International In-house Counsel Journal Vol. 9, No. 34, Winter 2016, 1

Data Analytics: The Future of Legal

JEREMIAH CHAN1 Legal Director, Inc., USA & JAY YONAMINE Senior Data Scientist, Google Inc., USA & NIGEL HSU Head of Operations, Verily Life Sciences LLC, USA

INTRODUCTION During the 2002 season of Major League Baseball (MLB), the Oakland Athletics and New York Yankees both won an astounding 103 games, a feat only topped three times by any team in the last 24 years. Despite identical records, the New York Yankees payroll on opening day was $125,928,583 compared to only $39,679,746 for the Oakland Athletics. On a per win basis, the Oakland Athletics paid $385,240 per win compared to the New York Yankee’s $1,222,608. The New York Yankees were not an outlier. All other MLB teams (excluding Oakland) paid on average $853,252 per win in 2002. So how were the Oakland Athletics able to generate wins with over two to three times the cost efficiency of other teams in the league? By now, the answer has been well documented by Michael Lewis in the book and subsequent movie Moneyball, in which Lewis recounts how Oakland Athletics’ General Manager Billy Beane replaced traditional approaches to scouting and roster management with a rigorous use of data and statistical models. This is often referred to as “data analytics.” The quest for increased efficiency is certainly not unique to MLB. Be it a widget factory, technology company, or legal department, the desire to maximize the quality of outcomes while maintaining or reducing costs is ubiquitous. As the “Big Data revolution” continues to grow, organizations increasingly look to data analytics as the core drivers of increased efficiency. However, this is more difficult for some organizations than others, as it requires not only building or buying sophisticated technical infrastructure, but also generating buy-in from key stakeholders who might not believe in the power of data analytics. For many in-house legal departments, debates over whether or not to move towards data-driven efficiency are over. In-house counsel can no longer afford to disregard relevant data in making decisions. The question facing legal department leadership is not whether to incorporate data analytics, but how to most effectively do so. Regarding how, there is good news and bad news for legal departments. The bad news is that legal as an industry is late to the game. The good news is that many other industries have already paved the way, having spent decades resolving a host of technical challenges, and their efforts have resulted in clear best practices and massive increases in

1 The views expressed herein are those of the authors, alone.

International In-house Counsel Journal ISSN 1754-0607 print/ISSN 1754-0607 online 2 Jeremiah Chan, Jay Yonamine & Nigel Hsu efficiency. The widespread adoption of data analytics across other industries suggests that legal departments may be forced to adapt faster than they would expect or prefer. This article describes the evolution of the data landscape in the legal industry and highlights several applications for data analytics in legal. It also prognosticates the continued growth of data analytics in legal practice and the potential repercussions based on observations from other industries. SECTION 1: THE EVOLUTION OF THE DATA LANDSCAPE IN LEGAL In 1965, more than one hundred thousand patent applications were filed in the United States. At that time, the U.S. Patent Office was managing thousands of physical documents, including , applications, and the voluminous correspondence between an applicant and an examiner in the course of procuring a patent (called “prosecution history”). IP law firms were also handling lots of paper in the course of patent law practice. Patent attorneys needed to search and copy documents from archives in order to understand the novelty of inventions for the purpose of filing new patent applications. They also had to order hard copies of prosecution histories from the Patent Office in order to evaluate the scope and validity of patents asserted in lawsuits. Fast forward to today. Patents from all over the world have been digitized, OCRed, and persisted in a distributed and indexed database. Search functionality has gone from complex boolean strings to powerful combinations of natural language and semantic search, class codes, citation networks, and a host of other signals. Most importantly, everyone has free access to this search functionality through the Patent Office’s searchable database and tools like Google Patents (patents.google.com).2 These tools have not only raised the quality of the patent attorney’s work product, they have also minimized the amount of attorney time spent on tasks that historically required hours of work and weeks of waiting for hard copies. Figure 1

Figure 1 describes the stages of development in the example above: (A) collecting all of the relevant physical documents; (B) scanning and digitizing the documents; (C) making the data searchable with tools like Google Patents; (D) visualizing the data with dynamic dashboards like Innography’s PatentIQ; and (E) applying advanced analytics to fully automate decisions. Every other industry has experienced the same sequence of events in

2 Google Patents makes it easy for anyone to search for patents and prior art from many sources, including the machine-translated full text of patents and applications from many patent offices and results from . It scales from simple searches performed by an individual inventor to extensive prior art and invalidation searches performed by patent examiners, agents, and attorneys. To efficiently search the growing amount of prior art in less time, it focuses on assisting the user in constructing their search query and surfacing the most relevant results. Full support for the Cooperative Patent Classification (CPC) scheme has been integrated, with results from Google Scholar machine-classified by CPCs to quickly narrow down non-patent prior art, and CPC autocomplete and result clustering suggestions shown to refine a query. Data Analytics 3 achieving data efficiency, and the legal industry is no different. At a higher level, the 5 stages can be grouped into 3 principal phases of data evolution: (1) availability, (2) visualization, and (3) automation. This section provides a brief overview of each phase. Availability As in all other industries, the first step in achieving data-driven efficiency for legal departments is data availability, which entails digitizing and persisting the data of interest in a way that enables the end user to search the data. In some domains, users are so accustomed to having data available that it is easy to take this functionality for granted. For example, consider search functionality within an e-mail account – a capability that most legal professionals use every day. Underlying this feature is a highly complex, distributed database architecture that powers near instant keyword and boolean searches across potentially terabytes of text, saving countless hours. For many legal tasks, data availability has yet to be achieved. For example, it is common for legal contracts to exist solely in .pdf format, even in large in-house departments. This means that in-house counsel are unable to perform searches on the text in the .pdf or utilize contract metadata that could be valuable for other analyses. Fortunately, many industries (including legal) have spent years establishing best practices to achieve data availability on currently “unavailable” data. At a high level, the best practices include the following three steps: 1) Identify: Determine the specific data of interest and implement a process for obtaining it. For example, if the data of interest is legal contracts: a) Identify where the legal contracts are located (on a hard drive, on a web server, etc.) b) Determine the current file format (.pdf, .doc, .odt, .wpd, or .rtf) c) Implement a process for ingesting the documents on a regular schedule (an “extract” process, web scraper, etc.) 2) Prepare: Ensure that the data is in a format conducive for persistence in a database. If the data in question is document-based, the “prepare” step might include running optical character recognition (OCR) software to convert text in a .pdf to a machine-readable text format. 3) Persist: Store the cleaned, machine-readable data in a database. A schema is required to identify the specific fields of information to be stored as well as relationships between fields and appropriate indexing to enable fast queries.3 There are dozens of technical choices to be made, including on-premise vs. cloud-based hardware, open-source vs. Commercial software, and SQL vs. NoSQL database architectures. For proprietary data, an organization may need to follow these steps fully within their hardware firewall. However, for many use cases, external companies provide data availability in the cloud and allow legal departments to access the data through application programming interfaces (APIs) or manually through software as a service (SaaS) applications. For some legal use cases, simply achieving data availability is sufficient, especially when a user has the ability to combine diverse information from multiple database locations. For example, consider a scenario in which your company is considering an acquisition of three companies and is interested in determining the legal risk that each company faces

3 A database “field” is identical to a column in a spreadsheet. 4 Jeremiah Chan, Jay Yonamine & Nigel Hsu from patent lawsuits. To perform a quick evaluation of patent risk, an analyst might first want to determine the number of past litigations that each of the three companies has faced with revenue information about the plaintiffs in each case. Imagine that the database contains a table with corporate revenue information called financial_data, and a table contains comprehensive information about patent litigations called patent_litigation_data. Since both contain a field for company name, an analyst could write a SQL query combining information in both tables. For example, a logical SQL query might be: SELECT plaintiff, revenue, defendant, year FROM patent_litigation_data pld LEFT JOIN financial_data fdON pld.plaintiff = fd.company AND pld.year = fd.year WHERE pld.defendant = ‘Name of Company of Interest’; This would return a file containing a row for every litigation against the company of interest, as well as the year of the assertion, the name of the plaintiff, and the revenue of the plaintiff in the year of assertion. With this data, the analyst could make illustrative charts and figures, provide a descriptive narrative, or simply pass along the raw data to inform the analysis and decision-making process. Visualization Although SQL provides an immensely powerful interface to leverage data in a database, it is not always practical. For example, when multiple users are regularly interested in the output of similar SQL queries or if the organization does not include a sufficient number of query experts; dashboards or other data visualizations are preferable. Legal departments generally satisfy both conditions and are able to best leverage their persisted data through dashboards. The last few years have seen an explosion in enterprise-grade dashboard tools, including offerings from tech conglomerates like IBM, Oracle, and Microsoft, and dedicated business intelligence (BI) companies like Tableau and MicroStrategy. Irrespective of the vendor, properly constructed dashboards enable users to easily interact with large amounts of complex data without having to write SQL or other types of customized code. Moreover, the data and resulting aggregations are presented through graphics that often lead to more easily interpretable insights than purely numerical results. Although using dashboards to gain insights and efficiency from data does not require strong technical expertise, building the dashboard often does. Large legal departments might have the luxury of a full-time dashboard engineer, but other smaller departments may wish to leverage other non-legal engineers or a consulting firm to build and maintain visualizations. Most commercial SaaS applications also provide users with dashboard environments to interact with the underlying data.4 While many SaaS dashboards provide standard data manipulation flexibility, some incorporate sophisticated machine-learning algorithms to find relationships within the data that humans might not be able to discern themselves. For example, most patent-related SaaS applications implement unsupervised clustering algorithms to help visualize technology areas contained in a patent portfolio. Figure 2 shows a citation graph for a set of patents with colors representing different clusters created using graph segmentation. Although this and other examples incorporate complex machine learning, it is important to note that the output is generally used to further inform the decision-making process, rather than to fully automate it.

4 Some SaaS applications also provide a query interface to allow users to write customized queries. Data Analytics 5

Figure 2

Automation Full automation is the final step in any data analytics ecosystem. For some industries, the complexities of the use cases and their inherent subjectivities mean that decisions will never be fully automated. Many legal use cases fall into this category; however, this does not mean that predictive algorithms cannot produce tremendous data-driven efficiencies short of complete automation. Machine learning is a subfield of computer science that involves the use of statistical and mathematical functions to learn from and make predictions on data. It is the core technology underlying all automated systems.5 The machine learning community has made rapid advancements, with large scale proprietary systems like IBM Watson and pushing the boundaries of machine learning and even venturing into legal applications. Moreover, the quality and volume of open source machine learning tools is also accelerating, with powerful offerings including Scikit-learn, Natural Language Toolkit, TensorFlow, SparkML, Distributed Machine Learning Toolkit, and others. These packages allow data scientists to easily implement a host of machine learning algorithms on complicated data structures:  Unsupervised clustering o K-means o Hierarchical methods  Natural Language Processing (NLP) o Word2Vec o Latent Dirichlet Allocation  Supervised algorithms o Generalized Linear Models o Random Forests o Support Vector Machines  Neural Networks based approached o Convolutional Neural Networks o Deep Learning

5 Machine learning is also referred to under the more abstract category of artificial intelligence (AI). 6 Jeremiah Chan, Jay Yonamine & Nigel Hsu

Some of these algorithms are already being used in commercial legal applications today. For example, a number of patent analytics SaaS tools utilize sophisticated NLP and unsupervised clustering algorithms to generate automated patent landscapes and use supervised algorithms – like logistic regressions – to predict patent quality. . SECTION 2: SURVEY OF LEGAL APPLICATIONS In the past few years, the legal industry has experienced rapid improvements in third party applications for data analytics. There now exists a profusion of useful data and systems covering many aspects of legal practice. The increasing degree of data availability and visualizations via third party applications has enabled in-house counsel to exact increasing scrutiny of common attorney behavior, saving time and money. Outside Counsel Selection When an operating company (OpCo) encounters a new lawsuit, the legal department must quickly select outside counsel to handle the matter. Typically, in-house counsel sends out a request for proposal to a selection of law firms, and those firms pitch their services, touting their win percentage and innovative approach to litigation. In reality, most law firms’ win percentages are misleading, and their approach is no more innovative than any other firm. What’s even worse is that most legal departments base their selection on attributes with little or no correlation to performance.6 With comprehensive litigation data available from the Public Access to Court Electronic Records system (PACER), legal departments can query a litany of factors relevant to firm performance.7 What is the firm’s litigation success rate? What types of litigations has the law firm handled in the past and how many in the same venue or before the same judge? What is the firm’s success rate in front of that judge and compared to other firms? In- house counsel can quickly answer these questions by utilizing various SaaS applications with established data availability. By using dashboard interfaces that summarize and interpret the data, legal departments can replace data-blind approaches with more data- informed processes. An identical paradigm can also be applied to the scope of selecting outside counsel to perform patent prosecution. The US Patent Office’s Patent Application Information Retrieval system (PAIR) contains digitized prosecution histories of millions of patents and applications.8 Third party applications provide dashboards that allow in-house counsel to identify the success rates of various firms prosecuting patents at the art unit (technology area) and examiner level. When a company begins to file patent applications in a new technology area, its legal department can select from a variety of performance metrics to compile a detailed comparison of outside counsel.9 Figure 3 is a portion of an outside counsel report from PatentAdvisor that displays numerous performance metrics for a particular patent law firm across several art units.

6 Many in-house counsel select outside counsel based on subjective factors – i.e., nephew’s law firm, members of your golfing foursome, drinking buddies, etc. 7 Other litigation data providers include Docket Navigator, Lex Machina, Darts IP, and Innography. 8 Other prosecution data providers include PatentAdvisor, Juristat, and Twin Dolphin. 9 The user can define patent prosecution success in several ways, including the average number of office actions to allowance, the overall allowance rate, or some other set of measurable factors. Data Analytics 7

Figure 3 Outside Counsel Report Art Unit App Count Average No. of Allowance Rate % with at least % where Office Actions One RCE Applicant Filed between Filing Date before Patent Notice of Appeal and Patent Issuance before Patent Issuance Issuance 3763 523 2 67.2% (6722/10002) 33.8% 7.5% 1624 147 1.6 67.3% (11412/16956) 18.4% 6.5% 1625 142 1.3 68.1& (11638/17086) 14.8% 5.6% 1626 119 1.2 71% (12167/17146) 15.9% 3.6% 1645 100 2.1 49.5% (3779/7629) 32.8% 12.8%

Patent Portfolio Development The traditional prosecution methodology relies solely on technical and legal arguments between the applicant and examiner. An examiner rejects an application on the grounds that the subject matter is either not novel or not patentable. In turn, patent counsel analyzes the strength of the examiner’s rejections and responds with legal arguments based on why the technical subject matter is novel or patentable. However, arguing past an examiner and convincing her to adopt a contrary opinion is difficult. Besides relying on legal arguments, patent practitioners can now rely on PAIR data to inform their decisions. PAIR data describes examiner behavior in response to patent counsel actions. Patent counsel can use this data to inform how they should respond to examiners in order to achieve the best outcome for a patent application. A variety of data providers aggregate and analyze PAIR data to generate a wealth of statistics on examiner behavior. Figure 4 is a portion of an examiner report from PatentAdvisor that provides statistics about a particular examiner’s overall practice, and the percentage of each type of rejection. Figure 4

8 Jeremiah Chan, Jay Yonamine & Nigel Hsu

Aggregated examiner statistics include the examiner’s allowance rates, the examiner’s behavior after the applicant files an appeal, how the examiner responds to interviews, and average timing when issuing office actions. Each of these are valuable data points that allow patent counsel to identify which of the many prosecution strategies she should employ when interfacing with a certain examiner. For example, if the data shows that a particular examiner is more likely to allow an application after the applicant conducts an interview with the examiner, patent counsel should make sure to request an interview with the examiner. Alternatively, the data may indicate that an examiner has an extremely low allowance rate until an appeal is made to the Patent Trial and Appeal Board. Patent counsel should consider this data in deciding whether to invest the money to file an appeal. Similarly, PAIR data can instruct overall patent portfolio strategy. Instead of looking at data on a case by case basis to drive individual prosecution decisions, PAIR data can be analyzed at the portfolio level. For example, patent counsel can use dashboards to filter a large portfolio and see lagging cases – i.e., pending applications that already have multiple requests for continued examination or cases that are assigned to an examiner with an extremely low allowance rate. By identifying such cases, patent counsel can avoid wasting additional investments of time and money by choosing not to file another RCE or abandoning the case altogether. Litigation Strategy Data analytics can also inform litigation strategy. Legal departments can examine litigation data to understand the likelihood of a favorable outcome and formulate a strategy based on expected outcomes throughout the case. Before such litigation data was available, the legal department could at best rely on qualitative indicators about opposing counsel or the judge assigned to the case. With the availability of robust litigation data, the legal department can quantitatively assess their probability of success by examining historical litigation data. What were the outcomes in similar types of cases? What were the average damage awards in those cases? Based on the outcome data, should you consider transferring to another venue? Does the judge tend to stay cases if the defendant challenges the patent-in-suit before the Patent Office? Does opposing counsel consistently employ certain tactics in the course of litigation? If so, perhaps there is an opportunity to prepare for those attacks by proactively addressing arguments or establishing a record to position a strong counter-attack. How often does the judge grant motions for summary judgment of invalidity and which arguments have been most successful? By reviewing the court’s track record, legal departments can extract important insights from the data in order to prepare positions with the highest likelihood of success. Budget Management In addition to legal practice areas, legal departments devote significant time and resources to manage its spend. The finance department relies on every organization within the company – engineering, sales, marketing, human resources, etc. – to manage its budget. The ability of each organization to meet its budget and provide accurate forecasts is critical to the company’s ability to meet its quarterly numbers and meet guidance. The legal department is no exception and is often a source of great expense. The problem is that for many legal practice areas like patents or litigation, the unpredictable nature of legal proceedings makes it particularly challenging to track or predict costs. In patent prosecution, some applications are resolved quickly and others drag on for years through appeals. The timing of costs is also unpredictable because it is largely dictated by Data Analytics 9 the examiner assigned to each case. Litigation is no different. Some cases take a few months to reach a settlement, and others take many years through trials and appeals. The timing of litigation expenses is often dictated by the judge handling each matter. Because of the unpredictability in examiner and judge behavior, most outside counsel are unable to deliver accurate forecasts. As a result, the finance department is unable to reconcile accruals with invoices, and budgets are blown on a regular basis. The first step in leveraging data analytics for budget management is to establish the availability of requisite data for spend analysis. This includes the company’s internal billing and invoice information, along with data for the relevant practice area – in this case, PAIR and PACER data. PAIR data reduces uncertainty about the timing of cost- incurring prosecution events. For example, it is certain that outside counsel will incur cost by drafting and filing a response to a first office action, but the time that it takes to issue a first office action may differ wildly from examiner to examiner. When each examiner’s expected behaviors are estimated from their past performance, it provides much greater precision and predictability about when the cost will actually hit the budget. Similarly, PACER data can provide the average time that it takes a particular court or judge to reach critical milestones in a case. Figure 5 is a litigation milestones dashboard from Docket Navigator that shows average timelines for different types of dispositions in the Eastern District of Texas. Legal departments should incorporate these court-specific timelines into their budgets. By leveraging data analytics, legal departments can predict costs and build budgets with a higher degree of confidence.

Figure 5

10 Jeremiah Chan, Jay Yonamine & Nigel Hsu

SECTION 3: THE FUTURE OF THE LEGAL INDUSTRY The data landscape in the legal industry is evolving at a rapid pace, and the rate of evolution is likely to accelerate moving forward. The availability and aggregation of data is well established for an assortment of legal practice areas, and there are many opportunities (as described above) for legal departments to leverage data analytics to inform and influence their decision-making process. Based on observations from other industries, we envision three major trends in the next five years. Breaking down data silos The increasing number of third party applications means that data availability is steadily increasing. However, the data that these applications are built on tends to be siloed, which creates barriers to simultaneously analyzing data from multiple applications. For example, consider the various tools that a legal department might use to manage patent strategy: patent ownership (Thomson Innovation); patent litigations and validity challenges (Docket Navigator); patent transfers (Patent Office Assignment Database); patent family and foreign patent relationships (LexisNexis TotalPatent); and a variety of characterization data about the strength, breadth, and relevant technology of each patent (Innography). The ability to run queries and build dashboards across all of this data can provide counsel with valuable answers to complex questions, such as: Which companies are purchasing or divesting patents that relate to your company’s business? Which patents relate to a technology area that your company is interested in entering? Which technology areas have the highest volume of litigation in the US and other jurisdictions? Which of your patents are relevant to competitor products in the US and worldwide? In order to leverage data across applications, legal departments have two main options. First, they can request API access or regular bulk downloads of the data used by each of their third party applications, and then persist this data in a comprehensive in-house database. This would provide the legal department with the flexibility to run queries and build their own internal dashboards on top of all of the siloed application data. Second, legal departments could choose to wait. The legal applications landscape is undergoing considerable consolidation, with recent mergers and acquisitions including LexisNexis’ purchase of Lex Machina, and CPA Global’s acquisition of Innography. Customer demand will drive further consolidation, and this will continue to break down data silos and enable more comprehensive analytics. Legal department composition The structure of most corporate legal departments has not changed in the last three decades. Senior leadership manages in-house counsel who are supported by paralegals and legal assistants. Senior leadership and in-house counsel are all attorneys by training, and paralegals have similar albeit less formal legal training. Some larger companies may also have technical specialists, contract managers or other non-attorney patent agents. Sophisticated legal departments might receive data analytics support from IT or other departments within the organization; but it is quite rare to have dedicated technical personnel who are familiar with data analytics. Very few legal departments have a full- time, dedicated database engineer, data scientist, or product manager. The amount of data-driven efficiency that a legal department will be able to achieve will be a function of the scope of experienced data professionals in the department. Like in other industries, some members of the old guard will feel threatened by the introduction of new colleagues without law degrees who introduce new processes and offer opinions about legal decisions. However, the value of data analytics has been conclusively demonstrated by Data Analytics 11 early adopting industries. Legal departments will no longer be able to deny the value of having data professionals in their organization. Automation Many legal departments have found increased efficiency for many use cases by leveraging data availability and visualization. However, for all of these use cases, the data only contributes to the human decision-making process. There are few, if any, current legal scenarios for which decisions are made in a fully automated process, but fully automated systems have led to tremendous gains in other domains. Consider two examples. First, tens of billions of advertisements are viewed online every day. Virtually every advertisement on every website is chosen through a fully-automated process in which sophisticated algorithms analyze past user behavior to choose which advertisement out of millions of options is most likely to be clicked on by a given user. Second, in online journalism, stalwarts like the Associated Press are increasingly leveraging machine learning to fully automate the process of writing financial articles.10 Already, a large percentage of financial news stories are written by algorithms, with no human intervention. In both use cases, fully automated systems have led to tremendous gains in efficiency. There is no reason to expect that similar degrees of automation will not eventually make their way to legal applications in the near future, especially around use cases with output that is structured according to clear guidelines. Many legal documents lend themselves well to machine learning applications. Contracts, license agreements, pleadings, and briefs all contain boilerplate language in highly structured formats that are no more complex than financial documents. Advanced data analytics could also be used to fully automate patent processes. Consider the process of converting an invention into a patent application. Typically, patent counsel will decide whether to file a patent application after evaluating an incoming invention based on the prior art and a set of criteria that is specific to the legal department’s objectives. This evaluation could be fully automated by combining the advanced prior art searching technology that already exists, and teaching the machine to identify specific characteristics in the invention submission. Similar automation could be realized in the course of patent prosecution, where the response to an examiner’s office action is formulaic in nature.11 A computer could use PAIR data to generate an optimal response to a particular examiner by reviewing all of the responses submitted to the same examiner and employing the most successful arguments. CONCLUSION The Oakland Athletics leveraged data to optimize cost efficiency beyond that of any other baseball team in the league, but it took many years for the sport to embrace data- driven decisions. Despite having an abundance of player performance data, cigar- smoking baseball scouts continued to make decisions with their gut. Today, they are all extinct. They have been replaced with statisticians and data scientists; and nearly every major sport in the world now leverages big data to drive their roster and playbook decisions. In the same way, data analytics is poised to have a profound impact on the legal industry. There are many third party providers who offer access to rich data sources and sophisticated tools to interpret and analyze the data. By mining and visualizing the

10 The Associated Press partnered with Automated Insights in 2014 to automate quarterly earnings reports. http://www.theverge.com/2015/1/29/7939067/ap-journalism-automation-robots-financial-reporting 11 Many companies have outsourced this work to offshore patent specialists.

12 Jeremiah Chan, Jay Yonamine & Nigel Hsu data, legal departments can observe trends and insights that provoke questions about many aspects of legal management. These questions will help generate actionable insights for many practice areas and identify data-driven decisions that will yield optimal results. Some legal departments are already leveraging data analytics to achieve better results. How does your legal department’s cost per win compare with others? The future of legal has arrived. *** Jeremiah Chan is Legal Director on Google’s Global Patents Team, leading a team that is responsible for patent operations, analytics, and strategic initiatives. Jay Yonamine is Senior Data Scientist on Google’s Global Patents Team, where he works on optimizing database architecture and machine learning applications for legal use cases. Nigel Hsu is Head of Patent Operations for Verily Life Sciences, where he oversees process, analytics, and legal operations. Google Inc. is a global technology company specializing in Internet-related services and products. Its mission is to organize the world’s information and make it universally accessible and useful. Verily Life Sciences, LLC, formerly known as Google Life Sciences, is a company focused on bringing together technology and life sciences to uncover new truths about health and disease.