Applied Natural Language Processing for Law Practice

APPLIED NATURAL LANGUAGE PROCESSING FOR LAW PRACTICE BRIAN S. HANEY INTRODUCTION ................................................................................................................................. 1 I. NATURAL LANGUAGE PROCESSING ....................................................................................... 4 II. PREPROCESSING ...................................................................................................................... 7 A. Text Corpora .................................................................................................................... 8 B. Vector Space .................................................................................................................. 10 III. MODELS ......................................................................................................................... 12 A. Artificial Neural Networks ............................................................................................. 12 B. Reinforcement Learning ................................................................................................. 16 C. Transformer ................................................................................................................... 20 IV. APPLICATIONS IN LAW ................................................................................................... 22 A. Question Answering ....................................................................................................... 23 B. Document Review ........................................................................................................... 25 C. Legal Writing ................................................................................................................. 28 V. ETHICS .................................................................................................................................. 32 A. Professional Responsibility ............................................................................................ 32 B. Access to Justice ............................................................................................................. 35 C. Automated Labor ............................................................................................................ 38 CONCLUSION ................................................................................................................................... 42 APPENDIX A. SUMMARY OF NOTATION ......................................................................................... 43 IPTF] Applied Natural Language Processing for Law Practice 1 APPLIED NATURAL LANGUAGE PROCESSING FOR LAW PRACTICE BRIAN S. HANEY* Abstract: Scholars, lawyers, and commentators are predicting the end of the legal profession, citing specific examples of artificial intelligence (AI) systems out-performing lawyers in certain legal tasks. Yet, technology’s role in the practice of law is nothing new. The Internet, email, and databases like Westlaw and Lexis have Been altering legal practice for decades. Despite technology’s evolution across other industries, in many ways the practice of law remains static in its essential functions. The dynamics of legal technology are defined By the organization and quality of data, rather than innovation. This Article explores the state of the art in AI applications in law practice, offering three main contributions to legal scholarship. First, this Article explores various methods of natural language database generation and normalization. Second, this Article provides the first analysis of two types of machine learning models in law practice, deep reinforcement learning and the Transformer. Third, this Article introduces a novel natural language processing algorithm for legal writing. INTRODUCTION Since its inception at the Big Bang, the Universe has been expanding.1 Similarly, since the dawn of communication, so too has the universe of language been expanding.2 Indeed, similar to the way in which entropy guides the Universe from order to disorder,3 time drives the expansion of language.4 As the Bible tells the story, there was once a time at which the whole world had a common language.5 As a result, humans were incredibly powerful, deciding to build a bridge to the heavens called the Tower of © 2020, Brian S. Haney. All rights reserved. * B.A. Washington & Jefferson College. J.D. Notre Dame Law School. Thanks to Angela Elias, Broderick Haney, Brad Haney, Leslie Kaelbling, and Branden Keck for the helpful comments, suggestions and feedback. 1 STEPHEN HAWKING, A BRIEF HISTORY OF TIME 151 (1996). 2 ZOLTAN TOREY, THE CONSCIOUS MIND 51 (2014). 3 BRIAN GREENE, FABRIC OF THE COSMOS 151 (2005); see also Frederic H. Behr, Jr. et al., Estimating and Comparing Entropy Across Written Natural Languages Using PPM Compression, INSTITUTE DE RECHERCHE EN INFORMATIQUE FONDAMENTALE 1, https://www.irif.fr/~dxiao/docs/entropy.pdf (last visited May 8, 2020). 4 See Daniel Martin Katz et al., Legal N-Grams? A Simple Approach to Track the ‘Evolution’ of Legal Language (Dec. 16, 2011), https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1971953. 5 Genesis 11:1. 2 Intellectual Property & Technology Forum at Boston College Law School [2020 Babel.6 But then God said, “[i]f as one people speaking the same language, they have begun to do this, then nothing they plan to do will be impossible for them. Come, let us go down and confuse their language so they will not understand each other.”7 So, God scattered the people’s language across the Earth, stopping the Tower of Babel’s construction.8 As the late philosopher, Zoltan Torey described, “[t]he epigenesis of our highly articulated human language is a fascinating story.”9 But, despite its entropic diffusion across time, Torey argues, “[l]anguage need not be a cognitive trap but can become a liberating passport to ever-deepening insights into the world and conscious mind itself.”10 Interestingly, the intersection of the conscious mind and language is the heart of natural language processing (NLP) studies, a sub-field of artificial intelligence.11 In fact, mastery of language is thought by many to be one of the most difficult tasks for computers to conquer.12 For example, machine learning scholar Ethem Alpaydin argues the driving force of computing technology is the realization that every piece of information can be represented as numbers.13 It follows logically that all information can be processed with computers. The divides between syntax and semantics, however, —manifested in a computer’s inability to understand human language—remains one of the most challenging problems in artificial intelligence. Generally, Artificial Intelligence (AI) is any system replicating the thoughtful processes associated with the human mind.14 In fact, machine learning pioneer Paul John Werbos argued from an engineering point of view, the human brain itself is simply a computer—an information processing system.15 Werbos further argued, the function of any computer as a whole system is to compute its outputs.16 And, many thinkers throughout history have argued the human mind is a machine learning system.17 Indeed, AI scholar Murray Shanahan explains, “[a] person’s browser history and buying 6 See id. 11:4–5. 7 Id. 11:6–7. 8 Id. 11:9. 9 TOREY, supra note 2, at 47 (epigenesis describes a theory of development through gradual differentiation). 10 Id. at 105–106. 11 See Noam Chomsky, Language and Nature, 104 MIND 1, 2 (1995). 12 MAX TEGMARK, LIFE 3.0: BEING HUMAN IN THE AGE OF ARTIFICIAL INTELLIGENCE 90–91 (2017). 13 ETHEM ALPAYDIN, MACHINE LEARNING 2 (2016). 14 Brian S. Haney, The Perils and Promises of Artificial General Intelligence, 45 J. LEGIS. 151, 152 (2018). 15 PAUL JOHN WERBOS, THE ROOTS OF BACKPROPAGATION FROM ORDERED DERIVATIVES TO NEURAL NETWORKS AND POLITICAL FORECASTING 305 (1994). 16 Id. 17 Id. at 307. IPTF] Applied Natural Language Processing for Law Practice 3 habits, together with their personal information, are enough for machine learning algorithms to predict what they’ll buy and how much they’ll pay for it.”18 As a result of increasing advancements in AI and NLP technologies, University of Pittsburgh Professor of Law, Kevin Ashley argues, “[a]rtificial Intelligence & Law is a research field that is about to experience a revolution.”19 Ashley is not alone. Scholars, lawyers, and commentators alike are now predicting the end of the legal profession, citing specific examples of computers successfully performing lawyers’ jobs and solving the age-old problems associated with access to justice.20 The impact of technology on legal practice, however, is nothing new.21 The Internet, email, and legal research databases like Westlaw have been impacting legal practice for decades.22 Yet, technology continues to fail in solving problems relating to access to justice.23 Indeed, the impacts technology will have on law practice and problems like access to justice, depend on a variety of factors including, institutional barriers and technological capabilities. Today, NLP is the most commonly used method of AI in the practice of law. This Article proceeds in five parts. Part I explains natural language processing and its evolution as a field of study from its inception to current state.24 Part II explains the preprocessing phase of NLP tasks, which generally includes methods of data gathering, organization, and modeling.25 Part III explores various machine learning models at the heart of contemporary machine learning research.26 Part IV discusses

Applied Natural Language Processing for Law Practice

ARCHITECTS of INTELLIGENCE for Xiaoxiao, Elaine, Colin, and Tristan ARCHITECTS of INTELLIGENCE

Race in Academia

The Time for a National Study on AI Is NOW

Arxiv:1202.4545V2 [Physics.Hist-Ph] 23 Aug 2012

Download Global Catastrophic Risks 2020

Dissolving the Fermi Paradox), and Quite Likely Once We Account for the Fermi Observation

A R E S E a R C H Agenda

Artificial Intelligence

Global Catastrophic Risks 2017 INTRODUCTION

Research Priorities for Robust and Beneficial Artificial Intelligence

Artificial Superintelligence and Its Limits: Why Alphazero Cannot Become a General Agent

Intelligence Unleashed an Argument for AI in Education