Haskell Tutorial and Cookbook
Total Page:16
File Type:pdf, Size:1020Kb
Haskell Tutorial and Cookbook Mark Watson This book is for sale at http://leanpub.com/haskell-cookbook This version was published on 2021-02-14 This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and many iterations to get reader feedback, pivot until you have the right book and build traction once you do. © 2016 - 2021 Mark Watson Contents Cover Material, Copyright, and License ................................ 1 Preface ...................................................... 2 Additional Material in the Second Edition .............................. 2 A Request from the Author ....................................... 2 Structure of the Book .......................................... 3 Code Examples .............................................. 4 Functional Programming Requires a Different Mind Set ..................... 4 eBooks Are Living Documents ..................................... 4 Setting Up Your Development Environment ............................ 5 Why Haskell? ............................................... 6 Enjoy Yourself ............................................... 7 Acknowledgements ........................................... 7 Section 1 - Tutorial .............................................. 8 Tutorial on Pure Haskell Programming ................................. 9 Interactive GHCi Shell .......................................... 9 Introduction to Haskell Types ..................................... 17 Functions Are Pure ............................................ 21 Using Parenthesis or the Special $ Character and Operator Precedence . 23 Lazy Evaluation .............................................. 25 Understanding List Comprehensions ................................. 26 Haskell Rules for Indenting Code ................................... 28 Understanding let and where ..................................... 29 Conditional do Expressions and Anonymous Functions ..................... 30 Maps .................................................... 36 Sets ..................................................... 37 More on Functions ............................................ 38 Comments on Dealing With Immutable Data and How to Structure Programs . 40 Error Handling .............................................. 41 Testing Haskell Code ........................................... 42 Pure Haskell Wrap Up .......................................... 45 Tutorial on Impure Haskell Programming ............................... 46 CONTENTS Hello IO () Monad ............................................ 46 A Note About >> and >>= Operators ................................. 49 Console IO Example with Stack Configuration ........................... 51 File IO ................................................... 54 Error Handling in Impure Code .................................... 56 Network IO ................................................ 58 A Haskell Game Loop that Maintains State Functionally ..................... 61 A More Detailed Look at Monads ................................... 63 Using Applicative Operators <$> and <*>: Finding Common Words in Files . 65 List Comprehensions Using the do Notation ............................ 68 Dealing With Time ............................................ 69 Using Debug.Trace ............................................ 70 Wrap Up .................................................. 72 Section 2 - Cookbook ............................................ 73 Text Processing ................................................ 74 CSV Spreadsheet Files .......................................... 74 JSON Data ................................................. 76 Cleaning Natural Language Text ................................... 78 Natural Language Processing Tools ................................... 81 Resolve Entities in Text to DBPedia URIs .............................. 82 Bag of Words Classification Model .................................. 87 Text Summarization ........................................... 92 Part of Speech Tagging ......................................... 94 Natural Language Processing Wrap Up ............................... 98 Linked Data and the Semantic Web ................................... 99 The SPARQL Query Language .....................................100 A Haskell HTTP Based SPARQL Client ...............................101 Querying Remote SPARQL Endpoints ................................103 Linked Data and Semantic Web Wrap Up ..............................106 Web Scraping ................................................. 107 Using the Wreq Library .........................................107 Using the HandsomeSoup Library for Parsing HTML . 111 Web Scraping Wrap Up .........................................113 Using Relational Databases ........................................ 114 Database Access for Sqlite .......................................114 Database Access for Postgres ......................................115 Haskell Program to Play the Blackjack Card Game ......................... 120 CONTENTS Section 3 - Larger Projects ......................................... 131 Knowledge Graph Creator ......................................... 132 Code Layout For the KGCreator Project and strategies for sharing Haskell code between projects ..............................................134 The Main Event: Detecting Entities in Text .............................136 Utility Code for Generating RDF ...................................138 Utility Code for Generating Cypher Input Data for Neo4J . 145 Top Level API Code for Handling Knowledge Graph Data Generation . 150 Wrapup for Automating the Creation of Knowledge Graphs . 152 Hybrid Haskell and Python Natural Language Processing ..................... 153 Example Use of the Haskell NLP Client ...............................153 Setting up the Python NLP Server ...................................153 Understanding the Haskell NLP Client Code ............................154 Wrapup for Using the Python SpaCy NLP Service . 156 Hybrid Haskell and Python For Coreference Resolution ...................... 157 Installing the Python Coreference Server ..............................157 Understanding the Haskell Coreference Client Code . 158 Wrapup for Using the Python Coreference NLP Service . 160 Book Wrap Up ................................................. 161 Appendix A - Haskell Tools Setup .................................... 162 stack .....................................................162 Emacs Setup ................................................163 Do you want more of an IDE-like Development Environment? . 163 hlint .....................................................163 Cover Material, Copyright, and License Copyright 2016 Mark Watson. All rights reserved. This book may be shared using the Creative Commons “share and share alike, no modifications, no commercial reuse” license. This eBook will be updated occasionally so please periodically check the leanpub.com web page for this book¹ for updates. Please visit the author’s website². If you found a copy of this book on the web and find it of value then please consider buying a copy at leanpub.com/haskell-cookbook³ to support the author and fund work for future updates. ¹https://leanpub.com/haskell-cookbook ²http://markwatson.com ³https://leanpub.com/haskell-cookbook Preface This is the preface to the new second edition released summer of 2019. It took me over a year learning Haskell before I became comfortable with the language because I tried to learn too much at once. There are two aspects to Haskell development: writing pure functional code and writing impure code that needs to maintain state and generally deal with the world non- deterministically. I usually find writing pure functional Haskell code to be easy and a lot of fun. Writing impure code is sometimes a different story. This is why I am taking a different approach to teaching you to program in Haskell: we begin techniques for writing concise, easy to read and understand efficient pure Haskell code. I will then show you patterns for writing impure code to deal with file IO, network IO, database access, and web access. You will see that the impure code tends to be (hopefully!) a small part of your application and is isolated in the impure main program and in a few impure helper functions used by the main program. Finally, we will look at a few larger Haskell programs. Additional Material in the Second Edition In addition to updating the introduction to Haskell and tutorial material, I have added a few larger projects to the second edition. The project knowledge_graph_creator helps to automate the process of creating Knowledge Graphs from raw text input and generates data for both the Neo4J open source graph database as well as RDF data for use in semantic web and linked data applications. The project HybridHaskellPythonNlp is a hybrid project: a Python web service that provides access to the SpaCy natural language processing (NLP) library and select NLP deep learning models and a Haskell client for accessing this service. It sometimes makes sense to develop polyglot applications (i.e., applications written in multiple programming languages) to take advantage of language specific libraries and frameworks. We will also use a similar hybrid example HybridHaskellPythonCore- fAnaphoraResolution that uses another deep learning model to replace pronouns in text with the original nouns