Clojure for Beginners

Elango Cheran

June 22, 2013 for Get ClojureI Beginners Elango Cheran

Introduction Setup Overview Preview I Clojure (actually) implemented as a library Language Overview I Need standard (Sun/Oracle) Java 1.6+ - Clojure & Comparisons http://www.oracle.com/technetwork/java/ Tabular comparisons Clojure Code Building javase/downloads/index. Blocks

I Clojure JAR downloads - Clojure Design http://clojure.org/downloads Ideas Conclusion I Can run the REPL (“interpreter”) with Extras java -cp clojure-1.6.0.jar clojure.main Cascalog I Try Clojure - online vanilla REPL - http://tryclj.com/ Clojure for Get ClojureII Beginners Elango Cheran

Introduction Setup Overview Preview I Leiningen - de facto build tool - Language http://leiningen.org/ Overview Clojure Basics & I New project - lein new Comparisons Tabular comparisons I Open a REPL - lein repl Clojure Code Building Blocks I The REPL from Leiningen maintains proj. libs Clojure Design (classpath), command history, built-in docs, etc. Ideas Conclusion I So easy that you don’t notice Maven is underneath Extras I Light Table - evolving instant-feedback IDE - Cascalog http://www.lighttable.com/ Clojure for “Traditional” IDEs for ClojureI Beginners Elango Cheran

Introduction Setup Overview I (!) Preview Language I Paredit mode - one unique advtange of Lisp syntax Overview Clojure Basics & I Imbalanced parenthases (& unclosed strings) no longer Comparisons Tabular comparisons possible Clojure Code Building Blocks I Editing code structure as natural as editing code Clojure Design I Integrated REPL, lightweight editor, etc. Ideas

I Get Emacs 24 or later, and install emacs-starter-kit Conclusion I + Counterclockwise Extras Cascalog I “Strict Structural Edit Mode” is steadily replicating Paredit mode

I , IntelliJ, etc. Clojure for “Traditional” IDEs for ClojureII Beginners Elango Cheran

Introduction Setup Overview Preview Shortcuts to learn (and my configurations) Language Overview Clojure Basics & paredit-forward (-M-f), paredit-backward (C-M-b), Comparisons Tabular comparisons paredit-forward-slurp-sexp (C-), Clojure Code Building paredit-forward-barf-sexp (C-), Blocks Clojure Design paredit-backward-slurp-sexp (C-M-), Ideas paredit-backward (C-M-), Conclusion paredit-backward (C-M-b), paredit-backward (C-M-b), Extras Cascalog paredit-split-sexp (M-S), and there’s more . . . Clojure for What This Presentation Covers Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & I An introduction to Clojure Comparisons Tabular comparisons Clojure Code Building I A cursory comparison of Java, Clojure, Ruby, and Scala Blocks I Code snippets as needed Clojure Design Ideas

I Explanation of design considerations Conclusion

I Additional resources Extras Cascalog Clojure for Interesting Things Not Covered Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons Tabular comparisons I ClojureScript Clojure Code Building Blocks I Specific DSLs & frameworks Clojure Design Ideas I Clojure’s concurrency constructs & STM Conclusion

Extras Cascalog Clojure for Overview of Presentation Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons I Brief intro of Clojure dev tools Tabular comparisons Clojure Code Building I Brief comparison of languages w/ snippets Blocks Clojure Design I Explanation of main Clojure concepts Ideas Conclusion I Hands-on example(s) Extras Cascalog 2. Open, use, and close multiple system resources 3. Filter all lines of a file based on a reg. exp. 4. Read in a line, skip first line, take every 3rd

Clojure for Teasers Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons 1. Average all numbers in a list Tabular comparisons Clojure Code Building Blocks Clojure Design Ideas

Conclusion

Extras Cascalog 3. Filter all lines of a file based on a reg. exp. 4. Read in a line, skip first line, take every 3rd

Clojure for Teasers Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons 1. Average all numbers in a list Tabular comparisons Clojure Code Building 2. Open, use, and close multiple system resources Blocks Clojure Design Ideas

Conclusion

Extras Cascalog 4. Read in a line, skip first line, take every 3rd

Clojure for Teasers Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons 1. Average all numbers in a list Tabular comparisons Clojure Code Building 2. Open, use, and close multiple system resources Blocks Clojure Design 3. Filter all lines of a file based on a reg. exp. Ideas Conclusion

Extras Cascalog Clojure for Teasers Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons 1. Average all numbers in a list Tabular comparisons Clojure Code Building 2. Open, use, and close multiple system resources Blocks Clojure Design 3. Filter all lines of a file based on a reg. exp. Ideas 4. Read in a line, skip first line, take every 3rd Conclusion Extras Cascalog Clojure for Teaser #1 Beginners Elango Cheran I Idea: Average all numbers in a list Introduction I Java Setup // int[] nums = {8, 6, 7, 5, 3, 0, 9}; Overview Preview float average(int[] nums) { Language float sum = 0.0; Overview Clojure Basics & for(intx : nums) { Comparisons Tabular comparisons sum += x; Clojure Code Building } Blocks Clojure Design return sum / nums.length; Ideas } Conclusion Extras I Clojure Cascalog ; (def nums [8 6 7 5 3 0 9]) (defn average[nums] (/(reduce + nums)(count nums))) I All values in input Java array, etc. must be of same type

I Unless you use an untyped Java collection ...

I ... and pre-emptively cast to float Clojure for Teaser #2I Beginners Elango Cheran

Introduction Setup I Idea: Open, use, and close multiple system resources Overview Preview I Java Language Sockets= new Socket("http://tryclj.com/", 80); Overview Clojure Basics & Comparisons OutputStream fos= new Tabular comparisons Clojure Code Building FileOutputStream("index copy.html"); Blocks PrintWriter out= new PrintWriter(fos); Clojure Design try { Ideas // do stuff... Conclusion } Extras finally { Cascalog out.close(); fos.close(); s.close(); } Clojure for Teaser #2II Beginners Elango Cheran

Introduction Setup Overview I Clojure Preview (with-open [s(Socket. "http://tryclj.com" 80) Language fos(FileOutputStream. Overview Clojure Basics & "index copy.html") Comparisons Tabular comparisons out(PrintWriter. fos)] Clojure Code Building Blocks ;; do stuff Clojure Design ) Ideas Conclusion I The predictable parts: Extras I .close() Cascalog

I Close in reverse order

I A try-catch-finally block for clean I/O usage Clojure for Teaser #3 Beginners Elango Cheran I Idea: Filter all lines of a file based on a reg. exp. Introduction Setup I Java Overview BufferedReader br= new BufferedReader(new Preview Language FileReader(file)); Overview String line; Clojure Basics & Comparisons while ((line = br.readLine()) != null) { Tabular comparisons Clojure Code Building if (line.matches("\\d{3}-\\d{3}-\\d{4}")) { Blocks System.out.println(line); Clojure Design Ideas } } Conclusion Extras br.close(); Cascalog

I Clojure (with-open [br(BufferedReader. (clojure.java.io/reader file))] (doseq [line(line-seq br)] (when(re-matches#" \d{3}-\d{3}-\d{4}" line) (println line)))) Clojure for Teaser #4 Beginners Elango Cheran

Introduction I Idea: Read in a line, skip first line, take every 3rd Setup Overview Preview I Java Language String line; Overview Clojure Basics & int counter = 0; Comparisons Tabular comparisons br.readLine(); // assume not EOF Clojure Code Building while ((line = br.readLine()) != null) { Blocks Clojure Design if (counter % 3 == 0) { Ideas

System.out.println(line); Conclusion

} Extras counter++; Cascalog }

I Clojure (doseq [line(take-nth3(rest(line-seq br)))] (println line)) Clojure for REPL Beginners Elango Cheran

Introduction Setup Overview I REPL = Read-Eval-Print Loop Preview Language I “Interactive interpreter” Overview Clojure Basics & Comparisons I user> 1 Tabular comparisons Clojure Code Building 1 Blocks user> 4.5 Clojure Design Ideas

4.5 Conclusion I Also try 22/7, \e, 10000000000000000000, first, str, +, Extras 2r10101010, "hello", Cascalog 0.000000000000000000000000314, [2 4 8], {"key""value"} Clojure for BindingsI Beginners Elango Cheran

Introduction Setup I “binding” = assigning a value to a symbol Overview Preview I Clojure promotes alternative ways to manage state, and Language “variable” would be misleading Overview Clojure Basics & Comparisons I In general Tabular comparisons Clojure Code Building I Bindings are made at diff. times w..t. compiling (static Blocks / dynamic) Clojure Design Ideas I Bindings are made within a (lexical / dynamic scope) Conclusion Extras I Clojure is dynamic (uses dynamic bindings) Cascalog

I Clojure promotes lexical scoping, allows easy dynamic scoping

I You can “hot swap” live code

I Lexical scope + a function = a closure Clojure for BindingsII Beginners Elango Cheran

I Clojure Introduction Setup user> (def a 3) Overview Preview #'user/a Language user> a Overview Clojure Basics & Comparisons 3 Tabular comparisons Clojure Code Building user> (def b 5) Blocks #'user/b Clojure Design Ideas user> b Conclusion

5 Extras Cascalog I Java inta = 3; a; intb = 5; b; Clojure for BindingsIII Beginners Elango Cheran I Ruby irb(main):001:0> a = 3 Introduction Setup 3 Overview irb(main):002:0> a Preview Language 3 Overview Clojure Basics & irb(main):003:0> b = 3 Comparisons Tabular comparisons Clojure Code Building 3 Blocks irb(main):004:0> b Clojure Design 3 Ideas Conclusion

I Scala Extras scala> val a = 3 Cascalog a: Int = 3

scala> a res10: Int = 3

scala> val b = 5 Clojure for BindingsIV Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & b: Int = 5 Comparisons Tabular comparisons Clojure Code Building Blocks scala> b Clojure Design Ideas res11: Int = 5 Conclusion

Extras Cascalog Clojure for Typing Beginners Elango Cheran

I The types of values and how they are resolved Introduction Setup I Through Clojure, still using Java, just differently Overview Preview I Strong typing (like Java, Ruby, Scala; unlike Perl) Language Overview I Type hierarchies, interfaces, etc. Clojure Basics & Comparisons I Types of values are actual Java types. Try: Tabular comparisons Clojure Code Building (class 1) Blocks (class 4.5) Clojure Design (class "yolo") Ideas Conclusion I Dynamic typing (like Perl, Ruby, Scala; unlike Java) Extras I Type checking happens at run-time, not compile-time Cascalog I Optional typing might provide type annotation checking

I Trust in ’s ability to write good code I Benefit is expressive power (ex: macros) I Incremental development via REPL ⇒ less unexpected surprises Clojure for Typing ExamplesI Beginners Elango Cheran

I Clojure Introduction Setup user> (def a "not a Long") Overview #'user/a Preview > Language user (class a) Overview java.lang.String Clojure Basics & Comparisons user> (def a [1 2 3]) ;; no commas! commas Tabular comparisons Clojure Code Building treated like whitespace Blocks #'user/a Clojure Design Ideas user> (class a) clojure.lang.PersistentVector Conclusion Extras I Side note: Clojure has other “container types” (beyond Cascalog just a “variable”) to manage state I Java

I Variables are declared with a type that cannot change I Prevents a lack of clarity on what a symbol represents. . . I . . . but also restricts power of functions, collections, etc. Clojure for Typing ExamplesII Beginners Elango Cheran I Ruby

> a = "not a long" Introduction => "not a long" Setup Overview > a.class Preview => String Language Overview > a = [1, 2, 3] # commas required Clojure Basics & Comparisons => [1, 2, 3] Tabular comparisons Clojure Code Building > a.class Blocks => Array Clojure Design Ideas

I Scala Conclusion

scala> var c = 4.5 Extras c: Double = 4.5 Cascalog

scala> c.getClass res0: java.lang.Class[Double] = double

scala> c = 3.5 c: Double = 3.5 Clojure for Typing ExamplesIII Beginners Elango Cheran

Introduction Setup Overview scala> var c = "not a Long" // re-defining c Preview Language required to store object of diff type Overview Clojure Basics & c: java.lang.String = not a Long Comparisons Tabular comparisons Clojure Code Building scala> val d = Vector(1, 2, 3) Blocks Clojure Design d: scala.collection.immutable.Vector[Int] = Ideas

Vector(1, 2, 3) Conclusion

I A ‘val’ (”value”) in Scala is immutable Extras Cascalog I A ‘var’ (”variable”) is mutable but type is fixed, like Java Clojure for Follow AlongI Beginners Elango Cheran

Introduction Setup Overview 1. Install Leiningen and Light Table Preview 2. At the command line, run Language lein new oakww Overview Clojure Basics & 3. Run a REPL at the command line via Leiningen Comparisons Tabular comparisons I cd oakww Clojure Code Building Blocks I lein repl Clojure Design 4. Now open Light Table Ideas Conclusion I In the “Workspace” tab on the left, choose “Folder” Extras Link at top Cascalog I Select the folder of the Leiningen project we created (lein repl) I Expand to and click the source file (oakww > src > oakww > core.clj) Clojure for Follow AlongII Beginners Elango Cheran

Introduction Setup 5. Enter the following code in both command-line REPL Overview Preview and core.clj open in Light Table Language (class 4.5) Overview Clojure Basics & (class 22/7) Comparisons Tabular comparisons (defa [1 2 3]) Clojure Code Building (classa) Blocks Clojure Design (firsta) Ideas

(resta) Conclusion

(defb "hella") Extras (firstb) Cascalog (restb) (class(firstb)) (class(restb)) Clojure for Follow AlongIII Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons Tabular comparisons Clojure Code Building 6. In Light Table, in the “Command” tab on the left, Blocks select “Instarepl: Make current editor an Instarepl” Clojure Design Ideas

Conclusion

Extras Cascalog Clojure for Follow AlongIV Beginners Elango Cheran

Introduction Setup 7. Some notes on Light Table (curr. ver.: 0.4.11) Overview Preview I Constant evaluation Language Overview I Instant feedback Clojure Basics & I Works well in some cases (pure / stateless functions, Comparisons Tabular comparisons web, testing) Clojure Code Building Blocks I Not what you want in other cases (stateful fns / I/O, Clojure Design GUI) Ideas

I Standard command-line REPL is the “canonical” REPL Conclusion

I Especially if you have confusion on return vals vs. Extras stdout, etc. Cascalog

I Many people still stick with emacs + nREPL for optimal productivity Clojure for Functions Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons I Prefix notation - functions go in first position Tabular comparisons Clojure Code Building (defa3) Blocks (defb5) Clojure Design Ideas (+ab) Conclusion (+ab716) Extras Cascalog Clojure for Notes on SyntaxI Beginners Elango Cheran

Introduction Clojure Setup I Overview Preview I Myth: Lisp’s parentheses drown out code Language Overview Clojure Basics & Comparisons Tabular comparisons Clojure Code Building Blocks Clojure Design Ideas

Conclusion

Extras Cascalog Figure: from XKCD

I Well, Common Lisp does have a lot. . . I . . . but Clojure reduces them, uses vector square , too Clojure for Notes on SyntaxII Beginners Elango Cheran I Overall, Clojure has same or less

parens+brackets+braces than many other languages Introduction (less code!) Setup Overview objA.method(b, c, d); Preview ⇓ Language (function a b c d) Overview Clojure Basics & I Using Paredit mode (or equivalent) makes editing easy Comparisons Tabular comparisons and having imbalanced parens difficult Clojure Code Building Blocks Clojure Design Ideas

Conclusion

Extras Cascalog Figure: from XKCD

I Commas are whitespace

I Useful for macros I Java

I There is a lot of code Clojure for Notes on SyntaxIII Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons Tabular comparisons Clojure Code Building Blocks Clojure Design Ideas

Conclusion

Extras Cascalog

Figure: from Bonkers World

I Ruby

I fn call parens can be omitted when the result is not ambiguous Clojure for Notes on SyntaxIV Beginners Elango Cheran

I semicolon optional at end of the line Introduction Setup Overview > def add two(x) Preview > x + 2 Language Overview > end Clojure Basics & Comparisons => nil Tabular comparisons Clojure Code Building > add two 6 Blocks Clojure Design => 8 Ideas

I Scala Conclusion

I Type declarations go after a variable / function name, Extras not in front Cascalog

I Omissible when type can be inferred

I fn call parens can be omitted when the result is not ambiguous

I Semicolon optional at end of line Clojure for Data StructuresI Beginners Elango Cheran

Introduction Setup Overview Preview I 4 basic data structures with literal support in Clojure: Language lists, vectors, maps, sets Overview Clojure Basics & I List: (1 1 2 3) Comparisons Tabular comparisons I Clojure Code Building Vector: [1 1 2 3] Blocks I Set: #{1 2 3} Clojure Design Ideas I Map: {"eins" 1, "zwei" 2, "drei" 3 } Conclusion A lot of data can be represented through composites of I Extras these Cascalog

I Functions are executed through lists (fn is in first position) Clojure for Data StructuresII Beginners Elango Cheran I Clojure

(defl(list1123)) Introduction l Setup Overview (defv [1 1 2 3]) Preview v Language Overview (defs# {1 2 3}) Clojure Basics & Comparisons s Tabular comparisons Clojure Code Building (defm {"eins" 1, "zwei" 2, "drei"3 }) Blocks m Clojure Design Ideas

I Java Conclusion

// omitting plain arrays Extras import java.util.List; Cascalog import java.util.ArrayList; Listl= new ArrayList(); l.add(1); // only with auto-boxing starting in Java 1.5 aka 5 l.add(1); l.add(2); l.add(3); Clojure for Data StructuresIII Beginners System.out.println(l); Elango Cheran // [1, 1, 2, 3] Introduction Setup ArrayListv= new ArrayList(); // ArrayList Overview replaced Vector in Java 1.2 Preview Language Overview import java.util.Set; Clojure Basics & Comparisons import java.util.HashSet; Tabular comparisons Clojure Code Building Sets= new HashSet(); Blocks set.add(1); Clojure Design Ideas set.add(2); set.add(3); Conclusion Extras System.out.println(s); Cascalog // [1, 2, 3]

import java.util.Map; import java.util.HashMap; Mapm= new HashMap(); m.put("eins", 1); m.put("zwei", 2); Clojure for Data StructuresIV Beginners Elango Cheran m.put("drei", 3); System.out.println(m); Introduction Setup // {zwei=2, drei=3, eins=1} Overview Preview I Ruby Language v = [1, 2, 3] Overview Clojure Basics & v Comparisons Tabular comparisons s = Set.new([1, 2]) Clojure Code Building Blocks s Clojure Design m = {"eins"= > 1, "zwei"= > 2, "drei"= > 3} Ideas m Conclusion Extras I Scala Cascalog vall = List(1, 2, 3) val l2 = 1 :: 2 :: 3 :: List() l valv = Vector(1, 2, 3) v vals = Set(1, 2, 3) s Clojure for Data StructuresV Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons valm = Map("eins"- > 1, "zwei"- > 2, "drei"- > Tabular comparisons Clojure Code Building 3) Blocks Clojure Design m Ideas

Conclusion

Extras Cascalog Clojure for ImmutabilityI Beginners Elango Cheran

Introduction I Values don’t change after declared Setup Overview I Clojure Preview Language I Data structures (and any other value) are immutable Overview Clojure Basics & I Try: Comparisons Tabular comparisons (def v1 [5 6]) Clojure Code Building (def v2 [7 8]) Blocks Clojure Design (concat v1 v2) Ideas

v1 Conclusion

v2 Extras (defm {9 "nine", 8 "eight"}) Cascalog (assocm7 "seven") m I Java

I People with experience say no such thing as “somewhat immutable” code Clojure for ImmutabilityII Beginners Elango Cheran I No immutable data structures originally, except for Strings, actually Introduction Setup String str1= "hobnob with Bob Loblaw"; Overview String str2= " on his Law Blog"; Preview Language str1.concat(str2); Overview Clojure Basics & System.out.println("str1 = [" + str1 + "]"); Comparisons Tabular comparisons System.out.println("str2 = [" + str2 + "]"); Clojure Code Building // str1 = [hobnob with Bob Loblaw] Blocks Clojure Design // str2 = [ on his Law Blog] Ideas

Conclusion

String str3 = str1.concat(str2); Extras System.out.println("str1 = [" + str1 + "]"); Cascalog System.out.println("str2 = [" + str2 + "]"); System.out.println("str3 = [" + str3 + "]"); // str1 = [hobnob with Bob Loblaw] // str2 = [ on his Law Blog] // str3 = [hobnob with Bob Loblaw on his Law Blog] Clojure for ImmutabilityIII Beginners I Ruby Elango Cheran

I Like Java, does not have immutable types Introduction Setup I Scala Overview scala> val v1 = Vector(5, 6) Preview Language v1: scala.collection.immutable.Vector[Int] = Overview Clojure Basics & Vector(5, 6) Comparisons Tabular comparisons Clojure Code Building scala> val v2 = Vector(7, 8) Blocks Clojure Design v2: scala.collection.immutable.Vector[Int] = Ideas

Vector(7, 8) Conclusion

Extras scala> v1 ++ v2 Cascalog res1: scala.collection.immutable.Vector[Int] = Vector(5, 6, 7, 8)

scala> v1 res2: scala.collection.immutable.Vector[Int] = Vector(5, 6) Clojure for ImmutabilityIV Beginners Elango Cheran scala> v2 res3: scala.collection.immutable.Vector[Int] = Introduction Setup Vector(7, 8) Overview Preview Language scala> val m = Map( 9 -> "nine", 8 -> "eight") Overview Clojure Basics & m: Comparisons scala.collection.immutable.Map[Int,java.lang.String] Tabular comparisons Clojure Code Building = Map(9- > nine, 8- > eight) Blocks Clojure Design Ideas

scala> m + (7 -> "seven") Conclusion res4: Extras scala.collection.immutable.Map[Int,java.lang.String] Cascalog = Map(9- > nine, 8- > eight, 7- > seven)

scala> m res5: scala.collection.immutable.Map[Int,java.lang.String] = Map(9- > nine, 8- > eight) Clojure for ImmutabilityV Beginners Elango Cheran

Introduction Setup Overview I Referential transparency Preview Language I Don’t rebind symbols/names (bind fn results to new Overview Clojure Basics & symbols) Comparisons Tabular comparisons I Any code that references a symbol (ex: v1) always sees Clojure Code Building same value Blocks Clojure Design I “Either it works (all the time) or it doesn’t work at all” Ideas

happens more often Conclusion I Structural sharing through persistent data structures Extras Cascalog I Any code creating a new value using v1 reuses memory

I EX: copying, appending, subsets, etc. Clojure for ImmutabilityVI Beginners Elango Cheran I Value semantics

I Clojure Introduction Setup (def v3 v1) Overview v1 Preview Language v3 Overview (= v1 v3) Clojure Basics & Comparisons (= v3 [5 6]) Tabular comparisons Clojure Code Building (def v4 [1 [2 [3]]]) Blocks (def v5 [2 [3]]) Clojure Design Ideas (second v4) Conclusion (= v5(second v4)) Extras I Scala Cascalog val v3 = v1 v1 v3 v1 == v3 v3 == Vector(5,6) val v4 = Vector(1, Vector(2, Vector(3))) val v5 = Vector(2, Vector(3)) Clojure for ImmutabilityVII Beginners Elango Cheran

Introduction Setup Overview v5 == v4(1) Preview Language I Immutable values can be safely used in sets and in map Overview Clojure Basics & keys Comparisons Tabular comparisons I Whereas Java allows mutable objects in sets or map Clojure Code Building keys (unadvisable) Blocks Clojure Design I Python disallows mutable objects (ex: lists) in sets or Ideas

map keys Conclusion

I In general, Clojure uniquely teases out Extras Cascalog I State as value + time, and. . .

I Identity transcends time Clojure for Java, Ruby, Scala, & Clojure Beginners Elango Cheran

Introduction Setup Overview aspect Java Ruby Scala Clojure Preview Language strong typing Y Y Y Y Overview Clojure Basics & Comparisons dynamic typing N Y N Y Tabular comparisons Clojure Code Building interpreter/REPL N Y Y Y Blocks Clojure Design functional style N Y Y Y Ideas

“fun web prog.” N Y Y Y Conclusion

good for CLI script N Y N N Extras efficient with memory Y N Y Y Cascalog true multi-threaded Y N Y Y Clojure for Clojure ↔ ScalaI Beginners aspect Clojure Scala why? (Clojure) Elango Cheran

STM yes yes does for concur- Introduction rency what GC did Setup Overview for memory Preview Language OOP not really yes “It is better to Overview Clojure Basics & have 100 functions Comparisons Tabular comparisons operate on one Clojure Code Building Blocks data structure Clojure Design than 10 func- Ideas tions on 10 data Conclusion Extras structures.” Cascalog design patterns no some equivalent out- comes done in other ways FP yes sort of fns compose and can be used as ar- guments to other fns Clojure for Clojure ↔ ScalaII Beginners Elango Cheran aspect Clojure Scala why? (Clojure) Introduction Setup concurrency yes yes Clojure designed Overview for this from the Preview Language beginning Overview Clojure Basics & persistent data yes yes only reasonable Comparisons Tabular comparisons structures way to support Clojure Code Building Blocks immutable data Clojure Design structures Ideas sequence abstrac- yes yes fns on seqs : ob- Conclusion Extras tion jects :: UNIX : Cascalog DOS syntax regularity yes sort of nice for macros, readability (& pasting into REPL) Clojure for Clojure ↔ ScalaIII Beginners Elango Cheran

Introduction Setup Overview aspect Clojure Scala why? (Clojure) Preview Language Overview language extensi- yes yes* abstract repetitive Clojure Basics & Comparisons bility (macros) code not possible Tabular comparisons Clojure Code Building via fns and pat- Blocks terns Clojure Design Ideas

backwards com- yes yes* Clojure is relatively Conclusion

patibility very good at work- Extras ing with old ver- Cascalog sion code I Enter the following (in Light Table, if possible): (defn square [x] (*xx))

I Now enter: (square 2)

Clojure for Defining a Function Beginners Elango Cheran

Introduction Setup I Basic structure of a new fn Overview (defn fn-name Preview Language "documentation string" Overview Clojure Basics & Comparisons [arg1 arg2] Tabular comparisons Clojure Code Building ;; return value is last form Blocks ) Clojure Design Ideas

Conclusion

Extras Cascalog I Now enter: (square 2)

Clojure for Defining a Function Beginners Elango Cheran

Introduction Setup I Basic structure of a new fn Overview (defn fn-name Preview Language "documentation string" Overview Clojure Basics & Comparisons [arg1 arg2] Tabular comparisons Clojure Code Building ;; return value is last form Blocks ) Clojure Design Ideas

I Enter the following (in Light Table, if possible): Conclusion

(defn square Extras [x] Cascalog (*xx)) Clojure for Defining a Function Beginners Elango Cheran

Introduction Setup I Basic structure of a new fn Overview (defn fn-name Preview Language "documentation string" Overview Clojure Basics & Comparisons [arg1 arg2] Tabular comparisons Clojure Code Building ;; return value is last form Blocks ) Clojure Design Ideas

I Enter the following (in Light Table, if possible): Conclusion

(defn square Extras [x] Cascalog (*xx))

I Now enter: (square 2) Clojure for Lexical scope - let I Beginners Elango Cheran

I Can think of let form as giving “local variables” Introduction Setup I Except they must all be declared at the beginning Overview Preview I The let bindings also used to break up a nested form Language Overview into something more readable Clojure Basics & Comparisons I Example: Let’s find the solutions of a quadratic Tabular comparisons Clojure Code Building equation Blocks 2 Clojure Design I For ax + bx + c = 0, the solution is Ideas √ −b ± b2 − 4ac Conclusion x = Extras 2a Cascalog

I Test case: a = 1, b = −5, c = 6 ⇒ x 2 − 5x + 6 = 0 x = {2, 3} Clojure for Lexical scope - let II Beginners Elango Cheran

Introduction Setup Overview Preview I First pass: Language (defn quadsolve Overview Clojure Basics & "solve a quad eqn" Comparisons Tabular comparisons Clojure Code Building [a b c] Blocks [(/(+(-b)(-(square b)(*4ac)))(* Clojure Design 2 a))(/(-(-b)(-(square b)(*4ac))) Ideas (*2a))]) Conclusion Extras Cascalog I Check: (quadsolve 1 -5 6) Clojure for Lexical scope - let III Beginners Elango Cheran I Define: Introduction (defn discriminant Setup Overview "for a quadratic eqn's coefficients, Preview return the discriminant" Language Overview Clojure Basics & [a b c] Comparisons Tabular comparisons (-(square b)(*4ac))) Clojure Code Building Blocks I Check: Clojure Design (discriminant 1 -5 6) Ideas Conclusion Rewrite: I Extras (defn quadsolve Cascalog [a b c] (let [disc(discriminant a b c) disc-sqrt(Math/sqrt disc)] [(/(+(-b) disc-sqrt)(*2a))(/(- (-b) disc-sqrt)(*2a))])) Clojure for Lexical scope - let IV Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons I Math/sqrt refers to the sqrt static method of Java’s Tabular comparisons Clojure Code Building java.lang.Math Blocks I Check: Clojure Design (quadsolve 1 -5 6) Ideas Conclusion

Extras Cascalog Clojure for Control Flow - if, etc.I Beginners Elango Cheran

Introduction Setup Overview I if Preview I Takes a 3 expressions: a test, the “then”, and the “else” Language Overview I Note: test passes for all values except false and nil Clojure Basics & Comparisons I This “truthiness” holds for everything built off of if - Tabular comparisons Clojure Code Building when, and, or, if-not, when-not, etc. Blocks

I (if( < disc 0) Clojure Design Ideas (println "I don't like imaginary numbers!") Conclusion Extras [(/(+(-b) disc-sqrt)(*2a))(/(-(- Cascalog b) disc-sqrt)(*2a))]) I do

I Creates a form that evaluates/executes multiple forms inside it Clojure for Control Flow - if, etc.II Beginners Elango Cheran

Introduction Setup Overview I Returns the value of the last form Preview (if( < disc 0) Language Overview (println "I don't like imaginary numbers") Clojure Basics & Comparisons (do Tabular comparisons Clojure Code Building (println "I like real numbers!") Blocks [(/(+(-b) disc-sqrt)(*2a))(/(- Clojure Design Ideas (-b) disc-sqrt)(*2a))])) Conclusion

I when is the same as if, but with nil as “else” and a Extras do built in for “then” Cascalog

I Both and and or do short-circuit evaluation Clojure for map & reduce I Beginners Elango Cheran

Introduction Setup Overview Preview I Where’s my for loop?? Language Overview I Clojure Basics & Instead of dealing with index-based looping, you can Comparisons apply higher-order functions Tabular comparisons Clojure Code Building Blocks I map applies a fn on every element of a sequence Clojure Design Ideas I reduce uses a fn to accumulate an answer Conclusion I Apply fn on first 2 elements (or an initial value and first Extras element) Cascalog

I Continue applying fn on accumulated value and next element Clojure for map & reduce II Beginners Elango Cheran

Introduction Setup Overview user> (def data [3 5 9 1 5 4 2]) Preview #'user/data Language Overview user> (map square data) Clojure Basics & Comparisons (9 25 81 1 25 16 4) Tabular comparisons Clojure Code Building user> (reduce + data) Blocks 29 Clojure Design user> (defn sum-sq Ideas [nums] Conclusion (reduce + (map square nums))) Extras Cascalog #'user/sum-sq user> (sum-sq data) 161 Clojure for map & reduce III Beginners Elango Cheran

Introduction I Since Clojure fns are first-class citizens Setup Overview I You can have a vector of fns: [+ -] Preview I You can have an anonymous fn (doesn’t have a name): Language (fn [x](if(pos?x)x(-x))) Overview Clojure Basics & Comparisons I Our next rewrite of quadsolve: Tabular comparisons Clojure Code Building (defn quadsolve Blocks Clojure Design [a b c] Ideas (let [disc(discriminant a b c) Conclusion disc-sqrt(Math/sqrt disc) Extras soln-fn(fn [op](/(op(-b) Cascalog disc-sqrt)(*2a))) ops [+ -]] (map soln-fn ops))) Clojure for Closures Beginners Elango Cheran I soln-fn is a closure – the values of a, b, and disc-sqrt are pulled from surrounding scope Introduction Setup I Even if soln-fn is passed elsewhere, the values of a, b, Overview and disc-sqrt in soln-fn don’t change after fn Preview Language creation & binding Overview I fns ⇒ values ⇒ immutable Clojure Basics & Comparisons I Ex: you have to decrypt a lot of strings encrypted with Tabular comparisons Clojure Code Building the same public key Blocks I Instead of repeated (decrypt priv-key s ...) calls Clojure Design Ideas (defn decrypt-with-priv [priv-key] Conclusion Extras (fn [s] Cascalog (decrypt priv-key s))) (let [my-decrypt(decrypt-with-priv priv-key)] (my-decrypt s1) (my-decrypt s2) ...) I In many cases, as above, partial does the same Clojure for Java Interop Beginners Elango Cheran Java classes in JVM and classpath accessible I Introduction I Use full name unless imported, ex: (import Setup Overview 'java.net.URL) Preview I All of java.lang.* always imported, just like Java Language Overview New objects through : Clojure Basics & I new (new URL Comparisons "http://clojure.org") Tabular comparisons Clojure Code Building Blocks I Syntax shorcut: (URL. "http://clojure.org") Clojure Design I Static methods called through Class/method (ex: Ideas Math/sqrt) Conclusion Extras I Idiomatic member method call ex: (.toLowerCase Cascalog "sUpEr UgLy CaSiNg")

I More (& interesting) Java interop available (ex: proxy, memfn, etc.)

I Clojure way for Java patterns very neat (multimethods, protocols, records, types) Clojure for Sequence/List Processing FunctionsI Beginners Elango Cheran

I Many useful fns exist to transform sequences, work on Introduction specific collection types, or convert from one to another Setup Overview Preview I Examples: Language user> (filter even? data) Overview Clojure Basics & (4 2) Comparisons Tabular comparisons Clojure Code Building user> (remove even? data) Blocks (3 5 9 1 5) Clojure Design user> (take 3 data) Ideas Conclusion (3 5 9) Extras user> (drop 3 data) Cascalog (1 5 4 2) user> (first data) 3 user> (rest data) (5 9 1 5 4 2) user> (last data) Clojure for Sequence/List Processing FunctionsII Beginners Elango Cheran

Introduction Setup Overview Preview 2 Language Overview user> (butlast data) Clojure Basics & Comparisons (3 5 9 1 5 4) Tabular comparisons Clojure Code Building user> (take-while (fn [x] (< 1 x)) data) Blocks (3 5 9) Clojure Design Ideas

user> (drop-while (fn [x] (< 1 x)) data) Conclusion

(1 5 4 2) Extras user> (take-nth 2 data) Cascalog (3 9 5 2) Clojure for Sequence/List Processing FunctionsIII Beginners Elango Cheran

Introduction Setup user> (def nums [1 1 1 2 1 1 2 1 1 1 1 1 2 2 Overview Preview 1 3 1 2 2 1 1]) Language Overview #'user/nums Clojure Basics & Comparisons user> (frequencies nums) Tabular comparisons Clojure Code Building {1 13, 2 6, 3 1} Blocks user> (group-by odd? nums) Clojure Design Ideas

{true [1 1 1 1 1 1 1 1 1 1 3 1 1 1], false Conclusion

[2 2 2 2 2 2]} Extras user> (partition-by even? nums) Cascalog ((1 1) (2) (1 1) (2) (1 1 1 1 1) (2 2) (1 3 1) (2 2) (1 1)) Clojure for Adding/Removing/Getting single elements Beginners Elango Cheran

I cons puts an element at the front and returns a Introduction sequence Setup Overview Preview I conj adds an element in the most efficient manner and Language preserves the collection/sequence type Overview Clojure Basics & user> (cons 12 data) Comparisons Tabular comparisons Clojure Code Building (12 3 5 9 1 5 4 2) Blocks user> (conj data 12) Clojure Design [3 5 9 1 5 4 2 12] Ideas Conclusion user> (cons 12 s) Extras (12 1 2 3) Cascalog user> (conj s 12) #{1 2 3 12}

I assoc (for maps) adds a key and its value, dissoc removes a key and its value, given a key

I disj is the opposite of conj for a set Clojure for apply - unpacking sequences in fn calls Beginners Elango Cheran

Introduction I Some fns are meant for scalar args, not sequences: Setup Overview user> (max 3 8 9 5 -1 4 1 6) Preview 9 Language Overview user> (max [3 8 9 5 -1 4 1 6]) Clojure Basics & Comparisons [3 8 9 5 -1 4 1 6] Tabular comparisons Clojure Code Building Blocks I When what you want comes as a sequence. . . : Clojure Design user> (max (filter odd? [3 8 9 5 -1 4 1 Ideas 6])) Conclusion (3 9 5 -1 1) Extras Cascalog I . . . use apply to “unpack” the sequence and apply the fn: user> (apply max (filter odd? [3 8 9 5 -1 4 1 6])) 9 Clojure for Interlude - clojure.inspector Beginners Elango Cheran

Introduction Setup Overview I Run the following (preferably in command-line REPL): Preview Language Overview Clojure Basics & (use 'clojure.inspector) Comparisons Tabular comparisons Clojure Code Building (inspect [3 8 9 5 -1 4 1 6]) Blocks Clojure Design Ideas (inspect-tree [1 [2 [3 4]] 5]) Conclusion Extras (require '[clojure.xml :as xml]) Cascalog (inspect-tree(xml/parse "http://www.w3schools.com/xml/note.xml")) Clojure for MacrosI Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & I Powerful pre-evaluation step Comparisons Tabular comparisons Clojure Code Building I A fn that transforms code (input and output is code) Blocks I Only possible when language’s code written in Clojure Design language’s data structures Ideas Conclusion

I Changing a language to accept code in its own data Extras structures ⇒ Lisp Cascalog Clojure for MacrosII Beginners I Basic threading macros (-> and ->>) Elango Cheran

I Write nested forms “inside out” (more readable) Introduction I -> puts result of previous form in 2nd position of next Setup Overview I ->> puts result of previous form in last position of next Preview Language I Our previous sum of squares example Overview Clojure Basics & I Before Comparisons (reduce+(map square nums)) Tabular comparisons Clojure Code Building Blocks I After Clojure Design (->> nums Ideas

(map square) Conclusion (reduce+)) Extras I Our previous teaser # 4 example Cascalog

I Before (take-nth3(rest(line-seq br))) I After (->> br line-seq rest (take-nth3)) Clojure for MacrosIII Beginners Elango Cheran

Introduction I Example with -> Setup Overview I Setup Preview (require [clojure.string :as string]) Language ' Overview (def line "col1\tcol2\tcol3\tcol4")) Clojure Basics & Comparisons I Before Tabular comparisons Clojure Code Building (Integer/parseInt(.substring(second Blocks (string/split line #"\t"))3)) Clojure Design Ideas I After Conclusion (-> line Extras (string/split #"\t") Cascalog second (.substring3) (Integer/parseInt)) I Nested nil checks Clojure for MacrosIV Beginners Elango Cheran

Introduction Setup Overview I Before Preview (fn [n] Language (when-let [nth-elem(get["http://g.co" Overview Clojure Basics & "http://t.co"] n)] Comparisons Tabular comparisons (when-let [fl(get nth-elem 7)] Clojure Code Building Blocks (get# {\g \t \f} fl)))) Clojure Design Ideas I After (fn [n] Conclusion (some-> ["http://g.co" "http://t.co"] Extras Cascalog (getn) (get7) (#{\ \t \f}))) Clojure for MacrosV Beginners Elango Cheran I Don’t create your own macros unless you have to

I Can’t compose like fns (⇔ can’t take value of macro) Introduction Setup I Macros harder to debug Overview Preview I Macros can (and/or should) be used in a few cases, Language including: Overview Clojure Basics & I Abstracting repetitive code where fns can’t (ex: Comparisons Tabular comparisons patterns) Clojure Code Building Blocks I Or even for simplifying control flow, if common enough Clojure Design I Creating a DSL on top of domain-relevant fns Ideas I Controlling when a form is evaluted Conclusion Macros allow individuals to add on to their language Extras I Cascalog I with-open

I . . . is a macro in Clojure I Copied into Python, but only possible as official language syntax (= impl’ed by language maintainers)

I The some-> threading macro

I (officially added in Clojure 1.5) I already functionally existed in contrib library as -?> Clojure for MacrosVI Beginners Elango Cheran

Introduction Setup Overview Preview Language I Most of Clojure is implemented as fns and macros Overview Clojure Basics & I A few special forms exist as elemental building blocks Comparisons Tabular comparisons I Rest of language (fns and macros) is composed of Clojure Code Building previously-defined forms (special forms, fns and Blocks Clojure Design macros) Ideas I Syntax is simple and doesn’t change Conclusion I New lang. versions mostly just add fns, macros, etc. Extras ⇒ backwards-compatibility Cascalog Clojure for High-level Design Decision Cascade Beginners Elango Cheran

Introduction Setup Overview Preview Language I Simplicity → isolate state Overview Clojure Basics & I Simplicity → immutability Comparisons Tabular comparisons Clojure Code Building I Concurrency → immutability Blocks Clojure Design I Concurrency → STM Ideas I Simplicity → functional programming Conclusion Extras I Functional programming → immutability Cascalog I Immutability → persistant data structures Clojure for Effects of Decisions Beginners Elango Cheran

Introduction Setup Overview Preview I Lisp Language Overview I Flexible syntax Clojure Basics & Comparisons I Less parentheses + brackets + etc. (!) Tabular comparisons Clojure Code Building I Macros Blocks I Functional programming Clojure Design Ideas I Simpler code Conclusion I Easier to reason about Extras I Places of mutation minimized, isolated Cascalog I Refential transparency elsewhere I Design patterns handled in simpler, more powerful ways Clojure for My Parting Message to You Beginners Elango Cheran I The basics are simple, but tremendous depth I May take time at first (initial investment), but simpler Introduction Setup code is perpetual payoff Overview Preview I Clojure/Lisp compared to other languages Language I Lisp helps you get better at programming (even if you Overview Clojure Basics & don’t use it) Comparisons Tabular comparisons I Not a better vs. worse Clojure Code Building Blocks I But maybe a powerful vs. more powerful Clojure Design I If we agree that two languages can differ in power (ex: Ideas Perl vs. Basic) Conclusion I Tradeoffs exist – always choose right tool for the job Extras I Ex: a language’s power may cost performance Cascalog I Many language discussions → emotional arguments b/c of proximity to mind & identity I Or so wrote Paul Graham - “Keep Your Identity Small” (& Paul Buchheit - “I am Nothing”) I Keep exploring I There are more cool aspects to Clojure I couldn’t fit here I And it’s still a young language Clojure for Abridged Set of Useful Resources Beginners Elango Cheran I Videos of Easy-to-follow Lectures by Rich Hickey

I At Clojure’s Youtube channel Introduction I Data structures; Sequences; Concurrency; Clojure for Setup Overview {Java , Lisp Programmers} Preview I Books (my recommendations) Language Overview I The Joy of Clojure - good intro that explains the ‘why’ Clojure Basics & Comparisons of Clojure Tabular comparisons Clojure Code Building I Clojure Programming - deeper, more comprehensive Blocks guide to Clojure for all levels Clojure Design Ideas I ClojureDocs Conclusion Clojure Cheatsheet I Extras I 4Clojure Cascalog I Getting through the first 100 is worth the challenge to get better I I learned a lot by following these users’ solutions: 0x89, pcl, austintaylor, jbear, maximental, nikelandjelo, jfacorro, jsmith145, chouser, cgrand I Shameless plug: The Newbie’s Guide to Learning Clojure Clojure for The End Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons Tabular comparisons Clojure Code Building Blocks I Thanks! Clojure Design Ideas

Conclusion

Extras Cascalog Clojure for What is Cascalog?I Beginners Elango Cheran

Introduction Setup Overview Preview Language I You have a MapReduce (Hadoop) installation Overview Clojure Basics & I You put data on the filesystem (HDFS) Comparisons Tabular comparisons I You perform queries / analysis on data Clojure Code Building Blocks I Cascalog enables queries in Datalog syntax Clojure Design Ideas I Datalog - Scheme-based subset of Prolog - queries must Conclusion terminate? Extras I “-log” - logic programming Cascalog

I logic programming is declarative (like SQL!) Clojure for What is Cascalog?II Beginners Elango Cheran

Introduction Setup Overview I The point Preview Language I Queries are now a set of filters Overview Clojure Basics & I ⇒ No special syntax Comparisons Tabular comparisons I ⇒ We can combine/compose queries, run them in Clojure Code Building parallel, etc. Blocks Clojure Design I Implemented as a DSL ⇒ can mix in regular fns Ideas I Based on Cascading - Java library on top of Hadoop Conclusion MapReduce Extras Cascalog I Cascading establishes concept of flows

I Casca- + -log = Cascalog Clojure for Most Basic SetupI Beginners Elango Cheran Create a new Leiningen project I Introduction Setup I Basic project.clj file: Overview (defproject happy-clickers "0.1.0-SNAPSHOT" Preview Language :description"FIXME: write description" Overview :url "http://example.com/FIXME" Clojure Basics & Comparisons :license {:name "Eclipse Public License" Tabular comparisons Clojure Code Building :url Blocks "http://www.eclipse.org/legal/epl-v10.html"} Clojure Design Ideas :dependencies [[org.clojure/clojure "1.5.1"] Conclusion [cascalog "1.10.1"]] Extras :repositories {"cloudera" Cascalog "https://repository.cloudera.com/artifactory/cloudera-repos"} :profiles {:provided {:dependencies [[org.apache.hadoop/hadoop-core "0.20.2-cdh3u5"]]}} :aot [happy-clickers.core] :main happy-clickers.core ) Clojure for Most Basic SetupII Beginners Elango Cheran

Introduction I Source file setup: Setup Overview (ns happy-clickers.core Preview (:gen-class) Language (:require [cascalog.ops :as ops] Overview Clojure Basics & Comparisons [cascalog.vars :as vars]) Tabular comparisons Clojure Code Building (:use [cascalog.api])) Blocks Clojure Design Ideas (defn -main Conclusion "initiate execution when run as a standalone Extras Cascalog app" [& args] ;; do stuff ) Clojure for Deployment Beginners Elango Cheran

Introduction Setup Overview Preview I lein uberjar - create the JAR file to run on Hadoop Language Overview Clojure Basics & I hadoop jar - run the JAR file Comparisons Tabular comparisons I Hadoop doesn’t know (or care) that JAR file generated Clojure Code Building through Clojure Blocks Clojure Design I Testing Ideas Conclusion I You can create a REPL to run queries, etc. I You can choose inputs to be from HDFS, LFS, or Extras hand-created Clojure data Cascalog I But still working on this, among other things . . . Clojure for Example Prompt Beginners Elango Cheran

Introduction Setup Overview Preview Language 1. Given a file of online events (uid, impression, click, etc.) Overview Clojure Basics & Comparisons 2. Per uid, get # of impressions, & # of clicks Tabular comparisons Clojure Code Building 3. Determine CTR = impressions/clicks Blocks Clojure Design 4. Filter out when clicks <= 2 or CTR < 0.02 Ideas Conclusion 5. For the CTR values, compute quartiles Extras 6. Add the quartile number to each uid Cascalog Clojure for Query 1 - get quartile boundaries Beginners Elango Cheran

Introduction Setup Overview Preview Language (defn query1 Overview Clojure Basics & [source] Comparisons Tabular comparisons (let [hclks(happy-clickers source) Clojure Code Building hclk-ctrs( <- [?ctr](hclks ?uid ?ctr)) Blocks Clojure Design ctr-quartiles( <- [?min ?b12 ?b23 ?b34 ?max] Ideas

(hclk-ctrs ?ctr)(quartile-bounds ?ctr: > ?min ?b12 Conclusion

?b23 ?b34 ?max))] Extras ctr-quartiles)) Cascalog Clojure for CTR calculation Beginners Elango Cheran

Introduction Setup (defn happy-clickers Overview [source] Preview Language (<- [?uid ?ctr] Overview Clojure Basics & (source ?uid ?impr ?clk ?actn) Comparisons (parse-int ?clk: > ?click) Tabular comparisons Clojure Code Building (parse-int ?impr: > ?impression) Blocks (ops/sum ?click: > ?clicks) Clojure Design Ideas (ops/sum ?impression: > ?impressions) Conclusion (<= 2 ?impressions) ;; includes preventing Extras divide-by-zero. as it Cascalog ;; turns out, order of predicates matters for the divide-by-zero check (div ?clicks ?impressions: > ?ctr) (< 0.05 ?ctr))) Clojure for Parsing input tapI Beginners Elango Cheran

Introduction Setup Overview (defn- in-tap-parsed Preview "Helper fn that takes lines of input from a source Language Overview tap, splits the line, and returns only a specified Clojure Basics & Comparisons constant number of Cascalog vars. Helper fn to be Tabular comparisons Clojure Code Building used whether input is textline or sequencefile" Blocks [dir num-fields source] Clojure Design (let [outargs(vars/gen-nullable-vars num-fields)] Ideas (<- outargs Conclusion (source ?line) Extras Cascalog (line-not-empty ?line) (parse-line num-fields ?line: >> outargs) (:distinct false)))) Clojure for Parsing input tapII Beginners Elango Cheran (defn textline-parsed "parse the input source as an HDFS TextLine (file). Introduction Setup opts are for hfs-seqfile / hfs-tap" Overview [dir num-fields & opts] Preview Language (let [source(apply hfs-textline dir opts)] Overview Clojure Basics & (in-tap-parsed dir num-fields source))) Comparisons Tabular comparisons Clojure Code Building (defn parse-int Blocks Clojure Design [s] Ideas

(Integer/parseInts)) Conclusion

Extras (defn parse-line Cascalog [num-fields line] (take num-fields(string/split line #" \t")))

(defn line-not-empty [line] (boolean(seq(.trim line)))) Clojure for Custom aggregator - compute quartile boundaries Beginners Elango Cheran

Introduction Setup Overview Preview Language Overview Clojure Basics & Comparisons Tabular comparisons (defbufferop quartile-bounds Clojure Code Building [tuples] Blocks Clojure Design [(incanter.stats/quantile(map first tuples))]) Ideas Conclusion

Extras Cascalog Clojure for Query 2 - Add quartile number Beginners Elango Cheran

(defn query2 Introduction [source ctr-quartiles] Setup Overview (let [hclks(happy-clickers source) Preview hclk-qnums( <- [?uid ?ctr ?qnum](hclks ?uid Language Overview ?ctr) Clojure Basics & Comparisons (ctr-quartiles ?min ?b12 ?b23 Tabular comparisons Clojure Code Building ?b34 ?max) Blocks (cast-dbls ?min ?b12 ?b23 ?b34 Clojure Design ?max: > ?min-dbl ?b12-dbl ?b23-dbl ?b34-dbl Ideas ?max-dbl) Conclusion (qnum-casc-fn ?min-dbl Extras Cascalog ?b12-dbl ?b23-dbl ?b34-dbl ?max-dbl ?ctr: > ?qnum) ;; need to specify to ;; Cascalog that this is a cross-join (cross-join))] hclk-qnums)) Clojure for Other quartile fns and queriesI Beginners Elango Cheran

Introduction Setup Overview Preview (defn quantile-num Language "find the quantile number (1-indexed) of data point Overview Clojure Basics & Comparisons x given a vector of quantile info as given by Tabular comparisons Clojure Code Building incanter's quantile fn (first and last are min-val Blocks and max-val of dataset)" Clojure Design [quantiles x] Ideas (let [quant-ranges(partition 2 1 quantiles)] Conclusion (inc Extras (first(keep-indexed #(if( <=(first %2)x Cascalog (second %2)) %1) quant-ranges))))) Clojure for Other quartile fns and queriesII Beginners Elango Cheran

Introduction Setup Overview Preview (defn cast-dbls Language [& nums] Overview Clojure Basics & (map#(Double/parseDouble%) nums)) Comparisons Tabular comparisons Clojure Code Building Blocks (defn qnum-casc-fn Clojure Design "create a wrapper fn for quantile-num that works Ideas with Cascalog, that is, doesn't take any collections Conclusion as args" Extras [min b12 b23 b34 max n] Cascalog (quantile-num [min b12 b23 b34 max] n)) Clojure for Run queries Beginners Elango Cheran (defn run Introduction "read in std in and return output" Setup [] Overview Preview (let [dir "hdfs:///data/dir/path/" Language intermediate Overview Clojure Basics & "hdfs:///intermediate/dir/path/" Comparisons Tabular comparisons output Clojure Code Building "hdfs:///output/dir/path/" Blocks Clojure Design source(seqfile-parsed dir 12 :source-pattern Ideas

"ds=201306{21,22,23,24,25,26,27}") Conclusion

Extras sink(hfs-textline output)] Cascalog (?-(hfs-textline intermediate)(query1 source)) (with-job-conf { "io.compression.codecs" "org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzopCodec"} (?- sink(query2 source(textline-parsed intermediate 5)))))) Clojure for Improving this Cascalog example Beginners Elango Cheran

Introduction Setup I Update versions (currently: Cascalog 2.1.1, etc.) Overview Preview Show testing situation – pretty simple I Language Overview I Parsing a tab-separated (TSV) file is already supported Clojure Basics & Comparisons by Cascalog fns (use those instead) Tabular comparisons Clojure Code Building Blocks I Instead of writing and reading the “intermediate” values Clojure Design to disk using 2 disjoint queries, it might be more Ideas efficient to pull into memory as Clojure data structures Conclusion using ??- or ??<- Extras Cascalog I There probably is a way to generalize the quartile code for any quantiles of size n (ex: “deciles” when n=10)

I The first two points above will further decrease code size