Concatenative Programming

Concatenative Programming From Ivory to Metal Jon Purdy ● Why Concatenative Programming Matters (2012) ● Spaceport (2012–2013) Compiler engineering ● Facebook (2013–2014) Site integrity infrastructure (Haxl) ● There Is No Fork: An Abstraction for Efficient, Concurrent, and Concise Data Access (ICFP 2014) ● Xamarin/Microsoft (2014–2017) Mono runtime (performance, GC) What I Want in a ● Prioritize reading & modifying code over writing it Programming ● Be expressive—syntax closely Language mirroring high-level semantics ● Encourage “good” code (reusable, refactorable, testable, &c.) ● “Make me do what I want anyway” ● Have an “obvious” efficient mapping to real hardware (C) ● Be small—easy to understand & implement tools for ● Be a good citizen—FFI, embedding ● Don’t “assume you’re the world” ● Forth (1970) Notable Chuck Moore Concatenative ● PostScript (1982) Warnock, Geschke, & Paxton Programming ● Joy (2001) Languages Manfred von Thun ● Factor (2003) Slava Pestov &al. ● Cat (2006) Christopher Diggins ● Kitten (2011) Jon Purdy ● Popr (2012) Dustin DeWeese ● … History Three ● Lambda Calculus (1930s) Alonzo Church Formal Systems of ● Turing Machine (1930s) Computation Alan Turing ● Recursive Functions (1930s) Kurt Gödel Church’s Lambdas e ::= x Variables λx.x ≅ λy.y | λx. e Functions λx.(λy.x) ≅ λy.(λz.y) | e1 e2 Applications λx.M[x] ⇒ λy.M[y] α-conversion (λx.λy.λz.xz(yz))(λx.λy.x)(λx.λy.x) ≅ (λy.λz.(λx.λy.x)z(yz))(λx.λy.x) (λx.M)E ⇒ M[E/x] β-reduction ≅ λz.(λx.λy.x)z((λx.λy.x)z) ≅ λz.(λx.λy.x)z((λx.λy.x)z) ≅ λz.z Turing’s Machines ⟨ ⟩ M = Q, Γ, b, Σ, δ, q0, F ● Begin with initial state & tape ● Repeat: Q Set of states ○ If final state, then halt Γ Alphabet of symbols ○ Apply transition function ∈ b Γ Blank symbol ○ Modify tape ⊆ ∖ Σ Γ {b} Input symbols ○ Move left or right ∈ ⊆ q0 Q, F Q Initial & final states δ State transition function δ : (Q ∖ F) × Γ → Q × Γ × {L, R} Gödel’s Functions f(x1, x2, …, xk) = n Constant S(x) = x + 1 Successor k Pi (x1, x2, …, xk) = xi Projection f ∘ g Composition ρ(f, g) Primitive recursion μ(f) Minimization Three Four ● Lambda Calculus (1930s) Alonzo Church Formal Systems of ● Turing Machine (1930s) Computation Alan Turing ● Recursive Functions (1930s) Kurt Gödel ● Combinatory Logic (1950s) Moses Schönfinkel, Haskell Curry Combinatory Logic (SKI, BCKW) Just combinators and applications! Bxyz = x(yz) Compose Cxyz = xzy Flip Sxyz = xz(yz) Application Kxy = x Constant S = λx.λy.λz.xz(yz) “Starling” Wxy = xyy Duplicate Kxy = x Constant SKKx = Kx(Kx) = x K = λx.λy.x “Kestrel” M = SII = λx.xx Ix = x Identity L = CBM = λf.λx.f(xx) I = λx.x “Idiot” Y = SLL = λf.(λx.f(xx))(λx.f(xx)) What is Turing machines → imperative Lambda calculus → functional concatenative Combinatory logic →* concatenative programming? “A concatenative programming language is a point-free computer programming language in which all expressions denote functions, and the juxtaposition of expressions denotes function composition.” — Wikipedia, Concatenative Programming Language “…a point-free computer programming language…” Point-Free Programming find . -name '*.txt' define hist (List<Char> | awk '{print length($1),$1}' → List<Pair<Char, Int>>): | sort -rn | head { is_space not } filter sort group hist ∷ String → [(Char, Int)] { \head \length both_to hist = map (head &&& length) pair } map . group . sort . filter (not . isSpace) Point-Free ● Programming: dataflow style using combinators to avoid (Pointless, Tacit) references to variables or Programming arguments ● Topology/geometry: abstract reasoning about spaces & regions without reference to any specific set of “points” ● Variables are “goto for data”: unstructured, sometimes needed, but structured programming is a better default ● “Name code, not data” Value-Level Programming Can Programming Be Liberated int inner_product from the Von Neumann Style? (int n, int a[], int b[]) (1977) John Backus { int p = 0; CPU & memory connected by “von for (int i = 0; i < n; ++i) Neumann bottleneck” via primitive p += a[i] * b[i]; “word-at-a-time” style; programming return p; languages reflect that } Value-Level Programming int inner_product n=3; a={1, 2, 3}; b={6, 5, 4}; (int n, int a[], int b[]) p ← 0; { i ← 0; int p = 0; p ← 0 + 1 * 6 = 6; for (int i = 0; i < n; ++i) i ← 0 + 1 = 1; p += a[i] * b[i]; p ← 6 + 2 * 5 = 16; return p; i ← 1 + 1 = 2; } p ← 16 + 3 * 4 = 28; 28 Value-Level Programming ● Semantics & state closely ● No high-level combining forms: coupled: values depend on all everything built from primitives previous states ● No useful algebraic properties: ● Too low-level: ○ Can’t easily factor out ○ Compiler infers structure to subexpressions without optimize (e.g. vectorization) writing “wrapper” code ○ Programmer mentally ○ Can’t reason about subparts executes program or steps of programs without context through it in a debugger (state, history) FP Def InnerProd ≡ (Insert +) ∘ (ApplyToAll ×) ∘ Transpose Def InnerProd ≡ (/ +) ∘ (α ×) ∘ Trans innerProd ∷ Num a ⇒ [[a]] → a innerProd = sum . map product . transpose FP Def InnerProd ≡ InnerProd:⟨⟨1, 2, 3⟩, ⟨6, 5, 4⟩⟩ (Insert +) ∘ (ApplyToAll ×) ∘ ((/ +) ∘ (α ×) ∘ Trans):⟨⟨1,2,3⟩, ⟨6,5,4⟩⟩ Transpose (/ +):((α ×):(Trans:⟨⟨1,2,3⟩, ⟨6,5,4⟩⟩)) (/ +):((α ×):⟨⟨1,6⟩, ⟨2,5⟩, ⟨3,4⟩⟩) ≡ Def InnerProd (/ +):(⟨×:⟨1,6⟩, ×:⟨2,5⟩, ×:⟨3,4⟩⟩) ∘ ∘ (/ +) (α ×) Trans (/ +):⟨6,10,12⟩ +:⟨6, +:⟨10,12⟩⟩ +:⟨6,22⟩ 28 Function-Level Programming ● Stateless: values have no ● Made by only combining forms dependencies over time; all data ● Useful algebraic properties dependencies are explicit ● Easily factor out subexpressions: ● High-level: Def SumProd ≡ (+ /) ∘ (α ×) ○ Expresses intent Def ProdTrans ≡ (α ×) ∘ Trans ○ Compiler knows structure ● Subprograms are all pure ○ Programmer reasons about functions—all context explicit large conceptual units J innerProd =: +/@:(*/"1@:|:) innerProd >1 2 3; 6 5 4 (+/ @: (*/"1 @: |:)) >1 2 3; 6 5 4 +/ (*/"1 (|: >1 2 3; 6 5 4)) +/ (*/"1 >1 6; 2 5; 3 4) +/ 6 10 12 28 J You can give verbose names to things: sum =: +/ of =: @: products =: */"1 transpose =: |: innerProduct =: sum of products of transpose (J programmers don’t.) Function-Level ● Primitive pure functions ● Combining forms: combinators, Programming: HoFs, “forks” & “hooks” Summary ● Semantics defined by rewriting, not state transitions ● Enables purely algebraic reasoning about programs (“plug & chug”) ● Reuse mathematical intuitions from non-programming education ● Simple factoring of subprograms: “extract method” is cut & paste Three Four Five ● Lambda Calculus (1930s) Alonzo Church Formal Systems of ● Turing Machine (1930s) Computation Alan Turing ● Recursive Functions (1930s) Kurt Gödel ● Combinatory Logic (1950s) Moses Schönfinkel, Haskell Curry ● Concatenative Calculus (~2000s) Manfred von Thun, Brent Kirby Concatenative Calculus The Theory of Concatenative [ A ] dup = [ A ] [ A ] Combinators (2002) Brent Kirby [ A ] [ B ] swap = [ B ] [ A ] E ::= C Combinator [ A ] drop = | [ E ] Quotation | E1 E2 Composition [ A ] quote = [ [ A ] ] ∘ (E2 E1) [ A ] [ B ] cat = [ A B ] [ A ] call = A Concatenative Calculus { dup, swap, drop, quote, cat, call } is Turing-complete! Smaller basis: [ B ] [ A ] k = A [ B ] [ A ] cake = [ [ B ] A ] [ A [ B ] ] [ B ] [ A ] cons = [ [ B ] A ] [ B ] [ A ] take = [ A [ B ] ] Combinatory Logic (BCKW) ● B — apply functions Bkab = k(ab) Compose/apply ● C — reorder values Ckab = kba Flip ● K — delete values Kka = k Constant ● W — duplicate values Wka = kaa Duplicate Connection to logic: substructure! ● W — contraction ● C — exchange ● K — weakening Combinatory Logic ● BCKW = SKI ● BI = ordered + linear ○ S = B(BW)(BBC) “Exactly once, in order” ○ K = K (Works in any category!) ○ I = WK ● BCI = linear ● SKI → LC (expand combinators) “Exactly once” ● LC → SKI (abstraction algorithm) ● BCKI = affine ● { B, C, K, W } = LC “At most once” ● BCWI = relevant “At least once” Substructural Type Systems ● Rust, ATS, Clean, Haskell (soon) ● Reason about any resource: ● Rust (affine): if a mutable memory, file handles, locks, reference exists, it must be sockets… unique—eliminate data races & ● Enforce protocols: “consume” synchronization overhead objects that are no longer valid ● Avoid garbage collection: ● Prevent invalid state transitions precisely track lifetimes of ● Reversible computing objects to make memory usage ● Quantum computing deterministic (predictable perf.) Substructural Rules in Concatenative Calculus [ A ] dup k = [ A ] [ A ] k ● Continuations are no longer Wka = kaa scary or confusing ● “Current continuation” (call/cc) [ A ] [ B ] swap k = [ B ] [ A ] k is simply the remainder of the Ckab = kba program [ A ] drop k = k ● Saving a continuation is as easy Kka = k as saving the stacks and instruction pointer Concatenative Calculus ≈ Combinatory Logic + Continuation-Passing Style “…all expressions denote functions […] juxtaposition…denotes function composition.” Stacks ● Composition is the main way to build programs, but what are we composing functions of? ● We need a convenient data structure to store the program state and allow passing multiple values between functions ● Most concatenative languages use a heterogeneous stack, separate from the call stack, accessible to the programmer ● Other models proposed; stack is convenient & efficient in practice “Everything is an object a list a function” Literals (“nouns”) take stack & return Term 2 is a function, pushes value 2. it with corresponding value on top. 2 3 + is a function, equal to 5. Can be split into 2 3 and + or 2 and 3 +. 2 : ∀s. s → s × ℤ "hello" : ∀s. s → s × string Higher-order

Concatenative Programming

A New Macro-Programming Paradigm in Data Collection Sensor Networks

SQL Processing with SAS® Tip Sheet

Rexx to the Rescue!

The Semantics of Syntax Applying Denotational Semantics to Hygienic Macro Systems

Metaprogramming with Julia

Dynamic Economics Quantitative Methods and Applications to Macro and Micro

Top 10 Uses of Macro %Varlist - in Proc SQL, Data Step and Elsewhere

(Dynamic (Programming Paradigms)) Performance and Expressivity Didier Verna

The Paradigms of Programming

PL/I for AIX: Programming Guide

61A Extra Lecture 8 Homoiconicity Macros

Application Note: Object-Oriented Programming in C