On the Duality of Streams How Can Linear Types Help to Solve the Lazy IO Problem?
Total Page:16
File Type:pdf, Size:1020Kb
Draft: do not distribute On the Duality of Streams How Can Linear Types Help to Solve the Lazy IO Problem? Jean-Philippe Bernardy Josef Svenningsson Chalmers University of Technology and University of Gothenburg bernardy,josefs at chalmers.se Abstract sired behavior does not always happen. Indeed, a necessary con- We present a novel stream-programming library for Haskell. As dition is that the production pattern of f matches the consumption other coroutine-based stream libraries, our library allows syn- pattern of g; otherwise buffering occurs. In practice, this means that chronous execution, which implies that effects are run in lockstep a seemingly innocuous change in either of the function definitions and no buffering occurs. may drastically change the memory behavior of the composition, A novelty of our implementation is that it allows to locally in- without warning. If one cares about memory behavior, this means troduce buffering or re-scheduling of effects. The buffering require- that the compositionality principle touted by Hughes breaks down. ments (or re-scheduling opportunities) are indicated by the type- Second, lazy evaluation does not extend nicely to effectful pro- system. cessing. That is, if (say) an input list is produced by reading a file lazily, one is exposed to losing referential transparency (as ? has Our library is based on a number of design principles, adapted 1 from the theory of Girard’s Linear Logic. These principles are shown). For example, one may rightfully expect that both follow- applicable to the design of any Haskell structure where resource ing programs have the same behavior: management (memory, IO, ...) is critical. main = do inFile openFile "foo" ReadMode Categories and Subject Descriptors D.1.1 [Applicative (Func- contents hGetContents inFile tional) Programming]; D.3.3 [Language Constructs and Fea- putStr contents tures]: Coroutines hClose inFile Keywords Streams, Continuations, Linear Types main = do inFile openFile "foo" ReadMode contents hGetContents inFile 1. Introduction hClose inFile putStr contents As ? famously noted, the strength of functional programming lan- guages resides in the composition mechanisms that they provide. Indeed, the putStr and hClose commands act on unrelated re- That is, simple components can be built and understood in isola- sources, and thus swapping them should have no observable effect. tion; one does not need to worry about interference effects when However, while the first program prints the foo file, the second one composing them. In particular, lazy evaluation affords to construct prints nothing. Indeed, because hGetContents reads the file lazily, complex programs by pipelining simple list transformation func- the hClose operation has the effect to truncate the list. In the first tions. Indeed, while strict evaluation forces to fully reify each in- program, printing the contents force reading the file. One may argue termediate result between each computational step, lazy evaluation that hClose should not be called in the first place — but then, clos- allows to run all the computations concurrently, often without ever ing the handle happens only when the contents list can be garbage allocating more than a single intermediate element at a time. collected (in full), and relying on garbage collection for cleaning Unfortunately, lazy evaluation suffers from two drawbacks. resources is brittle; furthermore this effect compounds badly with First, it has unpredictable memory behavior. Consider the follow- the first issue discussed above. If one wants to use lazy effectful ing function composition: computations, again, the compositionality principle is lost. f :: [a ] ! [b ] In this paper, we propose to tackle both of these issues by g :: [b ] ! [c ] mimicking the computational behavior of Girard’s linear logic (?) h = g ◦ f in Haskell. In fact, one way to read this paper is as an advocacy for linear types support in Haskell. While Kiselyov’s iteratees (?) One hopes that, at run-time, the intermediate list ([b]) will only be already solves the issues described above, our grounding in linear allocated element-wise, as outlined above. Unfortunately, this de- logic yields a rich structure for types for data streams, capturing various production and consumption patterns. First, the type corresponding to on-demand production of ele- ments is called a source (Src). An adaptation of the first code ex- ample above to use sources would look as follows, and give the guarantee that the composition does not allocate more memory than the sum of its components. 1 This expectation is expressed in a Stack Overflow question, accessi- ble at this URL: http://stackoverflow.com/questions/296792/haskell-io-and- [Copyright notice will appear here once ’preprint’ option is removed.] closing-files Sumitted to IFL 2015 1 2016/3/1 f :: Src a ! Src b 2. Preliminary: negation and continuations g :: Src b ! Src c In this section we recall the basics of continuation-based program- h = g ◦ f ming. We introduce our notation, and justify effectful continua- Second, the type driving the consumption of elements is called a tions. sink (Snk). For example, the standard output is naturally given a We begin by assuming a type of effects Eff , which we keep sink type: abstract for now. We can then define negation as follows: type N a = a ! Eff stdoutSnk :: Snk String A shortcut for double negations is also convenient. Using it, we can implement the printing of a file as follows, and type NN a = N (N a) guarantee the timely release of resources, even in the presence of exceptions: The basic idea (imported from classical logic) pervading this paper is that producing a result of type α is equivalent to consuming an main = fileSrc "foo" `fwd` stdoutSnk argument of type Nα. Dually, consuming an argument of type α is equivalent to producing a result of type Nα. In this paper we call In the above fileSrc provides the contents of a file, and fwd for- these equivalences the duality principle. wards data from a source to a sink. The types are as follows: In classical logic, negation is involutive; that is: NN α = α fileSrc :: FilePath ! Src String However, because we work within Haskell, we do not have this 2 fwd :: Src a ! Snk a ! IO () equality . We can come close enough though. First, double nega- tions can always be introduced, using the shift operator: Sources provide data on-demand, while sinks decide when they are shift :: a ! NN a ready to consume data. This is an instance of the push/pull duality. shift x k = k x In general, push-streams control the flow of computation, while pull-streams respond to it. We will see that this polarization does Second, it is possible to remove double negations, but only if not need to match the flow of data. We support in particular data an effect can be outputted. Equivalently, triple negations can be sources with push-flavor, called co-sources (CoSrc). Co-sources collapsed to a single one: are useful for example when a data stream needs precise control unshift :: N (NN a) ! N a over the execution of effects it embeds (sec Sec. 6). For example, unshift k x = k (shift x) sources cannot be demultiplexed, but co-sources can. In a program which uses both sources and co-sources, the need The above two functions are the return and join of the double might arise to compose a function which returns a co-source with negation monad3; indeed adding a double negation in the type cor- a function which takes a source as input: this is the situation where responds to sending the return value to its consumer. However, we list-based programs would silently cause memory allocation. In our will not be using this monadic structure anywhere in the following. approach, this mismatch is caught by the type system, and the user Indeed, single negations play a central role in our approach, and the must explicitly conjure a buffer to be able to write the composition: monadic structure is a mere diversion. f :: Src a ! CoSrc b 2.1 Structure of Effects g :: Src b ! Src c When dealing with purely functional programs, continuations have h = g ◦ buffer ◦ f no effects. In this case, one can let Eff remain abstract, or define it to be the empty type: Eff = ?. This is also the natural choice The contributions of this paper are when interpreting the original linear logic of ?. The pure logic makes no requirement on effects, but interpreta- • The formulation of principles for compositional resource-aware tions may choose to impose a richer structure on them. Such inter- programming in Haskell (resources include memory and files). pretations would then not be complete with respect to the logic — The principles are linearity, duality, and polarization. While but they would remain sound. In our case, we first require Eff to be borrowed from linear logic, as far as we know they have not a monoid. Its unit (mempty) corresponds to program termination, been applied to Haskell programming before. while the operator (mappend) corresponds to sequential composi- • An embodiment of the above principles, in the form of a Haskell tion of effects. (This structure is standard to interpret the HALT and library for streaming IO. Besides supporting compositionality MIX rules in linear logic (??)) as outlined above, our library features two concrete novel as- For users of the stream library, Eff will remain an abstract pects: monoid. However in this paper we will develop concrete effectful streams, and therefore we greatly extend the structure of effects. 1. A more lightweight design than state-of-the-art co-routine In fact, because we will provide streams interacting with files and based libraries. other operating-system resources, and write the whole code in 2. Support for explicit buffering and control structures, while standard Haskell, we must pick Eff = IO(), and ensure that Eff still respecting compositionality (Sec.