Transactions of the Institute of Systems, Control and Information Engineers, Vol. 33, No. 9, pp. 253–258, 2020 253

Paper

Concept and Design of for Interactive Art Polyphonic Jump!*

Ichiroh Kanaya†

This article presents the mathematical background of general interactive systems. The first principle of designing a large system is “divide and rule,” which implies that we could possibly reduce human error if we divided a large system in smaller subsystems. Interactive systems are, however, often composed of many subsystems that are organically connected to one another and thus extremely difficult to divide. In other words, we cannot apply a traditional mechanism of mathematical functions to the programming of interactive systems. We, however, can overcome this difficulty by applying a framework of category theory to the programming, but this requires highly abstract , which is not very popular. In this article we introduce the fundamental idea of the category theory using only λ-calculus, and then demonstrate how it can be used in the practical design of an interactive system. Finally, we mention how this discussion relates to Kleisli category in mathematics.

1. Introduction cannot be divided into smaller λ-calculus expressions. The Oxford English Dictionary (OED) defines the He explained that this was due to the mismatch of word “” as 1 an activity that is natural to or types of input and output in the λ-calculus and sug- the purpose of a person or thing: ‘bridges perform gested that this gap could be overcome by regarding the function of providing access across water’; ‘bodily an interactive system as a morph of a function to an functions’. 2 [Mathematics] a relation or expression action[4]. involving one or more variables: ‘the function (bx + Moggi’s theory uses highly abstract mathematics ).’ and is generally difficult for average-level computer As defined by the OED, the word function has (at scientists to understand. However, Moggi’s concept least) a double meaning: activity (the first meaning) is understandable without a deep understanding of and relation (the second meaning). The relation is Kleisli’s category theory. often referred to as a mapping in mathematics, which In this paper we first interpret Moggi’s discussion typically implies referential transparency. without using the category theory and then explain Conversely, λ-calculus is identical to a mapping how this theory can be applied to our interactive art in mathematics, and also to a Turing Machine. This called Polyphonic Jump![5]. Finally, we provide rigid means that every program (computer code) can be proof of our discussion by following Moggi’s original represented by the λ-calculus[1,2]. proposal. However, there are often difficulties in regarding 2. Equation of Interaction interactive systems as a mapping. Moggi tackled this problem and found a unique solution: he applied Kleisli Let x and y be an input to and an output from category and regarded a function as a morph, which a certain system, respectively. Generally neither x is a more abstract concept of an ordinary mapping[3]. nor y are scalars. Hereafter, we assume that all func- Moggi discovered that these interactive systems tions are referentially transparent except for ones we ∗ explicitly denoted. Manuscript Received Date: July 29, 2019 Interactive systems can be classified as one of the ∗ The material of this paper was partially presented at following four classes: A class 0 system outputs a the 63rd Annual Conference of the Institute of Systems, constant value, while class 1 outputs a value that is Control and Information Engineers (SCI’19) which was a function of time. A class 2 system outputs a value held in May, 2019. † that is a function of arbitrary inputs (including time). School of Information and Data Sciences, Nagasaki Uni- A class 3 system outputs a value that is a function of versity; 1-14 Bunkyomachi, Nagasaki 852-8521, JAPAN an arbitrary input and its internal status. Key Words: interactive art, functional programming, Class 0: output y is constant, that is, y = c, where c referential transparency, category theory, monad in is a constant value. programming.

–1– 254 Trans. of ISCIE, Vol. 33, No. 9 (2020)

Class 1: output y is a function of time τ.1 We de- y =bindf(injectx). (9) note this function as f, and call it the transfer function. Assume that all functions used in this Equation (9) of a Class 3 system corresponds with paper are curried and left-associative, which we equation (2) of a Class 2 system. are familiar to. The equation for class 1 is Next we introduce several symbols to facilitate hu- man readability. The † symbol denotes the injection y = fτ. (1) operator and is given by

Class 2: output y is a function of an arbitrary value x† ≡ injectx. (10) x. Thus, the equation is The  symbol denotes the binding operator and is y = fx. (2) given by

Class 3: output y is a function of an arbitrary input f m ≡ bindfm. (11) x andaninternalstatuss. If we allowed referen- tial opacity of function f, we would obtain the Equation (9) can be simplified by these operators as following equation: follows: y = f x†. (12) y = f!sx, (3) The binding operator is defined as where function f! changes its behavior based on its internal status s, and rewrites the value of f m ≡\s → fxs where [x,s]=ms, (13) s whenever the function is evaluated. Because rewriting any variables is not allowed in this dis- where the keyword where declares local variables, cussion, we must forget this disruptive function which can be easily translated in ordinary λ-calculus. f!. Since almost all of practical programming languages One well-known method for retaining referential provide syntax for declaring local variables, we follow transparency is placing the internal status outside the popular programming-language style and use let...in box. For example, instead of where: [y,t]=f [x,s](4)f m ≡\s → let[x,s]=ms in fxs (14) is a referentially transparent equation. Here function where “let...in” is defined as f returns a pair of output y and a new internal sta- ≡ \ → tus t. To match the types of input and output, the leta = b in c ( a c)b. (15) argument is also a pair. 3. Composition of Transfer Functions Assume we have function named “inject” given by Assume that transfer function f is the composi- injectx ≡\s → [x,s](5)tion of two different transfer functions g and h;that is, where \ denotes λ-calculus. The function “inject” abstracts the internal status s, andthuswecancall f = h•g, (16) injectx as an input with context. Because we wish to apply transfer function f to where input x, output y should be b•a ≡\z → b(az). (17) y = inject(fx). (6) Let x be the input to the system, and m be a contex- Although Equation (6) is perfectly correct, it is not tual version of x. Thus m is given by practical because output y is with context while input m ≡ x†. (18) x is without context. A practical transfer function, say F, should be something looked like Furthermore, let

y = F (injectx). (7) n ≡ gm (19) Let us extract transfer function f as a parameter of where function F as n = \s → let[x,s]=ms in gxs. (20) F =bindf (8) Now we can expand hn as so that we can obtain hn = \t → let[x,s]=mt,[y,t]=gxs in hyt,(21) 1 Rich Hickey, a language designer of Clojure figured which leads to out difficulty of modeling time itself[9].

–2– Kanaya: Concept and Design of Functional Programming for Interactive Art Polyphonic Jump! 255

Fig. 1 Polyphonic Jump! (Mayuko Kanazawa, Masataka Imura, and Ichi Kanaya)

Fig. 2 System structure of Polyphonic Jump!

hgx† = \s → let [x,s]=x†s,[ ,s]=gxs For seamless integration of the physical painting, in (h•g)xs. (22) which presents true reality and computer-generated animation that moves dynamically and interacts with As seen above, transfer functions can be combined us- the audience, we have incorporated real-time 3D mod- ing a binding operator. Composition through a bind- eling and projection technology in this artwork (see ing operator keeps the context as shown in equation Figure 1). (22), and also maintains the order of evaluation of the As shown in Figure 2, Polyphonic Jump! has the functions because the operator follows equation (17). following subunits: (A) clock generator, (B) image Now we use another operator denoting quick capturing unit, (C) animation frame database, (D) composition of transfer functions. We can think of motion sensor, (E) animation generator, and (F) ren- applying non-contextual function fNC to contextual derer. The authors use white for trivial referentially m as transparent units, and gray for non-trivial referen- † tially transparent units. Arrows show the flow of in- fNC m ≡ (fNCx) where\s → [x,s ]=ms. (23) formation. Operator gives context to function fNC, and is known (A) The clock generator synchronizes all units by as a functor as discussed later. controlling the renderer (F). (B) The image capturing unit captures a figure in 4. An Example: Polyphonic Jump! the audience. Polyphonic Jump! is a system that allows humans (C) The animation frame database retrieves each frame to be immersed in a fantasy world in which many of animations. creatures create a polyphonic chorus. The audience (D) The motion sensor returns True if a member of stands in front of a huge canvas on which a picture of the audience is jumping, otherwise False. a forest has been painted in oils, and individuals jump (E) The animation synthesizer unit, referring to the to interact with oil-painted animals on the canvas as motion sensor (D), generates frame information if they were also on the canvas. These individuals feel in XML format based on current time. Anima- as though they are actually in a picture book [5]. tion in this art work is complex because multiple

–3– 256 Trans. of ISCIE, Vol. 33, No. 9 (2020)

Fig. 3 Example of animation sequence of Polyphonic Jump!

sequences run at different timings/speeds.1 5. Note on Monad of Category The- (F) The renderer renders a frame based on the XML ory information given by the animation generator C (E) and images from the animation frame database We define a category with objects A,B,... and (C). amorphφ. Objects can be monoids, including the set Units (B), (C), (D), and (F) are trivially referen- of integers, list of scalars, and tree of scalars. Morph tially transparent, in other words, the units (B), (C), φ can be a function that returns the length of a list, (D), and (F) are categorized as Class 2 of the section for an example. 2. Unit (B) is a function that takes a time value and If we have an identity projection idC and a func- T C C returns an image, unit (C) is a function that takes a tor from category to , the following natural query and returns images, unit (D) is a function that transforms η and μ follow: returns the audience’s motion, and unit (F) is a func- η :id→T, (24) C tion that takes the frame information and returns a 2 computer-graphics image.2 μ : T →T. (25) Unit (A) returns a time-variant value. Note that f Moreover, if transforms η,μ are commutative with of the equation (1) of Class 1 is referentially trans- functor T , i.e., ηT A = T ηA and T μA = μTA, atriple parent, though τ is not. The Unit (A) is, however, [T ,η,μ] is called a monad in category theory[2]. still referentially transparent when considering that Kleisli introduced operator  instead of the trans- it always returns an “evaluate the current timing” form μ of category theory, and called triple [T ,η,]a action. Kleisli triple. Operator  follows these equations: Conceptually unit (E) has its own internal sta- tus, because it runs a pre-defined animation sequence (ηA) =id, (26) T A (normal status), and starts a new animation when f •ηA = f, (27) the motion sensor triggers the unit (triggered status). • • After a certain time, unit (E) returns to its normal g f =(g f) , (28) status. where projection f projects A to T B and another The actual unit (E) was designed to be completely projection g exists. Figure 4 illustrates the relation- referentially transparent. The internal status is given, ship among functor T , the natural transform η, and and proceeds through the unit as an action (λ-calculus), operator . whcih is described as s in equation (4). This action Kleisli’s triple is identical to our triple [ ,†,], which is eventually evaluated in unit (F) once rendering has is called a Monad in Programming. For example, started. the triple [fmap,return,>>=] in the programming lan- Polyphonic Jump! assigns time (action to evaluate guage Haskell is identical to Kleisli’s triple. the current time) to variable x in equation (5), and the status of the animation generator as context s, 6. Concluding Remarks thus the rest of the system doesn’t need to recognize In this paper we presented a strict mathematical any internal status of conncted units. framework for interactive systems. A difficulty in de- scribing such interactive systems relates to dividing such systems into subsystems owing to the organic 1The animation synthesizer unit generates appropri- connection of every part of the system. Global vari- ate frame number of the animation frame database ables, hidden contexts, and non referentially trans- (C) so that the renderer (F) can draw the required parent functions are examples of this difficulty in pro- frame in time. This unit holds its internal status gramming[6]. (e.g. frame counter) and thus it is hard to be di- Referential transparency is a popular concept among vided in terms of referential transparency. mathematicians for reducing complexity. We can re- 2The image capturer (B) captures camera image at every clock-edge and provides the image to the ren- gard a function as a projection of values if the func- derer (F). The renderer (F) over/under-lays the tion is referentially transparent. The domain and co- captured image above/below the animation frame, domain of a function are monoids if they have an which is given by the frame database (C) and in- identity projection. This means that such a projec- dexed by the animation synthesizer unit (E). tion can intuitively be divided into composite projec-

–4– Kanaya: Concept and Design of Functional Programming for Interactive Art Polyphonic Jump! 257

Fig. 4 Relationship of operators T ,η, of Kleisli Triple tions, thereby reducing the complexity for program- work. This research was partially supported by JSPS mers. For this reason, some domain specific languages KAKENHI Grant Number 17K00728, 18H04203, and for scientific computing support referential transpar- 18K11396. ency[7,8]. for interactive systems must consider References both the input from users and output to users, and [1] A. Church: Unsolvable problem of elementary number thus they cannot be discussed simply as purely math- theory; American Journal of Mathematics, Vol. 58, pp. ematical mappings. For example, composition of mo- 345–363 (1936) noids is well studied and can be applied to scientific [2] A. M. Ben-Amram: The church-turing thesis and its computing, however, it cannot be applied to interac- look-alikes; ACM SIGACT News, Vol. 36, No. 3, pp. tive systems. 113–114 (2005) Interactive systems are, however, projections [3] E. Moggi: Notions of computation and monads; In- (morphs) in terms of the Kleisli category. The Kleisli formation and Computation, Vol. 93, No. 1 (1991) triple is identical to the monad of programming. [4] Y. Onoue: Self-reproducing programs; Information This paper showed that interactive systems can Processing Society of Japan (IPSJ) Magazine, Vol. 47, be described as a composition of subsystems with- No. 3 (2006) (Japanese) out using highly abstract mathematics. It also il- [5] M. Kanazawa, M. Imura and I. Kanaya: Polyphonic Jump!; Proc. 12th Conf. Institute of Environmental lustrated the concrete example of Polyphonic Jump! Art and Design (2011) (Japanese) and showed how our discussion corresponds with tra- [6] G. Cousineau and M. Mauny: The Functional Ap- ditional category theory. Referential transparency is proach to Programming, Cambridge University Press not the only way to divide interactive systems into (1998) subsystems. The monad of programming can spa- [7] A. Umemura: Modification of algebraic specifications tially divide a system, while the continuation of pro- basedonmonads;Trans. ,Vol. gramming can temporally divide a system. Unfor- 93, No. 97, pp. 1–8, Information Processing Society of tunately continuation is known to disrupt referential Japan (IPSJ) (1993) (Japanese) transparency; however, we can still hope for the ex- [8] S. Wolfram: The Mathematica Book, Fourth Edition, istence of a more abstract mechanism that treats ref- Cambridge University Press (1999) erential transparency and continuation equally [9]. [9] R. Hickey: Are we there yet?; Proc. JVM Language Summit 2009, https://http://bit.ly/hickey2009 Acknowledgements (2009)

The author thanks Mayuko Kanazawa for leading the creation of Polyphonic Jump! The author also thanks Masataka Imura for total engineering of the

–5– 258 Trans. of ISCIE, Vol. 33, No. 9 (2020)

Authors Ichiroh Kanaya (Member) Ichiroh (Ichi) Kanaya is a director of pineapple.cc and a professor of School of Information and Data Sciences, Na- gasaki University, Japan. He majored in computer science and its application to media art and design. He is also inter- ested in applying computer technologies to survey, restora- tion, and conservation of cultural properties including Pyra- mids in Egypt.

–6–