University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2017 Regular Programming Over Data Streams Mukund Raghothaman University of Pennsylvania,
[email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Computer Sciences Commons Recommended Citation Raghothaman, Mukund, "Regular Programming Over Data Streams" (2017). Publicly Accessible Penn Dissertations. 2540. https://repository.upenn.edu/edissertations/2540 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/2540 For more information, please contact
[email protected]. Regular Programming Over Data Streams Abstract Data streams arise in a variety of applications, such as feeds from financial markets, event streams from sensors and medical devices, logs produced by long-running programs, click-streams from websites, and packet sequences passing through internet routers. In this thesis, we are concerned with computing quantitative statistics over these streams, and with expressing transformations in the related domain of strings. Many string transformations are instances of simple patterns, such as inserting, deleting and replacing substrings, or applying a function to each element in the stream. Over data streams, the task is usually to compute some simple quantitative statistic, such as counting the number of occurrences of a pattern or the mean time between occurrences of an event. There has traditionally been limited programming language support for stream processing, and programmers are forced to write low-level code, by manually maintaining state and updating it on seeing each new input element. This sacrifices both ease of expression and amenability to static analysis. We propose a simple, expressive programming model for stream transformations, with strong theoretical foundations and fast evaluation algorithms.