Types from Data: Making Structured Data First-Class Citizens in F

Types from Data: Making Structured Data First-Class Citizens in F

Types from data: Making structured data first-class citizens in F# Tomas Petricek Gustavo Guerra Don Syme University of Cambridge Microsoft Corporation, London Microsoft Research, Cambridge [email protected] [email protected] [email protected] Abstract let doc = Http.Request("http://api.owm.org/?q=NYC") match JsonValue.Parse doc with Most modern applications interact with external services and ( ) Record root access data in structured formats such as XML, JSON and j ( ) ! match Map.nd "main" root with CSV. Static type systems do not understand such formats, Record main often making data access more cumbersome. Should we give j ( ) ! match Map.nd "temp" main with up and leave the messy world of external data to dynamic Number num printfn "Lovely %f!" num typing and runtime checks? Of course, not! j ( ) ! failwith "Incorrect format" We present F# Data, a library that integrates external j _ ! failwith "Incorrect format" structured data into F#. As most real-world data does not j _ ! failwith "Incorrect format" come with an explicit schema, we develop a shape inference j _ ! algorithm that infers a shape from representative sample The code assumes that the response has a particular shape documents. We then integrate the inferred shape into the F# described in the documentation. The root node must be a type system using type providers. We formalize the process record with a main field, which has to be another record and prove a relative type soundness theorem. containing a numerical temp field representing the current Our library significantly reduces the amount of data ac- temperature. When the shape is different, the code fails. cess code and it provides additional safety guarantees when While not immediately unsound, the code is prone to errors contrasted with the widely used weakly typed techniques. if strings are misspelled or incorrect shape assumed. Using the JSON type provider from F# Data, we can write Programming Categories and Subject Descriptors D.3.3 [ code with exactly the same functionality in two lines: Languages]: Language Constructs and Features typeW = JsonProvider "http://api.owm.org/?q=NYC" Keywords F#, Type Providers, Inference, JSON, XML printfn "Lovely %f!" (W.GetSample().Main.Temp) 1. Introduction JsonProvider "..." invokes a type provider [23] at compile- time with the URL as a sample. The type provider infers the Applications for social networks, finding tomorrow’s weather structure of the response and provides a type with a GetSam- or searching train schedules all communicate with external ple method that returns a parsed JSON with nested properties services. Increasingly, these services provide end-points that Main.Temp, returning the temperature as a number. return data as CSV, XML or JSON. Most such services do In short, the types come from the sample data. In our not come with an explicit schema. At best, the documenta- experience, this technique is both practical and surprisingly tion provides sample responses for typical requests. effective in achieving more sound information interchange For example, http://openweathermap.org/current con- in heterogeneous systems. Our contributions are as follows: tains one example to document an end-point to get the cur- rent weather. Using standard libraries, we might call it as1: • We present F# Data type providers for XML, CSV and JSON (§2) and practical aspects of their implementation that contributed to their industrial adoption (§6). Permission to make digital or hard copies of part or all of this work for personal or • We describe a predictable shape inference algorithm for classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation structured data formats, based on a preferred shape rela- on the first page. Copyrights for components of this work owned by others than tion, that underlies the type providers (§3). ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, contact the Owner/Author. • We give a formal model (§4) and use it to prove relative Request permissions from [email protected] or Publications Dept., ACM, Inc., fax +1 (212) 869-0481. Copyright 2016 held by Owner/Author. Publication Rights type safety for the type providers (§5). Licensed to ACM. 1 PLDI ’16 June 13–17, 2016, Santa Barbara, CA, United States We abbreviate the full URL and omit application key (available after Copyright c 2016 ACM 978-1-nnnn-nnnn-n/yy/mm. $15.00 registration). The returned JSON is shown in AppendixA and can be DOI: http://dx.doi.org/10.1145/(to come) used to run the code against a local file. 2. Type providers for structured data We now use a local file as a sample for the type inference, but We start with an informal overview that shows how F# Data then processes data from another source. The code achieves type providers simplify working with JSON and XML. We a similar simplicity as when using dynamically typed lan- introduce the necessary aspects of F# type providers along guages, but it is statically type-checked. the way. The examples in this section also illustrate the key Type providers. The notation JsonProvider "people.json" design principles of the shape inference algorithm: passes a static parameter to the type provider. Static pa- rameters are resolved at compile-time and have to be con- • The mechanism is predictable (§6.5). The user directly stant. The provider analyzes the sample and provides a works with the provided types and should understand type People. F# editors also execute the type provider at why a specific type was produced from a given sample. development-time and use the provided types for auto- • The type providers prefer F# object types with properties. completion on “.” and for background type-checking. This allows extensible (open-world) data formats (§2.2) The JsonProvider uses a shape inference algorithm and and it interacts well with developer tooling (§2.1). provides the following F# types for the sample: • The above makes our techniques applicable to any lan- type Entity = guage with nominal object types (e.g. variations of Java member Name : string or C# with a type provider mechanism added). member Age : option oat • Finally, we handle practical concerns including support type People = null for different numerical types, and missing data. member GetSample : unit ! Entity[] member Parse string Entity The supplementary screencast provides further illustration : ! [] of the practical developer experience using F# Data.2 The type Entity represents the person. The field Name is string 2.1 Working with JSON documents available for all sample values and is inferred as . The field Age is marked as optional, because the value is The JSON format is a popular data exchange format based missing in one sample. In F#, we use Option.iter to call the on JavaScript data structures. The following is the definition specified function (printing) only when an optional value is JsonValue of used earlier (§1) to represent JSON data: available. The two age values are an integer 25 and a float type JsonValue = 3:5 and so the common inferred type is oat. The names of j Number of oat the properties are normalized to follow standard F# naming j Boolean of bool conventions as discussed later (§6.3). People Get- j String of string The type has two methods for reading data. Sample Parse j Record of Map string; JsonValue parses the sample used for the inference and parses a JSON string. This lets us read data at runtime, pro- j Array of JsonValue[] vided that it has the same shape as the static sample. j Null Error handling. In addition to the structure of the types, The earlier example used only a nested record containing the type provider also specifies the code of operations such a number. To demonstrate other aspects of the JSON type as item.Name. The runtime behaviour is the same as in the provider, we look at an example that also involves an array: earlier hand-written sample (§1) – a member access throws [ { "name":"Jan", "age":25 }, an exception if data does not have the expected shape. { "name":"Tomas" }, Informally, the safety property (§5) states that if the in- { "name":"Alexander", "age":3.5 } ] puts are compatible with one of the static samples (i.e. the The standard way to print the names and ages would be to samples are representative), then no exceptions will occur. In pattern match on the parsed JsonValue, check that the top- other words, we cannot avoid all failures, but we can prevent level node is a Array and iterate over the elements checking some. Moreover, if http://openweathermap.org changes that each element is a Record with certain properties. We the shape of the response, the code in §1 will not re-compile would throw an exception for values of an incorrect shape. and the developer knows that the code needs to be corrected. As before, the code would specify field names as strings, Objects with properties. The sample code is easy to write which is error prone and can not be statically checked. thanks to the fact that most F# editors provide auto-completion people.json data Assuming is the above example and is a when “.” is typed (see the supplementary screencast). The string containing JSON of the same shape, we can write: developer does not need to examine the sample JSON file to type People = JsonProvider "people.json" see what fields are available. To support this scenario, our type providers map the inferred shapes to F# objects with for item in People.Parse(data) do printf "%s " item.Name (possibly optional) properties.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    14 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us