Connecting Luatex to Mongodb (Arstexnica, Numero 26, 2018)
Total Page:16
File Type:pdf, Size:1020Kb
Connecting LuaTEX to MongoDB Roberto Giacomelli Abstract primary and foreign key constraint, the not null clause and so on. This paper describes in details a way to connect Secondly, the valuable advantage is that avoiding LuaT X to a MongoDB database server. As a con- E intermediate layer between data and T X means sequence, the Lua-powered typesetting engine be- E speeding up modifications. comes a candidate, along with other tools, to gen- I have frequently adapted project entity- erate beautiful reports. relationship diagrams whenever improvements From local-wide project to large web ap- were (e.g., the addition of a complex set of eco- plications, MongoDB—the popular NoSQL nomic information). So, I learned a lot about how database—creates a new point of view on datasets to translate real world entities into conceptual SQL also adopted by LuaT X. This could extend the E models, trying to keep minimalism in mind as much development perspectives. as possible. Changes were easy to apply but, after six years Sommario of development, things are no longer so easy to Questo articolo descrive in dettaglio un modo per keep up-to-date despite the success of the project. connettere LuaTEX a un database server Mon- After all, the world is constantly changing. And goDB, dimostrando come il motore di composi- I consider more important finding out database zione dotato dell’interprete Lua possa candidarsi, systems to allow a better data representation than insieme ad altri strumenti, a generare bellissimi incrementing data bandwidth as fast as hardware report. potentially could do. In other words, I’m looking Dal piccolo progetto alle grandi applicazioni web, for more natural criteria to model a reality in rapid MongoDB — il noto database di tipo NoSQL — and unpredictable evolution. crea un nuovo punto di vista sui dati che si ritro- When I talk about project I mean a production va anche in LuaTEX. Ciò potrebbe incrementare business where data are shared over the IT network ulteriormente le possibilità di sviluppo. infrastructure and automatically processed to get reports. Just think of a freelance engineer or a little 1 Reasons design team, dealing with invoices of technical documents as part of their customer service, or I’m currently working on a project based on Lua- a company delivering financial statements to the 1 jitLATEX, and SQLite3 as a set of databases han- clients. dler, to typeset a lot of high quality reports. I exposed about that in my article (GIACOMELLI, 1.1 What I’ll talk about 2017). This paper is covering a brief description of the LuajitTEX includes the LuaJIT FFI Foreign Func- NoSQL unconventional way of thinking in the tion Interface, a powerful technology to bind C section 2, where I also introduce the MongoDB executable modules such as the dynamic linking database system presenting fundamental concepts SQLite3 library. Not only LuajitTEX becomes capa- like the document. ble of collecting data by running SQL queries, but Section 3 and the following two are dedicated to it also makes life easier in configuring the system. a step-by-step detailed tutorial valid for Ubuntu Why did I discard simpler data management soft- and Windows operating systems—section 4 and ware such as spreadsheets or plain text files, and section 5 respectively. They provide instructions to why didn’t I ensure a full separation between TEX set up a MongoDB local server for experimenting and data layer, for example through an agnostic and a suitable MongoDB connector for LuaTEX. file format? Finally, from section 7 on I will discuss two demo First of all, the main goal of this architecture projects simple and meaningful, useful to practice is to ensure data safety: that is why I chose a the projects development. RDBMS system2. Every record saved in the archive is checked by the database engine itself that con- 1.2 Verbatim conventions trols the integrity of the relational model, in con- When a shell command doesn’t fit in the column sideration of a handful of concepts such as the width, an ending backslash \ is added to split text up into more lines. 1. https://www.sqlite.org. 2. The acronym RDBMS stands for Relational Database While experimenting, you can improve readabil- Management System. ity too, splitting the lines up in your command 62 ArsTEXnica Nº 26, Ottobre 2018 Connecting LuaTEX to MongoDB window according to your operating system syn- documents. Thus documents literally represent tax. For instance, Windows PowerShell supports themselves. Shift+Enter keys combination for multiline editing In the document above—delimited by or even a backtick3 ‘ at the end of breaking lines. braces—there are two fields separated by a Beware: PowerShell inserts a space for each newline comma. Each field has a key and a value separated you enter, so you won’t be able to even split up a by a colon. Keys are strings while values are, file path, while Linux or Mac OSX user can safely respectively, a string and an integer. Values use backslash \ to break up to command names. are not limited to atomic types like those just considered: they can also represent compound 1.3 Requirements types such as arrays—a list of comma separated You need a desktop computer running a 64 bit Op- values, enclosed in square brackets—or even other erating System in order to execute demo projects documents. For example, we can add geospatial programs illustrated from section 7 on: running information as an array of coordinates localizing locally database operations and typesetting docu- the river source: ments for reports. A basic Lua understanding is { also recommended to knowingly run such source ”name” : ”Missouri River”, files with LuaTEX. ”length km” : 3768, ”source coords” : [45.9275, -111.508056] 2 NoSQL, Not only SQL } That notation—pretty much the same of the ta- NoSQL philosophy denies the SQL model: no fixed ble constructor in Lua—corresponds exactly to the schemas and no relational constraints are imposed JavaScript object type and it highlights the strong on data. Leaving out that essential checks, web inclination of MongoDB for web development. applications can take advantage of NoSQL in two main areas: a great capability to scale out—that is 2.2 Collection and database partitioning data across several servers, optimizing A MongoDB collection is a set of documents and, costs and performances—and a flexible design of in turn, a database is a set of collections. data models. Shortly, a powerful and easy to use Collections are referenced by name as a single storage engine. UTF-8 string or as a sequence of strings concate- The heart of the SQL system is the table: a nated with the dot character. It’s the recommended tabular dataset organized in rows and columns that way to organize document groups with namespaces. describes and represents entities. Improvements For instance, the previous document regarding require to reformulate all the dataset archived in essential data about rivers could be part of the the table. This is what we call a fixed-schema: it’s info.basic collection. Please pay attention to the just not possible to write data that don’t suit the fact that in this case there is no collection belonging current schema. to another one but it’s only an optional naming Quite the opposite for the NoSQL databases: rule to identify a single collection. the information models can be eventually changed, There are no constraints defined on the doc- whatever and wherever. ument structure except on the field _id. If the Google, Amazon and other Internet big compa- document does not have an _id field, MongoDB nies are developing their own NoSQL data stor- will attach one to it with a default value of type age system. In this article, I’m going to consider ObjectId, otherwise it will check if the _id value the open source project MongoDB located at of any type is unique within the collection. www.mongodb.com. 2.3 The JSON and BSON formats 2.1 The MongoDB document JSON (JavaScript Object Notation) is a text-based A MongoDB document is a key/value ordered list. data-interchange open standard format. It is built As a matter of fact, not only the database can upon a key/value structure called object and a list archive data in binary format, but can also under- called array. Values types belong to a list of seven of stand the structure of the key/value data type for them like string or number, but even object and array. querying. The JSON specification takes only a single A document looks like this: web page, see https://www.json.org/, to be com- pletely defined. As the website says, JSON iseasy {”name”:”Missouri River”, ”length km”:3768} for humans to read and write, and it is also easy for machines to parse and generate. In absence of a fixed and predefined SQL-like JSON is not alone in the arena of structured schema, MongoDB developers use to visualize text formats; competitors are TOML4, YAML5, how information is structured, simply printing 4. Tom’s Obvious, Minimal Language https://github.com/ 3. Backtick is issued typing ALT+0096 on Windows ma- toml-lang/toml. chines. 5. YAML Ain’t Markup Language http://yaml.org/. 63 Roberto Giacomelli ArsTEXnica Nº 26, Ottobre 2018 XML6, RON7 and many others, but JSON is very • an independent execution of CLI tools like popular on the modern web. mongo shell itself, using files as communication The data transfer over the network is based on channel with LuaTEX (see section 10). the serialization/deserialization process where text is encoded into an efficient binary format.