Nosql Databases in Archaeology a Funerary Case Study

FACULTY OF ARCHAEOLOGY NoSQL databases in Archaeology a Funerary Case Study Rens Cassée;S1228226 15-12-2017 0 1 NoSQL Databases in Archaeology – a Funerary Case Study. Rens W. Cassée S1228226 Thesis MSc Digital Archaeology 1044CS05H-1718ARCH Dr K. Lambers Master Digital Archaeology and Archaeology of the Near East Leiden University, Faculty of Archaeology Leiden, 15-12-2017. Final version 2 Content 1. Introduction ................................................................................................................ 5 2. Case study ................................................................................................................... 9 2.1. Introduction ............................................................................................................ 9 2.2. The Pre-Pottery Neolithic B..................................................................................... 9 2.2.1. Funerary rites in the PPNB ........................................................................ 12 2.2.2. The Pre-Pottery Neolithic B dataset.......................................................... 15 2.3. Funerary data ........................................................................................................ 18 2.3.1. Excavation result ....................................................................................... 19 2.3.2. Osteoarchaeology ..................................................................................... 21 2.3.3. Literary & museum studies ....................................................................... 22 3. Database management systems ............................................................................... 24 3.1. Introduction .......................................................................................................... 24 3.2. General history ...................................................................................................... 24 3.3. DBMS in Archaeology ............................................................................................ 26 3.4. The Relation Database .......................................................................................... 29 3.4.1. Normalisation and consistency ................................................................. 31 3.4.2. Advantages and disadvantages ................................................................. 34 4. NoSQL ........................................................................................................................ 38 4.1. Introduction .......................................................................................................... 38 4.2. Consistency in NoSQL databases .......................................................................... 39 4.2.1. Master-Slave and Multi-Master replication .............................................. 39 4.2.2. BASE .......................................................................................................... 40 4.2.3. CAP-theorem ............................................................................................. 41 4.2.4. Aggregate .................................................................................................. 42 4.3. NoSQL database management systems ................................................................ 43 4.3.1. Document store ........................................................................................ 43 4.3.2. Graph Store ............................................................................................... 46 4.3.3. RDF Triple-stores ....................................................................................... 48 4.3.4. Column store ............................................................................................. 48 4.3.5. Key-value store .......................................................................................... 49 4.4. Differences between RDBMS and NoSQL ............................................................. 50 5. RDBMS results ........................................................................................................... 52 5.1. Building the database ............................................................................................ 52 3 5.2. Creating the queries .............................................................................................. 55 5.3. query results and file durability ............................................................................. 64 6. NoSQL results ............................................................................................................ 66 6.1. Document store ..................................................................................................... 66 6.1.1. Building the database ................................................................................ 66 6.1.2. Creating the queries .................................................................................. 70 6.1.3. query results and file durability ................................................................. 76 6.2. Graph store............................................................................................................ 77 6.2.1. Building the database ................................................................................ 77 6.2.2. Creating the queries .................................................................................. 81 6.2.3. query results and file durability ................................................................. 87 7. Discussion of the results ............................................................................................ 89 7.1. Comparison ........................................................................................................... 89 7.1.1. Building the database ................................................................................ 89 7.1.2. Creating the queries .................................................................................. 92 7.1.3. Discussing the results ................................................................................ 92 7.1.4. Durability ................................................................................................... 93 7.1.5. NoSQL vs Relational ................................................................................... 97 7.2. Case study results .................................................................................................. 99 7.2.1. Answering the questions ......................................................................... 100 8. conclusion ................................................................................................................ 111 Abstract ........................................................................................................................... 115 Samenvatting (Dutch) ..................................................................................................... 116 Bibliography .................................................................................................................... 118 Literature ..................................................................................................................... 118 Web pages ................................................................................................................... 124 list of figures, tables, and appendices ............................................................................. 126 Figures ......................................................................................................................... 126 Tables .......................................................................................................................... 135 Appendices ............................................................... Fout! Bladwijzer niet gedefinieerd. 4 1. Introduction This thesis explores the possible use of NoSQL1 databases in archaeology, or more accurate the use of NoSQL databases in the field of funerary archaeology. The collection, analysis and storage of data are essential to any research area. However, the field of archaeology has a special relationship with its data. Archaeology tends to destroy its primary source of information: the archaeological record. Therefore, data observation, collection and preservation are more important in archaeology than in scientific fields in which experiments can be repeated. People have developed a wide variety of hardware and software to accommodate the observation, collection and preservation of research data; some of which are especially for archaeology (Corsi et al. 2013, 120-127; Sobotkova et al. 2013, 80-88; Smith et al. 2013, 89-99). Software programs that automate the storage, manipulation and retrieval of structured bodies of information are called Database Management Systems (DBMS hereafter) (Auer & Kroenke 2011, 9- 11; Elmasri & Navathe 2007, 4-6; Lock 2003, 89). Nowadays, archaeologists mostly use the relational DBMS (hereafter RDBMS). This database type has risen to the position of dominance in the 1980s and has held that position ever since (Elmasri & Navathe 2007, 56; Ryan 2004). When new techniques and applications developed, archaeologists were eager to adopt them in their research. These new technologies and applications resulted in a wider variety of data types. Instead of having only numeral and textual data, archaeologists must deal with digital photographs, 3D models, and georeferenced data. This creates an

Nosql Databases in Archaeology a Funerary Case Study

Further Normalization of the Data Base Relational Model

Aslmple GUIDE to FIVE NORMAL FORMS in RELATIONAL DATABASE THEORY

Characteristics of Functional Dependencies

Developing an Application Concept of Data Dependencies of Transactions to Relational Databases

Relational Database Is a Digital Database Based on the Relational Model of Data, As Proposed by E

Extending the Relational Model with Constraint Satisfaction

Denormalization in Data Warehouse Example

Introduction to Databases Presented by Yun Shen ([email protected]) Research Computing

A Simple Guide to Five Normal Forms in Relational Database Theory", Communications of the ACM 26(2), Feb

Colorado Technical University CS 660 – Database Systems

Functional Dependencies Between Attributes

Normalization