Concepts Guide
Total Page:16
File Type:pdf, Size:1020Kb
MarkLogic Server Concepts Guide 1 MarkLogic 10 May, 2019 Last Revised: 10.0, May, 2019 Copyright © 2019 MarkLogic Corporation. All rights reserved. MarkLogic Server Table of Contents Table of Contents Concepts Guide 1.0 Overview of MarkLogic Server .....................................................................5 1.1 Relational Data Model vs. Document Data Model .................................................5 1.2 XML Schemas ........................................................................................................6 1.3 High Performance Transactional Database .............................................................7 1.4 Rich Search Features ..............................................................................................7 1.5 Text and Structure Indexes .....................................................................................7 1.6 Semantics Support ..................................................................................................8 1.7 Binary Document Support ......................................................................................9 1.8 MarkLogic APIs and Communication Protocols ....................................................9 1.9 High Performance .................................................................................................10 1.10 Clustered ...............................................................................................................11 1.11 Cloud Capable .......................................................................................................11 2.0 How is MarkLogic Server Used? .................................................................13 2.1 Publishing/Media Industry ....................................................................................13 2.2 Government / Public Sector ..................................................................................14 2.3 Financial Services Industry ...................................................................................15 2.4 Healthcare Industry ...............................................................................................15 2.5 Other Industries .....................................................................................................17 3.0 Indexing in MarkLogic ................................................................................18 3.1 The Universal Index ..............................................................................................18 3.1.1 Word Indexing ..........................................................................................19 3.1.2 Phrase Indexing .........................................................................................20 3.1.3 Relationship Indexing ...............................................................................21 3.1.4 Value Indexing ..........................................................................................22 3.1.5 Word and Phrase Indexing ........................................................................22 3.2 Other Types of Indexes .........................................................................................23 3.2.1 Range Indexing .........................................................................................23 3.2.1.1 Range Queries ...........................................................................26 3.2.1.2 Extracting Values ......................................................................27 3.2.1.3 Optimized "Order By" ...............................................................27 3.2.1.4 Using Range Indexes for Joins ..................................................28 3.2.2 Word Lexicons ..........................................................................................29 3.2.3 Reverse Indexing ......................................................................................30 3.2.3.1 Reverse Query Constructor .......................................................30 3.2.3.2 Reverse Query Use Cases .........................................................31 3.2.3.3 A Reverse Query Carpool Match ..............................................31 MarkLogic 10—May, 2019 Concepts Guide—Page 1 MarkLogic Server Table of Contents 3.2.3.4 The Reverse Index .....................................................................33 3.2.3.5 Range Queries in Reverse Indexes ............................................35 3.2.4 Triple Index ...............................................................................................36 3.2.4.1 Triple Index Basics ....................................................................36 3.2.4.2 Triple Data and Value Caches ...................................................37 3.2.4.3 Triple Values and Type Information .........................................38 3.2.4.4 Triple Positions .........................................................................38 3.2.4.5 Index Files .................................................................................39 3.2.4.6 Permutations ..............................................................................39 3.3 Index Size .............................................................................................................39 3.4 Fields .....................................................................................................................40 3.5 Reindexing ............................................................................................................40 3.6 Relevance ..............................................................................................................40 3.7 Indexing Document Metadata ...............................................................................40 3.7.1 Collection Indexes ....................................................................................41 3.7.2 Directory Indexes ......................................................................................41 3.7.3 Security Indexes ........................................................................................41 3.7.4 Properties Indexes .....................................................................................41 3.8 Fragmentation of XML Documents ......................................................................42 4.0 Data Management ........................................................................................43 4.1 What's on Disk ......................................................................................................43 4.1.1 Databases, Forests, and Stands .................................................................44 4.1.2 Tiered Storage ...........................................................................................44 4.1.3 Super Databases and Super Clusters .........................................................45 4.1.4 Partitions, Partition Keys, and Partition Ranges .......................................48 4.2 Ingesting Data .......................................................................................................50 4.3 Modifying Data .....................................................................................................52 4.4 Multi-Version Concurrency Control .....................................................................52 4.5 Point-in-time Queries ............................................................................................53 4.6 Locking .................................................................................................................53 4.7 Updates .................................................................................................................53 4.8 Isolating an update ................................................................................................54 4.9 Documents are Like Rows ....................................................................................54 4.10 MarkLogic Data Loading Mechanisms ................................................................55 4.11 Content Processing Framework (CPF) .................................................................56 4.12 Organizing Documents .........................................................................................57 4.12.1 Directories .................................................................................................57 4.12.2 Collections ................................................................................................57 4.12.3 Unprotected Collections ...........................................................................58 4.12.4 Protected Collections ................................................................................58 4.13 Database Rebalancing ...........................................................................................58 4.14 Bitemporal Documents .........................................................................................60 4.14.1 Bitemporal Data Management ..................................................................60 4.14.2 Bitemporal Queries ...................................................................................61 4.15 Managing Semantic Triples ..................................................................................61 MarkLogic 10—May, 2019 Concepts Guide—Page 2 MarkLogic Server Table of Contents 5.0 Searching in MarkLogic Server ...................................................................62 5.1 High Performance Full Text Search .....................................................................62