Search Developer's Guide (PDF)

Total Page:16

File Type:pdf, Size:1020Kb

Search Developer's Guide (PDF) MarkLogic Server Search Developer’s Guide 1Application Developer’s Guide MarkLogic 10 May, 2019 Last Revised: 10.0-7, June, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server Table of Contents Table of Contents Search Developer’s Guide 1.0 Developing Search Applications in MarkLogic Server ...............................22 1.1 Overview of Search Features in MarkLogic Server .............................................22 1.1.1 High Performance Full Text Search .........................................................22 1.1.2 APIs for Multiple Programming Languages .............................................23 1.1.3 Support for Multiple Query Styles ............................................................24 1.1.4 Support for Multiple Query Types ............................................................26 1.1.5 Full XPath Search Support in XQuery .....................................................28 1.1.6 Lexicon and Range Index-Based APIs .....................................................28 1.1.7 Stemming, Wildcard, Spelling, and Much More Functionality ................28 1.1.8 Alerting API and Built-Ins ........................................................................28 1.2 Where to Find Search Information .......................................................................29 2.0 Search API: Understanding and Using ........................................................30 2.1 Understanding the Search API ..............................................................................30 2.1.1 Making the Search API Available to Your Application ...........................31 2.1.2 Simple search:search Example and Response Output ..............................31 2.1.3 Automatic Query Text Parsing and Grammar ..........................................32 2.1.4 Constrained Searches and Faceted Navigation .........................................34 2.1.5 Built-In Snippetting ..................................................................................36 2.1.6 Search Term Completion ..........................................................................36 2.1.7 Search Customization Via Options and Extensions .................................36 2.1.8 Speed and Accuracy ..................................................................................37 2.2 Controlling a Search With Query Options ............................................................37 2.3 Search Term Completion Using search:suggest ...................................................38 2.3.1 default-suggestion-source Option .............................................................38 2.3.2 Choose Suggestions With the suggestion-source Option .........................39 2.3.3 Use Multiple Query Text Inputs to search:suggest ...................................40 2.3.4 Make Suggestions Based on Cursor Position ...........................................41 2.3.5 search:suggest Examples ..........................................................................41 2.4 Creating a Custom Constraint ...............................................................................42 2.4.1 Implementing the parse Function .............................................................42 2.4.2 Implementing the start-facet Function ......................................................45 2.4.3 Implementing the finish-facet Function ....................................................46 2.4.4 Example: Creating a Simple Custom Constraint ......................................47 2.4.5 Example: Creating a Custom Constraint for Structured Queries ..............48 2.4.6 Example: Creating a Custom Constraint Geospatial Facet .......................50 2.5 Search Grammar ...................................................................................................53 2.6 Returning Lexicon Values With search:values .....................................................54 MarkLogic 10—May, 2019 Search Developer’s Guide—Page 2 MarkLogic Server Table of Contents 2.6.1 Specifying the Input Lexicons ..................................................................54 2.6.2 Constraining and Filtering Your Results ..................................................55 2.6.3 Example: Using a Query to Constrain Results .........................................56 2.6.4 Example: Filtering with Starting Value, Limit, and Page Length ............58 2.6.5 Example: Finding Value Co-Occurrences ................................................60 2.6.6 Additional Interfaces .................................................................................60 2.7 JSON Support in the Search API ..........................................................................60 2.8 More Search API Examples ..................................................................................62 2.8.1 Buckets Example ......................................................................................62 2.8.2 Computed Buckets Example .....................................................................64 2.8.3 Sort Order Example ..................................................................................66 3.0 Searching Using String Queries ...................................................................67 3.1 String Query Overview .........................................................................................67 3.2 The Default String Query Grammar .....................................................................68 3.2.1 Query Components and Operators ............................................................68 3.2.2 Operator Precedence .................................................................................71 3.2.3 Using Relational Operators on Constraints ..............................................72 3.2.4 String Query Examples .............................................................................73 4.0 Searching Using Structured Queries ............................................................74 4.1 Structured Query Overview ..................................................................................74 4.2 Structured Query Concepts ...................................................................................75 4.2.1 Major Query Categories ............................................................................76 4.2.2 Understanding the Difference Between Term and Word Queries ............77 4.2.3 Understanding Containment .....................................................................77 4.2.4 Text Match Semantics ..............................................................................79 4.2.5 Structured Query Sub-Query Taxonomy ..................................................80 4.3 Constructing a Structured Query ..........................................................................81 4.4 Syntax Summary ...................................................................................................82 4.5 Examples of Structured Queries ...........................................................................83 4.5.1 Example: Simple Structured Search .........................................................83 4.5.2 Example: Structured Search With Constraint References as Text ...........84 4.5.3 Example: Structured Search With Constraint References ........................85 4.5.4 Example: Structured Search on Key-Value Metadata Fields ...................86 4.6 Syntax Reference ..................................................................................................87 4.6.1 query .........................................................................................................89 4.6.2 term-query .................................................................................................91 4.6.3 and-query ..................................................................................................93 4.6.4 or-query .....................................................................................................94 4.6.5 and-not-query ............................................................................................96 4.6.6 not-query ...................................................................................................98 4.6.7 not-in-query ..............................................................................................99 4.6.8 true-query ................................................................................................101 4.6.9 false-query ..............................................................................................102 MarkLogic 10—May, 2019 Search Developer’s Guide—Page 3 MarkLogic Server Table of Contents 4.6.10 near-query ...............................................................................................103 4.6.11 boost-query .............................................................................................105 4.6.12 properties-fragment-query ......................................................................107 4.6.13 directory-query ........................................................................................110 4.6.14 collection-query ......................................................................................112 4.6.15 container-query .......................................................................................113 4.6.16 document-query ......................................................................................115 4.6.17 document-fragment-query ......................................................................116
Recommended publications
  • Semantics Developer's Guide
    MarkLogic Server Semantic Graph Developer’s Guide 2 MarkLogic 10 May, 2019 Last Revised: 10.0-8, October, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server MarkLogic 10—May, 2019 Semantic Graph Developer’s Guide—Page 2 MarkLogic Server Table of Contents Table of Contents Semantic Graph Developer’s Guide 1.0 Introduction to Semantic Graphs in MarkLogic ..........................................11 1.1 Terminology ..........................................................................................................12 1.2 Linked Open Data .................................................................................................13 1.3 RDF Implementation in MarkLogic .....................................................................14 1.3.1 Using RDF in MarkLogic .........................................................................15 1.3.1.1 Storing RDF Triples in MarkLogic ...........................................17 1.3.1.2 Querying Triples .......................................................................18 1.3.2 RDF Data Model .......................................................................................20 1.3.3 Blank Node Identifiers ..............................................................................21 1.3.4 RDF Datatypes ..........................................................................................21 1.3.5 IRIs and Prefixes .......................................................................................22 1.3.5.1 IRIs ............................................................................................22
    [Show full text]
  • Content Processing Framework Guide (PDF)
    MarkLogic Server Content Processing Framework Guide 2 MarkLogic 9 May, 2017 Last Revised: 9.0-7, September 2018 Copyright © 2019 MarkLogic Corporation. All rights reserved. MarkLogic Server Version MarkLogic 9—May, 2017 Page 2—Content Processing Framework Guide MarkLogic Server Table of Contents Table of Contents Content Processing Framework Guide 1.0 Overview of the Content Processing Framework ..........................................7 1.1 Making Content More Useful .................................................................................7 1.1.1 Getting Your Content Into XML Format ....................................................7 1.1.2 Striving For Clean, Well-Structured XML .................................................8 1.1.3 Enriching Content With Semantic Tagging, Metadata, etc. .......................8 1.2 Access Internal and External Web Services ...........................................................8 1.3 Components of the Content Processing Framework ...............................................9 1.3.1 Domains ......................................................................................................9 1.3.2 Pipelines ......................................................................................................9 1.3.3 XQuery Functions and Modules .................................................................9 1.3.4 Pre-Commit and Post-Commit Triggers ...................................................10 1.3.5 Creating Custom Applications With the Content Processing Framework 11 1.4
    [Show full text]
  • Access Control Models for XML
    Access Control Models for XML Abdessamad Imine Lorraine University & INRIA-LORIA Grand-Est Nancy, France [email protected] Outline • Overview on XML • Why XML Security? • Querying Views-based XML Data • Updating Views-based XML Data 2 Outline • Overview on XML • Why XML Security? • Querying Views-based XML Data • Updating Views-based XML Data 3 What is XML? • eXtensible Markup Language [W3C 1998] <files> "<record>! ""<name>Robert</name>! ""<diagnosis>Pneumonia</diagnosis>! "</record>! "<record>! ""<name>Franck</name>! ""<diagnosis>Ulcer</diagnosis>! "</record>! </files>" 4 What is XML? • eXtensible Markup Language [W3C 1998] <files>! <record>! /files" <name>Robert</name>! <diagnosis>! /record" /record" Pneumonia! </diagnosis> ! </record>! /name" /diagnosis" <record …>! …! </record>! Robert" Pneumonia" </files>! 5 XML for Documents • SGML • HTML - hypertext markup language • TEI - Text markup, language technology • DocBook - documents -> html, pdf, ... • SMIL - Multimedia • SVG - Vector graphics • MathML - Mathematical formulas 6 XML for Semi-Structered Data • MusicXML • NewsML • iTunes • DBLP http://dblp.uni-trier.de • CIA World Factbook • IMDB http://www.imdb.com/ • XBEL - bookmark files (in your browser) • KML - geographical annotation (Google Maps) • XACML - XML Access Control Markup Language 7 XML as Description Language • Java servlet config (web.xml) • Apache Tomcat, Google App Engine, ... • Web Services - WSDL, SOAP, XML-RPC • XUL - XML User Interface Language (Mozilla/Firefox) • BPEL - Business process execution language
    [Show full text]
  • Multi-Model Databases: a New Journey to Handle the Variety of Data
    0 Multi-model Databases: A New Journey to Handle the Variety of Data JIAHENG LU, Department of Computer Science, University of Helsinki IRENA HOLUBOVA´ , Department of Software Engineering, Charles University, Prague The variety of data is one of the most challenging issues for the research and practice in data management systems. The data are naturally organized in different formats and models, including structured data, semi- structured data and unstructured data. In this survey, we introduce the area of multi-model DBMSs which build a single database platform to manage multi-model data. Even though multi-model databases are a newly emerging area, in recent years we have witnessed many database systems to embrace this category. We provide a general classification and multi-dimensional comparisons for the most popular multi-model databases. This comprehensive introduction on existing approaches and open problems, from the technique and application perspective, make this survey useful for motivating new multi-model database approaches, as well as serving as a technical reference for developing multi-model database applications. CCS Concepts: Information systems ! Database design and models; Data model extensions; Semi- structured data;r Database query processing; Query languages for non-relational engines; Extraction, trans- formation and loading; Object-relational mapping facilities; Additional Key Words and Phrases: Big Data management, multi-model databases, NoSQL database man- agement systems. ACM Reference Format: Jiaheng Lu and Irena Holubova,´ 2019. Multi-model Databases: A New Journey to Handle the Variety of Data. ACM CSUR 0, 0, Article 0 ( 2019), 38 pages. DOI: http://dx.doi.org/10.1145/0000000.0000000 1.
    [Show full text]
  • Pipenightdreams Osgcal-Doc Mumudvb Mpg123-Alsa Tbb
    pipenightdreams osgcal-doc mumudvb mpg123-alsa tbb-examples libgammu4-dbg gcc-4.1-doc snort-rules-default davical cutmp3 libevolution5.0-cil aspell-am python-gobject-doc openoffice.org-l10n-mn libc6-xen xserver-xorg trophy-data t38modem pioneers-console libnb-platform10-java libgtkglext1-ruby libboost-wave1.39-dev drgenius bfbtester libchromexvmcpro1 isdnutils-xtools ubuntuone-client openoffice.org2-math openoffice.org-l10n-lt lsb-cxx-ia32 kdeartwork-emoticons-kde4 wmpuzzle trafshow python-plplot lx-gdb link-monitor-applet libscm-dev liblog-agent-logger-perl libccrtp-doc libclass-throwable-perl kde-i18n-csb jack-jconv hamradio-menus coinor-libvol-doc msx-emulator bitbake nabi language-pack-gnome-zh libpaperg popularity-contest xracer-tools xfont-nexus opendrim-lmp-baseserver libvorbisfile-ruby liblinebreak-doc libgfcui-2.0-0c2a-dbg libblacs-mpi-dev dict-freedict-spa-eng blender-ogrexml aspell-da x11-apps openoffice.org-l10n-lv openoffice.org-l10n-nl pnmtopng libodbcinstq1 libhsqldb-java-doc libmono-addins-gui0.2-cil sg3-utils linux-backports-modules-alsa-2.6.31-19-generic yorick-yeti-gsl python-pymssql plasma-widget-cpuload mcpp gpsim-lcd cl-csv libhtml-clean-perl asterisk-dbg apt-dater-dbg libgnome-mag1-dev language-pack-gnome-yo python-crypto svn-autoreleasedeb sugar-terminal-activity mii-diag maria-doc libplexus-component-api-java-doc libhugs-hgl-bundled libchipcard-libgwenhywfar47-plugins libghc6-random-dev freefem3d ezmlm cakephp-scripts aspell-ar ara-byte not+sparc openoffice.org-l10n-nn linux-backports-modules-karmic-generic-pae
    [Show full text]
  • Monitoring Marklogic Guide (PDF)
    MarkLogic Server Monitoring MarkLogic Guide 1 MarkLogic 10 May, 2019 Last Revised: 10.0-6, February, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server Table of Contents Table of Contents Monitoring MarkLogic Guide 1.0 Monitoring MarkLogic Server .......................................................................5 1.1 Overview .................................................................................................................5 1.2 Selecting a Monitoring Tool ...................................................................................5 1.3 Monitoring Architecture, a High-level View ..........................................................6 1.4 Monitoring Tools and Security ...............................................................................6 1.5 Guidelines for Configuring your Monitoring Tools ...............................................7 1.5.1 Establish a Performance Baseline ...............................................................7 1.5.2 Balance Completeness Against Performance .............................................7 1.6 Monitoring Metrics of Interest to MarkLogic Server .............................................8 1.6.1 Does MarkLogic Have Adequate Resources? ............................................8 1.6.2 What is the State of the System Overall? ...................................................9 1.6.3 What is Happening on the MarkLogic Server Cluster Now? .....................9 1.6.4 Are There Signs of a Serious Problem? ...................................................11
    [Show full text]
  • (High Technology). Process Manual. Indiana
    DOCUMENT RESUME ED 328 697 CE 056 829 TITLE Northwestern Pennsylvania CooperativeDemonstration Project (High Technology). ProcessManual. INSTITUTION Indiana Univ. of Pennsylvania. Centerfor Vocational Personnel Preparation. SPONS AGENCY Office of Vocational and Adult Education(ED), Washington, DC. PUB DATE 90 CONTRACT V199A90089 NOTE 150p.; For the final project report, seeCE 056 830. PUB TYPE Guides - Non-Classroom Use(055) EDRS PRICE MF01/PC06 Plus Postage. DESCRIPTORS Adult Vocational Education; ComputerAssisted Manufacturing; Demonstration Programs;Education Work Relationship; Electricity; Electronics;Fluid Mechanics; Industry; Inplant Programs;*Manufacturing Industry; Numerical Control; PostsecondaryEducation; *Program Development; Program Implementation; *Retraining; *School Business Relationship;Skill Development; *Technological Advancement;Word Processing IDENTIFIERS Partnerships in Education; Pennsylvania (Northwest) ABSTRACT This process manual explains the proceduresfollowed by a project that provided training foremployees of manufacturing industries. It also focuses on the project'sattainment of two other objectives: (1) helping industry in the target areabecome more competitive with foreign rivals; and(2) building a network between industry and education. A project summarydiscusses the need to provide training or skill upgrading in hightechnology areas of study, such as word processing, fluid power,computer-controlled machine tools, and computer-aided drafting anddesign. The manual describes the following activities: establishmentof
    [Show full text]
  • Data Lifecycle and Analytics in the AWS Cloud
    Data Lifecycle and Analytics in the AWS Cloud A Reference Guide for Enabling Data-Driven Decision-Making DATA LIFECYCLE AND ANALYTICS IN THE AWS CLOUD AWS THE IN ANALYTICS AND LIFECYCLE DATA CONTENTS PURPOSE Contents INTRODUCTION Purpose 3 CHALLENGES 1. Introduction 4 2. Common Data Management Challenges 10 3. The Data Lifecycle in Detail 14 LIFECYCLE Stage 1 – Data Ingestion 16 INGESTION Stage 2 – Data Staging 24 Stage 3 – Data Cleansing 31 Stage 4 – Data Analytics and Visualization 34 STAGING Stage 5 – Data Archiving 44 4. Data Security, Privacy, and Compliance 46 CLEANSING 5. Conclusion 49 ANALYTICS 6. Further Reading 51 Appendix 1: AWS GovCloud 53 Appendix 2: A Selection of AWS Data and Analytics Partners 54 Contributors 56 ARCHIVING SECURITY Public Sector Case Studies Financial Industry Regulation Authority (FINRA) 9 CONCLUSION Brain Power 17 READING DigitalGlobe 21 US Department of Veterans Affairs 23 APPENDICES Healthdirect Australia 27 CONTRIBUTORS Ivy Tech Community College 35 UMUC 38 UK Home Office 40 2 © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DATA LIFECYCLE AND ANALYTICS IN THE AWS CLOUD AWS THE IN ANALYTICS AND LIFECYCLE DATA CONTENTS PURPOSE Purpose of this guide INTRODUCTION Data is an organization’s most valuable asset and the volume and variety of data that organizations amass continues to grow. The demand for simpler data analytics, cheaper data storage, advanced predictive tools CHALLENGES like artificial intelligence (AI), and data visualization is necessary for better data-driven decisions. LIFECYCLE INGESTION The Data Lifecycle and Analytics in the AWS Cloud guide helps organizations of all sizes better understand the data lifecycle so they can optimize or establish an advanced data analytics practice in their STAGING organization.
    [Show full text]
  • BEYOND the RDBMS: WORKING with RELATIONAL DATA in MARKLOGIC Ken Tune, Senior Principal Consultant, Marklogic
    BEYOND THE RDBMS: WORKING WITH RELATIONAL DATA IN MARKLOGIC Ken Tune, Senior Principal Consultant, MarkLogic © COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Agenda . Personal introduction . Motivation – the Operational Data Hub . Modeling data in MarkLogic . Three worked examples demonstrating different migration patterns . BI Tool integration . Wrap Up & Q&A SLIDE: 2 © COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Introduction . Ken Tune . Senior Principal Consultant at MarkLogic for ~5 years . Background : dev lead / system architecture and design . Strong background in relational technology . Source code for this talk - https://github.com/rjrudin/marklogic-sakila-demo SLIDE: 3 © COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. OVERVIEW MarkLogic as an Operational Data Hub . Integrate data from multiple heterogeneous stand-alone sources . Do more with that data in aggregate . Use extensive MarkLogic feature set . Some of those silos will be RDBMSs SLIDE: 4 © COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Tables and Documents . Tables – Tabular! Document . Documents – Tree Structures – hierarchical & nested Title . Higher dimension implies greater representative power Metadata Author Section . We can intuitively transform tables into documents . Documents offer additional possibilities First Last Section Section Section Section SLIDE: 5 © COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Entities and relationships . Our applications have conceptual models Association . A conceptual model has entities with different Person Person kinds of relationships Aggregation . Relational forces us to break up our model into separate tables for every 1:many relation Person Car . Sometimes makes sense, but not always Composition . We never have a choice Person Alias SLIDE: 6 © COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Documents – greater flexibility . Choose how you model relationships Person • Name .
    [Show full text]
  • Master Thesis
    Version Control of structured data: A study on different approaches in XML Master Thesis in Software Engineering ERIK AXELSSON SE´RGIO BATISTA Department of Computer Science and Engineering Division of Software Engineering Chalmers University of Technology Gothenburg, Sweden 2015 Abstract The structured authoring environment has been changing towards a decentralised form of authoring. Existing version control systems do not handle these documents ade- quately, making it very difficult to have parallel authoring in a structured environment. This study attempts to find better alternatives to the existing paradigms and tools for versioning XML documents. To achieve this, the DESMET methodology for evaluating software engineering meth- ods and tools was applied, with both a Feature Analysis and a Benchmark Analysis being performed. Concerning the feature analysis, the results demonstrate that the XML-aware tools are, as expected, better at XML specific concerns, such as considering the history of a specific node. Conversely, the non-XML-aware ones are not able to achieve good results in the XML specific concerns, but do achieve a high score when considering project maturity or general repository management features. Regarding performance, this study concludes that XML-aware tools bring a considerable overhead when compared to the non-XML-aware tools. This study concludes that the selection of an approach to versioning XML should be dependent of the priorities of the documentation project. Keywords: Structured documentation, XML, Version Control, distributed collaboration, Git, Sirix, XChronicler, temporal databases, DESMET. Acknowledgements The authors would like to thank both the academic supervisor Morgan Ericsson and examiner Matthias Tichy for their feedback and support throughout this thesis work.
    [Show full text]
  • Xquery 1 Xquery
    XQuery 1 XQuery XQuery Examples Collection Welcome to the XQuery Examples Collection Wikibook! XQuery is a World Wide Web Consortium recommendation for selecting data from documents and databases. Current Status A new release of eXist (1.4) is currently installed and under test. Please note any problems with these examples in the discussion. Recent Changes About this Project This is a collaborative project and we encourage everyone who is using XQuery to contribute their XQuery examples. All example programs must conform to the creative-commons-2.5 share-alike with attribution license agreement [1]. Execution of examples use an eXist demo server. 1. Instructors: please sign our Guest Registry if you are using this book for learning or teaching XQuery 2. Contributors: please see our Naming Conventions to ensure your examples are consistent with the textbook 3. Learners: If you are looking for an example of a specific XQuery language construct, technique or problem but can't find an example, please add a suggestion to the Examples Wanted section. Introduction 1. Background - A brief history and motivation for the XQuery standard. 2. Benefits - Why use XQuery? 3. Installing and Testing - How to install an XQuery server on your . 4. Naming Conventions - Naming standards used throughout this book. Example Scripts Beginning Examples Examples that do not assume knowledge of functions and modules. 1. HelloWorld - A simple test to see if XQuery is installed correctly. 2. FLWOR Expression - A basic example of how XQuery FLWOR statements work. 3. Sequences - Working with sequences is central to XQuery. 4. XPath examples - Sample XPath samples for people new to XML and XPath 5.
    [Show full text]
  • REST Application Developer's Guide
    MarkLogic Server REST Application Developer’s Guide 2 MarkLogic 9 May, 2017 Last Revised: 9.0-8, December 2018 Copyright © 2019 MarkLogic Corporation. All rights reserved. MarkLogic Server Version MarkLogic 9—May, 2017 Page 2—REST Application Developer’s Guide MarkLogic Server Table of Contents Table of Contents REST Application Developer’s Guide 1.0 Introduction to the MarkLogic REST API ...................................................13 1.1 Capabilities of the REST Client API ....................................................................13 1.2 Getting Started with the MarkLogic REST API ...................................................14 1.2.1 Preparation ................................................................................................14 1.2.2 Choose a REST API Instance ...................................................................15 1.2.3 Load Documents Into the Database ..........................................................15 1.2.4 Search the Database ..................................................................................16 1.2.5 Tear Down the REST API Instance ..........................................................18 1.3 REST Client API Service Summary .....................................................................18 1.4 Security Requirements ..........................................................................................21 1.4.1 Basic Security Requirements ....................................................................21 1.4.2 Controlling Access to Documents and
    [Show full text]