Nosql at Work with JCR and Apache Jackrabbit
Total Page:16
File Type:pdf, Size:1020Kb
NoSQL at work with JCR and Apache Jackrabbit Carsten Ziegeler, Adobe [email protected], Nov 2011 Vancouver About • Member of the ASF – Sling, Felix, Portals, Incubator – PMC: Felix, Portals, Incubator, Sling (Chair) • RnD Team at Adobe Research Switzerland • Arcle/Book Author • Technical Reviewer • JSR 286 Spec Group (Portlet API 2.0) 2 Movaon • Tried and trusted NoSQL soluon • Standard Java API – First spec released in May 2005 – Various implementaons, products, and soluons – Open Source implementaon since 2006 • Think about your data use cases / problems – JCR might help! 3 Consider JCR • Data structure • Supporng the web • ACID • Security • Addional features 4 The Structure of Data • A data storage should be flexible and • Allow to model app data in the “right” way – Opmal way of dealing with the data in the app 5 The Structure of Data • A data storage should be flexible and • Allow to model data in the “right” way • What is the “right” way? – Tables? – Key‐Value‐Pairs? – Schema based? – Semi structured or even unstructured? – Flat, hierarchical or graph? 6 The Structure of Data • The right way depends on the applicaon: – Tables – Key‐Value‐Pairs – Schema based – Semi structured and unstructured – Flat, hierarchical, and graph – … • An app might have more than one “right” way • But: A lot of data can be modeled in a hierarchy 7 Sample: Product Catalog Books DVDs English English Ficon IT SF Douglas Adams Databases 2001 2010 A Hitch.. Apache Jackrabbit 8 Java Content Repository • Hierarchical content – Nodes with properes – (Table is a special tree) • Structured – Nodetypes with typed properes • And/or semi structured and unstructured • Fine and coarse‐grained • Single repository for all content! 9 Sample: Product Catalog Databases Apache Jackrabbit tle authors ISBN cover 10 Sample: Product Catalog Databases Apache Jackrabbit tle authors ISBN cover Defined by node type 11 Sample: Product Catalog Databases Apache Jackrabbit tle authors ISBN cover User Images Defined by node type img1 img2 12 JSR 170 / JSR 283: Content Repository for JavaTM technology API • (Java) Standard – Version 1.0 and 2.0 – Supported by many vendors – Used by many products and projects – Several open source soluons • Data model and features • Connecng to a CR • Working / Interacon with a CR 13 14 Apache Jackrabbit – JSR 170 and 283 reference implementation – Apache TLP since 2006 – Vital community – Frequent Releases • 1.6.5 (JSR 170 based, EOL) • 2.2.9 (JSR 283) • (dev branch 2.3.2) – Components • Commons, API • RMI, WebDAV, WebApp, JCA • OCM http://jackrabbit.apache.org/ • And more... 14 Data and the Web? • The web is hierarchical by nature • Web applicaons provide data in different ways – HTML – JSON • Provide your data in a RESTful way – hp://…/products/books/english/it/databases/apachejackrabbit.(html|json) • Avoid mapping/conversion – hp://…/products.jsp?id=5643564 15 Resource Oriented Architecture I • Every piece of informaon is a resource – News entry, book, book tle, book cover image – Descripve URI • Stateless web architecture (REST) – Request contains all relevant informaon – Targets the resource • Leverage HTTP – GET for rendering, POST for operaons 16 Resource Oriented Architecture II • JCR and Apache Jackrabbit are a perfect match – Hierarchical – From a single piece of informaon to binaries • Elegant way to bring data to the web • Apache Sling is (the|one) web framework 17 Sample Applicaon: Slingshot • Digital Asset Management – Hierarchical storage of pictures – Upload Poor man's flickr... – Tagging – Searching – Automac thumbnail generaon • Sample applicaon from Apache Sling 18 Slingshot Content Structure Travel Family Europe Weddings Amsterdam Basel 1997 2007 City Photo Photo Photo Photo 19 Facts About Slingshot • Java web applicaon • Uses Apache Sling as web framework • Content repository managed by Apache Jackrabbit • Interacon through the JCR API 20 Authencaon and Access Control • Apache Jackrabbit supports JAAS – Custom login modules possible • Deny / Allow of privileges on a node – Like read, write, add, delete – Inheritance from parent • Tree allows structuring based on access rights • Access control is done in the data er! 21 Slingshot Content Structure johndoe janedoe public private public private Basel Amsterdam Weddings Photo Photo 1997 Read for everyone, write for owner Write for owner Photo Photo 22 Content Modeling • Create a content hierarchy – Based on how the data is used – Access rights • Node type – Defines the structure of a node (schema) – Leverage exisng types – Start unstructured – Use mixin node types where appropriate (semi structure) 23 Leverage Exisng Node Types nt:hierarchyNode mix:tle nt:folder mix:created nt:file mix:lastModified nt:linkedFile nt:unstructured 24 Modeling: Node Types and Mixins mix:tle mixin - jcr:descripon (string) - jcr:tle(date) slingshot:photo > mix:tle mixin nt:file - slingshot:locaon (string) - slingshot:tags (string[]) slingshot:album > mix:tle mixin nt:folder - slingshot:date (date) sling:Folder slingshot:tag > mix:tle nt:base 25 Namespaces • Namespaces can be used for – Node types – Node and property names • Single namespace per company or app • Reasonably unique namespace prefix • Prefixed names for – Node types (always!) – Structured content • No namespace for unstructured content 26 Content Modeling: Words of Advice • Use an applicaon root node – /my:content or /slingshot – Good for searching, backup, and migraon • Avoid flat hierarchies – User interface complexity • Content‐driven design – Design your content before your applicaon 27 Content Modeling: Words of Advice • Embrace change • Look at exisng node types (JSR 2.0) • Start unstructured • Semi structure through mixin node types • Checkout Apache Jackrabbit wiki and mailing lists – "Davids Model" 28 (Nearly) Everthing is Content • Applicaon domain specific content • Presentaon support (HTML, CSS, JavaScript, Images) • Documentaon, translaons • Server side scripts • Applicaon code • … 29 Content Silos without JCR • Structured content – Usually schema based • Unstructured content • Large data (images, movies etc.) • Different storage for each use case 30 Content Repository: Combined Features • Features of a RDBMS – Structure, integrity, transacons – Queries • Features of a file system – Hierarchy, binaries – Access control, locking • And more good stuff – Observaon, versioning – Unstructured, mul values, sort order 31 Content Repository: Combined Features • Features of a RDBMS – Structure, integrity, transacons – Queries No • Features of a file system content – Hierarchy, binaries silos – Access control, locking • And more good stuff – Observaon, versioning – Unstructured, mul values, sort order 32 The Repository Model • Repository: one (or more) workspaces • Workspace contains a tree of nodes • Node have properes • Nodes provide the content structure – May have children • Actual data is stored as values of properes • Types and namespaces! Implementation of JCR 33 Sample: Product Catalog Databases Apache Jackrabbit tle authors ISBN cover 34 Connecng to the Repository • JCR 2.0 provides RepositoryFactory • Uses Service Provider Mechanism • META‐INF/services/javax.jcr.RepositoryFactory • Just use • RepositoryFactory.getRepository(null) • Or specify connecon parameters • RepositoryFactory.getRepository(Map) • With Apache Sling: OSGi service! 35 Working with the Repository • Interacon is session based – Assemble credenals – Login into workspace Credentials myCredentials = new SimpleCredentials("USERID", "PASSWORD".toCharArray()); Session mySession = repository.login(myCredentials, "WORKSPACE"); 36 Traversing the Content • Traverse the repository – From the root or any node Node rootNode = mySession.getRootNode(); Node albumNode = rootNode.getNode("slingshot/albums/travel"); Node europeNode = albumNode.getNode("Europe"); 37 Retrieve a Property • Various ways to get a property – Different methods for each type – Automac type conversion if possible Property prop = albumNode.getProperty(jcr:description"); Value value = prop.getValue(); String desc = value.getString(); value.getBoolean(); value.getStream(); value.getLong(); value.getDate(); value.getDouble(); 38 Changing Content • Change everything you want – Add/Remove nodes – Add/Remove/Change properes – Transient space – Then save // change albumNode.addNode("newAlbum"); europeNode.setProperty(“jcr:description", "something"); // save... mySession.save(); // ... or revert all changes mySession.refresh(false); 39 Interacon Summary • Get the repository • Login to a workspace – Provides a session • Use the session to – Access nodes and their properes – Navigaon/traversal or query – CRUD and save changes • Close the session 40 Advanced Development • Apache Sling – REST based web framework – Powered by OSGi – Scripng Inside • Apache Jackrabbit OCM – Map content to Java objects and vice versa – Similar to database ORMs 41 Observaon • Enables applicaons to register interest in changes • Monitor changes • Act on changes API: ObservaonManager: addEventListener(EventListener listener, int eventTypes, java.lang.String absPath, boolean isDeep, java.lang.String[] uuid, java.lang.String[] nodeTypeName, boolean noLocal) Advanced JCR Features 42 Event Types • Six different events – Node added – Node removed – Node moved – Property added – Property changed – Property removed Advanced JCR Features 43 Event Listeners • Registered with a session • Registraon with oponal filters – Like node types, paths, event types • Receive events for every change – Set of changes API: public void onEvent(EventIterator events);