Data Management Organization Charter

Data Management Organization Charter

Large Synoptic Survey Telescope (LSST) Database Design Jacek Becla, Daniel Wang, Serge Monkewitz, K-T Lim, Douglas Smith, Bill Chickering LDM-135 08/02/2013 LSST Database Design LDM-135 08/02/13 Change Record Version Date Description Revision Author 1.0 6/15/2009 Initial version Jacek Becla 2.0 7/12/2011 Most sections rewritten, added scalability test Jacek Becla section 2.1 8/12/2011 Refreshed future-plans and schedule of testing Jacek Becla, sections, added section about fault tolerance Daniel Wang 3.0 8/2/2013 Synchronized with latest changes to the Jacek Becla, requirements (LSE-163). Rewrote most of the Daniel Wang, “Implementation” chapter. Documented new Serge Monkewitz, tests, refreshed all other chapters. Kian-Tat Lim, Douglas Smith, Bill Chickering 2 LSST Database Design LDM-135 08/02/13 Table of Contents 1. Executive Summary.....................................................................................................................8 2. Introduction..................................................................................................................................9 3. Baseline Architecture.................................................................................................................10 3.1 Alert Production and Up-to-date Catalog..........................................................................10 3.2 Data Release Production....................................................................................................13 3.3 User Query Access.............................................................................................................13 3.3.1 Distributed and parallel.............................................................................................14 3.3.2 Shared-nothing..........................................................................................................14 3.3.3 Indexing....................................................................................................................15 3.3.4 Shared scanning........................................................................................................15 3.3.5 Clustering..................................................................................................................16 3.3.6 Partitioning................................................................................................................17 3.3.7 Technology choice....................................................................................................19 4. Requirements.............................................................................................................................20 4.1 General Requirements........................................................................................................20 4.2 Data Production Related Requirements.............................................................................21 4.3 Query Access Related Requirements.................................................................................21 4.4 Discussion..........................................................................................................................23 4.4.1 Implications...............................................................................................................23 4.4.2 Query complexity and access patterns......................................................................24 5. Potential Solutions - Research...................................................................................................25 5.1 The Research......................................................................................................................25 5.2 The Results.........................................................................................................................25 5.3 Map/Reduce-based and NoSQL Solutions........................................................................26 5.4 DBMS Solutions................................................................................................................27 5.4.1 Parallel DBMSes.......................................................................................................27 5.4.2 Object-oriented solutions..........................................................................................30 5.4.3 Row-based vs columnar stores..................................................................................30 5.4.4 Appliances.................................................................................................................32 5.5 Comparison and Discussion ..............................................................................................32 6. Design Trade-offs......................................................................................................................36 6.1 Standalone Tests................................................................................................................37 6.1.1 Spatial join performance...........................................................................................37 3 LSST Database Design LDM-135 08/02/13 6.1.2 Building sub-partitions..............................................................................................37 6.1.3 Sub-partition overhead..............................................................................................38 6.1.4 Avoiding materializing sub-partitions......................................................................38 6.1.5 Billion row table / reference catalog.........................................................................38 6.1.6 Compression.............................................................................................................39 6.1.7 Full table scan performance......................................................................................39 6.1.8 Low-volume queries.................................................................................................39 6.1.9 Solid state disks.........................................................................................................40 6.2 Data Challenge Related Tests............................................................................................41 6.2.1 DC1: data ingest........................................................................................................41 6.2.2 DC2: source/object association.................................................................................41 6.2.3 DC3: catalog construction.........................................................................................41 6.2.4 Winter-2013 Data Challenge: querying database for forced photometry.................42 6.2.5 Winter-2013 Data Challenge: partitioning 2.6 TB table for Qserv..........................42 6.2.6 Winter-2013 Data Challenge: multi-billion-row table..............................................42 7. Risk Analysis.............................................................................................................................43 7.1 Potential Key Risks............................................................................................................43 7.2 Risks Mitigations...............................................................................................................45 8. Implementation of the Query Service (Qserv) Prototype..........................................................46 8.1 Components.......................................................................................................................46 8.1.1 MySQL.....................................................................................................................46 8.1.2 XRootD.....................................................................................................................46 8.2 Partitioning.........................................................................................................................47 8.3 Query Generation...............................................................................................................48 8.3.1 Processing modules...................................................................................................48 8.3.2 Processing module overview....................................................................................49 8.4 Dispatch.............................................................................................................................50 8.4.1 Wire protocol............................................................................................................50 8.4.2 Frontend....................................................................................................................50 8.4.3 Worker......................................................................................................................51 8.5 Threading Model................................................................................................................51 8.6 Aggregation........................................................................................................................52 8.7 Indexing.............................................................................................................................53 8.8 Data Distribution................................................................................................................53 8.8.1 Database data distribution.........................................................................................53

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    126 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us