Gerard Lemson Alex Szalay, Mike Rippin DIBBS/Sciserver Collaborative Data-Driven Science

Gerard Lemson Alex Szalay, Mike Rippin DIBBS/Sciserver Collaborative Data-Driven Science

Collaborative data-driven science Gerard Lemson Alex Szalay, Mike Rippin DIBBS/SciServer Collaborative data-driven science } Started with the SDSS SkyServer } Built in a few months in 2001 } Goal: instant access to rich content } Idea: bring the analysis to the data } Interac@ve access at the core } Much of the scien@fic process is about data ◦ Data collec@on, data cleaning, data archiving, data organizaon, data publishing, mirroring, data distribu@on, data analy@cs, data curaon… 2 Collaborative data-driven science Form Based Queries 3 Collaborative data-driven science Image Access Collaborative data-driven science Custom SQL Collaborative data-driven science Batch Queries, MyDB Collaborative data-driven science Cosmological Simulations Collaborative data-driven science Turbulence Database Collaborative data-driven science Web Service Access through Python Collaborative data-driven science } Interac@ve science on petascale data } Sustain and enhance our astronomy effort } Create scalable open numerical laboratories } Scale system to many petabytes } Deep integraon with the “Long Tail” } Large footprint across many disciplines ◦ Also: Genomics, Oceanography, Materials Science } Use commonly shared building blocks } Major naonal and internaonal impact 10 Collaborative data-driven science } Offer more compung resources server side } Augment and combine SQL queries with easy- to-use scrip@ng tools } Heavy use of virtual machines } Interac@ve portal via iPython/Matlab/R } Batch jobs } Enhanced visualizaon tools 11 Collaborative data-driven science } CasJobs ◦ SQL, MyDB, batch ◦ FileDB: Raw data access from within RDB } SciDrive ◦ Dropbox-like, on-drop event handling } SciServer/compute ◦ Interac@ve/batch python, R, Matlab in Docker container } MyScratch (File & DB) } SSO on all components } All published through REST 12 Collaborative data-driven science MyScratch Files Login Portal SkyServer MyScratch DB REST API SciDrive OpenStack REST API SciScript Turbulence Keystone & Swift REST API REST API Cosmology WEB UI CasJobs UI Client REST CasJobs Job GLUSEEN BatchAdmin Scheduler Service API WS Client SkyQuery USNOB IRAS DR7 DR8 GLUSEEN DR10 FIRST ROSAT DR5 DR6 Parallel X-Match Engine Cosmology 2MASS Galex DR3 DR4 SkyQuery Scheduler DR9 SDSS WISE DR1 DR2 Turbulence SkyNode REST Registry SDSS DB Misc. DB Servers API MyDB Server Servers Servers Linked Server Connections 13 Collaborative data-driven science } Jupyter Notebooks in Docker ◦ h`p://www.nature.com/news/interac@ve-notebooks-sharing-the- code-1.16261 ◦ h`ps://developer.rackspace.com/blog/how-did-we-serve-more- than-20000-ipython-notebooks-for-nature/ } Python, R, Matlab } Flexible way to aach data sets in volume containers } Extended to batch jobs 14 Collaborative data-driven science 15 Collaborative data-driven science Astronomy Collaborative data-driven science Collaborative data-driven science Collaborative data-driven science Materials Science Collaborative data-driven science Materials Science 20 Collaborative data-driven science Turbulence Collaborative data-driven science Genomics Collaborative data-driven science 23 Collaborative data-driven science I’ll be very happy to demo and discuss our services 24 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    24 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us