Joaquim Jornet-Somoza Max Planck Institute for the Structure of Dynamics of Matter, Hamburg Taller NOMAD-Tecnalia 6 Julio de 2017 NOMAD Scope Big-Data Management

SCOPE – The NOMAD Center of Excellence The NOMAD CoE develops a Materials Encyclopedia and Big-Data Analytics tools for materials science and engineering. This will be reinforced by advanced graphics and animation tools.

NOMAD goals: Develop tools to enable researchers in basic science and engineering to : advance materials science, identify new physical phenomena, and help industry to improve existing products as well as develop novel products and technologies. NOMAD Scope Big-Data Management https://NOMAD-CoE.eu homepage The Novel Materials Discovery (NOMAD) Laboratory maintains the largest Repository for Input and Output files of all important Computational Materials Science Codes.

From its open-access data it builds The Novel Materials Discovery (NOMAD) several Big-Data Services helping to Laboratory maintains the largest Repository advance materials science and for Input and Output files of all important engineering: Computational Materials Science Codes. • Archive From its open-access data it builds several • Enciclopedia Big-Data Services helping to advance • Analytic Tools materials science and engineering. • Advanced Graphics NOMAD • HPC Expertise & Hardware structure and use cases What is NOMAD Lab ? https://nomad-coe.eu

NOMAD Repository Organize And Share Materials Data contact concerning the general NOMAD: Mathias Scheffler NOMAD Repository Organize And Share Materials Data Density Functional Theory 14000 10500 Data is the raw material for the 7000 st 3500 21 century. in ISI WOK inISI Publications 0 1990 1995 2000 2005 2010 2015 The NOMAD Repository accepts (and requests) in- and output files of all important codes. BigDFT, CP2K, CPMD, DMol3, Elk, FLEUR, GPAW, MOLCAS, NWChem, , ONETEP, ORCA, SIESTA, and TURBOMOLE NOMAD Repository Organize And Share Materials Data

Currently, the NOMAD Repository contains 5,045,064 entries.

• Upload interfaces: Curl, FTP. • Open Access Sharing. • Support the most important codes in • DOI support, to link from publication computational materials. to data. • Structure calculations in data sets. • DOI support, to link from data to • Share privately with collaborators. publication. • Share anonymously during peer • Guaranteed storage for 10 years. review. What is NOMAD Lab ?

NOMAD Code-Independent Archive Data Representation So That All Data Can Be Used for Analytics contact concerning the NOMAD Archive: Fawzi Mohamed and Luca Ghiringhelli NOMAD Code-Independent Archive Data Representation So That All Data Can Be Used for Analytics

Challenge: Nomenclature, data representation, and file formats of the input and output files of these codes are different. NOMAD Code-Independent Archive Data Representation So That All Data Can Be Used for Analytics

NOMAD META INFO The metadata structure is a conceptual model to store results from ab initio and force-field atomistic calculations. NOMAD Code-Independent Archive Data Representation So That All Data Can Be Used for Analytics NOMAD Code-Independent Archive Data Representation So That All Data Can Be Used for Analytics

Selected Open Access files ready to download What is NOMAD Lab ?

NOMAD Encyclopedia A web-based public infrastructure

contact concerning the NOMAD Encyclopedia: Georg Huhs NOMAD Encyclopedia A web-based public infrastructure

The NOMAD Encyclopedia allows users to see, compare, explore, and understand computed materials data.

• The NOMAD Encyclopedia provides a materials- oriented view on the Archive data. • Knowledge of their various properties: stuctural features, mechanical and thermal behavior, electronic and magnetic properties, the response to light … • user-friendly graphical user interface (GUI) • open API: experts can develop their own software. https://encyclopedia-doc.nomad-coe.eu NOMAD Encyclopedia A web-based public infrastructure NOMAD Encyclopedia A web-based public infrastructure

INTERACTIVE STRUCTURE VIEW

STATISTICS OF THE SEARCH

PLOTS OF COMPUTED PROPERTIES What is NOMAD Lab ?

NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

contact concerning Big-Data Analytics: Luca Ghiringhelli https://NOMAD-CoE.eu homepage NOMAD Big-Data Analytics Toolkit https://NOMAD-CoE.eu homepageIdentify Correlations and Structure in The Archive Data

We develop and implement methods that identify correlations and structure in BIG DATA of materials.

INPUT OUTPUT CLUSTERING The Novel TheMaterials Novel Materials Discovery Discovery (NOMAD) (NOMAD) SORT & CLASSIFY Laboratory maintains the largest Repository LEARNED MODELS Laboratoryformaintains Input and Output the largest files of Repositoryall important PREDICTIONS for Input and Output files of all important Computational Materials Science Codes. METADATA NEURAL STRUCTURE RIDGE NETWORK ComputationalFrom itsMaterials open-access Science data it Codes.builds several REGRESSION From its openBig-Data-access Services data helping it builds to advance several materials science and engineering. Big-Data Services helping to advance NOMAD materials science and engineering. structure and use cases NOMAD structure and use cases NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

ANALYTIC TOOLKIT FEATURES

The Analytics Toolkit is based on the concept of ‘notebooks’, which are interactive web pages where users can write and run their own code. NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

EXAMPLE https://analytics-toolkit.nomad-coe.eu/beaker/#/session/WMA13G NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

EXAMPLE https://analytics-toolkit.nomad-coe.eu/beaker/#/session/WMA13G

Open a pop-up window with Motivation & References

Open the interactive options to with possible variables

Explains how to proceed step-by-step NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

EXAMPLE https://analytics-toolkit.nomad-coe.eu/beaker/#/session/WMA13G

Press run to launch the tool kit

Several steps can be visualised

Plot is usually shown with the results NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

Tool-kits with Interactive plots for a better understanding

https://analytics-toolkit.nomad-coe.eu/beaker/#/session/vxNPfl NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

ANALYTIC TOOLKIT FEATURES

The Analytics Toolkit is based on the concept of ‘notebooks’, which are interactive web pages where users can write and run their own code.

The Toolkit tutorials are notebooks using a GUI. However, the underlying code (written in Python and HTML) is fully accessible and user editable. NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data https://analytics-toolkit.nomad-coe.eu/beaker/#/session/WMA13G

Experts can easy edit cell to generate their own modified codes NOMAD Big-Data Analytics Toolkit Identify Correlations and Structure in The Archive Data

ANALYTIC TOOLKIT FEATURES

The Analytics Toolkit is based on the concept of ‘notebooks’, which are interactive web pages where users can write and run their own code.

The Toolkit tutorials are notebooks using a GUI. However, the underlying code (written in Python and HTML) is fully accessible and user editable.

The user management system allows users to store and share new notebooks with other users. What is NOMAD Lab ?

NOMAD Advanced Graphics Seeing Helps Understanding

contact concerning Advanced : Rubén García Hernández NOMAD Advanced Graphics Seeing Helps Understanding

Enable comprehensive analysis and interactive visualization of molecular simulations.

Remote Visualization We develop an infrastructure for remote visualisation of the multi- dimensional NOMAD data. DOCKER CONTAINERS • Interactive platform for data visualisation. • No need of hardware or software installations. • Allow graphical analyses of complex data.

ENCYCLOPEDIA NOMAD Advanced Graphics Seeing Helps Understanding

Enable comprehensive analysis and interactive visualization of molecular simulations.

Virtual Reality Environment for immersive data exploration, training and dissemination.

• Feeling that they are located inside the dataset • Datasets become easily visible and understandable • Mobile systems will be supported. What is NOMAD Lab ?

NOMAD Outreach & Industry Reaching the General Public and Companies

contact concerning Industry Networking: Alessandro Da Vita, Angel Rubio The NOMAD Laboratory A European Centre of Excellence

General Matthias Scheffler, scheffl[email protected] Kylie O’Brien, [email protected] Repository Matthias Scheffler, scheffl[email protected] Claudia Draxl, [email protected] Archive Fawzi Mohamed, [email protected] Materials Encyclopedia Georg Huhs, [email protected] Big-Data Analytics Toolkit Luca Ghiringhelli, [email protected] Advanced Graphics Rubén García Hernández, [email protected] Industry Networking and Alessandro De Vita, [email protected] Use Cases Angel Rubio, [email protected] THANKS FOR YOUR ATTENTION

H2020 NOMAD: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 676580. THANKS FOR YOUR ATTENTION

H2020 NOMAD: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 676580.