A webinar on CSC’s Services for Bio-users (23.03.2020)
CSC – Suomalainen tutkimuksen, koulutuksen, kulttuurin ja julkishallinnon ICT-osaamiskeskus Outline •Accessing CSC Services
•CSC Supercomputing Environment (i.e., Puhti)
•CSC Data Storage Environment (i.e., Allas)
•CSC Cloud Services (e.g., cPouta )
•Other Relevant Services for Biousers
•Take Home Message
2 Accessing CSC Services How to get access? Your Haka/Virtu user ID is your access to our services. • Use CSC customer portal MyCSC (https://my.csc.fi/welcome) • Register to get a personal CSC user account • If your organization does not have Haka, please contact our customer services Customer service • Support and guidance [email protected] • Weekdays 8.30–16.00.
4 More about our customer portal – my.csc.fi
Manage your account
Create projects Apply for resources
Register as Manage a CSC your customer personal my.csc.fi information
Add + more services
Add members to projects
5 Visit: https://docs.csc.fi/accounts/ CSC Supercomputing Environment
Visit: https://docs.csc.fi/computing/overview/ 6 CSC supercomputing environment Why: • Huge number of computational memory requirements • Less Scalability on local clusters • Parallel computing is needed • Time consuming operations • Non-optmised programmes CSC options: • Puhti – our successor of Taito oPuhti - Supercomputer with Intel CPUs oPuhti-ai – Supercomputer with GPUs • Mahti – our successor of Sisu (close to piloting phase) • Lumi – EuroHPC (System installations: Q4/2020)
7 Some basic info on Puhti Supercomputer • Pre-installed bio-software stack on Puhti available at: https://docs.csc.fi/apps/ • Understand Puhti workspace directories, defaults quota and max. number files o HOME – user specific / small data … o PROJAPPL – project specific / your installations/ sharing project code… o SCRATCH – project specific / Actual data / temporary space / automatic cleaning/ Billing units • Support for module environment o module command module-name • Slurm configuration for running batch jobs o note: #SBATCH --account=project_XXXXX • Support for interactive jobs • Support for Singularity containers
8 DEMO: Getting familiar with basic usage of Puhti
9 CSC Data Storage Environment (Allas)
https://docs.csc.fi/data/Allas/ 10 Allas – object storage service
• Active data • Project-based • Sharing data
11 Allas – first steps for Puhti
• Use https://my.csc.fi to apply Allas access for your project § Allas is not automatically available • In Puhti and (in future) Mahti, setup connection to Allas with the commands: module load allas allas-conf • Refer to our manual pages and start using Allas with rclone or a-tools:
https://docs.csc.fi/data/Allas/introduction/ Allas – a-tools
• A-tools provide easy and safer way to use Allas l Developed for CSC server environmnet (Puhti, Mahti) but you can install the tools in other linux and mac machines too. l Unlike rclone, a-tools do not overwrite and remove data without asking! l Automatic packing and compression. l Uses default bucket names based on directories of Puhti
Visit: https://docs.csc.fi/data/Allas/using_allas/a_commands/ Example command with a-put
Puhti Allas quota for project_123 /scratch/project_123 123-puhti- case1/ data1.txt SCRATCH data2.txt case1.tar.zst data3.txt case1.tar.zst_ameta case1.tar.zst
Command: a-put case1 DEMO: Getting familiar with basic usage of ALLAS
15 CSC Cloud Services (e.g., Pouta)
16 Cloud computing use cases
• ”We need root access” • Deploying tools with web interfaces • CSC Private Cloud (ePouta) for sensitive data • Dont want to stand in batch queues for the execution of jobs • Advanced users – able to manage servers • Difficult workflows – can’t run on Puhti
17 CSC cloud service models
Infrastructure as a Service (IaaS) CSC’s ePouta/cPouta
Platform as a Service (PaaS) CSC’s RAHTI CSC’s notebook.csc.fi
Software as a Service (SaaS) CSC’s Chipster,..
18 DEMO : few examples to deploy web tools on cPouta
19 ePouta IaaS Cloud for sensitive data
• ePouta is a cloud computing environment (Infrastructure as a Service, IaaS) designed for processing sensitive data • It allows customers to access, use and manage virtualized infrastructure using a self-service model. • Ongoing further developments by ELIXIR activities
20 Other relevant services for biousers
21 Notebooks
•Easy to use: No software installations, No Firewall rules, No extra registrations: Login with your Haka account. •Blueprints Available: • Jupyter Notebooks: Customize your own interactive working environment • R-studio servers: Data Analytics and Visualization • Apache Spark: Crunch your BigData • Tensor Flow and Keras: Deep Machine Learning & Data Analytics
Visit: https://notebooks.csc.fi 22 Chipster • Easy to use • 450 analysis tools oSingle cell RNA-seq oRNA-seq omiRNA-seq o16S amplicon seq oChIP-seq oetc • Tutorials in YouTube • Log in with HAKA • https://chipster.csc.fi
23 Training portfolio https://www.csc.fi/training
High- Computing Methods & Programming Performance Data Networking IT Security Platforms Software Computing
Finite Element Parallel Data Intensive Network Secure IT Linux 1, 2 and 3 Methods Fortran programming Computing Administration Practices (Elmer)
CSC Comp. Fluid Data Network Network Computing Accelerators Dynamics Python / R Management Technologies Security Platforms (OpenFOAM)
Molecular Cloud Staging & Network System Optimisation Dynamics Scripting computing Storage Protocols Security (Gromacs)
Quantum System Network Parallel Debugging Chemistry Parallel I/O Watch programming workshops Services (GPAW) Webinars Next- PGAS Meta-data Network in YouTube Generation languages Repositories Security Sequencing E-learning Parallel I/O Visualisation material
CSC Summer School in HPC CSC Winter School in Bioinformatics CSC Spring School in Comp. Chemistry 24 Learning materials for bio-users
• Course & eLearning materials, tutorials and webinar recordings for bioscientists: ohttps://research.csc.fi/bioscience-learning-materials ohttps://research.csc.fi/rnaseq-tutorial • Chipster: Youtube channel & course material packages oCourse materials available for: o RNA-seq data analysis o Single cell RNA-seq data analysis o Virus detection using small RNA-seq o Community analysis of amplicon sequencing data (16S) o Detection and annotation of genomic variants o ChIP-seq data analysis o Microarray data analysis
25 https://research.csc.fi/biosciences Fairdata.fi
• National integrated services for storing, describing and sharing and preserving research data
• Provided by MinEdu
• Produced by CSC and National Library of Finland
• Make your data safe , documented and citable o IDA – Research data storage service o ETSIN – Research data finder o QVAIN – Research dataset metadata tool o FAIRDATA-PAS – Digital preservation for research data
26 Take Home Message
• Manage your csc services via. our customer portal: my.csc.fi • Make use of csc resources for your research oResources are (mostly) free for open science research oCSC environment is different from a laptop or single workstation • Participate in CSC training, read materials and watch webinars in YouTube • CSC user documentation pages: docs.csc.fi • Join the [email protected] e-mail list and get our bioNewsletter • Support and guidance: [email protected]
27