A webinar on CSC’s Services for Bio-users (23.03.2020) CSC – Suomalainen tutkimuksen, koulutuksen, kulttuurin ja julkishallinnon ICT-osaamiskeskus Outline •Accessing CSC Services •CSC Supercomputing Environment (i.e., Puhti) •CSC Data Storage Environment (i.e., Allas) •CSC Cloud Services (e.g., cPouta ) •Other Relevant Services for Biousers •Take Home Message 2 Accessing CSC Services How to get access? Your Haka/Virtu user ID is your access to our services. • Use CSC customer portal MyCSC (https://my.csc.fi/welcome) • Register to get a personal CSC user account • If your organization does not have Haka, please contact our customer services Customer service • Support and guidance [email protected] • Weekdays 8.30–16.00. 4 More about our customer portal – my.csc.fi Manage your account Create projects Apply for resources Register as Manage a CSC your customer personal my.csc.fi information Add + more services Add members to projects 5 Visit: https://docs.csc.fi/accounts/ CSC Supercomputing Environment Visit: https://docs.csc.fi/computing/overview/ 6 CSC supercomputing environment Why: • Huge number of computational memory requirements • Less Scalability on local clusters • Parallel computing is needed • Time consuming operations • Non-optmised programmes CSC options: • Puhti – our successor of Taito oPuhti - Supercomputer with Intel CPUs oPuhti-ai – Supercomputer with GPUs • Mahti – our successor of Sisu (close to piloting phase) • Lumi – EuroHPC (System installations: Q4/2020) 7 Some basic info on Puhti Supercomputer • Pre-installed bio-software stack on Puhti available at: https://docs.csc.fi/apps/ • Understand Puhti workspace directories, defaults quota and max. number files o HOME – user specific / small data … o PROJAPPL – project specific / your installations/ sharing project code… o SCRATCH – project specific / Actual data / temporary space / automatic cleaning/ Billing units • Support for module environment o module command module-name • Slurm configuration for running batch jobs o note: #SBATCH --account=project_XXXXX • Support for interactive jobs • Support for Singularity containers 8 DEMO: Getting familiar with basic usage of Puhti 9 CSC Data Storage Environment (Allas) https://docs.csc.fi/data/Allas/ 10 Allas – object storage service • Active data • Project-based • Sharing data 11 Allas – first steps for Puhti • Use https://my.csc.fi to apply Allas access for your project § Allas is not automatically available • In Puhti and (in future) Mahti, setup connection to Allas with the commands: module load allas allas-conf • Refer to our manual pages and start using Allas with rclone or a-tools: https://docs.csc.fi/data/Allas/introduction/ Allas – a-tools • A-tools provide easy and safer way to use Allas l Developed for CSC server environmnet (Puhti, Mahti) but you can install the tools in other linux and mac machines too. l Unlike rclone, a-tools do not overwrite and remove data without asking! l Automatic packing and compression. l Uses default bucket names based on directories of Puhti Visit: https://docs.csc.fi/data/Allas/using_allas/a_commands/ Example command with a-put Puhti Allas quota for project_123 /scratch/project_123 123-puhti- case1/ data1.txt SCRATCH data2.txt case1.tar.zst data3.txt case1.tar.zst_ameta case1.tar.zst Command: a-put case1 DEMO: Getting familiar with basic usage of ALLAS 15 CSC Cloud Services (e.g., Pouta) 16 Cloud computing use cases • ”We need root access” • Deploying tools with web interfaces • CSC Private Cloud (ePouta) for sensitive data • Dont want to stand in batch queues for the execution of jobs • Advanced users – able to manage servers • Difficult workflows – can’t run on Puhti 17 CSC cloud service models Infrastructure as a Service (IaaS) CSC’s ePouta/cPouta Platform as a Service (PaaS) CSC’s RAHTI CSC’s notebook.csc.fi Software as a Service (SaaS) CSC’s Chipster,.. 18 DEMO : few examples to deploy web tools on cPouta 19 ePouta IaaS Cloud for sensitive data • ePouta is a cloud computing environment (Infrastructure as a Service, IaaS) designed for processing sensitive data • It allows customers to access, use and manage virtualized infrastructure using a self-service model. • Ongoing further developments by ELIXIR activities 20 Other relevant services for biousers 21 Notebooks •Easy to use: No software installations, No Firewall rules, No extra registrations: Login with your Haka account. •Blueprints Available: • Jupyter Notebooks: Customize your own interactive working environment • R-studio servers: Data Analytics and Visualization • Apache Spark: Crunch your BigData • Tensor Flow and Keras: Deep Machine Learning & Data Analytics Visit: https://notebooks.csc.fi 22 Chipster • Easy to use • 450 analysis tools oSingle cell RNA-seq oRNA-seq omiRNA-seq o16S amplicon seq oChIP-seq oetc • Tutorials in YouTube • Log in with HAKA • https://chipster.csc.fi 23 Training portfolio https://www.csc.fi/training High- Computing Methods & Programming Performance Data Networking IT Security Platforms Software Computing Finite Element Parallel Data Intensive Network Secure IT Linux 1, 2 and 3 Methods Fortran programming Computing Administration Practices (Elmer) CSC Comp. Fluid Data Network Network Computing Accelerators Dynamics Python / R Management Technologies Security Platforms (OpenFOAM) Molecular Cloud Staging & Network System Optimisation Dynamics Scripting computing Storage Protocols Security (Gromacs) Quantum System Network Parallel Debugging Chemistry Parallel I/O Watch programming workshops Services (GPAW) Webinars Next- PGAS Meta-data Network in YouTube Generation languages Repositories Security Sequencing E-learning Parallel I/O Visualisation material CSC Summer School in HPC CSC Winter School in Bioinformatics CSC Spring School in Comp. Chemistry 24 Learning materials for bio-users • Course & eLearning materials, tutorials and webinar recordings for bioscientists: ohttps://research.csc.fi/bioscience-learning-materials ohttps://research.csc.fi/rnaseq-tutorial • Chipster: Youtube channel & course material packages oCourse materials available for: o RNA-seq data analysis o Single cell RNA-seq data analysis o Virus detection using small RNA-seq o Community analysis of amplicon sequencing data (16S) o Detection and annotation of genomic variants o ChIP-seq data analysis o Microarray data analysis 25 https://research.csc.fi/biosciences Fairdata.fi • National integrated services for storing, describing and sharing and preserving research data • Provided by MinEdu • Produced by CSC and National Library of Finland • Make your data safe , documented and citable o IDA – Research data storage service o ETSIN – Research data finder o QVAIN – Research dataset metadata tool o FAIRDATA-PAS – Digital preservation for research data 26 Take Home Message • Manage your csc services via. our customer portal: my.csc.fi • Make use of csc resources for your research oResources are (mostly) free for open science research oCSC environment is different from a laptop or single workstation • Participate in CSC training, read materials and watch webinars in YouTube • CSC user documentation pages: docs.csc.fi • Join the [email protected] e-mail list and get our bioNewsletter • Support and guidance: [email protected] 27.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages27 Page
-
File Size-