Bioinformatics Software Installation with Conda Linux as a Computational Platform: 9/17/2020 Bioinformatics software is all over the place à dependencies complex

• Dependencies are base packages that Python, c, , rely on. • Can require c programming experience to install, and just not how you want to spend your time. • A version of something required by thing 1 may conflict with a version required by thing 2 • Package managers/installers to the rescue! Conda is for Data Science, Bioinformatics

• Installs python by default- can handle different environments for python 2, 3, and also R. • Anaconda Navigator: GUI • Has jupyterhub • A Script Editor with cool features • Environment and • Miniconda: Command Line only • just the environment and package manager • OK to have both • Windows users- Anaconda Navigator and Linux subsystem will be separate environments Installing conda

• Go to: docs.conda.io/en/latest/miniconda.html • Mac OS- 1. right-click on Python 3.8, Miniconda3 Linux 64-bit to copy the link 2. Use wget paste in the link. $ wget 3. $ shasum -a 256 filename Miniconda3-latest-MacOSX-x86_64.sh 4. $ bash Miniconda3-latest-MacOSX-x86_64.sh • Windows Linux Subsystem – 1. right-click on Python 3.8, Miniconda3 Linux 64-bit to copy the link 2. Use wget in your linux and paste in the link. $ wget 3. $ sha256sum filename Miniconda3-latest-Linux-x86_64.sh 4. $ bash Miniconda3-latest-Linux-x86_64 6. Accept license. 7. Say ‘no’ for running conda init For MacOS Catalina who still have zsh (do echo $SHELL to check): $ source /bin/activate (we may have to search for it) $ conda init zsh

For everyone else: $ conda init

Now moving forward: $ conda config --add channels conda-forge conda-forge is a “channel” where a lot of packages come from. $ conda install numpy $ conda search wget $ conda install tree Let’s install bioinformatics software!!! • $ conda install –c bioconda bwa • bioconda is another channel that has bioinformatics software • Hit to confirm downloads and installations • $ which bwa • $ bwa • What else might you use? blast? bedtools? Try: • $ conda install –c bioconda blast • $ conda install –c bioconda bedtools repeatmasker On Summit $ ssh –l [email protected] login.rc.colostate.edu $ nano ~/.condarc …and paste the following four lines: pkgs_dirs: - /projects/$USER/.conda_pkgs envs_dirs: - /projects/$USER/software/anaconda/envs $ source /curc/sw/anaconda3/latest $ conda create -n py3.8 python==3.8 $ conda activate py3.8 $ conda config –add channels conda-forge $ conda install –c bioconda hisat2 fastp bedtools fastqc You will have to do conda activate py3.8 when you log in again. More info: https://curc.readthedocs.io/en/latest/software/python.html • See Best Practices For Scientific Computing, PLoS, 2014.