Cluster Flow: a User Friendly Bioinformatics Workflow Tool

Cluster Flow: A user friendly bioinformatics workflow tool Supplementary Information Figure S1: Example Pipeline Script /* ----------------------------------- FastQ to Bismark Pipeline (RRBS Data) ----------------------------------- This pipeline takes FastQ files as input, runs FastQC, Trim Galore, then processed with Bismark (alignment, deduplication, methylation extractor then tidying up and report generation). It requires a genome index with a corresponding genome folder and bismark converted genome. Trim Galore is run with the RRBS parameter. */ #fastqc #trim_galore RRBS #bismark_align #bismark_methXtract #bismark_report >bismark_summary_report Typical pipeline script showing analysis pipeline for reduced representation bisulfite sequencing (RRBS) data, from FastQ files to methylation calls with a project summary report. Pipeline steps will run in parallel for each read group for steps prefixed with a hash symbol (#). All input files will be channelled into the final process, prefixed with a greater-than symbol (>). Table S1: List of supported tools Program Module Name Description Program Website Name Custom script - takes a BED file and https://github.com/mel-astar/mel- ngs/blob/master/mel- bedToNrf - computes the NRF (non-redundant chipseq/chipseq- fraction). metrics/CalBedNrf.pl bedtools_bamToBed bedtools Convert BAM files to BED files and sort. http://bedtools.readthedocs.io bedtools_intersectN Filter out blacklisted regions from a BAM bedtools http://bedtools.readthedocs.io eg using a BED file. http://www.bioinformatics.babraha bismark_align Bismark Align Bisulfite data m.ac.uk/projects/bismark Remove duplicates from aligned Bisulfite http://www.bioinformatics.babraha bismark_deduplicate Bismark data m.ac.uk/projects/bismark Extract methylation calls from aligned http://www.bioinformatics.babraha bismark_methXtract Bismark Bisulfite data m.ac.uk/projects/bismark Create a sample report of Bismark http://www.bioinformatics.babraha bismark_report Bismark analysis m.ac.uk/projects/bismark bismark_summary_rep Create a summary report of a Bismark http://www.bioinformatics.babraha Bismark ort project m.ac.uk/projects/bismark Run either Bowtie 1 or 2, depending on bowtie Bowtie 1 / 2 read length bowtie1 Bowite 1 Align reads using Bowtie 1 http://bowtie-bio.sourceforge.net http://bowtie- bowtie2 Bowtie 2 Align reads ugins Bowtie 2 bio.sourceforge.net/bowtie2 bwa BWA Align reads using BWA http://bio-bwa.sourceforge.net cf_download Cluster Flow Download data from the web http://clusterflow.io Merge files using a regex (BAM, gzipped cf_merge_files Cluster Flow http://clusterflow.io or text). deeptools_bamCovera Creates a bigWig coverage from from a deepTools http://deeptools.readthedocs.io ge BAM file deeptools_bamFinger Creates a fingerprint plot for ChIP-seq deepTools http://deeptools.readthedocs.io print data FastQ Runs FastQ Screen to check for http://www.bioinformatics.babraha fastq_screen Screen contaminants m.ac.uk/projects/fastq_screen http://www.bioinformatics.babraha fastqc FastQC Runs FastQC for basic QC metrics m.ac.uk/projects/fastqc Subread Counts reads overlapping exons in a http://bioinf.wehi.edu.au/featureCo featureCounts featureCoun GTF file, grouped by gene. unts ts Mapping and quality control for Hi-C http://www.bioinformatics.babraha hicup HiCUP data. m.ac.uk/projects/hicup Program Module Name Description Program Website Name hisat2 HISAT2 Align RNA-seq reads with HISAT2 https://ccb.jhu.edu/software/hisat2 HTSeq- Counts reads overlapping exons in a http://www- htseq_counts huber.embl.de/users/anders/HTSe count GTF file. q/doc/count.html Quantify abundances of transcripts from kallisto Kallisto https://pachterlab.github.io/kallisto RNA-Seq data Aggregates results from bioinformatics multiqc MultiQC analyses across many samples into a http://multiqc.info single report Phantom phantompeaktools_ru Runs cross correlation analysis on BAM https://github.com/kundajelab/pha Peak Qual nSpp files ntompeakqualtools Tools http://broadinstitute.github.io/picar picard_dedup Picard Mark duplicates in a BAM files d Calculate the complexity of a sequencing http://smithlabresearch.org/softwa preseq_calc Preseq library re/preseq rseqc_geneBody_cove Calculate the average coverage across RSeQC http://rseqc.sourceforge.net rage genes rseqc_inner_distanc Calculate the distance between reads for RSeQC http://rseqc.sourceforge.net e an RNA-seq library QC plots for exon junction annoation and rseqc_junctions RSeQC http://rseqc.sourceforge.net saturation Calculates a histogram showing the GC rseqc_read_GC RSeQC http://rseqc.sourceforge.net content of reads samtools_bam2sam Samtools Convert a BAM file to SAM http://www.htslib.org Mark and remove duplicated sequences samtools_dedup Samtools http://www.htslib.org from BAM files samtools_sort_index Samtools Sort and index a BAM file http://www.htslib.org Extract csqual and csfasta files from .sra sra_abidump SRA-Tools http://ncbi.github.io/sra-tools input sra_fqdump SRA-Tools Extract FastQ files from .sra input http://ncbi.github.io/sra-tools https://github.com/alexdobin/STA star STAR Align RNA data with STAR R tophat TopHat Align RNA data with TopHat https://ccb.jhu.edu/software/tophat Trim adapter sequence and low quality http://www.bioinformatics.babraha trim_galore TrimGalore! bases from reads m.ac.uk/projects/trim_galore List of modules with tool description and URL. Core Cluster Flow modules excluded. List valid at time of writing for Cluster Flow v0.4. .

Cluster Flow: a User Friendly Bioinformatics Workflow Tool

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support