
Cluster Flow: A user friendly bioinformatics workflow tool Supplementary Information Figure S1: Example Pipeline Script /* ----------------------------------- FastQ to Bismark Pipeline (RRBS Data) ----------------------------------- This pipeline takes FastQ files as input, runs FastQC, Trim Galore, then processed with Bismark (alignment, deduplication, methylation extractor then tidying up and report generation). It requires a genome index with a corresponding genome folder and bismark converted genome. Trim Galore is run with the RRBS parameter. */ #fastqc #trim_galore RRBS #bismark_align #bismark_methXtract #bismark_report >bismark_summary_report Typical pipeline script showing analysis pipeline for reduced representation bisulfite sequencing (RRBS) data, from FastQ files to methylation calls with a project summary report. Pipeline steps will run in parallel for each read group for steps prefixed with a hash symbol (#). All input files will be channelled into the final process, prefixed with a greater-than symbol (>). Table S1: List of supported tools Program Module Name Description Program Website Name Custom script - takes a BED file and https://github.com/mel-astar/mel- ngs/blob/master/mel- bedToNrf - computes the NRF (non-redundant chipseq/chipseq- fraction). metrics/CalBedNrf.pl bedtools_bamToBed bedtools Convert BAM files to BED files and sort. http://bedtools.readthedocs.io bedtools_intersectN Filter out blacklisted regions from a BAM bedtools http://bedtools.readthedocs.io eg using a BED file. http://www.bioinformatics.babraha bismark_align Bismark Align Bisulfite data m.ac.uk/projects/bismark Remove duplicates from aligned Bisulfite http://www.bioinformatics.babraha bismark_deduplicate Bismark data m.ac.uk/projects/bismark Extract methylation calls from aligned http://www.bioinformatics.babraha bismark_methXtract Bismark Bisulfite data m.ac.uk/projects/bismark Create a sample report of Bismark http://www.bioinformatics.babraha bismark_report Bismark analysis m.ac.uk/projects/bismark bismark_summary_rep Create a summary report of a Bismark http://www.bioinformatics.babraha Bismark ort project m.ac.uk/projects/bismark Run either Bowtie 1 or 2, depending on bowtie Bowtie 1 / 2 read length bowtie1 Bowite 1 Align reads using Bowtie 1 http://bowtie-bio.sourceforge.net http://bowtie- bowtie2 Bowtie 2 Align reads ugins Bowtie 2 bio.sourceforge.net/bowtie2 bwa BWA Align reads using BWA http://bio-bwa.sourceforge.net cf_download Cluster Flow Download data from the web http://clusterflow.io Merge files using a regex (BAM, gzipped cf_merge_files Cluster Flow http://clusterflow.io or text). deeptools_bamCovera Creates a bigWig coverage from from a deepTools http://deeptools.readthedocs.io ge BAM file deeptools_bamFinger Creates a fingerprint plot for ChIP-seq deepTools http://deeptools.readthedocs.io print data FastQ Runs FastQ Screen to check for http://www.bioinformatics.babraha fastq_screen Screen contaminants m.ac.uk/projects/fastq_screen http://www.bioinformatics.babraha fastqc FastQC Runs FastQC for basic QC metrics m.ac.uk/projects/fastqc Subread Counts reads overlapping exons in a http://bioinf.wehi.edu.au/featureCo featureCounts featureCoun GTF file, grouped by gene. unts ts Mapping and quality control for Hi-C http://www.bioinformatics.babraha hicup HiCUP data. m.ac.uk/projects/hicup Program Module Name Description Program Website Name hisat2 HISAT2 Align RNA-seq reads with HISAT2 https://ccb.jhu.edu/software/hisat2 HTSeq- Counts reads overlapping exons in a http://www- htseq_counts huber.embl.de/users/anders/HTSe count GTF file. q/doc/count.html Quantify abundances of transcripts from kallisto Kallisto https://pachterlab.github.io/kallisto RNA-Seq data Aggregates results from bioinformatics multiqc MultiQC analyses across many samples into a http://multiqc.info single report Phantom phantompeaktools_ru Runs cross correlation analysis on BAM https://github.com/kundajelab/pha Peak Qual nSpp files ntompeakqualtools Tools http://broadinstitute.github.io/picar picard_dedup Picard Mark duplicates in a BAM files d Calculate the complexity of a sequencing http://smithlabresearch.org/softwa preseq_calc Preseq library re/preseq rseqc_geneBody_cove Calculate the average coverage across RSeQC http://rseqc.sourceforge.net rage genes rseqc_inner_distanc Calculate the distance between reads for RSeQC http://rseqc.sourceforge.net e an RNA-seq library QC plots for exon junction annoation and rseqc_junctions RSeQC http://rseqc.sourceforge.net saturation Calculates a histogram showing the GC rseqc_read_GC RSeQC http://rseqc.sourceforge.net content of reads samtools_bam2sam Samtools Convert a BAM file to SAM http://www.htslib.org Mark and remove duplicated sequences samtools_dedup Samtools http://www.htslib.org from BAM files samtools_sort_index Samtools Sort and index a BAM file http://www.htslib.org Extract csqual and csfasta files from .sra sra_abidump SRA-Tools http://ncbi.github.io/sra-tools input sra_fqdump SRA-Tools Extract FastQ files from .sra input http://ncbi.github.io/sra-tools https://github.com/alexdobin/STA star STAR Align RNA data with STAR R tophat TopHat Align RNA data with TopHat https://ccb.jhu.edu/software/tophat Trim adapter sequence and low quality http://www.bioinformatics.babraha trim_galore TrimGalore! bases from reads m.ac.uk/projects/trim_galore List of modules with tool description and URL. Core Cluster Flow modules excluded. List valid at time of writing for Cluster Flow v0.4. .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages3 Page
-
File Size-