PeakMatcher: Matching Peaks Across Genome Assemblies Ronald J. Nowling Christopher R. Beal Scott Emrich Milwaukee School of Engineering Marquette University University of Tennessee–Knoxville Milwaukee, Wisconsin Milwaukee, Wisconsin Knoxville, Tennessee
[email protected] [email protected] [email protected] Susanta K. Behura Marc S. Halfon Molly Duman-Scheel University of Missouri SUNY at Buffalo Indiana University School of Medicine Columbia, Missouri Buffalo, New York South Bend, Indiana
[email protected] [email protected] [email protected] ABSTRACT The second data set was STARR-Seq (Self-Transcribing Active When reference genome assemblies are updated, the peaks from Regulatory Region Sequencing) of Drosophila melanogaster DNA DNA enrichment assays such as ChIP-Seq and FAIRE-Seq need in S2 culture cells [1]. We called STARR peaks against two versions to be called again using the new genome assembly. PeakMatcher (dm3 and r5.53) of the D. melanogaster genome [6]. PeakMatcher is an open-source package that aids in validation by matching matched 77.4% (precision) of the 4,195 dm3 peaks with 94.8% (recall) peaks across two genome assemblies using the alignment of reads of the 3,114 r5.53 peaks. or within the same genome. PeakMatcher calculates recall and PeakMatcher and associated documentation are available on precision while also outputting lists of peak-to-peak matches. GitHub (https://github.com/rnowling/peak-matcher) under the open- PeakMatcher uses read alignments to match peaks across genome source Apache Software License v2. PeakMatcher was written in assemblies. PeakMatcher finds all read aligned to one genome that Python 3 using the intervaltree library.