3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013

2013 Annual Poster Session

! 1 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013

Computational and Characterization of 2-D tendency to cross- and self-hybridize. These features are observed in Photonic Crystal Geometries for Biosensing RNA sequences with known structure. We demonstrate that pre- James E. Baker and Benjamin L. Miller selected sequences accelerate the design of structures that are mimics Physics and Astronomy of biologically relevant structures. This is implemented as a new Dermatology structure design component of RNAstructure (http:// Biochemistry and Biophysics .urmc.rochester.edu/RNAstructure.html). This work is a Biomedical collaboration with Celadon Laboratories, Inc. (http:// The importance of early disease diagnosis both for initiating www.celadonlabs.com/). successful clinical treatment and preventing disease transmission continues to drive the development of rapid, ultrasensitive, label-free Next Generation Sequence Analysis of the Transcriptional biosensors. Sensors based on two-dimensional photonic bandgap Response to Neonatal Hyperoxia crystal structures, in particular, have the potential for single- Soumyaroop Bhattacharya, Chin-Yi Chu, Zhongyang Zhou, Min Yee, pathogen detection capabilities. In order to achieve such high Ashley M. Lopez, Valerie A. Lunger, Bradley Buczynski, Gloria S. sensitivity, the geometric structure of the photonic crystal must be Pryhuber, Michael A. O’Reilly, and Thomas J. Mariani designed in a way that pathogen binding is evident in the optical Pediatrics transmission spectrum of the crystal. Computational modeling methods allow us to simulate the electric field profiles and Rationale: Bronchopulmonary Dysplasia (BPD) is a major transmission spectra that we expect to observe for different photonic complication of preterm birth associated with significant morbidity. crystal . Ongoing computational characterization of photonic Due to the complexity of risk factors and limited availability of crystal designs are presented. clinical samples, an understanding of this disease, potential biomarkers and causal mechanisms are limited. BPD is a debilitating condition characterized by inflammation, enlarged airspaces, Accelerating Design Using Pre-Selected vascular dysmorphia and aberrant extracellular matrix accumulation Sequences that is typically described as arrested lung development. This is in Stanislav Bellaousov and David Mathews part due to oxidative stress, resulting from therapeutic oxygen Biochemistry and Biophysics supplementation, that disrupts critical pathways of lung development. Rodent models involving neonatal exposure to Nanoscale nucleic acids could potentially be designed to be catalysts, excessive oxygen concentrations (hyperoxia) have been used to study pharmaceuticals, or probes for detecting pathogens. We hypothesize the mechanisms contributing to BPD pathology. Transcriptomic that designing nucleic acid molecules from pre-selected sequences, assessment of the effects of hyperoxia in neonatal mouse lungs using rather than from random sequences, would increase the speed of RNASeq will help to identify genes and pathways associated with designing large molecules and also increase the accuracy of design. BPD. Helices should be formed in the optimal folding free energy change range, have maximal structure probability and minimal ensemble Methods: Whole lung tissue from newborn C57BL/6 mice exposed to defect. Loops should be composed of sequences with the lowest 100% oxygen for 10 days (n=8) and room air-exposed age matched ensemble free energy change. All sequences should have low controls (n=6) were compared. Total RNA was isolated from

! 2 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 individual whole lung tissues (n=14) and pooled in duplicates to Further analysis of these data will enhance our current knowledge of perform transcriptome Sequencing (RNA-seq) using the Illumina BPD, and may be useful for developing novel therapeutic strategies. Genome Analyzer II platform. Alignments were generated using multiple (ELAND v1.8 using Illumina CASAVA software; TopHat; and Shrimp2). Raw counts obtained from The Effects of Inhomogeneities within Colliding Flows on each alignment algorithm were further normalized using trimmed- the Formation and Evolution of Molecular Clouds mean and reads per million bases. Normalized gene expression data Jonathan Carroll-Nellenback, Adam Frank, and Fabian Heitsch were filtered to remove undetected genes. Differentially expressed Center for Integrated Research Computing genes were detected using Significance Analysis of Microarrays Physics and Astronomy (SAM) and CuffDiff2, on each version of mapped and normalized Observational evidence from local star-forming regions mandates data. Ingenuity Pathway Analysis (IPA) was used for pathway and that star formation occurs shortly after, or even during, molecular network analyses. Expression patterns for selected genes were cloud formation. Models of the formation of molecular clouds in examined by quantitative polymerase chain reaction (qPCR). Results: large-scale converging flows have identified the physical A total of 248 genes were identified as differentially expressed mechanisms driving the necessary rapid fragmentation. They also between hyperoxia and control samples by both SAM (at median point to global gravitational collapse driving supersonic turbulence FDR = 0) and CuffDiff2 (at p < 0.05) and had a magnitude of change in molecular clouds. Previous cloud formation models have focused greater than or equal to two-fold (fold change > 2). We successfully on turbulence generation, gravitational collapse, magnetic fields, and validated 36 of 51 genes by qPCR. There was a clear association feedback. Here, we explore the effect of structure in the flow on the between the magnitude change identified by RNASeq and resulting clouds and the ensuing gravitational collapse. We compare subsequent qPCR validation. There were also differences in the rate two extreme cases, one with a collision between two smooth streams, of validation dependent upon the mapping and gene selection and one with streams containing small clumps. We find that approaches used. Canonical pathways significantly dysregulated in structured converging flows lead to a delay of local gravitational hyperoxia lungs included Nrf2-mediated oxidative stress signaling, collapse ("star formation"). Thus, more gas has time to accumulate, p53 signaling, hepatic fibrosis and sildenafil pathways. Interestingly eventually leading to a strong global collapse, and thus to a high star most genes significantly affected (~70%) showed a pattern of formation rate. Uniform converging flows fragment hydro- expression consistent with an arrest in lung development following dynamically early on, leading to the rapid onset of local gravitational hyperoxia. A subset of the genes dysregulated in hyperoxic neonatal collapse and an overall low sink formation rate. mouse lungs (13%) were also differentially expressed (as defined by t-test p<0.05) in human BPD lung tissue (Bhattacharya et al., 2012, AJRCCM). Summary: We have generated genome-wide expression data from hyperoxia-exposed neonatal mouse lung tissue using RNA-Seq. We have identified and validated genes dysregulated in this model of BPD-like pathology. This model captures some aspects of BPD.

! 3 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013

Prediction of Nucleic Acid α and ζ Torsions From 31P NMR disperse fengycin about the membrane eliminating radical pockets of Chemical Shifts curvature. David E. Condon, Brendan C. Mort, Scott D. Kennedy, and Douglas H. Turner Chemistry Discovery of Novel ncRNA by Scanning Multiple Genome Alignments Structural databases were searched for nucleic acids that had both Yinghan Fu, Zhenjiang Xu, Zhi J. Lu, Shan Zhao, and David H. 31P NMR chemical shifts assigned and known ! and " torsion angles, Mathews specifically structures that had non-A-form characteristics, e.g. a Biochemistry and Biophysics hairpin. Quantum mechanics data was used to predict NMR chemical shift using the GIAO method. Hartree-Fock and the DFT Recently, non-coding (ncRNAs) have been discovered with functionals wB97X-D, SSB-D, B3LYP, and a mean functional of all five novel functions, and it has been appreciated that there is pervasive were tested to see which would give the best agreement with the transcription. Therefore, de novo computational ncRNA detection empirical data. Here, the NMR 31P chemical shifts estimated with that are accurate and efficient are desirable. The purpose of this study B3LYP are recommended as a guide for future structural studies. is to develop a ncRNA detection method based on structural conservation. A new method called Multifind, based on Multilign (Xu & Mathews 2011), was developed. It uses an algorithm that Interactions of the Antifungal Fengycin with Model predicts common structures among multiple sequences and estimates Biomembranes Characterized using Molecular Simulation the probability the input sequences are ncRNA using a classification Aaron Cravens, Joshua N. Horn, and Alan Grossfield support vector machine (SVM). Multilign uses Dynalign (Mathews & Biochemistry and Biophysics Turner 2002), which folds and aligns two sequences simultaneously without requiring any sequence identity; its structure prediction With the advent of many immuno-compromising medicines as well quality will therefore not be affected by input sequence diversity. as drug-resistant strains of fungi there is an increasing need for new Benchmarks showed Multifind performs better than RNAz on testing classes of antifungal compounds. The lipopeptide fengycin (FE), sequences extracted from Rfam database (Gardner et al. 2011), produced by Bacillus subtilis, has demonstrated potent antifungal especially on sequences that are more diverse. For de novo ncRNA activity while having little inhibitory effect towards mammalian cells discovery in genomes, Multifind had an advantage in low similarity and bacteria. Experiment indicates fengycin acts as a toxic agent via regions of genome alignments. Multifind takes about 48 hours to cytoplasmic membrane disruption, resulting in permeabilization of finish scanning the whole yeast genome alignment and RNAz takes the cell membrane. Here we explore this hypothesis using coarse- about 4 hours, therefore, its computational requirements do not grained molecular dynamics simulations. While the insertion of present a barrier for most users. The program was implemented in multiple fengycins into our model bilayers causes notable thinning C++ and is included in RNAstructure package (Reuter & Mathews, there is no significant difference in average thinning between the 2010): http://rna.urmc.rochester.edu. three bilayer types. This is likely due to fengycin clustering in the different types of membranes. Membranes with experimental susceptibility to fengycin demonstrated more clustering in our simulations. We hypothesize that fengycin-resistant membranes

! 4 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013

References: variants for final annotation using the web-based bioinformatics platforms DBDB and Lynx. The workflow in its entirety requires a Xu, Z.J. and Mathews, D.H. (2011) Multilign: an algorithm to predict mean of 11.8 hours/trio on BlueHive. We have identified causative secondary structures onserved in multiple RNA sequences, variants for pediatric epilepsy in 3 families and have confirmed these Bioinformatics, 27, 626-632. by Sanger sequencing. Our analysis has identified putatively Mathews, D.H. and Turner, D.H. (2002) Dynalign: An algorithm for causative variants in additional 4 families with combined epilepsy finding the secondary structure common to two RNA sequences, and autism phenotypes. This model addresses some of the challenges Journal of Molecular Biology, 317, 191-203. of whole exome sequencing for neurodevelopmental disorders and Gardner, P.P., et al. (2011) Rfam: Wikipedia, clans and the â identifies areas for further optimization. The model can be modified €œdecimal†release, Nucleic Acids Research, 39, D141. for other phenotypes where whole exome sequencing for gene discovery is used. Reuter, J.S. and Mathews, D.H. (2010) RNAstructure: software for RNA secondary structure prediction and analysis, BMC Bioinformatics, 11. The Influence of Geometric Surfactants on Phase Behavior of Hard Spheres and Spherocylinders Alokendra Ghosh and Mitchell Anthamatten Whole Exome Sequencing at URMC for Gene Discovery in Chemical Engineering Developmental Brain Disorders using SOLVE for Causative Variant Identification Hard-core repulsive systems are computationally inexpensive and Dalia Ghoneim, Emily Tuttle, and Alex R. Paciorkowski useful models for investigating phase behavior physics of real Center for Neural Development and Disease systems including liquid crystals, polymers, , viruses and colloids. This work aims to investigate, through constant Whole exome sequencing has the potential to identify causative pressure Monte Carlo simulations, the entropy-driven phase variants in the coding genomic sequence in a variety of human behavior of binary mixtures of hard spheres and spherocylinders. Mendelian disorders. Pediatric developmental brain disorders such The simulations also investigate how the presence of shape dyads, as autism and epilepsy are frequently de novo, and identification of comprising of a sphere tethered to a spherocylinder, affect ordering disease causing variants remains a challenge. We have implemented and miscibility of spheres with spherocylinders. The primary types of an analysis pipeline on the BlueHive cluster at the CIRC and applied phase behavior of interest include: a) the ordering behavior of the it to whole exome datasets from 20 families with autism and/or spherocylinder component and how spheres influence the ordering epilepsy. BWA aligns fastq files to the hg19 reference genome. transition; and b) bulk demixing, which involves concentration or Samtools and Picard prepare the resulting data for further analysis. depletion of species from regions of the simulation volume with GATK then generates trio-specific variant call files. Annovar provides length scales varying from few particle diameters to the order of box initial variant annotation. To address the challenges of causative length. The general simulation program for three component variant identification, we developed SOLVE, a publicly available mixtures was calibrated successfully against literature data for pure workflow that first filters variants by call quality and suspected de components. Then, it was used to generate phase diagrams for binary novo vs. recessive pedigree characteristics. SOLVE then presents mixtures over a wide range of compositions and different particle

! 5 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 sizes. Finally, small amount of dyads were added to investigate their coupled receptor, is a membrane whose function is effect on the phase behavior. dependent on major environmental factors, including lipid composition, cholesterol concentration, and the ionic strength of the surrounding solvent. In this work, we further explored these effects Hα and [S II] Emission from MHD Simulations of YSO Jets by utilizing coarse-grained molecular dynamics to simulate large, Edward Hansen and Adam Frank native-like membranes for long timescales. We discovered clear Physics and Astronomy preferences at the surface of the protein for polyunsaturated lipid tails, an effect that has been explored before with all-atom simulation, We study simulations of two-dimensional axisymmetric jets using the though not at the timescales present in this work. We also noted MHD code AstroBEAR. These simulations are based on those done preferential binding regions for cholesterol and POPE lipids. by de Colle and Raga (2006), The jets are pulsed via a sinusoidally time-dependent ejection velocity. As the pulses run over each other, a complicated structure of internal shocks and rarefactions is formed. Mass Loading and Knot Formation in AGN Jets by Stellar We compare a hydrodynamic run with magnetized runs of different Winds field strengths. The field inside these jets is purely toroidal. We have Martin Huarte-Espinosa, Eric G. Blackman, Alex Hubbard, and Adam implemented some micro-physics in the code, namely hydrogen and Frank helium ionization and recombination, which enable us to produce Institute of Optics and analyze emission maps of H! and [S II]. Strong H! emission Physics and Astronomy typically marks shock fronts and strong [S II] emission occurs inside cooling regions behind shocks. The results are consistent with this, Jets from active galaxies propagate from the central black hole out to and they also show enhanced emission from clumps within the jets. the radio lobes on scales of hundreds of kiloparsecs. The jets may Furthermore, an increase in field strength shows an increase in shock encounter giant stars with strong stellar winds and produce velocities and jet collimation. We have also extended the same observable signatures. For strong winds and weak jets, the simulations to three dimensions. Simulations such as these are interaction may truncate the jet flow during its transit via the mass undoubtedly important to our understanding of observations of loading. For weaker jets, the interaction can produce knots in the jet. Herbig-Haro objects. We present recent 3DMHD numerical simulations to model the evolution of this jet-wind interaction and its observational consequences. We explore (i) the relative mechanical luminosity of Exploring Rhodopsin-Bilayer Interactions via Coarse- the radio jets and the stellar winds (ii) the impact parameter between Grained Molecular Dynamics Simulation the jets' axis and the stellar orbital path (iii) the relative magnetic field Joshua N. Horn, Ta-Chun Kao, and Alan Grossfield strength of the jets and the stellar winds. Biochemistry and Biophysics

Proteins are dynamic in structure, with molecular motions dictated primarily by local physical forces. Integral membrane differ in the sense that the heterogeneous environment plays a major role in protein flexibility and, in turn, function. Rhodopsin, a G protein-

! 6 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013

Multiple Exciton Generation in Quantum Dots throughput, energy consumption and inter-packet delay variation Heather Jaeger and Oleg Prezhdo (IPDV). Through extensive simulations we show that both dynamic Chemistry channel allocation and cooperative load balancing improve the bandwidth efficiency under non-uniform load distributions The unique properties of semiconductor quantum dots facilitate compared with protocols that do not use these mechanisms as well as multi-exciton generation (MEG) or the creation of more than one compared with the IEEE 802.11 uncoordinated protocol. Although electron-hole pair per photon absorbed. When photons of energy in simulations are efficient tools to comparatively evaluate the excess of the band gap are absorbed, hot excitons are generated. A efficiency of the protocols, they cannot reflect many of the challenges competition exists between transferring the excess energy to phonons for real implementation of these protocols, such as clock-drift, and re- distributing the excess energy within the electronic manifold. synchronization, imperfect physical layers, and interference from Energy not lost as heat can potentially generate additional excitons. devices out of the system. In this project we use SORA software Experimental evidence points to a fast generation (< 50 fs) of multi- defined radios to implement the TRACE protocol and determine the excitons, which is indicative of a mechanism unlike impact ionization challenges in implementing this protocol in a real world of bulk semiconductors. communication system.

Bandwidth and Energy Efficient Coordinated MAC Retinal Changes Conformation During the Early Stages of Protocol Design and Implementation on SDRs Rhodopsin Activation Bora Karaoglu and Wendi Heinzelman Nicholas Leioatts, Blake Mertz, Karina Martinez-Mayorga, Tod D. Electrical and Computer Engineering Romo, Michael C. Pitman, Scott E. Feller, Alan Grossfield, Michael F. Brown Mobile ad hoc networks (MANETs) are becoming increasingly Biochemistry and Biophysics common, and typical network loads considered for MANETs are increasing as applications evolve. This, in turn, increases the Rhodopsin, the mammalian dim-light receptor, is one of the best- importance of bandwidth efficiency while maintaining tight characterized G protein-coupled receptors–a pharmaceutically requirements on energy consumption, delay and jitter. Coordinated important class of membrane proteins that has garnered much channel access protocols have been shown to be well suited for attention due to the recent availability of structural information. Yet, highly loaded MANETs under uniform load distributions. However, the activation mechanism is not fully understood. Here, we these protocols are in general not as well suited for non-uniform load combined solid-state NMR with three separate microsecond-scale all- distributions as uncoordinated channel access protocols due to the atom molecular dynamics simulations to understand the transition lack of on-demand dynamic channel allocation mechanisms that exist between the dark and metarhodopsin I (Meta I) states. From the in infrastructure based coordinated protocols. In this project, we simulations, we directly computed NMR spectra for specifically developed a lightweight dynamic channel allocation mechanism and deuterated methyl groups in retinal. The simulation-based results a cooperative load balancing strategy that are applicable to cluster corroborated one of two competing hypotheses for Meta I formation, based MANETs to address this problem. We propose protocols that the complex-counterion mechanism. Further simulation analysis utilize these algorithms to improve performance in terms of revealed striking differences in ligand flexibility between the two

! 7 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 states; retinal was more dynamic in Meta I, adopting an elongated AstroBEAR 2.0 and its Performance on Blue Gene/Q conformation. Surprisingly, this elongation also corresponded to a Baowei Liu and Jonathan Carroll-Nellenback dramatic influx of bulk water into the hydrophobic core of the Center for Integrated Research Computing protein. Importantly, this enhanced retinal motion upon light activation may reconcile two recent crystal structures of active AstroBEAR is an Adaptive Mesh Refinement(AMR), multi-physics rhodopsin, which showed retinal in two distinct conformations. parallel code for astrophysics. AMR remains at the cutting edge of computational astrophysics. AMR simulations adaptively change resolution within a computational domain to ensure that the most Investigation of the Mechanism of Antimicrobial important features of the dynamics are simulated with highest Lipopeptides using Coarse-Grained Molecular Dynamics accuracy. By allowing quiescent regions to evolve with low Simulations resolution, AMR simulations achieve order of magnitude increases in Dejun Lin, Joshua N. Horn, Zhen Xia, Pengyu Ren, and Alan computational speed. Current AMR simulations require algorithms Grossfield that are highly parallelized and manage memory efficiently. Here we Biochemistry and Biophysics present both the AMR and parallelization algorithm used in the AstroBEAR 2.0 code. We also present the strong scaling test and Antimicrobial lipopeptides (AMLPs) are a series of acylated cationic optimization results of AstroBEAR on our flagship, the new Blue peptides with broad-spectrum antimicrobial activity and low Gene/Q at CIRC. hemolytic activity. We used microsecond-scale coarse-grained molecular dynamics simulations with the MARTINI force field to understand AMLPs' modes of action. Rigorous free energy Estimation of Added Fluid Mass to Vibrating Cochlear calculations have been performed to probe the mechanism for their Partition and its Contribution to the Frequency-Place selectivity for different membranes. Although these studies provided Relation of Mammalian Cochlea useful insights, artifacts arising from the coarse representation of Yanju Liu, Sheryl M. Gracewski, and Jong-Hoon Nam electrostatics in MARTINI force field complicated further Mechanical Engineering interpretation of the simulations. To address this deficiency, we are developing a new coarse-grained force field for AMLPs and lipids Background: A pure tone vibration delivered to the oval window based on elliptical Gay-Berne van der Waals potential and electric travels along the cochlear partition (CP) until it culminates at a multipoles. This force field will retain much of the computational location specific to the tone. The traveling wave has been explained efficiency of current coarse-grained models while the detailed with mechanical models represented by a series of spring-mass representation of electrostatics and molecular shape in this force field resonators interacting with the fluid of cochlear ducts. The will provide more realistic descriptions of molecular interactions frequency-dependent peak response locations are primarily among AMLPs and membrane lipids. determined by the stiffness and mass of the CP. The stiffness gradient of most cochlear models is about 11 dB/octave, which is greater than experimentally measured stiffness gradients (4-9 dB/octave: Naidu & Mountain, 1998; Emadi, Richter et al., 2004). We hypothesized that the added fluid mass of the CP due to its fluid interaction, increases

! 8 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 toward the apex so that a broader frequency range can be encoded Compression and Rheology of Frictionless U-Shaped with a limited stiffness gradient. Particles in Two Dimensions Theodore Marschall, Andrew Loheac, Scott Franklin, and Stephen Methods: A finite element (FE) model of the CP based on anatomical Teitel and mechanical measurement data of gerbil cochlea was used to Physics and Astronomy estimate the mass of the CP. Its static responses were validated against experimental results (Nam & Fettiplace, 2010). From the We simulate a system of soft, frictionless, U-shaped particles under dynamic response of the FE model without fluid-interaction, we both isotropic compression and uniform shear flow in two obtained the effective mass of the CP that depends on the geometry dimensions. The shape of the particles allows them to interlock, and the vibrating mode. In order to evaluate the fluid mass, a Neely- causing a geometry induced cohesion. We investigate the jamming Kim type 1D passive cochlear model was adopted with stiffness and transition of this system as the packing fraction is increased, in an mass of spring-mass resonators obtained from the FE analyses. Due effort to learn whether such geometric cohesion can produce effects to added fluid mass, the peak resonating frequency at a location was similar to what is found near the jamming transition of frictional lower than the natural frequency of the CP spring-mass resonator. disks. The fluid mass was obtained from the difference between those two frequencies. A Computational Bayesian Approach to Gene Regulatory Results: The stiffness of the CP ranged from 760 to 0.3 mN/m per 10 Network Estimation µm section for 30 to 0.16 kHz frequency range (9 dB/oct). The mass Matthew N McCall, Helene McMurray, Anthony Almudevar, and of the CP per 10 µm section ranged from 30 (base) to 60 ng (apex), Hartmut Land while the fluid mass ranged from 5 (base) to 1000 ng (apex). The Biostatistics and Computational Biology structural mass was dominant only at the very basal location (< 2mm Biomedical Genetics from the basal end). Advances in genomic technology have led to the discovery of Conclusions: Except at the most basal locations, the effect of added numerous genes whose expression differs between cellular fluid mass is greater than the structural mass of CP which agrees conditions; however, genes do not act in isolation, rather they act with a previous study (Lim & Steele, 2000) and recent experimental together in complex networks that drive cellular function. By measurement (Dong & Olson, 2009). The stiffness gradient dominates considering the interactions between genes (and gene products), one the frequency gradient near the base while fluid mass gradient gains a more in-depth understanding of the underlying cellular explains the frequency gradient near the apex. mechanisms. Furthermore, estimation of gene regulatory networks is necessary to predict cellular response to interventions. Gene perturbation experiments are the primary tool to investigate gene regulatory networks and predict cellular response to interventions. Unfortunately, current network estimation algorithms are unable to adequately reconstruct gene networks from expression data. This is not surprising given that most network estimation algorithms function modularly and disregard uncertainty in previous steps. We

! 9 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 propose a computational Bayesian approach to network estimation were identified while reducing the number of false positives. The that explicitly models and incorporates uncertainty in each step of preliminary data we present here strongly supports the need of a the analysis. Instead of attempting to infer a single "best" network, validated dataset to directly compare the performance of NGS we report a posterior density on the network that directly analysis software tools, thereby providing a standard of comparison. conveys the uncertainty in the inferred network structure. Furthermore, using our synthetic RNA dataset we were able to identify a robust, sensitive, and highly accurate workflow for RNA- Seq using pre-processing, Shrimp2, and CuffDiff2. Moreover, we Comparative Analysis of Next Generation Sequencing show that sequence quality directly affects the differential expression Alignment Software using Synthetic RNA Spike-in results generated by the three sequence alignment tools tested thus Controls far. Going forward, we will perform more in depth comparisons on Jason R. Myers, Meghann Obrien, Kelly Schooping, Michelle these and other tools utilized for RNA-Seq analysis to provide a Zanche, Samantha Lomber, John M. Ashton, and Steven R. Gill comprehensive evaluation of NGS based software. Microbiology and Immunology

Next generation sequencing (NGS) technology has come to the Accelerating Decoupled Look-ahead to Exploit Implicit forefront of genomic research over the last several years. The vast Parallelism amount of data generated by NGS has led to an upsurge in Raj Parihar and Michael C. Huang development of software capable of analyzing large and complex Electrical and Computer Engineering datasets. Choosing the best analysis tool is a daunting task, often being compounded by constant updates meant to improve Despite the proliferation of multi-core and multi-threaded performance. Using synthetic RNA spike-in controls developed , exploiting implicit parallelism for a single semantic through the ENCODE project, we generated a validated dataset to thread is still a crucial component in achieving high performance. compare the results attained using different pieces of software in our Look-ahead is a tried-and-true strategy to exploit implicit RNA-Seq analysis workflow. We compared three commonly utilized parallelism, but can have resource-inefficient implementations such NGS alignment tools; Shrimp2, BWA, and TopHat2 with respect to as in a conventional, monolithic out-of-order core. While capable of their accuracy and sensitivity with and without pre-processing. generating significant performance gains, the look-ahead agent often Shrimp2 out performs both BWA and TopHat2 yielding the highest becomes the new speed limit; thus, we explore a range of software ratio of true positive to false positive transcripts with known fold and hardware based techniques to accelerate the look-ahead agent to changes. Given that sequence quality could adversely affect data exploit implicit parallelism. Fortunately, the look-ahead thread has alignment efficiency, we also tested performance of Shrimp2, BWA, no hard correctness constraints and presents new opportunities for and TopHat2 with raw versus pre-processed (low-complexity optimizations which are not present in traditional . First, sequence removal, polyA removal, quality end-trimming) sequence we explore speculative parallelization in the look-ahead thread data. With the synthetic RNA spike-in, we used CuffDiff2 to which is especially suited for the task of accelerating the look-ahead compared the differential expression results of the three alignment agent. Second, we observe that not all dependences are equal, and software packages. All three programs generally performed far better some links in a dependence chain are weak enough that removing with the pre-processed data than with raw, where more true positives them in the look-ahead thread does not materially affect the quality

! 10 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 of look-ahead. A trial-and-error approach and a framework based on easily lead to ordered aggregates. To answer this question, using the genetic algorithm can reliably identify weak instructions to molecular dynamics simulations, we characterized the ensemble of improve the speed of the look-ahead thread. We show that while the conformations populated by the two peptides in water and water + two main drivers for single-thread performance—faster clocks and NaCl environments. The results indicate that PAP248-286Ala favors advancements in microarchitecture—have all but stopped in recent contacts that stabilize a strand-turn-strand, or beta-arch, motif years, we can still uncover significant implicit parallelism using around P31, the only proline residue in the sequence. The contacts intelligent look-ahead techniques. stabilizing the beta-arch would bring positively charged residues into contact in PAP248-286, which, consistent with the experimental results, would be facilitated by the presence of negative ions. Characterizing the Conformational Space of Two Disordered Peptides in Different Solutions Ana V. Rojas, David Easterhoff, John T. M. DiMaio, Stephen A survey of structure and dynamics in HIV-1 Reverse Dewhurst, Alan Grossfield, Hongyu Miao, and Bradley L. Nilsson Transcriptase Biostatistics and Computational Biology James M. Seckler, Hongyu Miao, and Alan Grossfield Microbiology and Immunology Biostatistics and Computational Biology Chemistry Biochemistry and Biophysics

Amyloid fibrils formed by peptides found in semen have been shown HIV-1 reverse transcriptase is a critical drug target for HIV treatment, to enhance HIV infectivity in vitro. The first of these peptides to be and understanding the exact mechanisms of its function and identified was the 248-286 fragment of prostatic acid phosphatase inhibition would significantly accelerate the development of new (PAP248-286). PAP248-286 is highly cationic, and its fibrils might anti-HIV drugs. RT is a heterodimeric, multifunctional, multidomain facilitate infection by decreasing the electrostatic repulsion between protein with a 66 kDa subunit containing all of the catalytic active the negatively charged surfaces of the virus and the target cell. sites, and a 51 kDa subunit which is thought to provide structural Whereas PAP248-286 can easily form fibrils in seminal fluid, it needs stability to the larger subunit. Structural information on reverse rapid agitation in other environments, and certain ions have been transcriptase alone has proven to be insufficient to explain the shown to be critical for its assembly into fibrils. However, mutation mechanism of inhibition and drug resistance of non-nucleoside of the positively charged residues to alanine results in a peptide reverse transcriptase inhibitors. Elastic network modeling provides a (PAP248-286Ala) that can more easily form fibrilar aggregates. We technique to rapidly probe and compare protein dynamics. studied PAP248-286 and PAP248-286Ala fibril formation in water and Combining elastic network modeling with hierarchical clusters of water + NaCl environments. While PAP248-286Ala can efficiently both structural and dynamic data reveals a wealth of novel form fibrils in both water and water + NaCl, PAP248-286 can only do information. Here we present an extensive survey of the dynamics of so in a water + NaCl solution. The inability of PAP248-286 to form reverse transcriptase bound to a variety of ligands with a number of fibrils in water could be due solely to repulsion between the mutations, revealing a novel mechanism for drug resistance to non- positively charged peptides, an effect that might be diminished by nucleoside reverse transcriptase inhibitors, where hydrophobic core the presence of salt. However, it is also possible that the explanation mutations subtly shift the position of the thumb subdomain, lies in PAP248-286‘s failure to populate conformations that can restoring active-state motion to multiple functionally significant

! 11 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 regions of HIV-1 RT. This model arises out of a combination of Amber force field ff99 is able to accurately predict relative free structural and dynamic information, rather than exclusively from one energies of RNA duplex formation. or the other.

Accelerating Calculations of RNA Secondary Structure Benchmarking Molecular Mechanics Force Fields using Partition Functions Optical Melting Experiments and Nearest Neighbor Harry A. Stern and David H. Mathews Parameters Center for Integrated Research Computing Aleksandar Spasic, John Serafini, and David H. Mathews Biochemistry and Biophysics Biochemistry and Biophysics RNA performs many diverse functions in the cell in addition to its The ability of Amber force field ff99 to predict relative free energies of role as a messenger of genetic information. These functions depend RNA duplex formation was investigated. The test systems were three on its ability to fold to a unique three-dimensional structure hexaloop RNA hairpins with the identical loop sequence and varying determined by the sequence. The conformation of RNA is in part stem sequences. The potential of mean force of stretching the hairpins determined by its secondary structure, or the particular set of from the native state to an extended conformation was calculated contacts between pairs of complementary bases. Prediction of the using umbrella sampling simulations. The results were compared to secondary structure of RNA from its sequence is therefore of great the nearest neighbor parameters predictions. Because the hairpins interest, but can be computationally expensive. In this work we have identical loop sequence, the differences in free energies are only accelerate computations of base-pair probabilities using parallel from the composition of the stem region. The Amber ff99 force field graphics processing units (GPUs). Calculation of the probabilities of was able to correctly predict the order of stabilities of the hairpins, RNA secondary structures using nearest-neighbor standard free although the magnitude of the free energy change is larger than that energy change parameters has been implemented using CUDA to determined by optical melting experiments. The discrepancy was run on hardware with multiprocessor GPUs. A modified set of explained by noting that the unfolded state in the melting recursions was introduced, which reduces memory usage by about experiments is a random coil while the end state in the umbrella 25\%. GPUs are fastest in single precision, and for some hardware, sampling simulations was an elongated chain. The calculations can restricted to single precision. This may introduce significant roundoff be compared to reference data by using a thermodynamic cycle. By error. However, deviations in base-pair probabilities calculated using applying the thermodynamic cycle to the transitions between the single precision were found to be negligible compared to those hairpins using simulations and nearest neighbor data, agreement was resulting from shifting the nearest-neighbor parameters by a random found to be within the error of simulations, thus proving that the amount of magnitude similar to their experimental uncertainties. For

! 12 3rd Annual Center for Integrated Research Computing Poster Session! May 10, 2013 large sequences running on our particular hardware, the GPU determine the onset of global collapse conditions , which arise when implementation reduces execution time by a factor of close to 60 fractures have compromised the structural integrity of the vault. compared with an optimized serial implementation, and by a factor of 116 compared with the original code. Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision.

Non-Linear Finite Element Modeling of Gigantic Vaulted Stress Distribution of Jammed Particle Clusters and Structures Built with Unreinforced Concrete Maximum Entropy Principle Sarilyn Swayngim and Renato Perucchio Yegang Wu and Steven Teitel Mechanical Engineering Physics and Astronomy

The modeling and of fracture initiation, propagation We prepared a large number of configurations with the same total and failure of concrete, a quasi-brittle materials, using large scale stress, and used the entropy maximization postulate to derive the non-linear finite element (FE) modeling provides critical insight into stress distributions on the clusters of particles. We showed that, the structural response of monumental constructions under instead having a Boltzmann distribution, a quadratic term which is gravitational and seismic loads. As such, this type of numerical related to the tiling area should be included. Second, the tiling area is modeling provides a fundamental tool for understanding and solely determined by the stress for the total system, and correlated preserving some of the most extraordinary monuments of the World quadratically with the stress on clusters. This supports that tiling area Cultural Heritage. This study focuses on the Frigidarium of the Baths is a second conserved quantity. Third, the joint distributions of stress of Diocletian [298 - 305 AD] , a gigantic cross-vaulted hall built with and tiling area are examined and it agrees well with the maximum un-reinforced pozzolanic concrete (opus caementicium), still entropy postulates. standing today in excellent structural conditions. Typical of Roman design of large to gigantic vaulted structures, the hall is characterized by large shear walls that counteract the horizontal thrust at the springing of the vault. Through systematic geometric solid modeling and detailed non-linear FE meshes, we investigate how the various structural elements respond to gravitational and horizontal accelerations simulating seismic conditions. In particular, we evaluate how the present shear wall configuration, which plays a crucial role in stabilizing the entire structure, evolved from previous Roman designs of similar buildings. In the context of this work, we are also exploring the applicability of energy-based criteria to

! 13