Defining a Personal, Allele-Specific, and Single-Molecule Long-Read Transcriptome

Total Page:16

File Type:pdf, Size:1020Kb

Defining a Personal, Allele-Specific, and Single-Molecule Long-Read Transcriptome Defining a personal, allele-specific, and single-molecule long-read transcriptome Hagen Tilgnera,1, Fabian Gruberta,1, Donald Sharona,b,1, and Michael P. Snydera,2 aDepartment of Genetics, Stanford University, Stanford, CA 94305-5120; and bDepartment of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06511 Edited by Sherman M. Weissman, Yale University School of Medicine, New Haven, CT, and approved June 3, 2014 (received for review January 8, 2014) Personal transcriptomes in which all of an individual’s genetic var- for both parents of GM12878 (GM12891 and GM12892), we iants (e.g., single nucleotide variants) and transcript isoforms (tran- show that despite the higher error rate of the PacBio platform, scription start sites, splice sites, and polyA sites) are defined and single molecules can be attributed to the alleles from which they quantified for full-length transcripts are expected to be important were transcribed, thereby generating accurate personal tran- for understanding individual biology and disease, but have not scriptomes. This technique allows the assessment of biased allelic been described previously. To obtain such transcriptomes, we se- expression and isoform expression. quenced the lymphoblastoid transcriptomes of three family mem- bers (GM12878 and the parents GM12891 and GM12892) by using Results a Pacific Biosciences long-read approach complemented with Illu- Increased Full-Length Representation of RNA Molecules by Circular mina 101-bp sequencing and made the following observations. Consensus Reads. We sequenced ∼711,000 circular consensus First, we found that reads representing all splice sites of a transcript reads (CCS) molecules from unamplified, polyA-selected RNA ≤ are evident for most sufficiently expressed genes 3 kb and often from the GM12878 cell line (see Fig. S1 for mapping statistics). for genes longer than that. Second, we added and quantified pre- We have recently shown that CCS often describe all splice sites viously unidentified splicing isoforms to an existing annotation, of typical RNA molecules, although the success rate declines as thus creating the first personalized annotation to our knowledge. RNA length increases (11). The CCS we sequenced here were Third, we determined SNVs in a de novo manner and connected GENETICS significantly longer (average 1,188 bp, maximum 6 kb) than those them to RNA haplotypes, including HLA haplotypes, thereby in the HOP sample (average 999.9 bp; Fig. 1A). Both datasets assigning single full-length RNA molecules to their transcribed al- lele, and demonstrated Mendelian inheritance of RNA molecules. showed equal representation of RNA molecules between 0.8 and Fourth, we show how RNA molecules can be linked to personal 1.3 kb, but beginning at 1.3 kb and even more pronounced after variants on a one-by-one basis, which allows us to assess differential 1.7 kb, the GM12878 sample represented longer RNA molecules – allelic expression (DAE) and differential allelic isoforms (DAI) from the more faithfully; RNA molecules of 2.7 4 kb were present in the phased full-length isoform reads. The DAI method is largely indepen- GM12878 sample, but are essentially absent in the HOP sample B ′ dent of the distance between exon and SNV—in contrast to fragmen- (Fig. 1 ). The distance from the 5 -mapping end to the nearest tation-based methods. Overall, in addition to improving eukaryotic annotated transcription start site (TSS) dropped significantly −16 transcriptome annotation, these results describe, to our knowledge, (one-sided Wilcoxon rank sum test; P < 2.2e ) in GM12878 the first large-scale and full-length personal transcriptome. Significance personalized medicine | isoform sequencing | platform comparison | alternative splicing | allele-specific expression RNA molecules of higher eukaryotes can be thousands of nucleotides long and are expressed from two distinct alleles, hort-read RNA sequencing (1–7) is a widely used tool in which can differ by single nucleotide variations (SNVs) in the Smodern day biology. In mammalian transcriptomes, multi- mature RNA molecule. The de facto standard in RNA biology is intron genes are common and the detection and quantification of short (≤101 bp) read sequencing, which, although very useful, different transcript isoforms is of high importance. Recent work does not cover the entire molecule in a read. We show that has shown that reconstruction and quantification of transcript using amplification-free long-read sequencing one can often (i) isoforms from short-read sequencing is insufficiently accurate (8, cover the entire molecule, (ii) determine the allele it originated 9). Simultaneously, a number of research groups have pursued from, and (iii) record its entire exon-intron structure within long-read sequencing (8, 10–12), and such datasets generally a single read, thus producing a full-length, allele-specific view excel at connecting different exons, up to entire transcripts, with of an individual’s transcriptome. By enhancing existing gene a compromise of lower sequencing depth. However, these studies annotations using long reads and quantifying this enhanced have not investigated allelic variants. Such information is crucial annotation using >100 million 101-bp paired-end reads, we for understanding personal transcriptomes and their potential overcome the smaller number of long reads. biological consequences. “ ” Author contributions: H.T., F.G., D.S., and M.P.S. designed research; H.T., F.G., and D.S. Here, we use the Pacific Biosciences (13) platform ( PacBio ) performed research; H.T. analyzed data; and H.T. and M.P.S. wrote the paper. to produce a single-molecule RNA-seq dataset in the GM12878 Conflict of interest statement: M.P.S. is on the scientific advisory board of Personalis cell line that is more comprehensive in both length and depth than and GenapSys. a recently described dataset from a human organ panel (HOP) This article is a PNAS Direct Submission. (11). We additionally sequenced cDNA from the same cell line Freely available online through the PNAS open access option. by using 101-bp paired-end (PE) sequencing on the Illumina Data deposition: The sequences reported in this paper have been deposited in the NCBI platform, to show the properties of genes that can be sequenced Sequence Read Archive (accession no. SRP036136). Further data are available at http:// by using long-read, single-molecule transcriptome sequencing. stanford.edu/~htilgner/2014_PNAS_paper/utahTrio.index.html. We use previously unidentified isoforms revealed by long-read 1H.T., F.G., and D.S. contributed equally to this work. sequencing to produce an enhanced and personalized genome 2To whom correspondence should be addressed. E-mail: [email protected]. annotation, which we quantify by using 101-bp PE Illumina This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. reads. Finally, by producing single-molecule transcriptomes 1073/pnas.1400447111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1400447111 PNAS | July 8, 2014 | vol. 111 | no. 27 | 9869–9874 Downloaded by guest on October 1, 2021 A Read length D than those that had a CSMM (Fig. 2B), showing that deeper se- 3 ,1 21 0,0 00 13 221, ,000 13 1, 24,000 Gm12878 CSMM for BCKDK quencing of shorter reads detects more lowly expressed RNAs. M5 M2 M13 Because CCS generation requires read length to exceed cDNA Bp M11 1k 2k M6 M4 length by at least a factor of two, we hypothesized that CSMMs M8 M1 M10 M12 would not represent longer genes. Surprisingly, when calculating HOP ’78 M14 M9 M16 M3 the number of base pairs of the longest mature and annotated Long molecules: M15 B M7 GM12878 vs Hop Human organ panel CSMM for BCKDK transcript for each gene, we found this hypothesis to be wrong. M4 M9 Genes with and without a CSMM behaved largely similarly in M3 M5 ratio M10 − M1 terms of length. However, genes with a CSMM only very rarely M8 M2 C Log M7 represented genes smaller than 1 kb (Fig. 2 ), which is likely due 604 M13 − M6 M15 02k4k M12 to the use of magnetic beads in the loading procedure, which M18 Read length M19 M17 M14 disfavor short fragments. To derive an approximate predictive Distance to M16 C annotated TSS M11 statement from the above observations, we calculated the frac- Gencode version 15 annotation for BCKDK tion of genes that had at least one CSMM, for bins of gene Bp lengths and 101-bp PE Cufflinks-derived expression. At FPKMs >10 and gene lengths ≥ 1kb, 98% of genes receive a PacBio- 01k2k HOP ’78 CSMM when sequencing ∼711,000 CCS (Fig. 2D), whereas with an FPKM of >1, this percentage drops to 89%. When requiring Fig. 1. Increased length of CCS for the GM12878 sample. (A) Length distribu- at least 10 CSMMs (at FPKMs > 10), which may be useful for tion of CCS reads in the human organ panel (Hop; blue) (11) and CCS sequenced quantitative analyses, this fraction drops further to 68% (Fig. here for the GM12878 cell line (red). (B) Relative representation of molecules in E length bins in the two samples. y axis is calculated as log [(number of GM12878- 2 ). For genome-annotation purposes, CSMMs representing all CCS in bin + 1)/(number of Hop-CCS in bin +1)]. The red horizontal line gives the introns of an RNA molecule are useful. Sixty-three percent of expected ratio, which is above 0, because of the increased sequencing depth in CSMMs appear complete in that their first splice site is an an- GM12878. (C) Distribution of distances for CSMMs between the 5′ end of the notated first splice site and that their last splice site is an an- mapping and the closest annotated TSS of the same gene for both the Hop notated last splice site. By relaxing this criterion (SI Methods), (blue) and the GM12878 (red) sample. (D) All CSMMs mapped to the BCKDK ultimately 71% of CSMMs were classified as “full length.” For gene in the GM12878 cell line (red) and in the Hop sample (blue) as well as all Gencode15-annoated transcripts for this gene (black).
Recommended publications
  • The Genomic Organization and Expression Pattern of the Low-Affinity Fc Gamma Receptors (Fcγr) in the Göttingen Minipig
    Immunogenetics (2019) 71:123–136 https://doi.org/10.1007/s00251-018-01099-1 ORIGINAL ARTICLE The genomic organization and expression pattern of the low-affinity Fc gamma receptors (FcγR) in the Göttingen minipig Jerome Egli1 & Roland Schmucki1 & Benjamin Loos1 & Stephan Reichl1 & Nils Grabole1 & Andreas Roller1 & Martin Ebeling1 & Alex Odermatt2 & Antonio Iglesias1 Received: 9 August 2018 /Accepted: 24 November 2018 /Published online: 18 December 2018 # The Author(s) 2018 Abstract Safety and efficacy of therapeutic antibodies are often dependent on their interaction with Fc receptors for IgG (FcγRs). The Göttingen minipig represents a valuable species for biomedical research but its use in preclinical studies with therapeutic antibodies is hampered by the lack of knowledge about the porcine FcγRs. Genome analysis and sequencing now enabled the localization of the previously described FcγRIIIa in the orthologous location to human FCGR3A. In addition, we identified nearby the gene coding for the hitherto undescribed putative porcine FcγRIIa. The 1′241 bp long FCGR2A cDNA translates to a 274aa transmembrane protein containing an extracellular region with high similarity to human and cattle FcγRIIa. Like in cattle, the intracellular part does not contain an immunoreceptor tyrosine-based activation motif (ITAM) as in human FcγRIIa. Flow cytometry of the whole blood and single-cell RNA sequencing of peripheral blood mononuclear cells (PBMCs) of Göttingen minipigs revealed the expression profile of all porcine FcγRs which is compared to human and mouse. The new FcγRIIa is mainly expressed on platelets making the minipig a good model to study IgG-mediated platelet activation and aggregation. In contrast to humans, minipig blood monocytes were found to express inhibitory FcγRIIb that could lead to the underestimation of FcγR-mediated effects of monocytes observed in minipig studies with therapeutic antibodies.
    [Show full text]
  • Genetic and Genomic Analysis of Hyperlipidemia, Obesity and Diabetes Using (C57BL/6J × TALLYHO/Jngj) F2 Mice
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Nutrition Publications and Other Works Nutrition 12-19-2010 Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P. Stewart Marshall University Hyoung Y. Kim University of Tennessee - Knoxville, [email protected] Arnold M. Saxton University of Tennessee - Knoxville, [email protected] Jung H. Kim Marshall University Follow this and additional works at: https://trace.tennessee.edu/utk_nutrpubs Part of the Animal Sciences Commons, and the Nutrition Commons Recommended Citation BMC Genomics 2010, 11:713 doi:10.1186/1471-2164-11-713 This Article is brought to you for free and open access by the Nutrition at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Nutrition Publications and Other Works by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. Stewart et al. BMC Genomics 2010, 11:713 http://www.biomedcentral.com/1471-2164/11/713 RESEARCH ARTICLE Open Access Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P Stewart1, Hyoung Yon Kim2, Arnold M Saxton3, Jung Han Kim1* Abstract Background: Type 2 diabetes (T2D) is the most common form of diabetes in humans and is closely associated with dyslipidemia and obesity that magnifies the mortality and morbidity related to T2D. The genetic contribution to human T2D and related metabolic disorders is evident, and mostly follows polygenic inheritance. The TALLYHO/ JngJ (TH) mice are a polygenic model for T2D characterized by obesity, hyperinsulinemia, impaired glucose uptake and tolerance, hyperlipidemia, and hyperglycemia.
    [Show full text]
  • Mouse Fcrla Conditional Knockout Project (CRISPR/Cas9)
    https://www.alphaknockout.com Mouse Fcrla Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Fcrla conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Fcrla gene (NCBI Reference Sequence: NM_001160215 ; Ensembl: ENSMUSG00000038421 ) is located on Mouse chromosome 1. 5 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000046322). Exon 2~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Fcrla gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-100H5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a targeted allele exhibit largely normal T-dependent and T-independent antibody responses with an increase in IgG1 after secondary challenge with sheep red blood cells. Exon 2~4 is not frameshift exon, and covers 69.69% of the coding region. The size of intron 1 for 5'-loxP site insertion: 5054 bp, and the size of intron 4 for 3'-loxP site insertion: 2439 bp. The size of effective cKO region: ~2074 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Fcrla Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats.
    [Show full text]
  • Genomic Organization of Mouse Fcy Receptor Genes
    Proc. Nail. Acad. Sci. USA Vol. 87, pp. 2856-2860, April 1990 Immunology Genomic organization of mouse Fcy receptor genes (Fc receptors/exon-ntron junctions/protein domains/immunoglobulin gene superfamily) ANTHONY KULCZYCKI, JR.*t, JENNIFER WEBBER*, HAUNANI A. SOARES*, MICHAEL D. ONKEN*, JAMES A. THOMPSON*, DAVID D. CHAPLIN*t, DENNIS Y. LOH*t, AND JEFFREY P. TILLINGHAST* *Department of Medicine and tHoward Hughes Medical Institute, Washington University School of Medicine, Saint Louis, MO 63110 Communicated by David M. Kipnis, January 16, 1990 ABSTRACT We have isolated and characterized the gene responsible for intrathymic deletion of a large fraction of coding for the mouse Fc receptor that is termed Fc,,RHa. The potentially self-reactive T-cell clones (11-13). gene contains five exons and spans approximately 9 kilobases. An understanding of the relationship of the various Fc Unlike most members of the immunoglobulin gene superfam- receptor genes to each other, the factors influencing their ily, this gene utilizes multiple exons to encode its leader peptide. evolution, and molecular mechanisms regulating their The first exon encodes the hydrophobic region of the signal expression requires a detailed analysis of the structures of sequence; the second exon, which contains only 21 base pairs, their genes. Here we report the characterization of two encodes a segment of the signal peptidase recognition site; and members of this gene family. the beginning of the third exon encodes the predicted site of peptidase cleavage. The third and fourth exons each code for immunoglobulin-like extracellular domains. The fifth exon MATERIALS AND METHODS encodes the hydrophobic transmembrane domain and the Screening of Mouse cDNA and Genomic Libraries.
    [Show full text]
  • Fc Receptor-Like Proteins in Pathophysiology of B-Cell Disorder
    C al & ellu ic la n r li Im C m f u Journal of o n l o a l n o r Capone et al., J Clin Cell Immunol 2016, 7:3 g u y o J DOI: 10.4172/2155-9899.1000427 ISSN: 2155-9899 Clinical & Cellular Immunology Review Article Open Access Fc Receptor-Like Proteins in Pathophysiology of B-cell Disorder Mollie Capone, John Matthew Bryant, Natalie Sutkowski and Azizul Haque* Department of Microbiology and Immunology, and Hollings Cancer Center, Medical University of South Carolina, 173 Ashley Avenue, BSB-201, Charleston, SC 29425, USA *Corresponding author: Azizul Haque, Department of Microbiology and Immunology, Medical University of South Carolina, 173 Ashley Avenue, BSB-201, Charleston, SC 29425, USA, Tel: 843-792-9466; Fax: 843-792-2464; E-mail: [email protected] Received date: November 17, 2015; Accepted date: June 08, 2016; Published date: June 17, 2016 Copyright: © 2016 Capone M, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Abstract Members of the family of Fc receptor-like (FcRL) proteins, homologous to FcγRI, have been identified by multiple research groups. Consequently, they have been described using multiple nomenclatures including Fc receptor homologs (FcRH), immunoglobulin superfamily receptor translocation-associated genes (IRTA), immunoglobulin-Fc- gp42-related genes (IFGP), Src homology 2 domain-containing phosphatase anchor proteins (SPAP), and B cell cross-linked by anti-immunoglobulin M-activating sequences (BXMAS). They are now referred to under a unified nomenclature as FCRL.
    [Show full text]
  • Functional Genomics Atlas of Synovial Fibroblasts Defining Rheumatoid
    Ge et al. Genome Biology (2021) 22:247 https://doi.org/10.1186/s13059-021-02460-6 RESEARCH Open Access Functional genomics atlas of synovial fibroblasts defining rheumatoid arthritis heritability Xiangyu Ge1†, Mojca Frank-Bertoncelj2†, Kerstin Klein2, Amanda McGovern1, Tadeja Kuret2,3, Miranda Houtman2, Blaž Burja2,3, Raphael Micheroli2, Chenfu Shi1, Miriam Marks4, Andrew Filer5,6, Christopher D. Buckley5,6,7, Gisela Orozco1,8, Oliver Distler2, Andrew P. Morris1, Paul Martin1,8,9, Stephen Eyre1,8† and Caroline Ospelt2*† * Correspondence: caroline.ospelt@ usz.ch Abstract †Xiangyu Ge and Mojca Frank- Bertoncelj are co-first authors. Background: Genome-wide association studies have reported more than 100 risk Stephen Eyre and Caroline Ospelt loci for rheumatoid arthritis (RA). These loci are shown to be enriched in immune are co-last authors. cell-specific enhancers, but the analysis so far has excluded stromal cells, such as 2Department of Rheumatology, Center of Experimental synovial fibroblasts (FLS), despite their crucial involvement in the pathogenesis of RA. Rheumatology, University Hospital Here we integrate DNA architecture, 3D chromatin interactions, DNA accessibility, Zurich, University of Zurich, Zurich, and gene expression in FLS, B cells, and T cells with genetic fine mapping of RA loci. Switzerland Full list of author information is Results: We identify putative causal variants, enhancers, genes, and cell types for 30– available at the end of the article 60% of RA loci and demonstrate that FLS account for up to 24% of RA heritability. TNF stimulation of FLS alters the organization of topologically associating domains, chromatin state, and the expression of putative causal genes such as TNFAIP3 and IFNAR1.
    [Show full text]
  • Accurate Mutation Annotation and Functional Prediction Enhance the Applicability of -Omics Data in Precision Medicine
    The Texas Medical Center Library DigitalCommons@TMC The University of Texas MD Anderson Cancer Center UTHealth Graduate School of The University of Texas MD Anderson Cancer Biomedical Sciences Dissertations and Theses Center UTHealth Graduate School of (Open Access) Biomedical Sciences 5-2016 Accurate mutation annotation and functional prediction enhance the applicability of -omics data in precision medicine Tenghui Chen Follow this and additional works at: https://digitalcommons.library.tmc.edu/utgsbs_dissertations Part of the Bioinformatics Commons, Computational Biology Commons, Genomics Commons, and the Medicine and Health Sciences Commons Recommended Citation Chen, Tenghui, "Accurate mutation annotation and functional prediction enhance the applicability of -omics data in precision medicine" (2016). The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences Dissertations and Theses (Open Access). 666. https://digitalcommons.library.tmc.edu/utgsbs_dissertations/666 This Dissertation (PhD) is brought to you for free and open access by the The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences at DigitalCommons@TMC. It has been accepted for inclusion in The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences Dissertations and Theses (Open Access) by an authorized administrator of DigitalCommons@TMC. For more information, please contact [email protected]. ACCURATE MUTATION ANNOTATION AND FUNCTIONAL PREDICTION
    [Show full text]
  • Fc Receptors for Immunoglobulins and Their Appearance During Vertebrate Evolution
    Fc Receptors for Immunoglobulins and Their Appearance during Vertebrate Evolution Srinivas Akula, Sayran Mohammadamin¤, Lars Hellman* Department of Cell and Molecular Biology, Uppsala University, The Biomedical Center, Uppsala, Sweden Abstract Receptors interacting with the constant domain of immunoglobulins (Igs) have a number of important functions in vertebrates. They facilitate phagocytosis by opsonization, are key components in antibody-dependent cellular cytotoxicity as well as activating cells to release granules. In mammals, four major types of classical Fc receptors (FcRs) for IgG have been identified, one high-affinity receptor for IgE, one for both IgM and IgA, one for IgM and one for IgA. All of these receptors are related in structure and all of them, except the IgA receptor, are found in primates on chromosome 1, indicating that they originate from a common ancestor by successive gene duplications. The number of Ig isotypes has increased gradually during vertebrate evolution and this increase has likely been accompanied by a similar increase in isotype-specific receptors. To test this hypothesis we have performed a detailed bioinformatics analysis of a panel of vertebrate genomes. The first components to appear are the poly-Ig receptors (PIGRs), receptors similar to the classic FcRs in mammals, so called FcRL receptors, and the FcR c chain. These molecules are not found in cartilagous fish and may first appear within bony fishes, indicating a major step in Fc receptor evolution at the appearance of bony fish. In contrast, the receptor for IgA is only found in placental mammals, indicating a relatively late appearance. The IgM and IgA/M receptors are first observed in the monotremes, exemplified by the platypus, indicating an appearance during early mammalian evolution.
    [Show full text]
  • Cataloguing Over-Expressed Genes in Epstein Barr Virus Immortalized Lymphoblastoid Cell Lines Through Consensus Analysis of Pacb
    bioRxiv preprint doi: https://doi.org/10.1101/125823; this version posted April 10, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 Cataloguing over-expressed genes in Epstein Barr Virus immortalized lymphoblastoid cell lines through consensus analysis of PacBio transcriptomes corroborates hypomethylation of chromosome 1 Sandeep Chakraborty, R - 44/ 1, Celia Engineers, T. T. C Industrial Area, Rabale, Navi Mumbai, 400701, India. Abstract The ability of Epstein Barr Virus (EBV) to transform resting cell B-cells into immortalized lym- phoblastoid cell lines (LCL) provides a continuous source of peripheral blood lymphocytes that are used to model conditions in which these lymphocytes play a key role. Here, the PacBio generated transcrip- tome of three LCLs from a parent-daughter trio (SRAid:SRP036136) provided by a previous study [1] were analyzed using a kmer-based version of YeATS (KEATS). The set of over-expressed genes in these cell lines were determined based on a comparison with the PacBio transcriptome of twenty tissues pro- vided by another study (hOPTRS) [2]. MIR155 long non-coding RNA (MIR155HG), Fc fragment of IgE receptor II (FCER2), T-cell leukemia/lymphoma 1A (TCL1A), and germinal center associated signaling and motility (GCSAM) were genes having the highest expression counts in the three LCLs with no ex- pression in hOPTRS. Other over-expressed genes, having low expression in hOPTRS, were membrane spanning 4-domains A1 (MS4A1) and ribosomal protein S2 pseudogene 55 (RPS2P55). While some of these genes are known to be over-expressed in LCLs, this study provides a comprehensive cataloguing of such genes.
    [Show full text]
  • Study of Fc Receptor-Like Protein 3 and Its Binding
    Vol 5, Issue 5, Sep-Oct 2018 ISSN 2349-7041 Research Article STUDY OF FC RECEPTOR-LIKE PROTEIN 3 AND ITS BINDING EFFICIENCY WITH HERBAL AND ALLOPATHIC ANTIBIOTICS SASIKALA P1, PUGAZHENDY K*, TAMIZHAZHAGAN V1, JAYANTHI C, NEETA RAGHUVEER2 1Department of Zoology, Annamalai University, Tamilnadu, India. 2Department of Zoology, Bangalore University, Bangalore, India. Email: [email protected] ABSTRACT Objectives: This present study aimed to determine the effect of the drug (herbal and allopathic) on the FCRL3. FCRL3 protein sequence was collected from Uni-prot, protein sequence database and submitted to the SIB- BLAST. Methods: The multiple sequence alignment was performed by comparing FCRL3. The protein sequence of Human was compared with domestic duck, Wild Yak, Black flying fox, American mink by using CLUSTAL W2. The secondary structure was determined by using CFSSP. The composition of amino acids was determined by using Protparam. Results: The tertiary structure of FcRL 3 was predicted by using the CPH modeling server and visualized through RasMol. Ramachandran plot was performed for FCRL3 using Rampage software the active sites were predicted by CASTp. Conclusion: The docking of FCRL3 with selected herbal Acetaminophen and Allopathic methimazole was performed by Hex 6.3 selected target proteins. Keywords: Graves’s disease, Herbal drug, Allopathic drugs, Acetaminophen, Methimazole. INTRODUCTION Bioinformatics is the field of science in which biology, computer changes including psychosis, mania, anxiety, agitation and science and information technology merges to form a single depression [6]. Since Graves disease is an autoimmune disease discipline. Bioinformatics has not only become essential for basic which appears suddenly, often quite late in life, genomic and molecular biology research, but it's having a major a viral or bacterial infection may trigger antibodies which cross- impact on many areas of biotechnology and biomedical sciences.
    [Show full text]
  • WO 2017/075478 A2 4 May 20 17 (04.05.2017) W P O P C T
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2017/075478 A2 4 May 20 17 (04.05.2017) W P O P C T (51) International Patent Classification: [US/US]; 77 Massachusetts Ave., Cambridge, MA 02142 C12Q 1/68 (2006.01) (US). THE BRIGHAM AND WOMEN'S HOSPITAL, INC. [US/US]; 75 Francis Street, Boston, MA 021 15 (US). (21) International Application Number: PCT/US20 16/059507 (72) Inventors; and (71) Applicants : REGEV, Aviv [IL/US]; 15A Ellsworth Ave, (22) International Filing Date: Cambridge, MA 02139 (US). ANDERSON, Ana, Carri- 28 October 2016 (28.10.201 6) zosa [US/US]; 110 Cypress Street, Unit 3 11, Brookline, (25) Filing Language: English MA 02445 (US). CONG, Le [CN/US]; 100 Memorial Drive, Apt. 8-21B, Cambridge, MA 02142 (US). KU- (26) Publication Language: English CHROO, Vijay, K. [IN/US]; 30 Fairhaven Road, Newton, (30) Priority Data: MA 02149 (US). SINGER, Meromit [US/US]; 10 Bel 62/247,430 28 October 2015 (28. 10.2015) US mont Place, Somerville, MA 02143 (US). WANG, Chao 62/384,597 7 September 2016 (07.09.2016) US [US/US]; 6 Canal Park, Unit 302, Cambridge, MA 02 141 (US). (71) Applicants: THE BROAD INSTITUTE INC. [US/US]; 415 Main Street, Cambridge, MA 02142 (US). MAS¬ (74) Agents: KOWALSKI, Thomas, J. et al; Vedder Price SACHUSETTS INSTITUTE OF TECHNOLOGY P.C., 1633 Broadway, New York, NY 1001 9 (US). [Continued on nextpage] (54) Title: COMPOSITIONS AND METHODS FOR EVALUATING AND MODULATING IMMUNE RESPONSES BY USE OF IMMUNE CELL GENE SIGNATURES (57) Abstract: The present invention provides markers, marker signatures and molecular targets that correlate with dysfunction of immune cells and are ad vantageously independent of the immune cell activation status.
    [Show full text]
  • In Silico Analysis and Characterization of Differentially Expressed Genes
    bioRxiv preprint doi: https://doi.org/10.1101/2021.04.05.438487; this version posted May 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. In Silico Analysis and Characterization of Differentially Expressed Genes to Distinguish Glioma Stem Cells from Normal Neural Stem Cells Urja Parekh*1,2, Mohit Mazumder2, Harpreet Kaur2, Elia Brodsky2 1Deparment of Biological Science, Sunandan Divatia School of Science, Narsee Monjee Institute of Management Studies, India 2 Pine Biotech, USA Emails of Authors: Urja Parekh: [email protected] Mohit Mazumder: [email protected] Harpreet Kaur: [email protected]; [email protected] Elia Brodsky: [email protected] *Author for correspondence: [email protected] Abstract Glioblastoma multiforme (GBM) is a heterogeneous, invasive primary brain tumor that develops chemoresistance post therapy. Theories regarding the aetiology of GBM focus on transformation of normal neural stem cells (NSCs) to a cancerous phenotype or tumorigenesis driven via glioma stem cells (GSCs). Comparative RNA-Seq analysis of GSCs and NSCs can provide a better understanding of the origin of GBM. Thus, in the current study, we performed various bioinformatics analyses on transcriptional profiles of a total 40 RNA-seq samples including 20 NSC and 20 GSC, that were obtained from the NCBI-SRA (SRP200400). First, differential gene expression (DGE) analysis using DESeq2 revealed 348 significantly differentially expressed genes between GSCs and NSCs (padj. value <0.05, log2fold change ≥ 3.0 (for GSCs) and ≤ -3.0 (for NSCs)) with 192 upregulated and 156 downregulated genes in GSCs in comparison to NSCs.
    [Show full text]