Comparative transcriptome profiling of human and pig intestinal epithelial cells after

Deltacoronavirus infection

Thesis

Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University

By

Diana Cruz Pulido, MSc

Graduate Program in Comparative and Veterinary Medicine.

The Ohio State University

2020

Thesis Committee:

Scott P. Kenney, PhD, Advisor

Vanessa Hale, DVM, PhD

Anastasia Vlasova, DVM, PhD

Copyrighted by

Diana Cruz Pulido

2020

Abstract

Porcine (PDCoV) is an emerging global infectious disease of the swine industry with unknown potential to cause human disease. Phylogenetic analysis suggests that PDCoV originated relatively recently from a host-switching event between and . Little is known about how PDCoV interacts with its differing hosts. Although human derived cell lines are susceptible to PDCoV infection, there is no direct evidence that PDCoV causes morbidity in humans. Currently, there are no treatments or vaccines available for PDCoV partially due to the mechanisms of infection remaining unknown. In this study, we explored the expression profiles of the primary host organism (swine) to a potential novel host (humans) after infection with PDCoV. For this we utilized cell lines derived from intestinal lineages to reproduce the primary sites of viral infection in the host. Porcine intestinal epithelial cells (IPEC-J2) and human intestinal epithelial cells

(HIEC) were infected with PDCoV. At 24 h post infection, total cellular RNA was harvested and analyzed using RNA-sequencing (RNA-seq) technology. We found that there are more differentially expressed (DEGs) in humans, 7,486, in comparison to pigs, 1134. On the transcriptional level, humans as the naïve host appear to be showing a more aggressive response to PDCoV infection in comparison to the primary host, pigs.

Our research shows key immune associated DEGs are shared between humans and pigs that were affected after PDCoV infection. These included genes related to the NF-kappa-

β family, the interferon (IFN) family, the kinase family, and signaling pathways such as the signaling pathway, JAK-STAT signaling pathway, inflammation/ signaling pathway, Toll-like

ii

receptor signaling pathway, NOD-like receptor signaling pathway, Ras signaling pathway and cytosolic DNA-sensing pathway. While similarities exist between humans and pigs in many pathways, our research suggests that adaptation of PDCoV to the porcine host required the ability to downregulate many response pathways including the interferon pathway and provides an important foundation that contributes to an understanding of the mechanisms of PDCoV infection across different hosts. This is the first report of transcriptome analysis of human cells infected by PDCoV.

Keywords: PDCoV infection, RNA-seq technology, differentially expressed genes, pathways.

iii

Acknowledgments

I would like to thank my advisor Dr. Scott Kenney for his enormous patience and support in implementing this idea/project in his lab; my committee members, Dr. Vanessa Hale and Dr. Anastasia Vlasova for their patience, guidance and valuable ideas in order to make the data analysis of this study stronger; my Bioinformatics specialist and dear husband, Dr. Wilber Ouma, for his patience and guidance in solving the most difficult bioinformatics tasks in this project; and Dr. Patricia Boley for her valuable support in the experimental part of this study that includes the implementation of RNA extraction and qRT-PCR.

iv

Vita

September 2008 ...... B.S. Microbiology, Universidad de los

Andes. Bogota, Colombia.

March 2010 ...... M.Sc. Biological Sciences. Universidad de los Andes. Bogota, Colombia.

2017 to present ...... Masters (PhD track) Student/ Graduate

Research Associate. The Ohio State

University - OARDC, Wooster, Ohio, USA

Publications

Cruz Diana P, Huertas Mónica G, Lozano Marcela, Zárate Lina and Zambrano María

Mercedes. Comparative analysis of diguanylate cyclase and phosphodiesterase genes in Klebsiella pneumoniae. BMC Microbiology 2012, 12:139.

Monica G. Huertas, Lina Zarate, Ivan C. Acosta, Leonardo Posada, Diana Cruz, Marcela

Lozano and Maria M. Zambrano. Klebsiella pneumoniae yfiRNB operon affects biofilm formation, polysaccharide production and drug susceptibility. Microbiology

2014, 160.

Fields of Study

Major Field: Comparative and Veterinary Medicine.

v

Table of Contents

Abstract ...... ii

Acknowledgments...... iv

Vita ...... v

List of Tables ...... vii

List of Figures ...... viii

Chapter 1: Deltacoronavirus. Introduction ...... 1

1.1. Hypothesis and Objectives ...... 12

Chapter 2: Transcriptome Analysis of IPEC and HIEC Cells in Response to

Deltacoronavirus Infection ...... 13

2.1. Materials and Methods ...... 13

2.2. Results ...... 17

2.3. Discussion ...... 44

2.4. Conclusions ...... 51

References ...... 55

Appendix A: IF Staining of the Inoculated IPEC cells ...... 60

Appendix B: Common Upregulated and Downregulated DEGs in Humans and Pigs ..... 61

Appendix C: DEGs in humans and pigs in three Pathways ...... 62

vi

List of Tables

Table 1. RNA sequencing and Genome mapping results……………………………….33

Table 2. DEGs in H. sapiens (HIEC cells) and S. scrofa (IPEC cells) at 24 hpi vs no infected cells……………………………………………………………………………..33

Table 3. classification of upregulated DEGs in humans and pigs at 24 hpi………………………………………………………………………………………..41

Table 4. Top 20 differential expressed genes in humans and pigs……………………...48

vii

List of Figures

Figure 1. Electrophoresis gel of 16 RNA samples (uninfected HIEC and IPEC cells)…19

Figure 2. Bioanalyzer electropherogram of the cDNA pooled library that contains twelve libraries of uninfected and infected HIEC and IPEC cells……………………..20

Figure 3. Quality control of reads in HIEC RNA-seq library (Hs_Dc_1) before trimming

(A) and after trimming (B)………………………………………………………………22

Figure 4. Adapter removal in HIEC RNA-seq library (Hs_Dc_1) before trimming (A) and after trimming (B)…………………………………………………………………...23

Figure 5. The density distribution of log-CPM values before (A) and after (B) filtering low expressed genes in HIEC RNA-seq libraries……………………………………… 24

Figure 6. The density distribution of log-CPM values before (A) and after (B) filtering low expressed genes in IPEC RNA-seq libraries………………………………………...25

Figure 7. Box plots showing the distribution of log-cpm values for unnormalized (left panel) and normalized (right panel) data in HIEC RNA-seq libraries…………………..26

Figure 8. Box plots showing the distribution of log-cpm values for unnormalized (left panel) and normalized (right panel) data in IPEC RNA-seq libraries…………………...27

Figure 9. MDS plot based on biological coefficient of variation, and Histogram of the variances of the principal components of the Human Data set…………………………29

Figure 10. MDS plot based on biological coefficient of variation, and Histogram of the variances of the principal components of the Pig Data set……………………………..29

Figure 11. IF staining of the inoculated HIEC cells at 24 hpi………………………....31

viii

Figure 12. DEGs in H. sapiens (HIEC cells)(A) and S. scrofa (IPEC cells) (B) at 24 hpi vs no infected cells………………………………………………………………………34

Figure 13. Heatmap of the top 100 genes that were differentially expressed in the

PDCoV-infected HIEC cells (humans) (A) and the PDCoV-infected IPEC cells (pigs) (B) compared with the no infected cells……………………………………………………..35

Figure 14. KEGG Gene Set Enrichment Analysis of DEGs in humans (A) and pigs (B) at 24 hpi…………………………………………………………………………………..36

Figure 15. DEGs in 10 common immune response -associated pathways in pigs and humans after PDCoV infection…………………………………………………………..38

Figure 16. Common upregulated DEGs in humans at 24 hpi across 10 pathways……...39

Figure 17. Common upregulated DEGs in humans at 24 hpi across 10 pathways……. 40

Figure 18. Common upregulated DEGs in pigs at 24 hpi across 9 pathways…………..41

Figure 19. Common downregulated DEGs in humans at 24 hpi across 10 pathways...... 42

Figure 20. Common downregulated DEGs in pigs at 24 hpi across 7 pathways………43

Figure 21. Orthologs of Upregulated DEGs between humans and pigs……………….44

Figure 22. Orthologs of Upregulated and Downregulated DEGs between humans and pigs…………………………………………………………………………………….…45

Figure 23. Orthologs of Upregulated and Downregulated DEGs between humans and pigs………………………………………………………………………………………46

ix

Chapter 1: Deltacoronavirus Introduction

General Introduction

Coronaviruses (CoVs) are a global distributed group of single-stranded positive sense RNA that infect a wide range of vertebrate hosts, causing diseases that range from mostly upper respiratory tract infections in humans to gastrointestinal tract infections that can lead to death (McBride, van Zyl, & Fielding, 2014). Phylogenetically, they are classified into four genera: Alpha -, Betha-, Gamma- and Deltacoronavirus (Li et al., 2018; Woo, Huang, Lau, & Yuen, 2010). In the past two decades, three highly pathogenic human have emerged from zoonotic viruses. They are the severe acute respiratory syndrome (SARS)-CoV, infecting ~8000 people worldwide with a case-fatality rate of ~10% in 2002-2003; the Middle East respiratory syndrome

(MERS)- CoV, infecting ~2500 people with a case–fatality rate of ~36%; and the new severe acute respiratory syndrome-related 2 (SARS-COV-2), which causes

Coronavirus Disease-2019 (COVID-19) with a global mortality rate yet to be determined

(Daniel Blanco-Melo, 2020). The first two viruses cause lethal respiratory infections in humans and both are famous examples of the genera CoVs that crossed species barriers (Li, et al., 2018). SARS represents a zoonotic introduction from bats through intermediate animal hosts to humans, and MERS was spread from bats to camels to humans (Li, et al., 2018). Deltacoronavirus, which contains bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, and was most recently identified in pigs is potentially in the midst of a species crossing event into humans (Woo, et al., 2010).

1

History of the Disease

Porcine delta coronavirus (PDCoV) belongs to the Deltacoronavirus genus of the

Coronaviridae (Hu et al., 2015). PDCoV is an emerging global infectious disease of the swine industry with unknown potential to infect poultry and cause human disease. It was first detected in 2009 in fecal samples from pigs in Asia but its etiologic role was identified in 2014, when pigs in the United States were showing diarrhea (Woo et al.,

2012; Zhang, 2016). It was first identified in 2012 in Hong Kong, China during a molecular surveillance study (Woo, et al., 2012), and isolated from clinical cases of diarrhea in young pigs in the United States, in 2014 (L. Wang, Byrum, & Zhang, 2014).

During the initial outbreak of PDCoV, 30% of the diagnostic samples were positive

(Marthaler, Jiang, Collins, & Rossow, 2014). Another study reported that between

March and September of 2014, 6.9% of case IDs submitted from the US was positive for

PDCoV, while all case IDs coming from Canada and Chile were negative for PDCoV and only one case ID was positive for PDCoV in Mexico. From the positive cases reported in the U.S, twelve of the 25 states were positive for PDCoV while North Carolina, Indiana,

Ohio and Iowa had the highest percentage of positive case IDs (>10%) (Homwong et al.,

2016).

Swine enteric coronaviruses such as porcine epidemic diarrhea (PEDV) and porcine Deltacoronavirus (PDCoV) have emerged and spread in the North American swine industry in the last four years (Niederwerder & Hesse, 2018). In the United States

(U.S.), between 2013 and 2014, these newly emerged viruses have spread across the country and affected pigs, causing a high number of pig deaths and significant economic impacts, evading the control measures for viral infections in farms affected by PEDV and 2

PDCoV (Jung, Hu, & Saif, 2016). As a result, PEDV infection has caused severe economic losses with estimates of $900 million to $1.8 billion from 2013 to 2014 (Hou &

Wang, 2019; Paarlberg, 2014; Schulz & Tonsor, 2015). During the worst stages of the

PEDV outbreak, 10% (~seven million) of US pigs were infected (Hou & Wang, 2019).

Accordingly, given the reduced volumes of available pig and hog supplies, industry participants that depend on high volumes (for example, packers, processors, distributors, and retailers) were harmed. Consequently, pork packers experienced reductions in annual returns of $481 million following a 3% reduction in pigs from PEDV (Schulz &

Tonsor, 2015).

Clinical signs and ways of transmission

Deltacoronavirus is associated with clinical signs of acute, watery diarrhea, in sows and nursing pigs. It is frequently accompanied by acute, mild to moderate vomiting that lead to dehydration, loss of body weight lethargy and finally, death (Jung, et al.,

2016); mortality was found only in nursing pigs (Hu, et al., 2015). Experimental infection studies showed that conventional and gnotobiotic (Gn) piglets infected with

PDCoV developed mild to severe diarrhea, gross and microscopic intestinal lesions at 48-

72 h post infection (Jung et al., 2015; Ma et al., 2015; Zhang, 2016). As a consequence, these pigs show dehydration, loss of bodyweight, and lethargy, but they are able to maintain their appetite until sudden death (Jung, et al., 2016). Among the different clinical signs, experimentally infected conventional nursing pigs only developed diarrhea, which showed that a variety of disease outcomes are expected based on diverse experimental conditions. In like manner, conventional pigs are more susceptible to

3

PDCoV infection in comparison to Gn pigs in similar experimental conditions (Jung, et al., 2016).

Diarrhea induced by PDCoV is likely a consequence of malabsorption because of massive loss of absorptive enterocytes and functional disorders of these infected enterocytes, as it was shown in PEDV and TGEV (Jung, et al., 2016). Previous studies have shown that persisting diarrhea in infected nursing pigs is observed from five to ten days (Hu, Jung, Vlasova, & Saif, 2016; Jung, et al., 2015; Ma, et al., 2015). Fecal virus

RNA was also evident in PDCoV-infected nursing pigs and after recovery, pigs continue to discharge PDCoV RNA in their feces (Jung, et al., 2016). Gross and histological lesions can be observed during PDCoV infection in infected nursing pigs. Specifically, gross lesions are limited to the gastrointestinal tract and are characterized by thin and transparent intestinal walls, with accumulation of large amounts of yellow fluid in the intestinal lumen, while histological lesions are defined by acute, multifocal to diffuse, mild to severe atrophic enteritis in the proximal jejunum to ileum (Q. Chen et al., 2015;

Jung, et al., 2015; Jung, et al., 2016; Ma, et al., 2015). Currently, there is no evidence that PDCoV is transmissible to humans and there are no treatments or vaccines available for PDCoV due to its mechanisms of infection remaining unknown (Hu, et al., 2015;

Jiang et al., 2019)

Several possibilities have been considered for how PDCoV was introduced and rapidly spread throughout pig farms in North America, but the exact mechanism by which this virus entered and disseminated is not completely understood (Niederwerder &

Hesse, 2018). Some potential contributors could be: aerosols, neighboring farms, breaches in biosecurity, contaminated transport vehicles, feed and feed bags 4

(Niederwerder & Hesse, 2018). Another possible route of transmission for PDCoV includes carriers like older pigs with subclinical infection, similar to PEDV infection in young pigs (Jung, et al., 2016). Finally, it is important to stress that the fecal-oral route is the primary route of transmission (Jung, et al., 2016; Niederwerder & Hesse, 2018).

Genomics

Coronaviruses (CoVs) have the largest genomes (26.4 to 31.7 kb) among all known RNA viruses, with G + C contents varying from 32% to 43%. Different numbers of small open reading frames (ORFs) are present across their conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid) and downstream to nucleocapsid gene in different coronavirus lineages. This large genome has conferred coronaviruses with extra plasticity in accommodating and modifying genes (Woo, et al., 2010). Genetic plasticity allows CoVs to achieve unique diversity in emergence of new strains and species and to adapt to new host and ecological niches without using common biological vectors such as ticks, mosquitoes, etc.; this genetic diversity is maintained because of the accumulation of point mutation in genes (genetic drift) and due to low fidelity of the RNA-dependent

RNA polymerase and homologous RNA recombination (Vlasova, 2013). PDCoV is enveloped and pleomorphic with a diameter of 60-80 nm (Jung, et al., 2016). This virus is a single-stranded, enveloped, positive-sense RNA virus (Zhang, 2016). It has a genome of approximately 25.4 kb in length (excluding the poly A-tail) that basically encodes four structural : the Spike (S), envelope (E), Membrane (M) and

Nucleocapsid (N), and four nonstructural proteins (Jung, et al., 2016; Woo, et al., 2010;

Zhang, 2016). The PDCoV genome organization and arrangement includes the following regions and proteins: 5’ untranslated region, open reading frame 1a/1b (ORF 1a/1b), S, E, 5

M, nonstructural protein 6 (NS6), N, nonstructural protein 7 (NS7) and 3’ UTR (Jung, et al., 2016). A transcription regulatory sequence (TRS) motif has been identified at the 3’ end of the leader sequence preceding most ORFs (Cavanagh, 1997; Gorbalenya,

Enjuanes, Ziebuhr, & Snijder, 2006). It is believed that this motif directs recombination which is promoted by a unique template switching “copy-choice” mechanism during

RNA replication (Lai, 1992).

Some general characteristics of the structural and nonstructural proteins of coronaviruses and their viral replication have been identified but detailed functions and roles of these proteins in host cells are unknown. Some functions of these proteins have been identified, as follows. 1) ORF1a/1b occupies about two thirds of the CoV genome; it encodes the replicase polyprotein and is translated from ORF1a (11826 to 13425 nt) and

ORF1b (7983 to 8157 nt). 2) S protein (Spike) is used for receptor binding and viral entry, it is responsible for the “spikes” present on the surface of coronaviruses and is the most variable sequence in the CoV genomes. 3) E and M proteins are small transmembrane proteins associated with the envelope of all coronaviruses and, 4) The nucleocapsid (N) is a structural protein that establishes complexes with genomic RNA, interacts with the viral membrane protein during virion assembly, and plays a pivotal role in enhancing the efficiency of virus transcription and assembly (McBride, et al., 2014;

Woo, et al., 2010).

Viral replication and pathogenesis

With regard to tissue tropism of PDCoV, it is known that this virus is cytolytic and the enterocytes, that have been infected, develop acute necrosis, causing marked villous atrophy in the small intestine (Jung, et al., 2016). Therefore, the host receptor is a 6

major determinant of pathogenicity, tissue tropism and host range of the virus. Given these points, in general, coronavirus infection begins with the attachment of the S1 domain of the Spike protein with its associated receptor. This event causes a conformational change in the S2 subunit in S that promotes the fusion of the viral and cell plasma membrane. After the release of the viral nucleocapsid to the cytoplasm, the viral gRNA is translated through ribosomal frameshifting mechanism into polyproteins pp1a

(4382 amino acids) and pp1ab (7073 amino acids). These polyproteins are autoproteolytically processed by host and viral proteases that will help to generate 16 non-structural proteins (NSPs). These proteins will assemble and form the replication- transcription complex. With the assembly of this complex, the full-length positive strand of genomic RNA will be transcribed in order to form a full-length negative-strand template for the synthesis of new genomic RNAs and overlapping subgenomic negative- strand templates. The next step is coronaviral replication, in which the replicase- polymerase is involved. The genomic RNA is replicated and the subgenomic RNA transcribed and translated to form the structural and accessory proteins. The viral products will be assembled, resulting in the accumulation of new genomic RNA and structural components. In this stage of the infection, the helical nucleocapsid that contains the genomic RNA will interact with other viral structural proteins (S, E and M proteins) to form the assembled virion. The assembly of CoV particles will be completed through the release (budding) of helical nucleocapsid as a smooth-wall vesicle in the secretory pathway from the endoplasmic reticulum to the Golgi intermediate compartment (ERGIC) via exocytosis (Lim, Ng, Tam, & Liu, 2016).

7

When cells are exposed to pathogens such as viruses, the immune response plays a fundamental role in the early control of viral infection (Barber, 2001; Lim, et al., 2016).

The innate immune response is the first line of defense which is activated after the infection occurs and the second line of defense, represented by cells of the adaptive immune system (CD4+, CD8+ T and B cells) is generated by antigen-presenting cells

(APCs). The host APC can digest viral proteins, providing antigenic peptides and connecting them to the major histocompatibility complex (MHC), which will lead to the activation of the adaptive immune response (Kaminskyy & Zhivotovsky, 2010).

However, the host as well as the virus can manipulate innate immune mechanism as a form of defense or evasion strategy (Barber, 2001). In this case, CoV accessory proteins are indispensable in diverse cellular signaling such as cell proliferation, apoptosis and interferon signaling (Lim, et al., 2016). Interferon (IFN) and the IFN-induced cellular antiviral response are an important hallmark for virulence. Consequently, in virus- infected cells, viral components of replication known as the pathogen-associated molecular patterns (PAMPs) can be recognized by host pattern-recognition receptors

(PRRs) expressed in most somatic cells, such as the cytoplasmic retinoic acid-inducible gene I (RIG-I) (Kaminskyy & Zhivotovsky, 2010; Luo et al., 2016). The resulting signaling, which is activated by infection, induces the secretion of several , tumor necrosis factor (TNF), and immunoregulatory interleukins (ILs) (Kaminskyy &

Zhivotovsky, 2010). Likewise, PDCoV has evolved using several strategies in order to be able to replicate in the host. Some strategies involve the engagement of the apoptotic machineries for efficient viral infection and to be able to escape the innate immune response by impeding the activation of transcription factors IRF3 and NF-kB, both of 8

which are involved in the RIG-I signaling pathway disrupting IFN-B production (Jiang, et al., 2019; Lim, et al., 2016).

Interspecies transmission of PDCoV and bioinformatics

Viruses are a cause of considerable concern to human and animal health (Ibrahim et al., 2018; Kirk et al., 2015). There is a remarkably large number of viruses in the biosphere (they are approximately 1031, about 10 times more abundant than bacteria) and only a minuscule fraction has been identified (Ibrahim, et al., 2018). In recent years, we have seen both the emergence of new viral diseases (e.g. MERS, SARS) and the re- emergence of known diseases in new geographical areas (e.g. Zika, Dengue and

Chikungunya) (Ibrahim, et al., 2018). Consequently, the huge increase in the number of coronaviruses that has been discovered and their sequenced genomes is a valuable source of opportunities for performing genomics and bioinformatics analysis on this family of viruses (Woo, et al., 2010). Nevertheless, the natural reservoirs of CoVs and their emergence pathways remain unknown. For some CoV, animals, birds and humans can serve as a source of infection and in advantageous conditions can lead to disease outbreaks (Vlasova, 2013). Similarly, multiple viral genes can participate in CoV pathogenesis and result in different virulent phenotypes depending on interaction with host responses. Observations related to coronavirus tissue tropism and host range variants are evidence of CoV quasispecies during replication in tissue culture or in avian/mammalian species which indicate that CoVs can shift or extend their host range resulting in the emergence of new CoV variants in humans (Vlasova, 2013).

Consequently, CoVs can respond and evolve faster in order to adapt to new hosts/tissues/environments and can even manipulate these factors to perpetuate their 9

persistence and abundance. Additionally it is important to mention that according to the current subfamily classification, the alpha- and genera includes all known mammalian CoVs, while gamma- and newly emerging deltacoronavirus genera include avian CoVs. (Vlasova, 2013).

With respect to the previous information, phylogenetic analysis suggest that

PDCoV originated relatively recently from a host-switching event between birds and mammals (Li, et al., 2018). Little is known about how PDCoV interacts with its differing hosts. Recently, Li, W et.al., were able to show that PDCoV can infect cells from an exceptionally diverse range of species such as humans, pigs and chickens by binding to an interspecies conserved domain of aminopeptidase N (APN) (a receptor for cell entry)

(Boley et al., 2020; Li, et al., 2018). APN is a protein that plays an important role in enzymatic activity, peptide processing, cholesterol uptake, chemotaxis, cell signaling, and cell adhesion. This protein is extensively distributed and highly conserved in amino acid sequences across species of the Eukaryota domain, Animalia kingdom and is expressed in several tissues such as epithelial cells of the kidneys, respiratory and gastrointestinal tract

(Boley, et al., 2020). Selection of phylogenetically conserved receptors can provide viruses an evolutionary advantage by giving them the opportunity to explore alternative hosts, occasionally resulting in host switching and virus speciation (Woolhouse, Scott,

Hudson, Howey, & Chase-Topping, 2012).

By integrating bioinformatics methods, it could, in the future, be possible to predict viral evolution in patients, to predict the course of a virus infection, and to adjust therapeutic treatments (Ibrahim, et al., 2018). Transcript identification and quantification of have been separate activities in molecular biology since 10

the discovery of RNA’s role as the key intermediate between the genome and the proteome (Conesa et al., 2016). Understanding the transcriptome is a prerequisite for interpreting the functional elements of the genome and discovering the molecular constituents of cells and tissues as well as understanding development and disease (Z.

Wang, Gerstein, & Snyder, 2009). As a result, next-generation RNA-sequencing (RNA- seq) has emerged as the leading quantitative transcriptome profiling system (Mutz,

Heilkenbrinker, Lonne, Walter, & Stahl, 2013). This technique with its high resolution and sensitivity has reported novel transcribed regions, has provided a more precise measurements of levels of transcripts and splicing isoforms of known genes, and has mapped 5’ and 3’ boundaries for many genes (Z. Wang, et al., 2009). Additionally,

RNA-seq offers an unlimited dynamic range of quantification at reduced technical variability. These advantages, along with the declining cost of sequencing, make RNA- seq a very attractive method for whole-genome expression studies in many biological systems, including species with genomic sequences that are not yet determined (Van

Verk, Hickman, Pieterse, & Van Wees, 2013).

A typical RNA-seq experiment starts with mRNA. Hence, a population of RNA

(total or fractionated, such as poly (A) +) is converted into a cDNA library, in which the cDNA fragments have adaptors attached to one or both ends. This technique utilizes developed deep-sequencing technologies such as Illumina IG, Applied Biosystems

SOLiD and Roche 454 Life Science. The Illumina HiSeq platform is the most commonly used next-generation sequencing technology for RNA-seq, producing up to 3 billion reads per sequencing run between 1.5 and 11 days (Van Verk, et al., 2013). Following the description of this technique, each molecule, with or without amplification, is 11

sequenced in a high-throughput way in order to obtain short sequences from one end

(single-end sequencing) or both ends (pair-end sequencing). As a result, by sequencing these millions of DNA fragments in the library (known as ‘reads’), which are typically

30-400 bp, depending on the DNA sequencing technology used, an accurate measure of the relative abundance of each transcript and splice variants can be determined (Van

Verk, et al., 2013; Z. Wang, et al., 2009).

Lastly, it is important to note that robust library preparation methods that produce a representative, non-biased source of nucleic acid material from the genome under investigation are key factors for the success of RNA-seq experiments (van Dijk,

Jaszczyszyn, & Thermes, 2014). These methods usually start with ribosomal RNA

(rRNA) depletion or mRNA (poly (A)) enrichment since the large majority of cellular total RNA is rRNA. Hence, polyadenylated mRNAs are commonly extracted using oligo-dT beads, or as an alternative option, rRNAs can be selectively depleted by selective degradation of rRNAs and other 5’monophosphate RNAs by exonucleases (van

Dijk, et al., 2014). However, poly (A) enrichment suffers from biases when applied to low quality RNA samples, while ribosomal RNA depletion methods are more appropriate for the sequencing of RNA samples with lower quality since they reduce the amount of the highly abundant ribosomal RNAs from the total RNA samples using capture probes

(Schuierer et al., 2017).

1. 1. Hypothesis and Objectives

Considering the previous information, we hypothesize that gene expression may be different depending on the cell type and species in which the infection occurs, leading

12

to changes in disease severity. For this reason, we propose RNA-seq transcriptome profiling of intestinal epithelial human and swine cells infected by PDCoV.

In order to explore the RNA-seq transcriptome profiling of these cell lines, this study aimed to investigate if the intestinal epithelial human cells are susceptible to infection with cell culture-adapted PDCoV, to identify differentially expressed human and swine epithelial cell genes in response to PDCoV and, to find common DE genes and signaling pathways between humans and pigs. Jung, K et.al., was able to show that intestinal epithelial swine cells are susceptible to infection with PDCoV (Jung, Miyazaki,

Hu, & Saif, 2018). Determining the adaptability of PDCoV to different hosts is the first step in assessing this newly emerging viral disease. Understanding the origin of this virus, its pathogenesis, and the threat potential to the worldwide commercial meat production industry is also critical for preventing major outbreaks that can threat humans, affect food security and cause considerable economic damage. This is the first report of transcriptome analysis of human cells infected by PDCoV.

13

Chapter 2: Transcriptome analysis of IPEC and HIEC cells in response to Delta Coronavirus Infection

2.1. Materials and Methods

Virus

The PDCoV OH-FD22 cell culture adapted virus was previously isolated from small intestinal contents of a diarrheic pig from Ohio using swine testicular (ST) and

LLC porcine kidney (LLC-PK) cell cultures(Hu, et al., 2015; Hu, et al., 2016). The infectious titer as determined by tissue culture infective dose 50 (TCID50) was 7 x 107/ml

(1 x107 PFU/ml).

Intestinal epithelial swine (IPEC-J2), human intestinal epithelial (HIEC- P26) cells and PDCoV infection

IPEC-J2 and HIEC cells were a kind gift from by Dr. Linda Saif (Food Animal

Health Research Program, The Ohio State University, Wooster, OH, USA). IPEC-J2 cells (passage 40-58) were cultured in Dulbecco’s modified eagle medium/F12

(DMEM/F12) (Gibco, USA) supplemented with HEPES, 5% fetal bovine serum (FBS),

1% penicillin/streptomycin (Gibco), 1% insulin-transferrin-sodium selenite (Gibco) and

5 ng/ml of human epidermal growth factor –EGF (BioVision Inc) (Jung, et al., 2018).

HIEC cells (passages 25) were maintained on type I collagen-coated culture dishes in

Opti-MEM Reduced Serum Medium (Gibco, USA) supplemented with HEPES (20 mM)

(Gibco), GlutaMAX (10 mM) (Gibco), 4% fetal bovine serum (FBS) and 10 ng/mL of human epidermal growth factor –EGF (BioVision Inc) (HIEC-6 (ATCC ®CRL-3266 TM).

PDCoV OH-FD22 (Hu, et al., 2015) was utilized to infect the cells at a multiplicity of

14

infection (MOI) of 1. To infect, cells were rinsed with maintenance medium (Advance

MEM supplemented with 1% Anti-Anti, 1% NEAA and 1% HEPES), virus was added and allowed to adsorb for one hour in the presence of 0.25% Trypsin (HIEC cells) and

0.05% Trypsin (Gibco) (IPEC cells). Based upon previous research by Jung et.al. (Jung, et al., 2018) and Jiang et.al. (Jiang, et al., 2019) who were able to detect viral replication and gene expression alteration in cells a timepoint of 24 hours was chosen for ending the experiments.was propagated in IPEC-J2 and HIEC-P26 cell cultures as follows:

Immunofluorescence assay (IFA) for the detection of PDCoV antigen in HIEC cells

At 24 hours post infection HIEC cells (passage 26) either infected with PDCoV or mock infected were rinsed with phosphate buffered saline (PBS) and fixed with 4% paraformaldehyde solution in PBS for 30 minutes.Triton X-100 (Sigma) was then used as a permeabilization agent for 15 minutes at room temperature. Fixed cells were blocked using BetterBlocker Universal Blocking Solution (Thermo Fisher) for 30 minutes followed by incubation with the primary antibody (anti-N mouse monoclonal antibody

(1:2500), kindly supply by Dr. Steven Lawson at South Dakota State University) for one hour at 37°C. Cells were washed three times and the staining was completed by adding the secondary antibody (Alexa Fluor 488-conjugated goat α-mouse antibody (1:400)

(Life Technologies). Nuclei were visualized using 4′,6-diamidino-2-phenylindole

(DAPI) (Invitrogen). Cells were imaged using an Olympus IX-7 fluorescent microscope.

RNA extraction and quality control

RNA extraction was done using a GenCatch total RNA miniprep (Epoch Life

Science, Inc) following the manufacturer’s instructions. In order to remove all traces of

DNA, RNA samples were treated with DNase I using an InvitrogenTM TURBO DNA – 15

free kit. Uninfected IPEC and HIEC cells served as the host uninfected control samples

(6 samples – Control). Three replicates were used per sample for a total of 12 samples.

RNA quality was assessed by using TapeStation Analysis Software A.02.02 (Agilent

Technologies). RNA samples with RNA integrity (RINs) equal to 2 to 7 or greater than 7 were selected for library preparation and sequencing.

Library preparation and sequencing

The whole transcriptome RNA was enriched by depleting ribosomal RNA

(rRNA) using a NEBNext rRNA depletion kit (NEB). Input RNA concentrations, fragmentation conditions and PCR cycles for intact and degraded RNA were established, following the manufacturer’s protocol. RNA and cDNA samples were purified by using

DNA purification SPRI magnetic beads (abm®good). Completed libraries were quantified by using QuBitTM Assays (Invitrogen) and run on TapeStation Analysis

Software (Agilent) and Bioanalyzer in order to determine the library size. Libraries were pooled and sequenced using Illumina Hiseq Platform PE150 by Novogene Corporation

INC.

Data Pre-processing and alignment

Raw sequence reads were first subjected to a quality check that involved removal of adapter sequences by using FastQC

(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and BBMap

(https://sourceforge.net/projects/bbmap/). Reads were aligned to Homo sapiens GRCh38 genome release 97 (ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/dna/) and Sus scrofa 11.1 genome release 97 (ftp://ftp.ensembl.org/pub/release-

97/fasta/sus_scrofa/dna/) using the Rsubread aligner (Y. Chen, Lun, & Smyth, 2016) 16

Expression data pre-processing

Gene expression counts were identified from the alignment files in BAM format using Rsubread (Y. Chen, et al., 2016). Raw count data was transformed to counts per million (CPM) and log-CPM using EdgeR (Y. Chen, et al., 2016). Genes that were not expressed in any biologically significant levels (CPM less than 1) were discarded.

Expression values were normalized using the trimmed mean of M-values (TMM) normalization (Robinson & Oshlack, 2010).

Differential Expression

Differential expression analysis was performed between uninfected and infected samples in each cell lines. Briefly, dispersion of each gene was first estimated and then, a generalized linear model (GLM) was fit on the expression dataset. Differential expression was tested using a quasi-likelihood (QL) F-test method. The analyses were performed with the limma and edgeR statistical packages on the R programming environment (Y. Chen, et al., 2016). The parameters for the numbers of up and down- regulated genes between each cell line were Bonferroni-Hochberg adjusted P-value cut- off of 0.05 and log fold change of 1 (FDR <= 0.05).

GO function and KEGG pathway enrichment analysis

Function classification and enrichment analysis of differentially expressed genes

(DEGs) were performed using KEGG Gene Set Enrichment Analysis available in the R package: ClusterProfiler, DAVID (https://david.ncifcrf.gov/home.jsp) and PANTHER

15.0 (http://www.pantherdb.org/tools/index.jsp). Pathway enrichment was analyzed based on the KEGG database (https://www.genome.jp/kegg/pathway.html). KEGG pathways with p-values < 0.05 were considered to be significantly enriched. 17

Intersections of common upregulated and downregulated DE genes across pathways and between species were represented by Venn Diagrams using the draw custom Venn Diagram tool (http://bioinformatics.psb.ugent.be/webtools/Venn/) and

Orthologs genes between Humans and Pigs were predicted using PANTHER 15.0

(http://www.pantherdb.org/tools/index.jsp) and protein BLAST from NCBI

(https://blast.ncbi.nlm.nih.gov/Blast.cgi) with the following parameters: E-value greater than 10-6, identity percentage greater than 30% and less than 97% and query coverage greater than 95%.

2.2. Results

RNA quality control

RNA was extracted from uninfected IPEC and HIEC cells (Controls) and infected cells after 24 hours post infection (hpi). Three replicates were used per sample for a total of 12 samples. The RNA integrity electrophoresis gel showed that most of the samples were partially degraded RNA (Fig. 1). Taking into account this gel, we were able to choose the best samples and the most appropriate library preparation method.

18

[nt] EL1(L) HM1 HM2 HM3 IM1 IM2 IM3 IM4 HI1 HI2 HI3 HI4 II1 II2 II3 II4

Figure 1. Electrophoresis gel of 15 RNA samples: uninfected HIEC and IPEC cells

(HM1, HM2, HM3, IM1, IM2, IM3, IM4); and infected HIEC and IPEC cells (HI1, HI2,

HI3, HI4, II1, II2, II3, II4). In each sample, it is possible to identify two bands: the one at the top is 28S and the one at the bottom is 18S. RNA integrity numbers (RINs) are represented as follows: Intact RNA (RIN > 7); Partially degraded RNA (RIN = 2 to 7);

Highly degraded RNA (RIN= 1 to 2). HM, HIEC Mock; IM, IPEC Mock; HI, HIEC

Infected; II, IPEC infected.

19

Library Quality Control

Due to low quality of our RNA samples, we decided to use Ribosomal RNA depletion methods in order to get our 12 cDNA libraries. In general, the main peaks of the libraries range from 440 bp, which includes 120 bp of adapters. The Q-PCR concentration of the pooled library was 10.30 nmol/L, which met the sequence requirement of Hiseq Xten illumina (Fig 2).

Figure 2. Bioanalyzer electropherogram of the cDNA pooled library that contains twelve libraries of uninfected and infected HIEC and IPEC cells. This is a data plot of fragment size (bp) versus Fluorescence intensity (fluorescence units, FU). Peaks at 35 bp and 10380 bp represent lower and upper markers respectively.

20

Sequence Read Preprocessing

According to the previous results, there were some experimental challenges related to low quality RNA samples and presence of adapter/primer dimers in most of the cDNA libraries that were generated in this experiment. In order to solve this problem, quality control of the reads and elimination of low quality reads was performed by using

FastQC in HIEC and IPEC RNA-seq libraries (Fig 3); and BBMap was used for removal of adapter sequences in HIEC and IPEC RNA-seq libraries (Fig 4).

21

A B

C D

Figure 3. Quality control of reads in HIEC RNA-seq library (Hs_Dc_1) before trimming (A) and after trimming (B); Quality control of reads in IPEC RNA-seq library (Ss_Dc_1) before trimming (C) and after trimming (D). Hs_Dc, Homo sapiens_Deltacoronavirus; Ss_Dc, Sus scrofa_Deltacoronavirus. The same procedure was applied for the five remaining RNA-seq libraries.

22

A B

C D

Figure 4. Adapter removal in HIEC RNA-seq library (Hs_Dc_1) before trimming

(A) and after trimming (B); Adapter removal in IPEC RNA-seq library (Ss_Dc_1) before trimming (C) and after trimming (D). Hs_Dc, Homo sapiens_Deltacoronavirus;

Ss_Dc, Sus scrofa_Deltacoronavirus. The same procedure was applied for the five remaining RNA-seq libraries.

Expression Data Preprocessing

From the alignment files, gene expression counts were identified and raw count data was transformed to counts per million (CPM) and log-CPM (Fig 5, 6). The CPM 23

values will help to filter the data based on favoring genes that are expressed in larger libraries over those expressed in smaller libraries (Y. Chen, et al., 2016). This will also help to show the number of genes within each sample that have no counts (CPM = 0) or more than 1, 2, 5 or 10 CPM and will be a guide in order to decide the threshold to remove very low expressed genes in any of the experimental conditions (Jimenez-Jacinto,

Sanchez-Flores, & Vega-Alvarado, 2019).

.

Figure 5. The density distribution of log-CPM values before (A) and after (B) filtering low expressed genes in HIEC RNA-seq libraries. Hs, Homo sapiens; Hs_Dc,

Homo sapiens_Deltacoronavirus. 24

Figure 6. The density distribution of log-CPM values before (A) and after (B) filtering low expressed genes in IPEC RNA-seq libraries. Ss, Sus scrofa; Ss_Dc, Sus scrofa_Deltacoronavirus.

Next, in order to enable the comparison of expression across different samples, expression values were normalized using the trimmed mean of M-values (TMM) normalization (Fig 7, 8). This is a robust way to estimate the ratio of the RNA production using a weight trimmed mean of the log expression ratios (Robinson &

Oshlack, 2010).

25

Figure 7. Box plots showing the distribution of log-cpm values for unnormalized

(left panel) and normalized (right panel) data in HIEC RNA-seq libraries. Hs,

Homo sapiens; Hs_Dc, Homo sapiens_Deltacoronavirus.

.

26

Figure 8. Box plots showing the distribution of log-cpm values for unnormalized

(left panel) and normalized (right panel) data in IPEC RNA-seq libraries. Ss, Sus scrofa; Ss_Dc, Sus scrofa_Deltacoronavirus.

Finally, the unsupervised method, Principal Component Analysis (PCA) was used in order to determine similarities and dissimilarities amongst samples. This is a mathematical algorithm that reduces the dimensionality of the data, retaining most of the variation in the dataset. This reduction can be accomplished by identifying directions called principal components (PCs). Samples can be plotted, so it can be possible to 27

visualize similarities and differences between samples and determine whether samples can be grouped (Lever, 2017; Ringner, 2008). In this case, the multi-dimensional scale

(MDS) plot was used for showing the distribution of samples across two dimensions based on the biological coefficient of variation (Fig 9, 10). Accordingly, samples that belong to the same condition or treatment should be closer to each other and distant to other conditions (Jimenez-Jacinto, et al., 2019). Given this information, the PCA in HIEC cells showed that the biological samples replicates of the same condition (uninfected and/or infected HIEC cells) are clustered together due to biological variation (Fig 9).

Therefore, the global quality of this RNA-seq dataset is excellent. However, the PCA in

IPEC cells is showing that despite the fact that biological samples replicates are trying to cluster together in two groups, there is an outlier (Ss_1). This outlier is increasing the technical variation (Fig 10). Taking into account that the variation in the data set is still due to biological variation, it is possible to conclude that the global quality of this RNA- seq dataset is acceptable.

28

Figure 9. MDS plot based on biological coefficient of variation, and Histogram of the variances of the principal components of the Human Data set. Hs_1/2/3  UTR

(Untreated). Hs_Dc_1/2/3  TR (Treated). Histogram: 1Biological variation; 2

Technical variation.

Figure 10. MDS plot based on biological coefficient of variation, and Histogram of the variances of the principal components of the Pig Data set. Ss_1/2/3  UTR

(Untreated). Ss_Dc_1/2/3  TR (Treated). Histogram: 1Biological variation; 2

Technical variation. 29

HIEC cells are susceptible to PDCoV infection

HIEC cells are a normal non-immortalized non-transformed human intestinal crypt cell model derived from immature intestine (Guezguez, Pare, Benoit, Basora, &

Beaulieu, 2014). These cells have been useful in studying human crypt cell functions such as proliferation, apoptosis, cell-matrix interactions, metabolism and the inflammatory response (Guezguez, et al., 2014). HIEC cells were susceptible to PDCoV infection as shown by presence of N protein in cells at 24 hours post infection (hpi) via immunofluorescence assay staining (IFA) positivity (Fig 11A-C) as compared to mock infected cells (Fig 11D-F). HIEC cells appeared susceptible to PDCoV mediated cell death as infected cells were reduced in number at 24 hours and were almost completely gone at 48 hpi compared to mock. We further confirmed that IPEC-J2 cells were also susceptible to PDCoV infection as previously reported (Jung, et al., 2018) and observed equivalent positive IFA signals at 24 hpi (see Appendix A).

30

PDCoV positive by IF staining DAPI staining of Panel A Overlap of Panels A and B

A B C

HIEC Negative control DAPI staining of Panel D Overlap of Panels D and E

D E F

Figure 11. IF staining of the inoculated HIEC cells at 24 hpi. A) IF stained positive

(green fluorescence) for PDCoV antigen. B) Blue-fluorescent 4’,6-diamidino-2- phenylindole dihydrochloride (DAPI) staining of Panel A to stain nuclear DNA. C)

Overlap of panels A and B. The arrowheads indicated clustered of cells positive for

PDCoV antigen. D) IF staining of PDCoV-uninoculated (0.05% Trypsin). E) DAPI staining of Panel D. F) Overlap of panels D and E.

PDCoV infection results in more differential expressed genes in humans compared with pigs

In order to identify the differentially expressed genes (DEGs) in the transcriptome of HIEC and IPEC cells after PDCoV infection, the global gene expression of these cells 31

with and without PDCoV infection (negative controls) were obtained by using RNA sequencing. More than 23 million clean reads remained after removing reads containing adapters and reads with low quality. All clean reads coming from HIEC and IPEC cells were mapped to the human reference genome (release 97) and the pig reference genome

(release 97), respectively. The reads showed total mapping rates from 81.59% to 95.83%

(Table 1). As a result, 7486 DEGs, with 4011 genes upregulated and 3475 genes downregulated, were identified in humans (HIEC cells) (Table 2, Fig 12A) and 1134

DEGs, with 542 genes upregulated and 592 genes downregulated, identified in pigs

(IPEC cells) at 24 hpi (Table 2, Fig 12B). Together, these results show that there are more genes upregulated in humans in comparison to pigs during PDCoV infection (Fig

2).

In order to identify the differential expressed genes in the transcriptome of HIEC and IPEC cells after PDCoV infection, the global gene expression of these cells with and without PDCoV infection (negative controls) were obtained by using RNA sequencing.

More than 23 million of clean reads remained after removing reads containing adapters and reads with low quality. All these clean reads coming from HIEC and IPEC cells were mapped to the Human reference genome (release 97) and the pig reference genome

(release 97) respectively, and they were showing total mapping rates from 81.59% up to

95.83% (Table 1). As a result, 7486 differentially expressed genes (DEGs), with 4011 genes upregulated and 3475 genes downregulated, were identified in humans (HIEC cells) (Table 2, Fig 12A) and 1134 DEGs, with 542 genes upregulated and 592 genes downregulated, were identified in pigs (IPEC cells) at 24 hpi (Table 2, Fig 12B).

32

Altogether, these results show that there are more genes upregulated in humans in comparison to pigs (Fig 12).

Table 1. RNA sequencing and Genome mapping results.

Sample Raw reads Clean reads Total mapping (%) Hs_1 47007321 45421159 95.83% Hs_2 50022577 48880712 95.52% Hs_3 43496780 42168797 94.63% Hs_Dc_1 15600403 14820248 85.82% Hs_Dc_2 24140872 23537480 82.23% Hs_Dc_3 48513411 47055941 81.59% Ss_1 23545237 23195923 94.46% Ss_2 53210729 51875151 95.94% Ss_3 53026614 52011848 94.54% Ss_Dc_1 44124431 42370051 89.67% Ss_Dc_2 53543270 52298369 86.84% Ss_Dc_3 50508987 48861970 94.36%

Hs, Homo sapiens

Hs_Dc, Homo sapiens_Deltacoronavirus

Ss, Sus scrofa

Ss_Dc, Sus scrofa_Deltacoronavirus

Table 2. DEGs in H. sapiens (HIEC cells) and S. scrofa (IPEC cells) at 24 hpi vs no infected cells.

Total Species Up Down Non sigf. Total genes DEs genes Homo sapiens 4011 3475 10784 18270 7486 Sus scrofa 542 592 10716 11850 1134 Up, upregulated genes

Down, downregulated genes

Non.sigf, genes with no significant differences 33

A B

Figure 12. DEGs in H. sapiens (HIEC cells)(A) and S. scrofa (IPEC cells) (B) at 24 hpi vs no infected cells. DEGs upregulated are represented by red dots, DEGs downregulated are represented by blue dots and genes with no significant differences in expression are represented by blue dots.

The hierarchical clustering heatmap in Figure 13 shows the top 100 DE expression profiling data in humans (Fig 13A) and pigs (Fig 13B) by which all these DE expression genes were upregulated in humans at 24 hpi.

34

A B

Figure 13. Heatmap of the top 100 genes that were differentially expressed in the

PDCoV-infected HIEC cells (humans) (A) and the PDCoV-infected IPEC cells (pigs)

(B) compared with the no infected cells (p < 0.05). Each row is a gene and each column is a sample. Upregulated genes were represented in red while downregulated genes were represented in blue. UTR, no infected cells; TR, infected cells.

35

Common pathways and genes are affected in the immune associated response to

PDCoV infection in humans and pigs

DGEs were selected for Gene Set Enrichment Analysis (GSEA) in Kyoto

Encyclopedia of Genes and Genomes (KEGG) pathway enrichment. A total of 60 pathways including 13 immune response- associated pathways were enriched in humans and pigs at 24 hpi (Fig 14).

A B

Figure 14. KEGG Gene Set Enrichment Analysis of DEGs in humans (A) and pigs

(B) at 24 hpi. Enriched signaling pathways with p <0.05 were considered to be statistically significant.

36

From these immune response-associated pathways, 8 were commonly enriched

(Viral protein interaction with cytokine and cytokine receptor, Cytokine-cytokine receptor interaction, NF-kappa β signaling pathway, B cell receptor signaling pathway,

JAK-STAT signaling pathway, Influenza A, Toll-like receptor signaling pathway, TNF signaling pathway, NOD-like receptor signaling pathway and Cytosolic DNA-sensing pathway) (Fig 14). Additionally, pathways such as Apoptosis signaling pathway, T cell activation, Interferon signaling pathway, Interleukin signaling pathway, TGF-beta signaling pathway and Ras signaling pathway were affected during the alteration of immune response of HIEC and IPEC cells caused by PDCoV infection (Fig 15).

From these pathways, there are more genes affected in Inflammation/cytokine signaling pathway in pigs and humans in comparison to the other 9 pathways (Fig 15). In the same way, more genes are upregulated in humans and more genes are downregulated in pigs across the 10 immune response-associated pathways in response to PDCoV infection (Fig 15).

37

Figure 15. DEGs in 10 common immune response -associated pathways in pigs and humans after PDCoV infection. Results are shown in 10 pathways: Apoptosis signaling pathway, B cell activation, Inflammation/cytokine signaling pathway,

Interferon signaling pathway, Interleukin signaling pathway, JAK/STAT signaling pathway, Ras signaling pathway, T cell activation, TGF-beta signaling pathway and Toll like receptor signaling pathway. Pink, down-regulate; Blue, up-regulate.

This study was also able to identify common upregulated and downregulated

DEGs in humans and pigs across the 10 pathways as well as orthologs between these 38

species (Figure 16 - 23; see Appendix B). Most of the common DE genes that are upregulated in humans and pigs across the 10 pathways are part of the NF-kappa-β transcription factor family (NFKBIA, NFKBIA), Interferon (IFNs) family (IFNB1,

IFNL1, IFNL3,), JAK/STAT family (JAK1, JAK2, STAT1, STAT2), Interleukin family

(CXCL8), protein kinase family such as MAPK kinase (MAP3K4, MAPK7) and RAF kinase (RAF1) (Fig 16 - 18).

Figure 16. Common upregulated DEGs in humans at 24 hpi across 10 pathways.

AP Apoptosis signaling pathway; BC B cell activation; ICInflammation/cytokine signaling pathway; TAT cell activation; TLRS Toll like receptor signaling pathway

39

Figure 17. Common upregulated DEGs in humans at 24 hpi across 10 pathways.

ICInflammation/cytokine signaling pathway; IFSInterferon signaling pathway;

JSSJAK/STAT signaling pathway; INS Interleukin signaling pathway; TGFS

TGF-beta signaling pathway; RSRas signaling pathway.

The Interferon family, including MX1 and some other DE genes were also upregulated in different categories according to the Gene Ontology Classification ( Table

3).

40

Table 3. Gene Ontology classification of upregulated DEGs in humans and pigs at 24

hpi.

Category Description Human Pig

BP Antiviral Defense, Influenza A IFNB1, IFI44L, IFNL1, OAS1, IFNL3 MX1, RSAD2 BP Defense response to virus IFNB1, IFI44L, IFNL1, OAS1, IFNL3 CCL5,MX1,RSAD2 BP Negative regulation of viral genome replication IFNB1, OAS1, IFNL3 CCL5, MX1, RSAD2 BP Cytokine and Inflammatory response IFNB1, IFNL1, IFNL3, NLRP4 CCL5, SH2B2, RSAD2 MF Cytokine activity IFNB1, IFNL1, IFNL3 LIFR KEGG_PATHWAY Jak-STAT signaling pathway IFNB1, IFNL1, IFNL3 LIFR KEGG_PATHWAY Cytokine-cytokine receptor interaction IFNB1, IFNL1, IFNL3 LIFR, CCL5 BP Innate immune response FGA, IFNL1, IFNL3 MX1, RSAD2 BP Pos/Neg Regulation of endothelial cell apoptotic process FGA, FGG CCL5, NEK6, BHLHB9 BP Pos/Neg regulation of apoptotic signaling via death domain FGA, FGG CCL5, NEK6, BHLHB9 BP B cell differentiation IFNB1, FLT3 SH2B2 BP Type I interferon signaling pathway IFNB1, OAS1, NLRP4 CCL5, DHX58, MX1 BP negative regulation of T cell differentiation IFNB1, IFNL1 CCL5

Figure 18. Common upregulated DEGs in pigs at 24 hpi across 9 pathways. AP

Apoptosis signaling pathway; BC B cell activation; ICInflammation/cytokine

signaling pathway; TAT cell activation; TLRS Toll like receptor signaling pathway;

41

IFSInterferon signaling pathway; JSSJAK/STAT signaling pathway; INS

Interleukin signaling pathway; RSRas signaling pathway.

In contrast, common DE genes that are downregulated across the 10 pathways in humans and pigs belong to the protein kinase family such as PIK3CB/ LOC100622663,

Son of Sevenless 2 (SOS2), MAPK family (MAPK3, MAPK14) and the FOS family of transcription factors (FOS) (Fig 19 - 20).

Figure 19. Common downregulated DEGs in humans at 24 hpi across 10 pathways.

AP Apoptosis signaling pathway; BC B cell activation; ICInflammation/cytokine signaling pathway; TAT cell activation; TLRS Toll like receptor signaling pathway;

IFSInterferon signaling pathway; JSSJAK/STAT signaling pathway; INS

Interleukin signaling pathway; TGFS TGF-beta signaling pathway; RSRas signaling pathway.

42

Figure 20. Common downregulated DEGs in pigs at 24 hpi across 7 pathways. AP

Apoptosis signaling pathway; BC B cell activation; ICInflammation/cytokine signaling pathway; TAT cell activation; TLRS Toll like receptor signaling pathway;

INS Interleukin signaling pathway; RSRas signaling pathway.

Interestingly, there are approximately 16 orthologs of Upregulated and

Downregulated DE genes between humans and pigs that were identified in different pathways such as: Apoptosis signaling pathway, B and T cell activation,

Inflammation/cytokine signaling pathway, Interleukin signaling pathway, Toll like receptor signaling pathway and Ras signaling pathway (Fig 21 - 23). From these pathways, five orthologs DE genes were upregulated in humans and downregulated in pigs in the Apoptosis signaling pathway (Fig 23), three orthologs DE genes were upregulated in humans and pigs in Inflammation/cytokine signaling pathway (Fig 21), 43

and two orthologs DE genes were downregulated in humans and pigs in Ras signaling pathway (Fig 22). Most of these orthologs had Identity percentage greater than 77% (Fig

21 - 23).

Figure 21. Orthologs of Upregulated DEGs between humans and pigs in: Apoptosis signaling pathway, B and T cell activation and Inflammation/cytokine signaling pathway.

Up_H, Upregulate_humans; Down_H, Downregulate_humans; Up_P, Upregulate_pigs;

Down_P, Downregulate_pigs. These orthologous DE genes were inferred based on percentage of identity, gene name and function.

44

Figure 22. Orthologs of Upregulated and Downregulated DEGs between humans and pigs in: B cell activation, Inflammation/cytokine signaling pathway, Interleukin signaling pathway, Toll like receptor signaling pathway and Ras signaling pathway.

Up_H, Upregulate_humans; Down_H, Downregulate_humans; Up_P, Upregulate_pigs;

Down_P, Downregulate_pigs. These orthologous DE genes were inferred based on percentage of identity, gene name and function.

45

Figure 23. Orthologs of Upregulated and Downregulated DEGs between humans and pigs in: Apoptosis signaling pathway, B and T cell activation and Toll like receptor signaling pathway. Up_H, Upregulate_humans; Down_H, Downregulate_humans; Up_P,

Upregulate_pigs; Down_P, Downregulate_pigs. These orthologous DE genes were inferred based on percentage of identity, gene name and function.

2.3. Discussion

Coronaviruses (CoVs) produce respiratory and gastrointestinal disease in humans, poultry, swine and cattle. CoVs are divided into 4 genera, ,

Betacoronavirus, and Deltacoronavirus. Viruses from each genus have been detected in diverse host species, but and (DCoV) have been isolated primarily in birds (Boley, et al., 2020;

Woo, et al., 2012). In vitro studies have shown that PDCoV has access to cells of diverse range of species including humans, pigs and chickens by binding to an interspecies 46

conserved domain on the protein aminopeptidase N ( APN) (Li, et al., 2018). The ability of PDCoV to infect cells of multiple species and cause illness and death makes it a potentially emerging zoonotic pathogen that should be studied in more depth. Therefore, understanding how cross-species transmission of CoVs takes place is critical to predict which viruses might be in the vicinity of SARS-,MERS-, or COVID-19 – like pandemic.

Moreover, studies of CoV cross-species infection can inform development of novel therapeutics and strategies to fight CoVs in susceptible animal hosts before they become a threat to human health (Boley, et al., 2020). The present study is the first report of transcriptome analysis of human cells infected by PDCoV.

We explored the transcriptome of (HIEC) human cells and (IPEC) pig cells infected by this virus using RNA sequencing. The evaluation of the RNA-seq data resulted in an overall alignment rate from 81.59% up to 95.83% (Table 1). The global quality of the Human RNA-seq dataset was excellent (Fig 9), while the quality of the Pig

RNA-seq dataset was acceptable (Fig 10). The Principal Component Analysis enabled us to conclude that some technical issues related to the Pig Data set could have happened during the experimental phase of the project. These technical issues could be related to low concentration and RNA quality, or using a rRNA Depletion Kit specific for

Human/Mouse/Rat.

IPEC-J2 cells are intestinal porcine enterocytes that were isolated from the jejunum of a neonatal unsuckled piglet (Vergauwen, 2015). Jung and colleagues reported that IPEC-J2 cells are susceptible to infection with PDCoV (Jung, et al., 2018). This was also shown in our lab (see Appendix A). In this study, we also found that HIEC cells are susceptible to PDCoV infection (Fig 11). In order to investigate the mechanism of 47

PDCoV infection, differentially expressed genes were identified in IPEC and HIEC cells at 24 hpi using RNA sequencing. By comparing the global gene expression of these cells with and without PDCoV infection, we found more differentially expressed genes in humans in comparison to pigs and as a result, more genes were upregulated in humans than in pigs at 24 hpi (Fig 12 and Fig 13). We also found that the top 10 upregulated genes in humans exhibited a higher magnitude of response upon PDCoV infection (more than a 10-fold change in transcriptional response) compared to the host response of pigs

(more than 2.7 –fold change) From these genes, Interferon Beta 1_Human (IFNB1),

Interferon-induced protein 44-like_Human (IFI44L), Interferon-induced GTP-binding protein_Pig (Mx1) and Inflammatory response protein 6_Pig (RSAD2) showed the highest logFC (Table 4).

Table 4. Top 20 differential expressed genes in humans and pigs

Gene Ensembl ID Symbol Species logFC logCPM P-value FDR Expression ENSG00000171855 IFNB1 Homo sapiens 10.58987 1.872253 3.75E-109 3.43E-106 Up ENSG00000137959 IFI44L Homo sapiens 7.578162 5.152949 1.81E-148 4.73E-145 Up ENSG00000182393 IFNL1 Homo sapiens 6.661177 0.255202 1.02E-36 1.23E-34 Up ENSG00000089127 OAS1 Homo sapiens 6.056076 3.024683 7.22E-99 5.49E-96 Up

ENSG00000197110 IFNL3 Homo sapiens 6.04519 -2.12294 4.45E-08 6.04E-07 Up

ENSG00000160505 NLRP4 Homo sapiens 6.964062 -1.43834 3.24E-12 8.50E-11 Up

ENSG00000171560 FGA Homo sapiens 6.937664 -1.46837 6.65E-11 1.49E-09 Up ENSG00000171557 FGG Homo sapiens 9.368923 0.642956 4.41E-48 7.67E-46 Up ENSG00000122025 FLT3 Homo sapiens 5.550694 1.443451 1.18E-49 2.17E-47 Up

ENSG00000146151 HMGCLL1 Homo sapiens -1.30409 1.70339 6.21E-07 6.71E-06 Down

ENSG00000241399 CD302 Homo sapiens -1.2697 1.857402 1.16E-05 9.19E-05 Down

ENSG00000188037 CLCN1 Homo sapiens -1.30207 -1.21937 0.022515 0.061942 Down

ENSG00000111775 COX6A1 Homo sapiens -2.15232 -1.40514 0.00119 0.005294 Down

48

ENSG00000115290 GRB14 Homo sapiens -1.29817 0.132904 0.000525 0.002599 Down

ENSG00000149084 HSD17B12 Homo sapiens -1.66029 -0.77379 0.007978 0.026197 Down

ENSG00000136542 GALNT5 Homo sapiens -1.64265 0.786164 9.36E-07 9.72E-06 Down

ENSG00000145757 SPATA9 Homo sapiens -2.11211 -1.72707 0.005062 0.018026 Down

ENSG00000124678 TCP11 Homo sapiens -2.03175 -1.04844 0.000526 0.002602 Down

ENSG00000169435 RASSF6 Homo sapiens -1.61692 -0.66667 0.001183 0.00527 Down

ENSG00000112175 BMP5 Homo sapiens -1.83933 0.846557 6.13E-06 5.24E-05 Down ENSSSCG00000012077 MX1 Sus scrofa 2.041311 2.158071 8.89E-06 0.005018 Up ENSSSCG00000008648 RSAD2 Sus scrofa 2.635093 0.56878 2.12E-07 0.000419 Up

ENSSSCG00000017705 CCL5 Sus scrofa 1.952532 -0.72996 0.00204 0.137896 Up

ENSSSCG00000007682 SH2B2 Sus scrofa 1.144246 -0.04913 0.013264 0.335123 Up

ENSSSCG00000016850 LIFR Sus scrofa 1.289662 -0.93357 0.047963 0.516218 Up

ENSSSCG00000005587 NEK6 Sus scrofa 1.359133 -0.15415 0.04673 0.510582 Up

ENSSSCG00000012520 BHLHB9 Sus scrofa 1.406445 -0.07198 0.00568 0.233698 Up ENSSSCG00000017416 DHX58 Sus scrofa 1.140561 0.216966 0.012688 0.32265 Up

ENSSSCG00000005311 CD72 Sus scrofa -1.83387 -0.28071 0.000267 0.046614 Down

ENSSSCG00000012959 CATSPER1 Sus scrofa -1.21347 -0.30285 0.032173 0.464939 Down

ENSSSCG00000020740 DSG4 Sus scrofa -1.43382 -0.29562 0.001842 0.133961 Down

ENSSSCG00000030368 HSP70.2 Sus scrofa -1.47794 -0.58141 0.015439 0.353103 Down

ENSSSCG00000027997 NME5 Sus scrofa -1.42071 -0.45481 0.005247 0.222052 Down

ENSSSCG00000030850 TEX11 Sus scrofa -1.23861 -0.81813 0.042827 0.497553 Down

ENSSSCG00000006648 CTSS Sus scrofa -1.71573 -0.60694 0.002659 0.158376 Down

Therefore, these results suggest that from the transcriptional level, humans are showing a more aggressive response to PDCoV infection in comparison to pigs; and the

Interferon Type I related to innate immune response was activated during PDCoV infection at 24 hpi. Recent studies in PK-15 cells infected by PDCoV are comparable with this result (Jiang, et al., 2019).

When cells are exposed to pathogens such as viruses, immune responses are induced as a way of host defense (Lim, et al., 2016). Similarly, during viral infections, 49

apoptosis is induced as one of the host antiviral responses in order to limit virus replication and production (Lim, et al., 2016). In this study, we were able to identify common pathways and genes related to innate associated response that were affected after PDCoV infection in humans and pigs. Our data suggests that eight immune – associated signaling pathways were commonly enriched in humans and pigs in response to PDCoV infection (Fig 14). In fact, some of these pathways including NOD-like receptor signaling pathway, JAK-STAT signaling pathway and the cytosolic DNA- sensing pathway play a pivotal role in innate immune response (Jiang, et al., 2019).

These pathways are known to perform an important function in the process how the interferon family, specifically type-I IFN, develops antiviral function in innate immune response (Jiang, et al., 2019). Additionally, we found that pathways such as the apoptosis signaling pathway, T-cell activation, interferon signaling pathway, interleukin signaling pathway, TGF-beta signaling pathway and Ras signaling pathway were affected during the alteration of immune response of HIEC and IPEC cells caused by PDCoV infection

(Fig 15, see Appendix C). Most of the DE genes in these pathways were significantly upregulated in humans and significantly downregulated in pigs at 24 hpi in comparison to cells without PDCoV infection (Fig 15). These results indicate that these pathways, affected during PDCoV infection, can enhance or inhibit immune response of IPEC and

HIEC cells and therefore, humans are responding differently to PDCoV infection in comparison to pigs. These results also showed that there are more genes affected in

Inflammation/cytokine signaling pathway in pigs and humans in comparison to the other nine pathways (Fig 15; see Appendix B).

50

Transcriptome studies related to NK cells have identified the expression of

Interferons (IFNs), tumor necrosis factor (TNF) and Inflammation/cytokine stimulation in the activation of NK cells under the presence of viruses (Campbell et al., 2015; Costanzo et al., 2018). Interestingly, these cells can increase up to 100- fold the INFs (Barber,

2001). In fact, we were able to identify similar responses in HIEC cells infected by

PDCoV. These results also indicate that the inflammation increase in HIEC cells faster than in IPEC cells and that HIEC cells can limit viral replication by releasing cytokines.

However, further studies are needed in order to determine if this is a host strategy called

“antiviral state” for preventing or limiting viral replication or a common complication produce by the virus, called “cytokine storm” that can lead to death in humans.

Moreover, it should be pointed out that despite the fact that the pig reference genome assembly has new annotations that represent a significant improvement over the previous pig genome annotations, they are still far from being complete (Beiki et al., 2019).

Therefore, the results of identifying common pathways and genes related to innate associated response that were affected after PDCoV infection in pigs will be limited in comparison to the finding in humans.

This study was also able to show that there are some common upregulated and downregulated DEGs in humans and pigs across 10 pathways as well as orthologs between these species (Figure 16 - 23; see Appendix B). Most of the common DE genes that were upregulated across the 10 pathways in humans and pigs are part of the NF- kappa-β transcription factor family, Interferon (IFNs) family and JAK/STAT family (see

Appendix C); protein kinase family such as MAPK kinase and RAF kinase (Fig 16 - 18); while most of the genes that were downregulated across the 10 pathways in humans and 51

pigs belong to the protein kinase family such as PIK3CB, SOS2 and MAPK family, and the FOS family of transcription factors (Fig 19 - 20). These results could indicate that these DE genes can play an important role in the PDCoV mechanism of infection in humans and pigs. With regards to this, it has been shown that the IFNs can be induced by a number of stimuli including viruses and dsRNA through mechanism that can involve the activation of NF-κB (Barber, 2001). After the secretion from these cells, these cytokines bind to specific cell-surface receptors and start the induction of a number of responsive genes via signaling through JAK-STAT signaling pathway (Barber, 2001).

The regulation of IFNs via this pathway leads to the phosphorylation of STAT1 and

STAT 2 and the recruitment of JAK1 and JAK2, which are genes that were upregulated in humans in this study and common in some pathways (Fig 16 - 18; See Appendix C).

Genes of the JAK/STAT family may be required for the optimal expression of several pro-apoptotic genes, suggesting that these genes may have multiple roles in the cell

(Barber, 2001).

Previous studies have demonstrated that PDCoV infection fails to induce IFN-β production in LLC-PK1 cells (Luo, et al., 2016). In this study, we found that I-IFN and

III-IFN production in IPEC cells was increased at 24 hpi (See Appendix B). We believe that the reason for this is because it has been shown that the I-IFN response might be inhibited at 36 hpi by PDCoV (Jiang, et al., 2019). Furthermore, type III IFNs have a unique tropism where their signaling and functions are restricted to epithelial cells

(Stanifer, Pervolaraki, & Boulant, 2019). We found that III-IFN production in HIEC cells, specifically receptor 2 (IFNGR2), was reduced at 24 hpi by

PDCoV (See Appendix B). Likewise, there is growing evidence that IFNs can be 52

activated by the mitogen activated protein kinase (MAPK) pathway and it has been recently shown that MAPKs are important for type III but not type I IFN in order to mediate antiviral protection in human intestinal epithelial cells (Stanifer, et al., 2019). In this study, we have identified several upregulated and downregulated DE genes that belong to the MAPK kinase family (See Appendix B). Finally, this study was also able to identify five orthologs DE genes that were upregulated in humans and downregulated in pigs in the Apoptosis signaling pathway (Fig 23). This can be due to the fact that apoptotic cell death induced by virus has a complex role in host defense and might promote the clearance of viruses or serve as a mechanism for virus-induced tissue damage and progression of disease (Kaminskyy & Zhivotovsky, 2010). Under these circumstances, we speculate that HIEC cells are using these set of DE genes to limit

PDCoV replication and production, while PDCoV is reducing the activation of the same set of DE genes and it is using this as a viral strategy in order to be successful in the progression of PDCoV infection in IPEC cells. Of note, three orthologs DE genes were upregulated in humans and pigs in Inflammation/cytokine signaling pathway and two orthologs DE genes were downregulated in humans and pigs in Ras signaling pathway

(Fig 21 - 22). All together, these results may suggest that there are some aspects that are similar in the immune associated response to PDCoV infection in humans and pigs.

Conclusions

In summary, we were able to explore the transcriptome of HIEC and IPEC cells after PDCoV infection going from reads to genes to pathways. Our data suggested that first, there are more differentially expressed genes in humans in comparison to pigs and as a result, more genes were upregulated in humans than in pigs at 24 hpi; secondly, from 53

the transcriptional level, humans are showing a more aggressive response to PDCoV infection in comparison to pigs and third, despite the fact that humans and pig share key immune associated DE genes and signaling pathways, PDCoV is using different strategies in order to be successful in the progression of PDCoV infection.

These data can also provide an important foundation that contributes to an understanding of the mechanisms of CoV cross species transmission. Based on learning lessons from SARS and MERS outbreaks, there is lack of drugs capable of controlling

CoV viral activity. Therefore, we believe that these data can be used for the development of novel therapeutics and strategies to deter the virus from spreading in pigs and subsequently to humans. To our knowledge, this is the first report of transcriptome analysis of human cells infected by PDCoV.

54

References

Barber, G. N. (2001). Host defense, viruses and apoptosis. Cell Death Differ, 8(2), 113- 126.

Beiki, H., Liu, H., Huang, J., Manchanda, N., Nonneman, D., Smith, T. P. L., et al. (2019). Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data. BMC Genomics, 20(1), 344.

Boley, P. A., Alhamo, M. A., Lossie, G., Yadav, K. K., Vasquez-Lee, M., Saif, L. J., et al. (2020). Porcine Deltacoronavirus Infection and Transmission in Poultry, United States(1). Emerg Infect Dis, 26(2), 255-265.

Campbell, A. R., Regan, K., Bhave, N., Pattanayak, A., Parihar, R., Stiff, A. R., et al. (2015). Gene expression profiling of the human natural killer cell response to Fc receptor activation: unique enhancement in the presence of interleukin-12. BMC Med Genomics, 8, 66.

Cavanagh, D. (1997). : a new order comprising Coronaviridae and Arteriviridae. Arch Virol, 142(3), 629-633.

Chen, Q., Gauger, P., Stafne, M., Thomas, J., Arruda, P., Burrough, E., et al. (2015). Pathogenicity and pathogenesis of a United States porcine deltacoronavirus cell culture isolate in 5-day-old neonatal piglets. Virology, 482, 51-59.

Chen, Y., Lun, A. T., & Smyth, G. K. (2016). From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline. F1000Res, 5, 1438.

Conesa, A., Madrigal, P., Tarazona, S., Gomez-Cabrero, D., Cervera, A., McPherson, A., et al. (2016). A survey of best practices for RNA-seq data analysis. Genome Biol, 17, 13.

Costanzo, M. C., Kim, D., Creegan, M., Lal, K. G., Ake, J. A., Currier, J. R., et al. (2018). Transcriptomic signatures of NK cells suggest impaired responsiveness in HIV-1 infection and increased activity post-vaccination. Nat Commun, 9(1), 1212.

Daniel Blanco-Melo, B. E. N.-P., Wen-Chun Liu, Rasmus Moller, Maryline Panis, David Sachs, Randy A. Albrecht, Benjamin R. tenOever. (2020). SARS-CoV-2 launches a unique transcriptional signature from in vitro, ex vivo, and in vivo systems. bioRxiv. 55

Gorbalenya, A. E., Enjuanes, L., Ziebuhr, J., & Snijder, E. J. (2006). Nidovirales: evolving the largest RNA virus genome. Virus Res, 117(1), 17-37.

Guezguez, A., Pare, F., Benoit, Y. D., Basora, N., & Beaulieu, J. F. (2014). Modulation of stemness in a human normal intestinal epithelial crypt cell line by activation of the WNT signaling pathway. Exp Cell Res, 322(2), 355-364.

Homwong, N., Jarvis, M. C., Lam, H. C., Diaz, A., Rovira, A., Nelson, M., et al. (2016). Characterization and evolution of porcine deltacoronavirus in the United States. Prev Vet Med, 123, 168-174.

Hou, Y., & Wang, Q. (2019). Emerging Highly Virulent Porcine Epidemic Diarrhea Virus: Molecular Mechanisms of Attenuation and Rational Design of Live Attenuated Vaccines. Int J Mol Sci, 20(21).

Hu, H., Jung, K., Vlasova, A. N., Chepngeno, J., Lu, Z., Wang, Q., et al. (2015). Isolation and characterization of porcine deltacoronavirus from pigs with diarrhea in the United States. J Clin Microbiol, 53(5), 1537-1548.

Hu, H., Jung, K., Vlasova, A. N., & Saif, L. J. (2016). Experimental infection of gnotobiotic pigs with the cell-culture-adapted porcine deltacoronavirus strain OH-FD22. Arch Virol, 161(12), 3421-3434.

Ibrahim, B., McMahon, D. P., Hufsky, F., Beer, M., Deng, L., Mercier, P. L., et al. (2018). A new era of virus bioinformatics. Virus Res, 251, 86-90.

Jiang, S., Li, F., Li, X., Wang, L., Zhang, L., Lu, C., et al. (2019). Transcriptome analysis of PK-15 cells in innate immune response to porcine deltacoronavirus infection. PLoS One, 14(10), e0223177.

Jimenez-Jacinto, V., Sanchez-Flores, A., & Vega-Alvarado, L. (2019). Integrative Differential Expression Analysis for Multiple EXperiments (IDEAMEX): A Web Server Tool for Integrated RNA-Seq Data Analysis. Front Genet, 10, 279.

Jung, K., Hu, H., Eyerly, B., Lu, Z., Chepngeno, J., & Saif, L. J. (2015). Pathogenicity of 2 porcine deltacoronavirus strains in gnotobiotic pigs. Emerg Infect Dis, 21(4), 650-654.

Jung, K., Hu, H., & Saif, L. J. (2016). Porcine deltacoronavirus infection: Etiology, cell culture for virus isolation and propagation, molecular epidemiology and pathogenesis. Virus Res, 226, 50-59.

Jung, K., Miyazaki, A., Hu, H., & Saif, L. J. (2018). Susceptibility of porcine IPEC-J2 intestinal epithelial cells to infection with porcine deltacoronavirus (PDCoV) and

56

serum cytokine responses of gnotobiotic pigs to acute infection with IPEC-J2 cell culture-passaged PDCoV. Vet Microbiol, 221, 49-58.

Kaminskyy, V., & Zhivotovsky, B. (2010). To kill or be killed: how viruses interact with the cell death machinery. J Intern Med, 267(5), 473-482.

Kirk, M. D., Pires, S. M., Black, R. E., Caipo, M., Crump, J. A., Devleesschauwer, B., et al. (2015). Correction: World Health Organization Estimates of the Global and Regional Disease Burden of 22 Foodborne Bacterial, Protozoal, and Viral Diseases, 2010: A Data Synthesis. PLoS Med, 12(12), e1001940.

Lai, M. M. (1992). RNA recombination in animal and plant viruses. Microbiol Rev, 56(1), 61-79.

Lever, J., Krzywinski, M and Altman, N. (2017). Principal component analysis. Nature Methods, 14.

Li, W., Hulswit, R. J. G., Kenney, S. P., Widjaja, I., Jung, K., Alhamo, M. A., et al. (2018). Broad receptor engagement of an emerging global coronavirus may potentiate its diverse cross-species transmissibility. Proc Natl Acad Sci U S A, 115(22), E5135-E5143.

Lim, Y. X., Ng, Y. L., Tam, J. P., & Liu, D. X. (2016). Human Coronaviruses: A Review of Virus-Host Interactions. Diseases, 4(3).

Luo, J., Fang, L., Dong, N., Fang, P., Ding, Z., Wang, D., et al. (2016). Porcine deltacoronavirus (PDCoV) infection suppresses RIG-I-mediated interferon-beta production. Virology, 495, 10-17.

Ma, Y., Zhang, Y., Liang, X., Lou, F., Oglesbee, M., Krakowka, S., et al. (2015). Origin, evolution, and virulence of porcine deltacoronaviruses in the United States. mBio, 6(2), e00064.

Marthaler, D., Jiang, Y., Collins, J., & Rossow, K. (2014). Complete Genome Sequence of Strain SDCV/USA/Illinois121/2014, a Porcine Deltacoronavirus from the United States. Genome Announc, 2(2).

McBride, R., van Zyl, M., & Fielding, B. C. (2014). The coronavirus nucleocapsid is a multifunctional protein. Viruses, 6(8), 2991-3018.

Mutz, K. O., Heilkenbrinker, A., Lonne, M., Walter, J. G., & Stahl, F. (2013). Transcriptome analysis using next-generation sequencing. Curr Opin Biotechnol, 24(1), 22-30. Niederwerder, M. C., & Hesse, R. A. (2018). Swine enteric coronavirus disease: A review of 4 years with porcine epidemic diarrhoea virus and porcine 57

deltacoronavirus in the United States and Canada. Transbound Emerg Dis, 65(3), 660-675.

Paarlberg, P. (2014). Updated Estimated Economic Welfare Impacts Of Porcine Epidemic Diarrhea Virus (PEDV). Working Papers

Ringner, M. (2008). What is principal component analysis? Nat Biotechnol, 26(3), 303- 304.

Robinson, M. D., & Oshlack, A. (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol, 11(3), R25.

Schuierer, S., Carbone, W., Knehr, J., Petitjean, V., Fernandez, A., Sultan, M., et al. (2017). A comprehensive assessment of RNA-seq protocols for degraded and low-quantity samples. BMC Genomics, 18(1), 442.

Schulz, L. L., & Tonsor, G. T. (2015). Assessment of the economic impacts of porcine epidemic diarrhea virus in the United States. J Anim Sci, 93(11), 5111-5118.

Stanifer, M. L., Pervolaraki, K., & Boulant, S. (2019). Differential Regulation of Type I and Type III Interferon Signaling. Int J Mol Sci, 20(6). van Dijk, E. L., Jaszczyszyn, Y., & Thermes, C. (2014). Library preparation methods for next-generation sequencing: tone down the bias. Exp Cell Res, 322(1), 12-20. Van Verk, M. C., Hickman, R., Pieterse, C. M., & Van Wees, S. C. (2013). RNA-Seq: revelation of the messengers. Trends Plant Sci, 18(4), 175-179.

Vergauwen, H. (2015). The IPEC-J2 Cell Line. 125-134.

Vlasova, A. N., Saif LJ. (2013). Biological Aspects of the Interspecies Transmission of Selected Coronavirus. In S. K. Singh (Eds.)

Wang, L., Byrum, B., & Zhang, Y. (2014). Detection and genetic characterization of deltacoronavirus in pigs, Ohio, USA, 2014. Emerg Infect Dis, 20(7), 1227-1230.

Wang, Z., Gerstein, M., & Snyder, M. (2009). RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet, 10(1), 57-63.

Woo, P. C., Huang, Y., Lau, S. K., & Yuen, K. Y. (2010). Coronavirus genomics and bioinformatics analysis. Viruses, 2(8), 1804-1820.

Woo, P. C., Lau, S. K., Lam, C. S., Lau, C. C., Tsang, A. K., Lau, J. H., et al. (2012). Discovery of seven novel Mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of

58

alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus. J Virol, 86(7), 3995-4008.

Woolhouse, M., Scott, F., Hudson, Z., Howey, R., & Chase-Topping, M. (2012). Human viruses: discovery and emergence. Philos Trans R Soc Lond B Biol Sci, 367(1604), 2864-2871.

Zhang, J. (2016). Porcine deltacoronavirus: Overview of infection dynamics, diagnostic methods, prevalence and genetic evolution. Virus Res, 226, 71-84.

59

Appendix A: IF staining of the inoculated IPEC cells at 24 hpi

PDCoV positive by IF DAPI staining of Panel A Overlap of Panels A and B staining C A B

IPEC Negative control DAPI staining of Panel D Overlap of Panels D and E

D E F

IF staining of the inoculated IPEC cells at 24 hpi. A) IF stained positive (green fluorescence) for PDCoV antigen. B) Blue-fluorescent 4’,6-diamidino-2-phenylindole dihydrochloride (DAPI) staining of Panel A to stain nuclear DNA. C) Overlap of panels

A and B. The arrowheads indicated clustered of cells positive for PDCoV antigen. D) IF staining of PDCoV-uninoculated (0.05% Trypsin). E) DAPI staining of Panel D. F)

Overlap of panels D and E.

60

Appendix B: Common upregulated and downregulated DEGs in humans and pigs at 24 hpi

A. Common upregulated DEGs in humans and pigs at 24 hpi Entry Description Human Pig hsa04210 Apoptosis signaling pathway MAP3K14,MAP4K4,TRAF2,TNFSF10,BCL2L1,FOS,ATF3,BIRC3,NFKBIA,PRKCG,REL BOK,CASP7,LOC100622859,NFRSF1A,MCL1,MAP4K4,NFKB2 ,RELB,JUN,EIF2AK2,ATF4,NFKB2,MCL1,BAG1,HSPA5,MAP2K3,MAP3K1,RIPK1,PI K3CD,JDP2,CASP7,NFKB1,TNFRSF10A,BIRC2,DAXX,MAP2K7,RELA,HSPA1B,FAS, CFLAR,BAX hsa04662 B cell activation FOS,CD22,VAV3,RAF1,PLCG2,ITPR1,PTPRC,NFKB1,ITPR3,NFKBIA,JUN,NFATC2, NFKB2,HRAS NFATC1,NFKBIL1,CD79A,PIK3CD,NFKB2 hsa04060 Inflammation/cytokine signaling pathway IFNB1, IFNL1, IFNL3,ITGAL, MAP3K4, C5AR1, IL6, INPPL1,CCL5,GRK6,IFNGR2,NFKB2,PTGS1,SOCS4,GNAI3,CXCL8, MYH3,NFAT5,RAF1,RELB,RGS4,PAK6,ITGB2,PTGS2,ITGA9,PLCG2,JUND,ADCY5,IT NFKB2,NFKBIE,LIFR PR1,PAK5,PRKX,GRK4,PLCD1,NFKB1,CXCL8,NFKBIA,SOCS7,JUN,REL,NFATC2,NF ATC1,RELA,SOCS6,STAT1,NFKBIE,GNAO1,JAK2,IL15,NFKB2,CXCR4 hsa04060 Interferon signaling pathway IFNB1, IFNL1, IFNL3, SOCS3, SOCS2, PIAS4, SOCS7, SOCS6, STAT1,MAPK3,JAK2 MX1,IFNGR2,SOCS4,MAPK7 hsa04657 Interleukin signaling pathway IL11,MAP3K4,IL23A,IL6,FOS,RAF1,STAT5A,MAPK6,MYC,IRS1,IL15RA,IL6R,CDKN1 CXCL8,MAPK7 B,CXCL8,STAT2,RPS6KA1,STAT4,STAT5B,MKNK2,IRS2,FOXO3,IL15,CDKN1A,MAP KAPK2 hsa04630 JAK/STAT signaling pathway IFNB1, IFNL1, IFNL3,STAT5A,PTPRC,STAT2,PIAS4,STAT4,STAT5B,STAT1,JAK2 LIFR hsa04660 T cell activation PIK3R1,PIK3R3,FOS,CDC42,VAV3,RAF1,RIPK1,PTPRC,NFKB1,MAP3K1,NFKBIA,JU NFKB2,HRAS,LOC100622663 N,NFATC2,NFATC1,CD80,PIK3CD,WIPF2,NFKB2 hsa04350 TGF-beta signaling pathway TNFRSF10B,ACVR1,KRAS,GDF9,FOSL1,CITED4,ACVR1C,JUND,ALG5,DCP1B,INHB INHBB, INHBA,ALG5,HRAS,SMAD7,MYBPC2 A,CREBBP,BMP2,BAMBI,ACVR1B,JUN,SMAD3,SMURF1,SNIP1,DCP1A,SMAD7,EP 300,ACVR2B hsa04620 Toll like receptor signaling pathway TNFAIP3,IFNB1,ECSIT,LY96,TOLLIP,PTGS2,NFKB1,TICAM1,MAP3K1,NFKBIA,MAP NFKB2,TNFAIP3,NFKBIE,RSAD2 3K8,MYD88,JUN,REL,RELA,MAP2K3,TIRAP,CD14,NFKBIE,TLR3,IRF7,NFKB2 hsa04014 Ras signaling pathway MAP3K4,RHOB,RAF1,ETS1,MAP3K1,JUN,RPS6KA1,MAP2K7,STAT1,PIK3CD,PC00 HRAS 167,MAPKAPK2

*Common genes across pathways and between humans and pigs are highlighted in red.

B.. Common downregulated DEGs in humans and pigs at 24 hpi

Entry Description Human Pig hsa04210 Apoptosis signaling pathway PIK3CB,MAPK3,PIK3CA,MAP2K4,IGF2R,BAG4,ATF6,ATF6B,GHITM,TMBIM6,MAP MAP3K5,FOS,BCL2L11,HSP70.2,REL,MAP3K14,LOC100622663 4K2,MAPK9,MCM5,BCL2L2,HSPA8,MAPK10,HSPA2 hsa04662 B cell activation CALM3,PPP3CB,CDC42,FRK,RAC1,MAPK14,PIK3CA,MAPK10,MAPK3,SOS2,PPP3 FOS,LOC100622663,SOS1,REL CA,MAPK9,CALM3,PIK3CB hsa04060 Inflammation/cytokine signaling pathway GNG2,PRKACA,BCL3,PLCB1,MYLK3,PRKACB,KRAS,CDC42,GNG12,ITGB1,PLCH1,A RPC2,VWA2,IFNAR1,GNAI1,ACTG1,GNB3,RAC1,COL14A1,ARPC5,PAK1,ACTG2,A MYH7B,CAMK2D,COL14A1,PLA2G4B,LOC100622663,SOS1,NFAT5 CTB,ADCY6,RGS18,CCR3,MAPK3,IFNGR2,ITGA4,PIK3CB,COL12A1,COL6A1 hsa04060 Interferon signaling pathway MAPK14, MAPK10,IFNGR2,MAPK9,PTPN11,IL12RB1 PIAS1 hsa04657 Interleukin signaling pathway IL13RA1,RPS6KA6,RPS6KA2,GSK3B,PIK3CA,MAPK3,SOS2,IL11RA,PIK3CB FOS,IL1A,LOC100622663,SOS1 hsa04630 JAK/STAT signaling pathway MAPK14 PIAS1 hsa04660 T cell activation CALM3,PPP3CB,RAC1,PIK3C3,PAK1,HLA_CLASSII,PIK3CA,MAPK10,MAPK3,HLA- FOS,SOS1 DMA,SOS2,PPP3CA,MAPK9,CALM3,PIK3CB hsa04350 TGF-beta signaling pathway BMP5,BMPR1A,TGFBR2,BMP1,GDF11,BMPR1B,MAPK14,MAPK10,MAPK3,INHBB ,MAPK9 hsa04620 Toll like receptor signaling pathway IRAK4,UBE2N,MAPK14,MAPK10,MAPK3,MAPK9 TIRAP,REL hsa04014 Ras signaling pathway KRAS,CDC42,RPS6KA6,RAC1,PLD1,RPS6KA2,RHOA,PIK3C3,PAK1,GSK3B,MAP2K4 PLD1,LOC100622663,SOS1 ,MAP2K3,MAPK14,PIK3CA,MAPK10,MAPK3,SOS2,MAP2K6,MAPK9,PIK3CB

Common genes across pathways and between humans and pigs are highlighted in blue.

61

Appendix C: Upregulated and downregulated DEGs in humans and pigs in 3 Pathways

A. Upregulated and downregulated DEGs in humans_ Cytokine – Cytokine Receptor Interaction Pathway

B. Upregulated and downregulated DEGs in pigs_ Cytokine – Cytokine Receptor Interaction Pathway

62

C. Upregulated and downregulated DEGs in humans_ Apoptosis signaling pathway

D. Upregulated and downregulated DEGs in pigs_ Apoptosis signaling pathway

63

E. Upregulated and downregulated DEGs in humans_ JAK-STAT signaling pathway

F. Upregulated and downregulated DEGs in pigs_ JAK-STAT signaling pathway

64