bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Rewiring of expression in PWBC at artificial insemination is predictive of pregnancy 2 outcome in cattle (Bos taurus)

3

4 Sarah E. Moorey1, Bailey N. Walker2, Michelle F. Elmore3,4, Joshua B. Elmore4, Soren P. 5 Rodning3 and Fernando H. Biase2

6

7 1 Department of Animal Science, University of Tennessee, Knoxville, TN; 2 Department of 8 Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, 9 VA, USA; 3 Department of Animal Sciences, Auburn University, Auburn, AL, USA; 4 Alabama 10 Cooperative Extension System, AL, USA.

11 Corresponding author:

12 Fernando Biase

13 175 West Campus Drive, Blacksburg, VA 24061. Phone 540-231-9520. Email [email protected] 14 ORCID ID https://orcid.org/0000-0001-7895-0223

15

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

16 ABSTRACT

17 Infertility is a disease that affects humans and cattle in similar ways. The resemblance includes 18 complex genetic architecture, multiple etiology, low heritability of fertility related traits in females, 19 and the frequency in the female population. Here, we used cattle as a biomedical model to test 20 the hypothesis that profiles of -coding expressed in peripheral 21 white blood cells (PWBCs), and circulating micro RNAs in plasma, are associated with female 22 fertility, measured by pregnancy outcome. We drew blood samples from 17 female calves on the 23 day of artificial insemination and analyzed transcript abundance for 10496 genes in PWBCs and 24 290 circulating micro RNAs. The females were later classified as pregnant to artificial 25 insemination, pregnant to natural breeding or not pregnant. We identified 1860 genes producing 26 significant differential coexpression (eFDR<0.002) based on pregnancy outcome. Additionally, 27 237 micro RNAs and 2274 genes in PWBCs presented differential coexpression based on 28 pregnancy outcome. Furthermore, using a machine learning prediction algorithm we detected a 29 subset of genes whose abundance could be used for blind categorization of pregnancy outcome. 30 Our results provide strong evidence that bloodborne transcript abundance is highly associated 31 with fertility in females.

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

32 INTRODUCTION

33 The Word Health Organization considers Infertility to be a “disease of the reproductive 34 system”1. Infertility is a common problem in humans and agriculturally important large mammals. 35 For instance, prevalence of infertility may reach 16%2 and 15%3 in humans and beef female 36 calves, respectively. Similar to what has been observed in humans4, the genetic architecture of 37 fertility in beef heifers is very complex5,6,7,8,9,10,11,12. In line with this complexity, female reproductive 38 traits have low heritability. The heritability of traits related to female reproductive fitness (i.e.: birth 39 rate and age at last reproduction) in humans ranges from 0.05 to 0.2813. Likewise, reproductive 40 related traits in cattle such as heifer pregnancy and first service conception range from 0.07 to 41 0.205,14,15,16,17,18,19 and 0.03 to 0.185,14,17, respectively. Due to genetic and physiological 42 resemblance with human, cattle have been identified as an important biomedical model for 43 dissecting female fertility traits20,21.

44 Beyond the importance as a biomedical model, cattle production systems provide 45 approximately 28%22 of the protein supply globally. Improving cattle production efficiency is 46 essential for farmers to attain sustainable production and support the growing demand for animal 47 protein22. Infertility is a major factor that hinders efficiency in cattle production, and it starts with 48 limited success of pregnancy in young female calves. First breeding success greatly influences 49 the lifetime efficiency of beef replacement heifers. Heifers that calve early in their first calving 50 season experience increased productivity and longevity than their later calving herd 51 mates23,24,25,26. Furthermore, the genetic correlation between yearling pregnancy rate and lifetime 52 pregnancy rate is high (0.92-0.97)27,28. Therefore, the ability to identify heifers that experience 53 optimal fertility during the first breeding is essential to the sustainability of beef cattle production 54 systems. 55 The examination of the genetic components of fertility in beef heifers have yielded several 56 genes potentially associated with fertility traits5,6,7,8,9,10,11,12, but the effect of these markers are 57 minimal, and there is no clear redundancy in genetic markers identified across breeds. Beyond 58 the genomic profiling, the analysis of multiple layers of an individual’s molecular blueprint is likely 59 key for understanding the underlying biology of complex traits29. In line with this rationale, 60 expression-trait association studies have emerged as a means to better understand complex 61 traits30,31. Specifically related to fertility, there is evidence that circulating micro RNA (miRNA) 62 profiles are indicative of reproductive function32. Furthermore, we have previously identified 63 differences in the transcriptome of peripheral white blood cells (PWBC) among heifers of different 64 fertility potential33.

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

65 In the present study, we set out to investigate gene transcripts in PWBCs and circulating 66 micro RNA (miRNA) profiles in heifers enrolled in fixed time artificial insemination (AI). We utilized 67 a multi-dimensional approach to investigate differences in coexpression of protein-coding genes 68 in PWBCs, and coexpression between circulating miRNA and mRNAs in PWBCs. We also show 69 that transcript abundance in PWBCs can be successfully utilized for the prediction of heifer 70 reproductive outcome. 71

72 RESULTS

73 Overview of the experiment and data generated

74 We generated transcriptome data for mRNA from PWBC and small RNA from plasma 75 collected at the time of AI from heifers that became pregnant to AI (AI-preg, n=6), pregnant later 76 in the breeding season from natural breeding with bulls NB (NB, NB-preg, n=6), or failed to 77 become pregnant (not-pregnant, n=5, Fig. 1a). We produced, on average, 25.8 and 7.1 million 78 reads from mRNA and small RNA, respectively. Following the filtering for lowly expressed genes, 79 10496 protein coding genes in PWBC and 290 miRNAs in plasma samples were used for 80 differential gene expression, coexpression and prediction analyses (Fig. 1a). Notably, three of the 81 five not-pregnant samples clustered separately from other subjects based on the expression 82 levels of protein-coding genes. Contrariwise, there was no separation of samples based on the 83 miRNA expression levels (Fig. 1c).

84 Differential mRNA:mRNA coexpression associated with pregnancy outcome

85 First, we asked whether there was a general coexpression profile of the protein-coding 86 genes expressed in PWBCs amongst all 17 samples. We identified that 1633 genes that formed 87 2017 pairs with highly significant correlated expression levels (|r|>0.98, FDR<0.02, Fig. 2a, 88 Supplementary Table 1). Notably, ~99% of those significant correlations were positive (Fig. 2a). 89 Several genes showing coexpression were associated with ‘translation’, ‘positive regulation of 90 gene expression’, ‘ differentiation’, and ‘negative regulation of extrinsic apoptotic signaling 91 pathway’ amongst other biological processes (FDR<0.1, Supplementary Table 2). This result 92 indicates the occurrence of gene regulatory networks in PWBCs that are biologically meaningful 93 for the .

94 Next, we investigated if the pair-wise gene coexpression differs between animals that 95 became pregnant relative to those that did not become pregnant. We identified 1860 genes with 96 significant differential correlation based on the pregnancy outcome (eFDR<0.002, Fig. 2b,c,

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

97 Supplementary Table 3, Supplementary Fig. 1). Of these 1860 genes, 963 genes formed 716 98 negative, 1053 genes formed 708 positive, four genes lost two positive, and 40 genes formed 23 99 inverted coexpression connections in not-preg heifers relative to both AI-preg and NB-preg 100 heifers. Among the 963 genes that formed new negative coexpression connections in not-preg 101 heifers, there was enrichment of the biological processes “proteolysis” (n=38 genes) and 102 “translation” (n=52 genes, FDR<0.1, see Fig. 2d for selected genes and Supplementary Table 4 103 for list of genes). Five of the 40 genes that formed inverted coexpression connection in not-preg 104 heifers relative to the pregnant counterparts were associated with “negative regulation of 105 transcription by RNA polymerase II” (FDR<0.1, see Fig. 2d for selected genes and Supplementary 106 Table 5 for a list of genes). These results indicate that physiological factors associated with fertility 107 or infertility act on the transcription protein-coding genes in PWBCs.

108 Differential miRNA:mRNA coexpression associated with pregnancy outcome

109 The generation of data from small RNA (from plasma) and mRNA (from PWBCs) from the 110 same subjects (n=17) permitted us to ask whether circulating miRNAs have correlated expression 111 with protein-coding genes expressed in the PWBCs. We identified 141 pairs of miRNA:mRNA 112 with |r|>0.85 (eFDR<0.0001, Supplementary Fig. 2) that were formed by 106 protein-coding genes 113 and 33 miRNAs (Supplementary table 6). Among the 106 protein-coding genes existing in high 114 correlation with miRNA, we identified three significantly enriched biological processes, namely: 115 “peptidyl-threonine phosphorylation”, “cilium assembly” and “regulation of gene expression” (Fig. 116 3a, Supplementary table 7). Of notice, 30 of the 141 miRNA:mRNA pairs that presented negative 117 correlation between the expression levels have been previously identified as interacting pairs in 118 the miRWalk database (Fig. 3b). Additionally, bta-mir-744 and bta-mir-320a emerged as the two 119 most connected miRNAs (13 and 6 connections, respectively). It is evident that circulating 120 miRNAs have regulatory roles over transcripts of protein-coding genes expressed in the PWBCs.

121 We also interrogated miRNA:mRNA pairs for differential coexpression associated with 122 pregnancy outcome. We identified 2330 and 2738 miRNA:mRNA pairs with gain of negative and 123 positive correlated expression, respectively, (|r|>0.9, eFDR≤0.01, Supplementary Fig. 3) in not- 124 pregnant heifers that were not identified compared to in pregnant heifers (both AI-preg and NB- 125 preg, Fig. 3c). In addition, there were 51, two and three connections showing inverted, loss of 126 negative and loss of positive correlations in not-pregnant heifers relative to the pregnant (both AI- 127 preg and NB-preg) counterparts (Fig. 3c, see Supplementary table 8 for genes). Among the 1220 128 protein-coding genes associated with gain of negative correlation with miRNAs, there were 129 biological processes significantly enriched, namely: “immunoglobulin mediated immune

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

130 response”, “negative regulation of DNA-binding transcription factor activity”, “ 131 chemotaxis” and “mitochondrial respiratory chain complex I assembly” (FDR<0.1, Supplementary 132 table 9). Among the 1288 genes associated with gain of positive correlation with miRNAs, the 133 following biological processes were significantly enriched: “regulation of DNA replication” and 134 “cellular response to BMP stimulus” (FDR<0.1, Supplementary table 10). These results indicate 135 that several miRNA and mRNA pairs have their connections altered in relation to the heifer’s 136 fertility potential.

137 Differential gene expression associated with pregnancy outcome

138 Differential gene expression analysis revealed 67 (55 up- and 12 down-regulated) and 81 139 (63 up- and 18 down-regulated) protein-coding genes with altered expression profiles between 140 PWBC of AI-preg versus not-pregnant, and NB-preg versus not-pregnant heifers, respectively 141 (eFDR≤0.05, Fig. 4a, Supplementary tables 11 and 12, Supplementary Fig. 4). Notably, 26 genes 142 were common between both pregnant groups versus the not-pregnant group (Fig. 4b). Gene 143 ontology and pathway analyses of differentially expressed genes between AI-preg and Not- 144 pregnant heifers revealed significant enrichment for the terms “signal transduction” (ARAP3, 145 LSP1, NFAM1, OR12D2, RARA; FDR<0.05). When differentially expressed genes between NB- 146 preg and not-pregnant heifers were analyzed, significant enrichment was determined for the terms 147 “signal transduction” (ARAP3, HRH2, LSP1, TIAM2, TNFRSF17, TNFRSF1A; FDR=0.062) and 148 the KEGG Pathway “cytokine-cytokine receptor interaction” (IFNGR2, IL5RA, LTB, TNFRSF17, 149 TNFRSF1A; FDR=0.0045). Plasma miRNA profiles were very similar among samples from all 150 pregnancy outcomes with the exception of miR-11995 that was more abundant in AI-preg versus 151 not-pregnant heifers. These results indicate that quantitative differences exist in transcript 152 abundance of protein-coding genes expressed in PWBC relative to the pregnancy outcome.

153 Next, we asked if differential gene expression could be associated with differential 154 miRNA:mRNA coexpression. An intersection of the results identified 12 differentially expressed 155 genes (10 annotated genes) in not-pregnant heifers relative to the AI-preg and NB-preg groups 156 that also gained coexpression with 27 miRNAs (Fig. 4c). Notably, all genes that had higher 157 expression in not-pregnant heifers (CEACAM4, FCGR2, ISG20, JAML, LSP1, NCF1, NFE2, 158 PLEKHO1) showed a gain in negative coexpression with specific miRNAs. By contrast MGAT4D 159 was lower expressed in not-pregnant heifers and gained a positive coexpression with a miRNA in 160 not-pregnant heifers.

161 Assessment of transcript levels as predictors of pregnancy outcome

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

162 Motivated by the observations that several genes have their abundance altered in 163 relationship to the fertility potential pregnancy (Fig. 4a), we asked whether transcript abundances 164 could serve as predictor features across datasets. We generated gene expression values for a 165 similar dataset we produced previously (GSE10362834), which also consisted of transcriptome 166 data from PWBCs collected from heifers at the time of AI and were classified according to their 167 pregnancy outcome as AI-preg (n=6) or not-pregnant (n=6). This dataset was particularly fitting 168 for this interrogation because the sampling occurred one year earlier on the same site. We 169 combined the expression values of the previously produced dataset (year one) with those of the 170 current dataset (year two) and contrasted the transcript abundance between heifers that were 171 categorized as AI-preg versus not pregnant accounting for the different years as a fixed effect in 172 the model. This analysis identified 198 genes whose abundance were significantly different 173 between the two experimental groups (AI-preg, not pregnant, FDR<0.03, Fig. 5a, Supplementary 174 table 13).

175 We used the transcript abundance values for these 198 genes obtained from samples 176 collected in year one as input (training set) for machine learning algorithms to derive models of 177 prediction to discriminate pregnancy outcome (AI-preg or not-pregnant, Fig. 5b). Next, we used 178 the transcript abundance values for these 198 genes obtained from samples collected in year two 179 as input to the algorithms for a blind classification of pregnancy outcome (AI-preg or not-pregnant, 180 Fig. 5c). The accuracy of classifications was assessed based on the known pregnancy outcome, 181 and parallelized random forest was the algorithm that produced the greatest proportion of correct 182 classification of all 11 heifers tested (46.3% out of 2000 randomizations), followed by 53.1% 183 correct classification for 10 out of 11 heifers (Fig. 5d). Notably, the top five genes that showed the 184 largest degree of importance for the classification were RPL39, SMIM26, LONRF3, GATA3, 185 N6AMT1 (Fig. 5e), of which, RPL39, SMIM26, N6AMT1 were up-regulated and LONRF3 and 186 GATA3 were down-regulated in AI-preg heifers relative to the not- pregnant group (Fig. 5f).

187

188 DISCUSSION

189 Here, we report a multilayered transcriptome analysis of bloodborne transcripts sampled 190 on the day of AI. The main objective of our study was to understand gene regulatory networks35 191 occurring in PWBCs, and rewiring that can occur based on the heifer’s fertility potential. We 192 focused on the day of artificial insemination because it is the last day prior to the fertilization, when 193 the heifer has the first opportunity to become pregnant. We interrogated mRNA and miRNA

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

194 abundance levels obtained from the same individuals using the coexpression framework, which 195 is a valuable systems biology approach to gain insights into the biology of complex traits35. 196 Furthermore, using machine learning36, we detected consistent altered gene transcript abundance 197 associated with infertility across independent datasets. Collectively, our results provide important 198 aspects of regulatory networks in PWBCs that shed light on the physiological status of an animal 199 and its preparedness for becoming pregnant.

200 Similar to work reported in humans37, our gene network analysis focusing on mRNAs 201 detected several protein-coding genes with strong coexpression in PWBCs. Interestingly, nearly 202 all of the correlated expression inferred as statistically significant across all heifers (|r|>0.98, 203 FDR<0.02) were positive. As expected38, in silico functional gene co-expression analysis 204 indicated groups of genes that are linked to specific biological processes. A contrast of 205 coexpressing pairs of genes between samples collected from not-pregnant heifers and the two 206 groups of heifers that became pregnant displayed a large number of connections present in not- 207 pregnant heifers that were not identified in both AI-preg and NB-preg heifers. Our strict criteria 208 (|r|>0.99, eFDR<0.002) utilized to identify these altered connections indicate functional regulatory 209 relationships between selected genes.

210 Surprisingly, a great majority of the altered connections between gene pairs were present 211 gained in not-pregnant heifers, while there was minor loss of significant connections in not- 212 pregnant heifers relative to the AI-preg and NB-preg groups. The enrichment for specific biological 213 processes of genes that formed new negative connections was a clear evidence of the biological 214 relevance of these new connections. Notably, among genes annotated with proteolytic activity, 215 ADAM1939, ASRGL140, ATG4B41, DPEP242, IMMP2L43, PSMB944, SERPINE245, UCHL346, 216 USP4247 have been implicated in female infertility. A more drastic change in patterns of 217 coexpression was observed with the inversion of gene connections (significant pairwise 218 coexpression in one group with significant coexpression with opposite direction in another group). 219 These gene circuits changing from activation to repression are explained by current models of 220 incoherent feed-forward loops48. Cumulatively, the large occurrence of new connections and the 221 inverted connections created in PWBCs exclusively in not pregnant heifers indicate that gene 222 regulation is altered as an adaptative response to a stimulus.

223 Our profiling of small RNAs from plasma identified 263 annotated miRNAs, out of which 224 131 and 122 were previously identified in sera and circulating exosomes, respectively, among 225 other cattle organs49. Only 38 annotated miRNAs were profiled in peripheral blood mononucleated 226 cells in humans50, which demonstrates that our plasma samples were depleted of intact cells.

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

227 Immuno-related cells composing the PWBC population can uptake circulating small 228 RNAs51. Although our data did not allow us to trace the origin of the tissue that produced the 229 miRNAs, we detected 33 circulating miRNAs whose abundance are associated with the 230 abundance of 106 protein-coding genes. As reported in humans52, we quantified positive and 231 negative correlated abundance between miRNA and protein-coding genes. Negative correlations 232 mostly occur due to the targeting of mRNA by a specific miRNA. Indeed, we found several pairs 233 of miRNA:mRNA negatively correlated that had been identified as interacting pairs that fit this 234 model. By comparison, the positive coexpression between miRNA:mRNA pairs can be explained 235 by a miRNA targeting an intermediate gene, which in turn is negatively associated with a third 236 gene, thereby creating a positive correlation between a miRNA:mRNA pair (see Fig. 3 in 237 reference53). In our results, we only identify examples of miRNA:mRNA pairs that follow the 238 intermediate action of miRNAs when relaxing the stringent criteria for inference of significant 239 coexpression. Regardless of the model of interaction, it was notable that many protein-coding 240 genes that were coexpressed with miRNAs were enriched for specific biological processes, 241 including “regulation of gene expression”. These results add further evidence to the growing body 242 of literature suggesting that circulating miRNAs act as mediators of cellular function54,55.

243 We also tested whether circulating miRNA and PWBC mRNA pairs have differential 244 coexpression relative to the pregnancy outcome. Following the same trend observed for 245 miRNA:mRNA coexpression in PWBCs, there were 1000-fold more miRNA:mRNA connections 246 that were present in heifers that did not become pregnant compared to the loss in coexpression. 247 Results from gene enrichment analysis support the notion that genes forming new miRNA:mRNA 248 coexpression connections in not-pregnant heifers function in the immune system, although our 249 data do not permit inferences on whether particular immune processes were more or less 250 functional in not-pregnant heifers.

251 Critical aspects of female fertility are tied to an appropriate regulation of the immune 252 system. Maternal immune tolerance to pregnancy starts with the exposure to in the 253 semen56, and is carried throughout placentation and pregnancy establishment57. Notably, 24 out 254 of 26 differentially expressed genes between pregnant (AI-preg and NB-preg) and not-pregnant 255 heifers showed greater abundance in not-pregnant heifers relative to the other two groups. (AI- 256 preg and NB-preg). While we did not detect differential abundance of miRNAs relative to 257 pregnancy outcome, the gain of coexpression between miRNAs and mRNAs from differentially 258 expressed genes functionally linked with immunology is in line with regulatory roles that certain 259 miRNAs exert on immune cells58. This result lead us to hypothesize that the heifers in the not-

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

260 pregnant group have a heightened activity of specific cells or functions in the immune system 261 relative to the heifers that became pregnant. It remains unclear, however, if this possible higher 262 activity was stimulated by extrinsic factors or is within the naturally-occurring variation59.

263 The results of differential gene expression and differential coexpression fostered a 264 hypothesis that transcript abundance in the PWBCs can be used as predictors of pregnancy 265 outcome. We tested this hypothesis by applying machine learning algorithms on two groups of 266 heifers that were bred in 2015 (year one) and 2016 (year two). Parallel random forest emerged 267 as the algorithm with over 90% efficiency of classification nearly all trials executed, which confirms 268 the potential of accurate classification of samples using RNA-seq data under the case-control 269 framework60,61. The results show that while not one single gene emerges as a potential biomarker, 270 the accumulated information of transcript abundance from multiple genes can be powerful for the 271 identification of fertility potential in cattle.

272 Although we detected altered coexpression with high confidence, and in some instances, 273 with significant enrichment of genes in specific terms, the consequence of the 274 differential coexpression on those biological functions remains undetermined. In addition, it 275 remains unclear whether the altered patterns of gene expression profiles are connected to 276 immune functions that are related to infertility62. The data analyzed here does not allow us to infer 277 causality63, however our results converge to a strong indication that differences in gene 278 expression profiles are associated63 with fertility in cattle.

279 In summary, analyses of mRNA from PWBCs and circulating extracellular miRNA by 280 multiple approaches (differential gene expression, differential gene coexpression and machine 281 learning classification) provided complementary evidence that gene expression profiles in the 282 PWBCs are associated with the fertility potential in beef heifers. The overall difference detected 283 in transcript abundance for specific genes is reproducible across samples and show promise for 284 the development of predictors of fertility potential in beef cattle.

285

286 METHODS

287 Heifer classification according to pregnancy outcome

288 All procedures with animals were performed in accordance with protocols approved by 289 Institutional Animal and Care and Use Committee at Auburn University.

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

290 Crossbred female calves (n=32, Angus-Simmental, Bos taurus) were developed as 291 replacement heifers64,65 at the Black Belt Research and Extension Center (Auburn University). 292 Thirty-four days prior to AI, heifers were evaluated for reproductive tract score (scale of 1-566) and 293 body condition score (BCS; scale of 1-967) for the assertion that morphological and physiological 294 conditions were suitable for breeding. One animal was eliminated from the breeding herd at this 295 time.

296 Thirty one animals, 14 months of age on average, underwent estrous synchronization for 297 fixed-time artificial insemination utilizing the 7-Day Co-Synch + CIDR protocol68. An experienced 298 veterinarian inseminated all heifers with semen from sires of proven fertility. Fourteen days after 299 insemination, heifers were exposed to two fertile bulls for natural breeding for 60 days. An 300 experienced veterinarian performed pregnancy evaluation by transrectal palpation on days 69 301 and 120 post artificial insemination. Presence or absence of a conceptus, alongside 302 morphological features indicating fetal age were used to classify heifers as pregnant to AI (AI- 303 pregnant), pregnant to natural breeding (NB-pregnant), or not-pregnant.

304 Collection, processing and preservation of biological material

305 Immediately after AI, 10ml of blood was drawn from the jugular vein into vacutainers 306 containing 18mg K2 EDTA (Becton, Dickinson and Company, Franklin, NJ). The tubes were 307 inverted to prevent blood coagulation and immersed in ice for approximately 4 hours. Plasma and 308 buffy coat were separated by centrifugation for 10 minutes at 2000×g at 4°C. Plasma was 309 removed (~1.5ml) and centrifuged 10 minutes at 500×g at 4°C for further removal of cells. The 310 plasma devoid of cells was then transferred to a new microcentrifuge tube and preserved at - 311 80°C. The buffy coat was removed and deposited into 14ml of red blood cell lysis solution (0.15 312 M ammonium chloride, 10mM potassium bicarbonate, 0.1mM EDTA, Cold Spring Harbor 313 Protocols) for 10 minutes at room temperature (24-25°C). After a centrifugation, 800xg for 10 314 minutes, the solution was discarded and the pellet containing PWBCs was re-suspended in 200 315 µl of RNAlater® (Lifetechnologies™, Carlsbad, CA). PWBC samples were stored at -80°C.

316 RNA extraction, library preparation, and RNA sequencing

317 Total RNA was isolated from PWBCs of 18 heifers (AI-pregnant (n=6), NB-pregnant (n=6), 318 and not-pregnant (n=6)) using TRIzolä reagent (Invitrogen, Carlsbad, CA)69,70 following the 319 manufacturer’s protocol. RNA yield was quantified using the Qubit™ RNA Broad Range Assay 320 Kit (Eurogene, OR) on a Qubit® Fluorometer, and integrity was assessed on an Agilent 2100 321 Bioanalyzer (Agilent, Santa Clara, CA) using an Agilent RNA 6000 Nano kit (Agilent, Santa Clara,

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

322 CA). Aproximatly 500 µg of total RNA was used as input for the TruSeq Stranded mRNA Library 323 Prep kit (Illumina, Inc., San Diego, CA); and library preparation was carried out following 324 manufacturer’s instructions. Libraries were quantified with the Qubit™ dsDNA High Sensitivity 325 Assay Kit (Eurogene, OR); and quality was evaluated using the High Sensitivity DNA chip (Agilent, 326 Santa Clara, CA) on an Agilent 2100 Bioanalyzer. Libraries were sequenced in a HiSeq2500 327 system at the Genomic Services Laboratory at HudsonAlpha, Huntsville, AL to generate 125nt 328 long, pair-end reads.

329 Total RNA (circulating extracellular RNA) was isolated from 1ml of plasma by adding 9 ml 330 of TRIzolä reagent and following manufacturer’s protocols through organic phase separation. The 331 aqueous phase containing the RNA was then collected and purified with the RNA Clean & 332 Concentrator-5 kit (ZymoResearch, Irvine, CA), according to manufacturer’s instructions. Total 333 RNA yield was quantified in test samples using the Qubit™ RNA Broad Range Assay Kit 334 averaging 21.4ng, whereas samples prepared for sequencing were not quantified and all of the 335 eluted RNA was directly used for library preparation.

336 Total RNA obtained from plasma was desiccated to a volume of 2.5µl for library 337 preparation with the TruSeq Small RNA Library Prep kit (Illumina, Inc., San Diego, CA) following 338 manufacturer’s instructions, with the minor alteration that reagent volumes were halved in all 339 reactions until reverse transcription was performed, as described elsewhere71. After PCR 340 amplification, libraries were loaded onto a 10% polyacrylamide gel72 and electrophoresis was 341 carried out for three hours at 100 volts alongside a 25/100 bp mixed DNA ladder (Bioneer, 342 Alameda, CA); and the custom DNA ladder included in the TruSeq Small RNA Library Prep kit. 343 The gel was soaked in GelRed® Nucleic Acid Gel Stain (Biotium, Inc., Fremont, CA) for DNA 344 staining. DNA bands in the range of 145 -160 base pairs were isolated from the gel by the crush- 345 and-soak method73 with the following modifications. Gel fragments were placed into a 0.5 ml tube 346 containing small holes at the bottom, mounted into a 2ml tube. The tubes were centrifuged at 347 14,000xg for two minutes at room temperature. The crushed gel was soaked in 200µl of water at 348 4°C overnight, followed by removal of the solution containing the library to a new tube.

349 Concentration of each library was determined by Qubit™ dsDNA High Sensitivity Assay 350 Kit, and the volume corresponding to 1.5ng of each library was pooled for further purification with 351 the DNA Clean & Concentrator™-5 (ZymoResearch, Irvine, CA) following the manufacturer’s 352 instructions. The pool was quantified with Qubit™ dsDNA High Sensitivity Assay Kit, and the DNA 353 profile was evaluated using the High Sensitivity DNA chip on an Agilent 2100 Bioanalyzer.

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

354 Libraries were sequenced in a NextSeq system at the Genomic Services Laboratory at 355 HudsonAlpha, Huntsville, AL to generate 75nt long, single end reads.

356 Raw sequences and metadata are deposited in the Gene Expression Omnibus database 357 (GSE146041)

358 RNA sequencing data processing

359 Sequences were submitted to a custom build bioinformatics pipeline74,75,76. Reads were 360 aligned to the bovine genome (ARS-UCD1.2 obtained from Ensembl77) using HISAT278. 361 Sequences aligning to multiple places on the genome, or with 5 or more mismatches, were filtered 362 out. The sequences were then marked for duplicates, and non-duplicated pairs of reads (mRNA) 363 or single reads (miRNA) were used for transcript quantification at the gene level. The fragments 364 were counted against the Ensembl gene annotation79 (version 1.87) using featureCounts80. 365 RNAseq data from one animal (not-pregnant) were removed from analysis due to low sequencing 366 yield (613,765 reads).

367 Filtering of lowly expressed genes

368 All analytical procedures were carried out in R software81. Supplementary Code contain 369 all scripts used for analytical procedures described below and the creation of the plots, which can 370 also be obtained on figshare82.

371 Counts per million (CPM), and fragments per kilobase per million reads (FPKM) were 372 calculated using “edgeR”83. In order to remove quantification uncertainty associated to lowly 373 expressed genes and erroneous identification of differentially expressed genes84,85, we retained 374 genes with at least 2 CPM for PWBC mRNA or at least 1 CPM for plasma miRNA in five or more 375 samples for downstream analyses. Further analytical procedures were carried out with genes 376 annotated as “protein-coding” and “miRNA” for the PWBC RNA-sequencing and plasma small 377 RNA-sequencing, respectively.

378 Pair-wise gene coexpression and differential coexpression for protein-coding 379 genes

380 First, we calculated pair-wise Pearson’s correlation coefficient (r)86,87 for all 17 animals 381 sampled. Significance of the correlations was assessed using the “TestCor” package based on 382 the empirical statistical test and the large-scale correlation test with normal approximation88 for 383 the calculation of the false discovery rate (FDR). Further biological interpretation of results was

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

384 performed on pairs of genes if abs(r)>0.98, which corresponded to a nominal P≤1.2x10-16 and 385 FDR<0.02.

386 Next, we calculated within group pair-wise Pearson’s correlation coefficient (r) for each 387 group: AI-pregnant, NB-pregnant, and not-pregnant. We interpreted a differential correlation as 388 biologically meaningful based on the following scenarios, assuming that groups 1 and 2 can be

389 assigned arbitrarily: inverted correlation, if � − � > 1.95; gained correlation in group

390 1 and lost correlation in group 2, if � > 0.99 and −0.1 < � < 0.1.

391 Significance was estimated by the eFDR approach75,89,90. We resampled (5000 392 randomizations) the labels of pregnancy outcome and calculated r for each pair of genes in the

393 scrambled data set (rscramble). eFDR was obtained based on the proportion of the rscramble greater 394 than a specified r (i.e.: for r=0.99, eFDR<0.002, Supplementary Fig. 1), relative to the total number 395 of correlations calculated using the formulas described elsewhere75,89,90.

396 mRNA-miRNA coexpression and differential coexpression

397 We calculated Pearson’s correlation (r)86,87 values between miRNA and mRNA genes for 398 all 17 samples. Significance was estimated by the eFDR approach75,89,90 by shuffling (5000 399 randomizations) the animal identification on mRNA dataset, thus breaking the common source of 400 both RNA types (Supplementary Fig. 2).

401 Pearson’s correlation (r) values were calculated for mRNA-miRNA transcript pairs within 402 groups: AI-pregnant, NB-pregnant, and not-pregnant heifers. We interpreted a differential

403 correlation as biologically meaningful based on the following scenarios � > 0.99 and

404 −0.1 < � < 0.1. Significance was assessed by the eFDR approach (Supplementary Fig. 3). 405 We further interrogated the data for experimentally validated miRNA-mRNA interactions obtained 406 from miRWalk database (version 3)91,92.

407 Differential expression of mRNA and miRNA

408 Differences of transcript levels between groups based on the pregnancy outcome were 409 determined from fragment counts using “edgeR”83 and “DESeq2”93. For each comparison, a gene 410 was initially inferred as differentially expressed if the nominal P value was ≤0.03 in both “edgeR” 411 and “DESeq2” analyses. This nominal P value corresponded to eFDR≤0.05 (Supplementary Fig. 412 4), as calculated according to the procedure and formulas outlined elsewhere75,89,90 using 10000 413 randomizations of sample reshuffling. Genes inferred as differentially expressed based on P 414 values were further selected through “leave-one-out” cross validation94,95. Only genes whose

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

415 significance were maintained after all rounds of sampling were further considered for biological 416 interpretation.

417 Testing for enrichment of gene ontology terms and KEGG pathways

418 Differentially expressed genes and genes that displayed differential coexpression among 419 heifers of AI-pregnant, NB-pregnant, and not-pregnant classifications were analyzed for 420 enrichment of Gene Ontology96 categories and KEGG Pathways97. In all tests, the background 421 gene list consisted of all genes detected as expressed in our dataset98. GO annotations and 422 transcript lengths were obtained from BioMart99 and KEGG annotation was obtained from the 423 “org.Bt.eg.db” package. We used “GoSeq” package100 to estimate significance of category 424 enrichment using the sampling method100. We utilized FDR for multiple testing under 425 dependency101, and inferred statistical significance of GO terms and KEGG pathways if FDR < 426 0.10.

427 Assessment of predictive potential of gene expression levels by machine learning 428 algorithms.

429 We focused the mRNA sequencing data from AI-pregnant and not-pregnant heifers from 430 this report and data we collected from the same site on the previous year and published 431 elsewhere34,76. The raw sequences from the previous dataset were realigned, and data were 432 treated as described above.

433 To remove the genes not informative to the machine learning algorithm, we identified 434 differentially expressed genes between the two groups (AI-pregnant and not-pregnant). We used 435 the same procedures and standards aforementioned with the exception that we added year as a 436 fixed effect into the model. There were 198 genes differentially expressed between these two 437 groups (AI-preg and not-pregnant). We performed a variance stabilizing transformation on the 438 counts for these 198 genes, and the transformed data were used as input.

439 For the machine learning approach, we used the expression data for the 198 DEGs from 440 year one (2015) as a training set and the data for the same group of genes from year two (2016) 441 as a test set. Initially we processed the training set for the development of models and assessed 442 the prediction accuracy for 86 classification algorithms (Supplementary Fig. 5). Accuracy was 443 determined by the number of samples correctly classified divided by the number of samples tested 444 (n=11). The models that showed prediction accuracy ≥0.9 (classified correctly at least 10 445 samples, accuracy P value ≤0.01), were further subjected to 2000 random predictions with 446 different initial ‘seed’ numbers.

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

447 FIGURE LEGEND 448 Fig. 1: Overview of the experimental design and data. a Schematics of the experimental design, 449 classification scheme based on fertility, data produced and overview of the analyses. b 450 Dimensionality reduction of the transcriptome data obtained from the mRNA abundance of 451 protein-coding genes in PWBCs and plasma extracellular miRNA.

452

453 Fig. 2: mRNA:mRNA coexpression in PWBCs. a Network of genes with highly correlated mRNA 454 abundance. b Coexpression of 1860 protein-coding genes in pregnant heifers (AI-preg and NB- 455 preg). c Profile of the differential coexpression in heifers classified as not-pregnant relative to the 456 counterparts that became pregnant. d Scatterplot of selected genes associated with “proteolysis” 457 that were coexpressed in heifers classified as not-pregnant relative to the counterparts that 458 became pregnant. e Scatterplot of selected genes associated with “negative regulation of 459 transcription by RNA polymerase II” that showed inverted coexpression in heifers classified as 460 not-pregnant relative to the counterparts that became pregnant.

461

462 Fig. 3: miRNA from plasma coexpressed with mRNA in PWBCs. a miRNA:mRNA pairs 463 significantly coexpressed whose genes are significantly enriched in specific biological processes. 464 b miRNA:mRNA pairs significantly coexpressed with evidence of interaction from miRBase. c 465 Scatterplots of selected pairs representing differential coexpression between miRNA:mRNA in 466 heifers classified as not-pregnant relative to the counterparts that became pregnant

467

468 Fig. 4: Differential coexpression among heifers of different reproductive outcome. a Heatmap 469 depicting the expression levels of genes differentially expressed according to specific contrasts. 470 b Transcript abundance of the genes differentially expressed on both contrasts: AI-preg vs not- 471 pregnant and NB-preg vs not-pregnat. c miRNA:mRNA pairs significantly coexpressed whose 472 protein-coding genes are differentially expressed in PWBCs based on pregnancy outcome.

473

474 Fig. 5: Prediction of pregnancy outcome based on mRNA levels and machine learning algorithms. 475 a Representation of the contrasting of gene expression levels of heifers classified as AI-preg and 476 not-pregnant from two years and the number of differentially expressed genes (DEGs). b 477 Depiction of the training set used as input for the modeling of the transcriptome dataset by multiple

16 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

478 machine learning algorithms. c Depiction of the data used as the test set for prediction of 479 pregnancy outcome and the representation of the calculation of accuracy. d Accuracy of blind 480 pregnancy prediction based on the algorithm “parallel random forest” in 2000 trials. e Variable 481 importance for the top five genes. f Transcript levels for the top five genes most informative for 482 the machine learning algorithm to classify pregnancy outcome.

483

484 DATA AVAIABILITY

485 All raw sequencing files and raw read counts are deposited in the National Center for 486 Biotechnology Information under the Gene Expression Omnibus database, accession: 487 GSE146041 and GSE103628. All the intermediate data and results produced from raw files can 488 be reproduced based on the codes available on Supplementary Code.

489

490 AUTHOR CONTRIBUTIONS

491 F.B. conceived designed and supervised the experiments. S.E.M. performed sample collection, 492 library preparations, contributed to the data analysis, prepared the first draft and edited the final 493 draft. B.N.W. contributed to sample collection, assisted on the laboratory assays and editing of 494 the final draft. M.F.E. is responsible for the acquisition and curation of the animal database. J.B.E 495 and S.P.R. were responsible for the health and reproductive management of the animals. F.B. 496 conceived designed, supervised the experiments, contributed to data analysis and wrote the final 497 draft. 498

17 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

499 REFERENCES 500 501 1. Zegers-Hochschild F, et al. The International Glossary on Infertility and Fertility Care, 502 2017. Fertil Steril 108, 393-406 (2017). 503 504 2. Boivin J, Bunting L, Collins JA, Nygren KG. International estimates of infertility 505 prevalence and treatment-seeking: potential need and demand for infertility medical care. 506 Human reproduction (Oxford, England) 22, 1506-1512 (2007). 507 508 3. Dickinson SE, et al. Evaluation of age, weaning weight, body condition score, and 509 reproductive tract score in pre-selected beef heifers relative to reproductive potential. J 510 Anim Sci Biotechnol 10, 18 (2019). 511 512 4. Gajbhiye R, Fung JN, Montgomery GW. Complex genetics of female fertility. NPJ 513 Genom Med 3, 29 (2018). 514 515 5. Peters SO, et al. Heritability and Bayesian genome-wide association study of first service 516 conception and pregnancy in Brangus heifers. Journal of Animal Science 2013, 605-612 517 (2013). 518 519 6. Neupane M, et al. Loci and pathways associated with uterine capacity for pregnancy and 520 fertility in beef cattle. PLoS One 12, e0188997 (2017). 521 522 7. McDaneld TG, et al. Genomewide association study of reproductive efficiency in female 523 cattle. Journal of Animal Science 2014, 1945-1957 (2014). 524 525 8. McDaneld TG, et al. Y are you not pregnant: Identification of Y segments 526 in female cattle with decreased reproductive efficiency. Journal of Animal Science 90, 527 2142-2151 (2012). 528 529 9. de Camargo GM, Costa RB, de Albuquerque LG, Regitano LC, Baldi F, Tonhati H. 530 Association between JY-1 gene polymorphisms and reproductive traits in beef cattle. 531 Gene 533, 477-480 (2014). 532 533 10. Dias MM, et al. Study of lipid metabolism-related genes as candidate genes of sexual 534 precocity in Nellore cattle. Genet Mol Res 14, 234-243 (2015). 535 536 11. Irano N, et al. Genome-wide association study for indicator traits of sexual precocity in 537 Nellore cattle. PLoS One 11, e0159502 (2016). 538 539 12. Junior GAO, et al. Genomic study and medical subject headings enrichment analysis of 540 early pregnancy rate and antral follicle numbers in Nelore heifers. Journal of Animal 541 Science 95, 4796-4812 (2017). 542 543 13. Kosova G, Abney M, Ober C. Colloquium papers: Heritability of reproductive fitness 544 traits in a human population. P Natl Acad Sci USA 107 Suppl 1, 1772-1778 (2010).

18 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

545 546 14. Fortes MRS, et al. Gene network analyses of first service conception in Brangus heifers: 547 Use of genome and trait associations, hypothalamic-transcriptome information, and 548 transcription factors. Journal of Animal Science 2012, 2894-2906 (2012). 549 550 15. Doyle SP, Golden BL, Green RD, Brinks JS. Additive genetic parameter estimates for 551 heifer pregnancy and subsequent reproduction in Angus females. Journal of Animal 552 Science 78, 2091-2098 (2000). 553 554 16. Toghiani S, Hay E, Sumreddee P, Geary TW, Rekaya R, Roberts AJ. Genomic prediction 555 of continuous and binary fertility traits of females in a composite beef cattle breed. 556 Journal of Animal Science 95, 4787-4795 (2017). 557 558 17. Bormann JM, Totir LR, Kachman SD, Fernando RL, Wilson DE. Pregnancy rate and 559 first-service conception rate in Angus heifers. Journal of Animal Science 84, 2022-2025 560 (2006). 561 562 18. McAllister CM, Speidel SE, Crews DH, Jr., Enns RM. Genetic parameters for 563 intramuscular fat percentage, marbling score, scrotal circumference, and heifer pregnancy 564 in Red Angus cattle. Journal of Animal Science 89, 2068-2072 (2011). 565 566 19. Boddhireddy P, Kelly MJ, Northcutt S, Prayaga KC, Rumph J, DeNise S. Genomic 567 predictions in Angus cattle: Comparisons of sample size, response variables, and 568 clustering methods for cross-validation. Journal of Animal Science 92, 485-497 (2014). 569 570 20. Adams GP, Pierson RA. Bovine Model for Study of Ovarian Follicular Dynamics in 571 Humans. Theriogenology 43, 113-120 (1995). 572 573 21. Abedal-Majed MA, Cupp AS. Livestock animals to study infertility in women. Anim 574 Front 9, 28-33 (2019). 575 576 22. Henchion M, Hayes M, Mullen AM, Fenelon M, Tiwari B. Future Protein Supply and 577 Demand: Strategies and Factors Influencing a Sustainable Equilibrium. Foods 6, (2017). 578 579 23. Cushman RA, Kill LK, Funston RN, Mousel EM, Perry GA. Heifer calving date 580 positively influences calf weaning weights through six parturitions. Journal of Animal 581 Science 2013, 4486-4491 (2013). 582 583 24. Marshall DM, Minqiang W, Freking BA. Relative calving date of first-calf heifers as 584 related to production efficiency and subsequent reproductive performance. Journal of 585 Animal Science 68, 1812-1817 (1990). 586 587 25. Lesmeister JL, Burfening PJ, Blackwell RL. Date of first calving in beef cows and 588 subsequent calf production. Journal of Animal Science 36, 1-6 (1973). 589

19 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

590 26. Damiran D, Larson KA, Pearce LT, Erickson NE, Lardner BHA. Effect of calving period 591 on beef cow longevity and lifetime productivity in western Canada. Translational Animal 592 Science 2, S61-S65 (2018). 593 594 27. Morris CA, Cullen NG. A note on genetic correlations between pubertal traits of males or 595 females and lifetime pregnancy rate in beef cattle. Livestock Production Science 39, 291- 596 297 (1994). 597 598 28. Mwansa PB, Kemp RA, Jr DHC, Kastelic JP, Bailey DRC, Coulter GH. Selection for 599 cow lifetime pregnancy rate using bull and heifer growth and reproductive traits in 600 composite cattle. Canadian Journal of Animal Science 80, 507-510 (2000). 601 602 29. Li-Pook-Than J, Snyder M. iPOP goes the world: integrated personalized Omics profiling 603 and the road toward improved health care. Chem Biol 20, 660-666 (2013). 604 605 30. Mancuso N, Shi H, Goddard P, Kichaev G, Gusev A, Pasaniuc B. Integrating Gene 606 Expression with Summary Association Statistics to Identify Genes Associated with 30 607 Complex Traits. Am J Hum Genet 100, 473-487 (2017). 608 609 31. Gusev A, et al. Integrative approaches for large-scale transcriptome-wide association 610 studies. Nat Genet 48, 245-252 (2016). 611 612 32. Ioannidis J, Donadeu FX. Circulating microRNA profiles during the bovine oestrous 613 cycle. PLoS One 11, e0158160 (2016). 614 615 33. Dickinson SE, et al. Transcriptome profiles in peripheral white blood cells at the time of 616 artificial insemination discriminate beef heifers with different fertility potential. BMC 617 Genomics 19, 129 (2018). 618 619 34. Dickinson SE, et al. Transcriptome profiles in peripheral white blood cells at the time of 620 artificial insemination discriminate beef heifers with different fertility potential. BMC 621 Genomics 19, (2018). 622 623 35. Emmert-Streib F, Dehmer M, Haibe-Kains B. Gene regulatory networks and their 624 applications: understanding biological and medical problems in terms of networks. Front 625 Cell Dev Biol 2, 38 (2014). 626 627 36. Xu C, Jackson SA. Machine learning and complex biological data. Genome Biol 20, 76 628 (2019). 629 630 37. Monaco G, et al. RNA-Seq Signatures Normalized by mRNA Abundance Allow 631 Absolute Deconvolution of Human Immune Cell Types. Cell Rep 26, 1627-1640 e1627 632 (2019). 633

20 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

634 38. van Dam S, Vosa U, van der Graaf A, Franke L, de Magalhaes JP. Gene co-expression 635 analysis for functional classification and gene-disease predictions. Brief Bioinform 19, 636 575-592 (2018). 637 638 39. Sheridan MA, et al. Early onset preeclampsia in a model for human placental trophoblast. 639 P Natl Acad Sci USA 116, 4336-4345 (2019). 640 641 40. Fitzgerald HC, et al. Idiopathic infertility in women is associated with distinct changes in 642 proliferative phase uterine fluid proteins. Biol Reprod 98, 752-764 (2018). 643 644 41. Yang HL, Mei J, Chang KK, Zhou WJ, Huang LQ, Li MQ. Autophagy in endometriosis. 645 Am J Transl Res 9, 4707-4725 (2017). 646 647 42. Pelch KE, Schroder AL, Kimball PA, Sharpe-Timms KL, Davis JW, Nagel SC. Aberrant 648 gene expression profile in a mouse model of endometriosis mirrors that observed in 649 women. Fertil Steril 93, 1615-1627 e1618 (2010). 650 651 43. Demain LA, Conway GS, Newman WG. Genetics of mitochondrial dysfunction and 652 infertility. Clin Genet 91, 199-207 (2017). 653 654 44. Koks S, et al. The differential transcriptome and ontology profiles of floating and 655 cumulus granulosa cells in stimulated human antral follicles. Mol Hum Reprod 16, 229- 656 240 (2010). 657 658 45. Li SH, et al. Correlation of cumulus gene expression of GJA1, PRSS35, PTX3, and 659 SERPINE2 with oocyte maturation, fertilization, and embryo development. Reprod Biol 660 Endocrinol 13, 93 (2015). 661 662 46. Mtango NR, Sutovsky M, Susor A, Zhong Z, Latham KE, Sutovsky P. Essential role of 663 maternal UCHL1 and UCHL3 in fertilization and preimplantation embryo development. 664 J Cell Physiol 227, 1592-1603 (2012). 665 666 47. Ingham NJ, et al. Mouse screen reveals multiple new genes underlying mouse and human 667 hearing loss. PLoS Biol 17, e3000194 (2019). 668 669 48. Reeves GT. The engineering principles of combining a transcriptional incoherent 670 feedforward loop with negative feedback. J Biol Eng 13, 62 (2019). 671 672 49. Sun HZ, Chen Y, Guan LL. MicroRNA expression profiles across blood and different 673 tissues in cattle. Sci Data 6, 190013 (2019). 674 675 50. Vaz C, et al. Analysis of microRNA transcriptome by deep sequencing of small RNA 676 libraries of peripheral blood. BMC Genomics 11, 288 (2010). 677 678 51. Fritz JV, et al. Sources and Functions of Extracellular Small RNAs in Human 679 Circulation. Annu Rev Nutr 36, 301-336 (2016).

21 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

680 681 52. Diaz G, Zamboni F, Tice A, Farci P. Integrated ordination of miRNA and mRNA 682 expression profiles. BMC Genomics 16, 767 (2015). 683 684 53. Ritchie W, Rajasekhar M, Flamant S, Rasko JE. Conserved expression patterns predict 685 microRNA targets. PLoS Comput Biol 5, e1000513 (2009). 686 687 54. Bayraktar R, Van Roosbroeck K, Calin GA. Cell-to-cell communication: microRNAs as 688 hormones. Mol Oncol 11, 1673-1686 (2017). 689 690 55. Rayner KJ, Hennessy EJ. Extracellular communication via microRNA: lipid particles 691 have a new message. J Lipid Res 54, 1174-1181 (2013). 692 693 56. Robertson SA, Sharkey DJ. The role of semen in induction of maternal immune tolerance 694 to pregnancy. Semin Immunol 13, 243-254 (2001). 695 696 57. Moffett A, Loke C. Immunology of placentation in eutherian mammals. Nat Rev 697 Immunol 6, 584-594 (2006). 698 699 58. Momen-Heravi F, Bala S. miRNA regulation of innate immunity. J Leukoc Biol, (2018). 700 701 59. Hughes HD, Carroll JA, Burdick Sanchez NC, Richeson JT. Natural variations in the 702 stress and acute phase responses of cattle. Innate Immun 20, 888-896 (2014). 703 704 60. Wenric S, Shemirani R. Using Supervised Learning Methods for Gene Selection in RNA- 705 Seq Case-Control Studies. Front Genet 9, 297 (2018). 706 707 61. Zararsiz G, et al. A comprehensive simulation study on classification of RNA-Seq data. 708 PLoS One 12, e0182507 (2017). 709 710 62. Brazdova A, Senechal H, Peltre G, Poncet P. Immune Aspects of Female Infertility. Int J 711 Fertil Steril 10, 1-10 (2016). 712 713 63. Altman N, Krzywinski M. Association, correlation and causation. Nat Methods 12, 899- 714 900 (2015). 715 716 64. Patterson DJ, Smith MF. Management considerations in beef heifer development and 717 puberty. Vet Clin North Am Food Anim Pract 29, xiii-xiv (2013). 718 719 65. Larson RL, White BJ, Laflin S. Beef Heifer Development. Vet Clin North Am Food Anim 720 Pract 32, 285-302 (2016). 721 722 66. Holm DE, Thompson PN, Irons PC. The value of reproductive tract scoring as a predictor 723 of fertility and production outcomes in beef heifers. J Anim Sci 87, 1934-1940 (2009). 724

22 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

725 67. Rae DO, Kunkle WE, Chenoweth PJ, Sand RS, Tran T. Relationship of parity and body 726 condition score to pregnancy rate in Florida beef-cattle. Theriogenology 1993, 1143-1152 727 (1993). 728 729 68. Larson JE, et al. Synchronization of estrus in suckled beef cows for detected estrus and 730 artificial insemination and timed artificial insemination using gonadotropin-releasing 731 hormone, prostaglandin F2α, and progesterone. Journal of Animal Science 2006, 332– 732 342 (2006). 733 734 69. Rio DC, Ares M, Jr., Hannon GJ, Nilsen TW. Purification of RNA using TRIzol (TRI 735 reagent). Cold Spring Harb Protoc 2010, pdb prot5439 (2010). 736 737 70. Chomczynski P, Sacchi N. The single-step method of RNA isolation by acid guanidinium 738 thiocyanate-phenol-chloroform extraction: twenty-something years on. Nat Protoc 1, 739 581-585 (2006). 740 741 71. Yeri A, et al. Total Extracellular Small RNA Profiles from Plasma, Saliva, and Urine of 742 Healthy Subjects. Scientific Reports 7, 44061 (2017). 743 744 72. Chory J, Pollard JD, Jr. Separation of small DNA fragments by conventional gel 745 electrophoresis. Curr Protoc Mol Biol Chapter 2, Unit2 7 (2001). 746 747 73. Green MR, Sambrook J. Isolation of DNA Fragments from Polyacrylamide Gels by the 748 Crush and Soak Method. Cold Spring Harb Protoc 2019, pdb prot100479 (2019). 749 750 74. Biase FH, et al. Massive dysregulation of genes involved in and placental 751 development in cloned cattle conceptus and maternal endometrium. P Natl Acad Sci USA 752 113, 14492-14501 (2016). 753 754 75. Biase FH, Kimble KM. Functional signaling and gene regulatory networks between the 755 oocyte and the surrounding cumulus cells. BMC Genomics 19, 351 (2018). 756 757 76. Dickinson SE, Biase FH. Transcriptome data of peripheral white blood cells from beef 758 heifers collected at the time of artificial insemination. Data Brief 18, 706-709 (2018). 759 760 77. Kinsella RJ, et al. Ensembl BioMarts: a hub for data retrieval across taxonomic space. 761 Database 2011, bar030 (2011). 762 763 78. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory 764 requirements. Nat Methods 12, 357-360 (2015). 765 766 79. Flicek P, et al. Ensembl 2014. Nucleic Acids Res 42, D749-755 (2014). 767 768 80. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for 769 assigning sequence reads to genomic features. Bioinformatics 30, 923-930 (2014). 770

23 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

771 81. Ihaka R, Gentleman. R: A Language and Environment for Statistical Computing. Journal 772 of Computational and Graphical Statistics 5, 299-214 (2012). 773 774 82. Biase FH.Supplementary code and files to Rewiring of gene expression in PWBC at 775 artificial insemination is predictive of pregnancy outcome in cattle (Bos taurus). 776 https://doi.org/10.6084/m9.figshare.11985666.v2. 777 778 83. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential 779 expression analysis of digital gene expression data. Bioinformatics 26, 139-140 (2010). 780 781 84. Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for 782 normalization and differential expression in mRNA-Seq experiments. BMC 783 Bioinformatics 11, 94 (2010). 784 785 85. Robinson MD, Oshlack A. A scaling normalization method for differential expression 786 analysis of RNA-seq data. Genome Biol 11, R25 (2010). 787 788 86. Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D. How to infer gene 789 networks from expression profiles. Mol Syst Biol 3, (2007). 790 791 87. Serin EA, Nijveen H, Hilhorst HW, Ligterink W. Learning from Co-expression 792 Networks: Possibilities and Challenges. Front Plant Sci 7, 444 (2016). 793 794 88. Cai TT, Liu W. Large-Scale Multiple Testing of Correlations. J Am Stat Assoc 111, 229- 795 240 (2016). 796 797 89. Storey JD, Tibshirani R. Statistical significance for genomewide studies. P Natl Acad Sci 798 USA 100, 9440-9445 (2003). 799 800 90. Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic 801 studies. Nat Rev Genet 15, 335-346 (2014). 802 803 91. Dweep H, Sticht C, Pandey P, Gretz N. miRWalk--database: prediction of possible 804 miRNA binding sites by "walking" the genes of three genomes. J Biomed Inform 44, 839- 805 847 (2011). 806 807 92. Dweep H, Gretz N. miRWalk2.0: a comprehensive atlas of microRNA-target 808 interactions. Nat Methods 12, 697 (2015). 809 810 93. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for 811 RNA-seq data with DESeq2. Genome Biol 15, 550 (2014). 812 813 94. Picard RR, Cook RD. Cross-Validation of Regression-Models. J Am Stat Assoc 79, 575- 814 583 (1984). 815

24 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

816 95. Breiman L, Spector P. Submodel Selection and Evaluation in Regression - the X-Random 817 Case. Int Stat Rev 60, 291-319 (1992). 818 819 96. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene 820 Ontology Consortium. Nat Genet 25, 25-29 (2000). 821 822 97. Du J, Yuan Z, Ma Z, Song J, Xie X, Chen Y. KEGG-PATH: Kyoto encyclopedia of 823 genes and genomes-based pathway analysis using a path analysis model. Molecular 824 BioSystems 10, 2441-2447 (2014). 825 826 98. Timmons JA, Szkop KJ, Gallagher IJ. Multiple sources of bias confound functional 827 enrichment analysis of global -omics data. Genome Biol 16, 186 (2015). 828 829 99. Durinck S, et al. BioMart and Bioconductor: a powerful link between biological 830 databases and microarray data analysis. Bioinformatics 21, 3439-3440 (2005). 831 832 100. Young MD, Wakefield MJ, Smyth GK, Oshlack A. Gene ontology analysis for RNA-seq: 833 accounting for selection bias. Genome Biology 11, R14 (2010). 834 835 101. Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing 836 under dependency. Ann Statist 29, 1165-1188 (2001). 837 838 839

25 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

840 Figure 1 10496 protein coding genes 290 miRNAs a AI-preg NB-preg not preg b 1049610496 protein protein coding coding genes genes c 290290 miRNAs miRNAs (n=6) (n=6) (n=5) PWBC → mRNA, plasma → miRNA Differential Differential transcript

coexpression abundance SNE dim 2 SNE dim 2 − − t t SNE dim 2 SNE dim 2 SNE dim 2 SNE dim 2 − − − − t t mRNA: mRNA: mRNA miRNA t t mRNA miRNA t−SNE dim 1 t−SNE dim 1 Gene ontology Prediction of fertility t−SNEt−SNE dim dim 1 1 t−SNEt−SNE dim dim 1 1 enrichment analysis by artificial intelligence AIPreg-preg AI NBPreg-preg NB notNot pregPreg 841 PregPreg AI AI PregPreg NB NB NotNot Preg Preg 842

26 EIF4EBP3 GNPDA2

FCGRT IFI30

CUTA

RIC8A TMX2 IDH2 FABP3 NDUFV3 NUP43 ENSBTAG00000000524 B4GALT3

SLC25A20 RNF122 GOT1 AKR1A1

PHF23 AAMP IKBKB TJAP1 CD48 SEPT7 JTB YIPF3 ENSBTAG00000040331 USB1 LRP10 PIP4P1 COPG1 PIGC ADPRHL2 MAP7D1 CSNK2B CMTM7 HDDC3 VTI1B FAM50A CHTOP AMT C7H1orf35 AP4E1 RPS8 ENSBTAG00000048228 CD40 GEMIN7 SLC7A6OS AP5S1 RPLP0 C3H1orf123 BOLA-DMB PRR14 SDHAF1 ENSBTAG00000034433 SLC38A5 N4BP2 FAM126A RB1 THAP11 PGAP2 PSME2 DGUOK SYNGR2 CPTP MRPL10 MAGED1 FBXW4 HSD17B8 NRM EXOSC10 GRPEL1 ENSBTAG00000048688 CCDC32 ANKLE1 SNX13 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted MarchGRINA 20,MCM7 2020.LCK PBX2 The copyright holder for this preprint CHD1 CYBC1 CHMP4B POLR2E NDUFB10 VPS16 HYPK ENSBTAG00000014449 RPS4X LIN37 ARHGEF2 CXHXorf65 TAP1 LCAT BABAM1 DHX16 ENO1 ENSBTAG00000050579 HMGA1 ENSBTAG00000009783 EIF4A1 SCAF11 TTC7A DTX3 ZNF263 CDK9 TNFSF10 ATF6B VPS72 RELB ZPR1 MAP3K12 EMD RPL24 ACSS2 MRPL52 EEF1B2 SNAPIN ATP5MF UFC1 SDHA RNASEH2A CTDNEP1 PPP4R3B PRCC SRRT AP1S1 RELA ELP5 MINPP1 ANP32A MRPL34 CD52 FANCG COA3 HDAC6 ENSBTAG00000051063 ENSBTAG00000045678ENSBTAG00000005315 ERGIC3 PLP2 NDUFS3 GPSM3 (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to displayEXOSC6 POLR1D the preprintINPP5D in perpetuity. It is made IMPA2 WDR18 IER2 RPS6 ENSBTAG00000022278 HDAC3 RNF113A RPS15A RPL26 TPM3 SUPT5H EI24 ENSBTAG00000030281 RASA1 RMND5B PSENEN DCTN3 NEDD8 COX6A1 REV3L MRPS12 OSGEP CD79A RPL18A ENSBTAG00000049543 PHGDH CD19 DBNL LGALS1 BOD1 ENSBTAG00000053048 PSME1 EFR3A ATF4 CREB3 PEA15 POLR2C ENSBTAG00000055082 RPL14 RPL10A KRTCAP2 SLC50A1 HDGF NOSIP CNPY2 RPL3 RPL27 ENSBTAG00000047341 PTK2B FUNDC2 EMC10 RPS7 RBM14 SIT1 ENSBTAG00000027213 IMPDH2 GCC2 PSMB8 RFTN1 RPS27 TOMM7 TRAPPC2L RPL12 RPL7L1 FN3KRP COX8A DNAJB2 RPS6KA3 NDUFB9 RHOF RPL35 ZC3H10 AAMDCCOMMD7ARHGDIB RPS20 ENSBTAG00000045504 available under aCC-BY-NC-ND 4.0 InternationalRPS21 license. EBP RPS10TSTD1 BANF1 FTH1 EIF1 SASS6 NUDT2 TMEM256 GRK6 RASSF1 RPL32 ENSBTAG00000020550 USPL1 RASGRP2 RBM4 RPL37A RPL19 TSR2 REEP4 MAP4K2 KLHL11 PSMB4 TNFRSF13CEIF1AD NHEJ1 SSR4 NDUFB11 STK19 SLF2 RABEPK EIF3K PREB PSMC4 ENSBTAG00000055198PAAF1 ZCCHC17 GMIP WWP2 SH3BGRL3 MFNG PSMB3 RRAS2 ATP6V1F CTSA TADA3 DPCD ENSBTAG00000023274 ZDHHC17 RGL2 ENSBTAG00000049167RPL10 KCND1 MTCH1EIF3D NDUFS5 TNFAIP8L2 RPL18 PPP1R18 QARS TLE3 TMSB10 RPL34 NFKBID PSMC5 PTPN6 OXA1L CTSZ PFDN5 ENSBTAG00000012401 MRPL37 ZNF207 DRG2 THYN1 GPANK1 ENSBTAG00000047694 RPS14 DAXX RALGAPA1 C7H5orf24 SGF29 PIH1D1 PHB2 ITGA4 DENND4A VPS18 SF3A2 RPS16 RPL29 SEPSECS SERGEF VEGFB ENSBTAG00000045568 DEDD LTB JAGN1 PHIP SSR2 CCDC107 CFL1 RPS11 SRA1 CPSF2 RETREG2 ARHGAP9 FMNL3 RPS18 MAP3K14 RPL13 VAMP8 NUDC BBX LTN1 ENSBTAG00000031834 PARVG UBXN1 ARAF PIGT PRELID1 FTL TFAM DHFR ENSBTAG00000001538 COX4I1 RPL27A RPS19 RPS3 EIF3I EML4 GRWD1 C7H19orf53 TMEM147 C7H19orf70 VAMP5 TINF2 HDAC1 FIS1 FAU RRP7A ACP5 MTX1 IRF9 ARAP2 WDR83OS GMFG DCAF8 DCTN2 GMPPB MEPCE PEX19 UBR3 ZNF292 INTS11HSH2D EEF1G UBR2 PDS5B DMTF1 ATP6V0D1 MYL6 RPS24 ENSBTAG00000016391C19H17orf49 ENSBTAG00000049993 CDK3 SMARCD2 POLR2GENSBTAG00000000560 ATM MAP2K2 RPL13A ENSBTAG00000053606 PSMD8 ERCC1 AGBL5 ATP6V0B RIPK3 VPS13B HCST PFN1 TNPO2 PRKAG1 SEC23IP PPT2 UTRN MIA GPX1 ENSBTAG00000012276 DOCK11 PDS5A PAGR1 UFD1SLC25A11 UBR1 PUF60 UXT RPL37 COPZ1 SPEN GHDC TAF10 SLC7A7 VAV1 EIF2B4 HMCES BPTF BRWD3 NPAS2 COX14 RPS15 FBL PDCD6IP FKBP2 RPS5 LDB1 UROD EMC9 SLC25A39 DDB1 LANCL2 TLK1 ERAS PEX16 PDK2 TBXA2R G6PC3 CUEDC2 SF3B2 ENSBTAG00000046623 USP34 FCER1G DDX39A COMMD1 AKAP11 ZNF148 HINT2 H2AFJ RNF167 TRIR MRPL21 UBA52 CD37 PIK3C2A TSR3 MRPL27 INPPL1 EMP3 HECTD1 NUFIP2 AP1G2 CD74 ZMYM3 PPP2R1A RPS12 TMED9 RRP36 SENP5 SETD2 PSMB10 GYPC TNF ATP5MC2 SAP25 RPLP2 HSD17B10 RACK1 IL17RA PRDX2 CNN2 ENSBTAG00000034609 BAK1 ENSBTAG00000017734 ZC3H11A RPL35A EIF2B5 ZMYM4 SBNO1 RPS19BP1 LAMTOR2 ARRB2 VAC14 SLC44A2 PTDSS1 BID LENG8 ENSBTAG00000002550 RPS2 ELK4 AMDHD2 UBL5 GAPDH KLHL24 HINFP ENSBTAG00000049817 JAK3 RFC1 CNTRL EGLN2 ITGA2B KHNYN KXD1 RPL23 DYNLL2 SNAPC2 PQLC2 MIEN1 CTSW ACSF2 CHMP6 EIF4G3 S100A13 ENSBTAG00000025968 KMT2E DHX58 WDR13 IMP4 UNC119 PPP1CA SETX SHPRH BIRC6 EIF3F ABI3 RASGRP4 TCIRG1 ENSBTAG00000023792 TMEM265 PRKD3 ANKRD46 LGALS3BP RPL8 SREBF2 BDP1 NUCB1 PPP4C MAF1 OTUB1ATP5MG HINT1 DDB2 IRF3 RPL7A CLASP2 IFI35 RNF181 DMXL1 RHOT2 NDUFAF8 ATP2A3 POLR2I USP24 CALM3 RBCK1 TBC1D10B SNRPD2 PBXIP1 FAM219B SMG1 GUK1 SEMA4A PQBP1 TAGLN2 SERTAD1 UBA1 EXOSC7 CORO1A VPS8 ZBTB11 MTMR9 CIB1 TTC3 RPS13 ENSBTAG00000018987 RALGAPB AURKAIP1 CASP8AP2 YIF1A RPL21 NDUFS2 RSAD1 SPI1 CD2BP2 PKN1 PRPF19 PRR3 BRMS1 U2SURP DNPH1 DGKD ENSBTAG00000006241 IDH3B IL27RA ARHGAP25 NPAT SCNM1 SPSB3 PSME3 DBF4 WDR6 NELFE RPSA AQR JMJD7 EIF3G COPS6 ARL6IP4 ENSBTAG00000052566 USF1 CDK5RAP3 MRPL54 INTS2 FGR CAPNS1 PSMD2 CTC1 ATG101 ACADVL ATAD2 PREPL SIGIRR MED9 C2CD2L VPS25 RAD23A PHYHD1 SRP72 TMEM219 RRP12 ATP2B1 ENSBTAG00000019994 POLR2A KCTD11 PSMG4 ENSBTAG00000034656NACA AP4M1 ZNF518A ENSBTAG00000038379 ZDHHC16 CNIH1 PPP2R5D TNFRSF1A FGD3 UTP14A TTC37 CIITA USO1 BTAF1 MED22 CKB RHOG EEA1 FKBP11 ENSBTAG00000054814 ZNF202 DCAF11 B9D2 ZFAND2B VSIR Figure 2 PRRC2A DCTN1 843 HECTD3 RIF1 BOD1L1 HINT3 ING4 ABHD16A AP1M1

KBTBD8 CALCOCO1 URM1

EIF4EBP3 GNPDA2 OGA ZNF410 TIMM13 SMAD5 CNOT1 FAM241A MARCH7 IFIT3 SP1 SEC24C

MEAF6 IFI30 FCGRT KDM4C VAT1L DCAF1 FAM91A1 CUTA DR1 NAP1L4 SUPT16H RNF125 IL2RG PRPF3 ADSL PEX13 RIC8A ATP5F1B TMX2 ABCF3 IL1B a IDH2 BRAP FABP3 RPE NDUFV3 NUP43 BRD2 SETDB1 ENSBTAG00000000524 B4GALT3 BMS1 LIN7C NMT1 ADGRG5 SDS DRAP1 SMC4 SLC25A20 PLCL2 NXF1 PSMB9 RNF122 CCT3 ETAA1 GOT1 AKR1A1 KARS ENSBTAG00000027319

PHF23 IPO5 AAMP IKBKB TJAP1 ENSBTAG00000026650 CD48 SEPT7 CALR CERS5 JTB YIPF3 ENSBTAG00000040331 POLDIP3 USB1 TRPV2 NAIF1 LRP10 ZNF644 PIP4P1 ACTR1A COPG1 PIGC AMPD2 ADPRHL2 MAP7D1 CSNK2B CMTM7 HDDC3 VTI1B FAM50A CHTOP AMT C7H1orf35 AP4E1 ATOX1 RPS8 ENSBTAG00000048228 CD40 GEMIN7 SLC7A6OS AP5S1 TMEM106B RPLP0 C3H1orf123 SDHAF1 ENSBTAG00000034433 ZBTB17 BOLA-DMB PRR14 FAM126A SLC38A5 N4BP2 RB1 AP2M1 FAM208A THAP11 PGAP2 PSME2 DGUOK SYNGR2 CPTP MRPL10 MAGED1 FBXW4 HSD17B8 NRM EXOSC10 GRPEL1 ENSBTAG00000048688 CCDC32 ANKLE1 SNX13 GRINA MCM7 LCK PBX2 CHD1 CYBC1 CHMP4B POLR2E NDUFB10 VPS16 HYPK ENSBTAG00000014449 RPS4X LIN37 ARHGEF2 CXHXorf65 TAP1 LCAT BABAM1 HK1 ZBTB1 DHX16 ENO1 ENSBTAG00000050579 HMGA1 ENSBTAG00000009783 EIF4A1 SCAF11 TTC7A DTX3 ZNF263 CDK9 TNFSF10 ATF6B VPS72 RELB ZPR1 MAP3K12 EMD RPL24 ACSS2 MRPL52 EEF1B2 SNAPIN ATP5MF UFC1 SDHA RNASEH2A CTDNEP1 PPP4R3B PRCC SRRT AP1S1 ANP32A RELA ELP5 MINPP1 COA3 MRPL34 HDAC6 CD52 FANCG ERGIC3 NDUFS3ENSBTAG00000051063 ENSBTAG00000045678ENSBTAG00000005315 POLR1D PLP2 GPSM3 INPP5D MKI67 CISD2 UGGT1 PTMA YTHDF2 TMEM259 CDC14B PBRM1 PANK3 HSPB1 GMCL1 INO80C IMPA2 WDR18EXOSC6 IER2 RPS6 ENSBTAG00000022278 HDAC3 RNF113A RPS15A RPL26 TPM3 SUPT5H EI24 ENSBTAG00000030281 RASA1 TCTA HEATR5A ENSBTAG00000047432 RMND5B PSENEN DCTN3 NEDD8 COX6A1 REV3L MRPS12 OSGEP CD79A RPL18A ENSBTAG00000049543 PNRC2 MAN2A1 ENSBTAG00000054178 WWP1 BROX PHGDH CD19 PCNP ANO6 ENSBTAG00000027962 DBNL LGALS1 ENSBTAG00000053048 BOD1 PSME1 YLPM1 EFR3A ATF4 CREB3 PEA15 POLR2C ENSBTAG00000055082 RPL14 ALS2 RPL10A KRTCAP2 SLC50A1 KRR1 HDGF NOSIP CNPY2 RPL3 RPL27 ENSBTAG00000047341 C7H19orf67 BRWD1 JUND ENSBTAG00000025564 PTK2B FUNDC2 EMC10 RPS7 EIF5 TMEM170A RBM14 SIT1 ENSBTAG00000027213 IMPDH2 GCC2 PSMB8 RFTN1 RPS27 TOMM7 TRAPPC2L RPL12 RPL7L1 FN3KRP COX8A DNAJB2 RPS6KA3 NDUFB9 RHOF RPL35 ZC3H10 AAMDCCOMMD7ARHGDIB RPS20 ENSBTAG00000045504 RPS21 EBP KPNB1 RPS10 FTH1 SASS6 TSTD1 BANF1 EIF1 UBN1 CHD9 NUDT2 TMEM256 GRK6 RASSF1 RPL32 ENSBTAG00000020550 USPL1 RASGRP2 RBM4 RPL19 MAP4K2 RPL37A TSR2 KLHL11 CIZ1 RAB3GAP2 MUS81 RPLP1 REEP4 EIF1AD NHEJ1 PSMB4 TNFRSF13C NDUFB11 STK19 SSR4 RABEPK PSMC4 PAAF1 ZCCHC17 SLF2 EIF3K PREB SH3BGRL3 ENSBTAG00000055198 PSMB3 CTSA GMIP TADA3 WWP2 MFNG ZDHHC17 RRAS2 ATP6V1F DPCD RGL2 ENSBTAG00000023274 ENSBTAG00000049167RPL10 G3BP1 KCND1 MTCH1EIF3D NDUFS5 TNFAIP8L2 RPL18 DCLRE1C PPP1R18 QARS TLE3 TMSB10 RPL34 NFKBID PSMC5 PTPN6 OXA1L CTSZ PFDN5 ENSBTAG00000012401 MRPL37 ZNF207 DRG2 THYN1 GPANK1 ENSBTAG00000047694 RPS14 DAXX RALGAPA1 C7H5orf24 SGF29 PIH1D1 PHB2 ITGA4 DENND4A VPS18 SF3A2 RPS16 RPL29 SEPSECS SERGEF VEGFB ENSBTAG00000045568 DEDD LTB JAGN1 PHIP SSR2 CCDC107 CFL1 RPS11 SRA1 CPSF2 RETREG2 ARHGAP9 FMNL3 RPS18 MAP3K14 RPL13 VAMP8 NUDC BBX LTN1 ENSBTAG00000031834 PARVG UBXN1 ARAF USP14 YIPF6 CLASRP GRN NUB1 ENSBTAG00000032764 UBL3 TXNDC9 PPP3R1 CD22 PSMD3 GOLGB1 ZEB2 PALM PIGT PRELID1 FTL TFAM DHFR ENSBTAG00000001538 COX4I1 RPL27A RPS19 RPS3 EIF3I EML4 CPSF4 GRWD1 C7H19orf53 TMEM147 C7H19orf70 VAMP5 TINF2 HDAC1 FIS1 FAU ARAP2 RRP7A ACP5 MTX1 IRF9 STUB1 WDR83OS GMFG DCAF8 DCTN2 GMPPB MEPCE PEX19 UBR3 ZNF292 INTS11HSH2D EEF1G UBR2 PDS5B DMTF1 G2E3 SBDS C5orf51 VPS11 STAT6 SYNCRIP ENSBTAG00000006878 CPSF1 ATP6V0D1 MYL6 RPS24 ENSBTAG00000016391C19H17orf49 ENSBTAG00000049993 CDK3 SMARCD2 POLR2GENSBTAG00000000560 ATM MAP2K2 RPL13A ENSBTAG00000053606 PSMD8 HSPA13 RPL6 UPK3B ERCC1 AGBL5 ATP6V0B RIPK3 VPS13B HCST PFN1 TNPO2 PRKAG1 SEC23IP CD82 MIIP PPT2 UTRN CHD2 MIA GPX1 ENSBTAG00000012276 DOCK11 PDS5A PAGR1 UFD1SLC25A11 UBR1 PUF60 UXT RPL37 COPZ1 SPEN LCORL NXT2 ZFYVE16 MRTO4 WAS DHX15 HEATR1 MPST GHDC TAF10 SLC7A7 VAV1 EIF2B4 HMCES BPTF BRWD3 NPAS2 COX14 RPS15 FBL PDCD6IP FKBP2 RPS5 LDB1 DNAJC13 ENSBTAG00000009359 UROD EMC9 SLC25A39 DDB1 LANCL2 TLK1 ERAS PEX16 PLXNA3 PDK2 TBXA2R G6PC3 CUEDC2 SF3B2 ENSBTAG00000046623 USP34 FCER1G DDX39A COMMD1 AKAP11 ZNF148 HINT2 H2AFJ RNF167 TRIR MRPL21 UBA52 CD37 PIK3C2A EP300 TSR3 MRPL27 INPPL1 EMP3 HECTD1 NUFIP2 AP1G2 CD74 ZMYM3 PPP2R1A RPS12 TMED9 RRP36 SENP5 SETD2 PSMB10 GYPC TNF ATP5MC2 SAP25 RPLP2 HSD17B10 RACK1 IL17RA PRDX2 CNN2 ENSBTAG00000034609 BAK1 ENSBTAG00000017734 ZC3H11A RPL35A EIF2B5 ZMYM4 SBNO1 CAND1 NAE1 DHRS1 MGAT4A SH3GLB2 FEN1 EIF3L CHML ENSBTAG00000010871 COPS9 CTSF RNF13 PDLIM1 HDHD3 CD2AP EIF3J HGS ZNF570 CA9 RFX7 RBL1 CPLANE1 QKI PPIA SLC52A2 SLC38A1 SLK SCAMP3 TFIP11 PNP VWA8 TFEB PPP1CB CASP6 RPS19BP1 LAMTOR2 ARRB2 VAC14 SLC44A2 PTDSS1 BID LENG8 ENSBTAG00000002550 RPS2 ELK4 AMDHD2 UBL5 GAPDH KLHL24 HINFP ENSBTAG00000049817 JAK3 RFC1 CNTRL EGLN2 ITGA2B KHNYN KXD1 RPL23 DYNLL2 SNAPC2 PQLC2 MIEN1 CTSW ACSF2 CHMP6 EIF4G3 S100A13 ENSBTAG00000025968 KMT2E DHX58 WDR13 IMP4 UNC119 PPP1CA SETX SHPRH BIRC6 EIF3F ABI3 RASGRP4 TCIRG1 ENSBTAG00000023792 TMEM265 PRKD3 DHX29 TBK1 USP11 ZFP14 SIPA1 BZW2 ENSBTAG00000014862 NFAT5 MGAT2 SMIM7 MINDY1 CYB5R4 SLC6A4 TRIM2 MSANTD4 LNPK VPS28 DENND6A AGER ZMYM1 ATAD2B FAM208B IQGAP1 ENSBTAG00000045729 UNC45A CAPRIN1 RASA2 ENSBTAG00000051824 CHD3 IST1 MED1 KLC4 TM9SF3 EZH1 ANKRD46 LGALS3BP RPL8 SREBF2 BDP1 NUCB1 PPP4C MAF1 OTUB1ATP5MG HINT1 DDB2 IRF3 RPL7A CLASP2 IFI35 RNF181 DMXL1 RHOT2 NDUFAF8 ATP2A3 POLR2I USP24 CALM3 RBCK1 TBC1D10B SNRPD2 PBXIP1 FAM219B SMG1 GUK1 SEMA4A PQBP1 TAGLN2 SERTAD1 UBA1 EXOSC7 CORO1A VPS8 ZBTB11 MTMR9 CIB1 TTC3 RPS13 ENSBTAG00000018987 RALGAPB TAOK3 CNPPD1 TECR JARID2 RBMX HTT ZFYVE19 MPIG6B VPS4A COG8 DDRGK1 BET1L ENSBTAG00000013917 AMN1 MOV10 CLK2 FAM122B MARCKS ENSBTAG00000015839 TMEM208 RBM47 ZZEF1 NT5C3A PGPEP1 GALNT7 ADAM9 CDC27 MED11 KMT2B SLC35A2 MYDGF CAPN10 TOP2A RLF AURKAIP1 CASP8AP2 YIF1A RPL21 NDUFS2 RSAD1 SPI1 CD2BP2 PKN1 PRPF19 PRR3 BRMS1 U2SURP DNPH1 DGKD ENSBTAG00000006241 IDH3B IL27RA ARHGAP25 NPAT SCNM1 SPSB3 PSME3 DBF4 WDR6 NELFE RPSA AQR JMJD7 EIF3G COPS6 ARL6IP4 ENSBTAG00000052566 USF1 CDK5RAP3 MRPL54 INTS2 FGR CAPNS1 PSMD2 CTC1 COG6 NUP214 B3GAT3 ZBTB40 NBEA LARS2 ENSBTAG00000024492 TUBB1 ZNF394 KRI1 NAA10 AATF DYNC1H1 SPAST ZNF692 ENSBTAG00000051942 YTHDC2 ENSBTAG00000006523 TRANK1 GSTK1 P2RY12 KMT2C UBE2W PRPF4 CYBB ZBTB38 FAR1 NDUFAF3 SYNGAP1 PGAP3 DUS1L E4F1 KNL1 RMND5A ATG101 ACADVL ATAD2 PREPL SIGIRR MED9 C2CD2L VPS25 RAD23A PHYHD1 SRP72 TMEM219 RRP12 ATP2B1 ENSBTAG00000019994 POLR2A KCTD11 PSMG4 ENSBTAG00000034656NACA AP4M1 ZNF518A ENSBTAG00000038379 ZDHHC16 CNIH1 PPP2R5D TNFRSF1A FGD3 UTP14A TTC37 ADCK2 MTHFR MCM3AP TDG APAF1 UTP20 C23H6orf47 MRPL43 ATL3 ZNF329 GSTP1 ADAMTSL4 KDM6A TARS2 CHMP1A UBP1 C5AR2 TMEM167B FMNL1 SPIB TMPO EEF2 ENSBTAG00000048925 UBAP2L AGPAT1 MBNL1 PIM2 WASL AKIRIN1 TMEM134 NLRC5 FXR2 GPR132 NEU1 CIITA USO1 BTAF1 MED22 CKB RHOG EEA1 FKBP11 ENSBTAG00000054814 ZNF202 DCAF11 B9D2 ZFAND2B VSIR PRRC2A DCTN1 TAF6 ZNF574 GIGYF2 NUDT12 EMC6 CDC5L CHMP4A TMA7 CYFIP2 GPI CDK10 MARS COPB1 MAP4K1 TRAPPC1 PTPRC BASP1 UBE2D3 MAPKAPK2 RAP1A ENSBTAG00000052956 ATP5ME ENSBTAG00000054972 EIF3CL NRADD KRAS SLC25A44 RAD18 MPP5 TESC ZSWIM8 SEPT1 FHL3 LRRN4CL HECTD3 BOD1L1 RIF1 HINT3 ING4 ABHD16A AP1M1 GRK2 PRKAB1 BZW1 ENSBTAG00000006383 TRIM28 DLST TTBK2 GIT2 MATK DDX27 ARCN1 ENTPD4 APPL1 LGALS9 ENSBTAG00000006851 STT3A WDR25 UBXN7 TRAPPC8 OGG1 MYBL2 NAA50 ASPM IFIT1 POLDIP2 CYTIP POGLUT1 ARID3A TRAFD1 SRP68 RNF20

KBTBD8 CALCOCO1 URM1

r > 0.98 r < -0.98 STK10 PARK7 NUS1 RPS3A FLNA PPP1R9B ZKSCAN8 ENSBTAG00000052260 SGSH GOLGA2 ENSBTAG00000054322 RABGAP1 DNTTIP2 CYTH4 NR3C1 EIF4E2 DOC2G ENSBTAG00000046239 UGCG DOK1 ADAM8 EXOC5 MASTL IFIT2 GSK3A IPMK ASCC3 ENSBTAG00000052511 SMARCC2 DDX23 FDFT1

OGA ZNF410 TIMM13 SMAD5 CNOT1 FAM241A MARCH7 IFIT3 SP1 SEC24C

MEAF6 KDM4C VAT1L DCAF1 FAM91A1 DR1 NAP1L4 SUPT16H RNF125 IL2RG PRPF3 ADSL ATP5F1B PEX13 ABCF3 IL1B RPE BRAP BRD2 SETDB1 BMS1 LIN7C d NMT1 ADGRG5 SDS PLCL2 DRAP1 SMC4 CCT3 NXF1 PSMB9 ETAA1 KARS AI-preg NB-preg not preg AI-preg NB-preg not preg ENSBTAG00000027319

IPO5

ENSBTAG00000026650 CALR CERS5 POLDIP3 TRPV2 NAIF1 ZNF644 b ACTR1A AMPD2

ATOX1

ZBTB17 TMEM106B AP2M1 N=1860 genesFAM208A

HK1 ZBTB1

MKI67 CISD2 UGGT1 correlationPTMA YTHDF2 TMEM259 CDC14B PBRM1 PANK3 HSPB1 GMCL1 INO80C TCTA HEATR5A ENSBTAG00000047432 PNRC2 MAN2A1 ENSBTAG00000054178 WWP1 BROX PCNP ANO6 ENSBTAG00000027962 ALS2 YLPM1 C7H19orf67 BRWD1 KRR1 JUND ENSBTAG00000025564 EIF5 TMEM170A

KPNB1 UBN1 CHD9 CIZ1 RAB3GAP2 MUS81 RPLP1

G3BP1 DCLRE1C

USP14 YIPF6 CLASRP GRN NUB1 ENSBTAG00000032764 UBL3 TXNDC9 PPP3R1 CD22 PSMD3 GOLGB1 ZEB2 PALM CPSF4

STUB1 G2E3 SBDS C5orf51 VPS11 STAT6 SYNCRIP ENSBTAG00000006878 CPSF1

HSPA13 RPL6 UPK3B CD82 MIIP CHD2

LCORL NXT2 ZFYVE16 MRTO4 WAS DHX15 HEATR1 MPST

DNAJC13 ENSBTAG00000009359 PLXNA3

EP300

CAND1 NAE1 DHRS1 MGAT4A SH3GLB2 FEN1 EIF3L CHML ENSBTAG00000010871 COPS9 CTSF RNF13 PDLIM1 HDHD3 CD2AP EIF3J HGS ZNF570 CA9 RFX7 RBL1 CPLANE1 QKI PPIA SLC52A2 SLC38A1 SLK SCAMP3 TFIP11 PNP VWA8 TFEB PPP1CB CASP6

DHX29 TBK1 USP11 ZFP14 SIPA1 BZW2 ENSBTAG00000014862 NFAT5 MGAT2 SMIM7 MINDY1 CYB5R4 SLC6A4 TRIM2 MSANTD4 LNPK VPS28 DENND6A AGER ZMYM1 ATAD2B FAM208B IQGAP1 ENSBTAG00000045729 UNC45A CAPRIN1 RASA2 ENSBTAG00000051824 CHD3 IST1 MED1 KLC4 TM9SF3 EZH1

TAOK3 CNPPD1 TECR JARID2 RBMX HTT ZFYVE19 MPIG6B VPS4A COG8 DDRGK1 BET1L ENSBTAG00000013917 AMN1 MOV10 CLK2 FAM122B MARCKS ENSBTAG00000015839 TMEM208 RBM47 ZZEF1 NT5C3A PGPEP1 GALNT7 ADAM9 CDC27 MED11 KMT2B SLC35A2 MYDGF CAPN10 TOP2A RLF pregnant heifers heifers pregnant

COG6 NUP214 B3GAT3 ZBTB40 NBEA LARS2 ENSBTAG00000024492 TUBB1 ZNF394 KRI1 NAA10 AATF DYNC1H1 SPAST ZNF692 ENSBTAG00000051942 YTHDC2 ENSBTAG00000006523 TRANK1 GSTK1 P2RY12 KMT2C UBE2W PRPF4 CYBB ZBTB38 FAR1 NDUFAF3 SYNGAP1 PGAP3 DUS1L E4F1 KNL1 RMND5A

ADCK2 MTHFR MCM3AP TDG APAF1 UTP20 C23H6orf47 MRPL43 ATL3 ZNF329 GSTP1 ADAMTSL4 KDM6A TARS2 CHMP1A UBP1 C5AR2 TMEM167B FMNL1 SPIB TMPO EEF2 ENSBTAG00000048925 UBAP2L AGPAT1 MBNL1 PIM2 WASL AKIRIN1 TMEM134 NLRC5 FXR2 GPR132 NEU1

TAF6 ZNF574 GIGYF2 NUDT12 EMC6 CDC5L CHMP4A TMA7 CYFIP2 GPI CDK10 MARS COPB1 MAP4K1 TRAPPC1 PTPRC BASP1 UBE2D3 MAPKAPK2 RAP1A ENSBTAG00000052956 ATP5ME ENSBTAG00000054972 EIF3CL NRADD KRAS SLC25A44 RAD18 MPP5 TESC ZSWIM8 SEPT1 FHL3 LRRN4CL

GRK2 PRKAB1 BZW1 ENSBTAG00000006383 TRIM28 DLST TTBK2 GIT2 MATK DDX27 ARCN1 ENTPD4 APPL1 LGALS9 ENSBTAG00000006851 STT3A WDR25 UBXN7 TRAPPC8 OGG1 MYBL2 NAA50 ASPM IFIT1 POLDIP2 CYTIP POGLUT1 ARID3A TRAFD1 SRP68 RNF20

STK10 PARK7 NUS1 RPS3A FLNA PPP1R9B ZKSCAN8 ENSBTAG00000052260 SGSH GOLGA2 ENSBTAG00000054322 RABGAP1 DNTTIP2 CYTH4 NR3C1 EIF4E2 DOC2G ENSBTAG00000046239 UGCG DOK1 ADAM8 EXOC5 MASTL IFIT2 GSK3A IPMK ASCC3 ENSBTAG00000052511 SMARCC2 DDX23 FDFT1 Coexpression

c N=1860 genes proteolysis differential correlation e AI-preg NB-preg not preg AI-preg NB-preg not preg preg - and NB - coexpression vs AI vs preg Differential Differential not

844 negative regulation of transcription by RNA polymerase II 845

a. Network with all genes

27 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

846 Figure 3

a micro RNAs protein-coding genes biological processes bta-mir-320a-1 CDK10 bta-mir-181d MIR423 VRK2 bta-mir-130b bta-mir-130a GSN TMEM67 peptidyl-threonine phosphorylation bta-mir-744 EHD1 ROM1

USP16 cilium assembly MIR375 TBK1 bta-mir-1224 CLK1 regulation of gene expression bta-mir-181b-2 STK40 bta-mir-3956 FBF1 b coding genes coding - protein

micro RNAs AI-preg NB-preg not preg AI-preg NB-preg not preg c

847 848

Perhaps should put GO BP results into a table

28 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

849 Figure 4 a contrasts b AI preg vs NB preg

AI preg vs NB preg AI preg vs Not preg

AI preg vs Not preg

AI preg vs Not preg NB preg vs Not preg

NB preg vs Not preg

AI-preg NB-preg not preg

c

850 851

29 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.18.997379; this version posted March 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

852 Figure 5

a AI preg Not preg b TRAINING SET FOR CLASSIFICATION c MODEL BASED CLASSIFICATION OF SAMPLES FROM 198 genes for TEST GROUP (year 2) 12 samples (year 1) AI preg Not preg Year 1 Year Year 2 Year 198 AI preg Not preg DEGs Year 1 Year ASSESS ACURACY BASED ON BLIND CLASSIFICATION 11 correct 10 correct 9 correct 1 correct … Year 2 Year 82 MACHINE LEARNING MODELS 11/11= 100% 10/11≅ 90% 9/11≅ 81% 1/11= 9% accuracy accuracy accuracy accuracy

d Parallel Random Forest e f

853 854

30