ABSTRACT

MITCHELL III, ROBERT DRAKE. Global Human Health Risks for Repellents or Insecticides and Alternative Control Strategies. (Under the direction of Dr. R. Michael Roe).

Protein-coding and environmental chemicals. New paradigms for human health risk assessment of environmental chemicals emphasize the use of molecular methods and human-derived cell lines. In this study, we examined the effects of the repellent

DEET (N, N-diethyl-m-toluamide) and the phenylpyrazole insecticide fipronil

(fluocyanobenpyrazole) on transcript levels in primary human hepatocytes. These chemicals were tested individually and as a mixture. RNA-Seq showed that 100 µM DEET significantly increased transcript levels for 108 genes and lowered transcript levels for 64 genes and fipronil at 10 µM increased the levels of 2,246 transcripts and decreased the levels for 1,428 transcripts. Fipronil was 21-times more effective than DEET in eliciting changes, even though the treatment concentration was 10-fold lower for fipronil versus DEET. The mixture of DEET and fipronil produced a more than additive effect (levels increased for

3,017 transcripts and decreased for 2,087 transcripts). The transcripts affected in our treatments influenced various biological pathways and processes important to normal cellular functions.

Long non- coding RNAs and environmental chemicals. While the synthesis and use of new chemical compounds is at an all-time high, the study of their potential impact on human health is quickly falling behind. We chose to examine the effects of two common environmental chemicals, the insect repellent DEET and the insecticide fipronil, on transcript levels of long non-protein coding RNAs (lncRNAs) in primary human hepatocytes. While lncRNAs are believed to play a critical role in numerous important biological processes many still remain uncharacterized and their functions and modes of action remain largely unclear. RNA-Seq showed that 100 µM DEET significantly increased transcript levels for 2 lncRNAs and lowered transcript levels for 18 lncRNAs while fipronil at 10 µM increased transcript levels for 76 lncRNAs and decreased levels for 193 lncRNAs. A mixture of 100

µM DEET and 10 µM fipronil increased transcript levels for 75 lncRNAs and lowered transcript levels for 258 lncRNAs. Differentially expressed lncRNA genes were mapped to , analyzed by proximity to neighboring protein-coding genes, and functionally characterized via ontology and molecular mapping algorithms. While further testing is required to assess the organismal impact of changes in transcript levels, initial analysis links several of the dysregulated lncRNAs to processes and pathways critical to proper cellular function.

Tick Haller’s organ detects infrared light. The Haller’s organ (HO), unique to ticks and mites, is found only on the first tarsus of the front pair of legs. The current thinking is that the HO’s main function is chemosensation analogous to the insect antennae, but the functionality of its atypical structure (exclusive to the Acari) is unexplained. We provide the first evidence that the HO allows the American dog tick, Dermacentor variabilis, to respond to infrared (IR) light. Unfed D. variabilis adults with their HOs present were positively phototactic to IR. However, when the HOs were removed by amputation of the tarsi bearing the HOs, no IR response was detected. Ticks in these experiments were also attracted to white light with and without the HOs, but were only positively phototactic to white light when the ocelli (primitive eyes) were present. Covering the eyes did not prevent IR attraction. A TRPA1 receptor was characterized from a D. variabilis-specific HO transcriptome we constructed. This receptor was homologous to the transient receptor potential cation channel, subfamily A, member 1 (TRPA1) from the pit organ of the pit viper, python, and boa families of snakes, the only receptor identified so far for IR detection. The ability of ticks to use IR for host finding is consistent with its obligatory hematophagy and has practical applications in tick trapping and the development of new repellents.

© Copyright 2017 Robert Drake Mitchell III

All Rights Reserved Global Human Health Risks for Arthropod Repellents or Insecticides and Alternative Control Strategies

by Robert Drake Mitchell III

A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Entomology

Raleigh, North Carolina

2017

APPROVED BY:

______Dr. R. Michael Roe Dr. Daniel Sonenshine Committee Chair

______Dr. Ernest Hodgson Dr. Marcé Lorenzen

______Dr. Dominic Reisig

DEDICATION

I dedicate this dissertation to my mother and father, Bob and Beulah Mitchell. Without your love and support I wouldn’t be the man I am today or had the opportunities you provided.

Thank you from the bottom of my heart. Also, to my fiancé Tiffany and my sisters Marie and Chris for their love and support.

ii

BIOGRAPHY

Robert Drake Mitchell III was born to Bob and Beulah Mitchell on January 8, 1980 during a blizzard in Norfolk, VA. He was the third child in the family and the only boy. He attended Norfolk Collegiate School in Norfolk, VA through the entirety of high school and continued his education at American University in Washington, DC. Soon after receiving a bachelor’s degree in Biology he attended Old Dominion University in Norfolk, VA in pursuit of a Master’s degree. Under the tutelage of Dr. Daniel E. Sonenshine he developed a love for his work as well as a respect for its importance. He attended Eastern Virginia Medical

School for three years as a graduate student before moving to Raleigh, NC to pursue a Ph.D. in Entomology at North Carolina State University under the guidance of Dr. R. Michael Roe.

iii

ACKNOWLEDGEMENTS

To my family and friends. I’d like to thank everyone that has supported me along the way, including my fiancé Tiffany Benzine, my mother Beulah Mitchell, my sisters Marie

Davis and Christine Mitchell, and my father Bob Mitchell who passed away six years ago. I regret that he is not here to see me succeed, but I know that he always had faith in me and will cherish that feeling for the rest of my life. Also, my nieces and nephews, who I hope have success in their own lives.

I’d also like to personally thank two mentors that I hope to emulate throughout my career, Dr. Daniel Sonenshine and Dr. R. Michael Roe. Without either of them I’d not be where I am today or have the dedication and appreciation for my work that I’ve developed over the years and learned from them. Their tireless efforts have provided so much for the scientific community and for myself as well.

My committee members have also been instrumental in my growth and understanding so I’d like to thank them as well: Dr. Marcé Lorenzen, Dr. Ernest Hodgson, Dr. Dominic

Reisig, Dr. R. Michael Roe, and Dr. Daniel Sonenshine. Each has helped me at critical steps along the way and also helped me to stay focused through good times and more difficult times.

Finally, I’d like to thank those that I’ve worked with throughout the years and gotten to know very well. I have such an eclectic group of friend that have supported and challenged me throughout my education, including Anirudh Dhammi, Jiwei Zhu, Jaap van

Kretschmar, Nick, Travanty, Ann Carr, Loganathan Ponnusamy, Grayson Cave, John Strider,

Haley Thornton Sutton, Marcel Deguenon, Charles Apperson, Clyde Sorenson, Shane

Ceraul, Sayed Khalil, Kevin Donohue, Xin Guo and so many others.

iv

TABLE OF CONTENTS

List of Tables ...... vii List of Figures ...... ix Chapter 1: Impact of Environmental Chemicals on the Transcriptome of Primary Human Hepatocytes: Potential for Health Effects ...... 1 Abstract ...... 2 Introduction ...... 3 Materials and Methods ...... 5 Cell Culture and Treatments ...... 5 RNA Isolation, Quality Assessment, and Sequencing ...... 7 Data Analysis ...... 8 Quantitative PCR (qPCR) Analysis ...... 10 Results ...... 12 DEET and Fipronil Exposure Alter Gene Expression in Primary Human Hepatocytes ...... 12 DEET-Fipronil Mixture ...... 15 Impact of DEET, Fipronil and DEET plus Fipronil on Biological Processes and Molecular Functions ...... 17 Chromosomal Distribution of DEET, Fipronil, and DEET plus Fipronil Dysregulated Genes ...... 17 P450s ...... 19 Transcripts Related to Endocrine Metabolism and Function ...... 20 Transcripts involved in Steroid Hormone Biosynthesis were affected by DEET, Fipronil and DEET plus Fipronil ...... 22 Discussion ...... 25 Acknowledgements ...... 27 References ...... 28 Tables ...... 38 Figures...... 48 Chapter 2: Differential Expression Profile of lncRNAs from Primary Human Hepatocytes Following DEET and Fipronil Exposure ...... 56 Abstract ...... 57 Introduction ...... 58 Materials and Methods ...... 62 Cell Culture and Treatments ...... 62 RNA Isolation, Quality Assessment, and Sequencing ...... 64 Data Analysis ...... 66 Results and Discussion ...... 69 DEET and Fipronil Exposure Significantly Alter lncRNA Transcript Levels in Primary Human Hepatocytes ...... 69 Chromosomal Distribution of lncRNAs Dysregulated by DEET and Fipronil ...... 72 Comparison of Dysregulated lncRNA and Protein-Coding Gene Chromosomal Distribution ...... 73

v

Chromosomal Maps Help Visualize Dysregulated Protein-Coding Gene-lncRNA Relationships ...... 76 Identifying Protein-Coding Genes Associated with lncRNAs based on Genomic Location ...... 79 Inferring Functionality of lncRNAs through lncRNA-Coding Gene Relationships ....81 LncRNAs Dysregulated by DEET and Fipronil Important in Biological Processes ...82 Metabolic Pathways Important to Normal Cellular Function Influenced by Dysregulated lncRNAs and Neighboring Protein-Coding Genes ...... 86 Implications and Future Directions ...... 94 Acknowledgements ...... 96 References ...... 97 Tables ...... 108 Figures...... 115 Supplemental Material ...... 138 Supplement 1 ...... 138 Supplement 2 ...... 148 Supplement 3 ...... 160 Supplement 4 ...... 161 Supplement 5 ...... 163 Supplement 6 ...... 164 Chapter 3: Infrared Light Detection by the Haller’s Organ of Adult American Dog Ticks, Dermacentor variabilis (Ixodida: Ixodidae) ...... 167 Abstract ...... 168 Introduction ...... 169 Materials and Methods ...... 172 Ticks ...... 172 Scanning Electron Microscopy ...... 172 Behavioral Bioassays ...... 173 Transcriptomic Analysis ...... 175 Results and Discussion ...... 177 Tick Forelegs Detect IR Light ...... 177 Mechanism of IR Detection in Ticks ...... 180 Molecular Clues for IR Detection in Ticks ...... 183 Conclusion ...... 186 Acknowledgements ...... 187 References ...... 188 Tables ...... 192 Figures...... 194

vi

LIST OF TABLES

Table 1.1. Putative functions for the 15 protein-coding transcripts with the highest log2 fold change that were up-regulated and down-regulated when primary human hepatocytes were treated for 72 hours with 100 µM DEET ...... 38

Table 1.2. Putative functions for the 15 protein-coding transcripts with the highest log2 fold change that were up-regulated and down-regulated when primary human hepatocytes were treated for 72 hours with 10 µM fipronil ...... 40

Table 1.3. Putative functions for the 15 protein-coding transcripts with the highest log2 fold change that were up-regulated and down-regulated when primary human hepatocytes were treated for 72 hours with a mixture of 100 µM DEET and 10 µM fipronil ...... 42

Table 1.4. Transcripts associated with endocrine disruption that were up- or down-regulated when primary human hepatocytes were treated for 72 hours with 100 µM DEET, 10 µM fipronil or a mixture of 100 µM DEET and 10 µM fipronil ...... 44

Table 1.5. Transcripts for P450s that were up- or down-regulated when primary human hepatocytes were treated for 72 hours with 100 µM DEET, 10 µM fipronil or a mixture of 100 µM DEET and 10 µM fipronil (left panel). Venn diagram of P450 distribution between three treatments (right panel) ...... 47

Table 2.1. Long non-protein coding RNAs (lncRNAs) whose transcripts were up- or down- regulated in primary human hepatocytes after exposure to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil for 72 hours. Log2FC = log2 fold change; DT = 100 µM DEET; Fip = 10 µM fipronil; and DT+Fip = 100 µM DEET plus 10 µ m fipronil mixture ...... 108

Table 2.2. Chromosomal distribution of lncRNAs significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were exposed to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil. DT+Fip = mixture of 100 µM DEET and 10 µM fipronil...... 110

Table 2.3. Chromosomal distribution of protein-coding genes significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were exposed to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil. DT+Fip = mixture of 100 µM DEET and 10 µM fipronil ...... 112

Table 2.4. Protein-coding and non-protein coding genes neighboring (within 1,000 kb) the 20 lncRNAs whose transcripts were up- or down-regulated by 100 µM DEET using GREAT algorithm parameters. The GREAT algorithm defines neighboring genes as those whose transcription start site (TSS) is within 1,000 kb of the input lncRNAs. lncrna = long non- protein coding RNA; kb = kilobases. All gene names are HUGO gene nomenclature committee (HGNC) gene symbols ...... 114

vii

Table 3.1. Top 5 BLASTP hits for two HO-specific RNA-Seq contigs putatively assigned as TRPA1s exclusive to the HO of Dermacentor variabilis from our transcriptome (top) and schematic representation of putative HO-specific TRPA1 partial transcripts in D. variabilis from our transcriptomes aligned with a full-length TRPA1 from C. atrox and a putative TRPA1 from I. scapularis and A. aureolatum using DELTA-BLAST (bottom). (*) indicates putatively assigned TRPA1 (i.e. partial transcript, TRPA1 homolog, or TRPA1-like protein); “Desc.” is contig identification assignment; “Hit Acc.” is accession number for match to contig; “Sim.” is similarity score between contig and hit, i.e. the extent to which the two sequences are related; “Len.” is amino acid alignment length between contig and hit; “Pos.” is number of exact amino acid matches between query and subject sequence. The numbers on the schematic (bottom) above the protein sequence illustrations represent amino acid positions. BLASTP = protein-protein BLAST; DELTA-BLAST = Domain Enhanced Lookup Time Accelerated BLAST. Matching organisms: Amblyomma aureolatum, Ixodes scapularis (deer tick), Cancer borealis (Jonah crab), Homarus americanus (American lobster), Lingula anatina (duck mussel), Stegodyphus mimosarum (velvet spider), Parasteatoda tepidariorum (common house spider) ...... 192

viii

LIST OF FIGURES

Figure 1.1. Statistically significant (alpha = 0.05) fold change in transcript levels compared to a control in primary human hepatocytes treated for 72 hours with (A) 100 µM DEET, (B) 10 µM fipronil or (C) a mixture of 100 µM DEET and 10 µM fipronil. The control was treated with DMSO alone. (D) Venn diagram showing the common and exclusive total up- and down-regulated transcripts for each treatment (A-C). For A-C, plotted points that are red are statistically significant, whereas plotted points that are black are not significant ...... 48

Figure 1.2. Top 10 (GO) level 3 (biological processes) matches for the statistically significant (alpha = 0.05) changes in transcript levels compared to a control (Fig. 1) in primary human hepatocytes treated for 72 hours with 100 µM DEET (top left), 10 µM fipronil (top right), and a mixture of 100 µM DEET and 10 µM fipronil (bottom, middle) in DMSO. The control was treated with DMSO alone ...... 49

Figure 1.3. Top 10 Gene Ontology (GO) level 3 (molecular functions) matches for the statistically significant (alpha = 0.05) changes in transcript levels compared to a control (Fig. 1) in primary human hepatocytes treated for 72 hours with 100 µM DEET (top left), 10 µM fipronil (top right), and a mixture of 100 µM DEET and 10 µM fipronil (bottom, middle) in DMSO. The control was treated with DMSO alone ...... 50

Figure 1.4. Human chromosomal maps showing the location of genes for the statistically significant (α = 0.05) up- and down-regulated transcripts produced by primary human hepatocytes treated for 72 hours with 100 µM DEET in DMSO (2A), 10 µM fipronil in DMSO (2B), and a mixture of 100 µM DEET plus 10 µM fipronil in DMSO (2C) compared to a DMSO-only control. For each drawn, length is proportional to the relative length of each chromosome in the . Also shown as an indentation in the chromosome structure is the relative position of the centromere; the relative location of each gene that produced the up- and down-regulated transcripts is indicated by a horizontal line. The black number below each chromosome illustration identifies which chromosome each represents from the human genome. The blue number above the chromosome illustrations in 4A-4C represents the percentage of genes that were differentially-expressed on that chromosome. The black number above the chromosome illustrations in 4A-4C represents the genes affected as a percentage of total genes on that chromosome. The right and left panels in 4D are a graphical representation of the numerical values above each chromosome illustration ...... 51

Figure 1.5. Transcripts for enzymes in the steroid hormone biosynthesis pathway that were significantly (α= 0.05) up- (green) or down- (red) regulated when primary human hepatocytes were treated for 72 hours with 100 µM DEET, 10 µM fipronil, or a combination of 100 µM DEET plus 10 µM fipronil. 3β-HSD1 = hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 1; 17β-HSD = 17β-hydroxysteroid dehydrogenase; 11β- HSD1 = 11β-hydroxysteroid dehydrogenase type 1; POR = cytochrome P450 reductase; CYB5A = cytochrome b5, type A; SULT2A1 = sulfotransferase 2A1; CYP21A2 = cytochrome P450 21A2. dDEET; ffipronil; and d+fDEET plus fipronil ...... 55

ix

Figure 2.1. Relationships between the number of long non-protein coding RNAs (lncRNAs) whose transcripts were significantly up- or down-regulated (A) and protein-coding genes whose transcripts were differentially expressed (B) when primary human hepatocytes were treated with DEET (100 µM), fipronil (10 µM), or a mixture of the two (100 µM DEET and 10 µM fipronil) for 72 h. (A) lncRNAs with transcripts differentially expressed at P ≤ 0.01; (B) protein-coding genes with transcripts differentially expressed at P ≤ 0.01 ...... 115

Figure 2.2. Log2 fold change of transcripts significantly differentially expressed from long non-protein coding RNA genes by chromosome (P ≤ 0.01). Shared by all 3 means those transcripts were differentially expressed in all 3 treatments; Fip only means those transcripts were only differentially expressed when treated with 10 µM fipronil; Shared Fip and DT+Fip means those transcripts were only differentially expressed when treated with fipronil or a combination of DEET and fipronil, but not 100 µM DEET alone; and DT+Fip only means those transcripts were only differentially expressed with the combination of DEET and fipronil, but not each treatment alone. A single representative log2 fold change value was used for transcripts that were differentially expressed under more than one treatment condition. *A representative log2 fold change refers to the average log2 fold change for any genes whose transcript expression was affected by more than one treatment, like shared by all 3, where the same genes were dysregulated by all 3 treatment conditions ...... 116

Figure 2.3. Chromosome maps showing either (A) location of lncRNAs with up- or down- regulated transcripts or (B-D) location of lncRNAs with differentially expressed transcripts in relation to dysregulated protein-coding genes after exposure of primary human hepatocytes to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil. (A) Chromosomal location of lncRNAs with significantly up- or down-regulated transcripts (P ≤ 0.01) when hepatocytes exposed to 100 µM DEET; (B) chromosomal location of lncRNAs and protein-coding genes significantly dysregulated (P ≤ 0.01) when hepatocytes exposed to 100 µM DEET; (C) chromosomal location of lncRNAs and protein- coding genes significantly dysregulated (P ≤ 0.01) when hepatocytes exposed to 10 µM fipronil; (D) chromosomal location of lncRNAs and protein-coding genes significantly dysregulated (P ≤ 0.01) when hepatocytes exposed to a mixture of 100 µM DEET and 10 µM fipronil...... 117

Figure 2.4. Chromosomal location of lncRNA genes within 1,000 kb of protein-coding genes with differentially expressed transcripts after primary human hepatocytes were exposed to 100 µM DEET or 10 µM fipronil. (4A) Location of dysregulated lncRNAs and neighboring (within 1,000 kb) protein-coding genes affected by 100 µM DEET on selected chromosomes and (4B) location of dysregulated lncRNAs and neighboring protein-coding genes affected by 10 µM Fipronil on selected chromosomes. p-arm = short arm of chromosome; q-arm = long arm of chromosome; black star = corresponding lncRNA from 10 µM DEET treatment ...... 122

Figure 2.5. Genomic relationship between lncRNA transcription sites whose transcripts were differentially expressed in primary human hepatocytes after exposure to 100 µM DEET to the nearest protein-coding gene transcription start site (TSS) that lies within 1,000 kilobases

x

(kb) of the lncRNA. If the nearest TSS was over 1,000 kb then no neighboring protein- coding genes were assigned. Neighboring protein-coding genes are defined by the GREAT algorithm as those within 1,000 kb of an input genomic region (lncRNA in this case). The GREAT algorithm assumes that lncRNAs within 1,000 kb of a neighboring gene TSS can effect transcription of that neighboring gene. GREAT calculates the distance between input sequences and target TSS by measuring the distance from the middle of an lncRNA to the closest TSS of a protein-coding gene. (A) Number of protein-coding genes associated with up- or down-regulated lncRNA transcription sites (referred to as genomic regions in the graphs) after 100 µM DEET exposure; (B) genomic distance in kb of lncRNA transcription sites with up- or down-regulated transcripts after 100 µM DEET exposure to closest protein- coding gene TSS, and (C) genomic distance in kb and orientation of lncRNA transcription sites with up- or down-regulated transcripts after 100 µM DEET exposure to closest protein- coding gene TSS. Values in (A) that are red indicate the number of genomic regions (i.e., lncRNAs) that do not lie within 1,000 kb of a protein-coding gene TSS. Percentages on the Y axis in (A) refer to the ratio of differentially expressed lncRNAs that neighbor 0, 1, or 2 or more protein-coding genes to the total number of lncRNAs dysregulated by 100 µM DEET (20 total in this case). Percentages on the Y axis in (B) refer to the ratio of lncRNAs that fall into categories within a certain range from the TSS of the closest protein-coding gene to the total number of lncRNAs that fall within these ranges (36 total in this case since a single gene can span more than one range category). Percentages on the Y axis in (C) refer to the ratio of lncRNAs that fall into categories within a certain range both before and after the closest protein-coding gene TSS to the total number of lncRNAs that fall within these ranges (36 total in this case) ...... 124

Figure 2.6. Genomic relationship between lncRNA transcription sites whose transcripts were differentially expressed in primary human hepatocytes after exposure to 10 µM fipronil to the nearest protein-coding gene transcription start site (TSS) that lies within 1,000 kilobases (kb) of the lncRNA. If the nearest TSS was over 1,000 kb then no neighboring protein-coding genes were assigned. Neighboring protein-coding genes are defined by the GREAT algorithm as those within 1,000 kb of an input genomic region (lncRNA in this case). The GREAT algorithm assumes that lncRNAs within 1,000 kb of a neighboring gene TSS can effect transcription of that neighboring gene. GREAT calculates the distance between input sequences and target TSS by measuring the distance from the middle of an lncRNA to the closest TSS of a protein-coding gene. (A) Number of protein-coding genes associated with up- or down-regulated lncRNA transcription sites (referred to as genomic regions in the graphs) after 10 µM fipronil exposure; (B) genomic distance in kb of lncRNA transcription sites with up- or down-regulated transcripts after 10 µM fipronil exposure to closest protein- coding gene TSS, and (C) genomic distance in kb and orientation of lncRNA transcription sites with up- or down-regulated transcripts after 10 µM fipronil exposure to closest protein- coding gene TSS. Values in (A) that are red indicate the number of genomic regions (i.e., lncRNAs) that do not lie within 1,000 kb of a protein-coding gene TSS. Percentages on the Y axis in (A) refer to the ratio of differentially expressed lncRNAs that neighbor 0, 1, or 2 or more protein-coding genes to the total number of lncRNAs dysregulated by 10 µM fipronil (269 total in this case). Percentages on the Y axis in (B) refer to the ratio of lncRNAs that fall into categories within a certain range from the TSS of the closest protein-coding gene to

xi

the total number of lncRNAs that fall within these ranges (478 total in this case since a single gene can span more than one range category). Percentages on the Y axis in (C) refer to the ratio of lncRNAs that fall into categories within a certain range both before and after the closest protein-coding gene TSS to the total number of lncRNAs that fall within these ranges (478 total in this case) ...... 126

Figure 2.7. Genomic relationship between lncRNA transcription sites whose transcripts were differentially expressed in primary human hepatocytes after exposure to 100 µM DEET plus 10 µM fipronil to the nearest protein-coding gene transcription start site (TSS) that lies within 1,000 kilobases (kb) of the lncRNA. If the nearest TSS was over 1,000 kb then no neighboring protein-coding genes were assigned. Neighboring protein-coding genes are defined by the GREAT algorithm as those within 1,000 kb of an input genomic region (lncRNA in this case). The GREAT algorithm assumes that lncRNAs within 1,000 kb of a neighboring gene TSS can effect transcription of that neighboring gene. GREAT calculates the distance between input sequences and target TSS by measuring the distance from the middle of an lncRNA to the closest TSS of a protein-coding gene. (A) Number of protein- coding genes associated with up- or down-regulated lncRNA transcription sites (referred to as genomic regions in the graphs) after 100 µM DEET plus 10 µM fipronil exposure; (B) genomic distance in kb of lncRNA transcription sites with up- or down-regulated transcripts after 100 µM DEET plus 10 µM fipronil exposure to closest protein-coding gene TSS, and (C) genomic distance in kb and orientation of lncRNA transcription sites with up- or down- regulated transcripts after 100 µM DEET plus 10 µM fipronil exposure to closest protein- coding gene TSS. Values in (A) that are red indicate the number of genomic regions (i.e., lncRNAs) that do not lie within 1,000 kb of a protein-coding gene TSS. Percentages on the Y axis in (A) refer to the ratio of differentially expressed lncRNAs that neighbor 0, 1, or 2 or more protein-coding genes to the total number of lncRNAs dysregulated by 100 µM DEET plus 10 µM fipronil (331 total in this case). Percentages on the Y axis in (B) refer to the ratio of lncRNAs that fall into categories within a certain range from the TSS of the closest protein-coding gene to the total number of lncRNAs that fall within these ranges (603 total in this case since a single gene can span more than one range category). Percentages on the Y axis in (C) refer to the ratio of lncRNAs that fall into categories within a certain range both before and after the closest protein-coding gene TSS to the total number of lncRNAs that fall within these ranges (603 total in this case) ...... 128

Figure 2.8. Top 10 biological processes and signaling pathways affected by exposure of primary human hepatocytes to 100 µM DEET for 72 hours. (A) Top 10 biological processes affected by treatment with 100 µM DEET and (B) top 10 signaling pathways affected by treatment with 100 µM DEET ...... 130

Figure 2.9. Top 10 biological processes and signaling pathways affected by exposure of primary human hepatocytes to 10 µM fipronil for 72 hours. (A) Top 10 biological processes affected by treatment with 10 µM fipronil and (B) top 10 signaling pathways affected by treatment with 10 µM fipronil ...... 131

xii

Figure 2.10. Top 10 biological processes and signaling pathways affected by exposure of primary human hepatocytes to 100 µM DEET plus 10 µM fipronil for 72 hours. (A) Top 10 biological processes affected by treatment with 100 µM DEET plus 10 µM fipronil and (B) top 10 signaling pathways affected by treatment with 100 µM DEET plus 10 µM fipronil .132

Figure 2.11. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes exposed to 100 µM DEET (as defined by GREAT). Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. Settings were as follows: Number of input = 42; Minimum required interaction score: Low confidence (0.150); Max number of interactors to show: 1st shell = query proteins only; 2nd shell = maximum of 5 interactors; Disconnected nodes hidden in network. Colored nodes = query proteins and first shell of interactors; white nodes = second shell of interactors; large nodes = some 3D structure is known or predicted; small nodes = protein of unknown 3D structure; blue lines (known interaction) = from curated databases; magenta lines (known interaction) = experimentally determined; green lines (predicted interactions) = gene neighborhood; red lines (predicted interactions) = gene fusions; purple lines (predicted interactions) = gene co-occurrence; chartreuse line = text mining; black lines = co- expression; violet lines = protein homology ...... 133

Figure 2.12. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the immune response. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the immune response were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000 kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings ...... 134

Figure 2.13. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the p53 signaling pathway. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the p53 signaling pathway were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000

xiii

kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein- protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings ...... 135

Figure 2.14. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the Ras signaling pathway. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the Ras signaling pathway were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000 kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein- protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings ...... 136

Figure 2.15. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the Wnt signaling pathway. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the Wnt signaling pathway were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000 kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings ...... 137

Figure 3.1. Scanning electron micrographs of Dermacentor variabilis Haller’s organ (HO) and associated structures. (A) female, dorsal view at 25X, dotted line where tarsus I including HO was removed, (B) female, dorsal view of HO anterior pit and capsule at 500X, (C) male, dorsal view of HO anterior pit and capsule at 500X, and (D) female, dorsal view, aperture opening of capsule at 2500X. Arrows in panels B-D indicate undescribed structures resembling auricular or companiform sensilla that may serve as IR detectors or assist in this function in both male and female D. variabilis. The white star in panel A denotes the location of the HO (star just above structure). The ocellus (primitive eye) is located between the brackets in panel A. The white dotted line in panel A denotes the location where the HO was ablated for the corresponding trials. Oc = ocellus, Cp = capsule, AP = anterior pit .....194

xiv

Figure 3.2. Arena calibration points and video screenshot. (A) Choice arena where two of the ports at right angles to each other were fitted with identical light sources (either visible light or infrared), and (B) high definition, IR-capable video camera capture of bioassay trial where HO and ocelli were present and unobstructed (tick moving toward IR light). At the beginning of each assay a single tick was placed at the start and a light source was illuminated (yellow bulb, lane 1). After crossing the finish line of “Direction I” the first light source was turned off (grey bulb, lane 2) and the second light source (at a right angle) was immediately illuminated (yellow bulb, lane 2). Once the tick traveled from the start to the finish of “Direction II” the assay was over. Any deviation out of the field of the light beam (and not correcting toward the light source) was considered non-responsive. Movement toward each light source was considered non-responsive if the tick took longer than 1 minute to move within 2.5 cm of the illuminated source. Yellow bulbs denote lights that were turned on while the grey bulb represents a light that was turned off ...... 195

Figure 3.3. Dermacentor variabilis choice assay conditions and results. (A) Tick response to visible light with Haller’s organs (HOs) removed or ocelli (Oc) blocked. (B) Tick response to infrared light with Haller’s organs (HOs) removed or ocelli (Oc) blocked. “+HO +Oc” means that both the HOs and Oc were intact for those trials. “+HO -Oc” means that the HOs were intact and the Oc were covered with black paint for those trials. “-HO +Oc” means that the HOs were removed and the Oc were intact for those trials. A black “X” on the illustrations above each bar graph represents where the ticks’ HOs were ablated or Oc were blocked. The frequency response was analyzed using a chi-squared test of homogeneity of proportions under the null hypothesis that the expected proportion for either choice (response or no response) was 0.50. Response to either visible or infrared light was significant at P ≤ 0.001 (**), P ≤ 0.01 (*) ...... 196

xv

─── CHAPTER 1 ───

Impact of Environmental Chemicals on the Transcriptome of Primary Human Hepatocytes: Potential for Health Effects

Robert D. Mitchell III,1 Anirudh Dhammi,2 Andrew Wallace,3 Ernest Hodgson,4 and R. Michael Roe 5

1Department of Entomology, Toxicology Program, Department of Biology, North Carolina State University, Raleigh, NC 27695, USA 2Department of Entomology, North Carolina State University, Raleigh, NC 27695, USA 3Toxicology Program, Department of Biology, North Carolina State University, Raleigh, NC 27695, USA 4Department of Applied Ecology, Toxicology Program, Department of Biology, North Carolina State University, Raleigh, NC 27695, USA 5Department of Entomology, Toxicology Program, Department of Biology, North Carolina State University, Raleigh, NC 27695, USA;

Corresponding author: Dr. R. Michael Roe, E-mail: [email protected]

This chapter was published in the Journal of Biochemistry and Molecular Toxicology. Received 22 September 2015; revised 19 February 2016; accepted 25 February 2016.

1

Abstract

New paradigms for human health risk assessment of environmental chemicals emphasize the use of molecular methods and human-derived cell lines. In this study, we examined the effects of the insect repellent DEET (N, N-diethyl-m-toluamide) and the phenylpyrazole insecticide fipronil (fluocyanobenpyrazole) on transcript levels in primary human hepatocytes. These chemicals were tested individually and as a mixture. RNA-Seq showed that 100 µM DEET significantly increased transcript levels (α = 0.05) for 108 genes and lowered transcript levels for 64 genes and fipronil at 10 µM increased the levels of 2,246 transcripts and decreased the levels for 1,428 transcripts. Fipronil was 21-times more effective than DEET in eliciting changes, even though the treatment concentration was 10- fold lower for fipronil versus DEET. The mixture of DEET and fipronil produced a more than additive effect (levels increased for 3,017 transcripts and decreased for 2,087 transcripts). The transcripts affected for all chemical treatments were classified by GO analysis and mapped to chromosomes. Specific pathways and individual transcripts affected were discussed. Changes found in transcript levels in response to treatments will require further research to understand their importance in overall cellular, organ and organismic function.

2

Introduction

Exposure of people to environmental chemicals, for example pesticides, could have deleterious effects on human health. Examining the health risk of these chemicals using molecular methods and human-derived cell lines has been recommended [1-7]. Thus, in the current study, we examined the effect of treatments with the insect repellent DEET (N, N- diethyl-m-toluamide) and the phenylpyrazole insecticide fipronil (fluocyanobenpyrazole) on transcript levels in primary human hepatocytes.

DEET, when applied to skin or clothing, offers protection against many biting such as mosquitoes and fleas. First developed by the U.S. military in 1946, it became available for public use in 1957 [8]. The concentration of DEET in commercially-available products ranges from 5-100%, the length of protection being positively correlated with the concentration of DEET in the formulation [9, 10]. Approximately one-third of the U.S. population is exposed to DEET annually, reaffirming the need for extensive examination of potential toxic effects [11]. Although DEET is generally considered safe and its effects benign, isolated toxic effects associated with its use have been reported including seizures, cardiovascular toxicity, and symptoms associated with Gulf War Syndrome such as fatigue, headaches, and dermatitis [10, 12, 13]. The dermal absorption of DEET is affected by co- application with oxybenzone-containing sunscreens [14]. Also, the combined exposure of

DEET with other pest control compounds such as permethrin enhanced biochemical, behavioral, and metabolic changes compared to each compound separately [15]. An important site of metabolism of DEET is the liver [16].

Fipronil is a highly effective γ-aminobutyric acid (GABA) agonist used in the urban environment and on companion in over 70 countries as well as on over 100 different

3

crops to control pest insects [17, 18]. It was discovered by Rhône-Poulenc in the 1980s and became commercially available in 1993. The mode of action of fipronil involves binding to

GABAA and glutamate-gated chloride channels in insects, preventing the opening of chloride ion channels resulting in uncontrollable neuron firing and central nervous system (CNS) toxicity. Fipronil is used at application levels ranging from 0.6 -200g active ingredient per hectare (a.i./ha) depending on target pest and product formulation [19]. It is highly toxic to fish, aquatic invertebrates, lizards, and gallinaceous birds and moderately toxic to mammals.

Fipronil also has been suggested to contribute to colony collapse disorder in bees and was banned for use on sunflowers and corn in the European Union in 2013 [17, 20-22]. Long- term exposure of fipronil to rats can induce thyroid cancer and affect fertility, the resultant offspring having delayed development and increased mortality [23]. While fipronil is approved by the U.S. Environmental Protection Agency (EPA) for use on dogs and cats and around the home for termites, more research is needed to understand its potential health implications because of the close contact humans of all ages have with companion animals, potentially protracted exposure in and around the home and potential synergistic effects with other chemicals.

The objectives of this study were to analyze the impact of DEET and fipronil, separately and in combination, on global gene expression in human hepatocytes. RNA-Seq was used to analyze the impact of these compounds on transcript levels and to identify specific systems or pathways in which gene expression at the transcript level was up- and down-regulated. Using mixtures of DEET and fipronil will assess the impact of chemical mixtures on human molecular effects.

4

Material and Methods

Cell Culture and Treatments

Plated primary human hepatocytes were obtained from Life Technologies

Corporation, Carlsbad, CA, USA within 24 hours of their procurement from the patient.

They arrived at our laboratory submerged in William’s E Medium in 12-well culture plates coated with a Collagen (Type I) substratum and a Geltrex® overlay at a density of 0.67x106 cells per well. The cells were harvested from the liver of a Caucasian female, age 62, with a body mass index (BMI) of 26.3 and no history of smoking or alcohol consumption.

Individuals with a BMI > 27.5 show moderate steatosis and lower levels indicate low or no steatosis [24], a condition that may affect molecular studies. Upon arrival, the medium the cells were shipped in was removed and replaced with fresh, sterile William’s E Medium containing 0.292 g/L L-glutamine (Cat. No. W1878, Sigma-Aldrich, St. Louis, MO, USA) supplemented with (i) 10% (v/v) fetal bovine serum (Cat. No. S11050, Atlanta Biologicals,

Norcross, GA, USA), (ii) 10-7 M dexamethasone (Cat. No. P0500, Steraloids, Inc., Newport,

RI, USA), (iii) insulin-transferrin-selenium A formulated with 0.17 mM insulin, 0.0069 mM transferrin, 0.0039 mM sodium selenite and 100.0 mM sodium pyruvate (Cat. No. 51300-

044, Invitrogen, Carlsbad, CA, USA), (iv) penicillin/streptomycin/amphotericin B solution containing 10,000 units/mL of penicillin, 10,000 µg/mL of streptomycin, and 25 µg/mL of

Fungizone® antimycotic (Cat. No. 15240-062, Invitrogen, Carlsbad, CA, USA), and (v) a sterile gentamicin/glutamate solution containing 200 mM L-glutamine and 5 mg/mL gentamicin in 0.9% sodium chloride (Cat. No. G9654, Sigma-Aldrich, St. Louis, MO,

USA). The plate was then placed in a humidified incubator (relative humidity of 95%) at 5%

5

CO2/95% air at a temperature of 37°C for 24 hours. After 24 hours, the medium was changed once again, and the cells were incubated for 24 hours, resulting in a period of 48 hours during which the cells were maintained in fresh media for viability and quality observations.

Treatments with DEET and fipronil began 48 hours after the cells arrived. On the same plate, three wells of primary human hepatocytes were each inoculated with DEET

(purity > 98%; Cat. No. F2284, Chem Service, Inc., West Chester, PA, USA) producing a final concentration of 100 µM in each well, three different wells were inoculated each with fipronil (purity > 98%; Cat. No. PS2136, Chem Service, Inc., West Chester, PA, USA) producing a final concentration of 10 µM in each well, and three different wells each were inoculated with a combination of DEET and fipronil (mixed together before adding) producing a final concentration of 100 µM DEET and 10 µM fipronil. DEET and fipronil were ordered two weeks before the cells and kept at room temperature until the day of dosing the cells. The insecticides were added to the culture media dissolved (wt/vol) in DMSO

(dimethyl sulfoxide; ≥ 99.7% pure; Cat. No. BP231-100, Fisher Scientific International,

Inc., Hampton, NH, USA). The amount of DMSO (0.1% final concentration) was the same for all treatments and was previously shown to produce minimal cytotoxicity or changes in gene expression for hepatocytes in culture (LeCluyse et al. [25]). The concentration levels chosen for the insecticides were determined from the dose-response data of Das et al. [26,

27] for DEET and fipronil, respectively, that produced the maximum increase in P450 transcript, protein and activity levels for multiple P450s in both primary human hepatocytes and immortalized liver (HepG2) cells and with the lowest possible cytotoxic effects. This choice of dose was made to repeat as close as possible our previous experiments at the

6

maximum impact of these chemicals on P450s in the absence of cytotoxicity [26, 27] but at the same time expand our work to an examination of the impact of these treatments on global transcript levels. The experimental assay conditions in this paper for primary human hepatocytes were identical to Das et al. (26, 27). The DEET concentration chosen (100 µM) is in the intoxication range of what would be expected in human blood within 8 hours after a dermal treatment, 6.7 fold greater than what would be expected in human blood when DEET is appropriately applied at a maximum dose, and approximately one-fifth that for a person subjected to an acute intentional oral overdose of DEET [28]. No data are available on fipronil levels in human blood; the fipronil treatment level in this study was one-tenth that of

DEET. As a vehicle control, three separate wells were treated with culture medium supplemented with DMSO only, in the same amount as the insecticide treated wells. Once all cells received the treatment or carrier only, they were incubated undisturbed for 72 hours in a humidified incubator under the environmental conditions previously described.

RNA Isolation, Quality Assessment, and Sequencing

After 72 hours of treatment, the media was removed from all wells, and the wells washed with 500 µL of 1X phosphate-buffered saline formulated with 1 mM KH2PO4, 155 mM NaCl and 3 mM Na2HPO4-7H2O at a pH of 7.4 (Cat. No. 10010-023, Life

Technologies, Carlsbad, CA, USA). Then, 350 µL of RLT lysis buffer (Qiagen, Inc.,

Valencia, CA, USA) was added to each well, and the cells scraped from the plate’s solid support using a sterile, disposable cell scraper (producing a cell suspension in RLT buffer).

Suspensions from each individual well (3 DEET wells, 3 fipronil wells, 3 DEET plus fipronil wells, and 3 control wells) were separately stored in 1.5 mL microcentrifuge tubes at -80°C

7

until they were used for RNA extraction using the RNeasy Mini Kit (Cat. No. 74104,

Qiagen, Inc., Valencia, CA, USA) per the manufacturer’s protocol. Each isolated total RNA sample was separately analyzed for purity on an Agilent 2100 Bioanalyzer (Agilent

Technologies, Santa Clara, CA, USA) by the North Carolina State Genome Sciences

Laboratory, Raleigh, NC. No samples with an RNA Integrity Number (RIN) of less than 9.0 were used for sequencing; the lowest RIN obtained was 9.4. Sequencing of all treatments and the controls were performed on the Illumina HiSeq 2000 platform (Illumina, Inc., San

Diego, CA, USA) at the Beijing Genomics Institute collaborative genome center at the

Children’s Hospital of Philadelphia (BGI@CHOP, Philadelphia, PA, USA). RNA-Seq libraries were prepared using the TruSeq RNA Sample Preparation kit following the manufacturer’s protocol. The libraries were multiplexed with three samples per lane randomly distributed over four flow cell lanes (12 samples total as defined above) and run on the paired-end read (2 × 100-bp) setting. The calculated Phred quality scores (Q scores) showed that approximately 97% of the bases read in each lane had a score of >Q30. Quality scores are used to predict the probability that an error will occur during base calling. Runs where the majority of the bases score Q30 or above (1 error in 1000 bases) are ideal for most sequencing applications [29].

Data Analysis

Elements of the Tuxedo suite pipeline [30] were used by the NC State University

Bioinformatics Consulting Core to analyze the RNA-Seq data generated. Each of the fastq files was aligned to the hg19 build of the human genome. The hg19 annotation file from the

University of California, Santa Cruz (UCSC) was used along with Cufflinks [31, 32] to guide

8

an assembly for each of the datasets. These twelve assemblies were merged with CuffMerge, and the transcripts from the merged assembly were used as an annotation file for calculation of differential expression via CuffDiff. In running CuffDiff, the ‘rescue method’ for multi- reads was implemented, and normalization was performed using a geometric mean. Quality- control and result plots were generated from the Cummerbund package [33]. Scatter plots were generated that indicated no problems with the normalization steps. Fragments per kilobase of exon per million fragments mapped (FPKM) distribution appeared to be similar across all replicates. The genes that were indicated to have been differentially expressed, at a significance level of α = 0.05, were extracted from the original data and further categorized using the Blast2Go program into categories including biological processes, molecular functions, and cellular components [34]. The Kyoto Encyclopedia of Genes and Genomes

(KEGG) database was used to categorize differentially-expressed genes into molecular networks derived from genomic, chemical and systemic functional information, including those dysregulated (significantly up- or down-regulated) in the steroid hormone biosynthesis pathway [35, 36]. In addition, the web application Idiographica version 2.3

(http://www.ncrna.org/idiographica) was used to develop chromosomal maps of genes dysregulated by treatments and Venny version 2.0 (http://bioinfogp.cnb.csic.es/tools/venny) to generate Venn diagrams [37, 38]. A threshold of 1.4- and 2-fold up- or down-regulation

(≥log2 fold change of ±0.50 and ±1.00, respectively) was also reported for those messages that were differentially expressed at a significance level of α = 0.05. However, an a priori exclusion of statistically significant differences based on an arbitrary expression threshold was not conducted in order to provide all differences that were found. Common thresholds in the literature typically range between a log2 fold change of ±0.30 and ±1.50 to account for

9

biological and technical noise but could also potentially exclude important biologically significant differences needing further study.

Quantitative PCR (qPCR) Analysis

Total RNA (1 μg) isolated from one of the DEET-treated samples and one of the control samples using the RNeasy mini kit (Cat. No. 74104, Qiagen, Inc., Valencia, CA,

USA) was reverse transcribed to produce complementary DNA (cDNA) with SuperScript II

Reverse Transcriptase (Cat. No. 18064-022, Life Technologies, Carlsbad, CA, USA) as per manufacturer’s instructions. Bio-Rad SsoFast™ EvaGreen® Supermix (Cat. No. 1725200,

Bio-Rad Laboratories, Inc., Hercules, CA, USA) was then used to carry out qPCR on the

Bio-Rad C1000 with a CFX384 Real-Time PCR System (Bio-Rad Laboratories, Inc.,

Hercules, CA, USA) to confirm that the same pattern of up- and down-regulation for specific transcripts observed from HiSeq could also be observed by qPCR. The transcripts targeted for qPCR were based on genes shown by previous researchers to be induced by DEET, fipronil, or prototypical inducers like rifampin [25, 27, 39]. Primers used for the qPCR assays were obtained from Integrated DNA Technologies (IDT, Inc., San Jose, CA, USA).

The primer and probe sets used included GAPDH (glyceraldehyde 3-phosphate dehydrogenase; For. 5’-GCTGAGAACGGGAAGCTTGTCAT-3’/Rev. 5’-

TCTCCATGGTGGTGAAGACGC-3’), P450 1A1 (For. 5’-

GCATGGGCAAGCGGAAGTGTA/Rev. 5’-CATAGATGGGGGTCATGTCCACCT-3’),

P450 1A2 (For. 5’-GACGTCCTGCAGATCCGCATT-3’/Rev. 5’-

AGGGTGGAGGTGTAGAGGTCAGGC-3’), P450 2B6 (For. 5’-

GGAAACCGCTGGAAGGTGCT/Rev. 5’-AGGAAGGTGGCGTCCATGAG-3’) and P450

10

3A4 (For. 5’-GCTGGCTATGAAACCACGAGCA-3’/Rev. 5’-

GTGGGTGGTGCCTTATTGGGT-3’). The cycling conditions were as follows: hot start at

95°C for 3 mins (initial denaturation of cDNA), then 40 cycles of 95°C for 10 secs

(denaturation) and 55°C for 30 secs (primer annealing and extension), followed by a melt curve analysis where the temperature ramps from 65°C to 95°C in 5°C increments for 5 secs each (after the amplification cycles). Measurements of amplicon fluorescence were performed after each round of amplification when the EvaGreen® fluorescent dye had been successfully intercalated into the newly-synthesized DNA strand. A melt curve was used to assess whether the intercalating dye produced single, specific products. Relative expression values were normalized against human GAPDH expression.

11

Results

DEET and Fipronil Exposure Alter Gene Expression in Primary Human Hepatocytes

qPCR for the transcripts CYP1A1, 1A2, 2B6 and 3A4 from the DEET alone treatment was conducted to independently validate our RNA-Seq results. We chose the transcripts for these four enzymes because the effect of DEET was previously examined by our group (27) for the same enzymes in primary human liver cells using the same culturing methods as in this paper and at the same DEET concentration used in our current study.

CYP1A1, 1A2, 2B6 and 3A4 transcripts were significantly up-regulated when compared to the DMSO control (data not shown), which was in agreement with our previously published work (27). In our RNA-Seq analysis in the current work, 1A1, 1A2 and 2B6 were up- regulated (+1.82, +1.24, and +3.82 log2 fold change, respectively) corresponding to both our earlier work (27) and our qPCR results in this current study of cells treated with DEET and analyzed by RNA-Seq; up-regulation of 3A4 transcripts was not detected by RNA-Seq. We concluded from these results that our current study was a successful replication of the work of Das et al. (27) using two different methods to measure transcript levels and that our RNA-

Seq results were reasonably validated.

When primary human hepatocytes were treated with 100 µM DEET, the transcripts for 172 genes were affected (α = 0.05), of which 108 transcripts were up-regulated and 64 were down-regulated (Figure 1A). This represented 1.2% of the total number of detected gene transcripts (where sufficient data were present to be included in the testing model for the treatment and control, i.e. 172 of 13,882 gene transcripts). When treated with 10 µM fipronil, the transcripts for 3,674 genes were affected where 2,246 transcripts were up-

12

regulated and 1,428 were down-regulated (Figure 1B). This represented 24.2% of the total number of detected gene transcripts (where sufficient data were present to be included in the testing model for the treatment and control, i.e. 3,674 of 15,171 gene transcripts). Putative functions for the 15 protein-coding transcripts with the highest log2 fold change that were up-regulated and down-regulated for DEET are shown in Table 1 and for fipronil in Table 2.

The Venn diagram in Figure 1D, shows that 149 of the same transcripts were up- or down- regulated by both DEET and fipronil representing 86.6% versus 4.1%, respectively, of the total number of transcripts that were affected by each compound. At a 1.4 fold up- or down- regulation threshold (≥log2 fold change of ±0.50), DEET affected the same 172 gene transcripts as when no threshold was applied. For fipronil at a ≥log2 fold change of ±0.50,

2,217 genes were affected (60% of the no threshold results), where 1,242 transcripts were up- regulated and 975 were down-regulated. At a 2 fold up- or down-regulation threshold (≥log2 fold change of ±1.00), DEET affected 88 transcripts (51% of the no threshold results) with

55 transcripts up-regulated and 33 down-regulated. For fipronil at 2-fold, 683 genes were affected (19% of the no threshold results) where 337 transcripts were up-regulated and 346 down-regulated. The differences between DEET and fipronil in their impact on gene expression might be expected since they represent totally different chemistries with different pesticide modes of action.

Many of the transcript levels affected the most by both DEET and fipronil were associated with xenobiotic metabolism, which consists of a series of metabolic pathways responsible for modifying foreign chemical substances that are not naturally produced by the human body (xenobiotics) to facilitate their detoxication and elimination. For example, cytochrome P450, family 2, subfamily B, polypeptide 6 (CYP2B6) and cytochrome P450,

13

family 3, subfamily A, polypeptide 7 (CYP3A7) were up-regulated by both DEET and fipronil at similar log2 fold change values (Tables 1 and 2). Considering that the fipronil treatment concentration was 10-fold lower than that for DEET and the number of transcripts that were significantly up- or down-regulated (dysregulated genes) was 21.4 times greater in the former, these results show that fipronil had a greater influence on the hepatocyte transcript levels than DEET. Alkaline phosphatase transcripts were over nine times as abundant (log2 fold change of +5.90 or a +59.71 fold change) after the fipronil treatment than they were after the DEET treatment (log2 fold change of +2.71 or a +6.54 fold change).

Elevated alkaline phosphatase levels can be an indicator of liver disease or liver damage as well as bone problems [40]; enzyme levels were not measured in this study. Interestingly, aldehyde dehydrogenase family 3, memberA1 (ALDH3A1) transcripts were significantly up- regulated by DEET and fipronil (log2 fold change of +2.77 or a +6.82 fold change and log2 fold change of +1.29 or a +2.45 fold change, respectively) and alcohol dehydrogenase 1B

(ADH1B) transcripts were significantly down-regulated by fipronil (log2 fold change of -

3.35 or a -10.20 fold change). ALDH3A1 and ADH1B are both critical players in the metabolism of ethanol where alcohol dehydrogenase catalyzes the conversion of ethanol to acetaldehyde, and aldehyde dehydrogenase is responsible for the oxidation of acetaldehyde.

The inability to properly process acetaldehyde, a highly toxic and carcinogenic compound, has been associated with several forms of cancer [41, 42]. What this means in terms of comparative toxicity and exposure risk between DEET and fipronil is unknown, especially since our work is limited to only examining transcript levels at one time point post treatment.

14

DEET-Fipronil Mixture

When hepatocytes were treated with a combination of 100 µM DEET and 10 µM fipronil together, 5,104 transcripts were dysregulated where 3,017 transcripts were up- regulated and 2,087 were down-regulated (Figure 1C). This represented 36.6% of the total number of detected gene transcripts (where sufficient data were present to be included in the testing model for the treatment and control, i.e. 5,104 of 13,940 gene transcripts). At a 1.4 fold up- or down-regulation threshold (≥log2 fold change of ±0.50), 3,010 transcripts were affected by 100 µM DEET plus 10 µM fipronil (59% of the no threshold results), where

1,561 transcripts were up-regulated and 1,449 transcripts were down-regulated. At a 2 fold up- or down-regulation threshold (≥log2 fold change of ±1.00), 908 transcripts were affected by 100 µM DEET plus 10 µM fipronil (18% of the no threshold results), where 356 transcripts were up-regulated and 552 transcripts were down-regulated. Putative functions for the 15 protein-coding transcripts with the highest log2 fold change that were up-regulated and down-regulated for DEET plus fipronil are shown in Table 3. If the effect on hepatocyte transcript levels by DEET and fipronil were independent, the additive response expected when no threshold was applied, would have been 3,846 dysregulated genes. What was observed from the mixture was 5,104 transcripts that were affected, 1.3 fold higher than the expected (or an additional 1,258 transcripts). The greater than additive effect did not occur for all transcripts, however. For example, cytochrome P450 (family 4, subfamily A, polypeptide 11; CYP4A11) demonstrated a log2 fold change for DEET of -1.11 (-2.16 fold change), for fipronil -2.46 (-5.50 fold change), and for the combination of DEET and fipronil

-3.61 (-12.21 fold change) while the simple additive response would have been a -7.66 fold change; the mixture produced a greater than additive response. On the other hand, for

15

cytochrome P450 (family 4, subfamily A, polypeptide 22; CYP4A22), the log2 fold change for DEET was -1.06 (-2.10 fold change), for fipronil -2.08 (-4.26 fold change), and for the combination of DEET and fipronil -2.70 (-6.31 fold change); in this case, a greater than additive effect was not observed. The greater than additive effect for the total number of transcripts affected for the mixture occurred even when thresholds were applied in our analysis.

DEET and fipronil together had a more than additive impact on transcript levels based simply on the number of dysregulated genes (and in some cases for individual transcripts) and demonstrates the importance of looking at mixtures to understand the impact of environmental chemistry on human systems. Both of these compounds are used in the urban environment, DEET as a repellent and fipronil on companion animals, for example, and the potential exists that both might be found in the human body at the same time. The differential expression levels of 148 gene transcripts were shared between the three treatment conditions (DEET, fipronil, and DEET plus fipronil) and were essentially the same as those shared when DEET and fipronil were applied separately (Figure 1D). Interestingly, 1,939 transcripts were unique to the DEET plus fipronil treatment (not affected by DEET or fipronil alone) suggesting their regulation was specific to when the two compounds were combined (Figure 1D). Additional components related to the endocrine system that were not dysregulated by DEET or fipronil alone but were dysregulated by the combination of DEET plus fipronil (Table 4) as well as four P450s (Table 5) will be discussed in more detail below.

16

Impact of DEET, Fipronil, and DEET plus Fipronil on Biological Processes and

Molecular Functions

Of the 10 biological processes most affected by the three treatments and when no thresholds were applied (Figure 2), cellular and primary metabolic processes were highly affected by DEET (16% and 14%, respectively), fipronil (12% and 12%, respectively), and

DEET plus fipronil (12% and 12%, respectively). There were no major differences between treatments. However, the top two biological processes affected by fipronil and DEET plus fipronil were single-organism cellular processes and organic substance metabolic processes.

Of the 10 molecular functions most affected by the three treatments (Figure 3), protein binding and ion binding were the two processes most affected by DEET (24% and 16%, respectively), fipronil (25% and 16%, respectively), and DEET plus fipronil (25% and 16%, respectively). Again, no major differences were found between treatments for these two processes. Both the biological processes and molecular function results suggest the hepatocytes are responding to the two different chemistries the same based on what transcripts were up and down regulated and that the mixture produced the same changes even though the mixture had a greater than additive effect (discussed earlier).

Chromosomal Distribution of DEET, Fipronil, and DEET plus Fipronil Dysregulated

Genes

We used a human chromosomal mapping algorithm [37] to display the location of genes that demonstrated significantly up- and down-regulated transcripts (α=0.05) produced by primary human hepatocytes treated for 72 hours with 100 µM DEET, 10 µM fipronil, and

100 µM DEET plus 10 µM fipronil compared to a DMSO control (Figures 4A-4C). All

17

human chromosomes contained at least one gene whose transcript level was altered by DEET and fipronil applications, but the number and distribution varied between chromosomes and treatments. Figure 4D (left panel) shows the number of differentially-expressed genes per chromosome versus the total number of genes dysregulated by each specific treatment as a percentage. Figure 4D (right panel) shows the number of differentially-expressed genes relative to the total number of genes present in that chromosome [43] as a percentage when

DEET, fipronil, or DEET plus fipronil were applied. We find that, in general, the number of genes per chromosome affected by each treatment is proportionally similar among the treatments and does not always correspond to the size of the chromosome itself. For example, has the highest number of dysregulated genes for each treatment likely because it is the largest human chromosome with the most genes (5,361 genes).

However, chromosome 19 has the second-highest number of differentially-expressed genes in each treatment (except it’s third-highest in DEET only treatment), yet it is one of the smallest chromosomes in the human genome. Size is not a reliable indicator as chromosome

19 contains 2,936 genes while chromosome 20, which is slightly larger than chromosome 19, contains only 1,440 genes. Chromosome 19 does have a higher percentage of genes dysregulated by all three treatments than chromosome 2, which contains 4,105 genes. When you look at the number of genes up- or down-regulated per chromosome compared to the number of genes on each chromosome as a percentage (Figure 4D, right panel), the values for each treatment are fairly similar. Chromosome 13 has the lowest percentage of genes dysregulated in each treatment compared to the number of genes in that chromosome, but the values for chromosomes 6, 12, 16, 17, and 22 are very close to one another. Chromosomes 1 and 19 have the highest number of differentially-expressed genes across all three treatments,

18

but values for 2, 11, and 12 are very close to one another. We also observed that genes on the short arm (p arm) of chromosomes 13, 14, 15 and 22 were not affected by DEET, fipronil, or the two together, and there was a large span of genes on the long arm (q arm) of chromosome 1 that were unaffected by all three treatments, which corresponds to regions of these chromosomes where few or no protein-coding genes have been detected [43].

P450s

Table 5 (left panel) summarizes the cytochrome P450 transcripts (CYPs or P450s) that were up- and down-regulated by DEET, fipronil, and DEET plus fipronil, and the Venn diagram in the right panel shows the association of each dysregulated P450 to each treatment and between treatments (α=0.05). There are 57 known human CYP genes and 59 pseudogenes currently whose products serve many roles including the metabolism of endogenous substrates [44, 45]. They perform the bulk of activities in phase I (enzymatic transformation) of xenobiotic metabolism where lipid-soluble compounds are chemically converted into water-soluble compounds. While some P450s metabolize only one substrate

(i.e. aromatase), many P450s are non-specific, and they can metabolize different toxicants; alterations in their activity can result in serious human diseases such as type II autoimmune hepatitis [46]. For the DEET treatment, transcripts for 13 P450s were up-regulated and 2 were down-regulated, while transcripts for 17 P450s were up-regulated and 7 were down- regulated for the fipronil treatment. These same numbers are maintained at a ≥log2 fold change threshold of ±0.50 and at a 2 fold threshold (≥log2 fold change of ±1.00) there are 11 up-regulated transcripts and 2 down-regulated transcripts for the DEET treatment. For the fipronil treatment there are 16 up-regulated and 6 down-regulated p450 transcripts at a ≥log2

19

fold change threshold of ±0.50 and 13 up-regulated and 3 down-regulated transcripts at a

≥log2 fold change of ±1.00. For the DEET plus fipronil treatment, transcripts for 17 P450s were up-regulated and 11 P450s were down-regulated when no threshold was applied, which again suggests that fipronil has more of an effect on primary hepatocytes than DEET (and the combination of the two have a greater effect). The numbers shift slightly at the ≥log2 fold change threshold of ±0.50 where 16 transcripts are up-regulated and 6 are down-regulated for the mixture. At a 2 fold threshold there are 13 up-regulated transcripts and 3 down-regulated transcripts.

Twelve P450s were up- or down-regulated in the same manner with treatment of

DEET, fipronil, or DEET plus fipronil when no threshold was applied. Indicative of a greater than additive effect, transcripts for 4 P450s were only differentially-expressed by the combination of DEET and fipronil. CYP2C19, CYP4F2, CYP4F12, and CYP46A1 were all significantly down-regulated by DEET plus fipronil, but not by DEET or fipronil alone.

Also, as mentioned above, there are certain situations (i.e. CYP4A11 and CYP4A22) where

P450 transcripts are up- or down-regulated in the same manner, but when DEET plus fipronil is applied the dysregulation is more pronounced.

Transcripts Related to Endocrine Metabolism and Function

Table 4 summarizes the affected transcripts that produce proteins involved in endocrine metabolism and function. In addition to components of the steroid hormone biosynthesis pathway that were dysregulated at the transcript level (Figure 5), there are several other genes critical to endocrine regulation whose transcripts were significantly up- or down-regulated. The expression of estrogen receptors, which bind the hormone estrogen,

20

are involved in endocrine disruption as well as disease prediction and prognosis [47]. The over-expression of the estrogen receptors is found in 70% of breast cancers and under- expression in transgenic mice is linked to obesity [48, 49]. Transcripts for both estrogen receptor 1 (ESR1) and estrogen receptor 2 (ESR2) were significantly down-regulated after fipronil and DEET plus fipronil treatment, but not the DEET treatment alone. Interestingly, the level of down-regulation was greater with the combined DEET plus fipronil treatment for both ESR1 (-2.06 log2 fold change vs. -1.81 log2 fold change) and ESR2 (-1.53 log2 fold change vs. -1.33 log2 fold change) than fipronil alone, suggesting a greater than additive effect for the mixture. Adrenergic receptors (adrenoceptors) alpha 1A, alpha 1B, alpha 2C, beta 1, and beta 2, which bind catecholamines to stimulate the sympathetic nervous system

[50], were significantly dysregulated at the transcript level by both DEET and fipronil.

Androgen receptors (AR), which bind androgenic hormones in the cytoplasm and transport them into the nucleus, were down-regulated at the transcript level by fipronil and DEET plus fipronil. Furthermore, gene transcripts involved in thyroid hormone production and activity were significantly affected by both DEET and fipronil. Improper regulation of thyroid hormone can impair, among other things, proper brain development and control of metabolism. Transcription of deiodinase, iodothyronine, type I (DIO1) was up-regulated by

DEET, fipronil, and DEET plus fipronil. DIO1 is important in the activation the prohormone thyroxine (T4) into the active hormone triiodothyronine (T3) and inactivation of the active triiodothyronine to diiodothyronine (T2) [51]. Thyroid hormone receptor interactor 4

(TRIP4) was up-regulated only when the DEET plus fipronil treatment was administered.

TRIP4 is a transcriptional coactivator that associates with several nuclear receptors including the androgen receptor that was mentioned to be dysregulated previously [52]. The

21

transcription of the parathyroid hormone 2 receptor (PTH2R), which binds parathyroid hormone (PTH), was up-regulated at the transcript level by fipronil only. PTH regulates serum calcium levels that can influence bone development as well as kidney and intestinal functions [53-55]. Several other factors associated with endocrine functions involved in glucose, protein and fat metabolism are listed in Table 5, and abnormal function of any one of these components can have potentially detrimental effects on human growth and development. At a threshold of ≥log2±0.50, all 7 transcripts for the DEET treatment (100%),

32 out of 43 transcripts for the fipronil treatment (74%), and 36 out of 49 transcripts for

DEET plus fipronil treatment (73%) were observed. At a threshold of ≥log2±1.00, for the

DEET treatment, 3 of the 7 transcripts (43%) were observed while 18 of the 43 transcripts

(42%) for the fipronil treatment and 20 of the 29 transcripts (41%) for the DEET plus fipronil treatment were observed. Differences at the transcript level for these different genes do not mean differences in substrate metabolism, and the impact of these transcript levels on the organism as a whole is unknown. We also did not examine dose effects and the duration of the impacts that were measured at the transcript level.

Transcripts involved in Steroid Hormone Biosynthesis were affected by DEET,

Fipronil, and DEET plus Fipronil

Based on KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis [35,

36], the steroid hormone biosynthesis pathway had the highest number of up- and down- regulated transcripts out of approximately 100 pathways identified for the DEET alone treatment with no threshold applied. For fipronil and DEET plus fipronil, the steroid hormone biosynthesis pathway was the 11th most affected pathway identified. However,

22

there were still many transcripts dysregulated by fipronil and DEET plus fipronil in the steroid hormone biosynthesis pathway. The affected transcripts in the steroid hormone biosynthesis pathway are shown in Figure 5. The differential expression (log2 fold change) values for all of the components visualized in Figure 5 can be found in Table 4 and are denoted with an asterisk following the “gene symbol” name. Of these 7 components, the vast majority are significant at a log2 threshold of ≥0.50 and many are still significant at a log2 threshold of ≥1.00.

Steroid hormones, derived from cholesterol in mammals, act as intermediaries between the brain and specific tissues or organs and participate in a wide variety of functions in the human body including inflammatory events, pregnancy, and mineral and sugar metabolism [56, 57]. Disruptions in the production of steroid hormones can occur on many levels and can contribute to defects in development, metabolism and physiology. Steroid hormones are synthesized from cholesterol, the major sterol. It is converted to pregnenolone by a mitochondrial CYP and then further transformed into progesterone, the primary progestin. 3-β-hydroxysteroid dehydrogenase/Δ-5-4 isomerase (HSD3B1) transcripts were significantly down-regulated by both DEET and fipronil (-1.89 log2 fold change for DEET, -3.06 log2 fold change for fipronil, and -2.78 log2 fold change for DEET plus fipronil). This enzyme catalyzes the conversion of pregnenolone to progesterone, 17- alpha-pregnenolone to 17-alpha-progesterone, dehydroepiandrosterone (DHEA) to androstenedione, and androstenediol to testosterone and is essential to steroid hormone production [58, 59]. Mutations in this gene are associated with an uncommon form of congenital adrenal hyperplasia that can have mild to severe effects including sexual ambiguity, infertility and ambiguous genitalia [60]. CYP17A1 catalyzes the conversion of

23

progesterone to 17α-OH-progesterone. While we found no differential levels of the transcript for this enzyme, there were several associated co-enzymes that were dysregulated.

Cytochrome b5, type A (CYB5A) transcripts were up-regulated by both DEET and fipronil, while sulfotransferase 2A1 (SULT2A1) transcripts were up-regulated when DEET plus fipronil was applied. CYP5A and SULT2A1 are critical enzymes in the P450c17 (CYP17) catalysis of the aforementioned reaction [61]. 17α-OH-progesterone is converted to 11- deoxycortisol by CYP21A2, which was up-regulated (the transcript) by both fipronil and the

DEET plus fipronil combination. The messenger RNAs (mRNAs) for 11β-hydroxysteroid dehydrogenase type 1 (HSB11B1), the enzyme that reduces cortisone to the active hormone cortisol [62], were down-regulated by fipronil and DEET plus fipronil. 17β-hydroxysteroid dehydrogenases (17β-HSDs) catalyze the dehydrogenation of 17-hydroxysteroids in steroidogenesis, including the interconversion of DHEA and androstenediol, androstenedione and testosterone, and estrone and estradiol [63]. Hydroxysteroid (17-Beta) dehydrogenase 13

(HSD17B13) was up-regulated by DEET and down-regulated by fipronil, and the combination of DEET plus fipronil. Transcripts for HSD17B2, HSD17B3, HSD17B4,

HSD17B6, HSD17B7, HSD17B11, HSD17B12 and HSD17B14 were also significantly dysregulated by fipronil and/or DEET plus fipronil (Figure 3 and Table 4). Cytochrome

P450 oxidoreductase (POR) is an essential enzyme in sex steroid hormone synthesis [64] and was significantly up-regulated by DEET and fipronil.

24

Discussion

Much of what we know of the impact of DEET and fipronil on human health was obtained through a combination of case reports, observational studies, cell culture systems and rodent models [65-69]. While these methods are not ineffective, human primary cells more closely mimic the physiological state of cells in vivo for the human system but have not been widely utilized. Furthermore, previous studies utilizing primary human cells were limited in focus to specific enzymes, like P450s, and conducted on compounds that were treated singly or in combination, but never on the combination of DEET and fipronil as they relate to human health [15]. With primary human hepatocytes, Das et al. [26] showed that

DEET induced the enzymes CYP1A2, CYP2A6, CYP2B6 and CYP3A4 at the transcript, protein and activity level. We confirmed induction (increased transcript levels) in three of these four enzymes as well as ten other CYPs and inhibition in two others. Das et al. [27] showed that fipronil increased CYP1A1, CYP2B6, CYP3A4 and CYP3A5 transcript, protein and enzyme activity levels. We saw induction in all four of these enzymes at the transcript level, an additional thirteen-+ others, as well as reductions in the transcript levels of seven

CYPs by fipronil (Table 5).

This study incorporated the analysis of the application of DEET and fipronil, both singly or, for the first time, together. Using RNA-Seq, we were able to expand the range of gene targets from a select few to the potential entire human transcriptome. Primary human hepatocytes more closely mimic what is happening in the human body than the use of non- human model organisms or immortalized cell lines. Human hepatocytes also provide the advantage of allowing investigation of human variability. It is widely believed that even small amounts of a toxicant can have profoundly different effects on a fetus or infant

25

compared to an adult [70], for example. Also, as male and female anatomy differs greatly, so does their response to foreign chemicals [71].

DEET (N, N-diethyl-m-toluamide) is generally considered safe for human use and is placed in the toxicity category 3 by the U.S. EPA, which means it is “slightly toxic” and may be harmful if inhaled, swallowed, or absorbed through the skin [72]. We found 172 genes whose transcripts were differentially expressed after DEET exposure (Figure 1, top left), which is over 20 times fewer than for the fipronil treatment (at one-tenth of the DEET concentration), suggesting that DEET is less toxic than fipronil based on changes in transcript levels. Fipronil is a category 2 or “moderately toxic” compound [73] and differentially-regulated more transcripts than DEET. Unexpected was that the combination of the two compounds together displayed a more than additive response as 1,939 transcripts were dysregulated that were unique to the combinatorial treatment (Figure 1, bottom right).

We also suggest that while both DEET and fipronil have different structures and modes of action, there is a common response in primary human cells to CYP transcript levels. With the combination of DEET plus fipronil, many of the CYPs that were affected by each singly were activated. Additional CYPs were also up- or down-regulated (Table 5). Furthermore, transcripts that code for proteins involved in endocrine regulation including steroid hormone biosynthesis were up- and down-regulated by DEET, fipronil, and the combination of the two, suggesting further research is needed to understand the functional importance of these differences. More work is needed to understand dose response, human variability, the effect of chemical mixtures, and if changes in transcript levels are related to epigenetics and differences in substrate metabolism and cellular, organ and organismic function.

26

Acknowledgements. The authors gratefully acknowledge Sam Suarez, of the Department of

Entomology and the Toxicology Program, Department of Biology, North Carolina State

University for his assistance in graphical representation of the data and Dr. Daniel E.

Sonenshine of the Department of Biological Sciences at Old Dominion University for his assistance in reviewing the manuscript. We would also like to thank Jeff Roach from the

University of North Carolina-Chapel Hill Information Technology Services and Elizabeth

Scholl from the North Carolina State University Bioinformatics Consulting Core for their assistance with bioinformatics analyses. This research was supported in part by the US

Central Appalachian Regional Education and Research Center (CARERC) Pilot Study.

RDM was supported by an Entomology Department Teaching Assistantship at NC State

University.

27

References

1. Kullman SW, Mattingly CJ, Meyer JN, Whitehead A. Perspectives on Informatics in

Toxicology. In: Hodgson E, editor. A Textbook of Modern Toxicology. Hoboken, NJ:

John Wiley and Sons; 2010. p 593-605.

2. Judson RS, Houck KA, Kavlock RJ, Knudsen TB, Martin MT, Mortensen HM, Reif

DM, Rotroff DM, Shah I, Richard AM. In vitro screening of environmental chemicals

for targeted testing prioritization: the ToxCast project. Environmental health

perspectives (Online) 2010;118(4):485.

3. National Research Council. Toxicity testing in the 21st century: A vision and a

strategy: National Academies Press; 2007.

4. Schmidt CW. TOX 21: new dimensions of toxicity testing. Environ Health Perspect

2009;117(8):A348-A353.

5. National Toxicology Program. Tox21: Transforming Environmental Health. Web.

2011. Retreived 3 May 2015.

6. Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F, Yang H, Ch'ang L-Y, Huang

W, Liu B, Shen Y. The international HapMap project. Nature 2003;426(6968):789-

796.

7. Thorisson GA, Smith AV, Krishnan L, Stein LD. The international HapMap project

web site. Genome research 2005;15(11):1592-1593.

8. Schoenig GP, Osimitz TG, Gabriel KL, Hartnagel R, Gill MW, Goldenthal EI.

Evaluation of the chronic toxicity and oncogenicity of N, N-diethyl-m-toluamide

(DEET). Toxicological Sciences 1999;47(1):99-109.

28

9. Arthur O, Maciarello J. Essential oil analysis and field evaluation of the citrosa plant

“Pelargonium citrosum” as a repellent against populations of Aedes mosquitoes.

Journal of the American Mosquito Control Association 1996;12(1):69-74.

10. Osimitz TG, Murphy JV, Fell L, Page BC. Adverse events associated with the use of

insect repellents containing N, N-diethyl-m-toluamide (DEET). Regulatory

Toxicology and Pharmacology 2010;56(1):93-99.

11. Veltri JC, Osimitz TG, Bradford DC, Page BC. Retrospective analysis of calls to

poison control centers resulting from exposure to the insect repellent N, N-diethyl-m-

toluamide (DEET) from 1985-1989. Clinical Toxicology 1994;32(1):1-16.

12. Steele L, Sastre A, Gerkovich MM, Cook MR. Complex factors in the etiology of

Gulf War illness: wartime exposures and risk factors in veteran subgroups.

Environmental health perspectives 2012;120(1):112.

13. Osimitz TG, Murphy JV. Neurological effects associated with use of the insect

repellent N, N-diethyl-m-toluamide (DEET). Journal of Toxicology: Clinical

Toxicology 1997;35(5):435-441.

14. Yiin L-M, Tian J-N, Hung C-C. Assessment of dermal absorption of DEET-

containing insect repellent and oxybenzone-containing sunscreen using human

urinary metabolites. Environmental Science and Pollution Research 2015;22(9):7062-

7070.

15. Abu-Qare AW, Abou-Donia MB. Combined exposure to DEET (N, N-diethyl-m-

toluamide) and permethrin: pharmacokinetics and toxicological effects. Journal of

Toxicology and Environmental Health, Part B 2003;6(1):41-53.

29

16. Lerapetritou MG, Georgopoulos PG, Roth CM, Androulakis LP. Tissue-level

modeling of xenobiotic metabolism in liver: An emerging tool for enabling clinical

translational research. Clin Transl Sci 2009;2(3):228-37.

17. Carrington D. EU to ban fipronil to protect honeybees. Web. 2013. Retreived 8 May

2015.

18. Hamon N, Gamboa H, Ernesto J, Garcia M. Fipronil: a major advance for the control

of boll weevil in Colombia. Beltwise Cotton Conferences (USA). 1996.

19. Tingle CC, Rother JA, Dewhurst CF, Lauer S, King WJ. Fipronil: environmental fate,

ecotoxicology, and human health concerns. Reviews of environmental contamination

and toxicology: Springer; 2003. p 1-66.

20. Zaluski R, Kadri SM, Alonso DP, Martins Ribolla PE, de Oliveira Orsi R. Fipronil

promotes motor and behavioral changes in honey bees (Apis mellifera) and affects the

development of colonies exposed to sublethal doses. Environmental Toxicology and

Chemistry 2015;34(5):1062-1069.

21. Nicodemo D, Maioli MA, Medeiros HC, Guelfi M, Balieira KV, De Jong D,

Mingatto FE. Fipronil and imidacloprid reduce honeybee mitochondrial activity.

Environmental Toxicology and Chemistry 2014;33(9):2070-2075.

22. Nahar N, Ohtani T. Imidacloprid and Fipronil induced abnormal behavior and

disturbed homing of forager honey bees Apis mellifera. J Entomol and Zool Stud

2015;3:20-24.

23. Hurley PM. Mode of carcinogenic action of pesticides inducing thyroid follicular cell

tumors in rodents. Environmental Health Perspectives 1998;106(8):437.

30

24. Peng C, Yuan D, Li B, Wei Y, Yan L, Wen T, Zhao J, Yang J, Wang W, Xu M. Body

mass index evaluating donor hepatic steatosis in living donor liver transplantation.

2009. Elsevier. p 3556-3559.

25. LeCluyse EL, Madan A, Hamilton G, Carroll K, DeHaan R, Parkinson A. Expression

and regulation of cytochrome P450 enzymes in primary cultures of human

hepatocytes. Journal of biochemical and molecular toxicology 2000;14(4):177-188.

26. Das PC, Cao Y, Rose RL, Cherrington N, Hodgson E. Enzyme induction and

cytotoxicity in human hepatocytes by chlorpyrifos and N,N-diethyl-m-toluamide

(DEET). Drug Metabol Drug Interact 2008;23(3-4):237-60.

27. Das PC, Cao Y, Cherrington N, Hodgson E, Rose RL. Fipronil induces CYP isoforms

and cytotoxicity in human hepatocytes. Chem Biol Interact 2006;164(3):200-14.

28. Baselt RC, and Robert H. Cravey. Disposition of toxic drugs and chemicals in man.

Seal Beach, California: Biomedical publications; 2011.

29. Illumina Inc. Quality Scores for Next-Generation Sequencing. Web. 2011. Retreived

6 June 2015.

30. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg

SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-

seq experiments with TopHat and Cufflinks. Nat Protoc 2012;7(3):562-78.

31. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, Van Baren MJ, Salzberg

SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals

unannotated transcripts and isoform switching during cell differentiation. Nature

biotechnology 2010;28(5):511-515.

31

32. Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L. Differential

analysis of gene regulation at transcript resolution with RNA-seq. Nature

biotechnology 2013;31(1):46-53.

33. Goff L, Trapnell C, Kelley D. cummeRbund: Analysis, exploration, manipulation,

and visualization of Cufflinks high-throughput sequencing data. 2013. R package

version 2.13.0.

34. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a

universal tool for annotation, visualization and analysis in functional genomics

research. Bioinformatics 2005;21(18):3674-6.

35. Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data,

information, knowledge and principle: back to metabolism in KEGG. Nucleic acids

research 2014;42(D1):D199-D205.

36. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic

acids research 2000;28(1):27-30.

37. Kin T, Ono Y. Idiographica: a general-purpose web application to build idiograms

on-demand for human, mouse and rat. Bioinformatics 2007;23(21):2945-2946.

38. Oliveros J. VENNY. An interactive tool for comparing lists with Venn Diagrams.

2007.

39. Usmani KA, Rose RL, Goldstein JA, Taylor WG, Brimfield AA, Hodgson E. In vitro

human metabolism and interactions of repellent N, N-diethyl-m-toluamide. Drug

metabolism and disposition 2002;30(3):289-294.

40. WebMD. Alkaline Phosphatase. Web. 2015. Retreived 9 July 2015.

32

41. Jelski W, Szmitkowski M. Alcohol dehydrogenase (ADH) and aldehyde

dehydrogenase (ALDH) in the cancer diseases. Clinica Chimica Acta 2008;395(1):1-

5.

42. Jelski W, Kutylowska E, Laniewska-Dunaj M, Szmitkowski M. Alcohol

dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) as candidates for tumor

markers in patients with pancreatic cancer. J Gastrointestin Liver Dis 2011;20(3):255-

259.

43. Flicek P, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D,

Clapham P, Coates G, Fitzgerald S, Gil L. Ensembl 2014. Nucleic acids research.

2014 Jan 1;42(D1):D749-55.

44. Pikuleva IA, Waterman MR. Cytochromes P450: roles in diseases. Journal of

Biological Chemistry 2013;288(24):17091-17098.

45. Nelson D. The Cytochrome P450 Homepage. Human Genomics 4, 2009; 59-65.

46. Villeneuve J-P, Pichette V. Cytochrome P450 and liver diseases. Current drug

metabolism 2004;5(3):273-282.

47. US EPA. Endocrine Disruption Screening Program (EDSP) Assays Under

Consideration. Web. 2011. Retreived 8 May 2015.

48. Deroo BJ, Korach KS. Estrogen receptors and human disease. The Journal of clinical

investigation 2006;116(3):561.

49. Ohlsson C, Hellberg N, Parini P, Vidal O, Bohlooly M, Rudling M, Lindberg MK,

Warner M, Angelin B, Gustafsson J-Å. Obesity and disturbed lipoprotein profile in

estrogen receptor-α-deficient male mice. Biochemical and biophysical research

communications 2000;278(3):640-645.

33

50. Strosberg A. Structure, function, and regulation of adrenergic receptors. Protein

science: a publication of the Protein Society 1993;2(8):1198.

51. Gereben B, Zavacki AM, Ribich S, Kim BW, Huang SA, Simonides WS, Zeold A,

Bianco AC. Cellular and molecular basis of deiodinase-regulated thyroid hormone

signaling 1. Endocrine reviews 2008;29(7):898-938.

52. Lee JW, Choi H-S, Gyuris J, Brent R, Moore DD. Two classes of proteins dependent

on either the presence or absence of thyroid hormone for interaction with the thyroid

hormone receptor. Molecular Endocrinology 1995;9(2):243-254.

53. Usdin TB, Gruber C, Bonner TI. Identification and functional expression of a

receptor selectively recognizing parathyroid hormone, the PTH2 receptor. Journal of

Biological Chemistry 1995;270(26):15455-15458.

54. Coetzee M, Kruger MC. Osteoprotegerin-receptor activator of nuclear factor-kappaB

ligand ratio: a new approach to osteoporosis treatment? Southern medical journal

2004;97(5):506-511.

55. Poole KE, Reeve J. Parathyroid hormone—a bone anabolic and catabolic agent.

Current opinion in pharmacology 2005;5(6):612-617.

56. Falkenstein E, Tillmann H-C, Christ M, Feuring M, Wehling M. Multiple actions of

steroid hormones—a focus on rapid, nongenomic effects. Pharmacological reviews

2000;52(4):513-556.

57. Diamanti-Kandarakis E, Bourguignon J-P, Giudice LC, Hauser R, Prins GS, Soto

AM, Zoeller RT, Gore AC. Endocrine-disrupting chemicals: an Endocrine Society

scientific statement. Endocrine reviews 2009;30(4):293-342.

34

58. Pelletier G, Dupont E, Simard J, Luu-The V, Bélanger A, Labrie F. Ontogeny and

subcellular localization of 3β-hydroxysteroid dehydrogenase (3β-HSD) in the human

and rat adrenal, ovary and testis. The Journal of steroid biochemistry and molecular

biology 1992;43(5):451-467.

59. Simard J, Ricketts M-L, Gingras S, Soucy P, Feltus FA, Melner MH. Molecular

biology of the 3β-hydroxysteroid dehydrogenase/Δ5-Δ4 isomerase gene family.

Endocrine reviews 2005;26(4):525-582.

60. Simard J, Moisan AM, Morel Y. Congenital adrenal hyperplasia due to 3beta-

hydroxysteroid dehydrogenase/Delta (5)-Delta (4) isomerase deficiency. 2002. p 255-

276.

61. Nakamura Y, Xing Y, Hui X-G, Kurotaki Y, Ono K, Cohen T, Sasano H, Rainey

WE. Human adrenal cells that express both 3β-hydroxysteroid dehydrogenase type 2

(HSD3B2) and cytochrome b5 (CYB5A) contribute to adrenal androstenedione

production. The Journal of steroid biochemistry and molecular biology

2011;123(3):122-126.

62. Utriainen P, Laakso S, Jääskeläinen J, Voutilainen R. Polymorphisms of POR,

SULT2A1 and HSD11B1 in children with premature adrenarche. Metabolism

2012;61(9):1215-1219.

63. Adamski J, Jakob FJ. A guide to 17β-hydroxysteroid dehydrogenases. Molecular and

cellular endocrinology 2001;171(1):1-4.

64. Tomalik-Scharte D, Maiter D, Kirchheiner J, Ivison HE, Fuhr U, Arlt W. Impaired

hepatic drug and steroid metabolism in congenital adrenal hyperplasia due to P450

oxidoreductase deficiency. European Journal of Endocrinology 2010;163(6):919-924.

35

65. Briassoulis G, Narlioglou M, Hatzis T. Toxic encephalopathy associated with use of

DEET insect repellents: a case analysis of its toxicity in children. Human &

experimental toxicology 2001;20(1):8-14.

66. Chen-Hussey V, Behrens R, Logan JG. Assessment of methods used to determine the

safety of the topical insect repellent N, N-diethyl-m-toluamide (DEET). Parasit

Vectors 2014;7(1):173.

67. Barr DB, Ananth CV, Yan X, Lashley S, Smulian JC, Ledoux TA, Hore P, Robson

MG. Pesticide concentrations in maternal and umbilical cord sera and their relation to

birth outcomes in a population of pregnant women and newborns in New Jersey.

Science of the total environment 2010;408(4):790-795.

68. Gu X, Wang T, Collins D, Kasichayanula S, Burczynski F. In vitro evaluation of

concurrent use of commercially available insect repellent and sunscreen preparations.

British Journal of Dermatology 2005;152(6):1263-1267.

69. McCain WC, Lee R, Johnson MS, Whaley JE, Ferguson JW, Leach G. Acute oral

toxicity study of pyridostigmine bromide, permethrin, and DEET in the laboratory rat.

Journal of Toxicology and Environmental Health Part A 1997;50(2):113-124.

70. Scheuplein R, Charnley G, Dourson M. Differential sensitivity of children and adults

to chemical toxicity: I. Biological basis. Regulatory Toxicology and Pharmacology

2002;35(3):429-447.

71. Mennecozzi M, Landesmann B, Palosaari T, Harris G, Whelan M. Sex Differences in

Liver Toxicity—Do Female and Male Human Primary Hepatocytes React Differently

to Toxicants In Vitro? PLoS One 2015 Apr 7;10(4): 1-23.

36

72. Sudakin DL, Trevathan WR. DEET: a review and update of safety and risk in the

general population. Journal of Toxicology: Clinical Toxicology 2003;41(6): 831-839.

73. Tingle CC, Rother JA, Dewhurst CF, Lauer S, King WJ. Fipronil: environmental

fate, ecotoxicology, and human health concerns." Reviews of environmental

contamination and toxicology Springer New York, 2003;176:1-66.

37

TABLES

Table 1.1. Putative functions for the 15 protein-coding transcripts with the highest log2 fold changea that were up-regulated and down-regulated when primary human hepatocytes were treated for 72 hours with 100 µM DEET.

Differential Gene Log2 Fold Expression Symbol Gene Name Function Change Up CYP2B6 Cytochrome P450, family 2, subfamily B, polypeptide 6 Enzyme +3.82 CYP3A7 Cytochrome P450, family 3, subfamily A, polypeptide 7 Enzyme +3.40 CYP3A43 Cytochrome P450, family 3, subfamily A, polypeptide 43 Enzyme +3.26 CYP2B7P1 Cytochrome P450, family 2, subfamily B, polypeptide 7 pseudogene 1 Enzyme +2.97 CXCL13 Chemokine (C-X-C motif) ligand 13 Immune response +2.80 ALDH3A1 Aldehyde dehydrogenase 3 family, member A1 Enzyme +2.77 ALPI Alkaline phosphatase Enzyme +2.71 FOXN4 Forkhead box N4 Transcription factor +2.59 GSTA2 Glutathione S-transferase alpha 2 Enzyme +2.51 CYP2A7 Cytochrome P450, family 2, subfamily A, polypeptide 7 Enzyme +2.45 CYP2A13 Cytochrome P450, family 2, subfamily A, polypeptide 13 Enzyme +2.34 EPHX1 Epoxide hydrolase 1, microsomal (xenobiotic) Enzyme +2.30 CYP2C8 Cytochrome P450, family 2, subfamily C, polypeptide 8 Enzyme +2.30 TNFRSF19 Tumor necrosis factor receptor superfamily, member 19 Signal transduction +2.28 CYP2A6 Cytochrome P450, family 2, subfamily A, polypeptide 6 Enzyme +2.21 Down UROC1 Urocanate hydratase 1 Enzyme -2.06 Hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta- HSD3B1 isomerase 1 Enzyme -1.89 PCDH11X Protocadherin 11 X-linked Cell adhesion -1.68 BTNL9 Butyrophilin-like 9 Immune response -1.63 C1orf111 Chromosome 1 open reading frame 111 Unknown -1.52 HAPLN3 Hyaluronan and proteoglycan link protein 3 Cell adhesion -1.47 38

Table 1.1. continued

TAT Tyrosine aminotransferase, nuclear gene encoding mitochondrial protein Enzyme -1.43 GPR153 G protein-coupled receptor 153 Signal transduction -1.41 ADRA1A Adrenoceptor alpha 1A, transcript variant 1 Signal transduction -1.38 TRABD2B TraB domain containing 2B Enzyme -1.25 C1orf226 Chromosome 1 open reading frame 226, transcript variant 2 Unknown -1.22 Transmembrane ABCB11 ATP-binding cassette, sub-family B (MDR/TAP), member 11 transport -1.20 PLCH2 Phospholipase C, eta 2 Enzyme -1.20 NOS1AP Nitric oxide synthase 1 (neuronal) adaptor protein, transcript variant 2 Cell signaling -1.20 DNAH5 Dynein, axonemal, heavy chain 5 Cell movement -1.18 aFold change was statistically significant (α = 0.05).

39

Table 1.2. Putative functions for the 15 protein-coding transcripts with the highest log2 fold changea that were up-regulated and down-regulated when primary human hepatocytes were treated for 72 hours with 10 µM fipronil.

Differential Gene Log2 Fold Expression Symbol Gene Name Function Change Up ALPI Alkaline phosphatase Enzyme +5.90 FAM26D Family with sequence similarity 26, member D Pore-forming subunit +5.12 C4orf6 Chromosome 4 open reading frame 6 Unknown +4.46 CA12 Carbonic anhydrase XII Enzyme +4.06 CYP3A4 Cytochrome P450, family 3, subfamily A, polypeptide 4 Enzyme +3.86 SLC51B Solute carrier family 51, beta subunit Transporter activity +3.80 PRAMEF8 PRAME family member 8 Retinoic acid binding +3.73 MT3 Metallothionein 3 Binding heavy metals +3.72 CYP3A7 Cytochrome P450, family 3, subfamily A, polypeptide 7 Enzyme +3.71 CYP2B6 Cytochrome P450, family 2, subfamily B, polypeptide 6 Enzyme +3.54 HTR3A 5-hydroxytryptamine (serotonin) receptor 3A Signal transduction +3.52 TMPRSS4 Transmembrane protease, serine 4 Enzyme +3.42 STC1 Stanniocalcin 1 Cell signaling +3.35 CYP26A1 Cytochrome P450, family 26, subfamily A, polypeptide 1 Enzyme +3.34 CCDC64B Coiled-coil domain containing 64B GTPase binding +3.21 Down MTRNR2L1 MT-RNR2-like 1 Neuroprotective role -12.02 VN1R2 Vomeronasal 1 receptor 2 Pheromone receptor -5.64 FCAR Fc fragment of IgA, receptor for (FCAR), transcript variant 1 Immune activity -5.00 ZNF793 Zinc finger protein 793 Transcriptional regulation -4.45 TMEM212 Transmembrane protein 212 Transmembrane transport -4.35 KLK14 Kallikrein-related peptidase 14 Enzyme -4.29 CABP4 Calcium binding protein 4 Synaptic function -4.01 KIAA0226 KIAA0226, transcript variant 1 Cell degradation -3.70 ZNF486 Zinc finger protein 486 Transcriptional regulation -3.69 40

Table 1.2. continued

C21orf62 Chromosome 21 open reading frame 62, transcript variant 1 Unknown -3.58 Leukocyte immunoglobulin-like receptor, subfamily A, LILRA5 member 5 Immune activity -3.57 REX01 REX1, RNA exonuclease 1 homolog (S. cerevisiae) Nucleic acid binding -3.53 MT1B Metallothionein 1B Binding heavy metals -3.46 WNT7B Wingless-type MMTV integration site family, member 7B Cell signaling -3.44 Alcohol dehydrogenase 1B (class I), beta polypeptide, ADH1B transcript variant 1 Enzyme -3.35 aFold change was statistically significant (α = 0.05).

41

Table 1.3. Putative functions for the 15 protein-coding transcripts with the highest log2 fold changea that were up-regulated and down-regulated when primary human hepatocytes were treated for 72 hours with a mixture of 100 µM DEET and 10 µM fipronil.

Differential Gene Log2 Fold Expression Symbol Gene Name Function Change Up ALPI Alkaline phosphatase Enzyme +6.09 CA12 Carbonic anhydrase XII Enzyme +4.43 SLC51B Solute carrier family 51, beta subunit Transporter activity +4.31 C4orf6 Chromosome 4 open reading frame 6 Unknown +4.30 FAM26D Family with sequence similarity 26, member D Pore-forming subunit +4.26 CYP26A1 Cytochrome P450, family 26, subfamily A, polypeptide 1 Enzyme +4.23 HTR3A 5-hydroxytryptamine (serotonin) receptor 3A Ion channel receptor +4.16 CyP2B6 Cytochrome P450, family 2, subfamily B, polypeptide 6 Enzyme +3.84 GSTA2 Glutathione S-transferase alpha 2 Enzyme +3.81 MT3 Metallothionein 3 Binding heavy metals +3.77 CYP3A4 Cytochrome P450, family 3, subfamily A, polypeptide 4 Enzyme +3.73 PRAMEF8 PRAME family member 8 Retinoic acid binding +3.61 CYP3A7 Cytochrome P450, family 3, subfamily A, polypeptide 7 Enzyme +3.57 Bone morphogenetic protein/retinoic acid inducible neural- BRINP2 specific 2 regulation +3.31 PSG9 Pregnancy specific beta-1-glycoprotein 9 Immune response +3.17 Down MTRNR2L1 MT-RNR2-like 1 Neuroprotective role -11.56 VN1R2 Vomeronasal 1 receptor 2 Pheromone receptor -6.72 AC079610.2 Uncharacterized LOC100130451 Unknown -6.07 MLANA Melan-A Immune response -4.66 C21orf62 Chromosome 21 open reading frame 62, transcript variant 1 Unknown -4.43 GSG1 Germ cell associated 1, transcript variant 3 RNA polymerase binding -4.42 HTRA4 HtrA serine peptidase 4 Enzyme -4.40 MT1B Metallothionein 1B Binding heavy metals -4.32 42

Table 1.3. continued

CABP4 Calcium binding protein 4 Synaptic function -4.32 KLK14 Kallikrein-related peptidase 14 Enzyme -4.27 TMEM212 Transmembrane protein 212 Transmembrane transport -4.22 ZNF793 Zinc finger protein 793 Transcriptional regulation -4.09 HMGCS2 3-hydroxy-3-methylglutaryl-CoA synthase 2 Enzyme -3.93 KIAA0226 KIAA0226, transcript variant 1 Cell degradation -3.65 CYP4A11 Cytochrome P450, family 4, subfamily A, polypeptide 11 Enzyme -3.61 aFold change was statistically significant (α = 0.05).

43

Table 1.4. Transcripts associated with endocrine disruption that were up- or down-regulateda when primary human hepatocytes were treated for 72 hours with 100 µM DEET, 10 µM fipronil or a mixture of 100 µM DEET and 10 µM fipronil.

TABLE 1.4. Log2 Change With Treatment Gene DEET+ Symbol Gene Name Function DEET Fipronil Fipronil ADRA1A Adrenoceptor alpha 1A Fight-or-flight response -1.39 -2.79 -2.32 ADRA1B Adrenoceptor alpha 1B Fight-or-flight response ─C -0.40 -0.61 ADRA2C Adrenoceptor alpha 2C Fight-or-flight response ─ +1.91 b +1.48 ADRB1 Adrenoceptor beta 1 Fight-or-flight response +1.13 +2.41 +1.51 ADRB2 Adrenoceptor beta 2 Fight-or-flight response ─ +0.77 +1.14 AR Androgen receptor Steroid hormone activity ─ -0.40 -0.53 CALCB Calcitonin-related polypeptide beta Pain perception ─ +1.77 +2.00 CCND1 Cyclin D1 Thyroid hormone activity ─ ─ +0.31 CYB5A* Cytochrome b5, type A Steroid hormone biosynthesis +0.88 +1.07 +1.02 CYP21A2* Cytochrome P450, family 21, subfamily A, polypeptide 2 Steroid hormone biosynthesis ─ +0.88 +0.76 DIO1 Deiodinase, iodothyronine, type I Thyroid hormone activity +0.90 +0.89 +1.13 EBAG9 Estrogen receptor binding site associated Steroid hormone activity ─ ─ +0.32 ESRRA Estrogen-related receptor alpha Steroid hormone activity ─ ─ +0.46 ESR1 Estrogen receptor 1 Steroid hormone activity ─ -1.81 -2.06 ESR2 Estrogen receptor 2 (ER beta) Steroid hormone activity ─ -1.33 -1.53 GREB1 Growth regulation by estrogen in breast cancer 1 Steroid hormone activity ─ -1.03 -1.05 GHR Growth hormone receptor Stimulate growth ─ -0.36 -0.65 HSD3B1* Hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 1 Steroid hormone biosynthesis -1.89 -3.06 -2.77 HSD3B7 Hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 7 Steroid hormone biosynthesis ─ +0.70 ─ HSD11B1* Hydroxysteroid (11-beta) dehydrogenase 1 Steroid hormone biosynthesis ─ -0.44 -0.66 HSD17B2 Hydroxysteroid (17-beta) dehydrogenase 2 Steroid hormone biosynthesis ─ +0.48 +0.32

44

Table 1.4. continued

HSD17B3 Hydroxysteroid (17-beta) dehydrogenase 3 Steroid hormone biosynthesis ─ -1.25 -0.98 HSD17B4 Hydroxysteroid (17-beta) dehydrogenase 4 Steroid hormone biosynthesis ─ -0.36 ─ HSD17B6 Hydroxysteroid (17-beta) dehydrogenase 6 Steroid hormone biosynthesis ─ -0.57 -1.17 HSD17B7 Hydroxysteroid (17-beta) dehydrogenase 7 Steroid hormone biosynthesis ─ +0.75 +0.45 HSD17B11 Hydroxysteroid (17-beta) dehydrogenase 11 Steroid hormone biosynthesis ─ +0.52 +0.76 HSD17B12 Hydroxysteroid (17-beta) dehydrogenase 12 Steroid hormone biosynthesis ─ +0.42 +0.40 HSD17B13* Hydroxysteroid (17-beta) dehydrogenase 13 Steroid hormone biosynthesis +0.70 -0.33 -1.38 HSD17B14 Hydroxysteroid (17-beta) dehydrogenase 14 Steroid hormone biosynthesis ─ +1.58 +1.79 HSDL1 Hydroxysteroid dehydrogenase like 1 Steroid hormone biosynthesis ─ +0.58 +0.57 HTRA3 5-hydroxytryptamine (serotonin) receptor 3A, ionotropic Neurotransmission ─ +3.52 +4.16 IGF1 Somatomedin C Stimulate growth ─ -1.38 -1.77 INS Insulin Glucose regulation ─ -0.58 -1.17 NCOR1 Nuclear receptor corepressor 1 Thyroid hormone activity ─ ─ -0.33 NR1I2 Nuclear receptor subfamily 1, group I, member 2 (pregnane x receptor) Steroid hormone activity ─ -0.51 -0.36 NR3C2 Nuclear receptor subfamily 3, group C, member 2 (mineralocorticoid receptor) Steroid hormone activity ─ ─ -0.51 NRIP1 Nuclear receptor interacting protein 1 Lipid and glucose metabolism ─ +0.31 +0.35 NROB2 Nuclear receptor subfamily 0, group B, member 2 Thyroid hormone activity ─ +0.35 +0.58 POR* P450 (cytochrome) oxidoreductase Steroid hormone biosynthesis +0.97 +1.12 +1.24 PTH2R Parathyroid hormone 2 receptor Calcium regulation ─ +0.92 ─ RARA Retinoic acid receptor, alpha Thyroid hormone activity ─ ─ -0.46 RXRA Retinoid X receptor, alpha Thyroid hormone activity ─ -0.60 -0.49 SLC16A2 Solute carrier family 16, member 2 (thyroid hormone transporter) Thyroid hormone transport ─ -0.52 -0.79 SULT1A1 Sulfotransferase family, cytosolic, 1A, phenol-preferring, member 1 Hormone biosynthesis ─ +0.37 +0.64 SULT1A2 Sulfotransferase family, cytosolic, 1A, phenol-preferring, member 2 Hormone biosynthesis ─ ─ +0.91 SULT1B1 Sulfotransferase family, cytosolic, 1B, member 1 Steroid hormone biosynthesis ─ -1.30 -0.94 SULT1C2 Sulfotransferase family, cytosolic, 1C, member 2 Hormone biosynthesis ─ +1.00 +1.19 SULT1E1 Sulfotransferase family 1E, estrogen-preferring, member 1 Steroid hormone biosynthesis ─ -1.66 -0.91 45

Table 1.4. continued

Sulfotransferase family, cytosolic, 2A, dehydroepiandrosterone (DHEA)-preferring, SULT2A1* member 1 Hormone biosynthesis ─ ─ +0.82 TEF Thyrotrophic embryonic factor Gene repair ─ +0.65 +0.45 THRSP Thyroid hormone responsive Lipogenesis ─ +1.97 +1.44 TRIP4 Thyroid hormone receptor interactor 4 Thyroid hormone activity ─ ─ +0.31 aUp or down change was statistically significant (α = 0.05). bAn asterisk "*"denotes a component of the steroid hormone biosynthesis pathway. cA horizontal line means that no significant change in gene expression was detected.

46

Table 1.5. Transcripts for P450s that were up- or down-regulateda when primary human hepatocytes were treated for 72 hours with 100 µM DEET, 10 µM fipronil or a mixture of 100 µM DEET and 10 µM fipronil (left panel). Venn diagram of P450 distribution between three treatments (right panel).

DEET Fipronil DEET+Fipronil P450 Log2 Change P450 Log2 Change P450 Log2 Change CYP1A1b +1.82 CYP1A1 +1.09 CYP1A1 +2.02 CYP1A2 +1.24 ─ ─ CYP1A2 +1.06 CYP2A6 +2.21 CYP2A6 +1.31 CYP2A6 +1.75 CYP2A7 +2.45 ─ ─ CYP2A7 +1.10 CYP2A13 +2.34 ─ ─ ─ ─ CYP2B6 +3.82 CYP2B6 +3.54 CYP2B6 +3.85 CYP2B7P1 +2.97 CYP2B7P +2.42 CYP2B7P +2.80 CYP2C8 +2.30 CYP2C8 +2.39 CYP2C8 +2.04 CYP2C9 +0.93 CYP2C9 +0.75 CYP2C9 +0.62 ─c ─ CYP2D7P +0.39 ─ ─ ─ ─ CYP3A4 +3.86 CYP3A4 +3.73 CYP3A5 +1.15 CYP3A5 +1.17 CYP3A5 +1.21 CYP3A7 +3.40 CYP3A7 +3.71 CYP3A7 +3.57 CYP3A43 +3.26 CYP3A43 +2.70 CYP3A43 +1.47 ─ ─ CYP4F22 +1.26 CYP4F22 +0.95 ─ ─ CYP21A2 +0.88 CYP21A2 +0.76 ─ ─ CYP26A1 +3.34 CYP26A1 +4.23 ─ ─ CYP26B1 +2.15 CYP26B1 +2.42 ─ ─ CYP51A1 +0.62 ─ ─ POR +0.97 POR +1.12 POR +0.97 P450 Log2 Change P450 Log2 Change P450 Log2 Change ─ ─ ─ ─ CYP2C19 -0.35 ─ ─ CYP2E1 -1.60 CYP2E1 -1.87 CYP4A11b -1.11 CYP4A11 -2.46 CYP4A11 -3.61 CYP4A22 -1.07 CYP4A22 -2.08 CYP4A22 -2.70 ─ ─ ─ ─ CYP4F2 -0.36 ─ ─ ─ ─ CYP4F12 -0.39 ─ ─ CYP4V2 -0.46 CYP4V2 -0.56 ─ ─ CYP8B1 -0.84 CYP8B1 -1.01 ─ ─ CYP20A1 -0.56 CYP20A1 -0.66 ─ ─ CYP27C1 -0.85 CYP27C1 -1.38 ─ ─ ─ ─ CYP46A1 -0.98 aUp or down change was statistically significant (α = 0.05). bColored enzymes were affected in the same direction by all three treatments. Green enzymes were up-regulated by all three treatments, while red enzymes were down-regulated by all three treatments. cStraight line means not differentially regulated.

47

FIGURES

Figure 1.1. Statistically significant (α = 0.05) fold change in transcript levels compared to a control in primary human hepatocytes treated for 72 hours with (A) 100 µM DEET, (B) 10 µM fipronil or (C) a mixture of 100 µM DEET and 10 µM fipronil in DMSO. The control was treated with DMSO alone. (D) Venn diagram showing the common and exclusive total up- and down-regulated transcripts for each treatment (A-C). For A-C, plotted points that are red are statistically significant, whereas plotted points that are black are not significant.

48

Figure 1.2. Top 10 Gene Ontology (GO) level 3 (biological processes) matches for the statistically significant (alpha = 0.05) changes in transcript levels compared to a control (Fig. 1) in primary human hepatocytes treated for 72 hours with 100 uM DEET (left), 10 uM fipronil (middle), and a mixture of 100 uM DEET and 10 uM fipronil (right) in DMSO. The control was treated with DMSO alone.

49

Figure 1.3. Top 10 Gene Ontology (GO) level 3 (molecular functions) matches for the statistically significant (alpha = 0.05) changes in transcript levels compared to a control (Fig. 1) in primary human hepatocytes treated for 72 hours with 100 uM DEET (left), 10 uM fipronil (middle), and a mixture of 100 uM DEET and 10 uM fipronil (right) in DMSO. The control was treated with DMSO alone

50

Figure 1.4. Human chromosomal maps showing the location of genes for the statistically significant (α = 0.05) up- and down-regulated transcripts produced by primary human hepatocytes treated for 72 hours with 100 µM DEET in DMSO (2A), 10 µM fipronil in DMSO (2B), and a mixture of 100 µM DEET plus 10 µM fipronil in DMSO (2C) compared to a DMSO-only control. For each chromosome drawn, length is proportional to the relative length of each chromosome in the human genome. Also shown as an indentation in the chromosome structure is the relative position of the centromere; the relative location of each gene that produced the up- and down-regulated transcripts is indicated by a horizontal line. The black number below each chromosome illustration identifies which chromosome each represents from the human genome. The blue number above the chromosome illustrations in 4A-4C represents the percentage of genes that were differentially-expressed on that chromosome. The black number above the chromosome illustrations in 4A-4C represents the genes affected as a percentage of total genes on that chromosome. The right and left panels in 4D are a graphical representation of the numerical values above each chromosome illustration.

51

[4A]

15.12% 0.48% 2.91% 0.12%

6.40% 0.35% 6.40% 0.42% 4.65% 0.27% 5.23% 0.30% 9.30% 0.54% 1.74% 0.12% 3.49% 0.25% 5.23% 0.39% 5.23% 4.07% 2.33% 0.39% 0.21% 0.14%

0.58% 0.07% 0.58% 0.04% 4.65% 0.36% 5.81% 0.41% 2.91% 1.16% 0.17% 0.17%

2.33% 6.98% 0.28% 0.41% 2.33% 0.58% 0.29% 0.12%

52

[2B][4B]

11.14% 7.63% 6.05% 5.41% %

5.40% 6.30% 4.31% 6.07% 4.25% 5.35% 5.50% 6.80% 5.26% 6.52% 3.41% 5.06% 3.79% 5.74% 4.09% 6.44% 4.11% 5.67% 5.83% 6.60% 6.30% 7.26%

1.34% 3.61% 2.83% 4.57% 3.11% % 5.18% 4.85% 7.25% 4.99% 1.44% 6.09% 4.59%

2.21% 6.98% 5.63% 8.72% 2.45% 1.01% 6.60% 4.28%

53

[4C]

11.14% 10.60% 6.43% 7.99%

5.18% 8.40% 4.10% 8.04% 4.39% 7.68% 5.43% 9.33% 5.18% 3.22% 8.92% 6.65% 3.65% 7.68% 4.06% 8.89% 3.43% 5.84% 5.69% 7.65% 9.02% 9.83%

1.39% 5.24% 3.06% 6.85% 3.14% 7.27% 4.75% 9.85% 5.71% 1.33% 9.68% 5.89%

2.51% 6.86% 8.89% 11.92% 2.59% 0.94% 9.68% 5.55%

[4 D]

54

dDEET; ffipronil; and d+fDEET plus fipronil.

Figure 1.5. Transcripts for enzymes in the steroid hormone biosynthesis pathway that were significantly (α= 0.05) up- (green) or down- (red) regulated when primary human hepatocytes were treated for 72 hours with 100 µM DEET, 10 µM fipronil, or a combination of 100 µM DEET plus 10 µM fipronil. 3β-HSD1 = hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 1; 17β-HSD = 17β-hydroxysteroid dehydrogenase; 11β- HSD1 = 11β-hydroxysteroid dehydrogenase type 1; POR = cytochrome P450 reductase; CYB5A = cytochrome b5, type A; SULT2A1 = sulfotransferase 2A1; CYP21A2 = cytochrome P450 21A2. dDEET; ffipronil; and d+fDEET plus fipronil.

55

─── CHAPTER 2 ───

Differential Expression Profile of lncRNAs from Primary Human Hepatocytes Following DEET and Fipronil Exposure

Robert D. Mitchell,1 Ernest Hodgson,2,3 Andrew Wallace,3 and R. Michael Roe1,4

1Department of Entomology and Plant Pathology, Campus Box 7647, 3230 Ligon Street, North Carolina State University, Raleigh, NC 27695-7647 USA ([email protected]; [email protected]) 2Department of Applied Ecology, Toxicology Program, Department of Biology, North Carolina State University, Raleigh, NC 27695, USA ([email protected]) 3Toxicology Program, Department of Biology, North Carolina State University, Raleigh, NC 27695, USA ([email protected])

4Corresponding author, e-mail: [email protected]

This chapter was formatted for the International Journal of Molecular Sciences.

56

Abstract

While the synthesis and use of new chemical compounds is at an all-time high, the study of their potential impact on human health is quickly falling behind. We chose to examine the effects of two common environmental chemicals, the insect repellent DEET (N, N-diethyl-m- toluamide) and the insecticide fipronil (fluocyanobenpyrazole), on transcript levels of long non-protein coding RNAs (lncRNAs) in primary human hepatocytes. While lncRNAs are believed to play a critical role in numerous important biological processes many still remain uncharacterized and their functions and modes of action remain largely unclear, especially in relation to environmental chemicals. RNA-Seq showed that 100 µM DEET significantly increased transcript levels for 2 lncRNAs and lowered transcript levels for 18 lncRNAs while fipronil at 10 µM increased transcript levels for 76 lncRNAs and decreased levels for 193 lncRNAs. A mixture of 100 µM DEET and 10 µM fipronil increased transcript levels for 75 lncRNAs and lowered transcript levels for 258 lncRNAs. Differentially expressed lncRNA genes were mapped to chromosomes, analyzed by proximity to neighboring protein-coding genes, and functionally characterized via gene ontology and molecular mapping algorithms.

While further testing is required to assess the organismal impact of changes in transcript levels, initial analysis links several of the dysregulated lncRNAs to processes and pathways critical to proper cellular function like the innate and adaptive immune response and p53 signaling pathway.

Key Words: DEET, Fipronil, Long non-protein coding RNAs, lncRNA, noncoding, non- coding, environmental chemicals

57

Introduction

The study of the impact of environmental chemicals on human health has fallen significantly behind the rate at which we synthesize new chemical compounds. Recent studies suggest that while we generate nearly 10 million new chemical compounds annually, factors such as lack of funding and reduced public interest and awareness have diminished the amount of research into the potential consequences of human exposure to environmental chemicals [1].

To fully understand the impact of chemicals on human health and provide more rapid and comprehensive methods to evaluate risk we must take advantage of recent advances in high- throughput DNA sequencing, annotation of the human genome, and bioinformatics, while at the same time using global approaches relating effects of chemical exposure on molecular pathways to whole organism function.

The insect repellent DEET (N, N-diethyl-m-toluamide) and the insecticide fipronil

(fluocyanobenpyrazole) are pesticides that have a high potential for human exposure. DEET is applied at a concentration of 5 to 100% to skin to repel insects and other by approximately 30% of the U.S. population annually [2]. Recently, fetal Zika virus infections in the Americas linked to severe birth defects including microcephaly [3] have further increased our utilization and dependency on DEET. Expectant mothers are encouraged to use this repellent on their skin at a minimum concentration of 30% active ingredient every time they are at risk of being bitten by mosquitoes before and during pregnancy [4]. This level of repetitive use of DEET on human health has never been considered before. Fipronil is an insecticide used to treat companion animals for fleas and ticks as well as around the

58

home for other insect control, e.g., termites, roaches, and ants [5]. It is also used as a pesticide in numerous countries around the world to protect crops such as corn and cotton [6-

8]. However, several countries in the European Union have restricted the use of fipronil citing its potential lethality to honeybees [6]. High doses of fipronil in humans have in some cases resulted in severe vomiting, agitation, and seizures [9].

While both DEET and fipronil have been available commercially for many years (DEET since 1957 and fipronil since 1993), the study of these compounds directly in human systems can improve our understanding of their toxicology [10] and also provide a model for developing a new approach to risk assessment for environmental chemicals in general.

Previous studies have measured DEET and fipronil metabolite levels in blood and urine, but no work has been conducted at the DNA/RNA level in relation to human health [11, 12].

DEET is metabolized in humans by cytochrome P450 enzymes into the primary metabolites

N, N-diethyl-m-hydroxymethylbenzamide (BALC) and N-ethyl-m-toluamide (ET). While several P450s were demonstrated to be active in DEET metabolism, CYP2B6 was the principal P450 responsible for the conversion of DEET to BALC and CYP2C19 was the principal P450 responsible for the conversion of DEET to ET [13]. DEET metabolites are primarily excreted from the human body in urine but can also be expelled in feces [14]. The predominant metabolite of fipronil is fipronil sulfone (5-amino-1-(2,6-dichloro-4- trifluoromethylphenyl)-3-cyano-4-trifluoromethylsulfonylpyrazole), which is primarily metabolized by the cytochrome P450 enzyme CYP3A4 [15, 16]. Unlike DEET, fipronil is primarily eliminated in the feces [17]. We previously provided evidence that the exposure of primary human hepatocytes to 100 µM DEET, 10 µM fipronil, and a mixture of 100 µM

59

DEET and 10 µM fipronil significantly altered transcript levels for numerous protein-coding and non-protein coding genes [18]. Here the goal was to determine what epigenetic elements, specifically long non-protein coding RNA (lncRNA) transcripts, were significantly differentially expressed after exposure to DEET and fipronil at the same concentrations and what role they may play in response to these chemicals.

Long non-protein coding RNAs (lncRNAs) are RNA transcripts greater than 200 nucleotides long which rarely code for protein. They are the largest class of noncoding genes and are processed much like messenger RNA (mRNA), i.e., they are transcribed from active chromatin and can have a 5’ cap and a poly A tail. The nucleotide and natural structure of lncRNAs predominantly determine what RNA, DNA, or proteins they will interact with [19].

We now know that protein-coding genes account for less than 2% of the human genome and the majority of transcripts do not code for protein but perform other essential functions [20].

There are several distinct sub-categories of lncRNAs based on their configuration in the genome (i.e., location and proximity to protein-coding genes). Antisense lncRNAs are transcribed from the opposite strand of a protein-coding gene or RNA transcribed from the sense strand. Intronic lncRNAs are transcribed wholly within an intron (no overlap with the exons) of a protein-coding gene. Intergenic lncRNAs (lincRNAs) are transcribed completely within the genomic space between protein-coding genes without overlapping the transcriptional units [21]. Once thought to be “junk” or “artifacts”, we now know that lncRNAs likely play a critical role at every level of gene regulation in major biological processes like growth, development, and metabolism and are now referred to as the “dark energy” of DNA [22, 23, 19]. They can serve as transcription signals, transcription factor

60

decoys, guides for chromatin-modifying enzymes, and molecular scaffolds facilitating ribonucleoprotein complex formation [20]. While recent reviews of lncRNAs focus on their activity across numerous biological processes, there is relatively little known of their mode(s) of action or their activity in response to environmental chemicals [24].

Considering an almost complete lack of knowledge of the role of lncRNAs in animal and human responses to environmental chemicals, the objective of this study was to analyze the impact of two common environmental chemicals, DEET and fipronil, alone and in a mixture, on lncRNA transcript levels in primary human hepatocytes. The research also included an analysis of the interaction of lncRNA transcription with that for coding genes to provide leads for future assessments of risks to chemical exposure. Studying DEET and fipronil both alone and in combination in primary liver cells can provide insight on whether processes and pathways are shared or unique among the exposure conditions examined. The use of primary human cells provides the closest possible estimate of the chemical-human global molecular interaction.

61

Materials and Methods

Cell Culture and Treatments

Plated primary human hepatocytes were obtained from Life Technologies Corporation,

Carlsbad, CA, USA within 24 hours of their procurement from the patient. They arrived at our laboratory submerged in William’s E Medium in 12-well culture plates coated with a

Collagen (Type I) substratum and a Geltrex® overlay at a density of 0.67x106 cells per well.

The cells were harvested from the liver of a Caucasian female, age 62, with a body mass index (BMI) of 26.3 and no history of smoking or alcohol consumption. Individuals with a

BMI > 27.5 show moderate steatosis and lower BMI levels indicate low or no steatosis [25], a condition that may affect molecular studies. Upon arrival, the medium the cells were shipped in was removed and replaced with fresh, sterile William’s E Medium containing

0.292 g/L L-glutamine (Cat. No. W1878, Sigma-Aldrich, St. Louis, MO, USA) supplemented with (i) 10% (v/v) fetal bovine serum (Cat. No. S11050, Atlanta Biologicals,

Norcross, GA, USA), (ii) 10-7 M dexamethasone (Cat. No. P0500, Steraloids, Inc., Newport,

RI, USA), (iii) insulin-transferrin-selenium A formulated with 0.17 mM insulin, 0.0069 mM transferrin, 0.0039 mM sodium selenite and 100.0 mM sodium pyruvate (Cat. No. 51300-

044, Invitrogen, Carlsbad, CA, USA), (iv) penicillin/streptomycin/amphotericin B solution containing 10,000 units/mL of penicillin, 10,000 µg/mL of streptomycin, and 25 µg/mL of

Fungizone® antimycotic (Cat. No. 15240-062, Invitrogen, Carlsbad, CA, USA), and (v) a sterile gentamicin/glutamate solution containing 200 mM L-glutamine and 5 mg/mL gentamicin in 0.9% sodium chloride (Cat. No. G9654, Sigma-Aldrich, St. Louis, MO,

62

USA). The plate was then placed in a humidified incubator (relative humidity of 95%) at 5%

CO2/95% air at a temperature of 37°C for 24 hours. After 24 hours, the medium was changed once again, and the cells were incubated for 24 hours, resulting in a period of 48 hours during which the cells were maintained in fresh media for viability and quality observations.

Treatments with DEET and fipronil began 48 hours after the cells arrived. On the same plate, three wells of primary human hepatocytes were each inoculated with DEET (purity >

98%; Cat. No. F2284, Chem Service, Inc., West Chester, PA, USA) producing a final concentration of 100 µM in each well, three different wells were inoculated each with fipronil (purity > 98%; Cat. No. PS2136, Chem Service, Inc., West Chester, PA, USA) producing a final concentration of 10 µM in each well, and three different wells each were inoculated with a combination of DEET and fipronil (mixed together before adding) producing a final concentration of 100 µM DEET and 10 µM fipronil. DEET and fipronil were kept at room temperature until the day of dosing the cells. The insecticides were added to the culture media dissolved (wt/vol) in DMSO (dimethyl sulfoxide; ≥ 99.7% pure; Cat.

No. BP231-100, Fisher Scientific International, Inc., Hampton, NH, USA). The amount of

DMSO (0.1% final concentration) was the same for all treatments and was previously shown to produce minimal cytotoxicity or changes in gene expression for hepatocytes in culture

(LeCluyse et al. [26]). The concentration levels chosen for the insecticides were determined from the dose-response data of Das et al. [27, 16] for DEET and fipronil, respectively, that produced the maximum increase in P450 transcript, protein and activity levels for multiple

P450s in both primary human hepatocytes and immortalized liver (HepG2) cells and with the

63

lowest possible cytotoxic effects. This choice of dose was made to repeat as close as possible our previous experiments at the maximum impact of these chemicals on P450s in the absence of cytotoxicity [26, 27] but at the same time expand our work to an examination of the impact of these treatments on global transcript levels. The experimental assay conditions in this paper for primary human hepatocytes were identical to Das et al. (22, 23). The DEET concentration chosen (100 µM) is in the intoxication range of what would be expected in human blood within 8 hours after a dermal treatment, 6.7 fold greater than what would be expected in human blood when DEET is appropriately applied at a maximum dose, and approximately one-fifth that for a person subjected to an acute intentional oral overdose of

DEET [28]. No data are available on fipronil levels in human blood; the fipronil treatment level in this study was one-tenth that of DEET. As a vehicle control, three separate wells were treated with culture medium supplemented with DMSO only, in the same amount as the insecticide treated wells. Once all cells received the treatment or carrier only, they were incubated undisturbed for 72 hours in a humidified incubator under the environmental conditions previously described.

RNA Isolation, Quality Assessment, and Sequencing

After 72 hours of treatment, the medium was removed from all wells, and the wells washed with 500 µL of 1X phosphate-buffered saline formulated with 1 mM KH2PO4, 155 mM NaCl and 3 mM Na2HPO4-7H2O at a pH of 7.4 (Cat. No. 10010-023, Life Technologies,

Carlsbad, CA, USA). Then, 350 µL of RLT lysis buffer (Qiagen, Inc., Valencia, CA, USA) was added to each well, and the cells scraped from the plate’s solid support using a sterile,

64

disposable cell scraper (producing a cell suspension in RLT buffer). Suspensions from each individual well (3 DEET wells, 3 fipronil wells, 3 DEET plus fipronil wells, and 3 control wells) were separately stored in 1.5 mL microcentrifuge tubes at -80°C until they were used for RNA extraction using the RNeasy Mini Kit (Cat. No. 74104, Qiagen, Inc., Valencia, CA,

USA) per the manufacturer’s protocol. Each isolated total RNA sample was separately analyzed for purity on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA,

USA) by the North Carolina State Genome Sciences Laboratory, Raleigh, NC. No samples with an RNA Integrity Number (RIN) of less than 9.0 were used for sequencing; the lowest

RIN obtained was 9.4. Sequencing of all treatments and the controls were performed on the

Illumina HiSeq 2000 platform (Illumina, Inc., San Diego, CA, USA) at the Beijing Genomics

Institute collaborative genome center at the Children’s Hospital of Philadelphia

(BGI@CHOP, Philadelphia, PA, USA). RNA-Seq libraries were prepared using the TruSeq

RNA Sample Preparation kit following the manufacturer’s protocol. The libraries were multiplexed with three samples per lane randomly distributed over four flow cell lanes (12 samples total as defined above) and run on the paired-end read (2 × 100-bp) setting. The calculated Phred quality scores (Q scores) showed that approximately 97% of the bases read in each lane had a score of >Q30. Quality scores are used to predict the probability that an error will occur during base calling. Runs where the majority of the bases score Q30 or above (1 error in 1000 bases) are ideal for most sequencing applications [29].

65

Data Analysis

Elements of the Tuxedo suite pipeline [30] were used to analyze the RNA-Seq data. Each of the fastq files was aligned to the hg19 build of the human genome. The hg19 annotation file from the University of California, Santa Cruz (UCSC) was used along with Cufflinks [31,

32] to guide an assembly for each of the datasets. These nine assemblies were merged with

CuffMerge, and the transcripts from the merged assembly were used as an annotation file for calculation of differential expression via CuffDiff. In running CuffDiff, the ‘rescue method’ for multi-reads was implemented, and normalization was performed using a geometric mean.

Quality-control and result plots were generated from the Cummerbund package [33]. Scatter plots were generated that indicated no problems with the normalization step. Fragments per kilobase of the exon per million fragments mapped (FPKM) distribution appeared to be similar across all replicates.

The protein and non-protein coding genes whose transcripts were indicated to have been differentially expressed, at a significance level of P ≤ 0.01, were arranged in Venn diagrams to explore shared and unique genes among our treatment conditions. The lncRNAs with differentially expressed transcripts, at a significance level of P ≤ 0.01, were extracted from the total transcriptome data set and further classified using the Genomic Regions Enrichment of Annotations Tool (GREAT) to map lncRNAs whose transcripts were up- and down- regulated to potential target genes based on proximity to a transcription start site (TSS) and gene annotations of the neighboring proteins [34]. The GREAT algorithm assumes that lncRNA transcription sites within 1,000 kilobases (kb) of a “neighboring” gene TSS can

66

affect transcription of that neighboring gene. If the nearest TSS is over 1,000 kb away then no neighboring protein-coding genes are assigned. Neighboring or associated protein-coding genes are defined by the GREAT algorithm as those within 1,000 kb of an input lncRNA transcription site. GREAT calculates the distance between input sequences (lncRNA genes in this case) and target TSSs by measuring the distance in nucleotide base pairs from the middle of each input sequence to the closest TSS of a protein-coding gene. GREAT was run with the binomial and hypergeometric functions disabled, but all other filters intact, since much of the input data (i.e., lncRNAs with up- and down-regulated transcripts) were largely unannotated. Dysregulated lncRNAs and their associated protein-coding genes, as refined in the GREAT algorithm, were analyzed for potential functions using the Protein ANalysis

THrough Evolutionary Relationships (PANTHER version 11) classification system.

PANTHER uses Gene Ontology (GO)-slim terms to classify genes based on annotations established by the Gene Ontology Consortium. GO-slim is useful with large data sets and is a viable choice when a more broad classification of gene products is desired. Gene symbols

(official HUGO gene nomenclature committee (HGNC) gene symbols [35]) from our lncRNAs and neighboring protein-coding genes were also fed into the PANTHER “gene list analysis” tool to visualize and further annotate GO associations and conserved signaling pathways [36].

The web application, Idiographica version 2.3 (http://www.ncrna.org/idiographica), was used to develop chromosome maps of genes dysregulated by selected treatments and Venny version 2.0 (http://bioinfogp.cnb.csic.es/tools/venny) to generate Venn diagrams [37, 38].

We also used the Biological Database Network’s “Database to Database Conversions” tool

67

(https://biodbnet-abcc.ncifcrf.gov/db/db2db.php) to convert between various identifiers necessary to run specific algorithms [39] and as inputs in Microsoft Excel, Word, and

Powerpoint (2013 version). STRING version 10.0 [40] was used to generate protein-protein interaction models to visualize interconnectivity of signaling pathway components.

68

Results and Discussion

DEET and Fipronil Exposure Significantly Alter lncRNA Transcript Levels in Primary

Human Hepatocytes

Primary human hepatocytes were treated with either 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil. At a significance level of P ≤ 0.01 we observed transcripts for 2 lncRNA genes up-regulated and 18 down-regulated by 100 µM

DEET. This accounted for 0.04% of the total number of coding and noncoding genes identified in the latest Ensembl release (Ensembl release 87) of the annotated human genome

[41]. Specifically, of all of the annotated categories recognized by the Ensembl project that includes coding genes, noncoding genes (defined as small noncoding genes, long noncoding genes, and miscellaneous noncoding genes), and pseudogenes, 20 of the 56,384 genes were lncRNA genes whose transcripts were differentially expressed after exposure to 100 µM

DEET. When primary human hepatocytes were treated with 10 µM fipronil there were 76 lncRNA genes whose transcripts were up-regulated and 193 down-regulated accounting for

0.48% of the total number of coding and noncoding genes identified in the latest human genome annotation (i.e., 269 of 56,384 genes). When primary human hepatocytes were treated with a mixture of 100 µM DEET and 10 µM fipronil we observed 75 lncRNA genes whose transcripts were up-regulated and 258 down-regulated. This accounted for 0.59% of the total number of coding and noncoding genes identified in the latest human genome annotation (i.e. 269 of 56,384 genes).

69

In this study we included transcribed pseudogenes as lncRNAs and specifically defined an lncRNA as any non-protein coding gene whose transcripts were ≥ 200 nucleotides long. A pseudogene is a highly similar copy of a protein-coding gene. A protein-coding gene that is similar to a specific pseudogene is termed a parental gene to that pseudogene and no longer produces a functional protein product in most cases [42]. Pseudogenes typically regulate parental genes as lncRNA transcripts and previous studies established that pseudogenes, when transcribed, function as drivers of gene regulation just as lncRNAs do. However, it is known that not all pseudogenes are actively transcribed (although all the ones we worked with in this study were transcribed) and some estimate that only 2-20% of pseudogenes in the human genome are actively transcribed at all [42-44]. Pseudogenes function as regulators of target genes by having their transcripts interact with target gene promoters, getting processed into short noncoding RNAs and interacting with RNA sense strand transcripts, or hybridizing to sense strand transcripts [43]. Experimental evidence supports the role of transcribed pseudogenes as regulating messenger RNAs (mRNAs) via small interfering RNAs [45], regulating other lncRNA transcripts [46], and functioning as microRNA (miRNA) decoys

[47]. One of the two lncRNA genes whose transcripts were up-regulated by 10 µM DEET was a pseudogene, and 5 of the 18 (28%) with down-regulated transcripts were pseudogenes.

Thirty-four of the 76 (45%) lncRNA genes whose transcripts were up-regulated by 10 µM fipronil were pseudogenes, and 72 of the 193 (37%) with down-regulated transcripts were pseudogenes. Thirty-two of the 75 (43%) lncRNA genes whose transcripts were up- regulated by the 100 µM DEET plus 10 µM fipronil mixture were pseudogenes, and 97 of the 258 (38%) with down-regulated transcripts were pseudogenes. We have a long way to go to truly understand how lncRNA and pseudogene transcripts interact amongst themselves,

70

other epigenetic elements, and protein-coding genes and gene products. However, lncRNAs are certainly abundant as even conservative estimates put the total number of transcribed lncRNAs at approximately 35,000-40,000 versus the approximately 20,000 protein-coding genes [42]. It would be difficult to understand why the human body would transcribe so many lncRNAs if they did not serve a useful function.

Fig. 1A shows that 20 of the lncRNAs whose transcripts were significantly differentially expressed (P ≤ 0.01) were shared among all three treatment conditions (100 µM DEET, 10

µM fipronil, and 100 µM DEET combined with 10 µM fipronil). Interestingly, there were no lncRNAs whose differentially expressed transcripts were unique only to the 100 µM DEET treatment. There were 204 lncRNA genes whose transcripts were differentially expressed that were shared between the 10 µM fipronil treatment and the 100 µM DEET plus 10 µM fipronil treatment, 45 that were unique to the fipronil-only treatment, and 109 that were unique to the DEET plus fipronil mixture. Fig. 1B shows the relationship between the statistically significant up- and down-regulated protein-coding genes at P ≤ 0.01, which we defined as any genes that were processed into messenger RNA (mRNA), exported from the nucleus into the cytosol, and translated by ribosomes into protein. This definition excludes any noncoding RNAs or the few pseudogenes that are now thought to code for protein and may be reclassified in the near future [48].

Table 1 shows the 20 lncRNAs whose transcripts were differentially expressed across all three treatment conditions. Of those, only 5 lncRNAs (25%) had assigned Gene Ontology

(GO) terms at the biological process, cellular component, or molecular function level. These

71

will be discussed in more detail below, but it is worth noting that the log2 fold change

(log2FC), a common metric for differential expression values [49], was very similar and in the same direction, either up- or down-regulated, for each lncRNA among the 100 µM

DEET, 10 µM fipronil, and 100 µM DEET plus 10 µM fipronil conditions. Fig. 2 shows the distribution and magnitude (in ± log2FC values) of all differentially expressed lncRNA transcripts across all human chromosomes.

Chromosomal Distribution of lncRNAs Dysregulated by DEET and Fipronil

Table 2 shows the distribution of lncRNAs whose transcripts were differentially expressed (P

≤ 0.01) in primary human hepatocytes on an individual chromosome basis including the sex chromosome. In the 100 µM DEET treatment chromosome 16 had the most lncRNAs affected (3) while only 2 lncRNAs were dysregulated on each of chromosomes 5, 6, 7, 9, 11, and 19. Since there were only 20 lncRNAs in total affected by the DEET-only treatment the rest of the chromosomes only had 1 or no lncRNAs dysregulated by the treatment. There were many more lncRNAs whose transcripts were up- or down-regulated by the 10 µM fipronil treatment (269 total or 13.5X as many), which led to all 23 of the chromosomes having lncRNAs that were affected by the treatment. The three chromosomes with the most lncRNAs affected were chromosomes 1, 7, and 10, while the chromosomes with the lowest number of lncRNAs affected were chromosomes 13 and 18. For the mixture of 100 µM

DEET and 10 µM fipronil the three chromosomes that had the most lncRNAs dysregulated by the treatment were chromosomes 1, 6, and 7 while chromosomes 13, 18, and 21 were the three with the lowest number of affected lncRNAs.

72

In Table 2 we took and total number of lncRNAs dysregulated on each chromosome (up- and down-regulated combined) and divided that number by the total number of coding and noncoding genes known to exist on each chromosome based on the latest Ensembl release

[41], generating a percentage of dysregulated lncRNAs versus total number of genes. These calculations revealed that for the 100 µM DEET treatment chromosome 16 had 0.13% of its genes affected by the treatment (3 out of 2,375), which was the highest percentage of genes affected across all chromosomes for this treatment. The next closest was chromosome 9 where 0.09% of the genes were dysregulated (2 out of 2,224) and chromosomes 2, 3, 8, 12,

13, 14, 15, 18, 20, 21, and 22 had no lncRNAs that were affected in response to 100 µM

DEET. The three chromosomes most affected by the 10 µM fipronil treatment were (0.78%), chromosome 10 (0.78%), and chromosome 9 (0.67%) while the three chromosomes least affected were chromosome 13 (0.16%), chromosome 14 (0.23%), and chromosome 17 (0.28%). In the 100 µM DEET plus 10 µM fipronil mixture the three most affected chromosomes were chromosome 7 (1.02%), chromosome 16 (0.88%), and chromosome 10 (0.87%) and the three least affected were chromosome 13 (0.24%), chromosome 8 (0.30%), and chromosome X (0.34%).

Comparison of Dysregulated lncRNA and Protein-Coding Gene Chromosomal

Distribution

We compared our findings on the chromosomal distribution of lncRNA genes with up- and down-regulated transcripts to the chromosomal distribution of protein-coding genes dysregulated after primary human hepatocytes were exposed to both DEET and fipronil using

73

the percentage of genes dysregulated versus the total number of genes on each chromosome.

Table 3 shows the chromosomal distribution of protein-coding genes significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were treated with 100 µM DEET,

10 µM fipronil, and a mixture of 100 µM DEET plus 10 µM fipronil. It takes into account the most recent Ensembl gene annotations from December of 2016 [41] in calculations comparing the number of dysregulated protein-coding genes to the total number of coding and noncoding genes on each chromosome. For the 100 µM DEET treatment the three chromosomes that had the most dysregulated protein-coding genes were chromosome 7

(0.57%), chromosome 1 (0.50%), and chromosome 4 (0.57%). These percentages did not correspond with the top three genes that had the most lncRNAs affected in the DEET-only treatment. In the 10 µM fipronil treatment the three chromosomes with the most dysregulated protein-coding genes were chromosome 1 (7.92%), chromosome 12 (7.53%), and chromosome 16 (7.49%) and the three chromosomes with the lowest number of dysregulated protein-coding genes were chromosome 13 (3.84%), chromosome 14 (4.72%), and chromosome 18 (4.73%). Chromosomes 13 and 14 had the least number of lncRNAs with differentially expressed transcripts in the 10 µM fipronil treatment and no lncRNAs dysregulated in the 100 µM DEET treatment, which suggests that these two chromosomes play a smaller role in the response of liver cells to DEET and fipronil at these concentrations than the other chromosomes. There was little correlation between the chromosomes with the most dysregulated lncRNAs and protein-coding genes in response to 10 µM fipronil. The three chromosomes with the highest number of differentially expressed transcripts from protein-coding genes in response to the mixture of 100 µM DEET plus 10 µM fipronil were chromosome 19 (12.27%), chromosome 1 (10.99%), and chromosome 12 (10.21%). There

74

was no obvious correlation with the dysregulated lncRNA profile and the protein-coding gene profile in response to the mixture either. Chromosome 13 (as it did in the 100 µM

DEET and 10 µM fipronil treatments) had the lowest percentage of protein-coding genes affected by the 100 µM DEET plus 10 µM fipronil treatment (5.57%), suggesting that chromosome 13 is less important in the human hepatocyte response to a mixture of DEET and fipronil than any other chromosomes if we base our assumption on the percent of dysregulated genes versus the total number of genes per chromosome.

In summary, 20 lncRNA genes in primary human hepatocytes had transcripts that were significantly up- or down-regulated (P ≤ 0.01) by 100 µM DEET, 269 by 10 µM fipronil

(13.5X the number dysregulated by 100 µM DEET only), and 333 by a mixture of 100 µM

DEET and 10 µM fipronil (1.2X the number dysregulated by 10 µM fipronil alone and 16.7X the number dysregulated by 100 µM DEET alone). Therefore, 0.04% of the total number of known genes (both coding and noncoding) were lncRNAs dysregulated by exposure to 100

µM DEET, 0.48% were lncRNAs affected by the 10 µM fipronil treatment, and 0.59% were lncRNAs dysregulated by the 100 µM DEET plus 10 µM fipronil treatment. We observed a more-than-additive effect with the mixture of 100 µM DEET and 10 µM fipronil together

(333 lncRNAs whose transcripts were up- or down-regulated) versus a purely additive effect that would have totaled 289 lncRNAs with differentially expressed transcripts, which was the sum of the dysregulated lncRNAs from the 100 µM DEET treatment and the 10 µM fipronil treatment. This reveals that more lncRNA transcripts, and protein-coding transcripts from our previous study [18], were up- or down-regulated in primary human hepatocytes to metabolize a mixture of 100 µM DEET and 10 µM fipronil together than either chemical

75

alone. Interestingly, the concentration of fipronil used was 10-fold lower than the concentration of DEET used (10 µM fipronil versus 100 µM DEET) suggesting that a 100

µM fipronil dosage might elicit a much stronger response than a 100 µM DEET dosage.

These findings underline the importance of studying environmental chemicals as they are more typically encountered, which is in combination rather than singly, especially to vulnerable populations like agricultural field workers, for example, that may encounter these substances more regularly.

Chromosomal Maps Help Visualize Dysregulate Protein-Coding Gene-lncRNA

Relationships

RNA-Seq (transcriptomic) data were analyzed to reveal the identity and genomic location of lncRNAs and protein-coding genes whose transcripts were significantly differentially expressed (P ≤ 0.01) after exposure to 100 µM DEET, 10 µM fipronil, and a mixture of 100

µM DEET with 10 µM fipronil. Genomic location data were analyzed with the Idiographica algorithm [37] to visualize the location and orientation of dysregulated lncRNA genes in relation to dysregulated protein-coding genes across all human chromosomes. There were 5 instances where a dysregulated lncRNA transcription site (from the 100 µM DEET treatment) occupied genomic space within 1,000 kilobases (kb) of a dysregulated protein- coding gene transcription site. This 1,000 kb range from an lncRNA transcription site to a neighboring protein-coding gene transcription start site (TSS) constituted a regulatory region as defined by the Genomic Regions Enrichment of Annotations Tool (GREAT); this will be discussed in more detail later. A transcription start site (TSS) is the first nucleotide base in a

76

section of DNA where RNA polymerase II begins to synthesize a complementary RNA transcript at the 5’ end of a gene [50]. In the 100 µM DEET treatment there were even 2 examples of a dysregulated lncRNA gene residing within 300 kb of another dysregulated lncRNA gene, suggesting that some of these long noncoding elements may actually influence the activity of one another or the protein-coding genes with which each interacts. Fig. 3A shows the distribution of the 20 lncRNA genes whose transcripts were differentially expressed after primary human hepatocytes were treated with 100 µM DEET for 72 h.

Chromosomes 1, 4, 5, 6, 7, 9, 11, 16, 17, 19, and X had lncRNAs that were dysregulated by the treatment. When we displayed the dysregulated lncRNAs (20 genes; P ≤ 0.01) and dysregulated protein-coding genes (152 genes, P ≤ 0.01) together after 100 µM DEET treatment (Fig. 3B), we could clearly visualize the protein-coding genes that were closely oriented with lncRNA genes.

Fig. 4A displays magnified regions on specific chromosomes where lncRNAs with differentially expressed transcripts were closely associated (based on distance in base pairs) with neighboring dysregulated protein-coding genes from the 100 µM DEET treatment. Fig.

4B shows the dysregulated lncRNAs and protein-coding genes from the 10 µM fipronil treatment in the same regions that were highlighted with the DEET-only data in Fig. 4A. The same dysregulated protein-coding and lncRNA genes that were neighbors in the 100 µM

DEET treatment are visible for the fipronil treatment in addition to many others; recall the fipronil treatment activated 13.5X as many differentially expressed lncRNA gene transcripts and 21.3X as many dysregulated protein-coding gene transcripts as the DEET-only treatment. These newly activated coding and noncoding genes whose transcripts were

77

differentially expressed were seemingly regulated by the same lncRNAs that were also affected by the 100 µM DEET treatment based on proximity to one another. This suggests the possibility that the dysregulated lncRNAs were responsible for activation or repression of more than one coding or noncoding gene, or all of the “new” lncRNAs with transcripts that were up-or down-regulated by the DEET and fipronil treatments were not simply regulating genes that were in close proximity (i.e., within 1 kb). We will continue our association analysis with the coding and noncoding genes that were within 1,000 kb of neighboring lncRNAs (termed cis activation) because the algorithm that we used did not focus on long distance relationships (trans activation), but it is known that certain lncRNAs can effect distant coding and noncoding genomic regions [51].

Fig. 3C-D shows the lncRNA genes and protein-coding genes whose transcripts were differentially expressed when primary human hepatocytes were treated with 10 µM fipronil or 100 µM DEET plus 10 µM fipronil, respectively. There are too many genes listed to label by gene symbol so the dysregulated lncRNA genes are simply labeled with blue bars and the dysregulated protein-coding genes are labeled with orange bars. All gene markers on the chromosomal maps represent their relative size (in base pairs) and position in relation to size and position of all the known coding and noncoding genes on each human chromosome.

78

Identifying Protein-Coding Genes Associated with lncRNAs based on Genomic

Location

To predict which protein-coding genes (not already identified as within 1,000 kb of a dysregulated lncRNA in our data set) were associated with the 20 lncRNAs whose transcripts were differentially expressed after the 100 µM DEET treatment, the 10 µM fipronil treatment, and the 100 µM DEET plus 10 µM fipronil treatment we used the Genomic

Regions Enrichment of Annotations Tool (GREAT). This algorithm associated all lncRNAs with up- and down-regulated transcripts to their closest transcription start site (TSS) that had a GO-annotated protein-coding gene with function related to the flanking lncRNA transcription site. If no GO functions were assigned to a specific lncRNA, the program would simply calculate which TSSs were within 1,000 kb of the middle of the lncRNA gene.

The program only considered neighboring protein-coding genes and did not associate lncRNA genes with other noncoding gene transcription regions. Early studies and previous algorithms designed to predict gene interactions typically only focused on chromosomal regions within 2 kb or less of a target gene TSS, rationalizing that anything farther away would have much less of a chance at regulating transcription of that target gene. The

GREAT algorithm was designed to incorporate distal cis-regulatory elements with regulatory regions of transcription instead of narrowing the focus to only proximal elements as previous gene-based tools had done [34].

Fig. 5 A-C displays the number of neighboring protein-coding genes (or other lncRNAs in a few cases) per dysregulated input lncRNA sequence, the absolute distance to the closest TSS

79

in kilobases (kb) for each dysregulated lncRNA transcription site, and the orientation and distance of each dysregulated lncRNA gene to its closest TSS, respectively, after 100 µM

DEET exposure. Some of the protein-coding genes with differentially expressed transcripts that we identified previously and lay within 1,000 kb of these 20 dysregulated lncRNA genes were not designated as targets in the GREAT analysis, but we chose to include them in all further downstream analysis. In Fig. 5A, only 4 dysregulated lncRNAs had 1 neighboring protein-coding gene while 16 of the dysregulated lncRNAs had 2 neighboring protein-coding genes within 1,000 kb. Fig. 5B shows that 2 lncRNAs were within 0-5 kb of their closest protein-coding gene TSS, 14 lncRNAs were within 5-50 kb, 18 were within 50-500 kb, and 2 were 500-1000 kb away. This implies that in previous gene-based tools focused on this type of analysis, over half of our dataset would have been excluded based on distance to the nearest TSS. Fig. 5C shows the distance and orientation of the 20 dysregulated lncRNAs versus their closest neighboring protein-coding genes, where 23 were upstream of the TSS and 13 were downstream. Similar bar graphs are displayed for the 10 µM fipronil treatment

(6A-C) and the 100 µM DEET plus 10 µM fipronil mixture (7A-C). Table 4 shows the lncRNAs whose transcripts were differentially expressed after 100 µM DEET exposure and their neighboring protein-coding genes and lncRNA genes, including those we found previously (before using the GREAT algorithm) to be dysregulated and within 1,000 kb of an lncRNA whose transcripts were up- or down-regulated. Supplement 1 shows the protein- coding genes associated with the dysregulated lncRNAs from the 10 µM fipronil treatment and Supplement 2 shows the same from the 100 µM DEET plus 10 µM fipronil treatment.

After GREAT analysis was utilized to establish the identity of protein-coding genes that were associated with the dysregulated lncRNAs from each treatment condition, we used these

80

genes as input into a functional analysis program to determine within what biological pathways the genes may perform a function.

Inferring Functionality of lncRNAs through lncRNA-Coding Gene Relationships

The Protein ANalysis THrough Evolutionary Relationships (PANTHER) classification system was utilized to classify the lncRNA and protein-coding genes that were linked to one another in the GREAT analysis into functional groups and pathways. PANTHER GO-slim annotation focusing on biological processes showed that components of 11 biological processes were activated by 100 µM DEET. Fig. 8A displays the top 10 of these 11 biological processes whose lncRNA and neighboring protein-coding gene transcripts were differentially expressed by 100 µM DEET (based on the number of genes included in each pathway). The two highest represented were cellular processes (27%) and metabolic processes (25%), but several other critical biological processes such as immune system processes, cell killing, and biological regulation were also included. A search of the approximately 177 primary signaling pathways in the PANTHER database found that several of our dysregulated lncRNAs and neighboring protein-coding genes from the DEET-only treatment were associated with Ras pathway activity, the PI3 kinase pathway, the p53 pathway, and immune response pathways among others (Fig. 8B), discussed in more detail later.

When we input our list of dysregulated lncRNAs and associated protein-coding genes from the 10 µM fipronil treatment, we obtained matches for 14 biological processes, 11 of which

81

were included in the 100 µM DEET analysis (Fig. 9A). The 3 processes that were not included in the DEET-only dataset were reproduction, biological adhesion, and rhythmic processes. When we searched the signaling pathway database with the 10 µM fipronil data, we obtained 45 matches that included all 11 pathways affected by the 100 µM DEET treatment. This was expected since all of the dysregulated lncRNAs found in the 100 µM

DEET treatment were also affected by the 10 µM fipronil treatment. However, the fipronil- only treatment activated 34 additional signaling pathways of which the top 10 are displayed in Fig. 9B and the complete list is included in Supplement 3. For the 100 µM DEET plus 10

µM fipronil mixture, we found the same 14 biological processes activated that were also elicited by the 10 µM fipronil treatment of which the top 10 are shown in Fig. 10A. We found 68 total signaling pathways associated with the mixture of which 11 were shared with the 100 µM DEET and 10 µM fipronil treatments and the 34 additional matches from the fipronil-only treatment were shared, leaving 23 pathways unique to the response of primary human hepatocytes to a mixture of 100 µM DEET and 10 µM fipronil. Fig. 10B shows the top 10 signaling pathways associated with the DEET plus fipronil mixture and Supplement 4 shows the complete list of 68. The mixture of 100 µM DEET and 10 µM fipronil activated many more signaling pathways than either DEET or fipronil alone.

LncRNAs Dysregulated by DEET and Fipronil Important in Biological Processes

We identified several genes (coding and noncoding) that were significantly affected (P ≤

0.01) by 100 µM DEET, 10 µM fipronil, and a mixture of 100 µM DEET plus 10 µM fipronil and many of those genes were linked to critical biological processes. We will

82

discuss a subset of lncRNAs whose transcript expression was affected in our experiments that are crucial to important biological processes; these will be focused on in more detail than other known lncRNAs.

Highly up-regulated in liver cancer (HULC) transcripts. HULC transcripts were significantly down-regulated in the 10 µM fipronil treatment (log2FC of -0.57). While

HULC expression is known to influence the development of hepatocellular carcinoma

(HCC), the most common type of liver cancer, there are conflicting reports regarding its over or under-expression in relation to HCC [52]. In pancreatic cancer HULC overexpression is positively correlated with larger tumors and decreased survivability [53] and in gastric cancer HULC overexpression contributed to lymph node metastasis [54].

However, liver studies by Yang et al (2015) established that higher levels of HULC in HCC resulted in less vascular invasion and increased survivability in some instances, which conflicts with other HCC studies [55]. It is also known that HULC can act as a miRNA sponge to reduce miRNA activity [56]. One miRNA type that HULC interacts with is mir-

372, which is known to suppress tumorigenesis in certain types of cancer like endometrial carcinoma. The up-regulation of HULC could therefore potentially inhibit the tumor suppression capability of mir-372 [57]. Wu et al. (2015) demonstrated that down-regulation of mir-372 was correlated with tumor metastasis and poor prognosis in HCC [58]. Therefore, down-regulation of HULC may also be a positive in certain situations where an abundance of mir-372 may be necessary to combat HCC. Focused research is needed to determine in what scenarios the up- or down-regulation of HULC can serve as a prognostic or diagnostic indicator of liver disease, which is likely to differ between stages and types.

83

H19. The H19 gene, sometimes referred to as long intergenic non-protein coding RNA 8, codes for an lncRNA whose transcripts were up-regulated in both the 10 µM fipronil and 100

µM DEET plus 10 µM fipronil treatments (log2FC of +1.55 and +0.59, respectively).

Overexpression of H19 is associated with tumorigenesis in several different tissue types, and blocking H19 in breast cancer and HCC cells reduced their ability to grow and develop [59,

60]. H19 also has the ability to affect p53 (discussed in more detail later) which has been called the “master regulator” due to its ability to prevent genome mutations. H19 was up- regulated in cells containing mutant p53 where oxygen conditions were low, but its expression remained normal when oxygen levels were sufficient [61].

LncRNA metastasis associated lung adenocarcinoma transcript 1 (MALAT1). MALAT1 transcripts were significantly down-regulated in the 100 µM DEET, 10 µM fipronil, and 100

µM DEET plus 10 µM fipronil treatments (log2FC of -1.34, -1.87, and -2.28 respectively) and is one of the most studied lncRNAs. It plays a multitude of roles in processes including gene splicing and nuclear organization, but its overexpression is related to several types of cancer. MALAT1 overexpression promotes malignancy in cancer cells and controls gene expression of several metastasis-associated transcripts in lung cancer cells [62]. However, the down-regulation of MALAT1 (as we observed in primary human hepatocytes) was shown to induce tumor progression in a recent breast cancer study [63]. Therefore the over- or under-expression of MALAT1 is likely very tissue specific. Down-regulation of

MALAT1 as we observed in our data set could play a completely different role in primary liver cells than it does in other tissues and systems. We must consider that many of the

84

studies determining the role of MALAT1 were conducted in immortalized cell lines or model non-human organisms and not primary human hepatocytes [56].

Nuclear enriched abundant transcript 1 (NEAT1) transcripts. NEAT1 transcripts were also down-regulated in all three of our treatment conditions (log2FC of -1.10 for 100 µM DEET, -

1.22 for 10 µM fipronil, and -1.52 for the 100 µM DEET plus 10 µM fipronil mixture). The lncRNA NEAT1 plays an important role in the formation of nuclear paraspeckles, which are sub-nuclear bodies formed in response to stress, viral infection, and circadian rhythm maintenance. NEAT1 is more of an architectural components that interacts with proteins that have a direct role in transcription and RNA processing. Down-regulation of NEAT1 results in impairment or inhibition of paraspeckle formation that could have a profound effect on some of the processes already mentioned above like response to viral infection [64].

Interestingly, recent studies demonstrate that NEAT1 and MALAT1 co-localize to hundreds of genomic sites likely due to cues from the transcription process and not specific DNA sequences [65]. Therefore, disruption of either of these lncRNAs could affect many signaling pathways in a positive or negative manner either by their interaction with one another or on genetic components that they both may influence.

X-inactive specific transcript (XIST) and TSIX transcripts were both significantly down- regulated by the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments (XIST, log2FC of -0.60 and -0.93, respectively; TSIX, log2FC of -0.59 and -0.97, respectively).

Both XIST and TSIX are lncRNAs involved in the process of X inactivation where certain genes on the sex chromosomes are repressed to equalize gene expression among the sexes.

85

The function of XIST is to coat certain regions of the sex chromosomes and suppress gene expression at the coating site (i.e., X inactivation) while TSIX, the antisense of XIST, functions to down-regulate the expression of XIST when necessary [66]. It would seem logical that the repression of both of these lncRNAs would disrupt functionality of the entire

X inactivation system but further study is required.

Maternally expressed 3 (MEG3). MEG3 transcripts were down-regulated in both the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments (log2FC of -1.19 and -1.70, respectively). This lncRNA activates p53 and functions as a tumor suppressor, so its expression is typically reduced or lost in cancer cells. A 2015 study revealed that MEG3 interacts with several TGF-β pathway genes, which are important in several cellular processes like cell growth, differentiation, and apoptosis [67, 68]. Therefore, dysregulation of MEG3 could negatively influence all of these processes.

Metabolic Pathways Important to Normal Cellular Function Influenced by

Dysregulated lncRNAs and Neighboring Protein-Coding Genes

In our previous work we established that several protein-coding genes whose transcription was up- and down-regulated by DEET and fipronil exposure were involved in critical metabolic signaling pathways [18]. Here we establish that DEET and fipronil also influence the transcription of many lncRNAs that are either directly or indirectly involved in the activity of signaling pathways key to normal cellular function. Fig. 8B displays the top 10 signaling pathways (based on the number of genes included in each pathway) affected by

86

exposure to 100 µM DEET. Figs. 9B and 10B display the top 10 signaling pathways affected by 10 µM fipronil and a mixture of 100 µM DEET and 10 µM fipronil, respectively. In total, the 10 µM fipronil treatment affected 45 signaling pathways (Supplement 4) and the mixture of 100 µM DEET and 10 µM fipronil affected 68 signaling pathways (Supplement 5). Some of the most affected included the immune system, p53, Ras, and Wnt signaling pathways.

Note that some components are involved in more than one pathway as well. Fig. 11 is an interaction network developed in STRING version 10 [40] showing direct (physical) and indirect (functional) associations between protein products of the protein-coding genes that neighbor (within 1,000 kb) dysregulated lncRNA genes after exposure to 100 µM DEET.

Some protein-coding genes were also included whose transcripts were differentially expressed after treatment with 100 µM DEET, but were not within 1,000 kb of any dysregulated lncRNAs. Interaction networks are included to visualize the interconnectedness of the components affected by the DEET-only treatment and examine which grouped together closely and those that were more distantly connected to one another.

When appropriate, further STRING models are included to visualize proteins linked to specific pathways that are associated directly or indirectly with dysregulated lncRNAs after exposure to DEET and fipronil.

Innate and Adaptive Immunity. The immune system is a defense mechanism that protects the body from foreign invaders like bacteria and viruses. Deficiencies in this network can result in autoimmune diseases, inflammatory diseases, and cancer. There is great concern nearly tantamount to a crisis situation regarding the current emergence of drug-resistant bacteria of which our bodies can no longer successfully combat via both innate and adaptive immunity

87

[69]. Environmental chemicals that further weaken the immune system exacerbate an already dire dilemma. Transcripts of lncRNA NEAT1 were significantly down-regulated in all three treatments (see Table 1) and are known to influence immune gene expression and immune cell functions. NEAT1 does this by binding to splicing factor proline and glutamine rich (SFPQ), which in turn activates the transcription of the gene that codes for chemokine interleukin 8 or IL8 [19]. We did not see differential expression in SFPQ, but we did observe the up-regulation of IL8 in the 100 µM DEET and 100 µM DEET plus 10 µM fipronil mixture but not in the 10 µM fipronil condition. The lncRNA growth arrest-specific transcript 5 (GAS5), whose transcripts were up-regulated in the 10 µM fipronil and 100 µM

DEET plus 10 µM fipronil mixture, is critical in regulation of the cell cycle and apoptotic control of T cells and its normal expression is linked to tumor suppression [70]. However,

GAS5 overexpression was shown to inhibit growth of T cells and promote spontaneous apoptosis [71]. Noncoding repressor of NFAT (NRON) is known to inhibit nuclear factor of activated T cells (NFAT). While we did not observe dysregulation of NRON, we did see significant down-regulation of NFAT5 transcripts in both the fipronil-only and DEET plus fipronil mixture [19].

Using GO-Slim annotations in PANTHER we identified 4 additional genes neighboring

(within 1,000 kb) lncRNAs that were involved in either antigen processing and presentation or the immune response in the 100 µM DEET treatment. MHC class I polypeptide-related sequence A and B (MICA and MICB) were two of these genes neighboring the lncRNA HLA

Complex P5 gene (HCP5). Products of the C-X-C Motif Chemokine Ligand 8 gene

(CXCL8), also associated with HCP5, and the Major Histocompatibility Complex, Class I, C

88

protein gene (HLA-C) were identified as immune response components. HLA-C is associated with the dysregulated lncRNAs psoriasis susceptibility 1 candidate 3 (PSORS1C3) and

HCP5. HCP5 is an endogenous retrovirus that has become part of the human genome and is specifically associated with HIV-1 viral load where expression of one variant of the final protein product of HCP5 is shown to interact with HIV-1 and reduce its viral presence [72].

When we expanded our signaling pathway search to the dysregulated lncRNAs and neighboring protein-coding genes from the 10 µM fipronil treatment we get matches for 29 immune system components and if we expand to the 100 µM DEET plus 10 µM fipronil lncRNAs and associated protein-coding genes we get 45 matches to immune functions. Fig.

12 shows an interaction map of all of the protein products of the protein-coding genes that neighboring dysregulated lncRNA genes after hepatocytes were treated with a mixture of 100

µM DEET and 10 µM fipronil. There were several dysregulated genes from our previous study that were included, but not limited to, since they also function within the immune response pathway (FOS, JUNB, LCP2, TLR1, TLR2, TLR3, and TLR4). Serine/threonine- protein kinase B-Raf (BRAF), whose gene neighbors the differentially expressed lncRNA

NDUFB-AS1 (whose transcripts were down-regulated by both the 10 µM fipronil and 100

µM DEET plus 10 µM fipronil treatments), is a protein that transmits signals from the outside to the inside of a cell and is ultimately involved in cell growth and proliferation.

Mutations in this gene have been shown to lead to cancer since the accumulation of mutations in BRAF, like many other genes critical to normal human processes, contributes to the development of cancer [73]. The Hallmarks of Cancer, published in 2000 (updated in

2011), was a seminal paper that established six cellular alterations necessary to dictate

89

malignant growth that we still follow today [74, 75]. These alterations can arise from perturbations of components in any of the key metabolic processes discussed here.

Transformation-Related Protein 53 (p53) Signaling Pathway. The p53 pathway helps the body to respond to stress and prevents genome mutations by activating cell cycle arrest, cellular senescence, DNA repair, or apoptosis. Transformation-related protein 53 (p53) is any isoform of the protein coded from the trp53 gene and has been referred to as the

“guardian of the genome” largely due to its function as a tumor suppressor gene [76]. While p53 regulates a large set of genes the p53 pathway is itself under the control of multiple self- regulatory pathways, including 7 negative and 3 positive feedback loops [77]. Several lncRNAs have already been implicated in regulation of the p53 pathway at various levels and we found that some previously described lncRNAs and neighboring protein-coding genes were affected by all three of our treatment conditions in some capacity. The lncRNA genes

MALAT1 and MEG3, whose transcripts are termed p53 regulators, were dysregulated by both

DEET and fipronil as described previously. The lncRNA H19, whose transcripts are considered p53 effectors, was also dysregulated [20]. We identified 1 protein-coding gene neighboring the dysregulated lncRNA gene ERVK13-1 that was connected with p53 activity in the 100 µM DEET treatment called PDPK1. In the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments there were 5 (MDM2, RCHY1, PDPK1, CDKN2B, and

PRKAB2) and 7 (PIK3C3, HDAC2, MDM2, RCHY1, PDPK1, CDKN2B, and PRKAB2) lncRNA genes or lncRNA-associated protein-coding genes linked to p53 activity, one of which was mouse double minute 2 homolog (MDM2). The MDM2 protein, associated with the dysregulated lncRNA gene LOC100130075, is a well-known regulator of the p53

90

pathway as it controls six of the ten known feedback loops mentioned above. The presence of MDM2 limits the growth-suppressive functions of p53 by degrading the protein and

MDM2 levels decrease when p53 must respond to stress [78]. In this study LOC100130075 transcripts were down-regulated by the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments which could have affected MDM2 activity in relation to p53. The protein

3-phosphoinositide-dependent protein kinase 1 (PDPK1), whose TSS neighbors the lncRNA gene ERVK13-1 whose transcripts were down-regulated in all 3 treatment conditions, is also connected with p53 activity. PDPK1 is called the “master kinase” because of its importance in signaling pathways tied to growth factors, hormones, and insulin [79]. PDPK1 is a negative regulator of p53 and its levels are elevated in several different types of cancer including prostate, liver, and breast cancer. Its inhibition has been shown to hinder tumor growth and it is a promising candidate for cancer intervention [80]. We also noted that several of the protein-coding transcripts that we identified as significantly up- or down- regulated previously were directly or indirectly related to p53 functionality including ATM,

SIRT1, CDKN2A, CDKN2B, CREBBP, PAK2, PDRG1, TP53INP2, CDIP1, PERP,

RRM2B, TRIAP1, CSNK2A1, CSNK2A2, and HIPK2 which implies that the lncRNAs linked to p53 activity whose transcripts were up- or down-regulated by our treatments could have had an influence on these genes or their protein products as well since they all play some role in the same molecular pathway. Fig. 13 shows an interaction map of all of the protein products of the protein-coding genes that neighboring dysregulated lncRNA genes after hepatocytes were treated with a mixture of 100 µM DEET and 10 µM fipronil.

91

Rat Sarcoma (Ras) Signaling Pathway. The Ras pathway is another signaling pathway critical to proper cellular functioning since it is intimately involved in cell growth, differentiation, and survival. Perturbations in this system typically result in the Ras protein becoming continually switched “on” even when there is no signal for it to do so, resulting in uncontrolled cellular proliferation [81, 82]. “Rasopathies” as they are termed, like Noonan- like CBL syndrome, Costello syndrome, andcardio-facio-cutaneous (CFC) syndrome, make infants in particular more prone to cancer and abnormal myelopoiesis [69]. H-Ras, N-Ras,

K-Ras4A and K-Ras4B are four isoforms of Ras found in vertebrates, but since binding occurs in all four at a conserved N-terminal region they are commonly all referred to collectively as “Ras” [83]. Ras can bind to and regulate at least twenty different effector proteins of which several transcripts were found to be either up- or down-regulated or associated with elements dysregulated by both DEET and fipronil, suggesting that lncRNAs can have either a direct or indirect effect on Ras functionality [84]. PANTHER analysis showed that in the 100 µM DEET treatments only PDPK1 (the same gene linked to the p53 pathway) was associated with the Ras pathway. In the 10 µM fipronil treatment there were 7 genes associated with the Ras pathway (BRAF, RALA, PDPK1, RASSF7, RASSF6, RASAL2, and RASA4CP). RAS p21 protein activator 4C (RASA4CP), a pseudogene, likely acts as a suppressor of Ras since p21 is a known suppressor or Ras. In the mixture of 100 µM DEET and 10 µM fipronil there were 9 genes involved with the Ras pathway. Seven of these were the same as those found in the 10 µM fipronil treatment and the two unique to the mixture were PIK3C3 and IGFBP1. Phosphatidylinositol 3-kinase catalytic subunit type 3 (PIK3C3) and insulin-like growth factor binding protein 1 (IGFBP1) both mediate different aspects of

Ras activation [85, 86]. We found several protein-coding genes whose transcripts were

92

significantly up- and down-regulated in our previous study connected to Ras functionality, including RHOB (a Ras homolog), several variants of the RAB gene (a member of the Ras oncogene family), SYNGAP1, RREB1, RAPH1, RAP2C, RASSF1, RASSF2, RASSF6,

RASSF7, RGL3, and RGL4. Fig. 14 shows an interaction map of all of the protein products of the protein-coding genes that neighboring dysregulated lncRNA genes after hepatocytes were treated with a mixture of 100 µM DEET and 10 µM fipronil. A greater understanding of how and where lncRNAs interact with the Ras pathway will lead to better treatment of diseases resulting from deficiencies in this pathway.

Proto-Oncogene Int-1 Homolog (Wnt) Signaling Pathway. The Wnt signaling pathway was affected by the 10 µM fipronil and 100 µM DEET plus 10 µM fipronil treatments. This pathway controls essential metabolic functions like cell fate determination, cell polarity, and cell proliferation and is conserved in all metazoan animals. Wnt proteins bind Frizzled and

LDL receptor-related protein (LRP) receptor families on the cell surface and the message is relayed to intracellular transcription sites of Wnt target genes [87]. Our data showed 5 lncRNA-associated protein-coding genes linked to the Wnt signaling pathway from the 10

µM fipronil treatment (NKD2, EP400, PPP2R53, TLE3, and KREMEN1) and 5 from the 100

µM DEET plus 10 µM fipronil treatment (NKD2, EP400, TLE3, HDAC2, and WNT7B). We also observed several Frizzled receptor family associated protein transcripts up-regulated in our protein-coding gene expression data sets for these two treatments including WLS, FZD6,

FZD7, SFRP4, MFRP, and SMO and some other dysregulated Wnt pathway protein-coding genes (LRP5L, LRP6, WNT5B, AXIN1, and DKK3). Fig. 15 shows an interaction map of all of the protein products of the protein-coding genes that neighboring dysregulated lncRNA

93

genes after hepatocytes were treated with a mixture of 100 µM DEET and 10 µM fipronil.

LDL Receptor Related Protein 5 (LRP5) dysregulation can cause high bone mass and eye vascular defects and dysregulation of LDL Receptor Related Protein 6 (LRP6) can contribute to early coronary disease and osteoporosis. Dysregulation of Wnt family member 5B

(WNT5B) is associated with type II diabetes and dysregulation of AXIN1 can cause caudal duplication or cancer [88].

Implications and Future Directions

As seen in Figs. 6-8 and Supplements 3-6, there were many other biological processes and pathways that were not discussed that are also very important in normal cellular function.

Transcriptomic analyses revealed to us possible lncRNA-coding gene partnerships and their putative functions, but there is more work remaining to establish a cause and effect relationship. However, the research reported provides potential risks. Our analysis is the first of its kind to link lncRNAs to neighboring protein-coding genes (and other lncRNAs) and functions related to DEET and fipronil exposure, either alone or in combination, in primary human hepatocytes. It can help us to begin to understand the complex molecular interactions that are responsible for the human liver’s response to two common environmental chemicals at the epigenetic level.

Although many of the lncRNAs we found were uncharacterized, we inferred function from factors such as chromosomal position in relation to protein-coding and noncoding genes that had been previously assigned function and associated with key molecular pathways. We also

94

determined the relative impact of DEET and fipronil on human hepatocytes based on lncRNA expression profiles associated with each. All of this may be useful for future studies that aim to use lncRNAs in measurement of exposure to environmental chemicals as well as prognostic and diagnostic indicators of overexposure and disease. In addition, specific lncRNAs could be utilized for prevention of disease or treatments related to these and other chemicals.

This type of information is becoming more essential as millions of new chemical compounds are synthesized each year and released into the environment at a wide range of concentrations

(for varying durations) as repellents and pesticides, among other uses. The rise of the Zika epidemic and associated microcephaly in children, along with other debilitating birth defects, has prompted governmental agencies like the Centers for Disease Control (CDC) to recommend repellents that are effective at preventing disease transmission. Expecting mothers are encouraged to use DEET on their skin at a minimum concentration of 30% active ingredient every time they are at risk of being bitten by mosquitoes before and during pregnancy [4]. However, we do not really know the potential long term effects of repeated and prolonged use of these chemicals. In addition, fear and misinformation may encourage people to use DEET at higher concentrations over a longer duration to avoid contracting the disease. This could potentially be extremely harmful to unborn or infant children, the elderly, or immuno-compromised individuals along with normal healthy individuals. Finally, even though we tested specific concentrations of two common environmental chemicals we have no data on the effects of repetitive or long term exposure of DEET (and fipronil) on human health.

95

Acknowledgements

The authors gratefully acknowledge Jeff Roach from the University of North Carolina-

Chapel Hill Information Technology Services and Elizabeth Scholl from the North Carolina

State University Bioinformatics Consulting Core for their assistance with bioinformatics analyses. This research was supported in part by the U.S. Central Appalachian Regional

Education and Research Center (CARERC) Pilot Study. RDM was supported by an

Entomology Department Teaching Assistantship and a Graduate School Doctoral

Dissertation Completion Grant at NC State University.

96

References

1. Bernhardt ES, Rosi EJ, Gessner MO. Synthetic chemicals as agents of global change.

Frontiers in Ecology and the Environment 2017.

2. Veltri JC, Osimitz TG, Bradford DC, Page BC. Retrospective analysis of calls to

poison control centers resulting from exposure to the insect repellent N, N-diethyl-m-

toluamide (DEET) from 1985–1989. Journal of Toxicology: Clinical Toxicology

1994;32(1):1-16.

3. Mlakar J, Korva M, Tul N, Popović M, Poljšak-Prijatelj M, Mraz J, Kolenc M,

Resman Rus K, Vesnaver Vipotnik T, Fabjan Vodušek V. Zika virus associated with

microcephaly. N Engl J Med 2016;2016(374):951-958.

4. Morris H. Zika: the latest advice for travellers. Volume 2017: The Telegraph; 2016.

5. Vargo EL, Parman V. Effect of fipronil on subterranean termite colonies (Isoptera:

Rhinotermitidae) in the field. Journal of economic entomology 2012;105(2):523-532.

6. Carrington D. EU to ban fipronil to protect honeybees. Volume 2017: The Guardian;

2013.

7. Hamon N, Gamboa H, Ernesto J, Garcia M. Fipronil: a major advance for the control

of boll weevil in Colombia. 1996.

8. Tingle CC, Rother JA, Dewhurst CF, Lauer S, King WJ. Fipronil: environmental fate,

ecotoxicology, and human health concerns. Reviews of environmental contamination

and toxicology: Springer; 2003. p 1-66.

9. Mohamed F, Senarathna L, Percy A, Abeyewardene M, Eaglesham G, Cheng R,

Azher S, Hittarage A, Dissanayake W, Sheriff MR. Acute Human Self‐Poisoning

97

with the N‐Phenylpyrazole Insecticide Fipronil—a GABAA‐Gated Chloride Channel

Blocker. Journal of Toxicology: Clinical Toxicology 2004;42(7):955-963.

10. Schoenig GP, Osimitz TG, Gabriel KL, Hartnagel R, Gill MW, Goldenthal EI.

Evaluation of the chronic toxicity and oncogenicity of N, N-diethyl-m-toluamide

(DEET). Toxicological Sciences 1999;47(1):99-109.

11. Heffernan A, English K, Toms L, Calafat A, Valentin-Blasini L, Hobson P,

Broomhall S, Ware R, Jagals P, Sly P. Cross-sectional biomonitoring study of

pesticide exposures in Queensland, Australia, using pooled urine samples.

Environmental Science and Pollution Research 2016;23(23):23436-23448.

12. Herin F, Boutet-Robinet E, Levant A, Dulaurent S, Manika M, Galatry-Bouju F,

Caron P, Soulat J-M. Thyroid function tests in persons with occupational exposure to

fipronil. Thyroid 2011;21(7):701-706.

13. Usmani KA, Rose RL, Goldstein JA, Taylor WG, Brimfield AA, Hodgson E. In vitro

human metabolism and interactions of repellent N, N-diethyl-m-toluamide. Drug

metabolism and disposition 2002;30(3):289-294.

14. Selim S, Hartnagel RE, Osimitz TG, Gabriel KL, Schoenig GP. Absorption,

metabolism, and excretion of N, N-diethyl-m-toluamide following dermal application

to human volunteers. Toxicological Sciences 1995;25(1):95-100.

15. Tang J, Usmani KA, Hodgson E, Rose RL. In vitro metabolism of fipronil by human

and rat cytochrome P450 and its interactions with testosterone and diazepam.

Chemico-biological interactions 2004;147(3):319-329.

98

16. Das PC, Cao Y, Cherrington N, Hodgson E, Rose RL. Fipronil induces CYP isoforms

and cytotoxicity in human hepatocytes. Chemico-biological interactions

2006;164(3):200-214.

17. National Library of Medicine Hazardous Substances Data Bank. Fipronil. Web. 2013.

Retreived 10 April 2017.

18. Mitchell RD, Dhammi A, Wallace A, Hodgson E, Roe RM. Impact of Environmental

Chemicals on the Transcriptome of Primary Human Hepatocytes: Potential for Health

Effects. Journal of biochemical and molecular toxicology 2016;30(8):375-395.

19. Zhang Y, Cao X. Long noncoding RNAs in innate immunity. Cellular & molecular

immunology 2016;13(2):138-147.

20. Zhang A, Xu M, Mo Y-Y. Role of the lncRNA–p53 regulatory network in cancer.

Journal of molecular cell biology 2014;6(3):181-191.

21. Villegas VE, Zaphiropoulos PG. Neighboring gene regulation by antisense long non-

coding RNAs. International journal of molecular sciences 2015;16(2):3251-3266.

22. Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annual review of

biochemistry 2012;81:145-166.

23. Wang KC, Chang HY. Molecular mechanisms of long noncoding RNAs. Molecular

cell 2011;43(6):904-914.

24. Kung JT, Colognori D, Lee JT. Long noncoding RNAs: past, present, and future.

Genetics 2013;193(3):651-669.

25. Peng C, Yuan D, Li B, Wei Y, Yan L, Wen T, Zhao J, Yang J, Wang W, Xu M. Body

mass index evaluating donor hepatic steatosis in living donor liver transplantation.

2009. Elsevier. p 3556-3559.

99

26. LeCluyse EL, Madan A, Hamilton G, Carroll K, DeHaan R, Parkinson A. Expression

and regulation of cytochrome P450 enzymes in primary cultures of human

hepatocytes. Journal of biochemical and molecular toxicology 2000;14(4):177-188.

27. Das PC, Cao Y, Rose RL, Cherrington N, Hodgson E. Enzyme induction and

cytotoxicity in human hepatocytes by chlorpyrifos and N, N-diethyl-m-toluamide

(DEET). 2008.

28. Baselt RC, and Robert H. Cravey. Disposition of toxic drugs and chemicals in man

Seal Beach, California: Biomedical publications; 2011.

29. Illumina I. Quality Scores for Next-Generation Sequencing. 2011.

30. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg

SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-

seq experiments with TopHat and Cufflinks. Nat Protoc 2012;7(3):562-78.

31. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, Van Baren MJ, Salzberg

SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals

unannotated transcripts and isoform switching during cell differentiation. Nature

biotechnology 2010;28(5):511-515.

32. Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L. Differential

analysis of gene regulation at transcript resolution with RNA-seq. Nature

biotechnology 2013;31(1):46-53.

33. Goff L, Trapnell C, Kelley D, Guide PRSCU, biocViews Clustering D,

DataRepresentation D, GeneExpression I, MultipleComparison Q, RNASeq R, since

BioC IB. Analysis, exploration, manipulation, and visualization of Cufflinks high-

throughput sequencing data. R package version 2012;2(1).

100

34. McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, Lowe CB, Wenger AM,

Bejerano G. GREAT improves functional interpretation of cis-regulatory regions.

Nature biotechnology 2010;28(5):495-501.

35. Gray KA, Yates B, Seal RL, Wright MW, Bruford EA. Genenames. org: the HGNC

resources in 2015. Nucleic acids research 2014:gku1071.

36. Mi H, Huang X, Muruganujan A, Tang H, Mills C, Kang D, Thomas PD. PANTHER

version 11: expanded annotation data from Gene Ontology and Reactome pathways,

and data analysis tool enhancements. Nucleic Acids Research 2016:gkw1138.

37. Kin T, Ono Y. Idiographica: a general-purpose web application to build idiograms

on-demand for human, mouse and rat. Bioinformatics 2007;23(21):2945-2946.

38. Oliveros J. VENNY. An interactive tool for comparing lists with Venn Diagrams.

2007. 2008.

39. Mudunuri U, Che A, Yi M, Stephens RM. bioDBnet: the biological database network.

Bioinformatics 2009;25(4):555-556.

40. Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J,

Simonovic M, Roth A, Santos A, Tsafou KP. STRING v10: protein–protein

interaction networks, integrated over the tree of life. Nucleic acids research

2014:gku1003.

41. Aken BL, Ayling S, Barrell D, Clarke L, Curwen V, Fairley S, Banet JF, Billis K,

Girón CG, Hourlier T. The Ensembl gene annotation system. Database

2016;2016:baw093.

42. Milligan MJ, Lipovich L. Pseudogene-derived lncRNAs: emerging regulators of gene

expression. Frontiers in genetics 2015;5:476.

101

43. Milligan MJ, Harvey E, Yu A, Morgan AL, Smith DL, Zhang E, Berengut J,

Sivananthan J, Subramaniam R, Skoric A. Global intersection of long non-coding

RNAs with processed and unprocessed pseudogenes in the human genome. Frontiers

in genetics 2016;7.

44. Harrison PM, Zheng D, Zhang Z, Carriero N, Gerstein M. Transcribed processed

pseudogenes in the human genome: an intermediate form of expressed retrosequence

lacking protein-coding ability. Nucleic acids research 2005;33(8):2374-2383.

45. Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, Cheloufi S, Hodges E, Anger

M, Sachidanandam R, Schultz RM. Pseudogene-derived small interfering RNAs

regulate gene expression in mouse oocytes. Nature 2008;453(7194):534-538.

46. Watanabe T, Cheng E-c, Zhong M, Lin H. Retrotransposons and pseudogenes

regulate mRNAs and lncRNAs via the piRNA pathway in the germline. Genome

research 2015;25(3):368-380.

47. Poliseno L, Salmena L, Zhang J, Carver B, Haveman WJ, Pandolfi PP. A coding-

independent function of gene and pseudogene mRNAs regulates tumour biology.

Nature 2010;465(7301):1033-1038.

48. Branca RM, Orre LM, Johansson HJ, Granholm V, Huss M, Pérez-Bercoff Å,

Forshed J, Käll L, Lehtiö J. HiRIEF LC-MS enables deep proteome coverage and

unbiased proteogenomics. Nature methods 2014;11(1):59-62.

49. Dündar F, Skrabanek L, Zumbo P. Introduction to differential gene expression

analysis using RNA-seq. 2015.

50. Stamatoyannopoulos JA. Illuminating eukaryotic transcription start sites. Nature

methods 2010;7(7):501-503.

102

51. Esteller M. Non-coding RNAs in human disease. Nature Reviews Genetics

2011;12(12):861-874.

52. Yu X, Zheng H, Chan MT, Wu WKK. HULC: an oncogenic long non‐coding RNA in

human cancer. Journal of Cellular and Molecular Medicine 2016.

53. Peng W, Gao W, Feng J. Long noncoding RNA HULC is a novel biomarker of poor

prognosis in patients with pancreatic cancer. Medical oncology 2014;31(12):1-7.

54. Zhao Y, Guo Q, Chen J, Hu J, Wang S, Sun Y. Role of long non-coding RNA HULC

in cell proliferation, apoptosis and tumor metastasis of gastric cancer: a clinical and in

vitro investigation. Oncology reports 2014;31(1):358-364.

55. Yang Z, Lu Y, Xu Q, Tang B, Park C-K, Chen X. HULC and H19 played different

roles in overall and disease-free survival from hepatocellular carcinoma after curative

hepatectomy: a preliminary analysis from gene expression omnibus. Disease markers

2015;2015.

56. Li CH, Chen Y. Targeting long non-coding RNAs in cancers: progress and prospects.

The international journal of biochemistry & cell biology 2013;45(8):1895-1910.

57. Liu B-L, Sun K-X, Zong Z-H, Chen S, Zhao Y. MicroRNA-372 inhibits endometrial

carcinoma development by targeting the expression of the Ras homolog gene family

member C (RhoC). Oncotarget 2016;7(6):6649.

58. Wu G, Wang Y, Lu X, He H, Liu H, Meng X, Xia S, Zheng K, Liu B. Low mir-372

expression correlates with poor prognosis and tumor metastasis in hepatocellular

carcinoma. BMC cancer 2015;15(1):182.

59. Barsyte-Lovejoy D, Lau SK, Boutros PC, Khosravi F, Jurisica I, Andrulis IL, Tsao

MS, Penn LZ. The c-Myc oncogene directly induces the H19 noncoding RNA by

103

allele-specific binding to potentiate tumorigenesis. Cancer research

2006;66(10):5330-5337.

60. Berteaux N, Lottin S, Monté D, Pinte S, Quatannens B, Coll J, Hondermarck H,

Curgy J-J, Dugimont T, Adriaenssens E. H19 mRNA-like noncoding RNA promotes

breast cancer cell proliferation through positive control by E2F1. Journal of biological

chemistry 2005;280(33):29625-29636.

61. Matouk IJ, Mezan S, Mizrahi A, Ohana P, Abu-lail R, Fellig Y, Galun E, Hochberg

A. The oncofetal H19 RNA connection: hypoxia, p53 and cancer. Biochimica et

Biophysica Acta (BBA)-Molecular Cell Research 2010;1803(4):443-451.

62. Gutschner T, Hämmerle M, Eißmann M, Hsu J, Kim Y, Hung G, Revenko A, Arun

G, Stentrup M, Groß M. The noncoding RNA MALAT1 is a critical regulator of the

metastasis phenotype of lung cancer cells. Cancer research 2013;73(3):1180-1189.

63. Yang Z, Lu W, Ning L, Hao D, Jian S, Hai-Feng C. Downregulation of long non-

coding RNA MALAT1 induces tumor progression of human breast cancer through

regulating CCND1 expression. Open Life Sciences 2016;11(1):232-236.

64. Fox AH, Lamond AI. Paraspeckles. Cold Spring Harbor perspectives in biology

2010;2(7):a000687.

65. West JA, Davis CP, Sunwoo H, Simon MD, Sadreyev RI, Wang PI, Tolstorukov MY,

Kingston RE. The long noncoding RNAs NEAT1 and MALAT1 bind active

chromatin sites. Molecular cell 2014;55(5):791-802.

66. Lee J, Davidow LS, Warshawsky D. Tsix, a gene antisense to Xist at the X-

inactivation centre. Nature genetics 1999;21(4):400-404.

104

67. Mondal T, Subhash S, Vaid R, Enroth S, Uday S, Reinius B, Mitra S, Mohammed A,

James AR, Hoberg E. MEG3 long noncoding RNA regulates the TGF-[beta] pathway

genes through formation of RNA-DNA triplex structures. Nature communications

2015;6.

68. Markowitz SD, Roberts AB. Tumor suppressor activity of the TGF-β pathway in

human cancers. Cytokine & growth factor reviews 1996;7(1):93-102.

69. WHO. U.N. issues list of 12 most worrying drug-resistant bacteria. Volume 2017.

London: The Associated Press; 2017.

70. Liu Y, Zhao J, Zhang W, Gan J, Hu C, Huang G, Zhang Y. lncRNA GAS5 enhances

G1 cell cycle arrest via binding to YBX1 to regulate p21 expression in stomach

cancer. Scientific reports 2015;5:10159.

71. Mourtada-Maarabouni M, Hedge VL, Kirkham L, Farzaneh F, Williams GT. Growth

arrest in human T-cells is controlled by the non-coding RNA growth-arrest-specific

transcript 5 (GAS5). Journal of cell science 2008;121(7):939-946.

72. van Manen D, Kootstra NA, Boeser-Nunnink B, Handulle MA, van't Wout AB,

Schuitemaker H. Association of HLA-C and HCP5 gene regions with the clinical

course of HIV-1 infection. Aids 2009;23(1):19-28.

73. Davies H, Bignell GR, Cox C, Stephens P, Edkins S, Clegg S, Teague J, Woffendin

H, Garnett MJ, Bottomley W. Mutations of the BRAF gene in human cancer. Nature

2002;417(6892):949-954.

74. Hanahan D, Weinberg RA. The hallmarks of cancer. cell 2000;100(1):57-70.

75. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. cell

2011;144(5):646-674.

105

76. Read A, Strachan T. Chapter 18: Cancer Genetics. Human molecular genetics 1999;2.

77. Harris SL, Levine AJ. The p53 pathway: positive and negative feedback loops.

Oncogene 2005;24(17):2899-2908.

78. Moll UM, Petrenko O. The MDM2-p53 interaction. Molecular Cancer Research

2003;1(14):1001-1008.

79. Mora A, Komander D, van Aalten DM, Alessi DR. PDK1, the master regulator of

AGC kinase signal transduction. 2004. Elsevier. p 161-170.

80. Mäemets-Allas K, Viil J, Jaks V. A Novel Inhibitor of AKT1–PDPK1 Interaction

Efficiently Suppresses the Activity of AKT Pathway and Restricts Tumor Growth In

Vivo. Molecular cancer therapeutics 2015;14(11):2486-2496.

81. Weinberg RA. Oncogenes, antioncogenes, and the molecular bases of multistep

carcinogenesis. Cancer Research 1989;49(14):3713-3721.

82. Zenonos K, Kyprianou K. RAS signaling pathways, mutations and their role in

colorectal cancer. World J Gastrointest Oncol 2013;5(5):97-101.

83. Rotblat B, Leprivier G, Sorensen PH. A possible role for long non-coding RNA in

modulating signaling pathways. Medical hypotheses 2011;77(6):962-965.

84. Mitin N, Rossman KL, Der CJ. Signaling interplay in Ras superfamily function.

Current Biology 2005;15(14):R563-R574.

85. Denduluri SK, Idowu O, Wang Z, Liao Z, Yan Z, Mohammed MK, Ye J, Wei Q,

Wang J, Zhao L. Insulin-like growth factor (IGF) signaling in tumorigenesis and the

development of cancer drug resistance. Genes & Diseases 2015;2(1):13-25.

106

86. Wang J, Yuan Y, Zhou Y, Guo L, Zhang L, Kuai X, Deng B, Pan Z, Li D, He F.

Protein interaction data set highlighted with human Ras-MAPK/PI3K signaling

pathways. Journal of proteome research 2008;7(9):3879-3889.

87. Komiya Y, Habas R. Wnt signal transduction pathways. Organogenesis 2008;4(2):68-

75.

88. Clevers H, Nusse R. Wnt/β-catenin signaling and disease. Cell 2012;149(6):1192-

1205.

107

TABLES

Table 2.1. Long non-protein coding RNAs (lncRNAs) whose transcripts were up- or down-regulated in primary human hepatocytes after exposure to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil for 72 hours. Log2FC = log2 fold change; DT = 100 µM DEET; Fip = 10 µM fipronil; and DT+Fip = 100 µM DEET plus 10 µ m fipronil mixture.

Differential log2FC log2FC log2FC Expression Gene SymbolA Gene NameB (DT) (Fip) (DT+Fip) Up CYP2B7P Cytochrome P450 family 2 subfamily B member 7, pseudogene +2.96 +2.42 +2.79 HCP5 HLA complex P5 +0.78 +0.93 +1.38 Down MALAT1 Metastasis associated lung adenocarcinoma transcript 1 -1.34 -1.87 -2.28 NEAT1 Nuclear paraspeckle assembly transcript 1 -1.18 -1.22 -1.52 LINC01554 Long intergenic non-protein coding RNA 1554 -1.10 -1.37 -1.33 LINC01004 Long intergenic non-protein coding RNA 1004 -1.58 -2.01 -2.26 PSORS1C3 Psoriasis susceptibility 1 candidate 3 -1.58 -0.85 -1.39 AQP7P1 Aquaporin 7 pseudogene 1 -0.80 -1.05 -0.55 SCART1 Scavenger receptor protein family member -1.49 -1.24 -1.89 PDXDC2P Pyridoxal dependent decarboxylase domain containing 2 -0.78 -1.14 -1.41 LINC00893 Long intergenic non-protein coding RNA 893 -1.21 -1.18 -1.51 WASH5P WAS protein family homolog 5 pseudogene -0.79 -0.76 -1.06 PFN1P2 Profilin 1 pseudogene 2 -1.24 -1.33 -1.78 LINC00482 Long intergenic non-protein coding RNA 482 -1.01 -0.91 -1.06 ERVK13-1 Endogenous retrovirus group K13 member 1 -0.84 -0.78 -1.25 LOC100289230 Uncharacterized LOC100289230 -1.66 -1.80 -1.44 LOC728040 HCG1813624 -1.37 -2.50 -3.47 LOC100190986 Uncharacterized LOC100190986 -0.91 -0.96 -1.29 LINC01000 Long intergenic non-protein coding RNA 1000 -0.71 -0.53 -0.88 108

Table 2.1. continued

LOC100272217 Uncharacterized LOC100272217 -1.46 -1.29 -1.54 A HUGO gene nomenclature committee (HGNC) gene symbol (31). B National Center for Biotechnology Information (NCBI) gene description. Plus means up-regulated. Minus means down-regulated.

109

Table 2.2. Chromosomal distribution of lncRNAs significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were exposed to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil. DT+Fip = mixture of 100 µM DEET and 10 µM fipronil.

TABLE 2.2. lncRNAs lncRNAs lncRNAs Dysregulated Dysregulated Dysregulated Genes per affected per affected per affected per lncRNAs vs lncRNAs vs lncRNAs vs chromosome chromosome chromosome chromosome total genes total genes total genes CHROMOSOME (Total)a (DEET) (Fipronil) (DT+Fip) (DEET) (Fipronil) (DT+Fip) 1 5166 1 27 35 0.02% 0.52% 0.68% 2 3920 0 15 17 0.00% 0.38% 0.43% 3 2984 0 13 16 0.00% 0.44% 0.54% 4 2468 1 11 15 0.04% 0.45% 0.61% 5 2795 2 14 18 0.07% 0.50% 0.64% 6 2827 2 16 23 0.07% 0.57% 0.81% 7 2830 2 22 29 0.07% 0.78% 1.02% 8 2321 0 7 7 0.00% 0.30% 0.30% 9 2224 2 15 15 0.09% 0.67% 0.67% 10 2173 1 17 19 0.05% 0.78% 0.87% 11 3159 2 12 13 0.06% 0.38% 0.41% 12 2841 0 14 12 0.00% 0.49% 0.42% 13 1275 0 2 3 0.00% 0.16% 0.24% 14 2204 0 5 8 0.00% 0.23% 0.36% 15 2105 0 12 15 0.00% 0.57% 0.71% 16 2375 3 16 21 0.13% 0.67% 0.88% 17 2896 1 8 17 0.03% 0.28% 0.59% 18 1120 0 4 4 0.00% 0.36% 0.36% 19 2852 2 14 16 0.07% 0.49% 0.56% 20 1376 0 5 6 0.00% 0.36% 0.44% 21 819 0 5 4 0.00% 0.61% 0.49% 110

Table 2.1. continued

22 1309 0 5 10 0.00% 0.38% 0.76% X 2345 1 8 8 0.04% 0.34% 0.34% 56384 20 267b 331b aTotal coding and noncoding genes based on Ensembl release 87 [37]. bTwo genes omitted since chromosome location not well-established.

111

Table 2.3. Chromosomal distribution of protein-coding genes significantly dysregulated (P ≤ 0.01) after primary human hepatocytes were exposed to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil. DT+Fip = mixture of 100 µM DEET and 10 µM fipronil.

TABLE 2.3. Dysregulated Dysregulated Dysregulated Coding Genes Coding genes Coding genes coding genes per coding genes per coding genes per Genes per affected per affected per affected per chromosome vs chromosome vs chromosome vs chromosome chromosome chromosome chromosome total genes total genes total genes Chromosome (Total)* (DEET) (Fipronil) (DT+Fip) (DEET) (Fipronil) (DT+Fip) 1 5166 26 409 568 0.50% 7.92% 10.99% 2 3920 5 222 328 0.13% 5.66% 8.37% 3 2984 11 198 264 0.37% 6.64% 8.85% 4 2468 11 158 209 0.45% 6.40% 8.47% 5 2795 8 156 224 0.29% 5.58% 8.01% 6 2827 9 202 277 0.32% 7.15% 9.80% 7 2830 16 193 264 0.57% 6.82% 9.33% 8 2321 6 139 186 0.26% 5.99% 8.01% 9 2224 9 150 207 0.40% 6.74% 9.31% 10 2173 9 151 175 0.41% 6.95% 8.05% 11 3159 7 208 298 0.22% 6.58% 9.43% 12 2841 4 214 290 0.14% 7.53% 10.21% 13 1275 1 49 71 0.08% 3.84% 5.57% 14 2204 1 104 156 0.05% 4.72% 7.08% 15 2105 8 114 160 0.38% 5.42% 7.60% 16 2375 10 178 242 0.42% 7.49% 10.19% 17 2896 5 183 291 0.17% 6.32% 10.05% 18 1120 2 53 68 0.18% 4.73% 6.07% 19 2852 12 256 350 0.42% 8.98% 12.27% 20 1376 4 81 128 0.29% 5.89% 9.30% 112

Table 2.3. continued

21 819 1 37 48 0.12% 4.52% 5.86% 22 1309 4 90 132 0.31% 6.88% 10.08% X 2345 3 125 164 0.13% 5.33% 6.99% 56384 172 3670 5100 aTotal coding and noncoding genes based on Ensembl release 87 [37]. bTwo genes omitted since chromosome location not well-established.

113

Table 2.4. Protein-coding and non-protein coding genes neighboring (within 1,000 kb) the 20 lncRNAs up- or down-regulated by 100 µM DEET using GREAT algorithm parameters. The GREAT algorithm defines neighboring genes as those whose transcription start site (TSS) is within 1,000 kb of the input lncRNAs. lncrna = long non-protein coding RNA; kb = kilobases. All gene names are HUGO gene nomenclature committee (HGNC) gene symbols.

TABLE 2.4. LNCRNA Gene(s) within 1,000 kb of lncRNA CYP2B7P CYP2A7 (-54710), CYP2B6 (-53837), CYP2A6*, CYP2A13* HCP5 MICB (-33621), MICA (+60915) AQP7P1 ANKRD20A1 (-646908) MALAT1 SCYL1 (-22962), FRMD8 (+115516), NEAT1* SCART1 CYP2E1 (-59218), MTG1 (+67017) PFN1P2 PPIAL4B (-247525), NBPF9 (-199977) PDXDC2P PDPR (-92503), CLEC18A (+69943), NQO1* LINC01000 CALU (-88173), METTL2B (+174390) LOC100190986 METTL9 (-166237), NPIPB3 (-13482) PSORS1C3 POU5F1 (-5124), HLA-C (+96269), HCP5* LINC01554 GLRX (-33468), ELL2 (+105889) LOC100272217 FUBP3 (-1184) LINC00893 IDS (-28345), CXorf40A (-6965) NEAT1 SCYL1 (-100412), FRMD8 (+38066), MALAT1* WASH5P OR4F17 (-41513) LOC100289230 CHD1 (-3535) LINC00482 SLC38A10 (-10731), TMEM105 (+24638) LINC01004 KMT2E (-27723), LHFPL3 (+657799) ERVK13-1 KCTD5 (-16561), PDPK1 (+127950), RPS2* LOC728040 AFM (+36985), RASSF6 (+101963), CXCL8* *Neighboring differentially expressed protein-coding gene or lncRNA found previously to be within 1,000 kb of dysregulated lncRNA (after 100 µM DEET treatment) before GREAT algorithm parameters implemented.

114

FIGURES

Fig. 2.1. Relationships between the number of long non-protein coding RNAs (lncRNAs) whose transcripts were significantly up- or down-regulated (A) and protein-coding genes whose transcripts were differentially expressed (B) when primary human hepatocytes were treated with DEET (100 µM), fipronil (10 µM), or a mixture of the two (100 µM DEET and 10 µM fipronil) for 72 h. (A) lncRNAs with transcripts differentially expressed at P ≤ 0.01; (B) protein-coding genes with transcripts differentially expressed at P ≤ 0.01.

115

X

Fig. 2.2. Log2 fold change of transcripts significantly differentially expressed from long non-protein coding RNA genes by chromosome (P ≤ 0.01). Shared by all 3 means those transcripts were differentially expressed in all 3 treatments; Fip only means those transcripts were only differentially expressed when treated with 10 µM fipronil; Shared Fip and DT+Fip means those transcripts were only differentially expressed when treated with fipronil or a combination of DEET and fipronil, but not 100 µM DEET alone; and DT+Fip only means those transcripts were only differentially expressed with the combination of DEET and fipronil, but not each treatment alone. A single representative log2 fold change value was used for transcripts that were differentially expressed under more than one treatment condition. *A representative log2 fold change refers to the average log2 fold change for any genes whose transcript expression was affected by more than one treatment, like shared by all 3, where the same genes were dysregulated by all 3 treatment conditions.

116

Fig. 2.3. Chromosome maps showing either (A) location of lncRNAs with up- or down- regulated transcripts or (B-D) location of lncRNAs with differentially expressed transcripts in relation to dysregulated protein-coding genes after exposure of primary human hepatocytes to 100 µM DEET, 10 µM fipronil, or a mixture of 100 µM DEET and 10 µM fipronil. (A) Chromosomal location of lncRNAs with significantly up- or down-regulated transcripts (P ≤ 0.01) when hepatocytes exposed to 100 µM DEET; (B) chromosomal location of lncRNAs and protein-coding genes significantly dysregulated (P ≤ 0.01) when hepatocytes exposed to 100 µM DEET; (C) chromosomal location of lncRNAs and protein- coding genes significantly dysregulated (P ≤ 0.01) when hepatocytes exposed to 10 µM fipronil; (D) chromosomal location of lncRNAs and protein-coding genes significantly dysregulated (P ≤ 0.01) when hepatocytes exposed to a mixture of 100 µM DEET and 10 µM fipronil.

117

3A

Green bars and gene symbols denote identity and location of lncRNAs with up-regulated transcripts. Red bars and gene symbols denote identity and location of lncRNAs with down-regulated transcripts.

118

3B

Blue bars and gene symbols denote identity and location of lncRNAs with differentially expressed transcripts (both up- and down-regulated). Orange bars and gene symbols denote identity and location of lncRNAs with differentially expressed protein-coding gene transcripts (both up- and down-regulated).

119

3C C

Blue bars denote location of lncRNAs with differentially expressed transcripts (both up- and down- regulated). Orange bars denote location of protein-coding genes with differentially expressed transcripts (both up- and down-regulated).

120

3D D

Blue bars denote location of lncRNAs with differentially expressed transcripts (both up- and down- regulated). Orange bars denote location of protein-coding genes with differentially expressed transcripts (both up- and down-regulated).

121

Fig. 2.4. Chromosomal location of lncRNA genes within 1,000 kb of protein-coding genes with differentially expressed transcripts after primary human hepatocytes were exposed to 100 µM DEET or 10 µM fipronil. (4A) Location of dysregulated lncRNAs and neighboring (within 1,000 kb) protein-coding genes affected by 100 µM DEET on selected chromosomes and (4B) location of dysregulated lncRNAs and neighboring protein-coding genes affected by 10 µM Fipronil on selected chromosomes. p-arm = short arm of chromosome; q-arm = long arm of chromosome; black star = corresponding lncRNA from 10 µM DEET treatment.

122

4A

4B

123

Fig. 2.5. Genomic relationship between lncRNA transcription sites whose transcripts were differentially expressed in primary human hepatocytes after exposure to 100 µM DEET to the nearest protein-coding gene transcription start site (TSS) that lies within 1,000 kilobases (kb) of the lncRNA. If the nearest TSS was over 1,000 kb then no neighboring protein- coding genes were assigned. Neighboring protein-coding genes are defined by the GREAT algorithm as those within 1,000 kb of an input genomic region (lncRNA in this case). The GREAT algorithm assumes that lncRNAs within 1,000 kb of a neighboring gene TSS can effect transcription of that neighboring gene. GREAT calculates the distance between input sequences and target TSS by measuring the distance from the middle of an lncRNA to the closest TSS of a protein-coding gene. (A) Number of protein-coding genes associated with up- or down-regulated lncRNA transcription sites (referred to as genomic regions in the graphs) after 100 µM DEET exposure; (B) genomic distance in kb of lncRNA transcription sites with up- or down-regulated transcripts after 100 µM DEET exposure to closest protein- coding gene TSS, and (C) genomic distance in kb and orientation of lncRNA transcription sites with up- or down-regulated transcripts after 100 µM DEET exposure to closest protein- coding gene TSS. Values in (A) that are red indicate the number of genomic regions (i.e., lncRNAs) that do not lie within 1,000 kb of a protein-coding gene TSS. Percentages on the Y axis in (A) refer to the ratio of differentially expressed lncRNAs that neighbor 0, 1, or 2 or more protein-coding genes to the total number of lncRNAs dysregulated by 100 µM DEET (20 total in this case). Percentages on the Y axis in (B) refer to the ratio of lncRNAs that fall into categories within a certain range from the TSS of the closest protein-coding gene to the total number of lncRNAs that fall within these ranges (36 total in this case since a single gene can span more than one range category). Percentages on the Y axis in (C) refer to the ratio of lncRNAs that fall into categories within a certain range both before and after the closest protein-coding gene TSS to the total number of lncRNAs that fall within these ranges (36 total in this case).

124

A B

C

125

Fig. 2.6. Genomic relationship between lncRNA transcription sites whose transcripts were differentially expressed in primary human hepatocytes after exposure to 10 µM fipronil to the nearest protein-coding gene transcription start site (TSS) that lies within 1,000 kilobases (kb) of the lncRNA. If the nearest TSS was over 1,000 kb then no neighboring protein-coding genes were assigned. Neighboring protein-coding genes are defined by the GREAT algorithm as those within 1,000 kb of an input genomic region (lncRNA in this case). The GREAT algorithm assumes that lncRNAs within 1,000 kb of a neighboring gene TSS can effect transcription of that neighboring gene. GREAT calculates the distance between input sequences and target TSS by measuring the distance from the middle of an lncRNA to the closest TSS of a protein-coding gene. (A) Number of protein-coding genes associated with up- or down-regulated lncRNA transcription sites (referred to as genomic regions in the graphs) after 10 µM fipronil exposure; (B) genomic distance in kb of lncRNA transcription sites with up- or down-regulated transcripts after 10 µM fipronil exposure to closest protein- coding gene TSS, and (C) genomic distance in kb and orientation of lncRNA transcription sites with up- or down-regulated transcripts after 10 µM fipronil exposure to closest protein- coding gene TSS. Values in (A) that are red indicate the number of genomic regions (i.e., lncRNAs) that do not lie within 1,000 kb of a protein-coding gene TSS. Percentages on the Y axis in (A) refer to the ratio of differentially expressed lncRNAs that neighbor 0, 1, or 2 or more protein-coding genes to the total number of lncRNAs dysregulated by 10 µM fipronil (269 total in this case). Percentages on the Y axis in (B) refer to the ratio of lncRNAs that fall into categories within a certain range from the TSS of the closest protein-coding gene to the total number of lncRNAs that fall within these ranges (478 total in this case since a single gene can span more than one range category). Percentages on the Y axis in (C) refer to the ratio of lncRNAs that fall into categories within a certain range both before and after the closest protein-coding gene TSS to the total number of lncRNAs that fall within these ranges (478 total in this case).

126

A B

C

127

Fig. 2.7. Genomic relationship between lncRNA transcription sites whose transcripts were differentially expressed in primary human hepatocytes after exposure to 100 µM DEET plus 10 µM fipronil to the nearest protein-coding gene transcription start site (TSS) that lies within 1,000 kilobases (kb) of the lncRNA. If the nearest TSS was over 1,000 kb then no neighboring protein-coding genes were assigned. Neighboring protein-coding genes are defined by the GREAT algorithm as those within 1,000 kb of an input genomic region (lncRNA in this case). The GREAT algorithm assumes that lncRNAs within 1,000 kb of a neighboring gene TSS can effect transcription of that neighboring gene. GREAT calculates the distance between input sequences and target TSS by measuring the distance from the middle of an lncRNA to the closest TSS of a protein-coding gene. (A) Number of protein- coding genes associated with up- or down-regulated lncRNA transcription sites (referred to as genomic regions in the graphs) after 100 µM DEET plus 10 µM fipronil exposure; (B) genomic distance in kb of lncRNA transcription sites with up- or down-regulated transcripts after 100 µM DEET plus 10 µM fipronil exposure to closest protein-coding gene TSS, and (C) genomic distance in kb and orientation of lncRNA transcription sites with up- or down- regulated transcripts after 100 µM DEET plus 10 µM fipronil exposure to closest protein- coding gene TSS. Values in (A) that are red indicate the number of genomic regions (i.e., lncRNAs) that do not lie within 1,000 kb of a protein-coding gene TSS. Percentages on the Y axis in (A) refer to the ratio of differentially expressed lncRNAs that neighbor 0, 1, or 2 or more protein-coding genes to the total number of lncRNAs dysregulated by 100 µM DEET plus 10 µM fipronil (331 total in this case). Percentages on the Y axis in (B) refer to the ratio of lncRNAs that fall into categories within a certain range from the TSS of the closest protein-coding gene to the total number of lncRNAs that fall within these ranges (603 total in this case since a single gene can span more than one range category). Percentages on the Y axis in (C) refer to the ratio of lncRNAs that fall into categories within a certain range both before and after the closest protein-coding gene TSS to the total number of lncRNAs that fall within these ranges (603 total in this case).

128

A B

C

129

A

B

Fig. 2.8. Top 10 biological processes and signaling pathways affected by exposure of primary human hepatocytes to 100 µM DEET for 72 hours. (A) Top 10 biological processes affected by treatment with 100 µM DEET and (B) top 10 signaling pathways affected by treatment with 100 µM DEET.

130

A

B

Fig. 2.9. Top 10 biological processes and signaling pathways affected by exposure of primary human hepatocytes to 10 µM fipronil for 72 hours. (A) Top 10 biological processes affected by treatment with 10 µM fipronil and (B) top 10 signaling pathways affected by treatment with 10 µM fipronil. 131

A

B

Fig. 2.10. Top 10 biological processes and signaling pathways affected by exposure of primary human hepatocytes to 100 µM DEET plus 10 µM fipronil for 72 hours. (A) Top 10 biological processes affected by treatment with 100 µM DEET plus 10 µM fipronil and (B) top 10 signaling pathways affected by treatment with 100 µM DEET plus 10 µM fipronil. 132

Fig. 2.11. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes exposed to 100 µM DEET (as defined by GREAT). Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. Settings were as follows: Number of input proteins = 42; Minimum required interaction score: Low confidence (0.150); Max number of interactors to show: 1st shell = query proteins only; 2nd shell = maximum of 5 interactors; Disconnected nodes hidden in network. Colored nodes = query proteins and first shell of interactors; white nodes = second shell of interactors; large nodes = some 3D structure is known or predicted; small nodes = protein of unknown 3D structure; blue lines (known interaction) = from curated databases; magenta lines (known interaction) = experimentally determined; green lines (predicted interactions) = gene neighborhood; red lines (predicted interactions) = gene fusions; purple lines (predicted interactions) = gene co-occurrence; chartreuse line = text mining; black lines = co- expression; violet lines = protein homology. 133

Fig. 2.12. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the immune response. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the immune response were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000 kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings. 134

Fig. 2.13. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the p53 signaling pathway. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the p53 signaling pathway were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000 kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings. 135

Fig. 2.14. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the Ras signaling pathway. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the Ras signaling pathway were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000 kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings. 136

Fig. 2.15. Protein–protein interaction network mapping direct (physical) as well as indirect (functional) associations with proteins involved in the Wnt signaling pathway. Gene products from protein-coding genes neighboring lncRNAs whose transcripts were differentially expressed after primary human hepatocytes were exposed to a mixture of 100 µM DEET and 10 µM fipronil were used as inputs. Some protein-coding genes involved in the Wnt signaling pathway were also included whose transcripts were differentially expressed after exposure to a mixture of 100 µM DEET and 10 µM fipronil, but were not within 1,000 kb of any dysregulated lncRNAs. Protein networks are a topological summary of protein-protein interactions with a queried protein, or multiple protein inputs, based on known and predicted interactions determined experimentally or from curated databases. STRING version 10 is a database that compiles interaction information from various sources and utilizes this data to generate the interaction networks displayed here. See Figure 2.11 caption for STRING settings and line color meanings. 137

SUPPLEMENTAL MATERIAL

Supplement 1

10 µM Fipronil lncRNAs and NeighboringA Protein-Coding Genes Gene symbol (lncRNA) Gene symbol (protein-coding gene) and distance (in bp) to lncRNA RPL23AP7 RABL2A (-8041), FOXD4L1 (+120104) CYP2B7P CYP2A7 (-54710), CYP2B6 (-53837) MT1L MT1E (-7336), MT2A (+9940) MGC72080 ASNS (-96919), OCM2 (+20735) SNX29P2 NPIPB11 (+70356), LAT (+348847) HSD17B7P2 ZNF37A (+273106) GOLGA6L5P ZSCAN2 (-91294), ADAMTSL3 (+730104) RRN3P1 NPIPB4 (+49516), OTOA (+129388) RP9P FKBP9 (-27413), AVL9 (+434503) GTF2IP4 STAG3L3 (-118930), NSUN5 (+127639) LOC728554 PROP1 (+116478), B4GALT7 (+279664) MT1DP MT1B (-7585), MT1A (+5648) WASH3P OR4F4 (-46858) RPSAP58 ZNF726 (-119325), RPSAP58 (+32551) FGF7P6 NONE PI4KAP2 UBE2L3 (-54203), HIC2 (+52784) FKBP9P1 VOPP1 (-120295), SEPT14 (+169969) LOC154761 OR2F2 (-110824), CTAGE6 (-66646) CYP2D7 CYP2D6 (-11486), TCF20 (+73054) GUCY2EP TSKU (-90775), LRRC32 (-30977) WASH2P RABL2A (-33544), FOXD4L1 (+94601) RASA4CP DBNL (-8120), UBE2D4 (+110147) MSL3P1 TRPM8 (-50471), HJURP (-12360) 138

FBLL1 FBLL1 (+989) GOLGA2P7 ZSCAN2 (-264476), ADAMTSL3 (+556922) LOC654342 NONE SERPINB9P1 SERPINB1 (-23577), SERPINB9 (+37697) LOC389765 NAA35 (-116706), AGTPBP1 (-82622) ROCK1P1 TUBB8P12 (-66086), USP14 (-42965) TEKT4P2 NONE WHAMMP1 GOLGA8O (-71160), GOLGA8N (-66662) MAFIP ZNF595 (+31120), ZNF732 (+214780) STAG3L2 GTF2IRD2 (-34650), WBSCR16 (+187132) LOC100288778 IQSEC3 (-86308) GAS5 ZBTB37 (-2738) SNHG6 MCMDC2 (+27396), TCF24 (+64445) H19 MRPL23 (+49227), IGF2 (+144733) SNHG5 SYNCRIP (-34585) EPB41L4A-AS1 NREP (-184582), EPB41L4A (+257803) UCA1 OR10H1 (-24508), CYP4F2 (+65486) MIR4435-2HG BCL2L11 (+310135), ANAPC1 (+453626) ZMIZ1-AS1 ZMIZ1 (-63648), RPS24 (+971626) UBA6-AS1 UBA6 (-10712), GNRHR (+42469) DANCR USP46 (-53961), ERVMER34-1 (+38344) DLGAP1-AS1 DLGAP1 (+479687), TGIF1 (+523973) CYTOR PLGLB1 (-539027), PLGLB2 (-259604) LINC01512 TMEM63B (-212297), VEGFA (+143910) LOC100268168 RPL26L1 (-1818) LINC00239 PPP2R5C (-30317), DIO3 (+170130) URB1-AS1 URB1 (-519) LOC100133669 CYP11B2 (-82368), LY6E (-18318) LOC115110 TNFRSF14 (-4984)

139

DRAIC RPLP1 (+113796), TLE3 (+531596) LOC153910 GPR126 (+280243), HIVEP2 (+363029) LINC00938 ARID2 (-2845) LOC642361 SFTPD (+122351), SFTPA1 (+215807) LOC284344 PSG4 (-24444), PSG9 (+39310) LOC729970 CNN3 (-18371), ALG14 (+127296) LINC00673 SOX9 (+377042), SLC39A11 (+594624) LOC100294145 HLA-DMB (+42103), PSMB9 (+44806) MIR210HG PHRF1 (-9464), RASSF7 (+6005) LOC100506844 CTDSP2 (-87067), XRCC6BP1 (-7771) LINC00888 MCF2L2 (-23535), KLHL6 (+103879) OSER1-AS1 JPH2 (-30915), FITM2 (+92676) LOC100507389 PCOLCE2 (-45402), PAQR9 (+28731) LYPLAL1-AS1 LYPLAL1 (-46463), TGFB2 (+781146) IL10RB-AS1 IL10RB (-412) LINC00862 ZNF281 (+51888), NR5A2 (+330566) LINC00266-1 PCMTD2 (+41101) HCP5 MICB (-33621), MICA (+60915) HEIH MGAT1 (-23011), ZFP62 (+28437) SH3BP5-AS1 SH3BP5 (+34840), CAPN7 (+91546) RCHY1 THAP6 (-17740), PARM1 (+563617) LINC01587 C4orf6 (+1322), EVC2 (+182089) LINC01619 DCN (-882312), BTG1 (+80461) PCAT6 KDM5B (-1959) HLA-DRB6 HLA-DRB5 (-26070), HLA-DRB1 (+33491) HLA-H HLA-A (-51918), HLA-G (+62363) ANKRD20A12P NONE TPTEP1 CCT8L2 (-32560), XKR3 (+196329) CMAHP FAM65B (-198762), LRRC16A (-169699)

140

NPY6R MYOT (-61820), HNRNPA0 (-51621) MBL1P SFTPD (+30754), SFTPA1 (+307404) AQP7P1 ANKRD20A1 (-646908) AKR7L EMC1 (-18476), AKR7A3 (+19222) HERC2P4 ZNF267 (+278162), TP53TG3 (+524812) PGM5P2 FOXD4L6 (+88155) SCART1 CYP2E1 (-59218), MTG1 (+67017) RPL32P3 H1FX (-74859), EFCAB12 (+37515) RNF5P1 TACC1 (-186488), FGFR1 (-132871) PFN1P2 PPIAL4B (-247525), NBPF9 (-199977) LOC220729 BDH1 (-64967), KIAA0226 (+115765) CES1P1 CES1 (+65402), SLC6A2 (+111112) EP400NL DDX51 (+39024), EP400 (+155348) GGT8P NONE GUSBP2 ZNF322 (-221819), HIST1H2BJ (+218730) DPY19L2P2 NAPEPLD (-78558), PMPCB (-69682) PDXDC2P PDPR (-92503), CLEC18A (+69943) UBE2Q2P1 ZSCAN2 (-47317), ADAMTSL3 (+774081) SPDYE7P CALN1 (-459128), POM121 (-13450) PPIEL BMP8A (+34026), PABPC4 (+50729) LOC652276 KCTD5 (-65536), PDPK1 (+78975) CROCCP3 NECAP2 (+39332), NBPF1 (+133419) CLUHP3 ZNF720 (-9216), AHSP (+176154) HLA-J ZNRD1 (-30157), HLA-A (+89837) LOC155060 ZNF783 (+29125), ZNF777 (+169827) ADAM1A MAPKAPK5 (+57856), TMEM116 (+112629) MT1JP MT1A (-2254) AZGP1P1 ZKSCAN1 (-33082), AZGP1 (-6342) LOC730102 RASAL2 (-72068), SEC16B (-52158)

141

ESPNP CROCC (-216263), NBPF1 (-92200) FAM86B3P SGK223 (+149769), ZNF705B (+293095) LOC728989 NBPF12 (+128691), PRKAB2 (+141376) BMS1P4 SEC24C (-29559), AGAP5 (-17036) HTATSF1P2 RIPK1 (-45961), NQO2 (+22489) GOLGA2P5 ANKS1B (-180633), ACTR6 (-35217) ZNF37BP ZNF33B (+105372) ANKRD36BP1 TBX19 (-34535), SFT2D2 (+20497) FAM45BP RBMX2 (+94225), ENOX2 (+407040) GUSBP3 SERF1B (-350315), GTF2H2C (+114712) LOC389834 NONE LOC283788 NONE MTMR9LP LCK (-14554), EIF3I (+14315) FRG1JP ANKRD20A1 (+514318), FOXD4L6 (+761125) BEND3P3 SFTPA1 (+74989), SFTPD (+263169) ARHGAP27P1-BPTFP1- SMURF2 (-103762), LRRC37A3 (+152955) KPNA2P3 AKR1C8P AKR1CL1 (+15194), AKR1C3 (+75917) RPL23AP87 METRNL (+144052) ANKRD20A9P TUBA3C (+328666) PMS2P9 POMZP3 (-419018), FGL2 (+153567) GLUD1P3 AGAP5 (-35779), SEC24C (-10816) GCSHP3 INO80D (-29890), NDUFS1 (+43122) WASH5P OR4F17 (-41513) CTSLP2 ASAH2C (-102299), AGAP9 (-58519) GTF2H2B SERF1A (-467821), SMN2 (+383254) YY1P2 NXPH2 (-117901) DNM1P41 ZSCAN2 (-91294), ADAMTSL3 (+730104) CEACAM22P IGSF23 (-66343), ZNF180 (-46023)

142

SMG1P7 EXOSC6 (+29123), CLEC18C (+48782) ZNF767P KRBA1 (-129085), ZNF746 (-88165) HERC2P7 GOLGA8S (-207661), GOLGA8I (+137090) LOC100131257 CCZ1B (-259508), C1GALT1 (-96337) TMEM198B DNAJC14 (+279) ALG1L9P KRTAP5-11 (-217810), DEFB108B (-32515) PRKXP1 ASB7 (-49017), CERS3 (-8797) LOC100130075 MDM2 (-3211) LOC202181 B4GALT7 (+45288), PROP1 (+350854) AKR1C6P AKR1C1 (-69283), AKR1E2 (+67716) MALAT1 SCYL1 (-22962), FRMD8 (+115516) HULC SLC35B3 (-175051) XIST ZCCHC13 (-481711), CHIC1 (+259270) MEG3 RTL1 (+41282), DLK1 (+116738) TSIX ZCCHC13 (-481711), CHIC1 (+259270) NEAT1 SCYL1 (-100412), FRMD8 (+38066) LINC00115 OR4F16 (-140191), SAMD11 (-98874) HNF1A-AS1 SPPL3 (-66694), HNF1A (-7478) LINC01000 CALU (-88173), METTL2B (+174390) LINC01018 NSUN2 (+47954), UBE2QL1 (+136714) PTGES2-AS1 PTGES2 (-1119) MIR100HG UBASH3B (-509593), BLID (-29867) PVT1 TMEM75 (-47595) LINC00926 CGNL1 (-72438), TCF12 (+385444) LOC100128288 KRBA2 (+12063), ODF4 (+19613) NDUFB2-AS1 NDUFB2 (+4320), BRAF (+223773) LOC100190986 METTL9 (-166237), NPIPB3 (-13482) RAMP2-AS1 RAMP2 (-2709) PWARSN SNURF (+27904), UBE3A (+425756)

143

ZNF674-AS1 ZNF674 (-1541) LOC100128573 PEX11G (+15920), ARHGEF18 (+33411) FLJ42627 KCTD5 (-39920), PDPK1 (+104591) LINC00999 ZNF37A (+345813) MIR600HG GPR21 (+77958), STRBP (+156091) LINC01558 TCP10 (-393425), C6orf123 (+6160) LINC00574 C6orf70 (+44206), DLL1 (+403634) PSORS1C3 POU5F1 (-5124), HLA-C (+96269) LINC00174 KCTD7 (-352430), TPST1 (+183027) LINC01554 GLRX (-33468), ELL2 (+105889) CDKN2B-AS1 DMRTA1 (-388899), CDKN2B (-48579) LINC00663 ZNF14 (-33295), ZNF506 (+55359) LOC143666 PHRF1 (-1675) LOC90784 ST3GAL5 (-133028), POLR1A (+84113) LINC00923 ARRDC4 (-152176) LOC100132111 RORC (-9142), THEM5 (+12683) LINC00265 CDK13 (-185942), RALA (+140612) LOC93429 IGFL1 (-17213), IGFL2 (+64757) THUMPD3-AS1 THUMPD3 (+30119), LHFPL4 (+160631) ESRG LRTM1 (+329463), CACNA2D3 (+475945) LINC01126 ZFP36L2 (-1424) MGC27382 PTGFR (-191542), GIPC2 (+253629) MZF1-AS1 UBE2M (-8031), MZF1 (+6584) LINC00910 ARL4D (-19588), TMEM106A (+92845) LOC100272217 FUBP3 (-1184) MAN1B1-AS1 MAN1B1 (-1046) LINC00893 IDS (-28345), CXorf40A (-6965) LINC00894 MAMLD1 (-385794), MAGEA8 (+135951) TAPT1-AS1 TAPT1 (-15884), LDB2 (+656384)

144

MIR99AHG USP25 (+610124) LOC284412 HKR1 (-67204), ZNF383 (+49194) LINC01347 PLD5 (-554511), CEP170 (+176021) LOC643406 PROKR2 (-157433), GPCPD1 (+136861) TLR8-AS1 TMSB4X (-52052), TLR8 (+16438) SLC25A25-AS1 PTGES2-AS1 (-13381), SLC25A25 (+46751) DKFZP434I0714 FBXW7 (-1662) WAC-AS1 MPP7 (-245993), WAC (-7284) FAM83H-AS1 FAM83H (-6437), SCRIB (+75141) LOC283177 B3GAT1 (-78718) LINC00514 CLDN9 (-20675), PKMYT1 (-11242) LINC01252 PRB2 (-160650), ETV6 (-93639) LINC00921 ZNF263 (-17276), MEFV (-9040) LOC100132077 HIATL1 (-27839), ZNF169 (+87401) LINC00842 NPY4R (+40393), ANXA8L1 (+50113) FLJ22763 MORC1 (-25267), DPPA2 (+173108) LINC01502 PAEP (+19260), GLT6D1 (+58522) LINC00885 TFRC (-69574), ZDHHC19 (+59632) CCDC18-AS1 TMED5 (-147232), DR1 (-17928) LOC257396 MOCS2 (-2730) DNAJC27-AS1 DNAJC27 (-33809), POMC (+163000) LINC01270 PTPN1 (-206535), CEBPB (+112980) LINC00959 EBF3 (-123516), GLRX3 (-49042) LINC01160 ADORA3 (-100174), RAP1A (-16121) LINC00939 TMEM132B (+644415) LINC01530 ENSG00000167765 (+1299), ZNF175 (+21783) LINC00997 FKBP9 (-196800), AVL9 (+265116) LOC728752 ZNF566 (-772) LOC100129917 CPLX1 (+45200), PCGF3 (+75232)

145

LOC100289230 CHD1 (-3535) LOC644656 ZNF143 (-838) LINC01061 FABP2 (-85701), PDE5A (+220900) LHX4-AS1 LHX4 (+22389), ACBD6 (+250279) LINC01963 XRCC5 (+111076), MARCH4 (+153487) PCAT18 KCTD1 (-146194), AQP4 (+170189) SCARNA17 ACAA2 (-273) LOC100506730 AKR7A3 (-5241), AKR7A2 (+17655) LOC100505918 TBX19 (+130382), XCL2 (+132575) LOC728730 TMEM178A (-146539), MAP4K3 (-82067) TMCC1-AS1 TRH (-72914), TMCC1 (-20925) ARHGEF26-AS1 ARHGEF26 (+20111), DHX36 (+183383) LINC00482 SLC38A10 (-10731), TMEM105 (+24638) LINC00665 ZNF565 (-107308), ZFP14 (+56810) LINC01125 ZAP70 (-27156), ACTR1B (-22297) STPG3-AS1 NELFB (-3315) LOC284865 RTN4R (+66707), ZDHHC8 (+69876) LINC00672 LASP1 (+57417), PLXDC1 (+224373) LINC01426 CLIC6 (+95957), RUNX1 (+283996) LOC401320 GGCT (-58224), GARS (-31613) LINC00641 OR5AU1 (-47426), HNRNPC (+65990) RBM26-AS1 NDFIP2 (-65831), RBM26 (-9533) KCNQ1OT1 KCNQ1 (+202059), CDKN1C (+238831) LINC01089 SETD1B (-5357), RHOF (-5113) LOC286437 H2BFWT (-2122) LINC01004 KMT2E (-27723), LHFPL3 (+657799) ERVK13-1 KCTD5 (-16561), PDPK1 (+127950) LOC157273 TNKS (-225849), PPP1R3B (-179369) SMG7-AS1 NMNAT2 (-47827), SMG7 (-6074)

146

LINC00261 FOXA2 (+14865), PAX1 (+863939) ENTPD3-AS1 RPL14 (-37095), ENTPD3 (+33089) LINC00941 TSPAN11 (-127762), CAPRIN2 (-44682) PRICKLE2-AS1 PSMD6 (-123149), PRICKLE2 (+78746) LINC00864 GLUD1 (-307271), MINPP1 (-102738) LINC01146 GPR65 (+50823), KCNK10 (+267302) LOC284581 PM20D1 (-28966), SLC26A9 (+64377) MIRLET7DHG ZNF169 (-68743), PTPDC1 (+106104) MGC27345 RBM28 (+41185), LEP (+61440) LOC171391 PDDC1 (-2620) LOC728040 AFM (+36985), RASSF6 (+101963) ZNRF3-AS1 KREMEN1 (-102491), ZNRF3 (+87035) UGDH-AS1 UGDH (-55732), SMIM14 (+55740) LOC100506688 BRD9 (-100001), NKD2 (-16004) PTCSC3 MBIP (+197137), BRMS1L (+297221) ANeighboring refers to mRNAs whose transcription start site is within 1,000 kb of lncRNA Plus means upstream while minus means downstream.

147

Supplement 2

100 µM DEET plus10 µM Fipronil lncRNAs and NeighboringA Protein- Coding Genes Gene Symbol (lncRNA) Gene Symbol (protein-coding gene) and distance (in bp) to lncRNA RPL23AP7 RABL2A (-8041), FOXD4L1 (+120104) CYP2B7P CYP2A7 (-54710), CYP2B6 (-53837) MT1L MT1E (-7336), MT2A (+9940) SMG1P5 NPIPB13 (-55682), CD2BP2 (+53878) TEKT4P2 NONE STAG3L4 TYW1 (+315267) CIDECP FANCD2 (-4570) MGC72080 ASNS (-96919), OCM2 (+20735) SNX29P2 NPIPB11 (+70356), LAT (+348847) GBP1P1 LRRC8B (-108597), GBP6 (+52248) RP9P FKBP9 (-27413), AVL9 (+434503) GTF2IP4 STAG3L3 (-118930), NSUN5 (+127639) LOC728554 PROP1 (+116478), B4GALT7 (+279664) WASH3P OR4F4 (-46858) WASH2P RABL2A (-33544), FOXD4L1 (+94601) RASA4CP DBNL (-8120), UBE2D4 (+110147) FAM86DP CNTN3 (-907193), FRG2C (-235997) FAM86FP CLEC6A (-218929), FAM90A1 (-9379) MSL3P1 TRPM8 (-50471), HJURP (-12360) FBLL1 FBLL1 (+989) RPSAP58 ZNF726 (-119325), RPSAP58 (+32551) FGF7P6 NONE RPL23AP82 RABL2B (+5281), ACR (+40165)

148

GOLGA2P7 ZSCAN2 (-264476), ADAMTSL3 (+556922) ROCK1P1 TUBB8P12 (-66086), USP14 (-42965) SERPINB9P1 SERPINB1 (-23577), SERPINB9 (+37697) LOC654342 NONE PMS2P4 TYW1 (+292471) TREML3P TREML4 (-15074), TREML2 (-12056) BMS1P20 VPREB1 (+65806), ZNF280B (+198612) PI4KAP2 UBE2L3 (-54203), HIC2 (+52784) FKBP9P1 VOPP1 (-120295), SEPT14 (+169969) EPB41L4A-AS1 NREP (-184582), EPB41L4A (+257803) UCA1 OR10H1 (-24508), CYP4F2 (+65486) MIR4435-2HG BCL2L11 (+310135), ANAPC1 (+453626) ZMIZ1-AS1 ZMIZ1 (-63648), RPS24 (+971626) UBA6-AS1 UBA6 (-10712), GNRHR (+42469) LINC00998 LINC00998 (+963) DANCR USP46 (-53961), ERVMER34-1 (+38344) NCBP2-AS2 NCBP2 (-721) LINC00467 TRAF5 (+80808), RD3 (+85272) SVIL-AS1 LYZL1 (+126910), SVIL (+319830) URB1-AS1 URB1 (-519) MGC12916 HS3ST3B1 (+22599), PMP22 (+938907) DRAIC RPLP1 (+113796), TLE3 (+531596) PRR34-AS1 PPARA (-96129), WNT7B (-77361) LINC00116 NPHP1 (-12192), LIMS4 (+255582) LOC153910 GPR126 (+280243), HIVEP2 (+363029) LINC00863 MINPP1 (-213755), GLUD1 (-196254) LOC284344 PSG4 (-24444), PSG9 (+39310) LINC00673 SOX9 (+377042), SLC39A11 (+594624) LOC100294145 HLA-DMB (+42103), PSMB9 (+44806)

149

SNHG16 PRCD (+21466), ST6GALNAC2 (+24572) LOC100506844 CTDSP2 (-87067), XRCC6BP1 (-7771) SNHG9 RNF151 (-1624), RPS2 (-390) SNHG5 SYNCRIP (-34585) LINC00888 MCF2L2 (-23535), KLHL6 (+103879) HCG26 MICB (-26297), MICA (+68239) CYTOR PLGLB1 (-539027), PLGLB2 (-259604) OSER1-AS1 JPH2 (-30915), FITM2 (+92676) LOC100507389 PCOLCE2 (-45402), PAQR9 (+28731) SNHG8 PRSS12 (+73711), NDST3 (+244947) ZFAS1 ZNFX1 (+10639), DDX27 (+48233) SNHG15 CCM2 (-42182), MYO1G (-5746) GAS5 ZBTB37 (-2738) SNHG6 MCMDC2 (+27396), TCF24 (+64445) LYPLAL1-AS1 LYPLAL1 (-46463), TGFB2 (+781146) HCP5 MICB (-33621), MICA (+60915) HEIH MGAT1 (-23011), ZFP62 (+28437) H19 MRPL23 (+49227), IGF2 (+144733) SH3BP5-AS1 SH3BP5 (+34840), CAPN7 (+91546) LINC01619 DCN (-882312), BTG1 (+80461) LINC01587 C4orf6 (+1322), EVC2 (+182089) EXOC3-AS1 EXOC3 (-823), C5orf55 (+808) RCHY1 THAP6 (-17740), PARM1 (+563617) HLA-DRB6 HLA-DRB5 (-26070), HLA-DRB1 (+33491) HLA-H HLA-A (-51918), HLA-G (+62363) TPTEP1 CCT8L2 (-32560), XKR3 (+196329) CMAHP FAM65B (-198762), LRRC16A (-169699) PGM5P2 FOXD4L6 (+88155) SCART1 CYP2E1 (-59218), MTG1 (+67017)

150

RPL32P3 H1FX (-74859), EFCAB12 (+37515) LOC220729 BDH1 (-64967), KIAA0226 (+115765) CES1P1 CES1 (+65402), SLC6A2 (+111112) EP400NL DDX51 (+39024), EP400 (+155348) GUSBP2 ZNF322 (-221819), HIST1H2BJ (+218730) AFG3L1P CENPBD1 (-12747), DBNDD1 (+24840) ZNF37BP ZNF33B (+105372) PFN1P2 PPIAL4B (-247525), NBPF9 (-199977) LOC646214 OR4M2 (-431852) LRRC37A4P CRHR1 (-271342), PLEKHM1 (-22459) HERC2P4 ZNF267 (+278162), TP53TG3 (+524812) DPY19L2P2 NAPEPLD (-78558), PMPCB (-69682) PI4KAP1 GGTLC3 (-23416), USP41 (+353835) ABCC6P1 NOMO2 (-22660), RPS15A (+205568) PDXDC2P PDPR (-92503), CLEC18A (+69943) ARHGAP27P1-BPTFP1- SMURF2 (-103762), LRRC37A3 (+152955) KPNA2P3 LOC202181 B4GALT7 (+45288), PROP1 (+350854) UBE2Q2P1 ZSCAN2 (-47317), ADAMTSL3 (+774081) CROCCP3 NECAP2 (+39332), NBPF1 (+133419) CLUHP3 ZNF720 (-9216), AHSP (+176154) AKR1C8P AKR1CL1 (+15194), AKR1C3 (+75917) ANKRD20A9P TUBA3C (+328666) PMS2P9 POMZP3 (-419018), FGL2 (+153567) SMA4 SERF1A (-691868), SMN2 (+159207) GOLGA8IP GOLGA8S (-341001), GOLGA8I (+3750) SPDYE7P CALN1 (-459128), POM121 (-13450) FAHD2CP ANKRD36C (-30972), GPAT2 (+13209) PPIEL BMP8A (+34026), PABPC4 (+50729)

151

AQP7P1 ANKRD20A1 (-646908) HLA-J ZNRD1 (-30157), HLA-A (+89837) ANKRD36BP1 TBX19 (-34535), SFT2D2 (+20497) APOC1P1 APOC4 (-13144), APOC1 (+14847) MTMR9LP LCK (-14554), EIF3I (+14315) GOLGA2P5 ANKS1B (-180633), ACTR6 (-35217) ESPNP CROCC (-216263), NBPF1 (-92200) WHAMMP2 APBA2 (-138002), GOLGA8M (-35649) BMS1P4 SEC24C (-29559), AGAP5 (-17036) RPS10P7 ENSG00000269690_no gene symbol (-102637), CSRP1 (-10792) CROCCP2 CROCC (-297369), NBPF1 (-11094) LOC728989 NBPF12 (+128691), PRKAB2 (+141376) MST1P2 CROCC (-273953), NBPF1 (-34510) BEND3P3 SFTPA1 (+74989), SFTPD (+263169) FAM86B3P SGK223 (+149769), ZNF705B (+293095) FAM45BP RBMX2 (+94225), ENOX2 (+407040) SEPT7P2 IGFBP1 (-141955), ADCY1 (+171880) FAM86EP ADRA2C (+182283), OTOP1 (+278208) CEACAM22P IGSF23 (-66343), ZNF180 (-46023) LOC155060 ZNF783 (+29125), ZNF777 (+169827) ADAM1A MAPKAPK5 (+57856), TMEM116 (+112629) CCDC144B TBC1D28 (+60715), LGALS9C (+104934) MT1JP MT1A (-2254) AZGP1P1 ZKSCAN1 (-33082), AZGP1 (-6342) LOC730102 RASAL2 (-72068), SEC16B (-52158) RPL23AP87 METRNL (+144052) BTN2A3P BTN3A3 (-14483), BTN3A1 (+23752) FAM153C N4BP3 (-85417), PROP1 (-31784)

152

YY1P2 NXPH2 (-117901) DNM1P41 ZSCAN2 (-91294), ADAMTSL3 (+730104) HLA-L TRIM26 (-77082), TRIM39 (-36396) AKR7L EMC1 (-18476), AKR7A3 (+19222) ZNF767P KRBA1 (-129085), ZNF746 (-88165) GOLGA8T GOLGA8T (+6340), CHRFAM7A (+252060) CTSLP2 ASAH2C (-102299), AGAP9 (-58519) GTF2H2B SERF1A (-467821), SMN2 (+383254) HERC2P7 GOLGA8S (-207661), GOLGA8I (+137090) TMEM198B DNAJC14 (+279) HTATSF1P2 RIPK1 (-45961), NQO2 (+22489) LOC100130075 MDM2 (-3211) GCSHP3 INO80D (-29890), NDUFS1 (+43122) WASH5P OR4F17 (-41513) GLUD1P3 AGAP5 (-35779), SEC24C (-10816) ANKRD20A12P NONE GOLGA8S MKRN3 (-203771), GOLGA8S (+6690) LOC100132057 PPIAL4G (+67226) GUSBP11 IGLL1 (-97647), RGL4 (-12906) ALG1L9P KRTAP5-11 (-217810), DEFB108B (-32515) PRKXP1 ASB7 (-49017), CERS3 (-8797) SMG1P7 EXOSC6 (+29123), CLEC18C (+48782) GUSBP9 SERF1A (-367305), SMN2 (+483770) FRG1HP FOXD4L6 (+464748), ANKRD20A1 (+810695) LOC100131257 CCZ1B (-259508), C1GALT1 (-96337) FRG1JP ANKRD20A1 (+514318), FOXD4L6 (+761125) URAHP C16orf3 (-5290), URAHP (+12582) GUSBP3 SERF1B (-350315), GTF2H2C (+114712) PRORSD1P MTIF2 (-14216), CCDC88A (+136218)

153

C3P1 RDH8 (+44497), ANGPTL6 (+45050) LOC652276 KCTD5 (-65536), PDPK1 (+78975) DKFZP586I1420 ZNRF2 (+87115), NOD1 (+107358) MBL1P SFTPD (+30754), SFTPA1 (+307404) KCNQ1OT1 KCNQ1 (+202059), CDKN1C (+238831) MEG3 RTL1 (+41282), DLK1 (+116738) DIO3OS DIO3 (-7402), LOC100288160 (+661021) LINC01089 SETD1B (-5357), RHOF (-5113) MALAT1 SCYL1 (-22962), FRMD8 (+115516) LINC00261 FOXA2 (+14865), PAX1 (+863939) XIST ZCCHC13 (-481711), CHIC1 (+259270) KC6 PIK3C3 (-454773) SCARNA17 ACAA2 (-273) TSIX ZCCHC13 (-481711), CHIC1 (+259270) CDKN2B-AS1 DMRTA1 (-388899), CDKN2B (-48579) LOC441204 SNX10 (+158004), SKAP2 (+414815) A1BG-AS1 A1BG (-1328) LINC00294 TCP11L1 (+38385), CSTF3 (+83663) SYNE3 CLMN (-88772), SYNE3 (+67158) PSMG3-AS1 ELFN1 (-108270), TMEM184A (-23419) LINC00923 ARRDC4 (-152176) LOC100132111 RORC (-9142), THEM5 (+12683) PCAT18 KCTD1 (-146194), AQP4 (+170189) LINC00115 OR4F16 (-140191), SAMD11 (-98874) HNF1A-AS1 SPPL3 (-66694), HNF1A (-7478) LINC01000 CALU (-88173), METTL2B (+174390) PSMD5-AS1 PSMD5 (-5723), PHF19 (+28621) LINC01018 NSUN2 (+47954), UBE2QL1 (+136714) PTGES2-AS1 PTGES2 (-1119)

154

MIR100HG UBASH3B (-509593), BLID (-29867) LINC00926 CGNL1 (-72438), TCF12 (+385444) LOC100128288 KRBA2 (+12063), ODF4 (+19613) NDUFB2-AS1 NDUFB2 (+4320), BRAF (+223773) LINC01134 AJAP1 (-890616), C1orf174 (-7640) LOC100190986 METTL9 (-166237), NPIPB3 (-13482) RAMP2-AS1 RAMP2 (-2709) HECTD2-AS1 HECTD2 (+48866), PPP1R3C (+173843) MRPL23-AS1 MRPL23 (+39286), IGF2 (+154674) LOC100128573 PEX11G (+15920), ARHGEF18 (+33411) FLJ42627 KCTD5 (-39920), PDPK1 (+104591) LINC00999 ZNF37A (+345813) KMT2E-AS1 KMT2E (-1838) LINC02035 SEMA5B (+86670), DIRC2 (+94669) MIR600HG GPR21 (+77958), STRBP (+156091) LINC01558 TCP10 (-393425), C6orf123 (+6160) LINC00240 ZNF322 (-298282), HIST1H2BJ (+142267) LINC00574 C6orf70 (+44206), DLL1 (+403634) HCG27 POU5F1 (-30171), HLA-C (+71222) PSORS1C3 POU5F1 (-5124), HLA-C (+96269) LINC00839 ZNF33B (+153130) BAIAP2-AS1 BAIAP2 (-3236) LINC00174 KCTD7 (-352430), TPST1 (+183027) PDCD4-AS1 PDCD4 (-1910) LINC01554 GLRX (-33468), ELL2 (+105889) LINC00663 ZNF14 (-33295), ZNF506 (+55359) LOC100130691 NFE2L2 (-72968), AGPS (-54545) LOC143666 PHRF1 (-1675) LINC01140 LMO4 (-178984), RP5-1052I5.2 (+156475)

155

LINC00265 CDK13 (-185942), RALA (+140612) LOC93429 IGFL1 (-17213), IGFL2 (+64757) THUMPD3-AS1 THUMPD3 (+30119), LHFPL4 (+160631) FAM41C OR4F16 (-185763), SAMD11 (-53302) ESRG LRTM1 (+329463), CACNA2D3 (+475945) LINC01126 ZFP36L2 (-1424) LINC00671 G6PC (-14094), AOC3 (+35520) MGC27382 PTGFR (-191542), GIPC2 (+253629) MZF1-AS1 UBE2M (-8031), MZF1 (+6584) LINC00910 ARL4D (-19588), TMEM106A (+92845) LOC100272217 FUBP3 (-1184) MAN1B1-AS1 MAN1B1 (-1046) LINC00893 IDS (-28345), CXorf40A (-6965) LINC00894 MAMLD1 (-385794), MAGEA8 (+135951) TAPT1-AS1 TAPT1 (-15884), LDB2 (+656384) MIR99AHG USP25 (+610124) AFDN-AS1 MLLT4 (-1648) NEAT1 SCYL1 (-100412), FRMD8 (+38066) FTX ZCCHC13 (-185241), CHIC1 (+555740) ADORA2A-AS1 UPB1 (-33872), ADORA2A (+33626) LOC100289511 SLC10A1 (+28145), SRSF5 (+42244) LINC01347 PLD5 (-554511), CEP170 (+176021) LOC643406 PROKR2 (-157433), GPCPD1 (+136861) TLR8-AS1 TMSB4X (-52052), TLR8 (+16438) SLC25A21-AS1 SLC25A21 (+247009), PAX9 (+263992) SLC25A25-AS1 PTGES2-AS1 (-13381), SLC25A25 (+46751) LOC100422737 C6orf203 (-149094), QRSL1 (+122860) DKFZP434I0714 FBXW7 (-1662) FAM83H-AS1 FAM83H (-6437), SCRIB (+75141)

156

LOC283177 B3GAT1 (-78718) LINC00514 CLDN9 (-20675), PKMYT1 (-11242) LINC01252 PRB2 (-160650), ETV6 (-93639) F11-AS1 F11 (+117566), MTNR1A (+172056) LINC00921 ZNF263 (-17276), MEFV (-9040) LOC100288069 SAMD11 (-153962), OR4F16 (-85103) LOC100132077 HIATL1 (-27839), ZNF169 (+87401) LINC00842 NPY4R (+40393), ANXA8L1 (+50113) LINC00885 TFRC (-69574), ZDHHC19 (+59632) CCDC18-AS1 TMED5 (-147232), DR1 (-17928) DNAJC27-AS1 DNAJC27 (-33809), POMC (+163000) NDUFA6-AS1 NDUFA6 (-17186), CYP2D6 (+22763) LINC01270 PTPN1 (-206535), CEBPB (+112980) LINC00959 EBF3 (-123516), GLRX3 (-49042) LINC01160 ADORA3 (-100174), RAP1A (-16121) LINC00939 TMEM132B (+644415) LINC01530 AC018755.1 (+1299), ZNF175 (+21783) LINC00997 FKBP9 (-196800), AVL9 (+265116) LOC728752 ZNF566 (-772) LOC100129917 CPLX1 (+45200), PCGF3 (+75232) LOC100289230 CHD1 (-3535) LOC644656 ZNF143 (-838) LINC01061 FABP2 (-85701), PDE5A (+220900) LHX4-AS1 LHX4 (+22389), ACBD6 (+250279) LINC01963 XRCC5 (+111076), MARCH4 (+153487) LOC100506730 AKR7A3 (-5241), AKR7A2 (+17655) LOC100505918 TBX19 (+130382), XCL2 (+132575) LOC653160 DLGAP3 (-47617), ZMYM6NB (+8151) LOC728730 TMEM178A (-146539), MAP4K3 (-82067)

157

TMCC1-AS1 TRH (-72914), TMCC1 (-20925) LINC00482 SLC38A10 (-10731), TMEM105 (+24638) PITRM1-AS1 PITRM1 (+17527), PFKP (+87764) LINC01125 ZAP70 (-27156), ACTR1B (-22297) STPG3-AS1 NELFB (-3315) LOC100507387 THOC3 (-154058), SIMC1 (-116005) LOC100506746 SLC10A6 (-80608), AFF1 (-5130) LINC00672 LASP1 (+57417), PLXDC1 (+224373) LINC00311 GSE1 (-327698), KIAA0513 (+222306) LINC01268 MARCKS (+13304), HDAC2 (+100604) LINC01426 CLIC6 (+95957), RUNX1 (+283996) LOC401320 GGCT (-58224), GARS (-31613) LINC00641 OR5AU1 (-47426), HNRNPC (+65990) RBM26-AS1 NDFIP2 (-65831), RBM26 (-9533) LOC286437 H2BFWT (-2122) LINC01004 KMT2E (-27723), LHFPL3 (+657799) LOC100130899 TNRC6B (-143471), GRAP2 (+133372) LINC01569 SRL (-7727), TFAP4 (+23268) ERVK13-1 KCTD5 (-16561), PDPK1 (+127950) LOC157273 TNKS (-225849), PPP1R3B (-179369) LOC100287015 MCPH1 (-1540) RAD51-AS1 RAD51 (-744) SMG7-AS1 NMNAT2 (-47827), SMG7 (-6074) LOC100506990 DEFB130 (-232996), LONRF1 (+204178) ENTPD3-AS1 RPL14 (-37095), ENTPD3 (+33089) LOC728175 ENPP6 (-129543), IRF2 (+127077) PCAT19 CEACAM21 (-99287), ATP5SL (-37870) SNAP25-AS1 SNAP25 (-53215), ANKEF1 (+130566) GAS6-AS1 GAS6 (+24226), TMEM255B (+80598)

158

PRICKLE2-AS1 PSMD6 (-123149), PRICKLE2 (+78746) LINC00864 GLUD1 (-307271), MINPP1 (-102738) LINC01146 GPR65 (+50823), KCNK10 (+267302) LOC284581 PM20D1 (-28966), SLC26A9 (+64377) MIRLET7DHG ZNF169 (-68743), PTPDC1 (+106104) MGC27345 RBM28 (+41185), LEP (+61440) NIPBL-AS1 NIPBL (-2732) LOC100130744 ANKH (+83489), FAM105B (+123541) LOC171391 PDDC1 (-2620) LOC728040 AFM (+36985), RASSF6 (+101963) PRICKLE2-AS3 PSMD6 (-123149), PRICKLE2 (+78746) LINC00854 ARL4D (-99078), TMEM106A (+13355) LOC729732 DEFB130 (-232996), LONRF1 (+204178) UGDH-AS1 UGDH (-55732), SMIM14 (+55740) ASB16-AS1 TMUB2 (-8338), ASB16 (+8005) PTCSC3 MBIP (+197137), BRMS1L (+297221) TMEM9B-AS1 TMEM9B (-5706), NRIP3 (+33570) TMEM220-AS1 TMEM220 (-33927), PIRT (+74202) LOC100506688 BRD9 (-100001), NKD2 (-16004) LINC01750 CTTNBP2NL (-401477), KCND3 (-5549) ANeighboring refers to mRNAs whose transcription start site is within 1,000 kb of lncRNA Plus means upstream while minus means downstream.

159

Supplement 3

No. Biological Process GO-slim Annotation (10 µm Fipronil) Hits 1 cellular process (GO:0009987) 192 2 metabolic process (GO:0008152) 185 3 response to stimulus (GO:0050896) 53 4 localization (GO:0051179) 44 5 biological regulation (GO:0065007) 41 6 developmental process (GO:0032502) 41 7 multicellular organismal process (GO:0032501) 37 8 cellular component organization or biogenesis (GO:0071840) 29 9 immune system process (GO:0002376) 29 10 reproduction (GO:0000003) 8 11 biological adhesion (GO:0022610) 7 12 locomotion (GO:0040011) 3 13 cell killing (GO:0001906) 2 14 rhythmic process (GO:0048511) 1

160

Supplement 4

No. PANTHER Signaling Pathway (10 µM Fipronil) Hits 1 Ubiquitin proteasome pathway (P00060) 7 2 Gonadotropin-releasing hormone receptor pathway (P06664) 6 3 Angiogenesis (P00005) 5 4 Wnt signaling pathway (P00057) 5 5 T cell activation (P00053) 5 6 Integrin signaling pathway (P00034) 4 7 Heterotrimeric G-protein signaling pathway-Gq alpha and Go alpha 4 mediated pathway (P00027) 8 TGF-beta signaling pathway (P00052) 4 9 Apoptosis signaling pathway (P00006) 3 10 Interleukin signaling pathway (P00036) 3 11 Insulin/IGF pathway-protein kinase B signaling cascade (P00033) 3 12 Inflammation mediated by chemokine and cytokine signaling pathway 3 (P00031) 13 p53 pathway (P00059) 3 14 Heterotrimeric G-protein signaling pathway-Gi alpha and Gs alpha 3 mediated pathway (P00026) 15 Ras Pathway (P04393) 3 16 EGF receptor signaling pathway (P00018) 3 17 Parkinson disease (P00049) 3 18 Notch signaling pathway (P00045) 3 19 CCKR signaling map (P06959) 3 20 Huntington disease (P00029) 2 21 p53 pathway feedback loops 2 (P04398) 2 22 VEGF signaling pathway (P00056) 2 23 FGF signaling pathway (P00021) 2 24 P53 pathway feedback loops 1 (P04392) 2 161

25 PDGF signaling pathway (P00047) 2 26 Nicotinic acetylcholine receptor signaling pathway (P00044) 2 27 De novo purine biosynthesis (P02738) 1 28 Adrenaline and noradrenaline biosynthesis (P00001) 1 29 Insulin/IGF pathway-mitogen activated protein kinase kinase/MAP kinase 1 cascade (P00032) 30 Hypoxia response via HIF activation (P00030) 1 31 Asparagine and aspartate biosynthesis (P02730) 1 32 Nicotine pharmacodynamics pathway (P06587) 1 33 p53 pathway by glucose deprivation (P04397) 1 34 Thyrotropin-releasing hormone receptor signaling pathway (P04394) 1 35 Toll receptor signaling pathway (P00054) 1 36 General transcription by RNA polymerase I (P00022) 1 37 Bupropion degradation (P05729) 1 38 PI3 kinase pathway (P00048) 1 39 Opioid proopiomelanocortin pathway (P05917) 1 40 Nicotine degradation (P05914) 1 41 Glutamine glutamate conversion (P02745) 1 42 Cadherin signaling pathway (P00012) 1 43 Dopamine receptor mediated signaling pathway (P05912) 1 44 B cell activation (P00010) 1 45 Corticotropin releasing factor receptor signaling pathway (P04380) 1

162

Supplement 5

No. Biological Process GO-slim Annotation (100 µM DEET+ 10 µM Hits Fipronil) 1 cellular process (GO:0009987) 223 2 metabolic process (GO:0008152) 205 3 response to stimulus (GO:0050896) 67 4 localization (GO:0051179) 59 5 developmental process (GO:0032502) 53 6 biological regulation (GO:0065007) 49 7 immune system process (GO:0002376) 45 8 multicellular organismal process (GO:0032501) 38 9 cellular component organization or biogenesis (GO:0071840) 37 10 biological adhesion (GO:0022610) 11 11 reproduction (GO:0000003) 9 12 locomotion (GO:0040011) 5 13 rhythmic process (GO:0048511) 2 14 cell killing (GO:0001906) 2

163

Supplement 6

No. PANTHER Signaling Pathway (100 µM DEET+ 10 µM Fipronil) Hits 1 Heterotrimeric G-protein signaling pathway-Gi alpha and Gs alpha 7 mediated pathway (P00026) 2 T cell activation (P00053) 7 3 Gonadotropin-releasing hormone receptor pathway (P06664) 7 4 Ubiquitin proteasome pathway (P00060) 6 5 Angiogenesis (P00005) 5 6 Integrin signaling pathway (P00034) 5 7 Inflammation mediated by chemokine and cytokine signaling pathway 5 (P00031) 8 p53 pathway (P00059) 5 9 Heterotrimeric G-protein signaling pathway-Gq alpha and Go alpha 5 mediated pathway (P00027) 10 Wnt signaling pathway (P00057) 5 11 Insulin/IGF pathway-protein kinase B signaling cascade (P00033) 4 12 Ras Pathway (P04393) 4 13 TGF-beta signaling pathway (P00052) 4 14 EGF receptor signaling pathway (P00018) 4 15 PDGF signaling pathway (P00047) 4 16 Nicotinic acetylcholine receptor signaling pathway (P00044) 4 17 CCKR signaling map (P06959) 4 18 Apoptosis signaling pathway (P00006) 3 19 p53 pathway feedback loops 2 (P04398) 3 20 Parkinson disease (P00049) 3 21 Notch signaling pathway (P00045) 3 22 Corticotropin releasing factor receptor signaling pathway (P04380) 3 23 Interleukin signaling pathway (P00036) 2 24 Alzheimer disease-presenilin pathway (P00004) 2 164

25 Alzheimer disease-amyloid secretase pathway (P00003) 2 26 Alpha adrenergic receptor signaling pathway (P00002) 2 27 Phenylethylamine degradation (P02766) 2 28 Adrenaline and noradrenaline biosynthesis (P00001) 2 29 Hypoxia response via HIF activation (P00030) 2 30 Huntington disease (P00029) 2 31 VEGF signaling pathway (P00056) 2 32 Thyrotropin-releasing hormone receptor signaling pathway (P04394) 2 33 P53 pathway feedback loops 1 (P04392) 2 34 Endothelin signaling pathway (P00019) 2 35 PI3 kinase pathway (P00048) 2 36 Opioid proopiomelanocortin pathway (P05917) 2 37 Cadherin signaling pathway (P00012) 2 38 Dopamine receptor mediated signaling pathway (P05912) 2 39 Beta3 adrenergic receptor signaling pathway (P04379) 1 40 Metabotropic glutamate receptor group III pathway (P00039) 1 41 Beta2 adrenergic receptor signaling pathway (P04378) 1 42 Beta1 adrenergic receptor signaling pathway (P04377) 1 43 Ionotropic glutamate receptor pathway (P00037) 1 44 5HT4 type receptor mediated signaling pathway (P04376) 1 45 De novo purine biosynthesis (P02738) 1 46 5HT3 type receptor mediated signaling pathway (P04375) 1 47 5HT2 type receptor mediated signaling pathway (P04374) 1 48 5HT1 type receptor mediated signaling pathway (P04373) 1 49 Insulin/IGF pathway-mitogen activated protein kinase kinase/MAP 1 kinase cascade (P00032) 50 Asparagine and aspartate biosynthesis (P02730) 1 51 Nicotine pharmacodynamics pathway (P06587) 1 52 Synaptic vesicle trafficking (P05734) 1

165

53 GABA-B receptor II signaling (P05731) 1 54 p53 pathway by glucose deprivation (P04397) 1 55 Toll receptor signaling pathway (P00054) 1 56 FGF signaling pathway (P00021) 1 57 Oxytocin receptor mediated signaling pathway (P04391) 1 58 Bupropion degradation (P05729) 1 59 Opioid prodynorphin pathway (P05916) 1 60 Opioid proenkephalin pathway (P05915) 1 61 Nicotine degradation (P05914) 1 62 Glutamine glutamate conversion (P02745) 1 63 Muscarinic acetylcholine receptor 2 and 4 signaling pathway (P00043) 1 64 Blood coagulation (P00011) 1 65 Muscarinic acetylcholine receptor 1 and 3 signaling pathway (P00042) 1 66 B cell activation (P00010) 1 67 Metabotropic glutamate receptor group II pathway (P00040) 1 68 Pyrimidine Metabolism (P02771) 1

166

─── CHAPTER 3 ───

Infrared Light Detection by the Haller’s Organ of Adult American Dog Ticks, Dermacentor variabilis (Ixodida: Ixodidae)

Robert D. Mitchell,a Jiwei Zhu,a Ann L. Carr,a Anirudh Dhammi,a Grayson Cave,a Daniel E. Sonenshine,b and R. Michael Roea,* aDepartment of Entomology and Plant Pathology, Campus Box 7647, 3230 Ligon Street, North Carolina State University, Raleigh, NC 27695-7647 USA bDepartment of Biological Sciences, Old Dominion University, Norfolk, VA 23529 USA ([email protected]), *Corresponding author, e-mail: [email protected]

E-mail addresses: [email protected] (R.D. Mitchell), [email protected] (J. Zhu), [email protected] (A. L. Carr), [email protected] (A. Dhammi), [email protected] (G. Cave), [email protected] (R. M. Roe)

Abbreviations: HO = Haller’s organ, ADT = American dog tick, Oc =ocelli, AP = anterior pit, Cp = capsule, IR = infrared, TRPA1 = transient receptor potential cation channel, subfamily A, member 1, SEM = scanning electron microscopy

This chapter was formatted for the journal Ticks and Tick-Borne Diseases.

167

Abstract

The Haller’s organ (HO), unique to ticks and mites, is found only on the first tarsus of the front pair of legs. The organ has an unusual morphology consisting of an anterior pit (AP) with protruding sensilla and a posterior capsule (Cp). The current thinking is that the HO’s main function is chemosensation analogous to the insect antennae, but the functionality of its atypical structure (exclusive to the Acari) is unexplained. We provide the first evidence that the HO allows the American dog tick (ADT), Dermacentor variabilis, to respond to infrared

(IR) light. Unfed D. variabilis adults with their HOs present were positively phototactic to

IR. However, when the HOs were removed, no IR response was detected. Ticks in these experiments were also attracted to white light with and without the HOs, but were only positively phototactic to white light when the ocelli (primitive eyes) were unobstructed.

Covering the eyes did not prevent IR attraction. A TRPA1 receptor was characterized from a

D. variabilis-specific HO transcriptome we constructed. This receptor was homologous to transient receptor potential cation channel, subfamily A, member 1 (TRPA1) from the pit organ of the pit viper, python, and boa families of snakes, the only receptor identified so far for IR detection. HO scanning electron microscopy (SEM) studies in ADT showed the AP and Cp but also novel structures not previously described; the potential role of these structures in IR detection is discussed. The ability of ticks to use IR for host finding is consistent with their obligatory hematophagy and has practical applications in tick trapping and the development of new repellents.

Keywords: American dog tick, Dermacentor variabilis, Haller’s organ, Infrared,

TRPA1, Light

168

1. Introduction

Ticks are responsible for transmitting the majority of arthropod vector-borne disease agents in the U.S., and the incidence of tick-borne disease is on the rise because of globalization, population growth, people moving to rural areas, and climate change [1, 2].The

American dog tick (ADT), Dermacentor variabilis, the focus of this study, is a hard tick in the family Ixodidae that lives a non-nidicolous lifestyle in North and South America. ADT vectors the causative agent (the bacterium Rickettsia rickettsia) for Rocky Mountain spotted fever (RMSF) as well as other serious human pathogens. Transmission occurs from the tick to humans during blood feeding; once established in the host, RMSF can cause severe headaches, nausea, vomiting, and death if not treated within approximately the first week of symptom onset [3]. The number of reported cases of RMSF in the U.S. has been on the rise, with less than 500 cases reported in 1993 and 2500 cases in 2008 [4].

Understanding how ticks find their host is of utmost importance to disease prevention and personal protection from tick bites. Early morphological observations, generally made with light microscopy in the late 19th and early 20th century, revealed an abundance of sensory sensilla scattered across the surface of ticks as well as patches localized to specific areas of the palps and tarsus I of each foreleg. General differences in morphology and orientation were noticed, but the examination of fine detail was limited by the microscopic capabilities of the day [5, 6]. Work in the 1970s, fueled by the advent of electron microscopy and improved electrophysiology, further advanced our characterization of the different sensillar types associated with these regions. The foretarsal sensory organ commonly referred to as the Haller’s organ (HO) in ticks, is a sensilla-rich structure thought to be

169

mostly chemosensory. The HO is unique to the Acari and not found in any other animals [7-

9]. Some of its structure like the capsule is not typical of a chemosensory organ. However, the current consensus is that the HO is functionally analogous to the insect antenna [10].

In insects, there are two types of IR-sensing organs: (Type 1) photomechanic sensilla found in [11-13] and (Type 2) photothermal microbolometers found in the Merimna Australian fire-beetles [14-16] that are structurally and functionally similar to the pit organ IR detectors in snakes [17]. In Type 1, each receptor is adapted from a hair mechanoreceptor [13, 18] where the dendrite of a sensory cell is found at the bottom of a subcuticular, pressurized fluid-filled chamber (surrounded by a spherical thin layer of cuticle opening to the outside via a small pore); IR absorption and conversion to heat increases the internal pressure, which is detected by stretch-gated ion channels via physical deflection of the dendrite on which they reside in response to IR induced pressure changes [19]. In Type 2 sensors, a specialized IR-absorbing membrane with a low thermal mass is suspended above a hollow inner chamber, in the case of insects by a pedicel. The IR-absorbing membrane contains a highly branched dendritic mass capable of detecting small changes in temperature from absorption of radiant energy [16, 17]. Gracheva et al. (2010) used an unbiased transcriptional profiling method to identify these receptors, which detect IR signals through a thermotransduction mechanism, as transient receptor potential cation channel, subfamily A, member 1 (TRPA1) channels in snakes. They have not been studied for their role in IR perception elsewhere.

170

The only report of IR detection in the Acari was in the spiny rat mite, Laelaps echidnina, by Bruce (1971) where IR detection was localized to the forelegs; however, no specific region of tarsus I (including the HO), specific sensilla, or receptors were shown to be involved in IR detection [20]. Since ticks are obligatory blood feeders, there is similar morphology in the HO to IR-receptor organs in insects and snakes, and we had preliminary

HO-specific transcriptomic data of non-chemosensory receptor channels in the tick foreleg.

The current study was conducted to determine if ticks would be responsive to IR and to determine the role of the HO in this response.

171

2. Materials and methods

2.1. Ticks

Unfed, virgin adult (male and female) American dog ticks, Dermacentor variabilis

(Ixodida: Ixodidae) were provided by both Dr. Daniel E. Sonenshine, professor emeritus and eminent scholar of biological sciences, at Old Dominion University (Norfolk, VA) and Lisa

Coburn from the Department of Entomology and Plant Pathology, and manager of the Tick

Rearing Facility at Oklahoma State University (Stillwater, OK). Ticks were obtained from multiple sources to ensure our results were not specific to one strain. Upon arrival, ticks were maintained at 26±1°C and 80%±5% relative humidity on a 16:8 light/dark cycle until needed for assays or imaging.

2.2. Scanning Electron Microscopy

For scanning electron microscopy (SEM; Fig. 1), methods were adapted from

Sonenshine et al. (1984) [21]. Ticks were sacrificed by freezing at -80°C for at least 2 h, removed, washed immediately 3 times in 70% ethanol:distilled water, and stored in fresh

70% ethanol:distilled water in a 1.5 uL microcentrifuge tube until imaging was performed.

SEM was performed at the Analytical Instrumentation Facility at North Carolina State

University (Raleigh, NC) where tick specimens were removed from the 70% ethanol, air- dried for approximately 10 mins, mounted on a metal plug with double-sided mounting tape before being coated with 100-200 Å of a gold-palladium mixture (60 Au/40 Pd), and scanned with a Hitachi S-3200N variable pressure scanning microscope.

172

2.3. Behavioral Bioassays

Choice bioassays were conducted to assess IR versus visible light taxis for the ocelli versus HOs of the American dog tick (ADT). We designed the assay to limit interference from extraneous sources. The test arena (Fig. 2A) was 25 cm from the start to finish in

“Direction I” and (at a right angle) 25 cm from the start to finish in “Direction II”. The test arena was a flat white plastic tray. Assay conditions were 24±1°C and 45±5% relative humidity in total darkness (except for the light source being tested) in a walk-in incubator.

Unfed, virgin male and female adult D. variabilis with both HOs intact or removed and/or both ocelli intact or disrupted (i.e., ocelli obscured with black paint) were incubated for 5 h under assay conditions in complete darkness prior to bioassay. Each HO was removed by amputation with a sharp razor blade by cutting through tarsus I just distal to the tibia-tarsus I joint (Fig. 1A, dotted line). Very little if any hemolymph was lost from the wound, and therefore was not sealed. The coating used to cover the ocelli was black nail polish (Wet

N’Wild, Los Angeles, CA) that was applied, allowed to dry and then applied and dried a second time. There was no visible indication of cracks in the paint before or after bioassays were conducted. We did not measure light penetration through these two layers of paint since the results from the bioassays suggested the treatment was effective in preventing the detection of white light and had no impact on IR detection. The IR (880 nm) was produced from an Evolva T20 instrument (Dexcel International Co., Guangdong, China) and the visible light (450-650 nm in the visible light spectrum) by a BYB Super Bright 9 LED (BYB limit Co., United Kingdom). Fig. 2B shows a typical light projection on the arena surface.

Ticks did not respond to the light sources when they were turned off. Using a Ryobi infrared thermometer (Ryobi Limited, Hiroshima, Japan), we observed no measurable difference in

173

temperature (±0.1°C) at the surface of the light sources where light was projected (or any other surface of the flashlight tested) or anywhere on the surface of the test arena (5 mins after the light was turned on) that was different from the ambient incubator temperature.

This indicates that there was no measurable convection heat emanating from the light sources and tick responses were only to the detection of radiant energy.

At the beginning of each assay, a single tick (randomly selected with no consideration of sex) was placed at a pre-determined start location, labeled “Start” within the arena (Fig.

2A), and its movement visualized (and recorded) in response to IR or visible light exposure using a high resolution, video IR-capable camera (Canon XA25, Tokyo, Japan). The zoom capability of this camera allowed for visualization approximately 1 m from the test arena and the observer was never closer to the test arena than the camera during any single trial. In total darkness, there was no positive or negative tick taxis relative to the camera. There were

4 possible responses from a tick once the assay began: (1) movement toward the light source,

(2) movement away from the light source, (3) random movement, or (4) no movement at all.

Ticks that did not move at all after a minimum of 30 sec were removed and excluded from the assay (i.e., no movement) and not included in the results, which resulted in 5 of our 115 tick trials being excluded (approximately 4%). No ticks were observed to be repelled by the light source. Ticks that constantly changed direction without regard for the light source were considered non-light responsive (not positively or negatively phototactic). Therefore, two end points were recorded, attraction to the light source or a non-response to the light source.

In each assay, a single tick was challenged twice; a successful response was not only movement towards a light source (from “Start” to “Finish”) in Direction I (Fig. 2A, lane 1),

174

but the ability to correct its movement when a new light source was introduced at a right angle to the initial challenge, Direction II (Fig. 2A, lane 2). If a tick successfully navigated from start to finish in “Direction I,” the first light source was turned off and a second light source of identical type was immediately projected into the arena at a right angle. If the tick did not move toward the second light source, then the overall movement was considered a non-response. Successful movement of a tick from start to finish in both Direction I and

Direction II in the same trial was considered positive phototaxis. The light sources were arranged at a right angle to help demonstrate that the ticks were only responding to the light sources and no other potential cues like light, noise, CO2, air movement or heat sources within the incubator. The results were statistically evaluated using chi-square analysis under the null hypothesis that the expected proportion for either choice (response or no response) was 0.50 (Microsoft Excel and Powerpoint. Redmond, Washington: Microsoft, 2013).

2.4. Transcriptomic Analysis

A sequencing library was constructed from RNA extracted from the front pair of legs just proximal to the HO extending to the end of the leg from unfed virgin adult D. variabilis males [22]. The library was sequenced using the Illumina Hi-Seq 2000 platform (Illumina,

San Diego, CA). Briefly, data sets generated by Hi-Seq were cleaned, trimmed, and assembled de novo using the CLC pipeline assembler and scaffolder (Qiagen, Valencia, CA).

Blast2GO (BioBam, Valencia, Spain) and the GenBank non-redundant (nr) database were then used for functional annotation. A BLASTX and BLASTN search of the Haller’s organ transcriptome using the western diamondback rattlesnake IR-detecting TRPA1 revealed the presence of 5 putative partial TRPA1 transmembrane proteins. Those partial TRPA1

175

sequences covered a range of e-values; therefore, we chose to focus on the two contigs with the lowest e-values and longest sequence lengths for further analysis (contig 66838 and contig 70248). Both contigs were used to search against the non-redundant (nr) protein sequence database using BLASTX with a maximum expect threshold of 10. The contigs aligned closely to a nearly full-length TRPA1 receptor in Amblyomma aureolatum and a partial TRPA1 receptor found in Ixodes scapularis, both tick species. An alignment of a

TRPA1 known to be involved in IR detection in the western diamondback rattlesnake, Crotalus atrox, was constructed with the putative TRPA1s from A. aureolatum,

I. scapularis and our 2 contigs from the D. variabilis HO transcriptome using the MUSCLE ver. 3.9.31 algorithm (Edgar, 2004). A summary of the alignment result is included at the bottom of Table 1. C. atrox was chosen for the alignments because it was one of three pit- bearing snake species whose TRPA1 was shown bioinformatically, anatomically and functionally to serve as an IR detector (Gracheva et al., 2010). Furthermore, of the three species whose TRPA1s were described, C. atrox aligned most closely at the amino acid level with our contigs from the HO transcriptome. We define our HO-specific transcriptome as the sum of all the messenger RNAs that were expressed in dissected forelegs (including the

HO) that were removed from multiple adult male D. variabilis specimens and pooled together for sequencing.

176

3. Results and discussion

3.1. Tick Forelegs Detect IR Light

Ticks are obligatory blood feeders often on warm blooded animals. This lifestyle requires ticks to locate a host multiple times each generation to progress from larvae to nymphs, from nymphs to adults, in some tick species to progress through multiple nymphal stages, and for female ticks to develop eggs. Although there has been research on chemical cues that might be attractive to ticks and assist in host finding and there is a general view that heat is a component of host finding, no one has previously considered the possibility that ticks might be able to detect IR light [10, 23, 24]. There are organs in the nasal cavity of snakes for IR detection that are used for prey location. There are also IR-sensing organs in some beetles that are used for locating forest fires to take advantage of new food sources and for mating after fire events [14, 17]. Furthermore, there is one study in mites suggesting that the front pair of legs was important in IR light detection [20]. Finally in our studies of the

HO transcriptome in the American dog tick where we were examining mechanisms for chemoreception, we found receptor proteins that could be involved in light reception

(discussed in more detail later). In toto, these findings led to examining whether ticks could detect IR.

Currently, all visual perception in ticks is attributed to the ocelli found on the dorso- lateral surface just above the second pair of legs (Fig. 1A, single ocellus in brackets). In contrast, the HO (Fig. 1) has traditionally been regarded as a chemosensory organ analogous to insect antennae [7-10]. We designed a bioassay to assess IR versus visible light taxis for

177

the ocelli versus HOs in ADT. Fig. 2A shows a schematic diagram of the test area dimensions and relative positioning of the light source being tested during any single trial.

Fig. 2B shows the appearance of the light on the arena surface and is a screenshot of an assay in which the tick is moving toward the light source. Fig. 2B displays IR detected by an IR video camera; the arena appears completely dark to the human eye.

We tested the response of male and female virgin adult ADT to white (visible) light and IR under the following conditions: (1) HOs and ocelli intact, (2) HOs removed and ocelli intact, and (3) HOs intact and ocelli blocked. A summary of the results for the different conditions we tested is provided in Fig. 3. References to the removal of the HO was defined as the amputation of the end of the leg from just distal to the tibial-tarsus I joint to the end of the leg (Fig. 1A, dotted line). This included the HO and any other possible sensory structures found on the amputated part of the leg. Unfed D. variabilis with their HOs and ocelli (Oc) left intact (+HO +Oc) were positively phototactic to white (visible) light at a distance of 25 cm (Fig. 3A). The proportion of ticks responding to white light with their HOs and Oc intact was 0.81 (Fig. 3A; first bar on left). Using a chi-squared test of homogeneity of proportions with the null hypothesis being the ticks have a 50:50 chance of responding or not responding to the light source (a proportionality of 0.5), the difference in proportion observed in the test,

0.8, was significant from 0.5 (degrees of freedom, d.f. = 1, n = 42, χ² = 28.6, p ≤ 0.001).

With the HOs left intact but the Oc occluded with black paint (+HO –Oc; Fig. 3A, second bar from the left), no ticks were attracted to the white light (d.f. = 1, n = 12, χ² = 12.0, p ≤ 0.001).

It was clear the HO was not involved in attraction to the visible light. With the HOs removed by amputation and the Oc intact (-HO +Oc; Fig. 3A, third bar from the left), ADTs moved

178

toward the white light. The proportion of ticks responding to white light with their HOs ablated was 0.67 (d.f. = 1, n = 12, χ² = 6.0, p ≤ 0.01). This treatment is also important in showing that removal of the tarsi on the front pair of legs does not affect the ability of the ticks to walk towards light under the conditions of our bioassay.

Fig. 3B shows that ticks were attracted to IR presented at 880 nm and that the tarsi on the front pair of legs only and not the ocelli or tarsi on any of the other legs are responsible for this attraction. Unfed adult D. variabilis with their HOs and ocelli (Oc) left intact (+HO

+Oc; Fig. 3B, left bar) were positively phototactic to IR at 880 nm. The proportion of ticks responding to IR light with their HOs and Oc intact was 0.70 which was significantly different from the null hypothesis, 0.5 (d.f. = 1, n = 20, χ² = 10.8, p ≤ 0.001). Seventy percent of the ticks assayed moved to the IR light, 30 percent moved about randomly. The proportion of ticks responding to IR light with their HOs intact and Oc covered (+HO -OC;

Fig. 3B, second bar from the left) was 0.75 (d.f. = 1, n = 16, χ² = 9.6, p ≤ 0.001). These studies show that the ocelli are not involved in IR attraction or the black paint which covered the ocelli did not block the IR light. When the HO was removed and the ocelli left intact (-

HO +Oc; Fig. 3B, third bar from the left), no ticks were attracted to the IR light (d.f. -1, n =

10), χ² = 10.0, p ≤ 0.001). The finding that covering the Oc with blank paint with the HO intact had no impact on attraction to IR but removal of the HO and leaving the Oc eliminated

IR attraction clearly rules out the Oc as being involved in IR detection. Also, the removal of the HO (the tarsi on the front pair of legs) had no impact on the ability of the ticks to crawl to visible light (Fig. 3A). Therefore, it is likely that removing the tarsi on the front pair of legs is the reason the ticks did not crawl to the IR light (Fig. 3B).

179

In summary, these results suggest in ADT that visible light is detected by the ocelli and IR by the tarsi on the front pair of legs only. The assumption is made that the IR detection involves the HO or sensory structures closely associated with the HO, since the HO is exclusive to the tarsi on the front legs and is the most prominent sensory organ which is not found on the other walking legs.

3.2. Mechanism of IR detection in ticks

Understanding the mechanism of IR detection in ticks could lead to novel methods for disruption of this system and prevention of blood feeding on hosts. In insects, there are two different types of IR sensing organs: (Type 1) photomechanic IR receptors and (Type 2) photothermal receptors. Type 1 receptors, found in insects, are thought to be adapted from hair mechanoreceptors [13, 18]. They contain a single sensory dendrite suspended at the bottom of a pressurized fluid-filled chamber. IR absorption and conversion to heat by the organ increases its internal pressure which is detected by stretch-gated ion channels (possibly

TRPs) via physical deflection of the dendrite. In Type 2 IR organs found in both insects and snakes [16, 17], a thin layer of cuticle (which appears circular in shape from the outside) covers an air filled pit. This cover internally contains a large multipolar neuron [14, 15], and the air-filled pit below is open to the outside by a narrow slit between the cuticular cap and the rest of the body. The cap is attached by a single pedicel and contains the main branches of the cap sensor neurons. When IR light is absorbed by the cuticular cap, the increase in temperature of the cap is detected by thermal sensors in the cap sensory dendrites. The IR receptor protein detecting the temperature change is a transient receptor potential cation channel, subfamily A, member 1, or TRPA1 only studied so far in snakes [17]. The air-filled

180

cavity below the cuticular cap is thought to increase the sensitivity of the organ by enhancing thermal insulation and reducing thermal mass of the cap [16].

The HO in ticks is composed of two main areas, an anterior pit (AP) and a proximal capsule (Cp) (Fig. 1B). In adult male and female D. variabilis, there are 6 sensilla in the AP, which is the same number found in the AP of adult black-legged ticks, Ixodes scapularis, and the meadow tick, Dermacentor reticulatus [25, 26]. In contrast, adult lone star ticks,

Amblyomma americanum, and several soft tick species have 7 sensilla in their AP. The sensilla in the AP of both male and female ADT appeared to be a mixture of different types of chemosensory sensilla similar to that described in Amblyomma americanum [27]. The capsule (Fig. 1B-D) is approximately 125 microns (µ) front to back, 75 µ along the long axis of the leg (Fig. 1B-C) and with a pit below 30-40 µ in depth in ADT. The pit cover has a thin, jagged, aperture approximately 60 µ long and 1-3 µ wide (Fig. 1B-C). The capsule pit contains sensilla (not visible in Fig. 1) surrounded by dozens of cuticular projections of differing morphology called pleomorphs, which are much more easily seen in species whose capsule is partly or completely open [9, 28]. In I. scapularis, the capsule has a rounded larger opening which shows a single central projection from the pit bottom [26]. In the soft tick, Ornithodoros rostratus, there is no visible cover and the pit is filled with hundreds of long, thin pleomorphs that morphologically resemble branching marine corals or filiform papillae found on the surface of the human tongue under high magnification [28].

The morphology of the ADT capsule appears morphologically to be a hybrid between

Type 1 and Type 2 receptors described in insects and snakes. The ADT capsule is similar

181

externally to the Type 2 IR receptor organ found on the fore coxae of the Australian 'little ash ', Acanthocnemus nigricans. In this beetle, like in snakes, an innervated disk overlays an air-filled cavity, absorbs IR radiation, and the increase in cuticular temperature is measured by temperature receptors within the disk. The circular cap in insects and snakes appears as a slit in ADT. There is no evidence in ADT that the slit cover is innervated. On the other hand, the capsule pit in ADT may be fluid-filled like Type 1 receptors. Histological studies show there are secretory cells associated with the ADT capsule pit which release their products directly into the pit space [8]. They differ from Type 1 IR organs in that the pore leading from the fluid-filled pit to the outside appears as a slit in ADT. We found a number of other sensory structures associated with the HO (shown with arrows, Fig. 1C-D). The

Type 1 IR organs in insects are 170-320 µ by 80-150 µ in size on the outside and 17-100 µ in depth. Type 2 IR organs in insects are 150-180 µ across. So the structures indicated by the arrows (approximately 2 µ) are likely not involved in IR detection based on their size. In some cases like in Fig. 1B-C, the structures are similar in external appearance to auricular or companiform sensilla. Companiform sensilla have traditionally been linked with mechanosensation including functioning as stretch receptors [14, 29, 30]. These structures

(Fig. 1B-D) may also be involved in the detection of humidity [18, 30, 31]. More work will be needed to determine their function. The best candidate for IR detection in ticks is the capsule area of the HO based on its sharing similarities with both Type 1 and Type 2 IR detection organs in insects and snakes.

182

3.3. Molecular clues for IR detection in ticks

Snakes are the only organisms so far where specific IR receptor proteins (and not just the proposed mechanism of IR detection) were characterized at the molecular level. The IR receptor protein detecting temperature change in snakes is a transient receptor potential cation channel, subfamily A, member 1, or TRPA1 [17]. This TRP superfamily of ion channels plays important roles in various sensory functions in a wide variety of animal species, from vision and hearing to taste and mechanosensation. TRPs are 6-transmembrane cation-permeable channels that mediate the entry of positively charged ions like sodium, calcium, and magnesium. TRPs contain ankyrin repeats composed of 33 amino acids that are organized into α-helices connected by β-hairpin motifs. These ankyrin repeats, which vary in number between TRP types and different animal species are likely involved in protein- protein interactions and appear to play a critical role in their sensitivity to a wide variety of stimuli. TRPs are categorized into 8 distinct sub-families in metazoans, and we identified representatives of several of these sub-families in our HO-specific transcriptome. TRPA1 can function as a stress receptor but has also been associated with temperature sensitivity and more recently IR detection [17, 32-34].

We found contigs in our HO-specific transcriptome from Illumina sequencing that aligned to several different types of TRP, including two contigs (contig 66838 and contig

70248) that aligned to putative TRPA1s from both I. scapularis, and A. aureolatum. In Table

1, the top 5 hits based on e-value are listed for both of our HO-specific contigs along with data points describing where the strongest alignments occurred. Contig 66838 was 98% similar to a putative TRPA1 in A. aureolatum (top hit with e-value 1e-124) where 174 amino

183

acids of our query sequence matched a 178 amino acid stretch of the subject sequence. The next four hits were I. scapularis (deer tick), C. borealis (Jonah crab), H. americanus

(American lobster), and L. anatina (duck mussel) with e-values of 8e-82, 1e-55, 3e-53 and

5e-51, respectively. Contig 70248 was 91% similar to a putative TRPA1 in A. aureolatum

(top hit with e-value 8e-59) where 101 amino acids of our query sequence matched a 111 amino acid stretch of the subject sequence. The next four hits were I. scapularis, C. borealis,

S. mimosarum (velvet spider), and P. tepidariorum (common house spider) with e-values of

2e-43, 2e-21, 3e-21 and 7e-21, respectively.

Our HO-specific contigs also aligned with Crotalus atrox, the western diamondback rattlesnake, and other pit-bearing snake species studied by Gracheva et al. (2010) in a region immediately following the conserved ankyrin repeats associated with all TRPs. At the bottom of Table 1 we show an illustration of the alignment performed with the MUSCLE algorithm [35] showing where our contigs aligned with the snake and other tick species at the amino acid level whose TRPA1s (full or partial) have been identified. This alignment with the snake TRPA1 is significant because a functional relationship has been established between the receptor and IR detection. This relationship has not been established in insects.

Contig 66838 has a 35% identity and an e-value of 4e-35 when compared to the C. atrox complete TRPA1. Contig 70248 has a 37% identity and an e-value of 5e-5 when compared to the C. atrox complete TRPA1. The A. aureolatum putative TRPA1 (accession number

JAT98721.1) that aligns with both of our putative TRPA1 contigs has a 28% identity and an e-value of 1e-115 when compared to the C. atrox complete TRPA1. The I. scapularis putative TRPA1 (accession number XP_002434584.1) that aligns with both of our putative

184

TRPA1 contigs has a 28% identity and an e-value of 6e-28 when compared to the C. atrox complete TRPA1. Identity is defined as, “the extent to which two (nucleotide or amino acid) sequences have the same residues at the same positions in an alignment, often expressed as a percentage [36].” Our analysis suggests that the HO contains TRPA1 that may be responsible for IR detection in ticks similar to the function of TRPA1 in several pit-bearing snake species.

185

4. Conclusion

In summary, herein we presented evidence that demonstrates that the American dog tick, Dermacentor variabilis, is positively phototactic to IR light and that the organ for IR detection is found exclusively on the front tarsi. Morphological comparisons of the HO on the tarsi of the front legs of ADT to IR receptors in insects and snakes suggest the capsule area of the HO might be responsible for the tick IR detection. A putative TRPA1 transcript was found in an ADT HO-specific transcriptome which was similar to the TRPA1 receptor in the snake, C. atrox, which was shown to be involved in IR detection.

186

Acknowledgements

The authors gratefully acknowledge Charles Mooney at the Analytical Instrumentation

Facility at North Carolina State University for his assistance with SEM imaging. This work was funded by grants to RMR and DES from NIH (1R21AI096268) and NSF (IOS-

0949194). RM and AC were also supported in part by a Graduate Student Teaching

Assistantship from the Department of Entomology at North Carolina State University.

187

References

1. Spach DH, Liles WC, Campbell GL, Quick RE, Anderson DE, Jr., Fritsche TR. Tick-

borne diseases in the United States. N Engl J Med 1993;329(13):936-47.

2. Sonenshine DE, Roe RM. Biology of Ticks Volume II. New York: Oxford University

Press; 2014.

3. Biggs HM. Diagnosis and management of tickborne rickettsial diseases: Rocky

Mountain spotted fever and other spotted fever group rickettsioses, ehrlichioses, and

anaplasmosis—United States. MMWR. Recommendations and Reports 2016;65.

4. CDC. Rocky mountain spotted fever (RMSF): symptoms, diagnosis, and treatment.

Volume 2016: US Department of Health and Human Services; 2010.

5. Haller G. Vorlaufige bemerkungen uber das gehororgan der Ixodiden. Zool Anz

1881;4:165-166.

6. Nuttall GHF, Cooper WF, Robinson LE. On the structure of “Haller's Organ” in the

Ixodoidea. Parasitology 1908;1(03):238-242.

7. Amosova L, Raikhel A, Ivanov V, Leonovich S. An atlas of ixodid tick ultrastructure.

1983.

8. Balashov YS. Atlas of the electron microscopic anatomy of ixodid ticks. Atlas

elektronno-mikroskopicheskoi anatomii iksodovykh kleshchei. 1979.

9. Roshdy MA, Foelix RF, Axtell RC. The subgenus Persicargas (Ixodoidea: Argasidae:

Argas). 16. Fine structure of Haller's organ and associated tarsal setae of adult A. (P.)

arboreus Kaiser, Hoogstraal, and Kohls. J Parasitol 1972;58(4):805-16.

10. Sonenshine DE, Roe RM. Biology of Ticks Volume I. New York: Oxford University

Press; 2014.

188

11. Evans WG. Infra-red receptors in Melanophila acuminata DeGeer. Nature

1964;202(4928):211-211.

12. Schmitz H, Bleckmann H. The photomechanic infrared receptor for the detection of

forest fires in the beetle Melanophila acuminata (Coleoptera: ). Journal of

Comparative Physiology A 1998;182(5):647-657.

13. Schmitz H, Bleckmann H, Mürtz M. Infrared detection in a beetle. Nature

1997;386(6627):773.

14. Schmitz H, Schmitz A, Bleckmann H. A new type of infrared organ in the Australian"

fire-beetle" Merimna atrata (Coleoptera: Buprestidae). Naturwissenschaften

2000;87(12):542-545.

15. Schmitz H, Schmitz A, Bleckmann H. Morphology of a thermosensitive multipolar

neuron in the infrared organ of Merimna atrata (Coleoptera, Buprestidae). Arthropod

Structure & Development 2001;30(2):99-111.

16. Schmitz H, Schmitz A, Trenner S, Bleckmann H. A new type of insect infrared organ

of low thermal mass. Naturwissenschaften 2002;89(5):226-229.

17. Gracheva EO, Ingolia NT, Kelly YM, Cordero-Morales JF, Hollopeter G, Chesler

AT, Sanchez EE, Perez JC, Weissman JS, Julius D. Molecular basis of infrared

detection by snakes. Nature 2010;464(7291):1006-11.

18. Vondran T, Apel KH, Schmitz H. The infrared receptor of Melanophila acuminata De

Geer (Coleoptera: Buprestidae): ultrastructural study of a unique insect

thermoreceptor and its possible descent from a hair mechanoreceptor. Tissue Cell

1995;27(6):645-58.

189

19. Klocke D, Schmitz A, Soltner H, Bousack H, Schmitz H. Infrared receptors in

pyrophilous (“fire loving”) insects as model for new un-cooled infrared sensors.

Beilstein journal of nanotechnology 2011;2(1):186-197.

20. Bruce WA. Perception of infrared radiation by the spiny rat mite Laelaps echidnina

(Acari: Laelapidae). Annals of the Entomological Society of America

1971;64(4):925-931.

21. Sonenshine DE, Homsher PJ, Carson KA, Wang VD. Evidence of the role of the

cheliceral digits in the perception of genital sex pheromones during mating in the

American dog tick, Dermacentor variabilis (Acari: Ixodidae). Journal of medical

entomology 1984;21(3):296-306.

22. Carr AL. Profiling of acarine attractants and chemosensation: North Carolina State

University; 2015.

23. Lahille F. Contribution à l'étude des Ixodidés de la République Argentine: Imprimerie

du Bureau Météorologique; 1905.

24. Lees A. The sensory physiology of the sheep tick, Ixodes ricinus L. Journal of

Experimental Biology 1948;25(2):145-207.

25. Buczek A, Buczek L, Kusmierz A, Olszewski K, Jasik K. Ultrastructural

investigations of Haller’s organ in Dermacentor reticulatus (Fabr.)(Acari: Ixodida:

Ixodidae). Acarid Phylogeny and Evolution: Adaptation in Mites and Ticks: Springer;

2002. p 227-231.

26. Homsher PJ, Sonenshine DE. Scanning electron microscopy of ticks for systematic

studies: Fine structure of Haller's organ in ten species of Ixodes. Transactions of the

American Microscopical Society 1975:368-374.

190

27. Foelix RF, Axtell RC. Ultrastructure of Haller's organ in the tick Amblyomma

americanum (L.). Z Zellforsch Mikrosk Anat 1972;124(3):275-92.

28. Klompen JS, Oliver JH, Jr. Haller's organ in the tick family Argasidae (Acari:

Parasitiformes: Ixodida). J Parasitol 1993;79(4):591-603.

29. Beadle D. Muscle attachment in the tick, Boophilus decoloratus Koch (Acarina:

Ixodidae). International Journal of Insect Morphology and Embryology

1973;2(3):247-255.

30. Obenchain FD, Galun R. Physiology of Ticks: Current Themes in Tropical Science:

Elsevier; 2013.

31. Woolley TA. Some sense organs of ticks as seen by scanning electron microscopy.

Transactions of the American Microscopical Society 1972:35-47.

32. Nilius B, Appendino G, Owsianik G. The transient receptor potential channel

TRPA1: from gene to pathophysiology. Pflügers Archiv-European Journal of

Physiology 2012;464(5):425-458.

33. Paulsen CE, Armache J-P, Gao Y, Cheng Y, Julius D. Structure of the TRPA1 ion

channel suggests regulatory mechanisms. Nature 2015;520(7548):511-517.

34. Ramsey IS, Delling M, Clapham DE. An introduction to TRP channels. Annu. Rev.

Physiol. 2006;68:619-647.

35. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic acids research 2004;32(5):1792-1797.

36. Fassler J, Cooper P. BLAST glossary. Volume 2016; 2011.

191

TABLES

Table 3.1. Top 5 BLASTP hits for two HO-specific RNA-Seq contigs putatively assigned as TRPA1s exclusive to the HO of Dermacentor variabilis from our transcriptome (top) and schematic representation of putative HO-specific TRPA1 partial transcripts in D. variabilis from our transcriptomes aligned with a full-length TRPA1 from C. atrox and a putative TRPA1 from I. scapularis and A. aureolatum using DELTA-BLAST (bottom). (*) indicates putatively assigned TRPA1 (i.e. partial transcript, TRPA1 homolog, or TRPA1-like protein); “Desc.” is contig identification assignment; “Hit Acc.” is accession number for match to contig; “Sim.” is similarity score between contig and hit, i.e. the extent to which the two sequences are related; “Len.” is amino acid alignment length between contig and hit; “Pos.” is number of exact amino acid matches between query and subject sequence. The numbers on the schematic (bottom) above the protein sequence illustrations represent amino acid positions. BLASTP = protein-protein BLAST; DELTA-BLAST = Domain Enhanced Lookup Time Accelerated BLAST. Matching organisms: Amblyomma aureolatum, Ixodes scapularis (deer tick), Cancer borealis (Jonah crab), Homarus americanus (American lobster), Lingula anatina (duck mussel), Stegodyphus mimosarum (velvet spider), Parasteatoda tepidariorum (common house spider).

Sequence Desc. Top 5 Hits Hit Acc. E-val. Si Bit- Len. Pos. m. Score Contig TRPA1* A. aureolatum JAT98721.1 1e-124 98 385 178 174 66838 TRPA1* I. scapularis XP_002434584.1 8e-82 85 252 143 122 TRPA1* C. borealis APG53778.1 1e-55 51 198 167 86 TRPA1* H. americanus APG53784.1 3e-53 50 186 175 87 TRPA1* L. anatina XP_013406001.1 5e-51 48 185 165 79 Contig TRPA1* A. aureolatum JAT98721.1 8e-59 91 204 111 101 70248 TRPA1* I. scapularis XP_002434584.1 2e-43 83 151 81 67 TRPA1* C. borealis APG53778.1 2e-21 44 97.8 112 49 TRPA1* S. mimosarum KFM57166.1 3e-21 43 96.7 116 50 TRPA1* P. tepidariorum XP_015910181.1 7e-21 44 95.9 117 51

192

Table 3.1. continued

193

FIGURES

Cp Oc AP

A 1 mm B 50 µM

50 µM 10 µM C D

Fig. 3.1. Scanning electron micrographs of Dermacentor variabilis Haller’s organ (HO) and associated structures. (A) female, dorsal view at 25X, dotted line where tarsus I including HO was removed, (B) female, dorsal view of HO anterior pit and capsule at 500X, (C) male, dorsal view of HO anterior pit and capsule at 500X, and (D) female, dorsal view, aperture opening of capsule at 2500X. Arrows in panels B-D indicate undescribed structures resembling auricular or companiform sensilla that may serve as IR detectors or assist in this function in both male and female D. variabilis. The white star in panel A denotes the location of the HO (star just above structure). The ocellus (primitive eye) is located between

194

the brackets in panel A. The white dotted line in panel A denotes the location where the HO was ablated for the corresponding trials. Oc = ocellus, Cp = capsule, AP = anterior pit.

A B

Fig. 3.2. Arena calibration points and video screenshot. (A) Choice arena where two of the ports at right angles to each other were fitted with identical light sources (either visible light or infrared), and (B) high definition, IR-capable video camera capture of bioassay trial where HO and ocelli were present and unobstructed (tick moving toward IR light). At the beginning of each assay a single tick was placed at the start and a light source was illuminated (yellow bulb, lane 1). After crossing the finish line of “Direction I” the first light source was turned off (grey bulb, lane 2) and the second light source (at a right angle) was immediately illuminated (yellow bulb, lane 2). Once the tick traveled from the start to the finish of “Direction II” the assay was over. Any deviation out of the field of the light beam (and not correcting toward the light source) was considered non-responsive. Movement toward each light source was considered non-responsive if the tick took longer than 1 minute to move within 2.5 cm of the illuminated source. Yellow bulbs denote lights that were turned on while the grey bulb represents a light that was turned off.

195

Fig. 3.3. Dermacentor variabilis choice assay conditions and results. (A) Tick response to visible light with Haller’s organs (HOs) removed or ocelli (Oc) blocked. (B) Tick response to infrared light with Haller’s organs (HOs) removed or ocelli (Oc) blocked. “+HO +Oc” means that both the HOs and Oc were intact for those trials. “+HO -Oc” means that the HOs were intact and the Oc were covered with black paint for those trials. “-HO +Oc” means that the HOs were removed and the Oc were intact for those trials. A black “X” on the illustrations above each bar graph represents where the ticks’ HOs were ablated or Oc were blocked. The frequency response was analyzed using a chi-squared test of homogeneity of proportions under the null hypothesis that the expected proportion for either choice (response or no response) was 0.50. Response to either visible or infrared light was significant at P ≤ 0.001 (**), P ≤ 0.01 (*).

196