Supplementary Material
Total Page:16
File Type:pdf, Size:1020Kb
SUPPLEMENTARY MATERIAL Transcriptomics supports local sensory regulation in the antennae of the kissing bug Rhodnius prolixus Jose Manuel Latorre-Estivalis; Marcos Sterkel; Sheila Ons; and Marcelo Gustavo Lorenzo DATABASES Database S1 – Protein sequences of all target genes in fasta format. Database S2 – Edited Generic Feature Format (GFF) file of the R. prolixus genome used for read mapping and gene expression analysis. Database S3 - FPKM values of target genes in the three libraries. Database S4 – Fasta sequences from different insects used in the CT/DH – CRF/DH and nuclear receptor phylogenetic analyses. FIGURES Figure S1 - Molecular phylogenetic analyses of calcitonin diuretic (CT) and corticotropin-releasing factor- related (CRF) like diuretic hormone (DH) receptors of R. prolixus and other insects. The evolutionary history of R. prolixus CT/DH and CRF/DH receptors was inferred by using the Maximum Likelihood method in PhyML v3.0. The support values on the bipartitions correspond to SH-like P values, which were calculated by means of aLRT SH-like test. The CT/DH receptor 3 clade was highlighted in red. The CT/DH and CRF/DH R. prolixus receptors were displayed in blue. The LG substitution amino-acid model was used. Species abbreviations: Dmel, Drosophila melanogaster; Aaeg, Aedes aegytpi; Agam, Anopheles gambiae; Clec, Cimex lecturiaus; Hhal, Halomorpha halys; Rpro, Rhodnius prolixus; Amel, Apis mellifera, Acyrthosiphon pisum; and Tcas, Tribolium castaneum. The glutamate receptor sequence from the D. melanogaster (FlyBase Acc. N° GC11144) was used as an out-group. The sequences used from other insects are in Supplementary Database S4). Figure S2 – Molecular phylogenetic analysis of nuclear receptor genes of R. prolixus and other insects. The evolutionary history of R. prolixus nuclear receptors was inferred by using the Maximum Likelihood method in PhyML v3.0. The support values on the bipartitions correspond to SH-like P values, which were calculated by means of aLRT SH-like test. The R. prolixus nuclear receptors were displayed in blue. LG substitution amino-acid model was used. Drosophila melanogaster sequences were obtained from Velarde et al. (2006); and subsequently used as query in BLASTp searches against P. humanus and C. lecticularius transcript databases from VectorBase. Species abbreviations: Dmel, Drosophila melanogaster; Phum, Pediculus humanus; Clec, Cimex lectularius. The RproEip75b sequence used was from isoform B (from our antennal transcriptome) because the sequence of isoform A is considered incomplete. The sequences used from other insects are in Supplementary Database S4). Figure S3 - Alignment of R. prolixus takeout protein sequences. Sequences were aligned with CLUSTAL X v2.0 (Thompson et al., 1997). Asterisks indicate identical amino-acids, double points show conserved exchanges and single points show homologous amino acids. The Drosophila melanogaster takeout protein sequence was obtained from Justice et al. (2003). The two conserved cysteine residues defining the Takeout family in many insects (Touhara et al., 1993) are marked with white boxes. The position of the conserved motifs 1 and 2 (So et al., 2000) is indicated with grey boxes. Predicted signal peptides are underlined. Species abbreviations: Rpro, Rhodnius prolixus; and Dmel, Drosophila melanogaster. Figure S4 – Structure and organization of takeout gene clusters Scaffold IDs are presented on the left. White arrows represent each Takeout gene and its position on the scaffold. Table S1. Details of neuropeptide and neurohormone precursor genes. Columns are: Gene – the gene and protein name we are assigning; VectorBase code – the official gene number in the RproC3 genome assembly, prefix is RPRC; Scaffold – the RproC3.3 genome assembly supercontig ID; AAs – number of encoded amino acids in the protein; Comments – comments on the OGS gene model and repairs performed in the genome assembly based on Blast against de novo antennal assemblies. NTE: Amino terminal region; CTE: Carboxyl terminal region; VB: VectorBase; GB: GenBank. Hit against Gene VectoBase Gene Scaffold Isoforms Aas. the antennal de Comments symbol code novo assemblies Adipokinetic New gene model created based on GB sequence Acc. N° hormone/corazonin-related ACP - KQ035347 - 126 Yes KM975505 (Zandawala et al., 2015a) peptide VectorBase prediction identical to GB sequence Acc. N° A 71 No KM283242 (Zandawala et al., 2015b) Adipokinetic hormone AKH RPRC000416 KQ034546 New isoform identified based on antennal assemblies and B 70 Yes included in the edited genome GFF New gene model created based on GB sequences Acc. N° AstA AstA - KQ034293 - 203 Yes GQ856315 and JN559385 (Ons et al., 2011) Allatostatin B MIP - KQ034158 - 254 Yes New gene model created based on Ons et al. 2011 Allatostatin CC AstCC RPRC000300 KQ034374 - 117 Yes No changes in VB prediction New gene model was created based on Ons et. Al. 2011 Allatostatin CCC AstCCC - KQ034609 - 100 No Previously, it was annotated as AstC New gene model created was created based on GB sequence Allatotropin AT - KQ034313 - 119 Yes Acc. N° GQ162783 (Ons et al., 2011) Bursicon alpha Burs-alfa RPRC000797 KQ034200 - 169 Partial No changes in VB prediction Identified in this work using T. castaneum GB sequence Acc. N° Bursicon beta Burs-beta - KQ034059 - 107 No NM_001114308.1 as query. Identical to GB sequences Acc. N° GQ856316 and AEA51300 A 146 Yes (Ons et al., 2011). Last 42 amino acids are located in the KQ034472 (5´UTR); KQ037272 supercontig KQ034594 (rest of the gene Diuretic hormone 31 Dh31 RPRC000977 Identical to GB sequences Acc. N° GQ856317 and AEA51301 model) and KQ037272 (last B 109 No (Ons et al., 2011) exon) Identical to GB sequence Acc. N° HM030714.1 (Zandawala et C 206 Yes al., 2011) VB prediction identical to GB sequence Acc. N° ABS17680 RPRC000639 KQ034830 A 158 No Cardioacceleratory peptide (Paluzzi et al., 2008). In VB classified as non-translating CDS CAPA (CAPA/CAP2b) VB prediction was identical to GB sequence Acc. N° ACH70295. RPRC000563 KQ034830 B 158 No In VB classified as non-translating CDS Hit against Gene VectoBase Gene Scaffold Isoforms Aas. the antennal de novo Comments symbol code assemblies VB prediction was identical to GB sequence Acc. N° GQ888668 Crustacean cardiactive peptide CCAP RPRC000466 KQ034330 - 129 Yes (Ons et al., 2011) New gene model created based Ons et al. 2011 and CCHamide peptide CCHa - KQ034137 - 104 Partial included in the edited genome GFF CNMamide peptide CNMa RPRC010893 KQ034609 - 150 No No changes in VB prediction New gene model created based on Ons et al. 2011 and included Corazonin CZ - KQ034239 - No in the edited genome GFF No changes in VB prediction, which is identical to GB sequence Diuretic hormone 44 Dh44 RPRC000596 KQ034102 - 151 Yes Acc. N° HM153808 (Te Brugge et al., 2011b), annotated as corticotropin releasing factor-like protein Eclosion hormone EH RPRC014242 KQ034677 - 241 No Partial sequence. Initial methionine is still missed Elevenin-1 Elevin-1 RPRC003083 KQ034317 - 66 No No changes in VB prediction Elevenin-2 Elevin-2 RPRC003084 KQ034317 - 87 Yes No changes in VB prediction Ecdysis triggering hormone ETH RPRC014486 KQ034462 - 146 Yes No changes in VB prediction FLP FMRFamida RPRC014988 KQ035274 - 273 Partial No changes in VB prediction Glycoprotein hormone alpha 2 GPA2 RPRC007092 KQ034094 - 122 Partial No changes in VB prediction New gene model was created based on C. lectularius sequence Glycoprotein hormone beta 5 GPB5 - KQ034094 - 156 No XP_014244389.1. Methionine is missed. VB prediction was identical to GB Kinin RPRC000022 KQ034106 - 398 Yes sequence Acc. N° BK007870 (Te Brugge et al., 2011a) IDLSRF-like peptide - RPRC000351 KQ034112 - 168 Yes No changes in VB prediction VB prediction was identical to GB sequence Acc. N° Insulin-like peptide Ilp RPRC007020 KQ034142 - 126 Yes AMS34841.1 (Defferrari et al., 2016) New gene model created based on Ons et. al 2011 and included ITG-like - - KQ034255 - 214 Yes in the edited genome GFF Annotated as sulfakinin (GB sequence Acc. N° GQ2539210). Last A 111 No 33 amino acids are not in the genome Ion transport peptide ITP RPRC000519 KQ034208 VB prediction was identical to GB sequence Acc. N° GU207866 B 117 Yes (Ons et al., 2011) Gene Hit against the antennal Gene VectoBase code Scaffold Isoforms Aas. Comments symbol de novo assemblies VB prediction extended in NTE region and initial methionine Long Neuropeptide F LNPF RPRC008107 KQ034255 No 105 Yes fixed according to GB sequence Acc. N° KT898124.1 (Sedra and Lange, 2016) VB prediction was identical to Myosuppressin Ms RPRC000203 KQ034384 No 88 Yes GB sequence Acc. N° GQ344501 (Ons et al., 2011) Natalisin NTL RPRC003680 KQ034106 - 196 Partial No changes in VB prediction VB prediction was identical to Neuroparsin NP RPRC002095 Q034340 113 Yes GB sequence Acc. N° GU207864 (Ons et al., 2011) VB prediction was identical Neuropeptide like precursor 1 NPLP1 RPRC011668 KQ034238 - 454 Yes to GB sequence Acc. N° GU207865 (Ons et al., 2011) VB prediction was shorter in NTE and CTE regions. Impossible NVP-like PH2 RPRC003052 ACPB03040762 - 299 Yes to fix the genome model due to problem in the genome assembly. One exon added in NTE region of VB prediction based on GB A 165 Yes sequence Acc. N° FJ167860 Only first 52 amino acids present in the genome. Sequence Orcokinin OK RPRC014678 KQ034149 B 392 Yes identical to GB sequence Acc. N° FJ761320 (Sterkel et al., 2012) Same problem as mentioned for isoform B. Sequence C 422 No identical to GB sequence Acc. N° KF179047 New gene model created based on Ons et al. 2011 and Pigment dispersing factor PDF - KQ034061 - 48 Yes included in the edited genome GFF VB prediction was identical to GB sequence Acc. N° Proctolin Proc RPRC000390 KQ034188 - 97 No JN543225 (Orchard et al., 2011) New gene model created based on GB sequence GU230851 Pyrokinin PK-PBAN - KQ034521 - 122 Yes and included in the edited genome GFF Initial methionine of VB model was fixed according to RYamide RYa RPRC000461 KQ035177 - 107 Yes antennal assemblies New gene model created based on GB sequence Acc.