c Indian Academy of Sciences

RESEARCH ARTICLE

Computational identification and characterization of novel microRNA in the mammary gland of dairy goat (Capra hircus)

BO QU1,2, YOUWEN QIU1, ZHEN ZHEN1, FENG ZHAO2, CHUNMEI WANG1,2, YINGJUN CUI1,2, QIZHANG LI1,2 and LI ZHANG1,2∗

1Faculty of Life Sciences, Northeast Agricultural University, Harbin 150030, People’s Republic of China 2Key Laboratory of Dairy Science of the Education Ministry, Harbin 150030, People’s Republic of China

Abstract Many studies have indicated that microRNAs (miRNAs) influence the development of the mammary gland by posttranscrip- tionally affecting their target . The objective of this research was to identify novel miRNAs in the mammary gland of dairy goats with a bioinformatics approach that was based on expressed sequence tag (EST) and genome survey sequence (GSS) analyses. We applied all known major mammals, miRNAs to search against the goat EST and GSS databases for the first time to identify new miRNAs. We, then, validated these newly predicted miRNAs with stem–loop reverse transcription followed by a SYBR Green polymerase chain reaction assay. Finally, 29 mature miRNAs were identified and verified, and of these, 14 were grouped into 13 families based on seed sequence identity and 85 potential target genes of newly verified miR- NAs were subsequently predicted, most of which seemed to encode the participating in regulation of metabolism, signal transduction, growth and development. The predicting accuracy of the new miRNAs was 70.37%, which confirmed that the methods used in this study were efficient and reliable. Detailed analyses of the sequence characteristics of the novel miR- NAs of the goat mammary gland were performed. In conclusion, these results provide a reference for further identification of miRNAs in animals without a complete genome and thus improve the understanding of miRNAs in the caprine mammary gland.

[Qu B., Qiu Y., Zhen Z., Zhao F., Wang C., Cui Y., Li Q. and Zhang L. 2016 Computational identification and characterization of novel microRNA in the mammary gland of dairy goat (Capra hircus). J. Genet. 95, xx–xx]

Introduction and tumourigenesis (Hwang and Mendell 2006; Anglicheau et al. 2010). MicroRNAs (miRNAs) are a large class of endogenous Capra hircus, the domestic goat is an important live- noncoding small RNAs that average 22 nucleotides (nt) stock animal. It is a milk-producing and meat-producing ani- in length and are derived from distinctive hairpin precur- mal that is economically important throughout the world, sors of plants and animals (Carrington and Ambros 2003; especially in China, India and other developing countries Bartel 2004). MiRNAs play important roles in posttranscrip- (Zeder and Hesse 2000). In addition, it is one of the best tional regulation because they can negatively regulate model organisms for mammary gland bioreactor studies. gene expression by recognizing completely and partially Currently, there are over 1000 goat breeds and over 830 mil- complementary sequences in target messenger RNAs (mRNAs) for mRNA cleavage or the inhibition of mRNA lion goats around the world according to incomplete figures translation (Engels and Hutvagner 2006; Filipowicz et al. from the United Nations Food and Agriculture Organization 2008;Friedmanet al. 2009). Hundreds of miRNAs have (http://www.fao.org/corp/statistics/en/). The mammary gland been identified in various animal and plant species since the of goat is a productive organ that can convert various kinds first miRNA, lin-4 was discovered in Caenorhabditis elegans of nutritious substances into milk. The healthy development (Lee et al. 1993). Many previous studies have shown that and regular functional differentiation of the mammary gland miRNAs are associated with diverse biological phenomena, are important and valuable. Many studies have shown that such as cell growth, apoptosis, development, differentiation miRNAs influence mammary gland development by affect- ing the posttranscriptional expression of their target genes ∗ For correspondence. E-mail: [email protected]. (Silveri et al. 2006; Foubert et al. 2010;Liet al. 2012a).

Keywords. dairy goat; mammary gland; miRNAs; expressed sequence tag; bioinformatics approach.

Journal of Genetics, DOI 10.1007/s12041-016-0674-6 Bo Qu et al.

Thus, the identification of mammary gland miRNAs in dairy accuracy before the final experimental validation, and these goat is worth further investigation. include secondary structure prediction, minimal folding free Although, a growing number of miRNAs have been iden- energy (MFE) and some identification algorithms (Berezikov tified in diverse animal and plant species during recent et al. 2006; Mendes et al. 2009; Takada and Asahara 2012). years, the number of miRNAs that have been identified from In this study, we aimed to determine goat miRNA in the domesticated ruminants is far less than that of other animals mammary gland by using a bioinformatics approach that was (Liu et al. 2010;Jevsineket al. 2013). For example, the based on previously reported algorithms (Berezikov et al. number of identified miRNAs in Bos taurus and Ovis aries 2006; Mendes et al. 2009; Dong et al. 2013a). To iden- is 793 and 153, respectively (Fatima and Morris 2013;Wang tify novel miRNAs, we applied all reported major mammals et al. 2013). Further, unfortunately there have been only a miRNAs that were deposited in the miRBase to search few reports on miRNAs in goat to date, and only a few against the goat EST and GSS databases for the first miRNAs from dairy goat testis exist in the Sanger miRBase time. Further, we validated these newly predicted miRNAs v21.0 (June 2014) (Griffiths-Jones et al. 2008; Kozomara and with quantitative real-time reverse transcription-polymerase Griffiths-Jones 2011). chain reaction (qRT-PCR). Subsequently, we examined the At present, there are three methods for identifying sequence characteristics of the verified novel miRNAs. miRNAs: the direct cloning approach, the next-generation Finally, we predicted the potential target genes of newly val- sequencing approach and the bioinformatics approach idated miRNAs. We hope that the results of our study will (Mendes et al. 2009). However, direct cloning does not detect provide a starting point for further study of miRNA identi- miRNAs that have low expression levels. Once these mi- fication in animals without a complete genome and improve RNAs are cloned, bioinformatic tools are required to locate the understanding of miRNAs in caprine mammary gland. their origin in the genome (Berezikov et al. 2006;Caiment et al. 2010). It is well known that the goat genome project Materials and methods is currently in progress and the genome annotation is incom- plete, and thus, the cloning approach cannot work to find Sequence database and reference miRNA dataset the goat miRNAs (Dong et al. 2013a). The advent of next- Publicly available EST and GSS sequences of goat were generation sequencing technology provides a more reliable obtained from the GenBank nucleotide (nt) databases at and sensitive method to identify new miRNAs. To date, the National Center for Biotechnology Information (NCBI, through deep sequencing, the miRNA expression profiles of October 2013). A total of 14,497 ESTs and 300 GSSs were mammary gland (Ji et al. 2012a, b;Liet al. 2012b; Dong deposited in the EST and GSS databases, respectively. et al. 2013b), testis (Wu et al. 2014), muscle (Ling et al. 2013), All previously known mammals’ miRNA sequences were skin and hair (Liu et al. 2012;Yuanet al. 2013) in goat have downloaded from the Sanger miRBase v21.0 (June 2014) already been reported. Nevertheless, it is too expensive and (Griffiths-Jones et al. 2008; Kozomara and Griffiths-Jones time consuming, and requires extensive computational anal- 2011), which consists of a total of 7078 mature miRNA yses to distinguish miRNAs from other noncoding RNAs of sequences from major mammals, including Bos taurus, Ovis similar size (Huang et al. 2011;Gomeset al. 2013). In com- aries, Sus scrofa, Canis familiaris, Mus musculus, Rattus parison, the computational approach is more efficient and has norvegicus and Homo sapiens. The duplicates in the miRNA been widely applied to identify miRNAs in diverse animals set were removed by performing multiple sequence align- and plant species (Li et al. 2010; Allmer and Yousef 2012; ments with ClustalW (Larkin et al. 2007) to avoid redun- Takada and Asahara 2012). dancy, and then, only 4583 unique mature miRNA sequences The comparative genomic approach is widely used to iden- were considered for conserved miRNA prediction. tify new miRNAs in computational methods since many miRNAs are evolutionarily highly conserved from species Computational resources to species in animals and plants (Huang et al. 2011; Takada and Asahara 2012). Due to the incomplete genome of goat, The alignment tool, basic local alignment search tool the expressed sequence tags (EST) and genome survey (BLAST) ver. 2.2.28 (October 2013), which was down- sequences (GSS) that are available in the public databases loaded from the NCBI website (ftp://ftp.ncbi.nlm.nih.gov/ provide useful complements for discovering miRNAs. It has /executables/blast+) was used to identify the poten- been reported that the identification of miRNAs with EST tially conserved miRNAs. To predict the secondary structure and GSS analyses has some advantages over the other meth- of pre-miRNA, the ViennaRNA Package 2.0 (http://rna.tbi. ods and facilitates the prediction and study of miRNAs in univie.ac.at/) (Bernhart 2011;Lorenzet al. 2011) was used to nonmodel animals and plants, especially those in which a generate RNA secondary structure and to calculate the MFE. complete genome is not yet available (Li et al. 2010; Huang MiPred software with a random forest prediction model was et al. 2011; Allmer and Yousef 2012). Thus, only a compu- applied to distinguish real pre-miRNAs from pseudo-pre- tational search is not enough for identifying miRNAs. Many miRNAs with similar stem–loops (Jiang et al. 2007). The additional criteria are necessary for distinguishing miRNA MiPred web server is available at http://www.bioinf.seu.edu. from other types of small RNA and increasing the prediction cn/miRNA/.

Journal of Genetics In silico detection of potential miRNAs

In silico detection of potential miRNAs of mammary gland in goat Finally, the novel hairpin candidates were screened to dis- tinguish real miRNA precursors from pseudo-miRNA pre- The procedure that was used to search putative miRNAs in cursors with the MiPred web-based software. All MiPred the present study was improved from previously reported outputs, including the MFEs and the prediction confidence of EST-based approaches (Catalano et al. 2012; Muvva et al. the random forest classifier were recorded. 2012; Patanun et al. 2013). An overview of the in silico It is noteworthy that both MFEs and MFE index (MFEI) of detection of novel goat miRNA in mammary gland is pre- the secondary structures are essential to distinguish miRNAs sented in figure 1. First, to determine potential miRNAs, the from other small RNAs (Monavar et al. 2012; Vishwakarma 4583 nonredundant mature miRNAs were taken as a refer- and Jadeja 2013). The MFE was expressed as negative ence and searched against the assembled ESTs and GSSs kcal/mol. The adjusted MFE (AMFE) represented the MFE with a locally installed BLAST program, ver. 2.2.28, with a of 100 nt. It was calculated using the following formula: sensitive parameter setting (word size, 7 and E-value cutoff, (MFE/length of the precursor miRNA sequence) × 100. The 10) (Zhou et al. 2009). Then, output sequences with less than MFEI was calculated using the following equation: MFEI = four mismatches compared with the query miRNA sequences AMFE/(G + C) % (Zhang et al. 2006b; Ng Kwang Loong were adopted. All the BLAST results were saved in FASTA and Mishra 2007; Monavar et al. 2012; Panda et al. 2014). formats and used for further analysis. Second, the sequences All the data on the sequence characteristics of the novel goat of 400 nt were extracted (200 nt upstream and downstream miRNAs, including the contents of A, C, G and U, the A + U from the blast hits). If the length of a sequence was less than content, the G + C content, the base composition of the pre- 400 nt, the entire available sequence was used as a miRNA miRNA sequences and the base composition at each posi- precursor sequence. These extracted sequences were then tion of the mature miRNA sequences were processed using used to search against the NCBI nr (nonredundant ) perl scripts. database using BLASTX with default parameters to remove the protein-coding sequences. Third, the remaining candidate precursor sequences of the potential miRNA homologues Detection of miRNA expression by qRT-PCR were assessed for secondary structure with RNAFold, ver. The epithelial cells of the dairy goat mammary gland were 1.8.4, from the ViennaRNA Package 2.0. All parameters were frozen and stored in the Key Laboratory of Dairy Science set to default values. The potential miRNAs of goat were sub- of the Education Ministry, China. After resuscitation, the sequently identified based on the criteria reported previously total RNA of the cells was extracted using TRIzol reagent (Frazier and Zhang 2011; Monavar et al. 2012; Patanun et al. (Life Technologies Corporation, Carlsbad, USA) according 2013; Vishwakarma and Jadeja 2013; Panda et al. 2014). to the manufacturer’s instructions. The integrity and purity

Figure 1. Overview of the in silico detection of potential miRNAs of the mammary gland in goat. EST, expressed sequence tag; GSS, genome survey sequence; nt, nucleotides.

Journal of Genetics Bo Qu et al. of the RNA were measured using electrophoresis traces identify miRNAs in animals (Zhou and Liu 2010; Barozai and A260/A280 values, respectively. Reverse transcription 2012), plants (Catalano et al. 2012; Muvva et al. 2012; was then performed with the RevertAidTM First Strand Patanun et al. 2013; Vishwakarma and Jadeja 2013; Panda cDNA Synthesis kit (Thermo Fisher Scientific, Waltham, et al. 2014) and insects (Jia et al. 2010). USA) according to the supplier’s protocol. Finally, real- In this study, we used computational methods that were time PCR was performed using the standard SYBR Green based on EST and GSS to predict new miRNAs in caprine PCR protocol (MaximaTM SYBR Green/ROX quantitative mammary gland. The sequences and structural properties of PCR (qPCR) Master Mix, Thermo Fisher Scientific) on an known miRNAs were used to screen the candidate miRNAs Applied Biosystem’s 7300 Sequence Detection System. The in the EST and GSS databases of goat. A total of 7078 mature PCR was programmed as follows: initial denaturation at miRNAs from Bos taurus, Ovis aries, Sus scrofa, Canis 95◦C for 10 min, which was followed by 40 cycles of denat- familiaris, Mus musculus, Rattus norvegicus and Homo sapi- uration at 95◦C for 15 s, annealing at 60◦C for 30 s and ens were downloaded from the miRBase. After the removal extension at 72◦C for 15 s. All reactions were performed in of duplicate miRNAs, only 4583 unique mature miRNA triplicate. U6, one of the uniformly expressed small RNAs sequences were used to identify potential miRNAs as refer- was used as an internal control. The threshold cycle (CT) ence. The 4583 nonredundant mature miRNAs were com- values were automatically determined by the instrument, pared with 14,497 ESTs and 300 GSSs with the locally and the levels of miRNA expression were presented as installed BLAST program. According to the preliminary mean CT value ± standard deviation (SD) (Schmittgen and matching, 2868 candidates (2786 sequences from ESTs and Livak 2008; VanGuilder et al. 2008; Livak and Schmittgen 82 sequences from GSSs) were selected. After removal of 2001). the protein-coding sequences with BLASTX, 667 candidates (620 sequences from ESTs and 47 sequences from GSSs) Computational prediction of putative targets for the newly were remained. The secondary structures of the remaining verified miRNAs candidates were predicted with RNAFold. A total of 274 can- The potential targets of newly verified miRNAs were didates (263 sequences from ESTs and 11 sequences from predicted using the same strategy as reported previously GSSs) were chosen, and other probable false pre-miRNA (Barozai 2012). Then, all the newly verified mature miRNA sequences were removed manually. After screening with the sequences were subjected to the NCBI BLASTN program criteria mentioned previously and classifying with MiPred, (http://blast.ncbi.nlm.nih.gov) as queries. The parameters 41 candidates (39 sequences from ESTs and two sequences were adjusted as: database, reference mRNA sequences (ref- from GSSs) were finally obtained (table 1). seq_rna); organism, goat (taxid:9925) and program selec- tion, highly similar sequences (megablast). The miRNA Verification of miRNA expression using qRT-PCR sequences showing 75% query coverage were selected and subsequently confirmed for target prediction with a soft- With the development of miRNA studies, it is generally ware RNA-hybrid according to its operation manual (Krüger appreciated that the detection of miRNA requires a more and Rehmsmeier 2006), available at http://bibiserv.techfak. specific and sensitive detection method because of the lim- uni-bielefeld.de/rnahybrid/. Only those targets having strin- itations in their length and expression levels. Fortunately, a gent seed site located at either positions 2–7 from the 5 end stem-loop real-time RT-PCR assay has been developed, and of the miRNA along with the supplementary site and the this assay can quantify mature miRNAs in a fast, specific, MFE of the hybridization was −20 kcal/mol were genuine accurate and reliable manner (Chen et al. 2005). Until now, and believable. The results validated by RNA-hybrid were this approach has been successfully used to detect and quan- saved. tify the miRNAs in many animals and plants (Ji et al. 2012a, b; Monavar et al. 2012; Dong et al. 2013b;Wuet al. 2014). Results and discussion Generally, qPCR data can be analysed and presented as absolute or relative values. Relative quantification is the pre- Identification of potential miRNAs of the caprine mammary gland ferred method, and the most common method for relative Previous studies have shown that EST analyses are a pow- quantification is the 2−CT method (Varkonyi-Gasic and erful tool to identify new miRNAs among various species Hellens 2011). However, this method was not suitable for the that do not have whole genome sequences available. EST present study. The CT value is the number of cycles required analyses are particularly advantageous, they can signifi- for each reaction to reach an arbitrary amount of fluorescence cantly enhance the ability to identify miRNAs and to inves- (VanGuilder et al. 2008). In this case, the mean CT values tigate its structure and function (Li et al. 2010; Allmer and of the qRT-PCR data were used to determine the expression Yousef 2012; Takada and Asahara 2012;Gomeset al. 2013). levels of novel miRNA expression in the mammary gland of Just like EST analyses, GSS is also used in the miRNA goat as in previous studies (Ji et al. 2012a; Monavar et al. identification field and by this strategy (Frazier and Zhang 2012). 2011; Vishwakarma and Jadeja 2013). Currently, this method Here, stem–loop real-time RT-PCR was applied in the had been widely adopted by scientists around the world to mammary gland of goat to verify the predicted miRNAs.

Journal of Genetics In silico detection of potential miRNAs

Table 1. The potential miRNAs of the mammary gland in goat that were determined using bioinformatics approach.

Name Sequence of mature miRNA ML PL P

mir-est01 AGCAAAGCGGGGGUGGGCCUGG 22 77 0.132 mir-est02 AGAAUAAAACUCACCCAAAUCUGU 24 76 0.089 mir-est03 CCUGUAAAUGGCCAUAUUUACU 22 71 0.153 mir-est04 ACGCACAGAGGUCUCAAAAUUCAUG 25 88 0.282 mir-est05 GCUUUAUUGGGUUGGCCAAAAAGUUC 26 75 0.122 mir-est06 GGUGUCACUAUCAAUGAUU 19 63 0.021 mir-est07 UGUUGGCCAAAAAGUUCACUCUGGG 25 82 0.001 mir-est08 CAACCUUAAAAUGUGCAUCUCUCU 24 70 0.035 mir-est09 GGGUUGGCCAAAAAGUUCAUUCAGG 25 74 0.043 mir-est10 AAAAAGUUCACUCUGGGUGUUCUG 24 68 0.001 mir-est11 GUUGGCCAAAAAGUUCAUUCAGG 23 74 0.149 mir-est12 UGAAGAGCAUUUCUGGGCAAA 21 71 0.086 mir-est13 GAGUCCCUACAAUUUAAACU 20 76 0.003 mir-est14 CAAUUAGCCCAGAGGUGAUGUU 22 65 0.005 mir-est15 AUAUGAAGGGGGCAUACUUAUA 22 67 0.012 mir-est16 CCAACUGUGCAAUUUAGCAAGAGA 24 61 0.042 mir-est17 AAUGAAGUUGGACGCAUGAACUUU 24 70 0.091 mir-est18 GUGCAGUUUAAGGAACUAAUAUAA 24 62 0.048 mir-est19 ACCGGAGGCAGUCUAACAGUGGAUCGU 27 80 0.323 mir-est20 GUGGCGAGCCAUGGUACGCC 20 52 0.517 mir-est21 UUGGCCAAAAAGUUCAUUCAGG 22 74 0.096 mir-est22 UUGUCUGCGUCUCUGCUCUCC 21 61 0.049 mir-est23 GGGUUGGCCAAAAAGUUCAUUCAG 24 77 0.127 mir-est24 GGUGAAUGGCACACUGUUGUG 21 73 0.074 mir-est25 ACAUAAAUUCUAAUACUAAUA 21 61 0.053 mir-est26 CGGGGGUGGGCCUGGUGCCCCUGA 24 93 0.029 mir-est27 ACCAUGCUGCUGACUAGAUGAC 22 70 0.685 mir-est28 GGAAAGCAAGGCUGGUCUCAC 21 101 0.375 mir-est29 UGAAGAGUACCUUUUGUCAAA 21 88 0.020 mir-est30 GCAUGCAAGCUUCAGGAGUU 20 58 0.064 mir-est31 GUUGGGGUUUUUCUGGAUUU 20 92 0.018 mir-est32 GGUCUAGGUGAUCUGGAGCCCU 22 74 0.144 mir-est33 UUUAGACUGCAGAAUCCAUU 20 57 0.036 mir-est34 AAAUAUUAAGAGCCUCCCC 19 66 0.066 mir-est35 CGCUUCCACCAGAGCUAGAU 20 70 0.516 mir-est36 CGGGGGUGGGCCUGGUGCCCCU 22 81 0.093 mir-est37 CUAUUUUAUAAACUCCCA 18 53 0.003 mir-est38 UCUGACCUUAUUAGCAGGUGCCU 23 60 0.402 mir-est39 GAAAGCCUAUGCUGGCUGCUA 21 85 0.006 mir-gss01 CAGAUUCCUCUUUAGUUAAGUCA 23 100 0.103 mir-gss02 AACUUUAGUCAUCAUCACAU 23 123 0.770

ML, length of mature miRNA; PL, length of pre-miRNA. P value of the randomization test was obtained through MiPred.

All the potential miRNAs were subjected to qRT-PCR. The mean CT values less than 30, which could imply that these primers used in the qRT-PCR assay are shown in table 2. miRNAs were highly expressed in the mammary gland or Consequently, 29 miRNAs were validated and 12 were not ubiquitous in goat. Additionally, 16 miRNAs with mean CT detected among the 41 novel miRNAs (table 3). The predict- values ranging from 30 to 33 were detected, indicating that ing accuracy of the new miRNAs was 70.37%, which con- they were expressed in relatively low abundance. However, firmed that the methods used in this study were efficient and it has been previously reported that miRNA genes tend to reliable. The mean CT values of each new miRNA may be be expressed in a specific developmental stage or tissue (Jia assessed for differential expression in the mammary gland of et al. 2010). Further, the transcripts of most miRNAs are not goat. As shown in table 3 and figure 2, mir-est08 showed the as abundant as protein-coding genes. Therefore, these results highest levels of expression with a mean CT ± SD of 5.0291 are generally acceptable for the above-mentioned reasons. ± 0.3718, which was followed by mir-est05 with a mean CT The undetermined miRNAs might have been found due to ± SD of 5.6986 ± 0.3637, and then mir-est03 with a mean several reasons. First, they could be exact pseudo-miRNAs CT ± SD of 16.6691 ± 0.2550. There were 13 miRNAs with Second, they could be expressed at very low levels in animal.

Journal of Genetics Bo Qu et al.

Table 2. Primers used in this study for qRT-PCR.

Name Sequence

mir-est01-F ACACTCCAGCTGGGGAGCAAAGCGGGGGTG mir-est02-F ACACTCCAGCTGGGGAGAATAAAACTCACCCA mir-est03-F ACACTCCAGCTGGGGCCTGTAAATGGCCAT mir-est04-F ACACTCCAGCTGGGGACGCACAGAGGTCTCAAA mir-est05-F ACACTCCAGCTGGGGGCTTTATTGGGTTGGCCAA mir-est06-F ACACTCCAGCTGGGGGTGTCACTATCA mir-est07-F ACACTCCAGCTGGGGTGTTGGCCAAAAAGTTCA mir-est08-F ACACTCCAGCTGGGGCAACCTTAAAATGTGCA mir-est09-F ACACTCCAGCTGGGGGGGTTGGCCAAAAAGTTC mir-est10-F ACACTCCAGCTGGGGAAAAAGTTCACTCTGGG mir-est11-F ACACTCCAGCTGGGGGTTGGCCAAAAAGTTC mir-est12-F ACACTCCAGCTGGGTGAAGAGCATTTCTG mir-est13-F ACACTCCAGCTGGGGAGTCCCTACAATT mir-est14-F ACACTCCAGCTGGGGCAATTAGCCCAGAGG mir-est15-F ACACTCCAGCTGGGATATGAAGGGGGCA mir-est16-F ACACTCCAGCTGGGCCAACTGTGCAATTTA mir-est17-F ACACTCCAGCTGGGAATGAAGTTGGACGCA mir-est18-F ACACTCCAGCTGGGGTGCAGTTTAAGGAAC mir-est19-F ACACTCCAGCTGGGGACCGGAGGCAGTCTAACAGT mir-est20-F ACACTCCAGCTGGGGTGGCGAGCCAT mir-est21-F ACACTCCAGCTGGGTTGGCCAAAAAGTT mir-est22-F ACACTCCAGCTGGGTTGTCTGCGTCTC mir-est23-F ACACTCCAGCTGGGGGGTTGGCCAAAAAGT mir-est24-F ACACTCCAGCTGGGGGTGAATGGCACA mir-est25-F ACACTCCAGCTGGGACATAAATTCTAA mir-est26-F ACACTCCAGCTGGGGCGGGGGTGGGCCTGGTG mir-est27-F ACACTCCAGCTGGGACCATGCTGCTGAC mir-est28-F ACACTCCAGCTGGGGGAAAGCAAGGCT mir-est29-F ACACTCCAGCTGGGTGAAGAGTACCTTTT mir-est30-F ACACTCCAGCTGGGGCATGCAAGCTT mir-est31-F ACACTCCAGCTGGGGTTGGGGTTTTT mir-est32-F ACACTCCAGCTGGGGGGTCTAGGTGATCTG mir-est33-F ACACTCCAGCTGGGTTTAGACTGCAGAA mir-est34-F ACACTCCAGCTGGGAAATATTAAGAGC mir-est35-F ACACTCCAGCTGGGCGCTTCCACCAGAG mir-est36-F ACACTCCAGCTGGGCGGGGGTGGGCCTGGT mir-est37-F ACACTCCAGCTGGGCTATTTTATAAA mir-est38-F ACACTCCAGCTGGGTCTGACCTTATTAGCAG mir-est39-F ACACTCCAGCTGGGGAAAGCCTATGCTGG mir-gss01-F ACACTCCAGCTGGGGCAGATTCCTCTTTAGT mir-gss02-F ACACTCCAGCTGGGGAACTTTAGTCATC U6-F CTCGCTTCGGCAGCACA U6-R AACGCTTCACGAATTTGCGT Universal reverse primer TGGTGTCGTGGAGTCG

Finally, their expression could be absent in the mammary was performed and compared with the previous reports. The gland. complete information on the verified miRNAs, including the miRNA family, the location of the miRNA, A + U content, A/U ratio, MFE and MFEI is provided in table 4. Moreover, Sequence characteristics of novel pre-miRNAs and mature statistical summaries of the major sequence characteristics of miRNAs thepre-miRNAsareshownintable5. The sequence characteristics of pre-miRNAs and mature It is shown that much of the animal or plant miRNA is miRNAs in plants have been reported previously in many evolutionarily conserved and classified into the same miRNA studies (Silveri et al. 2006;Yuanet al. 2013). Recently, gene family. Analyses of miRNA families show an obvious the same analyses have reportedly been extended to other tendency that the miRNAs of a family share a common seed species, including animals (Zhou et al. 2009; Zhou and Liu sequence. The seed sequence is defined as nucleotides 2–7  2010; Barozai 2012;Linget al. 2013) and insects (Jia et al. from the 5 end of mature miRNA sequences, and this is the 2010). In the present study, a detailed analysis of the major specific determinant for target recognition (Lewis et al. sequence characteristics of the verified new miRNAs in goat 2005;Friedmanet al. 2009). Therefore, in part, miRNAs

Journal of Genetics In silico detection of potential miRNAs

Table 3. Results of qRT-PCR. As shown in tables 4, 5 and figure 3, the lengths of the ver- ified mature miRNAs in the caprine mammary gland were ± ± Name Mean CT SD Name Mean CT SD within 20–26 nt with an average value of 22 ± 2ntand ± ± the identified pre-miRNAs varied from 52 to 123 nt with mir-est01 30.4798 0.3237 mir-est22 27.1234 0.0695 ± mir-est02 28.0415 ± 0.2637 mir-est23 30.9869 ± 0.1653 an average value of 75 15 nt, which was consistent with mir-est03 16.6691 ± 0.2550 mir-est24 30.0641 ± 0.0737 previously reported pre-miRNAs in various animals and mir-est04 31.7358 ± 0.2866 mir-est25 30.4770 ± 0.2537 plants (Liu et al. 2008; Zhou et al. 2009; Zhou and Liu 2010; mir-est05 5.6986 ± 0.3637 mir-est26 24.1457 ± 0.1738 Frazier and Zhang 2011; Barozai 2012; Muvva et al. 2012). ± mir-est06 Undetermined mir-est27 25.0756 0.0477 It has been reported that the majority of mature miRNAs mir-est07 Undetermined mir-est28 31.5199 ± 0.4051 mir-est08 5.0291 ± 0.3718 mir-est29 Undetermined (>75%) have lengths of 20 and 21 nt in many plants, includ- mir-est09 Undetermined mir-est30 28.0148 ± 0.0462 ing sweet potato, soybean and maize (Zhang et al. 2009; mir-est10 16.7433 ± 0.2235 mir-est31 31.7374 ± 0.7171 Dehury et al. 2013). However, only 31.03% of the mature mir-est11 Undetermined mir-est32 32.7685 ± 0.1606 miRNAs had lengths of 20 and 21 nt as shown in figure 3a, ± mir-est12 30.7673 0.1764 mir-est33 Undetermined and this was consistent with the findings of previous stud- mir-est13 28.5663 ± 0.1357 mir-est34 Undetermined mir-est14 30.9406 ± 0.5851 mir-est35 Undetermined ies in animals (Zhou et al. 2009; Barozai 2012). Further, mir-est15 29.2669 ± 0.3007 mir-est36 Undetermined the majority of the newly identified pre-miRNAs (82.78%) mir-est16 30.0813 ± 0.0874 mir-est37 Undetermined had 60–99 nucleotides (figure 3b), which was similar to ani- mir-est17 30.5196 ± 0.2735 mir-est38 32.1140 ± 0.0188 mals, including horse and pig miRNAs (Zhou et al. 2009; ± mir-est18 31.6031 0.0518 mir-est39 Undetermined Zhou and Liu 2010) and was different compared to previ- mir-est19 Undetermined mir-gss-1 28.7151 ± 0.1308 mir-est20 28.8625 ± 0.2063 mir-gss-2 31.2683 ± 0.1357 ously reported plants (Zhang et al. 2009; Monavar et al. mir-est21 30.4587 ± 0.1179 U6 7.7675 ± 0.1041 2012; Dehury et al. 2013; Vishwakarma and Jadeja 2013; Panda et al. 2014). Interestingly, except for mir-est14, all the CT, threshold cycle; SD, standard deviation. newly identified miRNA sequences were located at the 5 end of the miRNA precursor sequence (table 4). Generally, the frequency of the four nucleotides (A, U, G are easily able to acquire new functions because of a sin- and C) is uneven in animal pre-miRNA sequences (Wang gle nucleotide change in their seed region, which would et al. 2013). In this study, the distribution of A, U, G and completely alter their target spectra (Liu et al. 2008). Thus, C were different (table 5). The U (29.17 ± 6.20%) and A all newly verified miRNAs are screened to identify miRNA (26.97 ± 5.93%) nucleotides were predominant compared to families by using the sequence identity of the seed sequence the other two nucleotides: G (23.67 ± 6.67%) and C (20.19 ± as previously reported (Zhou et al. 2009; Zhou and Liu 2010; 5.44%). Thus, it was obvious that the pre-miRNA sequences Frazier and Zhang 2011; Barozai 2012; Muvva et al. 2012). contained more A + U (56.14 ± 9.72%) nucleotides than As a result, 14 miRNAs were grouped into 13 families among G + C (43.86 ± 9.72%). During identifying the potential the 29 validated miRNAs (table 4). miRNAs, the content of the A+U nucleotides was an important

Figure 2. Expression of potential novel miRNAs in the mammary gland of the goat. The relative expression abundance is expressed as mean CT value.

Journal of Genetics Bo Qu et al.

Table 4. Major sequence characteristics of the verified miRNAs in the caprine mammary gland.

Name miRNA family Loc A/U A + U (%) ML PL MFE MFEI

mir-est01 5 0.87 36.36 22 77 44.42 0.79 mir-est02 5 1.00 57.89 24 76 21.93 0.87 mir-est03 miR-3607-3p 5 1.05 54.93 22 71 22.08 0.86 mir-est04 miR-1451 5 0.80 51.14 25 88 31.04 0.80 mir-est05 5 0.75 56.00 26 75 28.75 0.91 mir-est08 5 0.86 74.29 24 70 21.27 0.81 mir-est10 miR-2284 5 0.74 58.82 24 68 25.44 0.90 mir-est12 miR-300 5 1.17 70.42 21 71 42.26 1.39 mir-est13 miR-2114 5 0.47 57.89 20 76 31.06 1.17 mir-est14 3 0.77 60.00 22 65 21.91 0.74 mir-est15 5 1.41 61.19 22 67 20.94 0.86 mir-est16 5 2.00 59.02 24 61 26.53 1.04 mir-est17 5 0.78 58.57 24 70 28.25 1.08 mir-est18 5 1.39 69.35 24 62 22.38 1.24 mir-est20 5 1.40 46.15 20 52 27.54 0.76 mir-est21 miR-548s 5 0.71 55.41 22 74 29.01 0.96 mir-est22 miR-632 5 0.67 49.18 21 61 41.02 1.18 mir-est23 5 0.72 55.84 24 77 39.27 1.11 mir-est24 miR-1257 5 0.95 53.42 21 73 36.17 0.95 mir-est25 5 1.05 73.77 21 61 18.69 2.14 mir-est26 miR-2305 5 0.95 41.94 24 93 59.59 0.81 mir-est27 5 0.39 45.71 22 70 27.35 0.78 mir-est28 5 1.29 47.52 21 101 45.88 0.79 mir-est30 miR-300 5 1.09 39.66 20 58 21.72 0.75 mir-est31 miR-3170 5 0.78 61.96 20 92 30.81 0.84 mir-est32 miR-601 5 1.20 44.59 22 74 36.91 0.84 mir-est38 miR-4256 5 0.79 56.67 23 60 29.21 1.02 mir-gss-1 5 0.89 66.00 23 100 26.68 0.77 mir-gss-2 miR-1278 5 1.08 64.23 23 123 38.04 1.02

Loc, location of the miRNA; ML, length of the verified mature miRNA; PL, length of the precursor miRNA; MFE, minimum free energy; MFEI, MFE index.

Table 5. Statistical summary of the major sequence characteristics which probably indicated that the pattern of base composi- of the newly identified pre-miRNAs. tion in animal pre-miRNA sequences is similar to that of plants. Characteristic Mean ± SD In the present study, the base composition at each posi- ± tion in the mature miRNAs was also examined (figure 4). Sequence length (nt) 74.69 15.26  A + U (%) 56.14 ± 9.72 Reportedly, U is the predominant nucleotide at the 5 end of G + C (%) 43.86 ± 9.72 mature miRNA sequences in many plants (Monavar et al. A% 26.97 ± 5.93 2012; Dehury et al. 2013; Patanun et al. 2013), and a simi- ± U% 29.17 6.20 lar result was obtained in horse (Zhou et al. 2009). How- G% 23.67 ± 6.67 C% 20.19 ± 5.44 ever, G was the dominant nucleotide at the first position in A/U 0.97 ± 0.33 the mature miRNAs of caprine mammary gland and U was G/C 1.24 ± 0.44 the least. Interestingly, C prevailed at position 19, which was in accordance with plants. The reasons for these findings remain unclear and need further investigation. The formation filter criteria because a higher A + U content might make of a stem–loop hairpin secondary structure is a critical step the secondary structure of a pre-miRNA unstable and more in miRNA maturation. However, a stem–loop hairpin struc- easily processed into mature miRNA with the RNA-induced ture is not a unique feature of miRNAs because other RNAs, silencing complex. In addition, the ratio of A/U and G/C such as mRNA, rRNA and tRNA can also fold into sim- were 0.97 ± 0.33 and 1.24 ± 0.44, respectively, in the newly ilar hairpin structures. Consequently, three criteria, includ- identified pre-miRNAs, suggesting that the pre-miRNA ing negative MFE, AMFE and MFEI have been proposed sequences of the caprine mammary gland contained ∼10% to distinguish potential miRNA from other RNAs (Zhang more U and 20% more G than A and C, respectively. These et al. 2006b). Generally, the lower the MFE, the more stable results were in agreement with previous studies in plants the secondary structure of the pre-miRNAs. Here, the mean (Barozai et al. 2013; Dehury et al. 2013; Patanun et al. 2013), value of the MFE was −30.90 ± 9.43 kcal/mol with a range

Journal of Genetics In silico detection of potential miRNAs

Figure 3. The length distribution of the (a) novel mature miRNAs and (b) pre-miRNAs in the mammary gland of goat.

Figure 4. The percentage distribution of the base composition at each position in the mature miRNA sequences of caprine mammary gland. of −18.69–−59.59 kcal/mol. The MFEI was also a reliable Zhou et al. 2009; Frazier and Zhang 2011; Barozai 2012; criterion of the screening potential of miRNAs because it Monavar et al. 2012; Muvva et al. 2012; Patanun et al. 2013; can normalize the potential effects of the sequence length on Vishwakarma and Jadeja 2013; Panda et al. 2014). MFE. It has been reported that a candidate RNA sequence is more likely to be an miRNA when the MFEI is greater Prediction of putative targets for the newly verified miRNAs than 0.85 (Zhang et al. 2006b; Jones-Rhoades 2012). Here, the MFEIs of the pre-miRNAs in caprine mammary gland It is generally appreciated that the regulatory mechanism of range from 0.74 to 2.14 with an average of 0.97 ± 0.28, and the mature miRNA in mammalian systems is dependent on a majority of the novel verified miRNAs had MFEI values complementary base pairing primarily to the 3-UTR region over 0.85 (table 4). These results were similar to those previ- of the target mRNA, later causing the inhibition of translation ous reports in other animal and plant species (Liu et al. 2008; and/or the degradation of the mRNA (Oulas et al. 2012;

Journal of Genetics Bo Qu et al.

Table 6. Putative targets of the novel miRNA in caprine mammary gland. miRNA Putative target Function mir-est01 Diacylglycerol kinase alpha (DGKA) Cancer and tumour related mir-est02 Docking protein 1 (DOK1) Transportation mir-est03 TTK protein kinase (TTK) Growth and development Sorting nexin 15 (SNX15) Transportation mir-est04 DIS3 mitotic control homologue (DIS3) Growth and development RIO kinase 1 (RIOK1) Metabolism Phosducin-like (PDCL) Signalling mir-est05 Complement component 3d Immunity mir-est08 Zinc finger protein 667 (ZNF667) Transcription factor Lysosomal trafficking regulator (LYST) Transportation mir-est12 NUT midline carcinoma, family member 1 (NUTM1) Cancer and tumour related Fibronectin leucine-rich transmembrane protein 1 (FLRT1) Growth and development HECT and RLD domain containing E3 protein ligase Growth and development family member 1 (HERC1) mir-est13 Molybdenum cofactor synthesis 2 (MOCS2) Metabolism Cerebellin 1 precursor (CBLN1) Metabolism mir-est14 Vacuolar protein sorting 13 homologue D (VPS13D) Transportation mir-est15 RCC1 domain containing 1 (RCCD1) Growth and development mir-est16 Patched 1 (PTCH1) Signalling LPS-responsive vesicle trafficking, beach and anchor containing (LRBA) Transportation mir-est18 ST6 beta-galactosamide alpha-2,6-sialyltranferase 1 (ST6GAL1) Signalling Leucine-rich repeats and IQ motif containing 3 (LRRIQ3) Structural proteins Mitochondrial translational initiation factor 2 (MTIF2) Growth and development mir-est20 von Willebrand factor A domain containing 7 (VWA7) Growth and development Zinc finger protein 862 (ZNF862) Transcription factor Actin-binding Rho-activating protein (ABRA) Signalling mir-est21 Transmembrane protein 87B (TMEM87B) Structural proteins Solute carrier family 25, member 30 (SLC25A30) Transportation Cysteine sulfinic acid decarboxylase (CSAD) Metabolism 3-phosphoadenosine 5-phosphosulfate synthase 1 (PAPSS1) Metabolism mir-est22 Ubiquitin specific peptidase 53 (USP53) Transcription factor Family with sequence similarity 114, member A2 (FAM114A2) Cancer and tumour-related Phosphoglycerate kinase 1 (PGK1) Metabolism O-linked N-acetylglucosamine (GlcNAc) transferase (OGT) Metabolism Casein kinase 1, delta (CSNK1D) Metabolism 20 open reading frame, human C5orf42 (C20H5orf42) Structural proteins Phosphoinositide-3-kinase, regulatory subunit 5 (PIK3R5) Signalling mir-est23 Acyl-CoA dehydrogenase, C-4 to C-12 straight chain (ACADM) Metabolism RAS and EF-hand domain containing (RASEF) Signalling Tropomodulin 2 (TMOD2) Structural proteins PERP, TP53 apoptosis effector (PERP) Signalling Spectrin repeat containing, nuclear envelope 1 (SYNE1) Structural proteins CGRP receptor component (CRCP) Metabolism Transmembrane protein 87B (TMEM87B) Structural proteins Cysteine sulfinic acid decarboxylase (CSAD) Metabolism Spectrin repeat containing, nuclear envelope family member 3 (SYNE3) Structural proteins MRE11 meiotic recombination 11 homologue A (S. cerevisiae) (MRE11A) Growth and development Medium-chain specific acyl-CoA dehydrogenase Metabolism Maltase-glucoamylase Metabolism Glycerol kinase 5 (GK5) Metabolism mir-est24 Inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase Immunity complex-associated protein (IKBKAP) Carboxylmir-esterase 4A (CES4A) Metabolism Peroxisomal biogenesis factor 1 (PEX1) Metabolism mir-est25 Sodium channel voltage-gated type I alpha subunit (SCN1A) Metabolism Cytokine-induced apoptosis inhibitor 1 (CIAPIN1) Signalling Sperm-specific antigen 2 (SSFA2) Growth and development mir-est26 SCO-spondin (SSPO) Growth and development Ataxin 7-like 2 (ATXN7L2) Growth and development Formimidoyltransferase cyclodeaminase (FTCD) Metabolism Ankyrin 1, erythrocytic (ANK1) Structural proteins Complement component 2 (C2) Immunity BPI-fold containing family B, member 3 (BPIFB3) Structural proteins Antioxidant 1 copper chaperone (ATOX1) Transportation

Journal of Genetics In silico detection of potential miRNAs

Table 6 (contd) miRNA Putative target Function mir-est27 Nuclear receptor coactivator 6 (NCOA6) Transcription factor mir-est28 Zinc finger protein 133 (ZNF133) Transcription factor Pleckstrin homology domain containing, family B (evectins) member 2 (PLEKHB2) Metabolism mir-est30 Voltage-dependent L-type calcium channel subunit alpha-1S-like (LOC102172260) Signalling Fibronectin type III and ankyrin repeat domains 1 (FANK1) Structural proteins Arylacetamide deacetylase-like 3 (AADACL3) Metabolism Minichromosome maintenance complex component 4 (MCM4) Cancer and tumour-related mir-est31 E1A binding protein p400 (EP400) Cancer and tumour-related Kelch-like family member 20 (KLHL20) Metabolism G protein-coupled receptor 111 (GPR111) Signalling G protein-coupled receptor 179 (GPR179) Signalling mir-est32 Solute carrier family 39 (zinc transporter), member 4 (SLC39A4) Transportation Mucin 6, oligomeric mucus/gel-forming (MUC6) Metabolism Golgin A3 (GOLGA3) Metabolism mir-est38 Regulatory factor X, 3 (RFX3) Transcription factor mir-gss01 Uronyl-2-sulfotransferase (UST) Metabolism Nuclear factor, erythroid 2-like 1 (NFE2L1), Transcription factor Ubiquitin-like 3 (UBL3) Transcription factor Natriuretic peptide receptor A/guanylate cyclase A (atrionatriuretic peptide receptor A) (NPR1) Metabolism Calcium channel, voltage-dependent, L type, alpha 1F subunit (CACNA1F), Signalling mir-gss02 Signal peptide peptidase like 2A (SPPL2A) Metabolism Ring finger protein (C3H2C3 type) 6 (RNF6) Transcription factor DnaJ (Hsp40) homologue, subfamily C, member 2 (DNAJC2) Metabolism

Peterson et al. 2014). Consequently, the finding of miRNA of miRNAs in the mammary gland in different physiological targets is an important step for validation of miRNAs iden- states could provide new insights into the fact that miRNAs tified. Here, a total of 85 targets were predicted for the 29 are master regulators of gene expression in many biologi- novel miRNAs in caprine mammary gland using a combina- cal and pathological processes of mammary gland (Wright tion of BLAST and RNA-hybrid algorithms (table 6). Based et al. 2010; Yu and Pestell 2012; Gigli and Maizon 2013). on this analysis, it is indicated that each miRNA can regu- In this study, as shown in table 6, the proteins involved in the late more than one target gene in caprine mammary gland, metabolic processes are the most kind of targets annotated which reflects miRNAs cooperative regulation of transcrip- for the novel miRNAs in caprine mammary gland. More- tion. No target could be predicted from the current database over, cell signalling proteins, growth and development asso- for mir-est10 and mir-est17. ciated proteins are two broad groups of targets. These results Earlier studies have documented that most of the miRNAs of target analysis are in close connection with the physio- were largely metabolic transporters, signal transduction fac- logical functional of mammary gland. Therefore, it can be tors and target transcription factors (Zhang et al. 2006a, concluded that miRNAs play an important role in regulat- c;Bhardwajet al. 2010). Our results were almost con- ing the development and functional differentiation of caprine sistent with the findings of previous studies. The 85 mammary gland. potential miRNA targets were categorized into different broad divisions with various biological functions, including Conclusion metabolism (27 genes), signalling (12 genes), growth and development (11 genes), structural proteins (10 genes), tran- This study identified and verified 29 mature miRNAs in the scription factors (nine genes), transportation (eight genes), caprine mammary gland with a bioinformatics approach that cancer and tumourrelated (five genes) and immunity (three was based on EST and GSS analyses and qRT-PCR. The pre- genes) (table 6). dicting accuracy of the new miRNAs was 70.37%, which The mammary gland is one of the few tissues in mammals confirmed that the methods used in this study were efficient which can repeatedly undergo growth, functional differen- and reliable. Our work may provide a reference for the fur- tiation and regression. MiRNAs can regulate gene expres- ther identification of miRNAs in animals without a complete sion at a posttranscriptional level either by causing RNA genome. Eighty-five potential target genes were predicted degradation or blocking translation through base-pairing for the newly verified miRNAs. Majority of those targets with complementary sequences within mRNA (Silveri et al. seemed to encode the proteins participating in regulation of 2006; Foubert et al. 2010). Nowadays, many miRNAs have metabolism, signal transduction, growth and development. been identified in animals by experimental and computa- These targets will subsequently be validated in our following tional approaches. Adequate characterization of the actions study. It can be expected that these novel miRNAs and their

Journal of Genetics Bo Qu et al. potential targets will be good functional genomic resources Filipowicz W., Bhattacharyya S. N. and Sonenberg N. 2008 Mech- to understand the miRNA gene regulatory mechanisms in anisms of post-transcriptional regulation by microRNAs, are the mammary gland development and function of dairy goat. answers in sight? Nat. Rev. Genet. 9, 102–114. Foubert E., De Craene B. and Berx G. 2010 Key signalling nodes in mammary gland development and cancer. The Snail1-Twist1 conspiracy in malignant breast cancer progression. Breast Cancer Acknowledgements Res. 12, 206. Frazier T. P. and Zhang B. 2011 Identification of plant microRNAs This work was supported by National Natural Science Founda- using expressed sequence tag analysis. Methods Mol. Biol. 678, tion of China (grant nos. 31100959 and 31401093) and China 13–25. Postdoctoral Science Foundation (grant no. 2011M500633). Friedman R. C., Farh K. K., Burge C. B. and Bartel D. P. 2009 Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 19, 92–105. References Gigli I. and Maizon D. O. 2013 MicroRNAs and the mammary gland: a new understanding of gene expression. Genet. Mol. Biol. Allmer J. and Yousef M. 2012 Computational methods for ab initio 36, 465–474. detection of microRNAs. Front. Genet. 3, 209. Gomes C. P., Cho J. H., Hood L., Franco O. L., Pereira R. W. and Anglicheau D., Muthukumar T. and Suthanthiran M. 2010 Wang K. 2013 A review of computational tools in microRNA MicroRNAs, small RNAs with big effects. Transplantation 90, discovery. Front. Genet. 4, 81. 105–112. Griffiths-Jones S., Saini H. K., van Dongen S. and Enright A. J. Barozai M. Y. 2012 The novel 172 sheep (Ovis aries) microRNAs 2008 miRBase, tools for microRNA genomics. Nucleic Acids and their targets. Mol. Biol. Rep. 39, 6259–6266. Res. 36, 154–158. Barozai M. Y., Din M. and Baloch I. A. 2013 Structural and func- Huang Y., Zou Q., Wang S. P., Tang S. M., Zhang G. Z. and Shen tional based identification of the bean (Phaseolus) microRNAs X. J. 2011 The discovery approaches and detection methods of and their targets from expressed sequence tags. J. Struct. Funct. microRNAs. Mol. Biol. Rep. 38, 4125–4135. Genomics 14, 11–18. Hwang H. W. and Mendell J. T. 2006 MicroRNAs in cell prolif- Bartel D. P. 2004 MicroRNAs: genomics, biogenesis, mechanism, eration, cell death, and tumorigenesis. Br. J. Cancer 94, 776– and function. Cell 116, 281–297. 780. Berezikov E., Cuppen E. and Plasterk R. H. 2006 Approaches to Jevsinek Skok D., Godnic I., Zorc M., Horvat S., Dovc P., Kovac microRNA discovery. Nat. Genet. 38, 2–7. M. and Kunej T. 2013 Genome-wide in silico screening for Bernhart S. H. 2011 RNA structure prediction. Methods Mol. Biol. microRNA genetic variability in livestock species. Anim Genet. 760, 307–323. 44, 669–677. Bhardwaj J., Mohammad H. and Yadav S. K. 2010 Computa- Ji Z., Wang G., Xie Z., Wang J., Zhang C., Dong F. and Chen tional identification of microRNAs and their targets from the C. 2012a Identification of novel and differentially expressed expressed sequence tags of horsegram (Macrotyloma uniflorum microRNAs of dairy goat mammary gland tissues using Solexa (Lam.) Verdc.) J. Struct. Funct. Genomics 4, 233–240. sequencing and bioinformatics. PLoS One 7, e49463. Caiment F., Charlier C., Hadfield T., Cockett N., Georges M. and Ji Z., Wang G., Xie Z., Zhang C. and Wang J. 2012b Identifica- Baurain D. 2010 Assessing the effect of the CLPG mutation on tion and characterization of microRNA in the dairy goat (Capra the microRNA catalog of skeletal muscle using high-throughput hircus) mammary gland by Solexa deep-sequencing technology. sequencing. Genome Res. 20, 1651–1662. Mol. Biol. Rep. 39, 9361–9371. Carrington J. C. and Ambros V. 2003 Role of microRNAs in plant Jia Q., Lin K., Liang J., Yu L. and Li F. 2010 Discovering conserved and animal development. Science 301, 336–338. insect microRNAs from expressed sequence tags. J. Insect. Phys- Catalano D., Pignone D., Sonnante G. and Finetti-Sialer M. M. 2012 iol. 56, 1763–1789. In-silico and in-vivo analyses of EST databases unveil conserved Jiang P., Wu H., Wang W., Ma W., Sun X. and Lu Z. 2007 MiPred, miRNAs from Carthamus tinctorius and Cynara cardunculus. classification of real and pseudo microRNA precursors using ran- BMC Bioinformatics 13, S12. dom forest prediction model with combined features. Nucleic Chen C., Ridzon D. A., Broomer A. J., Zhou Z., Lee D. H., Nguyen Acids Res. 35, W339–W344. J. T. et al. 2005 Real-time quantification of microRNAs by stem- Jones-Rhoades M. W. 2012 Conservation and divergence in plant loop RT-PCR. Nucleic Acids Res. 33, e179. microRNAs. Plant Mol. Biol. 80, 3–16. Dehury B., Panda D., Sahu J., Sahu M., Sarma K., Barooah M. Kozomara A. and Griffiths-Jones S. 2011 miRBase: integrating et al. 2013 In silico identification and characterization of con- microRNA annotation and deep-sequencing data. Nucleic Acids served miRNAs and their target genes in sweet potato (Ipomoea Res. 39, 152–157. batatas L.) expressed sequence tags (ESTs). Plant Signal Behav. Krüger J. and Rehmsmeier M. 2006 RNAhybrid: microRNA target 8, e26543. prediction easy, fast and flexible. Nucleic Acids Res. 34, W451– Dong Y., Xie M., Jiang Y., Xiao N., Du X., Zhang W. et al. 2013a W454. Sequencing and automated whole-genome optical mapping of the Larkin M. A., Blackshields G., Brown N. P., Chenna R., genome of a domestic goat (Capra hircus). Nat. Biotechnol. 31, McGettigan P. A., McWilliam H. et al. 2007 Clustal W and 135–141. Clustal X version 2.0. Bioinformatics 23, 2947–2948. Dong F., Ji Z. B., Chen C. X., Wang G. Z. and Wang J. M. 2013b Lee R. C., Feinbaum R. L. and Ambros V. 1993 The C. elegans Target gene and function prediction of differentially expressed heterochronic gene lin-4 encodes small RNAs with antisense microRNAs in lactating mammary glands of dairy goats. Int. J. complementarity to lin-14. Cell 75, 843–854. Genomics 2013, 917342. Lewis B. P., Burge C. B. and Bartel D. P. 2005 Conserved seed Engels B. M. and Hutvagner G. 2006 Principles and effects of pairing, often flanked by adenosines, indicates that thousands of microRNA-mediated post-transcriptional gene regulation. Onco- human genes are microRNA targets. Cell 120, 15–20. gene 25, 6163–6169. Li L., Xu J., Yang D., Tan X. and Wang H. 2010 Computational Fatima A. and Morris D. G. 2013 MicroRNAs in domestic live- approaches for microRNA studies, a review. Mamm. Genome 21, stock. Physiol Genomics 45, 685–696. 1–12.

Journal of Genetics In silico detection of potential miRNAs

Li Z., Liu H., Jin X., Lo L. and Liu J. 2012a Expression profiles of Peterson S. M., Thompson J. A., Ufkin M. L., Sathyanarayana microRNAs from lactating and non-lactating bovine mammary P., Liaw L. and Congdon C. B. 2014 Common features of glands and identification of miRNA related to lactation. BMC microRNA target prediction tools. Front. Genet. 5, 23. Genomics 13, 731. Schmittgen T. D. and Livak K. J. 2008 Analyzing real-time PCR Li Z., Lan X., Guo W., Sun J., Huang Y., Wang J., Huang T., Lei C., data by the comparative C(T) method. Nat. Protoc. 3, 1101–1108. Fang X. and Chen H. 2012b Comparative transcriptome profil- Silveri L., Tilly G., Vilotte J. L. and Le Provost F. 2006 MicroRNA ing of dairy goat microRNAs from dry period and peak lactation involvement in mammary gland development and breast cancer. mammary gland tissues. PLoS One 7, e52388. Reprod Nutr. 46, 549–556. Ling Y. H., Ding J. P., Zhang X. D., Wang L. J., Zhang Y. H., Li Takada S. and Asahara H. 2012 Current strategies for microRNA Y. S., Zhang Z. J. and Zhang X. R. 2013 Characterization of research. Mod. Rheumatol. 22, 645–653. microRNAs from goat (Capra hircus) by Solexa deep- VanGuilder H. D., Vrana K. E. and Freeman W. M. 2008 Twenty- sequencing technology. Genet. Mol. Res. 12, 1951–1961. five years of quantitative PCR for gene expression analysis. Liu N., Okamura K., Tyler D. M., Phillips M. D., Chung W. J. and Biotechniques 44, 619–626. Lai E. C. 2008 The evolution and functional diversification of Varkonyi-Gasic E. and Hellens R. P. 2011 Quantitative stem-loop animal microRNA genes. Cell Res. 18, 985–996. RT-PCR for detection of microRNAs. Methods Mol. Biol. 744, Liu H. C., Hicks J. A., Trakooljul N. and Zhao S. H. 2010 Cur- 145–157. rent knowledge of microRNA characterization in agricultural Vishwakarma N. P. and Jadeja V. J. 2013 Identification of miRNA animals. Anim. Genet. 41, 225–231. encoded by Jatropha curcas from EST and GSS. Plant Signal Liu Z., Xiao H., Li H., Zhao Y., Lai S., Yu X. et al. 2012 Identifi- Behav. 8, e23152. cation of conserved and novel microRNAs in cashmere goat skin Wang X., Gu Z. and Jiang H. 2013 MicroRNAs in farm animals. by deep sequencing. PLoS One 7, e50001. Animal 3, 1–9. Livak K. J. and Schmittgen T. D. 2001 Analysis of relative Wright J. A., Richer J. K. and Goodall G. J. 2010 microRNAs and gene expression data using real-time quantitative PCR and the EMT in mammary cells and breast cancer. J. Mammary Gland 2(−Delta Delta C(T)) Method. Methods 25, 402–408. Biol. Neoplasia 15, 213–223. Lorenz R., Bernhart S. H., Höner Zu Siederdissen C., Tafer H., Wu J., Zhu H., Song W., Li M., Liu C., Li N. et al. 2014 Identi- Flamm C., Stadler P. F. and Hofacker I. L. 2011 ViennaRNA fication of conservative microRNAs in Saanen dairy goat testis Package 2.0 Algorithms. Mol. Biol. 6, 26. through deep sequencing. Reprod. Domest. Anim. 49, 32–40. Mendes N. D., Freitas A. T. and Sagot M. F. 2009 Current tools Yu Z. and Pestell R. G. 2012 Small non-coding RNAs govern mam- for the identification of miRNA genes and their targets. Nucleic mary gland tumorigenesis. J. Mammary Gland Biol. Neoplasia Acids Res. 37, 2419–2433. 17, 59–64. Monavar F. A., Mohammadi S., Frazier T. P., Abbasi A., Abedini Yuan C., Wang X., Geng R., He X., Qu L. and Chen Y. 2013 Dis- R., Karimi Farsad L. et al. 2012 Identification and validation covery of cashmere goat (Capra hircus) microRNAs in skin and of Asteraceae miRNAs by the expressed sequence tag analysis. hair follicles by Solexa sequencing. BMC Genomics 14,511. Gene 493, 253–259. Zeder M. A. and Hesse B. 2000 The initial domestication of Muvva C., Tewari L., Aruna K., Ranjit P., Md Z. S., Md K. A. goats (Capra hircus) in the Zagros mountains 10,000 years ago. and Veeramachaneni H. 2012 In silico identification of miRNAs Science 287, 2254–2257. and their targets from the expressed sequence tags of Raphanus Zhang B. H., Pan X. P., Cox S. B., Cobb G. P. and Anderson T. sativus. Bioinformation 8, 98–103. A. 2006a Computational identification of microRNAs and their Ng Kwang Loong S. and Mishra S. K. 2007 Unique folding of pre- targets. Comput. Biol. Chem. 6, 395–407. cursor microRNAs, quantitative evidence and implications for de Zhang B. H., Pan X. P., Cox S. B., Cobb G. P. and Anderson T. novo identification. RNA 13, 170–187. A. 2006b Evidence that miRNAs are different from other RNAs. Oulas A., Karathanasis N., Louloupi A., Iliopoulos I., Kalantidis Cell Mol. Life Sci. 63, 246–254. K. and Poirazi P. 2012 A new microRNA target prediction tool Zhang B. H., Pan X. P. and Anderson T. A. 2006c Identification of identifies a novel interaction of a putative miRNA with CCND2. 188 conserved maize microRNAs and their targets. FEBS Lett. RNA Biol. 9, 1196–1207. 15, 3753–3762. Panda D., Dehury B., Sahu J., Barooah M., Sen P. and Modi M. K. Zhang L., Chia J. M., Kumari S., Stein J. C., Liu Z., Narechania A. 2014 Computational identification and characterization of con- et al. 2009 A genome-wide characterization of microRNA genes served miRNAs and their target genes in garlic (Allium sativum in maize. PLoS Genet. 5, e1000716. L.) expressed sequence tags. Gene 537, 333–342. Zhou B. and Liu H. L. 2010 Computational identification of new Patanun O., Lertpanyasampatha M., Sojikul P., Viboonjun U. porcine microRNAs and their targets. Anim.Sci.J.81, 290–296. and Narangajavana J. 2013 Computational identification of Zhou M., Wang Q., Sun J., Li X., Xu L., Yang H. et al. 2009 In microRNAs and their targets in cassava (Manihot esculenta silico detection and characteristics of novel microRNA genes in Crantz.) Mol. Biotechnol. 53, 257–269. the Equus caballus. Genomics 94, 125–131.

Received 24 September 2015, in final revised form 5 January 2016; accepted 8 January 2016 Unedited version published online: 12 January 2016 Final version published online: 25 August 2016

Journal of Genetics