1

1 Metagenomic analysis of bacterial community composition among the cave sediments of Indo-

2 Burman biodiversity hotspot region

3

4 Surajit De Mandal, Zothansanga and Nachimuthu Senthil Kumar*

5 Department of Biotechnology, Mizoram University, Aizawl-796004, Mizoram, India.

6

7 s t n

i 8 r

P 9 e r 10 P

11

12

13

14

15

16

17 *Corresponding author:

18 Email: [email protected]

19 Mobile: +91-9436352574

20

21

22

23

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 2

1 ABSTRACT

2 Caves in Mizoram, Northeast India are potential hotspot diversity regions due to the historical

3 significance of the formation of Indo-Burman plateau and also because of their unexplored and

4 unknown diversity. High throughput paired end illumina sequencing of V4 region of 16S rRNA

5 was performed to systematically evaluate the bacterial community of three caves situated in

6 Champhai district of Mizoram, Northeast India. A total of 10,643 operational taxonomic units

7 (based on 97% cutoff) comprising 21 and 21 candidate phyla with a sequencing s t n

i 8 depth of 11, 40013 were found in this study. The overall taxonomic profile obtained by BLAST r

P 9 against RDP classifier and Greengene OTU database revealed high diversity within the bacterial e r 10 communities, dominated by , , , , and P

11 , while members of archea were less diverse and mainly comprising of eukaryoarchea.

12 Analysis revealed that Farpuk (CFP) cave has low diversity and is mainly dominated by

13 actinobacteria (80% reads), whereas diverse communities were found in the caves of Murapuk

14 (CMP) and Lamsialpuk (CLP). Analysis of rare and abundant species also revealed that a major

15 portion of the identified OTUs were falling under rare biosphere. Significantly, all these caves

16 recorded a high number of unclassified OUTs which might represent novel species. Further,

17 analysis with whole genome sequencing is needed to validate the novel species as well as to

18 determine their functional significance.

19

20 Subjects Biodiversity, Cave Ecology, Microbiology

21 Keywords Cave, Indo-Burman plateau, Bacterial diversity, illumina sequencing

22

23

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 3

1 INTRODUCTION

2 Indo-Burma region, a part of the 25 global biodiversity hotspots, is one of the richest biomes of

3 the world with high species diversity (Myers et al., 2000). This region is spread over 2, 62, 379

4 sq. kms and represents the transition zone between the Indian and Indochinese subregions of the

5 Oriental biogeographic region (Mani, 1974). This region contains an estimated 9.7% of the

6 world’s known endemic plant species and 8.3% of the endemic vertebrate species (Brook et al.,

7 2003). Interestingly, not many reports are available on the microbial diversity, particularly from s t n

i 8 Caves, from the Indo-Burma region. r P

e 9 Caves represent subsurface habitat and are less explored in terms of biodiversity and r

P 10 community composition due to environmental and geographical constrains. Lack of

11 photosynthesis and limited nutrient source makes the caves an extreme environment to sustain

12 life. However, alternative energy in the form of allochthonous organic materials transported from

13 the surface through bat, rodents and human activities or by percolating water is utilized by

14 certain groups of microorganisms (Barton, 2006). These ecosystems with extreme temperature,

15 osmolarity, pressure, and pH forces the inhabitants to undertake diverse and novel metabolic

16 pathways for oxidizing reduced metals, fixing gases and for utilization of various aromatic

17 compounds. Organic matter helps the formation of secondary microbial communities - usually

18 multicolored yellow, grey, white or pink cloddy coadings on carbonate or clay coated walls in

19 the form of bioflim with unusual coloration, precipitates, corrosion residues (Barton, 2006).

20 Caves also act as long-term reservoirs for endemic as well as allochthonous

21 microorganisms (Engel et al., 2010). Earlier studies reported diverse group of microorganisms

22 associated with different geological and environmental factors (Adetutu et al., 2011; 2012) and

23 have already been implicated in astrobiology, drug discovery and cave conservation studies

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 4

1 (Northup et al., 2011; Saiz-Jimenez, 2012). These microbial communities also influence the

2 formation and preservation of cave deposits by constructive and destructive processes. Cave

3 microbes are also important since they act as primary producers, which sustain populations of

4 more complex organisms (Barton and Northup 2007).

5 Majority of the cave microbial diversity studies have been done using culture dependent

6 techniques which can reveal only 1% of the total microorganisms. In recent years, a novel s

t 7 methodology is being developed to detect the environmental microorganisms, independently of a n i

r 8 need for culture based screening. Molecular microbial ecology tools such as denaturing gradient P

e 9 gel electrophoresis (DGGE) and clone library analysis are being used by many researchers to r

P 10 characterize these uncultured microbes, but these techniques are also not sufficient to analyze the

11 entire population in the community (Adetutu et al., 2012). With the advancement of Next

12 Generation Sequencing, cave microbial ecology research has also expanded which allows us to

13 use culture-independent techniques to reveal further the hidden biodiversity and key process

14 happening inside the caves.

15 This study involves the use of high throughput illumina sequencing of sediment samples

16 collected from caves situated in Indo-Burmese border of Champhai district, Mizoram, Northeast

17 India to contribute to better understanding of their microbial community.

18

19 MATERIALS AND METHODS

20 Three caves namely, Murapuk (CMP), Lamsialpuk (CLP) and Farpuk (CFP) were

21 selected for the present study based on the fact these caves are devoid of any human influence

22 and have not been studied yet. Sediment samples were collected from different locations of the

23 caves and upon collection; the samples were sieved and preserved at 4°C. No specific permit was

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 5

1 taken for the sampling since it did not involve any endangered species or protected area.

2 Sediment samples were analyzed for carbon and nitrogen content with a CHNS/O analyzer

3 (Perkin Elmer, USA) and pH of the sample were measured by pH meter (Table 1).

4 Soil community DNA was extracted from 0.5 g of soil sample using the Fast DNA spin

5 kit (MP Biomedical, Solon, OH, USA) following the manufacturer’s protocol. DNA

6 concentration was quantified using a microplate reader (Molecular device Spectromax 2E). V4 s

t 7 hypervariable region of the 16S rRNA gene was amplified using 2 µl of each 10 pmol/µl forward n i

r 8 and reverse primers 515F (5′-GTGCCAGCMGCCGCGGTAA-3′) and 806R (5′- P

e 9 GGACTACHVGGGTWTCTAAT-3′). The amplification mix contained 5 μL of 40mM dNTP, 5 r

P 10 μL of 5X Phusion HF reaction buffer, 0.2 μL of 2U/ µl F-540Special Phusion HS DNA

11 Polymerase, 5ng input DNA and water to make up the total volume to 25 μL. High throughput

12 Illumina Mi-seq sequencing was performed at Scigenome Labs, Cochin, India (Table 2).

13

14 Sequence quality was analyzed according to base quality score distributions, average base

15 content per read and GC distribution in the reads. Singletons, the unique OTU that did not cluster

16 with other sequence were removed as it might be a result of sequencing errors and can be

17 resulted to spurious OTUs. Chimeras were also removed using UCHIME method and pre-

18 processed consensus V4 sequences were clustered into Operational Taxonomic Units (OTUs)

19 based on their sequence similarity using Uclust program (similarity cutoff=0.97). All the pre-

20 processed reads were used to identify the OTUs using QIIME program for constructing a

21 representative sequence for each OTUs. The representative sequence was finally aligned to the

22 Greengenes core set reference databases using PyNAST program (Caporaso et al., 2010;

23 DeSantis et al., 2006). Representative sequence for each OTU was classified using RDP

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 6

1 classifier and Greengenes OTUs database. Sequences which are not classified were categorized

2 as unknown.

3 QIIME software was used to calculate Shannon index and Observed species metrices.

4 Shannon metric represents observed OTU abundance and estimates for both richness and

5 evenness, whereas observed species metric detects unique OTUs present in the sample. In this

6 study, the comparison of beta diversity between three bacterial communities (CLP, CMP and

7 CFP) was done by calculating the distance matrix using UniFrac approach (Lozupone et al., s t n

i 8 2005). Weighted UPGMA tree was constructed by performing jackknife test A with 10 replicates r

P 9 and each sub-sample containing 1, 00,000 random reads selected from each sample. e r 10 P

11 RESULTS AND DISCUSSION

12 With an unsuitable geology, caves are the most remote and inaccessible environment for

13 research, but are now being considered as a potential biodiversity hotspot due to its unique

14 ecological significance. Most of the caves present in Mizoram are of tectonic origin which was

15 caused due to tension cleavage of the compact host rock (Gebauer et al., 2001). Since these

16 caves are present in extreme conditions, it is assumed that microorganisms living in these caves

17 would be mostly novel and undisturbed. Studying this unique habitat provides an opportunity to

18 understand global microbial diversity, novel population assemblages, energy dynamics and

19 metabolism (Ortiz et al., 2013). Previous study based on caves across the world revealed

20 heterotrophic interaction and carbon turnover by Alpha and Betaproteobacteria, Firmicutes and

21 Actinobacteria (Macalady et al., 2008; Barton, 2014), although application of next generation

22 sequencing technology on these environment suggested to have more information about

23 microbial physiology in these caves (Tetu et al., 2013; Ortiz et al., 2014).

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 7

1 In the present study, we have used paired end illumina sequencing generating 585,434

2 raw sequences with 90% of reads having a phred score greater than 30. Illumina method is cost

3 effective and provides more detailed taxonomic profiles between samples to be determined

4 ( Nelson el al., 2014). After quality checking of V4 region of 16s rRNA, reads were clustered

5 into Operational Taxonomic Units (OTUs) based on their sequence similarity using Uclust

6 program (97% similarity level). A total of 1,140,013 preprocessed reads were clustered into

7 10,643 OTUs (operational taxonomical units). Sample library ranges from 259,895(CFP) to s t n

i 8 470,260 (CLP) sequence reads (Table 2). Identification of this huge number of sequence reads is r

P 9 a common phenomenon for underground microbial community compared to surface environment e r 10 (Moss et al., 2011; Epure et al., 2014). P

11

12 The number of OTUs and Shannon diversity indicates are summarized in Table 3. On the

13 basis of the OTUs, CMP has the highest diversity followed closely by CLP. Shannon index also

14 showes a high diversity among CMP bacterial community. Rarefaction curve for Observed

15 species and Shannon metric are shown in Fig.1. The Observed species metric is only the count of

16 unique OTUs identified in the sample. This analysis shows that sample from CMP is more

17 diverse than the other two samples. Beta diversity represents the explicit comparison of

18 microbial communities based on their composition. Unweighted UniFrac reveals a close

19 relationship between these three communities with no difference in distance matrix (Table 4).

20 Consensus UPGMA tree with weighted Unifac approach bring the two communities CLP and

21 CMP together showing the existing of similar , while the bacterial community in CFP is

22 different from others (Fig.2). This difference may be due to the remote location of the Farpuk

23 (CFP) than the other two caves, which probably causes CFP to retain its unique bacterial

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 8

1 community without any disturbance. The sequences from CFP sample had more than 50%

2 singletons in the consensus reads, which are believed to possess no taxonomic information and

3 hence deleted for further sequence analysis leading to less diversity representation of the

4 bacterial species.

5 A total of 21 bacterial phyla and 21 candidate phyla were identified from all the cave

6 sediments and were mostly dominated by Actinobacteria, Planctomycetes, ,

7 and Proteobacteria. Relative abundance among the top ten dominated phylum are s t n

i 8 represented in Fig. 3 and Fig. 4. Previous studies recorded diverse groups of actinobacteria in r

P 9 caves and their role in colored crystal formation in cave walls and therefore also in constructive e r 10 biomineralization processes (Barton et al., 2001). Our study also detected actinobacteria as the P

11 most dominating phyla (35.97% of total sequence) with majority of them (244 OTU) falling

12 under the order actinomycelales, followed by and acidimicrobiales. Other

13 orders identified include , rubrobacterraceae and MMB-A2-108. In our study,

14 twelve actinobacteria were identified upto species level: Streptomyces radiopugnans,

15 Virgisporangium ochraceum, Actinomadura vinacea, Streptomyces lanatus, Rhodococcus

16 fascians, R. fascians, Saccharopolyspora hirsute, Virgisporangium ochraceum, S. mirabilis,

17 Actinomadura vinacea and Mycobacterium celatum. In all the cave samples, denovo 3283 were

18 most dominated phylotype and BLAST result shows a 100% similarity with the species

19 Mycobacterium. Other dominated phylotype include denovo5355, which is closely related with

20 Arthrobacter - a member of the GC rich ‘actinomycete’ capable of utilizing a wide and diverse

21 range of organic substances as carbon and energy sources such as nicotine, nucleic acids, various

22 herbicides and pesticides. Phylum Chloroflexi had the second largest number of sequence

23 (13.96%) with 1999 OTUs dominated by the class Ktedonobacteria. Identified genera under

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 9

1 these phyla include Chloroflexus, FFCH10602, Caldilinea, Ardenscatena, Chloronema and

2 Oscillochloris. Most dominated OTUs under this phylum were denovo1827 and denovo 9830

3 which were classified under the order thermogemmatisporales and TK10, respectively. Members

4 of these phyla were commonly found in most of the caves from other environments such as

5 anaerobic thermophiles, filamentous anoxygenic phototrophs, and anaerobic organohalide

6 respirers. Our study detects on an average value of 13.76% of all sequences belonging to

7 planctomycetes, a distinct phylum of the domain bacteria having intracellular s t n

i 8 compartmentalization and lack of peptidoglycan in their cell walls. Most of the dominant OUTs r

P 9 within this group were classified under the order WD2101 and Gemmatales. This phyla is the e r 10 major abundant members in CMP (22.82% of all read) and CLP (18.43% of all read) samples, P

11 whereas in CFP it is only 0.03%. Although this phylum is a common member of the cave

12 bacterial community, its role in cave is not clear due to limited cultural representative. Few study

13 showed their involvement in metabolism of sulfated polysaccharides as well as oxidation of

14 ammonia (Schmid et al., 2000; Jetten et al., 2003). Proteobacteria was found to be diverse in all

15 the three bacterial communities. A total of 46154 sequences with 497 OTUs were found under

16 the subphylum alpha proteobacteria. Major OTUs in this subphylum were classified under the

17 order Rhizobiales. Dominant genera within this subphylum were Sphingomonadaceae

18 kaistobacter, Bradyrhizobiaceae bradyrhizobium and Hyphomicrobiaceae rhodoplanes.

19 Betaproteobacteria were less diverse, only 19 OTUs (2792 sequence) was detected among all the

20 samples. Most dominant OTUs within Betaproteobacteria were denovo 360, 2244 and 8071 all

21 classified under the genus Burkholderia. Fourty nine OTUs from 3534 sequences grouped under

22 gamma proteobacteria, dominated by the genus Dyella. 323 sequences clustering into 37 OUTs

23 were classified under the subphyla Deltaproteobacteria. All of OTUs were present in less

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 10

1 numbers. The phylum acidobacteria was moderately abundant among the cave samples and

2 represented by 11.44% of the total sequence obtained. This phylum consisted of family

3 solibacteraceae, koribacteraceae and acidobacteriaceae. Most dominant OTUs under this order

4 were denovo7994 and denovo 9544, belonging to the class chloracidobacteria and acidobacteria-

5 6 respectively. Two OTUs within this phylum - denovo 6901 and denovo 5227 demonstrate

6 close sequence similarity with Candidatus Solibacter usitatus Ellin6076, which are adopted to

7 survive under low-nutrient conditions (Ward et al., 2009). s t n

i 8 r

P 9 A total of 33411 reads (CLP=8, CFP=11583 & CMP =21820) comprising 361 OTUs e r 10 were classified within the phylum (formerly known as ‘candidate division P

11 OP10’), a dominant and globally-distributed lineage within this ‘uncultured majority’. All the

12 OTUs were classified under the genus fimbriimonas except denovo 1709 which is classified

13 under the genus chthonomonas. Only one OTU (denovo 4733) was found in CMP classified

14 under the genus Gemmatimonas. Seven OTUs containing 89 reads (CFP53, CLP2, and CMP 34)

15 were affiliated with phylum nitrospira with only two detected genus- JG37-AG-70 and

16 nitrospira. Members of this group are obligate chemolithoautotroph, obtaining energy by

17 oxidation in nitrite and detected previously in Mexican anchialine caves and Tito Bustillo caves

18 (Pohlman et al., 1997; Schabereiter-Gurtner et al., 2002). Our analysis reveals 45 OTUs under

19 the phylum bacteroidetes. Taxa classified under the species level were Cytophaga xylanolytica,

20 Flavobacterium succinicans, Bacteroides plebeius, Sphingobacterium multivorum and

21 Fontibacter flavus. Most dominant OTU classified were under flavisolibacter (denovo1336 and

22 denovo3516) and adhaeribacter (denovo8478). There were 334 sequences comprising 19 OTUs

23 classified under the phylum Euryarchaeota, dividing into four classes methanomicrobia,

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 11

1 thermoplasmata, halobacteria and methanobacteria. Identified genera in this phylum include

2 Methanocella, Methanocorpusculum, Halolamina, Methanoculleus, Methanoculleus,

3 Methanosarcina, Haloquadratum, Methanobacterium, Methanosaeta, and Methanoplanus.

4 However within the phylum Euryarchaeota, none of the OTUs were classified upto the species

5 level in all the three cave communities and were found to be present in rare numbers. The

6 phylum Crenarchaeota is also present in very low quantity and was clustered into 21 OTUs (total

7 read 1698). All the Crenarchaeota were assigned into two classes- MBGA and Thaumarchaeota. s t n

i 8 Our analysis identified twenty one candidate phyla, also known as a bacterial lineage, r

P 9 mostly falling within the rare biosphere. The most dominant OTU among the candidate phyla is e r 10 denovo 1407 (read=3094), classified under the phylum AD3, having close sequence similarity P

11 with the environmental clone LuqGS470001 (Minyard et al., 2012). This clone was originally

12 isolated from deep saprolite and saprock which is believed to play a role in weathered minerals

13 in deep tropical saprolite and is found to be a common inhabitant of all the analyzed caves

14 (Minyard et al., 2012). Other candidate phyla identified in our study includes LD1, MPV, NKB,

15 OD, OD1, OD3, TM6, TM7, WS1, WS2, WS3, WWE1, ZB3, BH1, BRC1, FCPU, GAL, GN,

16 ZB3 and Kazan3b. Top ten bacterial genera based on OTU number and top ten OTU’s based on

17 total read count number among the cave samples is represented in Supplementary Tables S1 and

18 S2, respectively. Relative abundance of bacterial diversity from phylum to species is shown in

19 Supplementary Fig.’s S1 to S3.

20 Illumina sequencing reveals a huge number of phylotype among these caves samples

21 belonging to the rare biosphere, which are microorganisms with extremely low abundance (Reid

22 et al., 2011). We have selected the criteria for rare (<0.01% of total community) and abundant

23 (other than rare species) species based on the previous study (Aravindraja et al., 2013) and

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 12

1 according to this distribution, the rare species was 75.72-83.01% among the samples, whereas

2 the abundant species was 16.98-24.27 % (Fig. 5). Ratio of rare and abundant OTUs among all

3 the three samples were similar and within a range of 3.11- 4.88. The most abundant phylotype

4 was denovo 6722 classified under actinobacteria and were present in all the three cave samples.

5 Fig. 6 shows the unique and shared species among the rare biosphere in all the three samples.

6 Venn diagram shows only 171 rare OTUs (1.78%) being shared among the three communities,

7 but majority of the rare species in CFP are unique, whereas many common species was observed s t n

i 8 between CFP and CMP. Among abundant species 3.94% is shared by all the three samples. r

P 9 Many OTUs among the cave samples were rare in one community but showing as an abundant in e r 10 other community. This showes that different environmental factors prevalent among the caves P

11 which makes some group to be dormant and become a member of the rare biosphere. These

12 members can be active at favorable environmental conditions and become abundant. Common

13 identified species among the cave samples were members of the phylum Actinobacteria,

14 Firmicutes and Proteobacteria (Table 5). Further analysis with whole genome sequencing will

15 reveal the actual role of these rare and abundant phyla present in the cave samples.

16 This study provides an in-depth study of unexplored bacterial diversity in cave samples

17 of Mizoram with a large number of classified phyla (twenty), candidate phyla (twenty one) and a

18 large portion of unclassified bacteria, indicates the possibility the presence of novel species. It is

19 found that the classified reads were simultaneously decreased from phylum to species level. The

20 two most dominant phylotype were denovo 6722 (11.72%) and denovo 4035 (5.72%) belonging

21 to Actinobacteria and , respectively. The remaining phyla present in the

22 communities had low (<4%) abundance. The present study revealed a unique bacterial

23 community in Farpuk which was mostly classified under uncultured actinobacteria. These

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 13

1 uncultured species could be a major source of new antibiotics. This analysis also revealed that

2 the bacterial diversity is higher in CMP and CFP samples compared to CLP samples. This might

3 be due to the fact that CLP is situated in extremely remote place and their diversity is not

4 influenced by an exogenous source compared to other or each cave environment might have

5 different nutrient composition or ecological condition for specific bacterial taxa.

6 s t

n 7 ACKNOWLEDGEMENTS i r P

e 8 This research was funded by a grant from the State Biotech Hub sponsored by Department of r

P 9 Biotechnology, Govt. of India, New Delhi. We would like to thank Mr. Lalrinhlua for his help in

10 sampling.

11 Competing Interests

12 The authors declare there are no competing interests.

13

14 Author Contributions

15 • Surajit De Mandal, Zothansanga and Nachimuthu Senthil Kumar conceived and designed the

16 experiments, analyzed the data, wrote the paper, prepared Figures and Tables. Surajit De Mandal

17 performed the experiments.

18

19 DNA Deposition

20 The following information was supplied and is under process regarding the deposition of DNA

21 sequences: EBI Sequence Read Archive, Project Number PRJEB7730 and ERP008676.

22

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 14

1 Supplemental Information

2 Supplemental information for this article is attached. The file contains 2 Tables and 3 Figures.

3 REFERENCES

4 Adetutu EM, Thorpe K, Bourne S, Cao X, Shahsavari E, Kirby G, Ball A.S. 2011.

5 Phylogenetic diversity of fungal communities in areas accessible and not accessible to tourists in

6 Naracoorte Caves. Mycologia 103:959-968 http://dx.doi.org/10.3852/10-256. s t n

i 7 Adetutu EM, Thorpe K, Shahsavari E, Bourne S, Cao X, Fard RMN, Kirby G, Ball AS. r

P 8 2012. Bacterial community survey of sediments at Naracoorte Caves, Australia. International e r 9 Journal of Speleology 41(2):137-147. P

10 Aravindraja C, Viszwapriya D, Karutha PS. 2013. Ultradeep 16S rRNA Sequencing Analysis

11 of Geographically Similar but Diverse Unexplored Marine Samples Reveal Varied Bacterial

12 Community Composition. PLoS ONE 8(10): e76724 doi:10.1371/journal.pone.0076724.

13 Barton AH, Northup ED. 2007. Geomicrobiology in cave environments: past, current and

14 future perspectives. Journal of Cave and Karst Studies 69:163-178.

15 Barton HA, Spear JR, Pace NR. 2001. Microbial life in the underworld: biogenicity in

16 secondary mineral formations. Geomicrobiology Journal 18:359-368.

17 Barton HA. 2006. Introduction to cave microbiology: a review for the non-specialist: Journal of

18 Cave and Karst Studies 68:43-54.

19 Barton HA. 2014. "Starving Artists: Bacterial Oligotrophic Heterotrophy in Caves." in Life in

20 Extreme Environments: Microbial Life of Cave Systems. ed. D. Wagner. (Berlin, Germany: De

21 Gruyter).

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 15

1 Brook BW, Sodhi NS, Ng PKL. 2003. Catastrophic extinctions follow deforestation in

2 Singapore. Nature 424: 420-424.

3 Caporaso JG, Bittinger K, Bushman FD, DeSantis TZ, Andersen GL, Knight R. 2010a.

4 PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26:266–

5 267.

6 Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer

7 N,Pe˜na AG, Goodrich JK, Gordon JI, Huttley GA, Kelley ST, Knights D, Koenig JE, Ley s t n

i 8 RE, Lozupone CA, McDonald D, Muegge BD, Pirrung M, Reeder J, Sevinsky JR, r

P 9 Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, Knight R. 2010b. e r 10 QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7:335- P

11 336 DOI 10.1038/nmeth.f.303.

12 DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D,

13 Hu P, Andersen GL. 2006. Greengenes, a chimera-checked 16S rRNA gene database and

14 workbench compatible with ARB. Applied and Environmental Microbiology 72:5069-5072.

15

16 Engel AS, Meisinger DB, Porter ML, Payn R, Schmid M, Stern LA, Schleifer KH, Lee NM.

17 2010. Linking phylogenetic and functional diversity to nutrient spiraling in microbial mats from

18 Lower Kane Cave (USA). The ISME Journal. 4: 98-110.

19

20 Epure J, Meleg JN, Munteanu CM, Roban RD, Moldovan OT. 2014. Bacterial and Fungal

21 Diversity of Quaternary Cave Sediment Deposits Geomicrobiology Journal 31 :(2)116-127 DOI:

22 10.1080/01490451.2013.815292.

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 16

1 Gebauer HD, Chhakchhuak B, Sootinck N. 2001. Caves of Mizoram (speleological projects in

2 NE-India) 5: 61

3 Jetten MS, Sliekers O, Kuypers M, Dalsgaard T, van Niftrik L, Cirpus I, van de Pas-

4 Schoonen K, Lavik G, Thamdrup B, Le Paslier D, Op den Camp HJ, Hulth S, Nielsen

5 LP, Abma W, Third K, Engström P, Kuenen JG, Jørgensen BB, Canfield DE, Sinninghe

6 Damsté JS, Revsbech NP, Fuerst J, Weissenbach J, Wagner M, Schmidt I, Schmid

7 M, Strous M. 2003. Anaerobic ammonium oxidation by marine and freshwater planctomycete- s t n

i 8 like bacteria. Applied Microbiology and Biotechnology 63:107-114. r

P 9 Lozupone C, Knight R. 2005. UniFrac: a new phylogenetic method for comparing microbial e r 10 communities. Applied and Environmental Microbiology 71:8228-8235 DOI 10.1128/ AEM. P

11 71.12.8228-8235.2005.

12 Macalady JL, Dattagupta S, Schaperdoth I, Jones DS, Druschel GK, Eastman D. 2008.

13 Niche differentiation among sulfur-oxidizing bacterial populations in cave waters. The ISME

14 Journal 2: 590-601.

15 Mani, MS.1974. Ecology and Biogeography in India- Vol-1, Dr. W. Junk b. v. Publishers, The

16 Hague.

17 Minyard ML, Bruns MA, Liermann LJ, Buss HL, Brantley SL. 2012. Bacterial Associations

18 with Weathering Minerals at the Regolith-Bedrock Interface, Luquillo Experimental Forest,

19 Puerto Rico. Geomicrobiology Journal 29: 792-803.

20 Moss JA, Nocker A, Snyder R.A. 2011. Microbial characteristics of a submerged karst cave

21 system in Northern Florida. Geomicrobiology Journal 28: 719-931.

22 Myers N, Mittermeier RA, Mittermeier CG, da Fonseca GAB, Kent J. 2000. Biodiversity

23 hotspots for conservation priorities. Nature 403: 853-858.

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 17

1 Nelson MC, Morrison HG, Benjamino J, Grim SL, Graf J. 2014. Analysis, Optimization and

2 Verification of Illumina-Generated 16S rRNA Gene Amplicon Surveys PLoS ONE 9(4): e94249.

3 Northup DE, Melim LA, Spilde MN, Hathaway JM, Garcia MG, Moya M, Stone FD,

4 Boston PJ, Dapkevicius MLNE, Riquelme C. 2011. Lava cave microbial communities within

5 mats and secondary mineral deposits: Implications for life detection on other planets.

6 Astrobiology 11:601-618.

7 Ortiz M, Legatzki A, Neilson JW, Fryslie B, Nelson WM, Wing RA, Soderlund CA, Pryor s t n

i 8 BM, Maier RM. 2014. Making a living while starving in the dark: metagenomic insights into r

P 9 the energy dynamics of a carbonate cave. The ISME Journal 8: 478-491. e r 10 Ortiz M, Neilson JW, Nelson WM, Legatzki A, Byrne A, Yu Y, Wing RA, Soderlund CA, P

11 Pryor BM, Pierson LS, Maier RM. 2013. Profiling Bacterial Diversity and Taxonomic

12 Composition on Speleothem Surfaces in Kartchner Caverns, AZ. Microbial Ecology 65:371-383.

13 Pohlman JW, Iliffe TM, Cifuentes LA.1997.A stable isotope study of organic cycling and the

14 ecology of an anchialine cave ecosystem. Marine Ecology Progress Series 155:17-27.

15 Reid A, Buckley M. Washington, DC: American Academy of Microbiology; 2011. The Rare

16 Biosphere: A report from the American Academy of Microbiology.

17 Saiz-Jimenez C. 2012. Microbiological and environmental issues in show caves. World Journal

18 of Microbiology and Biotechnology 28: 2453-2464.

19 Schabereiter-Gurtner C, Saiz-Jimenez C, Piñar G, Lubitz W, Rölleke S. 2002a. Altamira

20 cave Paleolithic paintings harbour partly unknown bacterial communities. FEMS Microbiology

21 Letters 211: 7-11.

22 Schmid M, Twachtmann U, Klein M, Strous M, Juretschko S, Jetten M, Metzger

23 JW, Schleifer KH, Wagner M. 2000. Molecular evidence for genus level diversity of bacteria

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 18

1 capable of catalyzing anaerobic ammonium oxidation. Systematic and Applied Microbiology

2 23: 93-106.

3 Tetu SG, Breakwell K, Elbourne LD, Holmes AJ, Gillings MR, Paulsen I.T. 2013. Life in the

4 dark: metagenomic evidence that a microbial slime community is driven by inorganic nitrogen

5 metabolism. The ISME Journal 7:1227-1236.

6 Ward NL, Challacombe JF, Janssen PH, Henrissat B, Coutinho PM, Wu M, Xie G., Haft

7 DH, Sait M, Badger J, Barabote RD, Bradley B, Brettin TS, Brinkac LM, Bruce D, Creasy s t n

i 8 T, Daugherty SC, Davidsen TM, Deboy RT, Detter JC, Dodson RJ, Durkin, A.S., r

P 9 Ganapathy A, Gwinn-Giglio M, Han CS, Khouri H, Kiss H, Kothari SP, Madupu R, Nelson e r 10 K , Nelson WC , Paulsen I , Penn K , Ren Q , Rosovitz MJ, Selengut JD, Shrivastava S, P

11 Sullivan SA, Tapia R, Thompson LS, Watkins KL, Yang Q, Yu C, Zafar N, Zhou L, Kuske

12 C.R. 2009. Three genomes from the phylum Acidobacteria provide insight into their lifestyles in

13 soils. Applied and Environmental Microbiology 75:2046-2056.

14

15

16

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 19

1

2 3 Table 1 Details of the cave samples used in the present study. 4

Name of the Place and Latitude Longitude Elevation Humidity Temperature pH C N H

cave Year of o s

collectiont (MSL) (%) ( C) (%) (%) (%)

(Sample n i Name) r P

e Murapuk r N23°44.295' E92°39'770' 4927 44 22 7.2 110.50 13.46 30.90

(CMP) P Champhai, Lamsialpuk Mizoram, N23°08.019' E93°16'896' 4446 40 24 7.2 126.79 13.96 9.96 (CLP) Northeast India (2014) Farpuk N23°06.055' E92°17'911' 4645 44 23 6.8 40.58 3.19 9.07 (CFP)

5

6

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 20

Table 2 Summary statistics of illumina paired-end reads (V4 region of 16S rRNA gene) used in this study.

Sample Total Passed Passed Passed Passed Consensus Chimeric Pre- Name Reads Conserved Spacer Read Mismatch Reads After Sequences processed Singleton (bp) s Region (bp) Quality Filter (bp) (bp) (bp) Reads t Filter (bp) Filter (bp) Removal (bp) n i

r (bp)

CLP 674,406P 617,278 616,828 616,738 568,149 568,149 470,943 683 470,260 e

CMP 635,210r 583,165 582,750 582,628 538,641 538,641 406,640 1,252 405,388

CFP 690,975P 621,772 620,134 619,973 560,239 538,641 262,430 2,535 259,895

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 21

Table 3 Summary of illumina operational taxonomical units (OTUs) and alpha diversity estimates using QIIME tool.

Sample name Total OTU Shannon index

CMP 6555 9.30 CLP 4968 9.05 CFP 3108 4.75 s

t n i r

P e

r

P

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 22

Table 4 Unweighted UniFrac distance matrix among the cave samples.

Sample CLP CMP CFP name CLP 0 0.761516 0.550632

CMP 0.761516 0 0.779452

CFP 0.550632 0.779452 0

s t n i r

P e

r

P

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 23

Table 5 Shared and unic taxa (identified upto species level) in the cave samples. Species with * is present in all the samples

Species in CFP Species in CLP Species in CMP Actinomadura vinacea* Actinomadura vinacea* Actinomadura vinacea* s t Clostridium bowmanii* Bacillus badius Afipia felis n i Clostridium butyricum* Bacteroides plebeius Bacillus badius r Glaciecola polaris Brevundimonas diminuta Burkholderia tuberum P Mycobacterium celatum Burkholderia tuberum Clostridium acetobutylicum e

r Peredibacter starrii* Caenispirillum salinarum Clostridium bifermentans Rhodococcus fascians Clostridium bifermentans Clostridium bowmanii* P Saccharopolyspora hirsuta* Clostridium bowmanii* Clostridium butyricum* Sphingomonas azotifigens* Clostridium butyricum* Coccomyxa subellipsoidea Stenotrophomonas acidaminiphila Clostridium perfringens Corallococcus exiguus Streptomyces mirabilis* Clostridium tetani Desulfosporosinus meridiei Thermomonas fusca Clostridium venationis Escherichia coli Virgisporangium ochraceum* Coccomyxa subellipsoidea Flavobacterium succinicans* Cytophaga xylanolytica Glaciecola polaris Escherichia coli Leptolyngbya frigida Flavobacterium succinicans* Luteibacter rhizovicinus Fontibacter flavus Nevskia ramosa Inquilinus limosus Paenibacillus chondroitinus Kosmotoga mrcj Paenibacillus ginsengarvi Luteibacter rhizovicinus Peredibacter starrii* Megamonas hypermegale Propionispira arboris Methylobacterium organophilum Pseudomonas viridiflava Mycobacterium celatum Rhodococcus fascians Paenibacillus chondroitinus Roseomonas mucosa Paenibacillus curdlanolyticus Saccharopolyspora hirsuta* Peredibacter starrii* Shewanella algae Pseudomonas viridiflava Sphingobacterium multivorum Saccharopolyspora hirsuta* Sphingomonas azotifigens* Shewanella algae Streptomyces lanatus Singulisphaera rosea Streptomyces mirabilis* multivorum Syntrichia ruralis Sphingomonas azotifigens* Virgisporangium ochraceum* Sphingomonas wittichii

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 24

Streptomyces mirabilis* Streptomyces radiopugnans Veillonella dispar Virgisporangium ochraceum* s t n i r P e r P

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 25

s t n i r P e r P

Figure 1 Rarefaction analysis of alpha diversity among CMP, CLP and CMP samples. Three different diversity matrix were used a) Observed number of species, b) Shannon diversity index.

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 26

1

2 3 s t

n 4 i r 5 Figure 2 Phylogenetic tree based on the distances between sample CLP, CMP and CFP P

e 6 with weighted UniFrac approach. r 7 P 8 9 10 11 12 13 14 15

26

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 27 s t n i r P e r P

1

2 Figure 3 classifications of reads at phylum level for the cave samples. Only top 3 10 enriched class categories are shown in the figure. Classification is performed using RDP 4 classifier and Greengenes OTUs database.

5

6

7

8

9

10

11 12 13 14

27

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 28

1

2 s t n i r P e r P

3

4 Figure 4 Taxonomy classifications of OTUs at phylum level for the cave samples. Only 5 top 10 enriched class categories are shown in the figure. Classification is performed using 6 RDP classifier and Greengenes OTUs database.

7

8

28

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 29 s t n i 1 r

P 2 Figure 5 Percentage of abundant and rare OTUs among the cave samples e 3 r

P 4

5

6

7

8

9

29

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014 30 s t n i r P e r 1 P 2 Figure 6 Venn diagram showing the unique and shared species among the rare and 3 abundant OTUs among of the cave samples

4

5

6

7

8

9

10

11

12

13

14

30

PeerJ PrePrints | http://dx.doi.org/10.7287/peerj.preprints.631v1 | CC-BY 4.0 Open Access | rec: 22 Nov 2014, publ: 22 Nov 2014