Functional role of β domain in the Thermoanaerobacter tengcongensis glucoamylase Zilong Li, Pingying Wei, Hairong Cheng, Peng He, Qinhong Wang, Ning Jiang* Z. Li·P. Wei·P. He·N.Jiang (*) Department of Industrial Microbiology and Biotechnology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, P.R. China. Z. Li·P. Wei Graduate University, Chinese Academy of Sciences, Beijing, 100049, P.R. China; H. Cheng The State Key Laboratory of Microbial Metabolism & School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, P.R. China; Q. Wang Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300386, China. E-mail: [email protected]

Figure legends

Fig. S1 Amino acid sequence alignment to other bacterial BDs was performed using Mega5.10. (A)

Multi-sequence alignments of nine N-terminal BDs from bacterial glucoamylase. Abbreviations of the sources of individual BD genes are as follows: Tte, T. tengcongensis MB4 (gi 20808230), Tsi, T. siderophilus SR4 (gi 392939254), Cth, C. thermoamylolyticum (gi 34146785), Tth, T. thermosaccharolyticum (gi 304316266), Mau, Mahellaaustraliensis 50-1 BON (gi332981855), Afe, A. ferrooxidans ATCC 23270 (gi 218665170), Cal, C. algicola DSM14237 (gi1 59186340), Rca, R. castenholzii DSM13941 (gi 156742555), and Atu, A. tumefaciens str.C58 (gi 159186340). Residues that belong to the consensus sequences are highlighted in black and indicated as “consensus.” Secondary structural assignments of modeled TteGA BD are indicated by helices (α-helix) and arrows (β-strands).

(B) Phylogenetic rooted tree of nine bacterial GA BDs. The tree is based on the multiple sequence alignment of the same nine BDs. The branch lengths are proportional to the sequence divergence

Fig. S2 Structural alignments to bacterial and fungal GAs were performed using Chimera 1.6. All structures modeled were based on the structure of TthGA (PDB code 1LF6) (A) Alignment of modeled structures of nine prokaryotic GAs. Every GA was labeled in the color of its abbreviated name. The catalytic general acid and base were highlighted in spheres (Glu in white). (B) Alignment to HjGA, the only fungal GA with overall 3D structure solved. The general acid and base were labeled in spheres. A significant interdomain interaction region was highlighted by cyan dashed rectangle and magnified in

(C). (C) Interaction between region D46T47W48 (blue) of BD and αL1. Hydrogen bonds (yellow line)

1 and Van der Waals force (yellow dashed line and letters) were indicated. Residues involved were shown in stick and red abbreviation

Fig. S3 Schematic representations of the truncation and mutagenesis site. The starts and ends were labeled with their complete amino acid numerical order. The arrows indicated the β strands in β domain. D46T47W48 and I339 were the interaction sites to be mutated

2 Fig. S1

3 Fig. S2

4 Fig. S3

5 Table S1. Bacterial strains, plasmids, and primers used in this study

cc Relevant characteristic/sequence(restriction enzymes) Source/reference/note Strain E. coli

- - F φ80 lacZ∆M15 ∆(lacZYA-argF)U169 recA1 endA1 hsdR17(rk , DH5α TransGen (Cat. No.CD201) + - mk ) phoA supE44 thi-1 gyrA96 relA1 λ – – – R Rosetta (DE3) F ompT hsdSB (rB mB ) gal dcm (DE3) pRARE2 (Cam ) Novagen (Cat. no. 70987-3) T. tengcongensis MB4 CCCCM AS 1.2430T=DSM 15242, the source of glucoamylase gene (Xue et al. 2001) Plasmid pET21a Expression vector with C-terminal hexahistidine affinity tag Novagen pET21a-TteGA pET21a derivative Expression vector of TteGA, deletion of residues 1- This study 21 (signal peptide) pET21a-TS1 pET21a derivative Expression vector of TS1, deletion of residues 1-31 This study (included β strand 1) pET21a-TS2 pET21a derivative Expression vector of TS2, deletion of residues 1-57 This study (included β strand 1-2) pET21a-TS3 pET21a derivative Expression vector of TS3, deletion of residues 1-71 This study (included β strand 1-3) pET21a-TS16 pET21a derivative Expression vector of TS16, deletion of residues 1- This study 267 (included β strand 1-16) pET21a-T47A pET21a derivative Expression vector of TteGA mutant: T47A This study pET21a-W48A pET21a derivative Expression vector of TteGA mutant: W48A This study 47 48 47 48 pET21a-T W A2 pET21a derivative Expression vector of TteGA mutant: T A and W A This study pET21a-I339A pET21a derivative Expression vector of TteGA mutant: I339A This study

6 pET21a-CD pET21a derivative Expression vector of CD (without hexahistidine This study affinity tag), residues 303-703 pET28a Expression vector with N-terminal hexahistidine affinity tag Novagen pET28a-BD pET28a derivative Expression vector of BD, residues 22-302 This study pET28a-CD pET28a derivative Expression vector of CD, residues 303-703 This study Primer (5'→3') This study TteGAforward CGCGGATCCTGTTCTGATGTTTCATACGTGAAGG(BamHI) This study TteGAreverse CGCCCTCGAGTCTCTCCCCTAATACATACCTTTTG(XhoI) This study BDforward GCGTCACGCATATGTGTTCTGATGTTTCATACGTGAA(NdeI) This study BDreverse CGCCCTCGAGATTACCCGTCGCAGTATTTATTCCATTC(XhoI) This study CDforward GCGTCACGCATATGCTTAAAAACTTTGGTGGAGA(NdeI) This study CDreverse CGCCCTCGAGATTATCTCTCCCCTAATACATACCTTTTG(XhoI) This study TS1forward CGCGGATCCCATTTAGATAAAACCGAAGCTTCTC(BamHI) This study TS2forward CGCGGATCCACTGCCAATAATGATGTATCGAAAG(BamHI) This study TS3forward CGCGGATCCCAGGGGGCCCTTTCTGAGATTTACT(BamHI) This study TS16forward CGCGGATCCGAAAGTGAGGAAGAGGCGTTAAAGA(BamHI) This study 47 T A forward GTTTTTGAGCTGTCGCCGCTGCATCTCTCTCCCCTGGTCC This study 47 T Areverse GGACCAGGGGAGAGAGATGCAGCGGCGACAGCTCAAAAAC This study 48 W A forward GTTTTTGAGCTGTCGCCGCTGTATCTCTCTCCCCTGGTCC This study 48 W A reverse GGACCAGGGGAGAGAGATACAGCGGCGACAGCTCAAAAAC This study 47 48 T W A2 forward GTTTTTGAGCTGTCGCCGCTGCATCTCTCTCCCCTGGTCC This study 47 48 T W A2 reverse GGACCAGGGGAGAGAGATGCAGCGGCGACAGCTCAAAAAC This study 339 I G forward TCCTTCTCCCCATGGACCTGAAAGGGAGGCAAT This study 339 I G reverse ATTGCCTCCCTTTCAGGTCCATGGGGAGAAGGA This study

7