Supplementary Information
Total Page:16
File Type:pdf, Size:1020Kb
Supplementary Information for ‘5-Methylation of Cytosine in CG:CG Base-Pair Steps: A Physicochemical Mechanism for the Epigenetic Control of DNA Nanomechanics’ Tahir I. Yusufaly*, Yun Li** and Wilma K. Olson*** * Rutgers, the State University of New Jersey, Department of Physics and Astronomy, Piscataway, NJ, USA, 08854 ** Delaware Valley College, Department of Chemistry and Biochemistry, Doylestown, PA, USA, 18901 *** Rutgers, the State University of New Jersey, Department of Chemistry and Chemical Biology, Piscataway, NJ, USA, 08854 S1 A. PCA Data Tables and Secondary Plots Table S1: CG:CG base-pair step parameter mean values, standard deviations, and magnitudes for one unit of each of the three dominant principal components. The first row shows the latent score, or fraction of the total PCA eigenvalues captured by the particular component. -∆ F /k T The second row is an estimate of the contribution of the methyl groups to the Boltzmann partition function Zmethylation = e methylation B at room temperature, based on DFT calculations. The analysis included 213 data points. Units of angular parameters are degrees and units of translational parameters are Angstroms. Base-pair parameters subscripted with C:G or G:C refer to values for the lower C:G or upper G:C S2 pairs (as illustrated in Figure 3 of the main text), respectively. Particularly dominant parameter motions, namely those greater than 0.1 Angstrom or 1 degree in magnitude, are highlighted in bold. Table S2: AC:GT base-pair step parameter mean values, standard deviations, and magnitudes for one unit of each principal component. The analysis included 498 data points. See captions in Table I for further explanation of table information. S3 Table S3: GA:TC base-pair step parameter mean values, standard deviations, and magnitudes for one unit of each principal component. The analysis included 308 data points. See captions in Table I for further explanation of table information. S4 Table S4: GC:GC base-pair step parameter mean values, standard deviations, and magnitudes for one unit of each principal component. The analysis included 264 data points. See captions in Table I for further explanation of table information. S5 Table S5: GG:CC base-pair step parameter mean values, standard deviations, and magnitudes for one unit of each principal component. The analysis included 67 data points. See captions in Table I for further explanation of table information. S6 B. Non-Redundant Protein-DNA Dataset Protein family NDB_ID Citation Enzymes DNA glycosylases Adenine glycosylase MutY PD0496, PD0634 [1] 8-Oxoguanine glycosylase PD0117, PD0195, PD0629, PD0631, PD0632 [2, 3, 4] 3-Methyladenine DNA glycosylase PD0099, PD0168, PD0172 [5, 6] Uracil DNA glycosylase PD0052, PD0053, PD0127 [7, 8] Formamidopyrimidine DNA glycosylase PD0264, PD0292, PD0447, PD0474, PD0551 [9, 10, 11, 12, 13] Endonuclease III PD0396, PD0397 [14] Endonuclease VIII PD0253 [15] Recombinases Cre recombinase PD0003, PD0047, PD0103, PD0166, PD0457, PD0601 [16, 17, 18, 19, 20, 21] HIN recombinase PD0213, PD0234, PDE009 [22, 23] DNA photolyases DNA photolyase PD0552 [24] Glucosyltransferases β-Glucosyltransferase PD0333, PD0534 [25, 26] DNA lyases DNase domain of colicin E7 PD0454 [27] Deoxyribonuclease I PDE005, PDE006 [28, 29] DNA endonucleases Restriction endonuclease BamHI PD0139, PDE020 [30, 31] Restriction endonuclease Bgl I PD0108 [32] Restriction endonuclease Bgl II PD0101 [33] Restriction endonuclease BsobI PD0085 [34] Eco O109I PD0608 [35] Restriction endonuclease Eco RI PD0062 [36] Restriction endonuclease Eco RV PD0010, PD0013, PD0038, PD0132, PDE014 [37, 38, 39, 40, 41] Restriction endonuclease HincII PD0577 [42] Restriction endonuclease MunI PD0151 [43] Restriction endonuclease Mspi PD0515 [44] Restriction endonuclease NaeI PD0207 [45] Restriction endonuclease NgoIV PD0177 [46] Restriction endonuclease Pvu II PD0006, PD0147 [47, 48] Homing endonuclease I-Cre I PD0189, PD0547, PD0561 [49, 50, 51] Homing endonuclease I-Ms oI PD0335 [52] Intron-encoded homing endonuclease I-Ppo I PDE0144 [53] Homing endonuclease I-Sce I PD0481 [54] DNA-binding domain of intron endonuclease I-Tev I PD0200, PD0538 [55, 56] S7 Engineered homing endonuclease I-Dmo I/I Cre I PD0342 [57] Endonuclease IV PD0068 [58] Flap endonuclease-1 PD0502 [59] Periplasmic endonuclease Vvn PD0403 [60] Very short patch repair (VSR) endonuclease PD0100 [61] Topoisomerases Eukaryotic DNA topoisomerase I PD0256, PDE0142 [62, 63] Transposases DNA conjugative relaxase TrwC PD0394 [64] DNA transposase Tc3 PDE0128 [65] DNA transposase Tn5 PD0350 [66] DNA methylase HhaI PDE141 [67] ATPase Mismatch repair protein MutS PD0142 [68] DNA methyltransferase DNA methylase Taq I PD0188 [69] Reverse transcriptase MMLV reverse transcriptase PD0126, PD0359 [70, 71] PD0030, PD0032, PD0033, PD0066, PD0300, PD0302, PD0303, DNA polymerases DNA polymerase I [72, 73, 74, 75] PD0317, PDE0131, PDE0133 DNA polymerase IV PD0251, PD0252, PD0358, PD0510, PD0535 [76, 77, 78, 79] DNA polymerase λ PD0503, PD0604 [80, 81] DNA polymerase β PDE0126 [82] DNA polymerase ι PD0543 [83] T7 phage DNA polymerase PD0522, PD0554 [84, 85] RNA polymerases T7 phage RNA polymerase PD0086 [86] Excisionase Excisionase PD0494 [87] Regulatory Proteins Nucleic acid-binding three- Antennapedia homeodomain PD0007 [88] helical bundles Engrailed homeodomain PD0016, PDT043 [89, 90] Even-skipped homeodomain PDT031 [91] Ultrabithorax homeodomain PD0042 [92] Mating type protein α2 PD0257 [93] Mating type protein A1 PDR049, PDT028 [94, 95] Msx-1 PD0211 [96] Paired protein PD0050, PD0259, PDE025, PDR018 [97, 98, 99, 100] Pbx1 PD0075, PD0455 [101, 102] POU homeodomain Pit-1 PDR034 [103] POU homeodomain Oct-1 PD0225 [104] S8 Interferon regulatory factor-2 PD0076 [105] Regulatory factor X PD0111 [106] Serum response factor accessory protein 1a PD0020, PD0027 [107] Catabolite gene activator protein PD0299 [108] Transcription factor PU.1 PDT033 [109] HNF-3 PDT013 [110] Ets-like protein 1 PD0116 [111] 39 kda initiator binding protein PD0452 [112] GA binding protein (GabP) α PDT048 [113] Replication terminator protein PD0167 [114] Repressor activator protein 1 PDT035 [115] Heat-shock transcription factor PD0073, PD0183, PD0184 [116, 117] Ets-1 transcription factor PD0260 [98] MarA PDR056 [118] GerE-like domains Nitrate/nitrite response regulator PD0219 [119] Spo0A-like domains Sporulation regulating transcription factor Spo0A PD0314 [120] Trp repressor-like domains Trp repressor PDR009, PDR013 [121, 122] DnaA-like domains Chromosomal replication initiation factor DnaA PD0371 [123] σ4-like domains σ factor SigA PD0284 [124] Putative domains Multidrug-efflux transporter regulator BmrR PD0483 [125] λ repressor-like domains λ C1 repressor PDR010 [126] 434 C1 repressor PDR004 [127, 128, 129] 434 Cro repressor PDR001 [130] Purine repressor PD0056, PD0224 [131, 132] Skn-1-like domains Skn-1 PDT064 [133] Leucine zipper domains Pap1 PD0180 [134] C-Jun PD0241 [135] Oncogene product v-Myb PD0290 [136] GCN4 basic protein PDT029 [137] Helix-loop-helix domains Myc proto-oncogene protein PD0386 [138] Max protein PD0387 [138] Sterol regulatory element binding protein SREBP-1A PDT062 [139] Korb-like domains Transcription repressor Korb PD0480 [140] Ribbon-helix-helix domains Arc repressor PD0035 [141] Methionine repressor PD0173, PD0245, PD0246, PD0248 [142] Cyclin-like domains Transcription factor IIB PD0070 [143] S9 Tetracyclin repressor-like Tetracyclin repressor PD0122 [144] domains Immunoglobulin-like β- T-box protein 3 PD0311 [145] sandwiches T domain from Brachyury transcription factor PDT045 [146] Tumor suppressor p53 PDR027 [147] NF-κB p50 subunit PDT015 [148] NF-κB p52 subunit PDR032 [149] NF-κB p65 subunit PDR051 [150] Sporulation specific transcription factor Ndt80 PD0341 [151] Ferredoxin-like domains Papillomavirus-1 E2 protein PD0227, PDV001 [152, 153] Epstein Barr virus nuclear antigen-1 PD0024 [154] SRF-like domains Myocyte enhancer factor Mef2a PD0121 [155] Myocyte enhancer factor Mef2b PD0363 [156] STAT-like domains Stat3 β PD0028 [157] TraR-like domains Quorum-sensing transcription factor TraR PD0298 [158] PD0154, PD0155, PD0156, PD0157, PD0158, PD0159, PD0160, TBP-like domains TATA-box binding protein (TBP) [159, 160, 161] PD0161, PD0162, PD0163, PD0164, PDT012, PDT034 SeqA C-terminal domain Replication modulator SeqA PD0390 [162] SMad MH1 domain SMad MH1 domain PD0409 [163] Zinc fingers Designed zinc finger protein PDTB41 [164] Zif268 zinc finger protein PD0231, PD0424, PDT039, PDT055, PDT057, PDT058 [165, 166, 167, 168] TATA box-binding zinc finger TATAZF PD0187 [169] Ying-yang 1 zinc finger PDT038 [170] Zn2Cys6 domains Heme activator protein Hap1 PD0089, PD0090 [171, 105] Zn2Cys6 protein Put3 PDT044 [172] Glucocorticoid receptor- Ecdysome receptor-ultraspiracle heterodimer PD0471 [173] like domains Retinoid X receptor-thyroid hormone receptor PDR021 [174] heterodimer Estrogen receptor PDRC03 [175] Glucocorticoid receptor PD0475, PDT030 [176, 177] Retinoid X receptor PD0071, PD0115 [178, 179] Orphan nuclear receptor RevErb PD0008 [180] Three-helical bundle + Oncogene product C-Myb+C/ebp β PD0289 [136] leucine zipper Zinc finger + Leucine Zif268 + Leucine zipper PD0312 [181] zipper S10 TFIIA + TBP TFIIA + TBP PD0369, PD0393, PDT036 [182, 183] TFIIB + TBP TFIIB + TBP PDR031 [184] σ4 + λ repressor σ4 + λ repressor PD0492 [185] Three helical-bundle + Three helical-bundle + SRF PDR036 [186] SRF Structural Proteins HMG-box domains High mobility group box protein PD0051, PD0110 [187, 188] IHF-like domains HU protein