Transcriptomics-guided design of synthetic promoters for a mammalian system Joseph K. Cheng1 and Hal S. Alper1,2,* 1Department of Chemical Engineering, The University of Texas at Austin, 200 E Dean Keeton St. Stop C0400, Austin, Texas 78712 2Institute for Cellular and Molecular Biology, The University of Texas at Austin, 2500 Speedway Avenue, Austin, Texas 78712 *Correspondence and requests for materials should be addressed to H.S.A. (
[email protected]) Supporting Information Accession codes of nucleotide sequences used in this manuscript Human cytomegalovirus: M60321.1 Simian Virus 40: J02400.1 Human chromosomes, GRCh38.p2 assembly: NC_000001.11, NC_000002.12, NC_000003.12, NC_000004.12, NC_000005.10, NC_000006.12, NC_000007.14, NC_000008.11, NC_000009.12, NC_000010.11, NC_000011.10, NC_000012.12, NC_000013.11, NC_000014.9, NC_000015.10, NC_000016.10, NC_000017.11, NC_000018.10, NC_000019.10, NC_000020.11, NC_000021.9, NC_000022.11, NC_000023.11, NC_000024.10 Equations From the derived GMM, we can determine the probabilities of several key concerns: 1) the false positive probability at a particular threshold expression value (log-transformed) of belonging to the high expression group; 2) the false negative probability at a particular threshold expression value (log- transformed) of belonging to the high expression group; and 3) the probability of observing an expression value (log-transformed), X , less than or equal to a specified expression value (e.g. median expression value) if X belongs to the high expression group. ∞, , ,