Mouse Gpbp1l1 Knockout Project (CRISPR/Cas9)

https://www.alphaknockout.com Mouse Gpbp1l1 Knockout Project (CRISPR/Cas9) Objective: To create a Gpbp1l1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Gpbp1l1 gene (NCBI Reference Sequence: NM_029868 ; Ensembl: ENSMUSG00000034042 ) is located on Mouse chromosome 4. 12 exons are identified, with the ATG start codon in exon 3 and the TAG stop codon in exon 12 (Transcript: ENSMUST00000030460). Exon 3~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 3 starts from the coding region. Exon 3~5 covers 32.98% of the coding region. The size of effective KO region: ~3744 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 3 4 5 12 Legends Exon of mouse Gpbp1l1 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 689 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(689bp) | A(29.03% 200) | C(15.82% 109) | T(37.01% 255) | G(18.14% 125) Note: The 689 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(30.85% 617) | C(18.75% 375) | T(31.9% 638) | G(18.5% 370) Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 689 1 689 689 100.0% chr4 + 116570236 116570924 689 browser details YourSeq 73 223 451 689 91.1% chr14 - 57832305 57832627 323 browser details YourSeq 69 206 365 689 87.1% chr9 - 28843037 28843190 154 browser details YourSeq 69 223 372 689 89.9% chr12 + 27433381 27433557 177 browser details YourSeq 68 162 302 689 92.6% chr6 + 60983118 60983263 146 browser details YourSeq 62 93 300 689 74.3% chr4 - 136058204 136058327 124 browser details YourSeq 61 206 302 689 93.0% chr8 - 88194505 88194617 113 browser details YourSeq 61 223 301 689 93.0% chr5 - 119478250 119478338 89 browser details YourSeq 61 187 301 689 83.6% chrX + 92785423 92785524 102 browser details YourSeq 60 223 301 689 89.5% chr14 - 10233998 10234089 92 browser details YourSeq 59 216 302 689 90.5% chr12 + 35183275 35183362 88 browser details YourSeq 58 225 302 689 91.5% chr1 - 136708595 136708683 89 browser details YourSeq 57 206 300 689 93.9% chrX - 47020965 47021068 104 browser details YourSeq 57 223 298 689 92.7% chr10 - 112366531 112366623 93 browser details YourSeq 57 223 305 689 91.4% chrX + 153411452 153411552 101 browser details YourSeq 57 223 302 689 88.0% chr10 + 68872743 68872833 91 browser details YourSeq 55 223 298 689 93.7% chr5 - 110241710 110241799 90 browser details YourSeq 55 635 689 689 100.0% chr15 - 42032742 42032796 55 browser details YourSeq 54 223 302 689 86.5% chr11 + 42125252 42125341 90 browser details YourSeq 53 635 689 689 98.2% chr1 - 4693424 4693478 55 Note: The 689 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 116574616 116576615 2000 browser details YourSeq 129 525 671 2000 94.6% chr13 - 59699614 59699975 362 browser details YourSeq 128 510 684 2000 92.7% chr5 - 100709547 100709746 200 browser details YourSeq 128 514 675 2000 86.8% chr1 + 24465277 24465434 158 browser details YourSeq 118 532 686 2000 92.3% chr6 + 121126290 121126645 356 browser details YourSeq 113 526 716 2000 92.5% chr2 - 21162834 21163047 214 browser details YourSeq 112 557 1085 2000 78.9% chr13 - 90877131 90877367 237 browser details YourSeq 111 532 686 2000 92.4% chr4 - 132798717 132798902 186 browser details YourSeq 109 511 651 2000 90.0% chr3 - 107652916 107653060 145 browser details YourSeq 109 525 651 2000 95.1% chr1 + 34527172 34527299 128 browser details YourSeq 106 549 1074 2000 75.9% chr4 + 120917171 120917461 291 browser details YourSeq 104 532 692 2000 94.1% chr9 - 80112542 80112831 290 browser details YourSeq 103 105 651 2000 78.9% chrX - 101107281 101107593 313 browser details YourSeq 93 535 649 2000 95.2% chr9 + 57662526 57662642 117 browser details YourSeq 93 553 675 2000 90.6% chr19 + 11064282 11064428 147 browser details YourSeq 93 532 681 2000 83.4% chr17 + 94465432 94465553 122 browser details YourSeq 92 535 649 2000 94.3% chr3 - 154216180 154216296 117 browser details YourSeq 91 535 650 2000 96.0% chr19 + 57143002 57143119 118 browser details YourSeq 90 536 649 2000 94.2% chrX - 140396440 140396554 115 browser details YourSeq 89 535 652 2000 92.4% chr2 - 32546556 32546674 119 Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Gpbp1l1 GC-rich promoter binding protein 1-like 1 [ Mus musculus (house mouse) ] Gene ID: 77110, updated on 24-Oct-2019 Gene summary Official Symbol Gpbp1l1 provided by MGI Official Full Name GC-rich promoter binding protein 1-like 1 provided by MGI Primary source MGI:MGI:1924360 See related Ensembl:ENSMUSG00000034042 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as BC002292; 5330440M15Rik Expression Ubiquitous expression in large intestine adult (RPKM 15.8), thymus adult (RPKM 14.8) and 28 other tissues See more Orthologs human all Genomic context Location: 4; 4 D1 See Gpbp1l1 in Genome Data Viewer Exon count: 14 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (116557179..116593902) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (116230332..116266487) Chromosome 4 - NC_000070.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 4 transcripts Gene: Gpbp1l1 ENSMUSG00000034042 Description GC-rich promoter binding protein 1-like 1 [Source:MGI Symbol;Acc:MGI:1924360] Gene Synonyms 5330440M15Rik Location Chromosome 4: 116,557,658-116,593,882 forward strand. GRCm38:CM000997.2 About this gene This gene has 4 transcripts (splice variants), 215 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Gpbp1l1-201 ENSMUST00000030460.14 3588 473aa ENSMUSP00000030460.8 Protein coding CCDS18511 Q6NZP2 TSL:1 GENCODE basic APPRIS P1 Gpbp1l1-202 ENSMUST00000106475.1 2642 473aa ENSMUSP00000102083.1 Protein coding CCDS18511 Q6NZP2 TSL:1 GENCODE basic APPRIS P1 Gpbp1l1-203 ENSMUST00000131913.1 714 No protein - lncRNA - - TSL:3 Gpbp1l1-204 ENSMUST00000138837.1 448 No protein - lncRNA - - TSL:3 Page 7 of 9 https://www.alphaknockout.com 56.23 kb Forward strand 116.55Mb 116.56Mb 116.57Mb 116.58Mb 116.59Mb 116.60Mb Genes (Comprehensive set... Gm12953-201 >lncRNA Gpbp1l1-203 >lncRNA Ccdc17-201 >protein coding Gpbp1l1-201 >protein coding Ccdc17-202 >lncRNA Gpbp1l1-202 >protein coding Ccdc17-203 >lncRNA Gpbp1l1-204 >lncRNA Ccdc17-205 >lncRNA Ccdc17-204 >lncRNA Contigs AL669953.7 > Genes < Tmem69-202protein coding < C530005A16Rik-201lncRNA (Comprehensive set... < Tmem69-203lncRNA < Nasp-202protein coding < Tmem69-201protein coding < Nasp-203protein coding < Nasp-201protein coding < Nasp-208lncRNA < Nasp-204lncRNA Regulatory Build 116.55Mb 116.56Mb 116.57Mb 116.58Mb 116.59Mb 116.60Mb Reverse strand 56.23 kb Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding RNA gene Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000030460 36.23 kb Forward strand Gpbp1l1-201 >protein coding ENSMUSP00000030... MobiDB lite Low complexity (Seg) Pfam Vasculin family PANTHER Vasculin family PTHR14339:SF10 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant splice region variant synonymous variant Scale bar 0 40 80 120 160 200 240 280 320 360 400 473 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 9 of 9.

Mouse Gpbp1l1 Knockout Project (CRISPR/Cas9)

Final Copy 2018 09 25 Gaunt

Noelia Díaz Blanco

Structure and Expression Analyses of SVA Elements in Relation to Functional Genes

Genome-Wide Association Study Identifies 44 Independent Genomic Loci for Self-Reported Adult Hearing Difficulty in the UK Biobank Cohort

Identification of the Long, Edited Dsrnaome of LPS-Stimulated Immune Cells

Identification of the Long, Edited Dsrnaome of LPS-Stimulated Immune Cells

Content Based Search in Gene Expression Databases and a Meta-Analysis of Host Responses to Infection

Microarray Bioinformatics and Its Applications to Clinical Research

556 Positive Significant Genes Computed Quantities Input

Blood Transcriptome Based Biomarkers for Human Circadian

Leveraging Pleiotropy to Discover and Interpret GWAS Results for Sleep

Junl;G.~~ Assistant Professor Department of Mechanical Engineering Rice University