https://www.alphaknockout.com

Mouse Glipr1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Glipr1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Glipr1 (NCBI Reference Sequence: NM_028608 ; Ensembl: ENSMUSG00000056888 ) is located on Mouse 10. 6 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 6 (Transcript: ENSMUST00000074805). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Glipr1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-374M16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Targeted inactivation of this gene renders mice more vulnerable to spontaneous tumorigenesis, leading to the formation of a wide spectrum of tumors and significantly shorter tumor-free survival times.

Exon 3 starts from about 54.25% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 4555 bp, and the size of intron 3 for 3'-loxP site insertion: 2125 bp. The size of effective cKO region: ~586 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 10 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Glipr1 Homology arm cKO region Exon of mouse Krr1 loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7086bp) | A(26.66% 1889) | C(23.74% 1682) | T(30.86% 2187) | G(18.74% 1328)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 111989136 111992135 3000 browser details YourSeq 322 1407 1864 3000 86.1% chr13 - 41708539 41709040 502 browser details YourSeq 318 1313 1861 3000 89.6% chr5 + 41885706 41886375 670 browser details YourSeq 317 1378 1858 3000 88.4% chr1 - 171972452 171972983 532 browser details YourSeq 317 1383 1860 3000 85.8% chr8 + 128257159 128257664 506 browser details YourSeq 315 1407 1860 3000 91.0% chr16 + 93508903 93509385 483 browser details YourSeq 314 1378 1872 3000 89.5% chr14 + 72730322 72730885 564 browser details YourSeq 311 1419 1864 3000 90.7% chr18 + 80488621 80489101 481 browser details YourSeq 309 1407 1864 3000 89.6% chr15 + 41766106 41766592 487 browser details YourSeq 307 1402 1864 3000 89.4% chr4 - 8471176 8471662 487 browser details YourSeq 305 1412 1861 3000 88.6% chr16 + 20182570 20183044 475 browser details YourSeq 301 1419 1857 3000 89.0% chr1 - 14525248 14860958 335711 browser details YourSeq 300 1393 1855 3000 86.0% chr2 - 172212599 172213114 516 browser details YourSeq 299 1394 1858 3000 88.0% chr18 + 13214209 13214692 484 browser details YourSeq 298 1407 1860 3000 87.9% chr10 - 23878581 23879090 510 browser details YourSeq 296 1412 1864 3000 88.9% chr1 - 183607855 183608316 462 browser details YourSeq 295 1375 1860 3000 87.6% chr15 + 96939073 96939561 489 browser details YourSeq 292 1392 1864 3000 90.0% chrX - 68548714 68549222 509 browser details YourSeq 292 1412 1860 3000 90.4% chr13 - 75346481 75588079 241599 browser details YourSeq 290 1375 1858 3000 84.9% chr2 - 103823767 103824286 520

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 111985550 111988549 3000 browser details YourSeq 157 842 1022 3000 91.9% chr9 - 59545520 59545691 172 browser details YourSeq 155 841 1005 3000 97.0% chr4 + 98763451 98763615 165 browser details YourSeq 154 841 1005 3000 97.0% chr8 + 69746500 69746671 172 browser details YourSeq 153 846 1023 3000 91.4% chr10 + 81868834 81869008 175 browser details YourSeq 151 841 1009 3000 95.3% chr17 + 80056257 80400909 344653 browser details YourSeq 150 849 1215 3000 96.9% chr9 + 57756510 57756912 403 browser details YourSeq 149 841 1005 3000 94.0% chr14 + 7947877 7948040 164 browser details YourSeq 149 841 1005 3000 94.5% chr12 + 24967180 24967343 164 browser details YourSeq 148 841 1006 3000 94.6% chr2 + 32192149 32192314 166 browser details YourSeq 147 841 1001 3000 95.7% chr7 + 104143911 104144071 161 browser details YourSeq 147 844 998 3000 97.5% chr5 + 96227287 96227441 155 browser details YourSeq 147 841 1006 3000 94.6% chr3 + 47197850 47198016 167 browser details YourSeq 147 841 1005 3000 94.6% chr12 + 77447631 77447795 165 browser details YourSeq 146 845 1010 3000 93.9% chr4 - 124875757 124875921 165 browser details YourSeq 146 840 1001 3000 95.1% chr2 - 127415231 127415392 162 browser details YourSeq 146 842 1001 3000 95.7% chr11 - 59765988 59766147 160 browser details YourSeq 145 841 1001 3000 95.1% chr15 - 59660701 59660861 161 browser details YourSeq 145 841 1001 3000 95.6% chr14 - 52126511 52126671 161 browser details YourSeq 145 841 1001 3000 95.7% chr10 - 70302436 70302600 165

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Glipr1 GLI pathogenesis-related 1 (glioma) [ Mus musculus (house mouse) ] Gene ID: 73690, updated on 12-Aug-2019

Gene summary

Official Symbol Glipr1 provided by MGI Official Full Name GLI pathogenesis-related 1 (glioma) provided by MGI Primary source MGI:MGI:1920940 See related Ensembl:ENSMUSG00000056888 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as RTVP1; RTVP-1; mRTVP-1; 2410114O14Rik Expression Ubiquitous expression in bladder adult (RPKM 18.1), placenta adult (RPKM 15.6) and 26 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 D2 See Glipr1 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (111985448..112002639, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (111422504..111434320, complement)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Glipr1 ENSMUSG00000056888

Description GLI pathogenesis-related 1 (glioma) [Source:MGI Symbol;Acc:MGI:1920940] Gene Synonyms 2410114O14Rik, RTVP-1, RTVP1, mRTVP-1 Location Chromosome 10: 111,985,448-112,002,631 reverse strand. GRCm38:CM001003.2 About this gene This gene has 5 transcripts (splice variants), 177 orthologues, 13 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Glipr1- ENSMUST00000162508.8 1183 255aa ENSMUSP00000123990.2 Protein CCDS24169 Q4QQK5 TSL:5 204 coding Q9CWG1 GENCODE basic APPRIS P1

Glipr1- ENSMUST00000074805.11 1081 255aa ENSMUSP00000074359.5 Protein CCDS24169 Q4QQK5 TSL:1 201 coding Q9CWG1 GENCODE basic APPRIS P1

Glipr1- ENSMUST00000161870.2 474 98aa ENSMUSP00000134094.1 Protein - G3UYI2 CDS 5' 203 coding incomplete TSL:3

Glipr1- ENSMUST00000174201.1 446 No - lncRNA - - TSL:3 205 protein

Glipr1- ENSMUST00000159550.1 352 No - lncRNA - - TSL:2 202 protein

Page 6 of 8 https://www.alphaknockout.com

37.18 kb Forward strand 111.98Mb 111.99Mb 112.00Mb 112.01Mb Krr1-203 >protein coding (Comprehensive set...

Krr1-202 >retained intron

Krr1-201 >retained intron

Krr1-204 >protein coding

Contigs AC167231.12 >

Genes (Comprehensive set... < Glipr1-201protein coding < Gm7269-201unprocessed pseudogene

< Glipr1-204protein coding

< Glipr1-203protein coding < Glipr1-205lncRNA

< Glipr1-202lncRNA

Regulatory Build

111.98Mb 111.99Mb 112.00Mb 112.01Mb Reverse strand 37.18 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000074805

< Glipr1-201protein coding

Reverse strand 11.82 kb

ENSMUSP00000074... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily CAP superfamily SMART CAP domain Prints Cysteine-rich secretory protein-related

Venom allergen 5-like Pfam CAP domain PROSITE patterns Allergen V5/Tpx-1-related, conserved site PANTHER PTHR10334:SF275

Cysteine-rich secretory protein-related Gene3D CAP superfamily CDD cd05385

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 255

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8