https://www.alphaknockout.com

Mouse Creg1 Knockout Project (CRISPR/Cas9)

Objective: To create a Creg1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Creg1 (NCBI Reference Sequence: NM_011804 ; Ensembl: ENSMUSG00000040713 ) is located on Mouse 1. 4 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000111432). Exon 1~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice heterozygous for a knock-out allele exhibit decreased neovascularization after induction of hindlimb ischemia, and show increased infarction size, elevated cardiomyocyte apoptosis and impaired autophagy following myocardial ischemia/reperfusion injury.

Exon 1 starts from about 0.15% of the coding region. Exon 1~4 covers 100.0% of the coding region. The size of effective KO region: ~10060 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4

Legends Exon of mouse Creg1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.7% 594) | C(22.4% 448) | T(24.9% 498) | G(23.0% 460)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.75% 555) | C(24.1% 482) | T(26.2% 524) | G(21.95% 439)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 165761786 165763785 2000 browser details YourSeq 275 80 573 2000 85.7% chr6 - 143370441 143371067 627 browser details YourSeq 265 885 1187 2000 95.0% chr13 - 20488266 20488577 312 browser details YourSeq 260 886 1176 2000 95.2% chr6 - 73557722 73558035 314 browser details YourSeq 259 899 1191 2000 95.8% chr18 + 11567384 11567680 297 browser details YourSeq 258 887 1189 2000 94.0% chr10 + 98027252 98027586 335 browser details YourSeq 257 586 1166 2000 88.0% chr11 + 112131086 112131517 432 browser details YourSeq 256 886 1167 2000 95.8% chr8 - 58047646 58047946 301 browser details YourSeq 255 886 1187 2000 94.5% chr1 + 58547541 58547854 314 browser details YourSeq 254 886 1189 2000 95.1% chr4 - 48024939 48025246 308 browser details YourSeq 254 886 1187 2000 94.2% chr19 - 59115038 59115351 314 browser details YourSeq 254 886 1189 2000 95.4% chr18 - 40828042 40828358 317 browser details YourSeq 254 886 1167 2000 95.4% chr1 - 168228009 168228301 293 browser details YourSeq 254 886 1186 2000 94.8% chr4 + 72104547 72104874 328 browser details YourSeq 253 886 1189 2000 93.5% chr9 - 77180115 77180425 311 browser details YourSeq 252 887 1167 2000 95.4% chr1 - 73179114 73179413 300 browser details YourSeq 252 886 1188 2000 94.4% chr9 + 77564841 77565155 315 browser details YourSeq 252 895 1189 2000 94.0% chr1 + 182971646 182971938 293 browser details YourSeq 251 886 1167 2000 95.4% chr18 - 30545915 30546216 302 browser details YourSeq 250 886 1167 2000 94.7% chr2 - 22863631 22863931 301

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 165773846 165775845 2000 browser details YourSeq 76 1036 1209 2000 92.3% chr6 + 82903969 83126436 222468 browser details YourSeq 40 1035 1132 2000 89.8% chr6 + 98046369 98046466 98 browser details YourSeq 40 833 1080 2000 83.7% chr10 + 40328853 40329098 246 browser details YourSeq 37 1119 1205 2000 72.5% chr19 - 42050808 42050898 91 browser details YourSeq 37 1027 1132 2000 86.4% chr12 + 55767503 55767607 105 browser details YourSeq 34 1057 1132 2000 94.9% chr9 + 58579752 58579827 76 browser details YourSeq 33 1057 1131 2000 94.8% chr5 - 143574091 143574168 78 browser details YourSeq 33 1160 1205 2000 94.5% chr3 - 131637037 131637082 46 browser details YourSeq 33 1160 1197 2000 94.6% chr11 + 77484143 77484181 39 browser details YourSeq 30 1038 1136 2000 96.9% chr2 - 75573184 75573283 100 browser details YourSeq 29 1052 1081 2000 100.0% chr18 - 54528655 54528685 31 browser details YourSeq 26 1119 1147 2000 96.6% chr9 - 16573356 16573386 31 browser details YourSeq 24 1919 1948 2000 90.0% chr13 - 104237652 104237681 30 browser details YourSeq 23 702 725 2000 100.0% chr2 - 147035267 147035291 25 browser details YourSeq 21 1143 1171 2000 86.3% chr1 - 58013175 58013203 29

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Creg1 cellular repressor of E1A-stimulated 1 [ Mus musculus (house mouse) ] Gene ID: 433375, updated on 14-Aug-2019

Gene summary

Official Symbol Creg1 provided by MGI Official Full Name cellular repressor of E1A-stimulated genes 1 provided by MGI Primary source MGI:MGI:1344382 See related Ensembl:ENSMUSG00000040713 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Creg; AA755314 Expression Broad expression in placenta adult (RPKM 225.0), liver adult (RPKM 174.7) and 21 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 H2.3 See Creg1 in Genome Data Viewer Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (165763758..165775309)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (167693911..167705435)

Chromosome 1 - NC_000067.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Creg1 ENSMUSG00000040713

Description cellular repressor of E1A-stimulated genes 1 [Source:MGI Symbol;Acc:MGI:1344382] Gene Synonyms Creg Location : 165,763,746-165,775,308 forward strand. GRCm38:CM000994.2 About this gene This gene has 3 transcripts (splice variants), 194 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Creg1-202 ENSMUST00000111432.9 2163 220aa ENSMUSP00000107060.3 Protein coding CCDS35759 O88668 TSL:1 GENCODE basic APPRIS P1

Creg1-201 ENSMUST00000040298.4 425 94aa ENSMUSP00000041234.4 Protein coding - K4DI63 TSL:3 GENCODE basic

Creg1-203 ENSMUST00000140769.1 2317 146aa ENSMUSP00000137087.1 Nonsense mediated decay - J3QP41 TSL:1

31.56 kb Forward strand 165.76Mb 165.77Mb 165.78Mb Genes (Comprehensive set... Creg1-203 >nonsense mediated decay Gm16565-201 >lncRNA

Creg1-202 >protein coding Gm16565-202 >lncRNA

Creg1-201 >protein coding

Contigs < AC124587.5 Genes < Gm36972-201lncRNA (Comprehensive set...

Regulatory Build

165.76Mb 165.77Mb 165.78Mb Reverse strand 31.56 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000111432

11.56 kb Forward strand

Creg1-202 >protein coding

ENSMUSP00000107... Low complexity (Seg) Cleavage site (Sign... Superfamily SSF50475 Pfam PF13883 PIRSF Cellular repressor of E1A-stimulated genes (CREG) PANTHER PTHR13343:SF21

PTHR13343 Gene3D FMN-binding split barrel

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend start lost missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 220

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8