https://www.alphaknockout.com

Mouse Crip1 Knockout Project (CRISPR/Cas9)

Objective: To create a Crip1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Crip1 (NCBI Reference Sequence: NM_007763 ; Ensembl: ENSMUSG00000006360 ) is located on Mouse 12. 5 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 4 (Transcript: ENSMUST00000006523). Exon 1~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.43% of the coding region. Exon 1~4 covers 100.0% of the coding region. The size of effective KO region: ~1542 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5

Legends Exon of mouse Crip1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.2% 424) | C(27.15% 543) | T(23.65% 473) | G(28.0% 560)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(20.7% 414) | C(26.45% 529) | T(25.2% 504) | G(27.65% 553)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 113150115 113152114 2000 browser details YourSeq 34 1399 1463 2000 70.3% chrX - 146144196 146144242 47 browser details YourSeq 31 1425 1471 2000 84.9% chr7 + 3235688 3235731 44 browser details YourSeq 29 1455 1493 2000 87.2% chr18 - 16636273 16636311 39 browser details YourSeq 29 1423 1469 2000 96.8% chr1 - 54102025 54102310 286 browser details YourSeq 27 1426 1455 2000 96.7% chr10 + 121704063 121704108 46 browser details YourSeq 25 1856 1880 2000 100.0% chr5 + 122513959 122513983 25 browser details YourSeq 24 1419 1443 2000 100.0% chr11 - 84669646 84669672 27 browser details YourSeq 23 1678 1700 2000 100.0% chr4 + 138986001 138986023 23 browser details YourSeq 23 257 280 2000 100.0% chr10 + 66447011 66447036 26 browser details YourSeq 22 1451 1472 2000 100.0% chr4 - 142487695 142487716 22 browser details YourSeq 22 1062 1093 2000 84.4% chr1 + 69468111 69468142 32 browser details YourSeq 21 1301 1321 2000 100.0% chr3 - 140885146 140885166 21 browser details YourSeq 21 1451 1471 2000 100.0% chr3 - 96622651 96622671 21 browser details YourSeq 21 1448 1468 2000 100.0% chr17 - 33831151 33831171 21 browser details YourSeq 21 1451 1471 2000 100.0% chr14 - 67953316 67953336 21 browser details YourSeq 21 1451 1471 2000 100.0% chr10 - 82803771 82803791 21 browser details YourSeq 21 1451 1471 2000 100.0% chr1 - 64827319 64827339 21 browser details YourSeq 21 1398 1418 2000 100.0% chr1 - 38044627 38044647 21 browser details YourSeq 21 1451 1471 2000 100.0% chr8 + 3524481 3524501 21

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 113153657 113155656 2000 browser details YourSeq 25 909 933 2000 100.0% chr5 + 135756018 135756042 25 browser details YourSeq 24 910 935 2000 88.0% chr6 + 117793017 117793041 25 browser details YourSeq 24 906 933 2000 85.2% chr6 + 117794159 117794185 27 browser details YourSeq 24 910 933 2000 100.0% chr14 + 11485753 11485776 24 browser details YourSeq 24 909 932 2000 100.0% chr12 + 53270603 53270626 24 browser details YourSeq 23 909 933 2000 96.0% chr17 + 6062247 6062271 25 browser details YourSeq 22 585 608 2000 95.9% chr10 + 4188676 4188699 24 browser details YourSeq 22 394 418 2000 95.9% chr1 + 86676441 86676467 27 browser details YourSeq 21 913 933 2000 100.0% chr14 - 35912886 35912906 21 browser details YourSeq 21 17 37 2000 100.0% chr13 - 84874906 84874926 21 browser details YourSeq 21 1969 1990 2000 100.0% chr10 - 49904760 49904782 23 browser details YourSeq 21 1658 1678 2000 100.0% chr1 - 95730917 95730937 21 browser details YourSeq 20 1505 1524 2000 100.0% chr1 - 133409387 133409406 20

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Crip1 cysteine-rich protein 1 (intestinal) [ Mus musculus (house mouse) ] Gene ID: 12925, updated on 12-Aug-2019

Gene summary

Official Symbol Crip1 provided by MGI Official Full Name cysteine-rich protein 1 (intestinal) provided by MGI Primary source MGI:MGI:88501 See related Ensembl:ENSMUSG00000006360 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CRHP; CRP1; Crip Expression Biased expression in large intestine adult (RPKM 3684.4), small intestine adult (RPKM 2998.6) and 10 other tissues See Orthologs more human all

Genomic context

Location: 12 F1; 12 61.59 cM See Crip1 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (113152012..113153879)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (114390223..114392090)

Chromosome 12 - NC_000078.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Crip1 ENSMUSG00000006360

Description cysteine-rich protein 1 (intestinal) [Source:MGI Symbol;Acc:MGI:88501] Gene Synonyms CRP1, Crip Location Chromosome 12: 113,146,316-113,153,879 forward strand. GRCm38:CM001005.2 About this gene This gene has 6 transcripts (splice variants), 131 orthologues, 21 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Crip1-206 ENSMUST00000200553.1 516 77aa ENSMUSP00000143680.1 Protein coding CCDS26205 P63254 TSL:2 GENCODE basic APPRIS P1

Crip1-201 ENSMUST00000006523.11 445 77aa ENSMUSP00000006523.7 Protein coding CCDS26205 P63254 TSL:1 GENCODE basic APPRIS P1

Crip1-205 ENSMUST00000199089.4 509 128aa ENSMUSP00000142803.1 Protein coding - A0A0G2JEK2 TSL:3 GENCODE basic

Crip1-202 ENSMUST00000196932.1 1809 No protein - Retained intron - - TSL:NA

Crip1-204 ENSMUST00000198909.1 1505 No protein - Retained intron - - TSL:1

Crip1-203 ENSMUST00000198597.4 552 No protein - lncRNA - - TSL:3

27.56 kb Forward strand

113.14Mb 113.15Mb 113.16Mb (Comprehensive set... Mta1-205 >protein codingCrip2-201 >protein coding Crip1-203 >lncRNA Tedc1-201 >protein coding

Mta1-204 >protein coding Crip2-206 >protein coding Crip1-205 >protein coding Tedc1-202 >retained intron

Mta1-203 >protein coding Crip2-202 >protein coding Crip1-201 >protein coding Tedc1-203 >nonsense mediated decay

Mta1-201 >protein coding Crip2-203 >retained intron Crip1-206 >protein coding Tedc1-205 >retained intron

Mta1-202 >protein coding Crip2-204 >retained intron Crip1-202 >retained intron Tedc1-206 >nonsense mediated decay

Mta1-206 >lncRNA Crip2-205 >retained intron Crip1-204 >retained intron Tedc1-204 >retained intron

Contigs < AC073562.6 AC161112.5 > Regulatory Build

113.14Mb 113.15Mb 113.16Mb Reverse strand 27.56 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000006523

1.87 kb Forward strand

Crip1-201 >protein coding

ENSMUSP00000006... Superfamily SSF57716 SMART Zinc finger, LIM-type Pfam Zinc finger, LIM-type PROSITE profiles Zinc finger, LIM-type PROSITE patterns Zinc finger, LIM-type PANTHER PTHR46074:SF3

PTHR46074 Gene3D 2.10.110.10 CDD cd09478

All sequence SNPs/i... Sequence variants (dbSNP and all other sources) S

Variant Legend synonymous variant

Scale bar 0 8 16 24 32 40 48 56 64 77

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8