https://www.alphaknockout.com

Mouse Crabp2 Knockout Project (CRISPR/Cas9)

Objective: To create a Crabp2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Crabp2 (NCBI Reference Sequence: NM_007759 ; Ensembl: ENSMUSG00000004885 ) is located on Mouse 3. 4 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000005019). Exon 1~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for targeted null mutations may exhibit an additional postaxial digit, usually on a single forepaw. Penetrance is dependent on the genetic background.

Exon 1 starts from about 0.24% of the coding region. Exon 1~4 covers 100.0% of the coding region. The size of effective KO region: ~4194 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4

Legends Exon of mouse Crabp2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(20.55% 411) | C(28.55% 571) | T(21.85% 437) | G(29.05% 581)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.95% 439) | C(25.55% 511) | T(29.25% 585) | G(23.25% 465)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 + 87946842 87948841 2000 browser details YourSeq 30 524 563 2000 83.4% chrY - 61110131 61110168 38 browser details YourSeq 30 524 563 2000 83.4% chrY - 47458841 47458878 38 browser details YourSeq 30 524 563 2000 83.4% chrY - 15163306 15163343 38 browser details YourSeq 29 459 488 2000 100.0% chr17 - 39318396 39318494 99 browser details YourSeq 28 524 563 2000 84.9% chrY - 59403235 59403272 38 browser details YourSeq 28 524 563 2000 84.9% chrY + 43165185 43165222 38 browser details YourSeq 27 1063 1090 2000 100.0% chr2 + 170794296 170794326 31 browser details YourSeq 25 1064 1089 2000 100.0% chr19 - 30427638 30427666 29 browser details YourSeq 25 1065 1091 2000 88.5% chr1 - 73664621 73664646 26 browser details YourSeq 25 1064 1090 2000 96.3% chr5 + 23402316 23402342 27 browser details YourSeq 23 465 487 2000 100.0% chr7 + 56392131 56392153 23 browser details YourSeq 23 1064 1089 2000 96.2% chr13 + 107977371 107977398 28 browser details YourSeq 23 470 492 2000 100.0% chr1 + 64018823 64018845 23 browser details YourSeq 22 468 489 2000 100.0% chr16 - 69523065 69523086 22 browser details YourSeq 22 1426 1447 2000 100.0% chr14 - 122609075 122609096 22 browser details YourSeq 22 105 126 2000 100.0% chr1 - 59472905 59472926 22 browser details YourSeq 22 468 489 2000 100.0% chr7 + 118707626 118707647 22 browser details YourSeq 22 466 489 2000 87.0% chr5 + 15508975 15508997 23 browser details YourSeq 21 904 924 2000 100.0% chrY - 40430646 40430666 21

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 + 87953036 87955035 2000 browser details YourSeq 128 866 1210 2000 92.1% chr5 + 100604721 100605066 346 browser details YourSeq 108 877 1060 2000 91.7% chr9 - 48911742 48911935 194 browser details YourSeq 105 1078 1206 2000 95.0% chr4 + 41167993 41168131 139 browser details YourSeq 102 988 1206 2000 88.9% chr8 - 25168287 25168501 215 browser details YourSeq 98 964 1205 2000 86.7% chr12 + 85867360 85867596 237 browser details YourSeq 97 1101 1466 2000 92.9% chr19 + 36850235 36850744 510 browser details YourSeq 95 1093 1210 2000 92.8% chr1 - 89059178 89059300 123 browser details YourSeq 94 963 1206 2000 83.1% chr11 + 72149151 72149371 221 browser details YourSeq 90 963 1212 2000 89.5% chr2 + 154626264 154626674 411 browser details YourSeq 86 963 1138 2000 94.8% chr1 - 171043044 171043423 380 browser details YourSeq 86 1094 1206 2000 93.9% chr18 + 35879906 35880023 118 browser details YourSeq 83 1098 1206 2000 93.7% chr5 - 129987635 129987748 114 browser details YourSeq 83 1078 1212 2000 95.7% chr1 + 68736448 68736591 144 browser details YourSeq 79 877 973 2000 92.5% chr3 + 66868444 66868548 105 browser details YourSeq 79 1093 1220 2000 95.6% chr1 + 156548961 156549218 258 browser details YourSeq 75 872 1206 2000 74.2% chr1 + 33835379 33835526 148 browser details YourSeq 74 907 1047 2000 88.6% chr1 - 39475330 39505416 30087 browser details YourSeq 72 897 1131 2000 89.2% chr12 - 55267905 55268428 524 browser details YourSeq 72 1098 1210 2000 95.0% chr1 - 134528580 134528697 118

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Crabp2 cellular retinoic acid binding protein II [ Mus musculus (house mouse) ] Gene ID: 12904, updated on 13-Aug-2019

Gene summary

Official Symbol Crabp2 provided by MGI Official Full Name cellular retinoic acid binding protein II provided by MGI Primary source MGI:MGI:88491 See related Ensembl:ENSMUSG00000004885 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Crabp-2; CrabpII; AI893628 Expression Biased expression in CNS E11.5 (RPKM 125.3), limb E14.5 (RPKM 55.3) and 3 other tissues See more Orthologs human all

Genomic context

Location: 3 F1; 3 See Crabp2 in Genome Data Viewer Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (87948693..87953372)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (87752615..87757294)

Chromosome 3 - NC_000069.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Crabp2 ENSMUSG00000004885

Description cellular retinoic acid binding protein II [Source:MGI Symbol;Acc:MGI:88491] Gene Synonyms Crabp-2, CrabpII Location Chromosome 3: 87,948,666-87,953,376 forward strand. GRCm38:CM000996.2 About this gene This gene has 2 transcripts (splice variants), 240 orthologues, 15 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Crabp2-201 ENSMUST00000005019.5 931 138aa ENSMUSP00000005019.5 Protein coding CCDS17460 P22935 TSL:1 GENCODE basic APPRIS P1

Crabp2-202 ENSMUST00000163040.1 658 No protein - Retained intron - - TSL:2

24.71 kb Forward strand

87.94Mb 87.95Mb 87.96Mb (Comprehensive set... Isg20l2-201 >protein coding Crabp2-201 >protein coding

Crabp2-202 >retained intron

Contigs AC158233.2 > Regulatory Build

87.94Mb 87.95Mb 87.96Mb Reverse strand 24.71 kb

Regulation Legend

CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000005019

4.71 kb Forward strand

Crabp2-201 >protein coding

ENSMUSP00000005... Superfamily Calycin Prints Cytosolic fatty-acid binding Pfam Lipocalin/cytosolic fatty-acid binding domain PROSITE patterns Cytosolic fatty-acid binding PANTHER Cellular retinoic acid-binding protein 2

Intracellular lipid binding protein Gene3D Calycin

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 138

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8