https://www.alphaknockout.com

Mouse Carf Knockout Project (CRISPR/Cas9)

Objective: To create a Carf knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Carf (NCBI Reference Sequence: NM_139150 ; Ensembl: ENSMUSG00000026017 ) is located on Mouse 1. 15 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 15 (Transcript: ENSMUST00000187978). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele have aberrant learning and memory.

Exon 2 starts from the coding region. Exon 2~4 covers 17.61% of the coding region. The size of effective KO region: ~4083 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 15

Legends Exon of mouse Carf Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1008 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.65% 593) | C(20.15% 403) | T(31.95% 639) | G(18.25% 365)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1008bp) | A(28.67% 289) | C(17.56% 177) | T(39.19% 395) | G(14.58% 147)

Note: The 1008 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 60106127 60108126 2000 browser details YourSeq 180 1343 1546 2000 96.0% chr1 + 36389954 36396867 6914 browser details YourSeq 170 1346 1561 2000 92.0% chr17 + 5910470 5910685 216 browser details YourSeq 169 1344 1553 2000 90.0% chr1 - 180774515 180774709 195 browser details YourSeq 168 1340 1547 2000 93.4% chr10 - 121508226 121921962 413737 browser details YourSeq 167 1343 1563 2000 93.3% chr4 + 98724594 98724820 227 browser details YourSeq 166 1335 1552 2000 88.2% chr5 - 100542398 100542608 211 browser details YourSeq 166 1343 1547 2000 89.8% chr5 + 65415018 65415205 188 browser details YourSeq 165 1343 1545 2000 93.2% chr12 - 86096247 86096458 212 browser details YourSeq 165 1343 1661 2000 87.4% chr14 + 34748563 34748840 278 browser details YourSeq 164 1343 1545 2000 90.4% chr7 + 49588690 49588881 192 browser details YourSeq 160 1343 1544 2000 90.2% chr16 + 32771865 32772055 191 browser details YourSeq 159 1328 1547 2000 89.8% chr14 - 8167791 8167978 188 browser details YourSeq 158 1343 1547 2000 88.0% chr8 - 69745938 69746120 183 browser details YourSeq 157 1368 1553 2000 90.7% chr8 - 12814193 12814375 183 browser details YourSeq 157 1343 1546 2000 90.4% chr7 - 3199861 3200052 192 browser details YourSeq 157 1342 1555 2000 89.9% chr2 - 170202916 170203113 198 browser details YourSeq 157 1343 1545 2000 87.4% chr10 - 59624640 59624822 183 browser details YourSeq 157 1354 1545 2000 90.0% chr4 + 135901784 135901964 181 browser details YourSeq 157 1343 1545 2000 93.5% chr12 + 51761901 51862153 100253

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1008 1 1008 1008 100.0% chr1 + 60124913 60125920 1008 browser details YourSeq 32 933 993 1008 65.8% chr17 + 62452568 62452606 39 browser details YourSeq 29 969 1004 1008 94.0% chr8 - 67004051 67004095 45 browser details YourSeq 26 958 998 1008 89.3% chrX + 159767264 159767303 40 browser details YourSeq 24 979 1004 1008 96.2% chr18 + 36311621 36311646 26 browser details YourSeq 21 984 1004 1008 100.0% chrX - 127908836 127908856 21 browser details YourSeq 21 984 1004 1008 100.0% chr1 - 178561326 178561346 21 browser details YourSeq 21 984 1004 1008 100.0% chrX + 164785786 164785806 21 browser details YourSeq 21 984 1004 1008 100.0% chr8 + 7701366 7701386 21 browser details YourSeq 21 955 975 1008 100.0% chr17 + 13779956 13779976 21 browser details YourSeq 21 457 483 1008 88.9% chr13 + 118224181 118224207 27 browser details YourSeq 20 451 470 1008 100.0% chr1 - 62752989 62753008 20

Note: The 1008 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Carf calcium response factor [ Mus musculus (house mouse) ] Gene ID: 241066, updated on 12-Aug-2019

Gene summary

Official Symbol Carf provided by MGI Official Full Name calcium response factor provided by MGI Primary source MGI:MGI:2182269 See related Ensembl:ENSMUSG00000026017 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Als2cr8; ECBRC-FC1; ECBRC-FC2 Expression Broad expression in testis adult (RPKM 2.5), CNS E18 (RPKM 1.9) and 25 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 C2 See Carf in Genome Data Viewer Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (60098221..60153953)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (60155125..60207878)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Carf ENSMUSG00000026017

Description calcium response factor [Source:MGI Symbol;Acc:MGI:2182269] Gene Synonyms Als2cr8 Location Chromosome 1: 60,098,247-60,153,953 forward strand. GRCm38:CM000994.2 About this gene This gene has 10 transcripts (splice variants), 167 orthologues, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Carf- ENSMUST00000187978.6 5463 689aa ENSMUSP00000141169.1 Protein coding CCDS14989 Q8VHI4 TSL:1 209 GENCODE basic APPRIS P3

Carf- ENSMUST00000180952.7 2157 689aa ENSMUSP00000137825.1 Protein coding CCDS14989 Q8VHI4 TSL:1 206 GENCODE basic APPRIS P3

Carf- ENSMUST00000027171.11 2102 654aa ENSMUSP00000027171.5 Protein coding CCDS78592 A8VI08 TSL:1 201 GENCODE basic APPRIS ALT2

Carf- ENSMUST00000124986.7 410 79aa ENSMUSP00000121293.1 Protein coding - D3Z2W4 CDS 3' 202 incomplete TSL:3

Carf- ENSMUST00000130075.7 2763 87aa ENSMUSP00000137867.1 Nonsense mediated - M0QWJ8 TSL:1 203 decay

Carf- ENSMUST00000186107.6 1837 255aa ENSMUSP00000139554.1 Nonsense mediated - A8VI09 TSL:1 207 decay

Carf- ENSMUST00000132949.2 493 55aa ENSMUSP00000139878.1 Nonsense mediated - A0A087WPQ8 CDS 5' 204 decay incomplete TSL:3

Carf- ENSMUST00000150008.7 3795 No - Retained intron - - TSL:2 205 protein

Carf- ENSMUST00000191232.1 3235 No - Retained intron - - TSL:NA 210 protein

Carf- ENSMUST00000186779.1 2416 No - Retained intron - - TSL:NA 208 protein

Page 7 of 9 https://www.alphaknockout.com

75.71 kb Forward strand 60.10Mb 60.12Mb 60.14Mb 60.16Mb Carf-203 >nonsense mediated decay (Comprehensive set...

Carf-202 >protein coding Carf-208 >retained intron

Carf-205 >retained intron Gm15464-201 >processed pseudogene Carf-204 >nonsense mediated decay

Carf-210 >retained intron

Carf-209 >protein coding

Carf-206 >protein coding

Carf-207 >nonsense mediated decay

Carf-201 >protein coding

Contigs AC116581.4 > < AC138597.7 Genes < Wdr12-201protein coding (Comprehensive set...

< Wdr12-202protein coding

< Wdr12-203protein coding

< Wdr12-206protein coding

< Wdr12-207protein coding

< Wdr12-204retained intron

Regulatory Build

60.10Mb 60.12Mb 60.14Mb 60.16Mb Reverse strand 75.71 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000187978

48.61 kb Forward strand

Carf-209 >protein coding

ENSMUSP00000141... MobiDB lite Low complexity (Seg) Pfam Calcium-responsive PANTHER Calcium-responsive transcription factor

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 689

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9