https://www.alphaknockout.com

Mouse Lkaaear1 Knockout Project (CRISPR/Cas9)

Objective: To create a Lkaaear1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Lkaaear1 (NCBI Reference Sequence: NM_199023 ; Ensembl: ENSMUSG00000045794 ) is located on Mouse 2. 4 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000052416). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.17% of the coding region. Exon 2~4 covers 100.0% of the coding region. The size of effective KO region: ~781 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4

Legends Exon of mouse Lkaaear1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.35% 587) | C(23.7% 474) | T(25.4% 508) | G(21.55% 431)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.05% 501) | C(27.0% 540) | T(23.6% 472) | G(24.35% 487)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 181697656 181699655 2000 browser details YourSeq 501 1 513 2000 98.9% chr2 + 181701119 181701629 511 browser details YourSeq 220 24 718 2000 86.0% chr7 + 90514998 90515706 709 browser details YourSeq 186 67 385 2000 80.9% chr14 + 72922966 72923262 297 browser details YourSeq 184 31 385 2000 90.2% chrX - 60963764 60964346 583 browser details YourSeq 184 72 385 2000 83.6% chr3 + 54323346 54323648 303 browser details YourSeq 180 70 370 2000 87.1% chr10 - 11071726 11072075 350 browser details YourSeq 180 71 385 2000 85.2% chr17 + 63728766 63729081 316 browser details YourSeq 175 70 385 2000 85.6% chr10 + 44712319 44712629 311 browser details YourSeq 174 70 385 2000 81.4% chr13 - 101579824 101580133 310 browser details YourSeq 166 70 386 2000 86.7% chr18 + 62097482 62097799 318 browser details YourSeq 164 65 385 2000 82.7% chr4 - 131503317 131503621 305 browser details YourSeq 162 70 385 2000 88.6% chr3 + 57281267 57281586 320 browser details YourSeq 160 70 385 2000 84.2% chr6 - 56318760 56319066 307 browser details YourSeq 156 61 366 2000 88.0% chrX - 143729759 143730243 485 browser details YourSeq 154 74 384 2000 84.4% chrX + 6617794 6618096 303 browser details YourSeq 153 65 371 2000 85.5% chr6 - 31429414 31429719 306 browser details YourSeq 151 29 385 2000 83.0% chr9 + 74914723 74915038 316 browser details YourSeq 150 57 388 2000 83.7% chr2 - 146648695 146649013 319 browser details YourSeq 149 65 385 2000 86.4% chr7 - 40408616 40768587 359972

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 181694873 181696872 2000 browser details YourSeq 21 137 157 2000 100.0% chr18 - 6881591 6881611 21 browser details YourSeq 21 1685 1707 2000 95.7% chr1 - 109807566 109807588 23 browser details YourSeq 21 87 107 2000 100.0% chr1 - 109552591 109552611 21 browser details YourSeq 20 910 931 2000 95.5% chr1 - 4677391 4677412 22

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Lkaaear1 LKAAEAR motif containing 1 (IKAAEAR murine motif) [ Mus musculus (house mouse) ] Gene ID: 277496, updated on 12-Aug-2019

Gene summary

Official Symbol Lkaaear1 provided by MGI Official Full Name LKAAEAR motif containing 1 (IKAAEAR murine motif) provided by MGI Primary source MGI:MGI:2685538 See related Ensembl:ENSMUSG00000045794 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm692; 4930526D03Rik Expression Restricted expression toward testis adult (RPKM 83.1) See more Orthologs human all

Genomic context

Location: 2; 2 H4 See Lkaaear1 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (181696793..181698449, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (181431500..181433147, complement)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Lkaaear1 ENSMUSG00000045794

Description LKAAEAR motif containing 1 (IKAAEAR murine motif) [Source:MGI Symbol;Acc:MGI:2685538] Gene Synonyms 4930526D03Rik, LOC277496 Location Chromosome 2: 181,696,793-181,698,442 reverse strand. GRCm38:CM000995.2 About this gene This gene has 2 transcripts (splice variants), 83 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Lkaaear1-201 ENSMUST00000052416.3 718 198aa ENSMUSP00000061134.3 Protein coding CCDS17221 Q8BIG2 TSL:1 GENCODE basic APPRIS P1

Lkaaear1-202 ENSMUST00000132409.1 420 115aa ENSMUSP00000116083.1 Protein coding - F6RL86 CDS 5' incomplete TSL:5

Page 7 of 9 https://www.alphaknockout.com

21.65 kb Forward strand 181.690Mb 181.695Mb 181.700Mb 181.705Mb Tcea2-201 >protein coding (Comprehensive set...

Tcea2-206 >retained intron

Tcea2-207 >protein coding

Tcea2-205 >protein coding

Tcea2-208 >retained intron

Tcea2-202 >retained intron

Tcea2-204 >lncRNA

Contigs AL844529.10 > AL845173.4 > Genes (Comprehensive set... < Rgs19-211protein coding < Lkaaear1-201protein coding

< Rgs19-207protein coding < Lkaaear1-202protein coding

< Rgs19-201protein coding

< Rgs19-212protein coding

< Rgs19-206protein coding

< Rgs19-202protein coding

< Rgs19-204protein coding

< Rgs19-203protein coding

< Rgs19-205protein coding

< Rgs19-210lncRNA

< Rgs19-208lncRNA

< Rgs19-209lncRNA

Regulatory Build

181.690Mb 181.695Mb 181.700Mb 181.705Mb Reverse strand 21.65 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000052416

< Lkaaear1-201protein coding

Reverse strand 1.65 kb

ENSMUSP00000061... MobiDB lite Low complexity (Seg) Pfam Protein LKAAEAR1 PANTHER Protein LKAAEAR1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 198

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9