https://www.alphaknockout.com

Mouse Limch1 Knockout Project (CRISPR/Cas9)

Objective: To create a Limch1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Limch1 (NCBI Reference Sequence: NM_001001980 ; Ensembl: ENSMUSG00000037736 ) is located on Mouse 5. 26 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 26 (Transcript: ENSMUST00000101164). Exon 5~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 10.88% of the coding region. Exon 5~7 covers 10.66% of the coding region. The size of effective KO region: ~9684 bp. The KO region does not have any other known gene.

Page 1 of 10 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 26

Legends Exon of mouse Limch1 Knockout region

Page 2 of 10 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 10 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.5% 550) | C(18.55% 371) | T(32.7% 654) | G(21.25% 425)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.45% 509) | C(20.1% 402) | T(33.15% 663) | G(21.3% 426)

Note: The 2000 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 10 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 + 66957589 66959588 2000 browser details YourSeq 129 1696 1916 2000 92.7% chr1 + 181883559 181883802 244 browser details YourSeq 128 1677 1848 2000 86.0% chr7 - 142047299 142047452 154 browser details YourSeq 124 1515 1848 2000 93.2% chr8 - 22795859 22796426 568 browser details YourSeq 121 1683 1848 2000 85.6% chr13 - 100691239 100691392 154 browser details YourSeq 120 1677 1848 2000 84.9% chr2 + 25004726 25004878 153 browser details YourSeq 119 1718 1990 2000 89.2% chr8 - 93712420 93712691 272 browser details YourSeq 119 1685 1848 2000 92.2% chr6 - 35072686 35073236 551 browser details YourSeq 119 1679 1848 2000 85.9% chr3 - 133565314 133565465 152 browser details YourSeq 119 1693 1848 2000 91.0% chr19 - 7499212 7499504 293 browser details YourSeq 119 1718 1848 2000 96.2% chr19 + 5163673 5164126 454 browser details YourSeq 118 1682 1848 2000 87.0% chr4 - 150987215 150987377 163 browser details YourSeq 118 1645 1847 2000 92.8% chr4 - 106355300 106355751 452 browser details YourSeq 118 1703 1848 2000 88.9% chrX + 144166551 144166689 139 browser details YourSeq 118 1684 1848 2000 85.5% chr11 + 29578426 29578566 141 browser details YourSeq 117 1718 1848 2000 94.7% chr17 + 66459821 66459951 131 browser details YourSeq 117 1718 1854 2000 93.4% chr17 + 45620860 45621005 146 browser details YourSeq 116 1720 1847 2000 95.4% chr2 + 157319058 157319185 128 browser details YourSeq 115 1719 1848 2000 94.7% chr3 - 68831869 68832006 138 browser details YourSeq 115 1698 1848 2000 84.9% chr10 + 4506552 4506690 139

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 + 66969273 66971272 2000 browser details YourSeq 36 1510 1691 2000 94.9% chr16 - 52763045 52763326 282 browser details YourSeq 34 982 1034 2000 92.5% chr9 - 43881350 43881403 54 browser details YourSeq 31 987 1035 2000 81.7% chr14 - 69799410 69799458 49 browser details YourSeq 31 978 1038 2000 89.5% chr5 + 136283210 136283270 61 browser details YourSeq 30 1240 1283 2000 96.9% chr4 + 25238456 25238871 416 browser details YourSeq 27 987 1031 2000 80.0% chrX - 166521895 166521939 45 browser details YourSeq 26 934 961 2000 96.5% chr9 - 11780936 11780963 28 browser details YourSeq 25 1019 1043 2000 100.0% chr7 + 79333267 79333291 25 browser details YourSeq 24 1014 1039 2000 96.2% chr16 - 93289566 93289591 26 browser details YourSeq 24 1019 1042 2000 100.0% chr13 - 34570179 34570202 24 browser details YourSeq 24 1016 1039 2000 100.0% chr11 - 87674873 87674896 24 browser details YourSeq 24 1019 1042 2000 100.0% chr4 + 57071563 57071586 24 browser details YourSeq 23 985 1031 2000 69.6% chr13 - 14706885 14706930 46 browser details YourSeq 23 801 823 2000 100.0% chr12 - 90146785 90146807 23 browser details YourSeq 23 1016 1038 2000 100.0% chr2 + 63057531 63057553 23 browser details YourSeq 22 1019 1042 2000 95.9% chr3 - 53499384 53499407 24 browser details YourSeq 22 1674 1697 2000 87.0% chr3 - 32663219 32663241 23 browser details YourSeq 21 1019 1039 2000 100.0% chr11 - 100324861 100324881 21 browser details YourSeq 21 1012 1032 2000 100.0% chr1 - 157374501 157374521 21

Note: The 2000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 10 https://www.alphaknockout.com

Gene and information: Limch1 LIM and calponin homology domains 1 [ Mus musculus (house mouse) ] Gene ID: 77569, updated on 24-Oct-2019

Gene summary

Official Symbol Limch1 provided by MGI Official Full Name LIM and calponin homology domains 1 provided by MGI Primary source MGI:MGI:1924819 See related Ensembl:ENSMUSG00000037736 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as mKIAA1102; 3732412D22Rik Expression Biased expression in lung adult (RPKM 20.3), heart adult (RPKM 7.0) and 11 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 C3.1 See Limch1 in Genome Data Viewer Exon count: 28

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (66745889..67057159)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (67137079..67448398)

Chromosome 5 - NC_000071.6

Page 6 of 10 https://www.alphaknockout.com

Transcript information: This gene has 20 transcripts

Gene: Limch1 ENSMUSG00000037736

Description LIM and calponin homology domains 1 [Source:MGI Symbol;Acc:MGI:1924819] Gene Synonyms 3732412D22Rik Location Chromosome 5: 66,745,827-67,057,158 forward strand. GRCm38:CM000998.2 About this gene This gene has 20 transcripts (splice variants), 222 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Limch1- ENSMUST00000038188.13 5903 901aa ENSMUSP00000043163.7 Protein coding CCDS57347 Q3UH68 TSL:1 201 GENCODE basic

Limch1- ENSMUST00000101164.10 5192 1057aa ENSMUSP00000098723.4 Protein coding CCDS39102 Q3UH68 TSL:5 202 GENCODE basic APPRIS P2

Limch1- ENSMUST00000238785.1 7498 1511aa ENSMUSP00000158661.1 Protein coding - - GENCODE 219 basic APPRIS ALT2

Limch1- ENSMUST00000238993.1 5981 927aa ENSMUSP00000159044.1 Protein coding - - GENCODE 220 basic

Limch1- ENSMUST00000118242.7 5050 1068aa ENSMUSP00000112732.1 Protein coding - D3YU22 TSL:5 204 GENCODE basic APPRIS ALT2

Limch1- ENSMUST00000119854.7 4185 774aa ENSMUSP00000112651.2 Protein coding - D3YU59 CDS 5' 205 incomplete TSL:1

Limch1- ENSMUST00000117601.7 3157 898aa ENSMUSP00000113544.2 Protein coding - D3Z589 TSL:5 203 GENCODE basic

Limch1- ENSMUST00000132991.4 876 268aa ENSMUSP00000123337.2 Protein coding - F6X3N8 CDS 3' 209 incomplete TSL:2

Limch1- ENSMUST00000127184.7 852 237aa ENSMUSP00000114681.1 Protein coding - D3YV55 CDS 3' 207 incomplete TSL:5

Limch1- ENSMUST00000122812.4 520 29aa ENSMUSP00000144176.1 Protein coding - A0A0J9YUH0 CDS 3' 206 incomplete TSL:3

Limch1- ENSMUST00000153174.1 435 60aa ENSMUSP00000118979.1 Nonsense mediated - F6W1Y9 CDS 5' 214 decay incomplete TSL:2

Limch1- ENSMUST00000130228.1 360 83aa ENSMUSP00000116126.1 Nonsense mediated - D6RHT2 TSL:5 208 decay

Limch1- ENSMUST00000202830.1 2538 No - Retained intron - - TSL:NA 218 protein

Limch1- ENSMUST00000140428.7 2458 No - Retained intron - - TSL:1 212 protein

Limch1- ENSMUST00000201852.3 1018 No - Retained intron - - TSL:3 216 protein

Page 7 of 10 https://www.alphaknockout.com

Limch1- ENSMUST00000202048.1 786 No - Retained intron - - TSL:2 217 protein

Limch1- ENSMUST00000137394.1 680 No - Retained intron - - TSL:2 211 protein

Limch1- ENSMUST00000201322.3 3356 No - lncRNA - - TSL:5 215 protein

Limch1- ENSMUST00000135334.2 735 No - lncRNA - - TSL:3 210 protein

Limch1- ENSMUST00000147050.7 624 No - lncRNA - - TSL:3 213 protein

Page 8 of 10 https://www.alphaknockout.com

331.33 kb Forward strand

66.8Mb 66.9Mb 67.0Mb (Comprehensive set... Limch1-216 >retained intron Limch1-213 >lncRNA Limch1-210 >lncRNA

Limch1-219 >protein coding

Limch1-202 >protein coding

Limch1-218 >retained intron Gm6517-201 >processed pseudogene Limch1-215 >lncRNA

Limch1-204 >protein coding

Limch1-208 >nonsense mediated decay Limch1-201 >protein coding

Limch1-206 >protein coding Gm43282-201 >TEC

Limch1-203 >protein coding

Gm43281-201 >TEC Limch1-220 >protein coding

Limch1-212 >retained intron

Limch1-207 >protein coding

Limch1-217 >retained intron

Limch1-209 >protein coding

Limch1-205 >protein coding

Limch1-214 >nonsense mediated decay

Limch1-211 >retained intron

Contigs < AC152416.2 < AC158994.4 AC119834.12 > Genes < Gm15949-201lncRNA < Gm23841-201miRNA (Comprehensive set...

< Gm15948-201lncRNA

< Gm42713-201TEC

Regulatory Build

66.8Mb 66.9Mb 67.0Mb Reverse strand 331.33 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene processed transcript

Page 9 of 10 https://www.alphaknockout.com

Transcript: ENSMUST00000101164

310.37 kb Forward strand

Limch1-202 >protein coding

ENSMUSP00000098... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily CH domain superfamily SMART Calponin homology domain Zinc finger, LIM-type

Prints Calponin/LIMCH1 Pfam Calponin homology domain Domain of unknown function DUF4757 Zinc finger, LIM-type

PROSITE profiles Calponin homology domain Zinc finger, LIM-type

PROSITE patterns Zinc finger, LIM-type PANTHER LIM and calponin homology domains-containing protein 1

PTHR15551 Gene3D CH domain superfamily 2.10.110.10

CDD Calponin homology domain cd08368

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1057

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 10 of 10