https://www.alphaknockout.com

Mouse Carmil1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Carmil1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Carmil1 (NCBI Reference Sequence: NM_026825.3 ; Ensembl: ENSMUSG00000021338 ) is located on Mouse 13. 38 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 38 (Transcript: ENSMUST00000072889). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Carmil1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-75N16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: No abnormal phenotype was observed in a high-throughput screen, nor in a pathology assessment.

Exon 5 starts from about 6.07% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 8693 bp, and the size of intron 5 for 3'-loxP site insertion: 9313 bp. The size of effective cKO region: ~622 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 38 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Carmil1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7122bp) | A(24.4% 1738) | C(23.2% 1652) | T(31.14% 2218) | G(21.26% 1514)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 24165173 24168172 3000 browser details YourSeq 100 1488 1611 3000 91.4% chr18 - 60460851 60460975 125 browser details YourSeq 88 1460 1569 3000 85.3% chr13 + 81049513 81049607 95 browser details YourSeq 85 1460 1566 3000 84.8% chr5 - 133723790 133723881 92 browser details YourSeq 85 1460 1569 3000 84.3% chr17 + 79212468 79212565 98 browser details YourSeq 85 1460 1566 3000 87.4% chr12 + 14278957 14279056 100 browser details YourSeq 84 1462 1566 3000 89.4% chr2 - 12691652 12691754 103 browser details YourSeq 84 1487 1761 3000 92.0% chr18 - 56681436 56681729 294 browser details YourSeq 81 1468 1566 3000 85.4% chr14 - 46062188 46062277 90 browser details YourSeq 77 1460 1552 3000 85.6% chrX - 12032679 12032762 84 browser details YourSeq 76 1463 1552 3000 89.1% chr5 - 14202775 14202859 85 browser details YourSeq 73 1484 1569 3000 94.1% chrX - 157127454 157127539 86 browser details YourSeq 73 1461 1545 3000 85.8% chr8 + 29541300 29541376 77 browser details YourSeq 69 1460 1540 3000 85.0% chrX + 83226657 83226729 73 browser details YourSeq 69 1460 1559 3000 85.9% chr10 + 44131593 44131685 93 browser details YourSeq 68 1460 1552 3000 84.9% chr14 - 78472826 78472911 86 browser details YourSeq 68 1491 1566 3000 92.0% chr6 + 106405777 106405851 75 browser details YourSeq 64 1490 1566 3000 88.6% chr3 - 45279089 45279161 73 browser details YourSeq 58 1575 1712 3000 95.4% chr16 + 90516251 90516624 374 browser details YourSeq 30 1736 1769 3000 94.2% chr4 + 12387345 12387378 34

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 24161551 24164550 3000 browser details YourSeq 78 2241 2539 3000 77.9% chr7 - 119675286 119675503 218 browser details YourSeq 78 2432 2656 3000 68.1% chr5 + 43588544 43588701 158 browser details YourSeq 74 2439 2704 3000 70.0% chr18 + 20467584 20467740 157 browser details YourSeq 72 2447 2647 3000 71.0% chr8 - 33989546 33989703 158 browser details YourSeq 68 2432 2654 3000 70.3% chr16 - 33157709 33157910 202 browser details YourSeq 62 2423 2534 3000 77.7% chr14 + 54366132 54366243 112 browser details YourSeq 62 2432 2653 3000 78.1% chr11 + 35555284 35555482 199 browser details YourSeq 62 2423 2534 3000 77.7% chr10 + 92972141 92972252 112 browser details YourSeq 61 2455 2731 3000 67.4% chr6 - 52627455 52627653 199 browser details YourSeq 59 2438 2534 3000 80.5% chr5 - 110586870 110586966 97 browser details YourSeq 59 2452 2546 3000 81.1% chr5 + 90489651 90489745 95 browser details YourSeq 58 2427 2677 3000 73.3% chr11 + 97461861 97462289 429 browser details YourSeq 57 2453 2589 3000 77.7% chr12 - 107944993 107945417 425 browser details YourSeq 57 2432 2534 3000 77.7% chr11 + 57699833 57699935 103 browser details YourSeq 56 2432 2534 3000 78.0% chr2 - 49504824 49504927 104 browser details YourSeq 56 2432 2513 3000 84.2% chr14 + 46866561 46866642 82 browser details YourSeq 55 2432 2723 3000 67.7% chr10 + 40337866 40338018 153 browser details YourSeq 54 2438 2535 3000 87.4% chr11 + 102803177 103191327 388151 browser details YourSeq 53 2436 2530 3000 77.9% chr9 + 123646616 123646710 95

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Carmil1 capping protein regulator and myosin 1 linker 1 [ Mus musculus (house mouse) ] Gene ID: 68732, updated on 26-Jun-2020

Gene summary

Official Symbol Carmil1 provided by MGI Official Full Name capping protein regulator and myosin 1 linker 1 provided by MGI Primary source MGI:MGI:1915982 See related Ensembl:ENSMUSG00000021338 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CARMIL; CARML1; Lrrc16; Lrrc16a; AI425970; D130057M20; 1110037D04Rik Expression Ubiquitous expression in testis adult (RPKM 3.6), CNS E18 (RPKM 3.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 13; 13 A3.1 See Carmil1 in Genome Data Viewer

Exon count: 42

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (24012481..24280802, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (24104353..24372659, complement)

Chromosome 13 - NC_000079.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Carmil1 ENSMUSG00000021338

Description capping protein regulator and myosin 1 linker 1 [Source:MGI Symbol;Acc:MGI:1915982] Gene Synonyms 1110037D04Rik, Carmil, Lrrc16, Lrrc16a Location Chromosome 13: 24,012,344-24,280,795 reverse strand. GRCm38:CM001006.2 About this gene This gene has 12 transcripts (splice variants), 343 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Carmil1- ENSMUST00000072889.11 4988 1374aa ENSMUSP00000072662.5 Protein coding CCDS36621 Q6EDY6 TSL:1 201 GENCODE basic APPRIS P3

Carmil1- ENSMUST00000110398.7 4520 1332aa ENSMUSP00000106028.2 Protein coding CCDS79171 D3Z030 TSL:1 202 GENCODE basic APPRIS ALT2

Carmil1- ENSMUST00000151566.1 1185 173aa ENSMUSP00000120971.1 Protein coding - F6XQY1 CDS 5' 212 incomplete TSL:1

Carmil1- ENSMUST00000140042.7 580 57aa ENSMUSP00000127121.1 Protein coding - E9Q4H6 CDS 3' 208 incomplete TSL:3

Carmil1- ENSMUST00000123076.7 428 20aa ENSMUSP00000130100.1 Protein coding - E9Q7X4 CDS 3' 203 incomplete TSL:3

Carmil1- ENSMUST00000125901.7 4071 932aa ENSMUSP00000126522.1 Nonsense mediated - F7AI27 CDS 5' 205 decay incomplete TSL:1

Carmil1- ENSMUST00000147261.1 645 No - Processed transcript - - TSL:5 211 protein

Carmil1- ENSMUST00000144159.1 394 No - Processed transcript - - TSL:5 210 protein

Carmil1- ENSMUST00000136517.1 311 No - Processed transcript - - TSL:3 207 protein

Carmil1- ENSMUST00000128416.1 215 No - Processed transcript - - TSL:5 206 protein

Carmil1- ENSMUST00000125420.7 2501 No - Retained intron - - TSL:1 204 protein

Carmil1- ENSMUST00000142171.1 2350 No - Retained intron - - TSL:1 209 protein

Page 6 of 8 https://www.alphaknockout.com

288.45 kb Forward strand 24.05Mb 24.10Mb 24.15Mb 24.20Mb 24.25Mb Gm11343-201 >processed pseudogene (Comprehensive set...

Contigs AL606464.11 > AL683873.14 > AL590864.12 > Genes (Comprehensive set... < Carmil1-205nonsense mediated decay < Carmil1-204retained intron

< Carmil1-212protein coding < Carmil1-209retained intron< Carmil1-206processed transcript < Carmil1-203protein coding

< Carmil1-201protein coding

< Carmil1-202protein coding

< Carmil1-211processed transcript < Carmil1-210processed transcript

< Carmil1-208protein coding

< Carmil1-207processed transcript

Regulatory Build

24.05Mb 24.10Mb 24.15Mb 24.20Mb 24.25Mb Reverse strand 288.45 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000072889

< Carmil1-201protein coding

Reverse strand 268.14 kb

ENSMUSP00000072... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF52047 SMART SM00368 Pfam CARMIL, pleckstrin homology domain CARMIL, C-terminal domain

Leucine-rich repeat PANTHER PTHR24112

Leucine-rich repeat-containing protein 16A Gene3D Leucine-rich repeat domain superfamily

PH-like domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1374

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8