https://www.alphaknockout.com

Mouse Lima1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Lima1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Lima1 (NCBI Reference Sequence: NM_001113545 ; Ensembl: ENSMUSG00000023022 ) is located on Mouse 15. 11 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 11 (Transcript: ENSMUST00000073691). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Lima1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-268E5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit decreased intestinal cholesterol absorption.

Exon 5 starts from about 27.93% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 11750 bp, and the size of intron 5 for 3'-loxP site insertion: 1087 bp. The size of effective cKO region: ~585 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 6 11 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Lima1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7085bp) | A(26.79% 1898) | C(21.85% 1548) | T(27.75% 1966) | G(23.61% 1673)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 - 99807994 99810993 3000 browser details YourSeq 344 115 809 3000 91.6% chr5 + 32837940 32838816 877 browser details YourSeq 325 391 755 3000 94.6% chr5 - 115103178 115103542 365 browser details YourSeq 311 283 727 3000 89.6% chr4 + 34406788 34458203 51416 browser details YourSeq 297 414 746 3000 94.9% chr15 + 5117286 5118677 1392 browser details YourSeq 294 400 755 3000 91.3% chr5 + 67453484 67453817 334 browser details YourSeq 288 391 758 3000 90.7% chr9 - 59844264 59844624 361 browser details YourSeq 277 408 809 3000 92.7% chr19 + 16025447 16026171 725 browser details YourSeq 274 391 746 3000 88.1% chr18 + 45198795 45199144 350 browser details YourSeq 261 447 755 3000 93.2% chr4 - 129761578 129761881 304 browser details YourSeq 248 89 746 3000 81.9% chr4 + 98352367 98352775 409 browser details YourSeq 247 451 767 3000 91.4% chrX + 53772875 53773192 318 browser details YourSeq 241 421 752 3000 88.0% chr6 - 72687966 72688269 304 browser details YourSeq 240 115 716 3000 86.4% chrX - 99687470 99688124 655 browser details YourSeq 231 1 381 3000 91.1% chr11 - 60792185 60971732 179548 browser details YourSeq 216 511 752 3000 93.8% chr6 - 34810584 34810824 241 browser details YourSeq 213 473 755 3000 88.0% chrX + 106612162 106612448 287 browser details YourSeq 213 115 734 3000 83.2% chrX + 101058904 101059335 432 browser details YourSeq 213 474 757 3000 87.5% chr17 + 27522620 27522873 254 browser details YourSeq 207 472 739 3000 91.3% chr15 + 21049438 21049713 276

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 - 99804409 99807408 3000 browser details YourSeq 469 370 2835 3000 93.2% chr5 - 134161268 134671263 509996 browser details YourSeq 449 377 2844 3000 91.2% chr3 + 30520851 31069174 548324 browser details YourSeq 300 1286 2835 3000 92.6% chr4 + 123692925 123731613 38689 browser details YourSeq 245 2289 2835 3000 84.4% chr9 + 110058211 110058713 503 browser details YourSeq 240 2300 2835 3000 89.2% chr6 + 86533806 86534411 606 browser details YourSeq 221 2343 2835 3000 91.8% chr11 - 119581984 119582579 596 browser details YourSeq 216 2296 2825 3000 89.5% chr14 - 32132654 32133261 608 browser details YourSeq 209 2289 2835 3000 85.5% chr16 - 3996590 3997101 512 browser details YourSeq 203 2518 2835 3000 87.5% chr19 + 5792042 5792636 595 browser details YourSeq 200 2288 2835 3000 92.4% chr11 - 96041402 96389424 348023 browser details YourSeq 200 2518 2835 3000 91.4% chr6 + 87905536 87905922 387 browser details YourSeq 194 2289 2835 3000 82.2% chr4 - 45109579 45110055 477 browser details YourSeq 187 2302 2835 3000 90.9% chr7 - 29133094 29133706 613 browser details YourSeq 186 2296 2835 3000 83.9% chr1 + 131912874 131913240 367 browser details YourSeq 184 2296 2835 3000 90.4% chr10 + 41433656 41434255 600 browser details YourSeq 182 2572 2835 3000 91.0% chr7 + 28394051 28394487 437 browser details YourSeq 170 2519 2893 3000 84.6% chr15 + 44308631 44308844 214 browser details YourSeq 168 2517 2825 3000 88.9% chr1 - 154803859 154804186 328 browser details YourSeq 166 2607 2854 3000 91.9% chr10 - 40272521 40272777 257

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Lima1 LIM domain and binding 1 [ Mus musculus (house mouse) ] Gene ID: 65970, updated on 12-Aug-2019

Gene summary

Official Symbol Lima1 provided by MGI Official Full Name LIM domain and actin binding 1 provided by MGI Primary source MGI:MGI:1920992 See related Ensembl:ENSMUSG00000023022 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Eplin Expression Broad expression in bladder adult (RPKM 44.8), colon adult (RPKM 23.3) and 27 other tissues See more Orthologs all

Genomic context

Location: 15 F1; 15 56.13 cM See Lima1 in Genome Data Viewer

Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (99778468..99875477, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (99608899..99705887, complement)

Chromosome 15 - NC_000081.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Lima1 ENSMUSG00000023022

Description LIM domain and actin binding 1 [Source:MGI Symbol;Acc:MGI:1920992] Gene Synonyms 1110021C24Rik, 3526402A12Rik, EPLIN, epithelial protein lost in neoplasm Location Chromosome 15: 99,778,470-99,875,456 reverse strand. GRCm38:CM001008.2 About this gene This gene has 6 transcripts (splice variants), 221 orthologues, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Lima1- ENSMUST00000073691.4 4243 753aa ENSMUSP00000073371.3 Protein coding CCDS49729 Q9ERG0 TSL:1 201 GENCODE basic APPRIS ALT2

Lima1- ENSMUST00000109024.8 4109 593aa ENSMUSP00000104652.2 Protein coding CCDS37206 Q8CD09 TSL:1 202 Q9ERG0 GENCODE basic APPRIS P3

Lima1- ENSMUST00000231121.1 448 109aa ENSMUSP00000155772.1 Protein coding - A0A2R8VI67 CDS 3' 206 incomplete

Lima1- ENSMUST00000171450.1 2680 No - Retained - - TSL:1 203 protein intron

Lima1- ENSMUST00000172119.1 506 No - Retained - - TSL:2 204 protein intron

Lima1- ENSMUST00000230033.1 405 No - Retained - - - 205 protein intron

Page 6 of 8 https://www.alphaknockout.com

116.99 kb Forward strand 99.78Mb 99.80Mb 99.82Mb 99.84Mb 99.86Mb 99.88Mb Gm17058-201 >lncRNA Gm17057-201 >lncRNA Gm18095-201 >processed pseudogene (Comprehensive set...

Gm18890-201 >processed pseudogene

Gm49494-202 >retained intron

Gm49494-201 >lncRNA

Contigs AC134548.5 > < AC138177.12 Genes (Comprehensive set... < Cers5-202protein coding < Lima1-205retained intron < Lima1-206protein coding

< Cers5-204nonsense mediated decay < Gm25897-201snRNA < Lima1-203retained intron

< Cers5-201protein coding < Gm4468-201processed pseudogene

< Cers5-203protein coding

< Cers5-205protein coding

< Lima1-201protein coding

< Lima1-202protein coding

< Lima1-204retained intron

Regulatory Build

99.78Mb 99.80Mb 99.82Mb 99.84Mb 99.86Mb 99.88Mb Reverse strand 116.99 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000073691

< Lima1-201protein coding

Reverse strand 96.99 kb

ENSMUSP00000073... MobiDB lite Low complexity (Seg) Superfamily SSF57716 SMART Zinc finger, LIM-type

Pfam Zinc finger, LIM-type

PROSITE profiles Zinc finger, LIM-type PROSITE patterns Zinc finger, LIM-type

PANTHER LIM domain and actin-binding protein 1

PTHR24206 Gene3D 2.10.110.10 CDD cd09485

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 753

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8