https://www.alphaknockout.com

Mouse Lhx2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Lhx2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Lhx2 (NCBI Reference Sequence: NM_010710 ; Ensembl: ENSMUSG00000000247 ) is located on Mouse 2. 5 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 5 (Transcript: ENSMUST00000000253). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Lhx2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-318M1 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit lethality during fetal development and the perinatal period with abnormal liver, telencephalon, olfactory bulb, basal ganglion, and eye morphology.

Exon 3 starts from about 26.6% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 935 bp, and the size of intron 3 for 3'-loxP site insertion: 5045 bp. The size of effective cKO region: ~904 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Lhx2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7404bp) | A(19.19% 1421) | C(27.9% 2066) | T(23.01% 1704) | G(29.89% 2213)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 38351200 38354199 3000 browser details YourSeq 34 2821 2869 3000 92.5% chr17 + 37181229 37181281 53 browser details YourSeq 31 2817 2851 3000 94.3% chr4 + 27313408 27313442 35 browser details YourSeq 29 2817 2851 3000 94.0% chr8 + 109342281 109342321 41 browser details YourSeq 28 140 199 3000 96.7% chr1 - 24586579 24586688 110 browser details YourSeq 26 2819 2865 3000 68.5% chr5 - 145060341 145060381 41 browser details YourSeq 26 2831 2863 3000 74.1% chr16 + 88479081 88479107 27 browser details YourSeq 25 461 485 3000 100.0% chr7 + 44336060 44336084 25 browser details YourSeq 24 2824 2847 3000 100.0% chr9 + 53271371 53271394 24 browser details YourSeq 21 322 342 3000 100.0% chr4 + 152129756 152129776 21 browser details YourSeq 21 2686 2706 3000 100.0% chr3 + 88804336 88804356 21 browser details YourSeq 21 2831 2851 3000 100.0% chr1 + 144820216 144820236 21 browser details YourSeq 20 2913 2934 3000 95.5% chr1 + 37665876 37665897 22

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 38355104 38358103 3000 browser details YourSeq 61 2085 2145 3000 100.0% chr16 - 86483546 86483606 61 browser details YourSeq 57 2085 2145 3000 98.4% chr6 + 139721588 139721704 117 browser details YourSeq 25 500 528 3000 85.8% chr1 - 5383069 5383096 28 browser details YourSeq 24 2749 2780 3000 87.5% chr14 + 88579425 88579456 32 browser details YourSeq 23 500 522 3000 100.0% chr4 - 83316310 83316332 23 browser details YourSeq 21 503 523 3000 100.0% chr15 - 88092646 88092666 21 browser details YourSeq 21 505 525 3000 100.0% chr10 - 113602491 113602511 21 browser details YourSeq 21 978 998 3000 100.0% chr13 + 113875541 113875561 21

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Lhx2 LIM protein 2 [ Mus musculus (house mouse) ] Gene ID: 16870, updated on 24-Sep-2019

Gene summary

Official Symbol Lhx2 provided by MGI Official Full Name LIM homeobox protein 2 provided by MGI Primary source MGI:MGI:96785 See related Ensembl:ENSMUSG00000000247 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as ap; LH2A; Lh-2; Lim2; apterous Expression Biased expression in CNS E14 (RPKM 42.9), whole brain E14.5 (RPKM 35.2) and 6 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 B See Lhx2 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (38339271..38369737)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (38206828..38225248)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Lhx2 ENSMUSG00000000247

Description LIM homeobox protein 2 [Source:MGI Symbol;Acc:MGI:96785] Gene Synonyms LH2A, Lh-2, ap, apterous Location Chromosome 2: 38,339,281-38,369,733 forward strand. GRCm38:CM000995.2 About this gene This gene has 7 transcripts (splice variants), 206 orthologues, 21 paralogues, is a member of 1 Ensembl protein family and is associated with 26 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Lhx2- ENSMUST00000000253.5 2840 406aa ENSMUSP00000000253.5 Protein coding CCDS16008 Q543C6 TSL:1 201 Q9Z0S2 GENCODE basic APPRIS P1

Lhx2- ENSMUST00000143783.8 1696 365aa ENSMUSP00000114797.2 Protein coding CCDS71048 F6Z9H5 TSL:1 203 GENCODE basic

Lhx2- ENSMUST00000133661.7 654 192aa ENSMUSP00000115179.1 Protein coding - F6SM03 CDS 3' 202 incomplete TSL:2

Lhx2- ENSMUST00000155964.2 375 108aa ENSMUSP00000121462.3 Protein coding - A0A0A0MQK9 CDS 3' 205 incomplete TSL:2

Lhx2- ENSMUST00000176229.1 592 49aa ENSMUSP00000135402.1 Nonsense mediated - H3BKI3 TSL:3 207 decay

Lhx2- ENSMUST00000175896.1 1243 No - Retained intron - - TSL:2 206 protein

Lhx2- ENSMUST00000149664.1 701 No - Retained intron - - TSL:2 204 protein

Page 6 of 8 https://www.alphaknockout.com

50.45 kb Forward strand 38.33Mb 38.34Mb 38.35Mb 38.36Mb 38.37Mb (Comprehensive set... Lhx2-203 >protein coding

Lhx2-204 >retained intron Lhx2-206 >retained intron

Lhx2-202 >protein coding

Lhx2-201 >protein coding

Lhx2-207 >nonsense mediated decay

Lhx2-205 >protein coding

Contigs AL929186.7 > BX813320.2 > Genes < Gm27197-201lncRNA < Gm13584-201lncRNA (Comprehensive set...

Regulatory Build

38.33Mb 38.34Mb 38.35Mb 38.36Mb 38.37Mb Reverse strand 50.45 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000000253

18.98 kb Forward strand

Lhx2-201 >protein coding

ENSMUSP00000000... MobiDB lite Low complexity (Seg) Superfamily SSF57716 Homeobox-like domain superfamily SMART Zinc finger, LIM-type Homeobox domain Pfam Zinc finger, LIM-type Homeobox domain

PROSITE profiles Zinc finger, LIM-type Homeobox domain PROSITE patterns Zinc finger, LIM-type Homeobox, conserved site

PANTHER PTHR24208

PTHR24208:SF80 Gene3D 2.10.110.10 1.10.10.60

CDD cd09469 cd09377 Homeobox domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 406

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8