https://www.alphaknockout.com

Mouse Lhx4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Lhx4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Lhx4 (NCBI Reference Sequence: NM_010712 ; Ensembl: ENSMUSG00000026468 ) is located on Mouse 1. 6 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 6 (Transcript: ENSMUST00000027740). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Lhx4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-440C7 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mutations in this gene result in abnormal lung development and neonatal lethality.

Exon 2 starts from about 6.58% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 16840 bp, and the size of intron 2 for 3'-loxP site insertion: 14474 bp. The size of effective cKO region: ~672 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Lhx4 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7172bp) | A(22.55% 1617) | C(24.46% 1754) | T(29.0% 2080) | G(24.0% 1721)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 155725212 155728211 3000 browser details YourSeq 217 913 1482 3000 88.7% chr11 + 53389834 53390326 493 browser details YourSeq 216 830 1124 3000 93.2% chr10 - 81339282 81339602 321 browser details YourSeq 214 913 1466 3000 90.1% chr6 + 37883069 37883472 404 browser details YourSeq 213 913 1389 3000 90.6% chr4 + 150279268 150279518 251 browser details YourSeq 212 691 1142 3000 90.2% chr19 + 23230314 23230555 242 browser details YourSeq 211 914 1125 3000 100.0% chr1 + 39531767 39531982 216 browser details YourSeq 210 946 1482 3000 96.1% chr17 - 42961280 42961860 581 browser details YourSeq 208 824 1140 3000 97.3% chr8 + 121595224 121595792 569 browser details YourSeq 206 916 1124 3000 99.6% chr9 - 45830515 45830732 218 browser details YourSeq 205 831 1123 3000 98.6% chr13 - 62758296 62758603 308 browser details YourSeq 205 913 1135 3000 96.9% chrX + 8238384 8238766 383 browser details YourSeq 203 914 1390 3000 90.7% chr8 + 119919746 119920053 308 browser details YourSeq 203 913 1123 3000 98.6% chr11 + 74941811 74942029 219 browser details YourSeq 202 913 1123 3000 96.2% chr15 - 82751079 82751285 207 browser details YourSeq 202 911 1125 3000 98.2% chr1 - 58694111 58694328 218 browser details YourSeq 202 913 1123 3000 98.1% chr8 + 106832740 106832950 211 browser details YourSeq 202 913 1123 3000 96.2% chr7 + 24527558 24527764 207 browser details YourSeq 201 913 1124 3000 97.7% chr4 + 117200575 117200791 217 browser details YourSeq 199 913 1123 3000 97.7% chr11 + 64005431 64006039 609

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 155721540 155724539 3000 browser details YourSeq 30 1531 1563 3000 97.0% chr16 - 91531224 91531257 34 browser details YourSeq 30 1524 1556 3000 97.0% chr16 - 91531224 91531257 34 browser details YourSeq 28 898 926 3000 100.0% chr13 - 12314928 12315561 634 browser details YourSeq 28 184 215 3000 93.8% chr7 + 16797071 16797102 32 browser details YourSeq 28 2406 2449 3000 76.5% chr6 + 17097076 17097114 39 browser details YourSeq 26 1521 1570 3000 64.3% chr16 - 91531224 91531252 29 browser details YourSeq 25 2401 2427 3000 88.5% chr6 - 34592402 34592427 26 browser details YourSeq 22 2407 2430 3000 87.0% chr2 - 51236699 51236721 23 browser details YourSeq 22 2411 2432 3000 100.0% chr11 - 23638509 23638530 22 browser details YourSeq 20 2412 2431 3000 100.0% chr8 - 109029800 109029819 20

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Lhx4 LIM protein 4 [ Mus musculus (house mouse) ] Gene ID: 16872, updated on 11-Sep-2019

Gene summary

Official Symbol Lhx4 provided by MGI Official Full Name LIM homeobox protein 4 provided by MGI Primary source MGI:MGI:101776 See related Ensembl:ENSMUSG00000026468 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gsh4; Gsh-4; A330062J17Rik Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 1 G3; 1 67.47 cM See Lhx4 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (155698031..155751726, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (157548822..157589157, complement)

Chromosome 1 - NC_000067.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Lhx4 ENSMUSG00000026468

Description LIM homeobox protein 4 [Source:MGI Symbol;Acc:MGI:101776] Gene Synonyms A330062J17Rik, Gsh-4, Gsh4 Location : 155,698,031-155,751,684 reverse strand. GRCm38:CM000994.2 About this gene This gene has 2 transcripts (splice variants), 211 orthologues, 21 paralogues, is a member of 1 Ensembl protein family and is associated with 11 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Lhx4-201 ENSMUST00000027740.13 5514 390aa ENSMUSP00000027740.7 Protein coding CCDS48399 P53776 TSL:1 GENCODE basic APPRIS P1

Lhx4-202 ENSMUST00000195275.1 928 223aa ENSMUSP00000141662.1 Protein coding - A0A0A6YWR5 CDS 3' incomplete TSL:5

73.65 kb Forward strand 155.70Mb 155.72Mb 155.74Mb 155.76Mb Acbd6-201 >protein coding (Comprehensive set...

Acbd6-205 >nonsense mediated decay

Contigs AC117598.7 > AC121314.7 >

Genes (Comprehensive set... < Lhx4-201protein coding

< Lhx4-202protein coding

Regulatory Build

155.70Mb 155.72Mb 155.74Mb 155.76Mb Reverse strand 73.65 kb

Regulation Legend CTCF Open Chromatin Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000027740

< Lhx4-201protein coding

Reverse strand 44.00 kb

ENSMUSP00000027... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Superfamily SSF57716 Homeobox-like domain superfamily

SMART Zinc finger, LIM-type Homeobox domain

Pfam Zinc finger, LIM-type Homeobox domain

PROSITE profiles Zinc finger, LIM-type Homeobox domain PROSITE patterns Zinc finger, LIM-type Homeobox, conserved site

PANTHER PTHR24208

PTHR24208:SF116 Gene3D 2.10.110.10 1.10.10.60

CDD cd09468 cd09376 Homeobox domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 390

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7