https://www.alphaknockout.com

Mouse Lbh Knockout Project (CRISPR/Cas9)

Objective: To create a Lbh knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Lbh (NCBI Reference Sequence: NM_029999 ; Ensembl: ENSMUSG00000024063 ) is located on Mouse 17. 3 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 3 (Transcript: ENSMUST00000024857). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit impaired mammary gland development and decreased litter size.

Exon 2 starts from about 8.57% of the coding region. Exon 2 covers 32.7% of the coding region. The size of effective KO region: ~103 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3

Legends Exon of mouse Lbh Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.05% 461) | C(24.45% 489) | T(27.2% 544) | G(25.3% 506)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.8% 496) | C(21.2% 424) | T(25.6% 512) | G(28.4% 568)

Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 + 72919189 72921188 2000 browser details YourSeq 28 704 739 2000 93.6% chr3 - 55587673 55587708 36 browser details YourSeq 22 1074 1095 2000 100.0% chr3 - 4825475 4825496 22 browser details YourSeq 22 699 720 2000 100.0% chr2 + 143341380 143341401 22 browser details YourSeq 22 169 190 2000 100.0% chr14 + 20193075 20193096 22 browser details YourSeq 21 1900 1920 2000 100.0% chr16 - 79322926 79322946 21 browser details YourSeq 21 708 730 2000 95.7% chr1 - 71614401 71614423 23 browser details YourSeq 21 1899 1919 2000 100.0% chr2 + 169040646 169040666 21 browser details YourSeq 21 1689 1709 2000 100.0% chr18 + 26134246 26134266 21 browser details YourSeq 21 1690 1710 2000 100.0% chr1 + 179886441 179886461 21 browser details YourSeq 20 1689 1708 2000 100.0% chr1 + 108258495 108258514 20 browser details YourSeq 20 1898 1917 2000 100.0% chr1 + 77114986 77115005 20

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr17 + 72921292 72923291 2000 browser details YourSeq 26 781 808 2000 88.9% chr1 - 11275400 11275426 27 browser details YourSeq 25 1118 1145 2000 96.3% chr1 + 108703850 108703878 29 browser details YourSeq 21 1391 1411 2000 100.0% chr17 - 25098051 25098071 21 browser details YourSeq 20 1607 1626 2000 100.0% chr1 - 92240938 92240957 20 browser details YourSeq 20 1216 1235 2000 100.0% chr1 + 134682524 134682543 20 browser details YourSeq 20 584 603 2000 100.0% chr1 + 25138057 25138076 20

Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Lbh limb-bud and heart [ Mus musculus (house mouse) ] Gene ID: 77889, updated on 12-Aug-2019

Gene summary

Official Symbol Lbh provided by MGI Official Full Name limb-bud and heart provided by MGI Primary source MGI:MGI:1925139 See related Ensembl:ENSMUSG00000024063 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 1810009F10Rik; 6720416L16Rik Expression Broad expression in heart adult (RPKM 216.8), lung adult (RPKM 92.7) and 21 other tissues See more Orthologs human all

Genomic context

Location: 17; 17 E1.3 See Lbh in Genome Data Viewer Exon count: 3

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (72918305..72941946)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (73267645..73291286)

Chromosome 17 - NC_000083.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Lbh ENSMUSG00000024063

Description limb-bud and heart [Source:MGI Symbol;Acc:MGI:1925139] Gene Synonyms 1810009F10Rik, 6720416L16Rik Location Chromosome 17: 72,918,305-72,941,947 forward strand. GRCm38:CM001010.2 About this gene This gene has 2 transcripts (splice variants), 187 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Lbh-201 ENSMUST00000024857.13 3069 105aa ENSMUSP00000024857.6 Protein coding CCDS28964 Q9CX60 TSL:1 GENCODE basic APPRIS P1

Lbh-202 ENSMUST00000148556.1 336 56aa ENSMUSP00000123062.1 Nonsense mediated decay - F6UVG6 CDS 5' incomplete TSL:3

43.64 kb Forward strand 72.91Mb 72.92Mb 72.93Mb 72.94Mb 72.95Mb (Comprehensive set... Lbh-201 >protein coding

Lbh-202 >nonsense mediated decay

Contigs < AC163036.2 AC151266.4 > Regulatory Build

72.91Mb 72.92Mb 72.93Mb 72.94Mb 72.95Mb Reverse strand 43.64 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000024857

23.64 kb Forward strand

Lbh-201 >protein coding

ENSMUSP00000024... MobiDB lite Prints Protein LBH Pfam LBH domain PIRSF Protein LBH PANTHER PTHR14987:SF2

PTHR14987

All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Y R

Variant Legend synonymous variant

Scale bar 0 10 20 30 40 50 60 70 80 90 105

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8