https://www.alphaknockout.com

Mouse Hapln1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Hapln1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hapln1 (NCBI Reference Sequence: NM_013500 ; Ensembl: ENSMUSG00000021613 ) is located on Mouse 13. 5 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000022108). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hapln1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-53P5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a targeted null mutation exhibit defects in cartilage development and delayed bone formation with short limbs and craniofacial anomalies. Mutants usually die as neonates due to respiratory failure, but some survive and develop dwarfism.

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 43785 bp, and the size of intron 2 for 3'-loxP site insertion: 16616 bp. The size of effective cKO region: ~606 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Hapln1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7106bp) | A(31.72% 2254) | C(19.77% 1405) | T(29.4% 2089) | G(19.11% 1358)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 + 89581473 89584472 3000 browser details YourSeq 232 148 582 3000 92.1% chr6 + 122733423 122733898 476 browser details YourSeq 226 148 583 3000 89.6% chr11 - 86595646 86906149 310504 browser details YourSeq 218 148 582 3000 90.2% chr1 - 91915534 92122464 206931 browser details YourSeq 218 148 586 3000 83.7% chr10 + 80626968 80627298 331 browser details YourSeq 215 148 586 3000 83.9% chr17 - 31515581 31515924 344 browser details YourSeq 200 145 586 3000 84.1% chr7 - 12976029 12976352 324 browser details YourSeq 189 154 568 3000 87.2% chr10 + 52346599 52347127 529 browser details YourSeq 157 148 550 3000 80.1% chr16 + 30568078 30568325 248 browser details YourSeq 155 148 586 3000 87.2% chr13 - 55236519 55237018 500 browser details YourSeq 146 239 586 3000 91.5% chr11 + 74794513 74795162 650 browser details YourSeq 140 418 586 3000 93.1% chr13 - 98318371 98318538 168 browser details YourSeq 139 169 586 3000 91.2% chr11 + 97313495 97314121 627 browser details YourSeq 135 415 586 3000 92.0% chr10 + 83119677 83119876 200 browser details YourSeq 134 31 586 3000 85.4% chr17 - 28210330 28210863 534 browser details YourSeq 133 435 586 3000 94.1% chr13 + 58171690 58171842 153 browser details YourSeq 132 435 586 3000 93.5% chr11 + 15471731 15471882 152 browser details YourSeq 131 436 586 3000 93.4% chr18 + 34998411 34998561 151 browser details YourSeq 130 435 586 3000 92.8% chr17 - 21529735 21529886 152 browser details YourSeq 130 153 542 3000 88.7% chr10 - 80874413 80874973 561

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 + 89585079 89588078 3000 browser details YourSeq 345 2486 2975 3000 94.2% chr8 - 101895110 101895842 733 browser details YourSeq 345 2355 2865 3000 94.7% chr19 - 50298740 50299250 511 browser details YourSeq 341 2489 2872 3000 93.4% chr12 + 17046133 17046508 376 browser details YourSeq 340 2490 2869 3000 95.0% chr6 - 62730883 62731264 382 browser details YourSeq 340 2487 2872 3000 93.8% chr3 + 134581258 134581642 385 browser details YourSeq 339 2489 2872 3000 94.4% chr9 + 12863375 12863755 381 browser details YourSeq 339 2489 2862 3000 95.5% chr15 + 40809589 40809964 376 browser details YourSeq 338 2486 2869 3000 95.7% chrX - 105199757 105200141 385 browser details YourSeq 338 2489 2862 3000 95.5% chrX + 77481833 77482208 376 browser details YourSeq 337 2489 2872 3000 93.5% chr11 + 45231239 45231621 383 browser details YourSeq 336 2489 2872 3000 94.2% chrX - 46042442 46042831 390 browser details YourSeq 336 2487 2865 3000 94.2% chr2 - 17520144 17520505 362 browser details YourSeq 335 2486 2865 3000 94.7% chr4 + 90811645 90812023 379 browser details YourSeq 335 2489 2860 3000 95.2% chr11 + 81581744 81582116 373 browser details YourSeq 334 2486 2884 3000 91.3% chr12 - 116810650 116811040 391 browser details YourSeq 334 2489 2872 3000 93.8% chr10 - 129165603 129165982 380 browser details YourSeq 334 2489 2859 3000 95.9% chr9 + 86960103 86960475 373 browser details YourSeq 334 2486 2869 3000 94.0% chr6 + 45601996 45983296 381301 browser details YourSeq 334 2489 2872 3000 93.5% chr18 + 47810158 47810540 383

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Hapln1 hyaluronan and proteoglycan link protein 1 [ Mus musculus (house mouse) ] Gene ID: 12950, updated on 12-Aug-2019

Gene summary

Official Symbol Hapln1 provided by MGI Official Full Name hyaluronan and proteoglycan link protein 1 provided by MGI Primary source MGI:MGI:1337006 See related Ensembl:ENSMUSG00000021613 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as LP; CLP; LP-1; Crtl1; Crtl1l; BB099155 Expression Biased expression in limb E14.5 (RPKM 38.8), CNS E14 (RPKM 5.4) and 4 other tissues See more Orthologs human all

Genomic context

Location: 13 C3; 13 45.5 cM See Hapln1 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (89540529..89611832)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (89680241..89751437)

Chromosome 13 - NC_000079.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 2 transcripts

Gene: Hapln1 ENSMUSG00000021613

Description hyaluronan and proteoglycan link protein 1 [Source:MGI Symbol;Acc:MGI:1337006] Gene Synonyms CLP, Crtl1, Crtl1l, LP-1, cartilage linking protein 1, link protein Location Chromosome 13: 89,539,796-89,611,652 forward strand. GRCm38:CM001006.2 About this gene This gene has 2 transcripts (splice variants), 196 orthologues, 7 paralogues, is a member of 1 Ensembl protein family and is associated with 24 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hapln1-201 ENSMUST00000022108.8 5715 356aa ENSMUSP00000022108.7 Protein coding CCDS26671 Q9QUP5 TSL:1 GENCODE basic APPRIS P1

Hapln1-202 ENSMUST00000225678.1 464 No protein - Retained intron - - -

91.86 kb Forward strand

89.54Mb 89.56Mb 89.58Mb 89.60Mb 89.62Mb Hapln1-201 >protein coding (Comprehensive set...

Hapln1-202 >retained intron

Contigs < AC154460.2

Genes < Gm8546-201processed pseudogene (Comprehensive set...

Regulatory Build

89.54Mb 89.56Mb 89.58Mb 89.60Mb 89.62Mb Reverse strand 91.86 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript pseudogene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000022108

71.86 kb Forward strand

Hapln1-201 >protein coding

ENSMUSP00000022... Cleavage site (Sign... Superfamily Immunoglobulin-like domain superfamily

C-type lectin fold SMART Immunoglobulin V-set domain Link domain

Immunoglobulin subtype Prints Link domain Pfam Immunoglobulin V-set domain Link domain PROSITE profiles Immunoglobulin-like domain Link domain PROSITE patterns Link domain PANTHER PTHR22804:SF10

PTHR22804 Gene3D Immunoglobulin-like fold C-type lectin-like/link domain superfamily

CDD cd05877 cd03519

cd03518

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 356

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7