http://www.alphaknockout.com/ Mouse Kdm7a Knockout Project (CRISPR/Cas9)

Objective: To create a Kdm7a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Kdm7a (NCBI Reference Sequence: NM_001033430 ; Ensembl: ENSMUSG00000042599 ) is located on Mouse 6. 20 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 20 (Transcript: ENSMUST00000002305). Exon 5~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous mutants exhibit abnormal hair follicle, tail, sebaceous gland, rib, and vertebrae morphology and decreased circulating iron levels.

Exon 5 starts from about 19.86% of the coding region. Exon 5~6 covers 11.67% of the coding region. The size of effective KO region: ~1664 bp. The KO region does not have any other known gene.

Page 1 of 8 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 6 20

Legends Exon of mouse Kdm7a Knockout region

Page 2 of 8 http://www.alphaknockout.com/

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1020 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 http://www.alphaknockout.com/

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(32.3% 646) | C(19.5% 390) | G(17.1% 342) | T(31.1% 622)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1020bp) | A(26.67% 272) | C(17.06% 174) | G(15.88% 162) | T(40.39% 412)

Note: The 1020 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr6 - 39170911 39172910 2000 browser details YourSeq 473 579 1219 2000 88.6% chrX - 159091204 159092113 910 browser details YourSeq 421 586 1233 2000 89.8% chr5 + 134587266 134588231 966 browser details YourSeq 400 688 1219 2000 91.7% chr18 + 44971841 44972789 949 browser details YourSeq 397 751 1232 2000 92.8% chr1 - 65797974 65798459 486 browser details YourSeq 392 750 1233 2000 90.9% chrX + 152719816 152720299 484 browser details YourSeq 392 759 1457 2000 89.9% chr18 + 54248302 54249066 765 browser details YourSeq 391 764 1233 2000 91.3% chr9 - 70660294 70660761 468 browser details YourSeq 390 759 1233 2000 92.6% chr9 - 51910717 51911191 475 browser details YourSeq 386 762 1233 2000 90.2% chr17 - 73442696 73443163 468 browser details YourSeq 386 759 1235 2000 90.9% chr10 - 25277720 25278196 477 browser details YourSeq 384 759 1233 2000 90.7% chr15 - 6503606 6504080 475 browser details YourSeq 383 750 1233 2000 90.3% chr6 - 7572148 7572630 483 browser details YourSeq 382 754 1233 2000 89.6% chrX - 58418837 58419315 479 browser details YourSeq 382 759 1233 2000 90.6% chr3 - 59378349 59378822 474 browser details YourSeq 382 750 1233 2000 88.8% chr2 - 126344662 126345141 480 browser details YourSeq 382 751 1233 2000 90.5% chr2 - 114463188 114463676 489 browser details YourSeq 382 759 1221 2000 92.7% chr10 - 89902152 89902614 463 browser details YourSeq 382 750 1233 2000 89.3% chr6 + 25009226 25009708 483 browser details YourSeq 382 759 1233 2000 90.5% chr5 + 21967124 21967598 475

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1020 1 1020 1020 100.0% chr6 - 39169104 39170123 1020 browser details YourSeq 25 540 566 1020 96.3% chr13 - 71252305 71252331 27 browser details YourSeq 24 190 216 1020 96.3% chrX - 137725385 137725412 28 browser details YourSeq 24 892 915 1020 100.0% chr15 + 91830960 91830983 24 browser details YourSeq 22 341 363 1020 100.0% chr1 - 34980366 34980391 26 browser details YourSeq 22 252 276 1020 96.0% chr5 + 5125706 5125731 26 browser details YourSeq 22 63 84 1020 100.0% chr14 + 98683615 98683636 22 browser details YourSeq 21 524 544 1020 100.0% chr16 - 79610461 79610481 21 browser details YourSeq 20 773 792 1020 100.0% chr1 + 49692381 49692400 20

Note: The 1020 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 http://www.alphaknockout.com/ Gene and information: Kdm7a lysine (K)-specific demethylase 7A [ Mus musculus (house mouse) ] Gene ID: 338523, updated on 18-Jul-2020

Gene summary

Official Symbol Kdm7a provided by MGI Official Full Name lysine (K)-specific demethylase 7A provided by MGI Primary source MGI:MGI:2443388 See related Ensembl:ENSMUSG00000042599 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Jhdm1d; BB041802; mKIAA1718; A630082K20Rik Expression Ubiquitous expression in liver E14 (RPKM 5.3), liver E14.5 (RPKM 4.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 B1 See Kdm7a in Genome Data Viewer Exon count: 20

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (39136620..39206773, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (39086619..39156772, complement)

Chromosome 6 - NC_000072.6

Page 6 of 8 http://www.alphaknockout.com/

Transcript information: This gene has 2 transcripts

Gene: Kdm7a ENSMUSG00000042599

Description lysine (K)-specific demethylase 7A [Source:MGI Symbol;Acc:MGI:2443388] Gene Synonyms A630082K20Rik, ENSMUSG00000073143, Jhdm1d, Kdm7a Location Chromosome 6: 39,136,623-39,206,789 reverse strand. GRCm38:CM000999.2 About this gene This gene has 2 transcripts (splice variants), 325 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 14 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Kdm7a-201 ENSMUST00000002305.8 9566 940aa ENSMUSP00000002305.8 Protein coding CCDS51753 Q3UWM4 TSL:1 GENCODE basic APPRIS P1

Kdm7a-202 ENSMUST00000127036.1 3077 No protein - Processed transcript - - TSL:1

90.17 kb Forward strand 39.14Mb 39.16Mb 39.18Mb 39.20Mb 4930599N23Rik-201 >antisense Gm49041-201 >bidirectional promoter lncRNA (Comprehensive set...

Contigs < AC153818.5 AC155729.3 > Genes (Comprehensive set... < Kdm7a-201protein coding

< Kdm7a-202processed transcript

Regulatory Build

39.14Mb 39.16Mb 39.18Mb 39.20Mb Reverse strand 90.17 kb

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

Bidirectional promoter lncRNA processed transcript

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Page 7 of 8 http://www.alphaknockout.com/

Transcript: ENSMUST00000002305

< Kdm7a-201protein coding

Reverse strand 70.17 kb

ENSMUSP00000002... MobiDB lite Low complexity (Seg) Superfamily Zinc finger, FYVE/PHD-type

SSF51197 SMART Zinc finger, PHD-type JmjC domain

Pfam Zinc finger, PHD-finger JmjC domain Jumonji, helical domain

PROSITE profiles Zinc finger, PHD-finger JmjC domain

PROSITE patterns Zinc finger, PHD-type, conserved site PANTHER PTHR23123

PTHR23123:SF15 Gene3D 1.20.58.1360

Zinc finger, RING/FYVE/PHD-type

2.60.120.650 CDD cd15640

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend splice acceptor variant missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 800 940

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 8 of 8