https://www.alphaknockout.com

Mouse Kdm2b Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Kdm2b conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Kdm2b (NCBI Reference Sequence: NM_001003953 ; Ensembl: ENSMUSG00000029475 ) is located on Mouse 5. 23 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 23 (Transcript: ENSMUST00000046073). Exon 15~19 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Kdm2b gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-330H17 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a targeted allele that does not express the long form exhibit exencephaly, fetal and postnatal lethality, coloboma, curly tail, oligozoospermia, increased apoptosis, and increased neuronal precursor proliferation.

Exon 15 starts from about 51.52% of the coding region. The knockout of Exon 15~19 will result in frameshift of the gene. The size of intron 14 for 5'-loxP site insertion: 830 bp, and the size of intron 19 for 3'-loxP site insertion: 520 bp. The size of effective cKO region: ~2816 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 14 15 16 17 18 19 20 21 22 23 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Kdm2b Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9316bp) | A(24.21% 2255) | C(23.7% 2208) | T(26.79% 2496) | G(25.3% 2357)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 122882739 122885738 3000 browser details YourSeq 151 1176 1469 3000 93.3% chr11 + 120724474 120724910 437 browser details YourSeq 149 1311 1580 3000 93.1% chr11 + 101761545 101761969 425 browser details YourSeq 141 839 1462 3000 85.3% chr1 - 170594927 170595475 549 browser details YourSeq 138 1311 1467 3000 96.7% chr12 + 65001641 65001800 160 browser details YourSeq 136 1309 1463 3000 94.2% chr11 + 53588016 53588171 156 browser details YourSeq 135 1314 1467 3000 94.2% chr6 - 120448952 120449106 155 browser details YourSeq 135 1313 1464 3000 92.0% chr4 - 107129144 107129292 149 browser details YourSeq 134 1308 1463 3000 93.3% chrX - 142982285 142982439 155 browser details YourSeq 134 1316 1468 3000 95.3% chr5 - 110495507 110495660 154 browser details YourSeq 133 1313 1464 3000 94.7% chr4 - 132965631 132965783 153 browser details YourSeq 133 1324 1597 3000 84.5% chr1 - 86158184 86158331 148 browser details YourSeq 133 1313 1459 3000 96.0% chr3 + 131026949 131027098 150 browser details YourSeq 133 1313 1466 3000 93.6% chr18 + 46519033 46519188 156 browser details YourSeq 133 1312 1464 3000 92.8% chr11 + 72212728 72212879 152 browser details YourSeq 132 1293 1463 3000 94.1% chr2 - 90821956 90822128 173 browser details YourSeq 132 1313 1467 3000 94.0% chr1 - 171572880 171573059 180 browser details YourSeq 132 1311 1465 3000 93.5% chrX + 144265618 144265773 156 browser details YourSeq 131 1315 1467 3000 93.5% chr2 - 29692522 29692676 155 browser details YourSeq 131 1313 1463 3000 93.4% chr17 - 20296349 20296499 151

Note: The 3000 bp section upstream of Exon 15 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 122876923 122879922 3000 browser details YourSeq 431 381 1488 3000 94.8% chr4 - 40912923 40913357 435 browser details YourSeq 31 2507 2539 3000 90.7% chr13 + 29736451 29736482 32 browser details YourSeq 27 2770 2807 3000 83.9% chrX - 115649211 115649246 36 browser details YourSeq 27 2869 2896 3000 100.0% chr13 + 13579126 13579165 40 browser details YourSeq 26 2513 2538 3000 100.0% chr12 - 106226661 106226686 26 browser details YourSeq 26 2497 2522 3000 100.0% chr8 + 19949445 19949470 26 browser details YourSeq 25 1673 1697 3000 100.0% chr4 - 114732910 114732934 25 browser details YourSeq 23 978 1000 3000 100.0% chr8 - 63111955 63111977 23 browser details YourSeq 23 969 991 3000 100.0% chr8 + 93415406 93415428 23 browser details YourSeq 22 153 174 3000 100.0% chr19 - 38088076 38088097 22 browser details YourSeq 21 1872 1892 3000 100.0% chr6 - 42578466 42578486 21 browser details YourSeq 21 2559 2579 3000 100.0% chr16 - 31517716 31517736 21

Note: The 3000 bp section downstream of Exon 19 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Kdm2b lysine (K)-specific demethylase 2B [ Mus musculus (house mouse) ] Gene ID: 30841, updated on 12-Aug-2019

Gene summary

Official Symbol Kdm2b provided by MGI Official Full Name lysine (K)-specific demethylase 2B provided by MGI Primary source MGI:MGI:1354737 See related Ensembl:ENSMUSG00000029475 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cxxc2; Fbl10; PCCX2; Fbxl10; Jhdm1b Summary The protein encoded by this gene is a H3K36-specific histone demethylase, which contains an N-terminal jumonji C domain, Expression a CxxC zinc finger domain, a plant homeodomain finger, an F-box, and eight leucine-rich repeats. Amongst its demonstrated functions, this protein plays roles in the suppression of premature cellular senescence, leukemia maintenance and development, maintenance of mouse embryonic stem cell pluripotency, and induced pluripotent stem cell generation. Mice homozygous for a targeted deletion of the zinc finger domain display embryonic lethality with development ceasing at approximately 7 to 8 days post coitum, demonstrating an essential role in early development. A pseudogene of this gene is found on chromosome 4. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Oct 2014] Orthologs Ubiquitous expression in CNS E14 (RPKM 10.9), whole brain E14.5 (RPKM 10.7) and 28 other tissues See more human all

Genomic context

Location: 5; 5 F See Kdm2b in Genome Data Viewer Exon count: 29

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (122870664..122989270, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (123320677..123439101, complement)

Chromosome 5 - NC_000071.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 20 transcripts

Gene: Kdm2b ENSMUSG00000029475

Description lysine (K)-specific demethylase 2B [Source:MGI Symbol;Acc:MGI:1354737] Gene Synonyms Cxxc2, Fbxl10, Jhdm1b Location Chromosome 5: 122,870,665-122,989,823 reverse strand. GRCm38:CM000998.2 About this gene This gene has 20 transcripts (splice variants), 317 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 31 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Kdm2b- ENSMUST00000046073.15 5180 1309aa ENSMUSP00000038229.9 Protein coding CCDS39259 Q6P1G2 TSL:1 202 GENCODE basic APPRIS P2

Kdm2b- ENSMUST00000031435.13 3532 776aa ENSMUSP00000031435.7 Protein coding CCDS19657 Q6P1G2 TSL:1 201 GENCODE basic

Kdm2b- ENSMUST00000118027.7 5062 1266aa ENSMUSP00000114052.2 Protein coding - D3YVU4 TSL:5 204 GENCODE basic

Kdm2b- ENSMUST00000086200.10 4999 1303aa ENSMUSP00000083376.4 Protein coding - E9QL25 TSL:5 203 GENCODE basic APPRIS ALT2

Kdm2b- ENSMUST00000121739.7 4107 1254aa ENSMUSP00000114049.1 Protein coding - D3YVU7 TSL:5 205 GENCODE basic APPRIS ALT2

Kdm2b- ENSMUST00000156474.7 1598 514aa ENSMUSP00000118488.1 Protein coding - D3YUE3 CDS 3' incomplete 218 TSL:1

Kdm2b- ENSMUST00000145082.1 1043 286aa ENSMUSP00000114731.1 Protein coding - D3YV31 CDS 3' incomplete 214 TSL:2

Kdm2b- ENSMUST00000152872.1 888 296aa ENSMUSP00000119746.1 Protein coding - F6QTG9 CDS 5' and 3' 217 incomplete TSL:3

Kdm2b- ENSMUST00000127403.7 2543 38aa ENSMUSP00000120912.1 Nonsense mediated - D6RHM8 TSL:5 207 decay

Kdm2b- ENSMUST00000139674.1 4244 No - Retained intron - - TSL:1 212 protein

Kdm2b- ENSMUST00000123479.7 2753 No - Retained intron - - TSL:1 206 protein

Kdm2b- ENSMUST00000134501.7 1491 No - Retained intron - - TSL:1 210 protein

Kdm2b- ENSMUST00000132419.1 641 No - Retained intron - - TSL:2 209 protein

Kdm2b- ENSMUST00000174357.1 414 No - Retained intron - - TSL:5 220 protein

Kdm2b- ENSMUST00000143273.7 724 No - lncRNA - - TSL:5 213 protein

Kdm2b- ENSMUST00000150543.1 697 No - lncRNA - - TSL:3 216 protein

Kdm2b- ENSMUST00000147544.7 625 No - lncRNA - - TSL:5 215 protein

Kdm2b- ENSMUST00000129998.7 550 No - lncRNA - - TSL:3 208 protein

Kdm2b- ENSMUST00000173355.7 516 No - lncRNA - - TSL:3 219 protein

Page 6 of 8 https://www.alphaknockout.com

Kdm2b- ENSMUST00000138929.7 464 No - lncRNA - - TSL:3 211 protein

139.16 kb Forward strand 122.88Mb 122.90Mb 122.92Mb 122.94Mb 122.96Mb 122.98Mb Rnf34-201 >protein coding A930024E05Rik-201 >lncRNA (Comprehensive set...

Rnf34-202 >protein coding

Contigs < AC121564.4 Genes (Comprehensive set... < Kdm2b-202protein coding

< Kdm2b-201protein coding < Gm43411-201TEC < Kdm2b-218protein coding

< Kdm2b-204protein coding

< Kdm2b-203protein coding

< Kdm2b-207nonsense mediated decay

< Kdm2b-205protein coding

< Kdm2b-212retained intron < Kdm2b-214protein coding < Gm44574-201lncRNA

< Kdm2b-217protein coding < Kdm2b-213lncRNA

< Kdm2b-209retained intron < Kdm2b-211lncRNA

< Kdm2b-219lncRNA < Kdm2b-215lncRNA

< Kdm2b-208lncRNA < Kdm2b-210retained intron

< Kdm2b-220retained intron < Kdm2b-216lncRNA

< Kdm2b-206retained intron

Regulatory Build

122.88Mb 122.90Mb 122.92Mb 122.94Mb 122.96Mb 122.98Mb Reverse strand 139.16 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000046073

< Kdm2b-202protein coding

Reverse strand 118.44 kb

ENSMUSP00000038... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF51197 Zinc finger, FYVE/PHD-type SSF52047

SMART JmjC domain Zinc finger, PHD-type Leucine-rich repeat, cysteine-containing subtype

Pfam Jumonji, helical domain Zinc finger, CXXC-type F-box domain

Cupin-like domain 8 Zinc finger, PHD-finger PROSITE profiles JmjC domain Zinc finger, PHD-finger

Zinc finger, CXXC-type PANTHER PTHR23123:SF10

PTHR23123 Gene3D 2.60.120.650 1.20.58.1360 1.20.58.2210 Zinc finger, RING/FYVE/PHD-type Leucine-rich repeat domain superfamily

CDD cd15644

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1309

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8