https://www.alphaknockout.com

Mouse Kat6a Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Kat6a conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Kat6a (NCBI Reference Sequence: NM_001081149 ; Ensembl: ENSMUSG00000031540 ) is located on Mouse 8. 18 exons are identified, with the ATG start codon in exon 3 and the TGA stop codon in exon 18 (Transcript: ENSMUST00000044331). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Kat6a gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-327N5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis.

Note: Homozygous null mice display perinatal lethality, cyanosis, decreased hematopoietic progenitor cell numbers, and severely impaired spleen and thymus development, but are not anemic. Heterozygotes display strain background dependent reductions in fertility.

Exon 4 starts from about 10.03% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 40306 bp, and the size of intron 4 for 3'-loxP site insertion: 4429 bp. The size of effective cKO region: ~873 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 4 18 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Kat6a Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7109bp) | A(29.41% 2091) | C(18.65% 1326) | T(30.37% 2159) | G(21.56% 1533)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 + 22899859 22902858 3000 browser details YourSeq 113 924 1383 3000 82.9% chr15 - 76983197 76983552 356 browser details YourSeq 111 771 902 3000 94.5% chrX - 71719257 71719393 137 browser details YourSeq 111 772 900 3000 92.8% chr2 + 59321406 59321533 128 browser details YourSeq 109 772 898 3000 93.6% chr17 - 92021200 92021327 128 browser details YourSeq 107 772 906 3000 88.2% chr1 + 99419848 99419978 131 browser details YourSeq 106 772 906 3000 92.2% chr13 + 88722199 88722334 136 browser details YourSeq 105 772 906 3000 92.7% chr2 + 37292801 37292935 135 browser details YourSeq 104 945 1382 3000 82.4% chr15 - 88506500 88506881 382 browser details YourSeq 103 948 1391 3000 82.8% chr16 - 5399697 5400017 321 browser details YourSeq 102 1043 1312 3000 86.0% chr12 - 79862332 79862555 224 browser details YourSeq 102 1043 1404 3000 94.2% chr1 + 132873355 132873862 508 browser details YourSeq 101 949 1383 3000 82.1% chr14 - 29811881 29812200 320 browser details YourSeq 95 924 1314 3000 80.4% chr8 - 90124480 90124792 313 browser details YourSeq 95 908 1147 3000 79.2% chr4 + 126549294 126549487 194 browser details YourSeq 94 1043 1383 3000 77.7% chr8 - 9668926 9669232 307 browser details YourSeq 94 1043 1395 3000 90.6% chr1 - 52569394 52569746 353 browser details YourSeq 94 943 1383 3000 82.6% chr8 + 95041019 95041397 379 browser details YourSeq 92 1043 1383 3000 82.7% chr2 - 71605721 71605973 253 browser details YourSeq 90 1043 1383 3000 90.3% chr12 + 28796859 28797223 365

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 + 22903468 22906467 3000 browser details YourSeq 96 2546 2859 3000 84.0% chr8 - 116841556 116841903 348 browser details YourSeq 95 2551 2859 3000 91.6% chr14 + 119862765 119863425 661 browser details YourSeq 92 2547 2688 3000 88.5% chr5 - 112880194 112880371 178 browser details YourSeq 86 2554 2801 3000 89.9% chr13 + 113838940 113839222 283 browser details YourSeq 81 2588 2804 3000 94.7% chr10 + 117581484 117581753 270 browser details YourSeq 80 2546 2685 3000 83.4% chr14 - 44266833 44267001 169 browser details YourSeq 80 2568 2688 3000 91.7% chr8 + 24272475 24272618 144 browser details YourSeq 79 2543 2676 3000 87.9% chr13 + 21732955 21733104 150 browser details YourSeq 79 2549 2685 3000 83.8% chr12 + 41790981 41791146 166 browser details YourSeq 78 2546 2685 3000 90.7% chr5 + 92916766 92916933 168 browser details YourSeq 78 2549 2685 3000 91.5% chr1 + 52602694 52912045 309352 browser details YourSeq 77 2546 2688 3000 89.8% chr13 + 59698466 59698637 172 browser details YourSeq 76 2556 2685 3000 84.7% chr14 - 117995099 117995258 160 browser details YourSeq 76 2546 2685 3000 89.6% chr16 + 71136730 71136897 168 browser details YourSeq 76 2549 2801 3000 85.2% chr1 + 37262562 37262870 309 browser details YourSeq 75 2549 2685 3000 90.4% chr13 - 9410197 9410362 166 browser details YourSeq 75 2562 2688 3000 86.6% chr10 - 83515330 83515492 163 browser details YourSeq 75 2543 2687 3000 83.1% chr10 - 45767217 45767398 182 browser details YourSeq 75 2549 2685 3000 83.2% chr1 - 47187759 47187924 166

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Kat6a K(lysine) acetyltransferase 6A [ Mus musculus (house mouse) ] Gene ID: 244349, updated on 24-Oct-2019

Gene summary

Official Symbol Kat6a provided by MGI Official Full Name K(lysine) acetyltransferase 6A provided by MGI Primary source MGI:MGI:2442415 See related Ensembl:ENSMUSG00000031540 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MOZ; Myst3; Zfp220; 1500036M03; 9930021N24Rik Expression Ubiquitous expression in thymus adult (RPKM 13.7), CNS E11.5 (RPKM 10.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 A2 See Kat6a in Genome Data Viewer

Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (22859442..22943259)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (23970011..24053734)

Chromosome 8 - NC_000074.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Kat6a ENSMUSG00000031540

Description K(lysine) acetyltransferase 6A [Source:MGI Symbol;Acc:MGI:2442415] Gene Synonyms 9930021N24Rik, MOZ, Myst3, Zfp220 Location : 22,859,535-22,943,259 forward strand. GRCm38:CM001001.2 About this gene This gene has 4 transcripts (splice variants), 221 orthologues, 9 paralogues, is a member of 1 Ensembl protein family and is associated with 59 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Kat6a-201 ENSMUST00000044331.6 9113 2003aa ENSMUSP00000038181.6 Protein coding CCDS40294 G3X940 TSL:1 GENCODE basic APPRIS P2

Kat6a-202 ENSMUST00000110696.7 9041 2003aa ENSMUSP00000106324.1 Protein coding CCDS40294 G3X940 TSL:1 GENCODE basic APPRIS P2

Kat6a-204 ENSMUST00000238975.1 9191 2053aa ENSMUSP00000159155.1 Protein coding - - GENCODE basic APPRIS ALT2

Kat6a-203 ENSMUST00000130718.1 419 No protein - lncRNA - - TSL:5

103.72 kb Forward strand 22.86Mb 22.88Mb 22.90Mb 22.92Mb 22.94Mb (Comprehensive set... Kat6a-202 >protein coding

Kat6a-204 >protein coding

Kat6a-201 >protein coding

Gm45555-201 >TEC Kat6a-203 >lncRNA

Contigs < AC142414.3 AC115361.5 > Regulatory Build

22.86Mb 22.88Mb 22.90Mb 22.92Mb 22.94Mb Reverse strand 103.72 kb

Regulation Legend

CTCF Enhancer Open Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000044331

83.71 kb Forward strand

Kat6a-201 >protein coding

ENSMUSP00000038... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Acyl-CoA N-acyltransferase

Winged helix DNA-binding domain superfamily

Zinc finger, FYVE/PHD-type SMART Linker histone H1/H5, domain H15

Zinc finger, PHD-type Pfam MYST, zinc finger domain

Zinc finger, PHD-finger

Histone acetyltransferase domain, MYST-type PROSITE profiles Histone acetyltransferase domain, MYST-type

Zinc finger, PHD-finger

Linker histone H1/H5, domain H15 PANTHER PTHR10615

Histone acetyltransferase KAT6A Gene3D 3.30.60.60 Winged helix-like DNA-binding domain superfamily

Zinc finger, RING/FYVE/PHD-type

3.40.630.30 CDD cd15527 cd04301

cd15618

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1400 1600 1800 2003

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7