https://www.alphaknockout.com

Mouse Atxn7l3 Knockout Project (CRISPR/Cas9)

Objective: To create a Atxn7l3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Atxn7l3 (NCBI Reference Sequence: NM_001098836 ; Ensembl: ENSMUSG00000059995 ) is located on Mouse 11. 13 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000107132). Exon 2~13 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mouse embryonic stem cells homozygous for a knock-out allele exhibit strikingly increased H2B monoubiquitination (H2Bub) levels and fail to show loss of global H2Bub following inhibition of transcriptional elongation.

Exon 2 starts from about 0.09% of the coding region. Exon 2~13 covers 100.0% of the coding region. The size of effective KO region: ~3340 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 11 12 13

Legends Exon of mouse Atxn7l3 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(15.05% 301) | C(30.4% 608) | T(16.45% 329) | G(38.1% 762)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(17.5% 350) | C(29.15% 583) | T(24.3% 486) | G(29.05% 581)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 102295000 102296999 2000 browser details YourSeq 42 306 544 2000 95.7% chr1 - 36939301 37327656 388356 browser details YourSeq 29 617 650 2000 84.4% chr8 - 57804152 57804183 32 browser details YourSeq 29 124 190 2000 96.8% chr1 + 128308460 128308558 99 browser details YourSeq 27 797 830 2000 96.7% chr9 - 46012796 46012829 34 browser details YourSeq 26 630 661 2000 77.8% chr1 - 59889036 59889062 27 browser details YourSeq 25 300 326 2000 100.0% chr13 - 109261391 109261424 34 browser details YourSeq 25 630 660 2000 77.0% chr1 - 59889030 59889055 26 browser details YourSeq 24 522 545 2000 100.0% chrX - 68678730 68678753 24 browser details YourSeq 23 524 546 2000 100.0% chr12 - 112929725 112929747 23 browser details YourSeq 23 460 482 2000 100.0% chr9 + 22594659 22594681 23 browser details YourSeq 23 1855 1878 2000 100.0% chr1 + 23308308 23308333 26 browser details YourSeq 22 303 324 2000 100.0% chr7 - 120440470 120440491 22 browser details YourSeq 22 303 324 2000 100.0% chr19 - 20362601 20362622 22 browser details YourSeq 21 303 324 2000 100.0% chr1 + 56971956 56971978 23 browser details YourSeq 20 524 543 2000 100.0% chr15 + 93336889 93336908 20

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 102289658 102291657 2000 browser details YourSeq 279 1686 1999 2000 95.2% chr8 + 81164038 81164360 323 browser details YourSeq 51 790 852 2000 93.4% chr10 - 112926977 112927040 64 browser details YourSeq 24 106 129 2000 100.0% chr5 + 123503985 123504008 24 browser details YourSeq 23 1763 1785 2000 100.0% chr11 - 19634960 19634982 23 browser details YourSeq 22 876 900 2000 95.9% chr1 - 62916266 62916291 26 browser details YourSeq 22 702 723 2000 100.0% chr3 + 107801466 107801487 22

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Atxn7l3 ataxin 7-like 3 [ Mus musculus (house mouse) ] Gene ID: 217218, updated on 15-Aug-2019

Gene summary

Official Symbol Atxn7l3 provided by MGI Official Full Name ataxin 7-like 3 provided by MGI Primary source MGI:MGI:3036270 See related Ensembl:ENSMUSG00000059995 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as E030022H21Rik Expression Ubiquitous expression in cortex adult (RPKM 32.3), whole brain E14.5 (RPKM 28.1) and 28 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 D See Atxn7l3 in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (102289300..102297045, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (102150614..102157943, complement)

Chromosome 11 - NC_000077.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Atxn7l3 ENSMUSG00000059995

Description ataxin 7-like 3 [Source:MGI Symbol;Acc:MGI:3036270] Gene Synonyms E030022H21Rik Location Chromosome 11: 102,289,300-102,296,631 reverse strand. GRCm38:CM001004.2 About this gene This gene has 6 transcripts (splice variants), 225 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 9 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Atxn7l3-203 ENSMUST00000107134.7 3712 347aa ENSMUSP00000102752.1 Protein coding CCDS48941 A2AWT3 TSL:5 GENCODE basic APPRIS ALT1

Atxn7l3-202 ENSMUST00000107132.2 3706 354aa ENSMUSP00000102750.2 Protein coding CCDS48942 A2AWT3 TSL:5 GENCODE basic APPRIS P4

Atxn7l3-201 ENSMUST00000073234.8 3688 347aa ENSMUSP00000072967.2 Protein coding CCDS48941 A2AWT3 TSL:1 GENCODE basic APPRIS ALT1

Atxn7l3-204 ENSMUST00000137387.7 3408 350aa ENSMUSP00000122610.1 Protein coding - Z4YN62 CDS 5' incomplete TSL:5

Atxn7l3-205 ENSMUST00000141516.7 3130 258aa ENSMUSP00000121917.1 Protein coding - F6RCN9 CDS 5' incomplete TSL:5

Atxn7l3-206 ENSMUST00000145484.1 3102 No protein - Retained intron - - TSL:1

Page 7 of 9 https://www.alphaknockout.com

27.33 kb Forward strand 102.28Mb 102.29Mb 102.30Mb Asb16-201 >protein coding Tmub2-201 >protein coding (Comprehensive set...

Tmub2-202 >protein coding

Tmub2-203 >lncRNA

Tmub2-204 >lncRNA

Tmub2-205 >protein coding

Contigs AL954730.15 > Genes (Comprehensive set... < Atxn7l3-201protein coding < Ubtf-210protein coding

< Atxn7l3-203protein coding < Ubtf-213protein coding

< Atxn7l3-204protein coding < Ubtf-201protein coding

< Atxn7l3-205protein coding < Ubtf-204protein coding

< Atxn7l3-206retained intron < Ubtf-203protein coding

< Atxn7l3-202protein coding < Ubtf-202protein coding

< Ubtf-205protein coding

< Ubtf-206protein coding

< Ubtf-209protein coding

Regulatory Build

102.28Mb 102.29Mb 102.30Mb Reverse strand 27.33 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000107132

< Atxn7l3-202protein coding

Reverse strand 7.33 kb

ENSMUSP00000102... MobiDB lite Low complexity (Seg) Pfam SAGA complex, Sgf11 subunit SCA7 domain

PROSITE profiles SCA7 domain PANTHER PTHR46367

PTHR46367:SF2 HAMAP SAGA complex, Sgf11 subunit

Gene3D 3.30.160.60

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 354

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9