https://www.alphaknockout.com

Mouse Tra2a Knockout Project (CRISPR/Cas9)

Objective: To create a Tra2a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tra2a (NCBI Reference Sequence: NM_198102 ; Ensembl: ENSMUSG00000029817 ) is located on Mouse 6. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000031841). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 4.37% of the coding region. Exon 2~6 covers 86.76% of the coding region. The size of effective KO region: ~7040 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 8

Legends Exon of mouse Tra2a Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 540 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.9% 638) | C(12.65% 253) | T(33.95% 679) | G(21.5% 430)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(540bp) | A(24.26% 131) | C(15.19% 82) | T(37.59% 203) | G(22.96% 124)

Note: The 540 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr6 - 49252516 49254515 2000 browser details YourSeq 24 1256 1280 2000 100.0% chr1 - 96102106 96102131 26

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 540 1 540 540 100.0% chr6 - 49244936 49245475 540 browser details YourSeq 46 342 424 540 92.5% chr18 - 37735755 37736179 425 browser details YourSeq 40 322 399 540 78.2% chr11 + 62474917 62474993 77 browser details YourSeq 39 351 401 540 88.3% chr13 + 17681976 17682026 51 browser details YourSeq 39 129 489 540 54.6% chr10 + 128063406 128063493 88 browser details YourSeq 37 351 399 540 87.8% chr4 + 139015570 139015618 49 browser details YourSeq 37 349 399 540 86.3% chr19 + 47689010 47689060 51 browser details YourSeq 33 349 399 540 82.4% chr2 - 44944087 44944137 51 browser details YourSeq 33 353 401 540 83.7% chr14 + 26978728 26978776 49 browser details YourSeq 31 353 436 540 91.9% chr5 - 150641039 150641123 85 browser details YourSeq 30 355 400 540 82.7% chr7 + 47707351 47707396 46 browser details YourSeq 29 351 399 540 79.6% chr15 - 100262574 100262622 49 browser details YourSeq 29 453 485 540 94.0% chr1 + 161248570 161248602 33 browser details YourSeq 28 454 485 540 93.8% chr10 - 17924420 17924451 32 browser details YourSeq 28 351 400 540 78.0% chr2 + 33939530 33939579 50 browser details YourSeq 27 326 366 540 89.7% chr8 - 84008389 84008428 40 browser details YourSeq 27 342 376 540 84.9% chr6 - 113513811 113513844 34 browser details YourSeq 26 304 333 540 86.3% chr1 - 56788239 56788267 29 browser details YourSeq 26 322 362 540 85.8% chr11 + 6295714 6295752 39 browser details YourSeq 25 351 397 540 76.6% chr7 - 64694691 64694737 47

Note: The 540 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tra2a transformer 2 alpha [ Mus musculus (house mouse) ] Gene ID: 101214, updated on 10-Oct-2019

Gene summary

Official Symbol Tra2a provided by MGI Official Full Name transformer 2 alpha provided by MGI Primary source MGI:MGI:1933972 See related Ensembl:ENSMUSG00000029817 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as mAWMS1; AL022798; 1500010G04Rik; G430041M01Rik Expression Broad expression in CNS E11.5 (RPKM 66.1), CNS E14 (RPKM 56.3) and 24 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 B2.3 See Tra2a in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (49243921..49264248, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (49193920..49214051, complement)

Chromosome 6 - NC_000072.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Tra2a ENSMUSG00000029817

Description transformer 2 alpha [Source:MGI Symbol;Acc:MGI:1933972] Gene Synonyms 1500010G04Rik, G430041M01Rik, mAWMS1 Location Chromosome 6: 49,243,924-49,264,033 reverse strand. GRCm38:CM000999.2 About this gene This gene has 5 transcripts (splice variants), 192 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 5 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tra2a- ENSMUST00000031841.8 1799 282aa ENSMUSP00000031841.7 Protein coding CCDS39486 E9QP00 TSL:5 201 GENCODE basic APPRIS P3

Tra2a- ENSMUST00000204189.2 1777 280aa ENSMUSP00000145039.1 Protein coding CCDS85044 A0A0N4SVC2 TSL:1 204 GENCODE basic APPRIS ALT2

Tra2a- ENSMUST00000203820.2 2090 59aa ENSMUSP00000144908.1 Nonsense mediated - A0A0N4SV13 TSL:1 202 decay

Tra2a- ENSMUST00000204013.1 3788 No - Retained intron - - TSL:NA 203 protein

Tra2a- ENSMUST00000204818.1 3040 No - Retained intron - - TSL:1 205 protein

Page 7 of 9 https://www.alphaknockout.com

40.11 kb Forward strand 49.24Mb 49.25Mb 49.26Mb 49.27Mb Gm43980-201 >TEC (Comprehensive set...

Contigs AC158667.8 > Genes (Comprehensive set... < Tra2a-202nonsense mediated decay

< Tra2a-204protein coding

< Tra2a-205retained intron

< Tra2a-201protein coding

< Tra2a-203retained intron

Regulatory Build

49.24Mb 49.25Mb 49.26Mb 49.27Mb Reverse strand 40.11 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000031841

< Tra2a-201protein coding

Reverse strand 20.10 kb

ENSMUSP00000031... MobiDB lite Low complexity (Seg) Superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Pfam RNA recognition motif domain PROSITE profiles RNA recognition motif domain PANTHER PTHR15241

PTHR15241:SF3 Gene3D Nucleotide-binding alpha-beta plait domain superfamily CDD cd12363

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 282

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9