https://www.alphaknockout.com

Mouse Zap70 Knockout Project (CRISPR/Cas9)

Objective: To create a Zap70 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Zap70 (NCBI Reference Sequence: NM_009539 ; Ensembl: ENSMUSG00000026117 ) is located on Mouse 1. 13 are identified, with the ATG start codon in 2 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000027291). Exon 2~9 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mutant mice show defects. Null mutants lack alpha-beta T cells in the thymus and have fewer T cells in dendritic and intestinal epithelium. Spontaneous and knock-in missense mutations affect T cell receptor signaling, one of the former resulting in severe chronic arthritis.

Exon 2 starts from the coding region. Exon 2~9 covers 69.36% of the coding region. The size of effective KO region: ~9090 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 13

Legends Exon of mouse Zap70 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 945 bp section downstream of Exon 9 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.6% 492) | C(24.35% 487) | T(29.85% 597) | G(21.2% 424)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(945bp) | A(39.79% 376) | C(24.66% 233) | T(10.26% 97) | G(25.29% 239)

Note: The 945 bp section downstream of Exon 9 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 36768812 36770811 2000 browser details YourSeq 115 1410 1560 2000 89.2% chr2 + 164800113 164800263 151 browser details YourSeq 113 1412 1568 2000 84.3% chr1 - 144968815 144968968 154 browser details YourSeq 110 1424 1615 2000 88.3% chr1 - 135240233 135240619 387 browser details YourSeq 109 1426 1568 2000 88.2% chr11 - 95700599 95700741 143 browser details YourSeq 108 1410 1550 2000 90.3% chr4 - 90358844 90359107 264 browser details YourSeq 107 1411 1547 2000 88.2% chr1 - 79681172 79681307 136 browser details YourSeq 104 1411 1546 2000 86.7% chr5 - 80675865 80675999 135 browser details YourSeq 104 1411 1546 2000 88.9% chr4 - 115034126 115034283 158 browser details YourSeq 104 1412 1547 2000 84.8% chr8 + 93232006 93232137 132 browser details YourSeq 103 1410 1546 2000 88.9% chr10 - 7304365 7304508 144 browser details YourSeq 103 1411 1555 2000 83.0% chr1 - 109089499 109089634 136 browser details YourSeq 103 1410 1547 2000 88.9% chr4 + 129690457 129690595 139 browser details YourSeq 101 1416 1555 2000 84.7% chr11 - 60075057 60075194 138 browser details YourSeq 100 1408 1555 2000 87.9% chr18 + 58142202 58142350 149 browser details YourSeq 100 1410 1546 2000 84.3% chr1 + 60673946 60674079 134 browser details YourSeq 99 1408 1552 2000 90.4% chrX + 10922059 10922211 153 browser details YourSeq 99 1412 1546 2000 86.1% chrX + 8173849 8173981 133 browser details YourSeq 98 1412 1547 2000 85.1% chr4 - 80293424 80293558 135 browser details YourSeq 98 1412 1569 2000 80.0% chr10 - 128185208 128185353 146

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 945 1 945 945 100.0% chr1 + 36779893 36780837 945 browser details YourSeq 265 133 603 945 91.7% chr6 + 12714732 12715472 741 browser details YourSeq 259 70 649 945 87.2% chr16 - 9465667 9466230 564 browser details YourSeq 240 130 649 945 92.4% chr15 + 83737661 83738473 813 browser details YourSeq 238 121 595 945 87.4% chr1 + 55188793 55189242 450 browser details YourSeq 237 95 657 945 91.9% chr12 - 7890083 7890779 697 browser details YourSeq 231 206 657 945 91.7% chr18 + 13921772 13922275 504 browser details YourSeq 230 142 650 945 91.5% chr12 + 112936621 112937466 846 browser details YourSeq 227 79 599 945 87.2% chr10 + 97547141 97547528 388 browser details YourSeq 226 84 646 945 93.0% chr1 - 192665028 192665657 630 browser details YourSeq 221 250 649 945 91.2% chr5 - 24656297 24656861 565 browser details YourSeq 221 124 650 945 86.3% chr3 - 135577332 135577719 388 browser details YourSeq 219 136 660 945 91.7% chr1 - 4256866 4258124 1259 browser details YourSeq 218 108 644 945 87.1% chr6 + 81977212 81977637 426 browser details YourSeq 217 136 657 945 84.7% chr1 + 166493819 166494160 342 browser details YourSeq 213 104 641 945 91.2% chr18 - 60638573 60639113 541 browser details YourSeq 213 157 662 945 85.0% chr16 + 17560225 17560616 392 browser details YourSeq 213 136 650 945 90.9% chr10 + 80361916 80362879 964 browser details YourSeq 210 122 580 945 92.7% chr3 - 135577156 135577703 548 browser details YourSeq 207 248 650 945 92.8% chr6 + 12714723 12715343 621

Note: The 945 bp section downstream of Exon 9 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Zap70 zeta-chain (TCR) associated protein kinase [ Mus musculus () ] Gene ID: 22637, updated on 22-Oct-2019

Gene summary

Official Symbol Zap70 provided by MGI Official Full Name zeta-chain (TCR) associated protein kinase provided by MGI Primary source MGI:MGI:99613 See related Ensembl:ENSMUSG00000026117 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Srk; mur; mrtle; ZAP-70 Summary This gene encodes a member of the protein tyrosine kinase family. The encoded protein is essential for development of T Expression lymphocytes and thymocytes, and functions in the initial step of T lymphocyte receptor-mediated signal transduction. A mutation in this gene causes chronic autoimmune arthritis, similar to rheumatoid arthritis in . Mice lacking this gene are deficient in alpha-beta T lymphocytes in the thymus. In humans, mutations in this gene cause selective T-cell defect, a severe combined immunodeficiency disease characterized by a selective absence of CD8-positive T lymphocytes. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Jan 2014] Orthologs Biased expression in thymus adult (RPKM 70.4), spleen adult (RPKM 17.5) and 3 other tissuesS ee more all

Genomic context

Location: 1 B; 1 15.41 cM See Zap70 in Genome Data Viewer

Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (36761798..36782821)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (36818706..36839661)

Chromosome 1 - NC_000067.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Zap70 ENSMUSG00000026117

Description zeta-chain (TCR) associated protein kinase [Source:MGI Symbol;Acc:MGI:99613] Gene Synonyms Srk, TZK, ZAP-70 Location Chromosome 1: 36,761,798-36,782,818 forward strand. GRCm38:CM000994.2 About this gene This gene has 4 transcripts (splice variants), 188 orthologues, 32 paralogues, is a member of 1 Ensembl protein family and is associated with 57 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Zap70-201 ENSMUST00000027291.6 2245 618aa ENSMUSP00000027291.4 Protein coding CCDS14888 P43404 TSL:1 GENCODE basic APPRIS P1

Zap70-202 ENSMUST00000185871.1 454 85aa ENSMUSP00000139990.1 Protein coding - A0A087WQ05 CDS 3' incomplete TSL:2

Zap70-204 ENSMUST00000190128.1 1338 No protein - Retained intron - - TSL:1

Zap70-203 ENSMUST00000186624.1 808 No protein - Retained intron - - TSL:2

41.02 kb Forward strand

36.76Mb 36.77Mb 36.78Mb 36.79Mb (Comprehensive set... 4933424G06Rik-204 >protein coding Zap70-201 >protein coding

4933424G06Rik-202 >lncRNA Zap70-202 >protein coding Zap70-204 >retained intron

Zap70-203 >retained intron

Contigs < AC084389.1

Genes < Gm18828-202lncRNA < Tmem131-201protein coding (Comprehensive set...

< Gm18828-201transcribed processed pseudogene < Tmem131-208protein coding

Regulatory Build

36.76Mb 36.77Mb 36.78Mb 36.79Mb Reverse strand 41.02 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000027291

21.02 kb Forward strand

Zap70-201 >protein coding

ENSMUSP00000027... MobiDB lite Low complexity (Seg) Superfamily SH2 domain superfamily Protein kinase-like domain superfamily SMART SH2 domain Tyrosine-protein kinase, catalytic domain Prints SH2 domain Serine-threonine/tyrosine-protein kinase, catalytic domain Pfam SH2 domain Serine-threonine/tyrosine-protein kinase, catalytic domain PROSITE profiles SH2 domain Protein kinase domain PROSITE patterns Protein kinase, ATP binding site

Tyrosine-protein kinase, active site PIRSF Tyrosine-protein kinase, non-receptor SYK/ZAP-70 PANTHER PTHR24418

PTHR24418:SF369 Gene3D SH2 domain superfamily 3.30.200.20 1.10.510.10

Tyrosine-protein kinase SYK/ZAP-70, inter-SH2 domain superfamily CDD SYK/ZAP-70, N-terminal SH2 domain cd05115

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 618

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8