https://www.alphaknockout.com

Mouse Get4 Knockout Project (CRISPR/Cas9)

Objective: To create a Get4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Get4 (NCBI Reference Sequence: NM_026269 ; Ensembl: ENSMUSG00000025858 ) is located on Mouse 5. 9 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 9 (Transcript: ENSMUST00000026976). Exon 2~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 15.9% of the coding region. Exon 2~8 covers 75.43% of the coding region. The size of effective KO region: ~4995 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9

Legends Exon of mouse Get4 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1505 bp section downstream of Exon 8 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.85% 437) | C(25.1% 502) | T(31.5% 630) | G(21.55% 431)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1505bp) | A(19.14% 288) | C(27.38% 412) | T(27.97% 421) | G(25.51% 384)

Note: The 1505 bp section downstream of Exon 8 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 + 139260482 139262481 2000 browser details YourSeq 38 552 964 2000 61.4% chr1 - 21296477 21296767 291 browser details YourSeq 38 1063 1124 2000 80.7% chr11 + 76848774 76848835 62 browser details YourSeq 36 1183 1322 2000 92.9% chr12 + 112366927 112367114 188 browser details YourSeq 32 1084 1124 2000 91.2% chr6 - 81706130 81706169 40 browser details YourSeq 32 1085 1210 2000 94.5% chr17 + 7449679 7449806 128 browser details YourSeq 30 1245 1322 2000 96.9% chr2 - 52409504 52409583 80 browser details YourSeq 30 1071 1128 2000 75.9% chr10 - 118039889 118039946 58 browser details YourSeq 27 1281 1319 2000 84.7% chr15 + 84637625 84637663 39 browser details YourSeq 25 1072 1096 2000 100.0% chr1 - 135285800 135285824 25 browser details YourSeq 24 1186 1209 2000 100.0% chr11 + 121598826 121598849 24 browser details YourSeq 22 1075 1128 2000 70.4% chr1 - 194317406 194317459 54 browser details YourSeq 22 1082 1105 2000 95.9% chr1 + 128557842 128557865 24 browser details YourSeq 21 669 689 2000 100.0% chr1 - 12689143 12689163 21 browser details YourSeq 21 350 372 2000 95.7% chr1 + 86766456 86766478 23 browser details YourSeq 20 1188 1209 2000 95.5% chr11 + 57021957 57021978 22 browser details YourSeq 20 1188 1209 2000 95.5% chr1 + 34695550 34695571 22

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1505 1 1505 1505 100.0% chr5 + 139267477 139268981 1505 browser details YourSeq 58 557 640 1505 90.5% chr17 - 47810457 47810541 85 browser details YourSeq 57 557 737 1505 92.6% chr12 - 3986267 3986507 241 browser details YourSeq 56 539 637 1505 82.1% chr9 - 43831845 43831943 99 browser details YourSeq 56 556 645 1505 85.8% chr7 - 118212134 118212221 88 browser details YourSeq 56 557 644 1505 81.9% chr4 - 149297601 149297688 88 browser details YourSeq 53 557 645 1505 91.7% chr2 - 123816414 123816501 88 browser details YourSeq 53 557 648 1505 87.4% chr7 + 61450869 61450958 90 browser details YourSeq 52 557 645 1505 82.0% chr8 - 129260591 129260677 87 browser details YourSeq 52 557 634 1505 83.4% chr4 - 106855549 106855626 78 browser details YourSeq 52 557 645 1505 90.0% chr11 - 39022450 39022537 88 browser details YourSeq 51 539 739 1505 68.9% chr18 - 35072701 35072805 105 browser details YourSeq 51 557 636 1505 87.0% chr16 - 32397263 32397342 80 browser details YourSeq 51 609 737 1505 91.9% chr11 - 84493418 84493641 224 browser details YourSeq 51 557 646 1505 86.9% chr8 + 77635690 77635777 88 browser details YourSeq 51 557 646 1505 86.9% chr4 + 117057264 117057351 88 browser details YourSeq 51 586 717 1505 85.0% chr17 + 56369843 56370377 535 browser details YourSeq 50 557 638 1505 89.1% chrX - 151643736 151643817 82 browser details YourSeq 50 557 643 1505 88.0% chr9 - 56352810 56352894 85 browser details YourSeq 49 555 646 1505 86.5% chr8 - 55410408 55410497 90

Note: The 1505 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Get4 golgi to ER traffic protein 4 [ Mus musculus (house mouse) ] Gene ID: 67604, updated on 10-Oct-2019

Gene summary

Official Symbol Get4 provided by MGI Official Full Name golgi to ER traffic protein 4 provided by MGI Primary source MGI:MGI:1914854 See related Ensembl:ENSMUSG00000025858 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cee; AW412535; 1110007L15Rik Expression Ubiquitous expression in ovary adult (RPKM 65.6), adrenal adult (RPKM 60.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 G2 See Get4 in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (139252324..139270050)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (139728278..139746004)

Chromosome 5 - NC_000071.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Get4 ENSMUSG00000025858

Description golgi to ER traffic protein 4 [Source:MGI Symbol;Acc:MGI:1914854] Gene Synonyms 1110007L15Rik Location Chromosome 5: 139,252,324-139,270,051 forward strand. GRCm38:CM000998.2 About this gene This gene has 6 transcripts (splice variants), 196 orthologues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Get4-201 ENSMUST00000026976.11 2107 327aa ENSMUSP00000026976.5 Protein coding CCDS19805 Q9D1H7 TSL:1 GENCODE basic APPRIS P3

Get4-202 ENSMUST00000110878.1 1180 274aa ENSMUSP00000106502.1 Protein coding CCDS51682 Q9D1H7 TSL:1 GENCODE basic APPRIS ALT2

Get4-204 ENSMUST00000130326.7 835 246aa ENSMUSP00000117473.1 Protein coding - D3Z4J5 CDS 3' incomplete TSL:3

Get4-206 ENSMUST00000138508.7 701 197aa ENSMUSP00000116975.1 Protein coding - D3Z7S0 CDS 3' incomplete TSL:5

Get4-203 ENSMUST00000124420.1 1167 No protein - Retained intron - - TSL:2

Get4-205 ENSMUST00000138059.1 390 No protein - Retained intron - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

37.73 kb Forward strand 139.25Mb 139.26Mb 139.27Mb 139.28Mb (Comprehensive set... Sun1-201 >protein coding Get4-201 >protein coding

Sun1-205 >protein coding Get4-204 >protein coding

Sun1-206 >protein coding Get4-206 >protein coding

Sun1-202 >protein coding Get4-202 >protein coding

Sun1-204 >protein coding Get4-205 >retained intron

Sun1-211 >protein coding Get4-203 >retained intron

Sun1-210 >nonsense mediated decay

Contigs AC125065.3 > Genes < Adap1-201protein coding (Comprehensive set...

< Adap1-204lncRNA

< Adap1-206lncRNA

< Adap1-203lncRNA

< Adap1-202retained intron

Regulatory Build

139.25Mb 139.26Mb 139.27Mb 139.28Mb Reverse strand 37.73 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000026976

17.73 kb Forward strand

Get4-201 >protein coding

ENSMUSP00000026... MobiDB lite Low complexity (Seg) Pfam Golgi to ER traffic protein 4 PANTHER PTHR12875:SF0

Golgi to ER traffic protein 4 Gene3D Tetratricopeptide-like helical domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 327

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9