https://www.alphaknockout.com

Mouse Mgat5 Knockout Project (CRISPR/Cas9)

Objective: To create a Mgat5 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mgat5 (NCBI Reference Sequence: NM_145128 ; Ensembl: ENSMUSG00000036155 ) is located on Mouse 1. 18 exons are identified, with the ATG start codon in exon 3 and the TAG stop codon in exon 18 (Transcript: ENSMUST00000038361). Exon 3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for deficiencies in this gene have immune system abnormalities and reduced cancer growth and metastasis.

Exon 3 starts from the coding region. Exon 3 covers 10.86% of the coding region. The size of effective KO region: ~383 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 18

Legends Exon of mouse Mgat5 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.35% 467) | C(26.25% 525) | T(27.35% 547) | G(23.05% 461)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.0% 500) | C(22.55% 451) | T(28.85% 577) | G(23.6% 472)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 127304777 127306776 2000 browser details YourSeq 53 117 200 2000 82.1% chr18 + 43056257 43056337 81 browser details YourSeq 52 128 200 2000 83.9% chr14 - 45434019 45434089 71 browser details YourSeq 52 115 200 2000 88.9% chr4 + 134205778 134205862 85 browser details YourSeq 49 127 313 2000 90.2% chr9 - 114892823 114893061 239 browser details YourSeq 45 115 200 2000 92.6% chr10 + 63421996 63422090 95 browser details YourSeq 43 126 182 2000 85.0% chr12 - 110892374 110892428 55 browser details YourSeq 43 128 200 2000 88.0% chr10 - 40860380 40860451 72 browser details YourSeq 43 117 170 2000 84.7% chr3 + 89500096 89500147 52 browser details YourSeq 43 133 207 2000 76.8% chr11 + 78819620 78819693 74 browser details YourSeq 42 127 208 2000 84.7% chr14 + 105221375 105221454 80 browser details YourSeq 42 130 200 2000 82.2% chr11 + 30047407 30047475 69 browser details YourSeq 41 133 200 2000 87.8% chr10 + 67068110 67068176 67 browser details YourSeq 39 130 200 2000 97.7% chr11 - 87718924 87718998 75 browser details YourSeq 39 683 729 2000 91.5% chr1 + 120102659 120102705 47 browser details YourSeq 38 168 208 2000 97.5% chr2 - 134638165 134638206 42 browser details YourSeq 38 127 199 2000 90.3% chr17 - 8358735 8358805 71 browser details YourSeq 38 128 277 2000 89.6% chr15 - 99829890 99830039 150 browser details YourSeq 38 130 200 2000 81.9% chr15 - 11894572 11894638 67 browser details YourSeq 38 132 200 2000 83.7% chr10 + 121520280 121520346 67

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 127307018 127309017 2000 browser details YourSeq 47 998 1153 2000 98.0% chr13 - 120053124 120053285 162 browser details YourSeq 44 1922 1985 2000 81.4% chr10 - 121458680 121458741 62 browser details YourSeq 44 1920 1980 2000 93.9% chr2 + 11682174 11682234 61 browser details YourSeq 43 1335 1389 2000 93.9% chr2 - 76340376 76340438 63 browser details YourSeq 42 1920 1995 2000 77.7% chr1 - 151982488 151982563 76 browser details YourSeq 42 1920 1973 2000 88.9% chr17 + 48955689 48955742 54 browser details YourSeq 40 1920 1977 2000 84.4% chr1 - 120520202 120520258 57 browser details YourSeq 40 368 432 2000 81.3% chr11 + 118154401 118154465 65 browser details YourSeq 39 1914 1969 2000 85.8% chr17 - 33536970 33537026 57 browser details YourSeq 39 1919 1982 2000 93.2% chr1 - 120186247 120186311 65 browser details YourSeq 39 1918 1978 2000 82.0% chr16 + 90470276 90470336 61 browser details YourSeq 39 1917 1982 2000 86.7% chr16 + 17717200 17717263 64 browser details YourSeq 38 364 433 2000 72.1% chr11 + 51495036 51495103 68 browser details YourSeq 37 1921 1999 2000 86.7% chr12 + 54271465 54271542 78 browser details YourSeq 36 1921 1964 2000 90.0% chr18 + 62925786 62925828 43 browser details YourSeq 36 1920 1969 2000 86.0% chr15 + 80519429 80519478 50 browser details YourSeq 35 1920 1976 2000 80.8% chr9 - 13625996 13626052 57 browser details YourSeq 35 1920 1970 2000 84.4% chr1 - 191556219 191556269 51 browser details YourSeq 34 367 431 2000 85.0% chr17 - 7364104 7364166 63

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and protein information: Mgat5 mannoside acetylglucosaminyltransferase 5 [ Mus musculus (house mouse) ] Gene ID: 107895, updated on 27-Aug-2019

Gene summary

Official Symbol Mgat5 provided by MGI Official Full Name mannoside acetylglucosaminyltransferase 5 provided by MGI Primary source MGI:MGI:894701 See related Ensembl:ENSMUSG00000036155 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI480971; GlcNAc-TV; 4930471A21Rik; 5330407H02Rik Expression Ubiquitous expression in colon adult (RPKM 9.9), frontal lobe adult (RPKM 9.4) and 27 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 E3 See Mgat5 in Genome Data Viewer Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (127204919..127488334)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (129101563..129379549)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Mgat5 ENSMUSG00000036155

Description mannoside acetylglucosaminyltransferase 5 [Source:MGI Symbol;Acc:MGI:894701] Gene Synonyms 4930471A21Rik, 5330407H02Rik, GlcNAc-TV, beta1,6N-acetylglucosaminyltransferase V Location Chromosome 1: 127,205,015-127,488,336 forward strand. GRCm38:CM000994.2 About this gene This gene has 4 transcripts (splice variants), 199 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mgat5- ENSMUST00000038361.10 8436 740aa ENSMUSP00000038359.4 Protein coding CCDS15245 Q059T5 Q8R4G6 TSL:1 201 GENCODE basic APPRIS P1

Mgat5- ENSMUST00000171405.1 3216 740aa ENSMUSP00000129166.1 Protein coding CCDS15245 Q059T5 Q8R4G6 TSL:1 202 GENCODE basic APPRIS P1

Mgat5- ENSMUST00000190921.1 2984 No - Retained - - TSL:1 204 protein intron

Mgat5- ENSMUST00000189427.6 2229 No - Retained - - TSL:1 203 protein intron

Page 7 of 9 https://www.alphaknockout.com

303.32 kb Forward strand

127.2Mb 127.3Mb 127.4Mb (Comprehensive set... Mgat5-203 >retained intron Gm38301-201 >TEC

Mgat5-201 >protein coding

Mgat5-204 >retained intron

Gm37772-201 >TEC Mgat5-202 >protein coding

Gm37510-201 >TEC Gm23370-201 >scaRNA Gm23734-201 >snoRNA

Gm23370-202 >scaRNA

Contigs < AC158147.2 < AC131793.2 < AC130527.4 Genes < Gm8451-201processed pseudogene < Tmem163-201nonsense mediated decay (Comprehensive set...

< Tmem163-202retained intron

< Tmem163-203protein coding

< Tmem163-205protein coding

< Tmem163-204lncRNA

Regulatory Build

127.2Mb 127.3Mb 127.4Mb Reverse strand 303.32 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000038361

283.32 kb Forward strand

Mgat5-201 >protein coding

ENSMUSP00000038... Transmembrane heli... Low complexity (Seg) Pfam family 18

Domain of unknown function DUF4525 PANTHER PTHR15075

PTHR15075:SF5

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 740

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9