https://www.alphaknockout.com

Mouse Alg14 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Alg14 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Alg14 (NCBI Reference Sequence: NM_024178 ; Ensembl: ENSMUSG00000039887 ) is located on Mouse 3. 4 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000039442). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Alg14 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-228D13 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 21.04% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 6620 bp, and the size of intron 2 for 3'-loxP site insertion: 21321 bp. The size of effective cKO region: ~655 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 4 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Alg14 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7155bp) | A(26.54% 1899) | C(24.84% 1777) | T(29.76% 2129) | G(18.87% 1350)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 121295355 121298354 3000 browser details YourSeq 302 831 1561 3000 93.3% chr17 + 26559271 27049591 490321 browser details YourSeq 278 1405 1772 3000 91.0% chr18 - 35684815 35940460 255646 browser details YourSeq 256 1388 1769 3000 87.9% chr17 - 45398472 45398856 385 browser details YourSeq 256 1389 1769 3000 92.1% chr1 - 184990768 185177597 186830 browser details YourSeq 245 1397 1770 3000 89.2% chr1 - 59676243 59676623 381 browser details YourSeq 243 1438 1769 3000 89.6% chr5 + 92180767 92181123 357 browser details YourSeq 236 1396 1769 3000 89.6% chr17 - 28064577 28065192 616 browser details YourSeq 233 1214 1769 3000 87.3% chr11 - 97760292 97760686 395 browser details YourSeq 225 894 1787 3000 84.4% chrX + 102202195 102202537 343 browser details YourSeq 221 1440 1767 3000 91.4% chr19 - 4048993 4049512 520 browser details YourSeq 221 1414 1777 3000 89.9% chr17 - 46784975 46785596 622 browser details YourSeq 215 1239 1573 3000 88.9% chr11 - 106404960 106405283 324 browser details YourSeq 214 1436 1769 3000 89.5% chr7 + 126316030 126316521 492 browser details YourSeq 212 1462 1733 3000 93.1% chr2 + 129843002 129843616 615 browser details YourSeq 205 1217 1769 3000 86.1% chr17 + 31439027 31439556 530 browser details YourSeq 198 894 1572 3000 83.8% chr11 - 107499220 107499503 284 browser details YourSeq 195 1404 1738 3000 92.3% chr15 - 73180817 73181230 414 browser details YourSeq 191 1215 1582 3000 87.6% chr19 - 43754457 43967119 212663 browser details YourSeq 190 1214 1767 3000 90.0% chr11 - 113706884 113707435 552

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 121299010 121302009 3000 browser details YourSeq 500 407 1033 3000 93.5% chr7 + 133488590 133489557 968 browser details YourSeq 432 372 991 3000 95.1% chr7 + 133488660 133489845 1186 browser details YourSeq 430 498 1029 3000 94.1% chr10 - 42998340 42999011 672 browser details YourSeq 429 378 1035 3000 91.8% chr18 - 83045850 83046361 512 browser details YourSeq 429 518 1022 3000 96.6% chr19 + 6190191 6190724 534 browser details YourSeq 401 371 935 3000 96.0% chr7 + 133488725 133489819 1095 browser details YourSeq 399 371 990 3000 91.7% chr10 - 42998322 42998919 598 browser details YourSeq 394 371 1008 3000 92.3% chr10 - 42998322 42998907 586 browser details YourSeq 388 372 891 3000 93.9% chr16 - 48130246 48130745 500 browser details YourSeq 361 371 936 3000 92.9% chr18 - 83045850 83046323 474 browser details YourSeq 361 371 1034 3000 92.3% chr10 - 42998398 42999003 606 browser details YourSeq 361 459 995 3000 93.5% chr5 + 87799949 87800399 451 browser details YourSeq 358 371 1013 3000 92.2% chr12 - 99705252 99705658 407 browser details YourSeq 353 371 764 3000 95.7% chr19 + 6190269 6190724 456 browser details YourSeq 346 581 1015 3000 96.6% chr4 - 155445286 155445738 453 browser details YourSeq 345 469 1026 3000 91.6% chr5 + 17633244 17633641 398 browser details YourSeq 344 371 872 3000 95.6% chr10 - 42998319 42998874 556 browser details YourSeq 338 596 1034 3000 92.5% chr18 - 83045917 83046307 391 browser details YourSeq 329 587 1034 3000 94.2% chr10 - 42998452 42999011 560

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Alg14 asparagine-linked 14 [ Mus musculus (house mouse) ] Gene ID: 66789, updated on 12-Aug-2019

Gene summary

Official Symbol Alg14 provided by MGI Official Full Name asparagine-linked glycosylation 14 provided by MGI Primary source MGI:MGI:1914039 See related Ensembl:ENSMUSG00000039887 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI854024; 5430428G01Rik Expression Ubiquitous expression in lung adult (RPKM 8.4), subcutaneous fat pad adult (RPKM 8.1) and 28 other tissues See more Orthologs human all

Genomic context

Location: 3; 3 G1 See Alg14 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (121291712..121363096)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (120994735..121064929)

Chromosome 3 - NC_000069.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Alg14 ENSMUSG00000039887

Description asparagine-linked glycosylation 14 [Source:MGI Symbol;Acc:MGI:1914039] Gene Synonyms 5430428G01Rik Location Chromosome 3: 121,291,773-121,363,094 forward strand. GRCm38:CM000996.2 About this gene This gene has 4 transcripts (splice variants), 188 orthologues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Alg14- ENSMUST00000039442.11 2037 217aa ENSMUSP00000038387.7 Protein coding CCDS17801 Q9D081 TSL:1 201 GENCODE basic APPRIS P1

Alg14- ENSMUST00000198341.1 831 133aa ENSMUSP00000142775.1 Protein coding - A0A0G2JEH5 CDS 5' 203 incomplete TSL:2

Alg14- ENSMUST00000199554.1 441 92aa ENSMUSP00000142857.1 Protein coding - A0A0G2JEQ0 TSL:3 204 GENCODE basic

Alg14- ENSMUST00000198235.1 474 60aa ENSMUSP00000142989.1 Nonsense mediated - A0A0G2JF20 TSL:3 202 decay

Page 6 of 8 https://www.alphaknockout.com

91.32 kb Forward strand 121.30Mb 121.32Mb 121.34Mb 121.36Mb (Comprehensive set... Alg14-201 >protein coding

Alg14-202 >nonsense mediated decay Gm26027-201 >misc RNA

Alg14-204 >protein coding

Alg14-203 >protein coding

Contigs < AC161210.5 AC105336.14 >

Genes < Tmem56-204protein coding (Comprehensive set...

< Tmem56-205protein coding

< Tmem56-202lncRNA

< Tmem56-206lncRNA

Regulatory Build

121.30Mb 121.32Mb 121.34Mb 121.36Mb Reverse strand 91.32 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000039442

71.32 kb Forward strand

Alg14-201 >protein coding

ENSMUSP00000038... Transmembrane heli... Superfamily SSF53756

Pfam Oligosaccharide biosynthesis protein Alg14-like

PANTHER Oligosaccharide biosynthesis protein Alg14-like Gene3D 3.40.50.2000 CDD cd03785

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 217

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8