https://www.alphaknockout.com

Mouse Alg13 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Alg13 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Alg13 (NCBI Reference Sequence: NM_026247 ; Ensembl: ENSMUSG00000041718 ) is located on Mouse X. 4 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 4 (Transcript: ENSMUST00000070801). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Alg13 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-106O9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Males hemizygous for a null allele exhibit environmentally induced seizures and increased susceptibility to pharmacologically induced seizures. Homozygous females for a different null allele show increased body fat and decrased lean body mass, decreased bone mineral density, decreased granulocyte numbers and increased leukocyte numbers.

Exon 3~4 covers 50.71% of the coding region. Start codon is in exon 1, and stop codon is in exon 4. The size of intron 2 for 5'-loxP site insertion: 1950 bp. The size of effective cKO region: ~2389 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Alg13 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8616bp) | A(25.94% 2235) | C(21.1% 1818) | T(32.16% 2771) | G(20.8% 1792)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX + 144318763 144321762 3000 browser details YourSeq 444 1786 2552 3000 90.8% chr10 - 62761682 63424311 662630 browser details YourSeq 397 2110 2556 3000 94.0% chr5 - 33075976 33076421 446 browser details YourSeq 394 2110 2552 3000 93.9% chr6 - 48523459 48523895 437 browser details YourSeq 393 2110 2552 3000 94.4% chr2 + 157521645 157522087 443 browser details YourSeq 393 2110 2552 3000 93.6% chr17 + 29136928 29137363 436 browser details YourSeq 391 2110 2546 3000 94.0% chr16 + 8922013 8922442 430 browser details YourSeq 389 2110 2552 3000 94.0% chr6 + 47815249 47815691 443 browser details YourSeq 384 2133 2923 3000 93.7% chr16 - 35919983 35920827 845 browser details YourSeq 383 2135 2846 3000 90.5% chr12 - 85662293 85662894 602 browser details YourSeq 382 2110 2929 3000 90.3% chr9 + 90221242 90221838 597 browser details YourSeq 382 2112 2552 3000 93.5% chr14 + 99365633 99366076 444 browser details YourSeq 381 2110 2552 3000 92.3% chr7 - 34175653 34176090 438 browser details YourSeq 379 2110 2552 3000 93.1% chr11 + 73225873 73226325 453 browser details YourSeq 377 2110 2526 3000 95.3% chr13 + 6662511 6662927 417 browser details YourSeq 377 2110 2526 3000 95.3% chr1 + 181241937 181242353 417 browser details YourSeq 376 2135 2552 3000 95.0% chr5 + 123255455 123255872 418 browser details YourSeq 370 2110 2526 3000 93.4% chr2 + 108874883 108875291 409 browser details YourSeq 366 2110 2552 3000 92.3% chr19 - 12524696 12525135 440 browser details YourSeq 365 1776 2544 3000 91.3% chr7 - 83705855 83706629 775

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX + 144324129 144327128 3000 browser details YourSeq 615 1621 2468 3000 91.2% chr12 - 79415211 79416130 920 browser details YourSeq 611 1621 2449 3000 90.3% chr2 + 104904749 104905650 902 browser details YourSeq 606 1551 2438 3000 89.1% chr1 - 172843681 172844661 981 browser details YourSeq 605 1629 2471 3000 90.5% chrX + 71241767 71242676 910 browser details YourSeq 604 1636 2461 3000 91.4% chr12 - 39816517 39817414 898 browser details YourSeq 603 1621 2471 3000 89.9% chr6 - 148339838 148340772 935 browser details YourSeq 603 1647 2471 3000 89.2% chr9 + 59806865 59835699 28835 browser details YourSeq 602 1621 2449 3000 89.6% chr5 - 88359284 88360174 891 browser details YourSeq 602 1553 2471 3000 89.3% chr2 + 129642748 129643728 981 browser details YourSeq 602 1626 2471 3000 90.8% chr11 + 72773998 72774915 918 browser details YourSeq 596 1621 2471 3000 89.9% chr14 + 55375587 55376502 916 browser details YourSeq 595 1635 2471 3000 90.3% chr2 + 154064958 154065855 898 browser details YourSeq 594 1635 2471 3000 89.0% chr8 + 40689684 40690583 900 browser details YourSeq 594 1626 2448 3000 91.0% chr16 + 7730189 8145493 415305 browser details YourSeq 590 1553 2471 3000 89.6% chr9 - 7514746 7515740 995 browser details YourSeq 587 1634 2471 3000 90.3% chr8 - 5631553 5632453 901 browser details YourSeq 587 1635 2471 3000 89.1% chr4 + 122804985 122805883 899 browser details YourSeq 585 1668 2471 3000 88.3% chr18 + 6877717 6878583 867 browser details YourSeq 585 1658 2471 3000 90.0% chr14 + 99214239 99215107 869

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Alg13 asparagine-linked glycosylation 13 [ Mus musculus (house mouse) ] Gene ID: 67574, updated on 11-Sep-2019

Gene summary

Official Symbol Alg13 provided by MGI Official Full Name asparagine-linked glycosylation 13 provided by MGI Primary source MGI:MGI:1914824 See related Ensembl:ENSMUSG00000041718 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MDS031; Glt28d1; 2810046O15Rik; 4833435D08Rik Expression Ubiquitous expression in testis adult (RPKM 4.3), limb E14.5 (RPKM 4.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: X; X F2 See Alg13 in Genome Data Viewer

Exon count: 28

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (144317966..144374450)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (140752509..140759740)

Chromosome X - NC_000086.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 15 transcripts

Gene: Alg13 ENSMUSG00000041718

Description asparagine-linked glycosylation 13 [Source:MGI Symbol;Acc:MGI:1914824] Gene Synonyms 2810046O15Rik, 4833435D08Rik, Glt28d1, MDS031 Location Chromosome X: 144,317,804-144,374,450 forward strand. GRCm38:CM001013.2 About this gene This gene has 15 transcripts (splice variants), 153 orthologues, 1 paralogue, is a member of 2 Ensembl protein families and is associated with 17 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Alg13- ENSMUST00000070801.10 1802 165aa ENSMUSP00000068403.4 Protein coding CCDS30456 Q9D8C3 TSL:1 202 GENCODE basic APPRIS P1

Alg13- ENSMUST00000238864.1 4191 969aa ENSMUSP00000159009.1 Protein coding - - GENCODE 215 basic

Alg13- ENSMUST00000123710.7 604 61aa ENSMUSP00000126052.2 Protein coding - E9Q161 TSL:1 203 GENCODE basic

Alg13- ENSMUST00000149330.7 479 61aa ENSMUSP00000130514.2 Protein coding - E9Q161 TSL:2 207 GENCODE basic

Alg13- ENSMUST00000154827.7 4168 155aa ENSMUSP00000131622.1 Nonsense mediated - E9PX10 TSL:1 210 decay

Alg13- ENSMUST00000197316.4 3839 51aa ENSMUSP00000142956.1 Nonsense mediated - A0A0G2JEZ0 TSL:5 212 decay

Alg13- ENSMUST00000198039.4 3307 48aa ENSMUSP00000143124.1 Nonsense mediated - A0A0G2JFD0 TSL:5 213 decay

Alg13- ENSMUST00000145724.6 859 81aa ENSMUSP00000131340.3 Nonsense mediated - E9Q2P6 TSL:5 206 decay

Alg13- ENSMUST00000197138.4 877 No - Retained intron - - TSL:5 211 protein

Alg13- ENSMUST00000199040.1 719 No - Retained intron - - TSL:5 214 protein

Alg13- ENSMUST00000149427.7 2996 No - lncRNA - - TSL:5 208 protein

Alg13- ENSMUST00000132416.7 2608 No - lncRNA - - TSL:5 204 protein

Alg13- ENSMUST00000149811.2 619 No - lncRNA - - TSL:5 209 protein

Alg13- ENSMUST00000040338.8 508 No - lncRNA - - TSL:5 201 protein

Alg13- ENSMUST00000144185.4 476 No - lncRNA - - TSL:5 205 protein

Page 6 of 8 https://www.alphaknockout.com

76.65 kb Forward strand

Genes (Comprehensive set... Alg13-215 >protein coding

Alg13-202 >protein coding Alg13-201 >lncRNA Alg13-204 >lncRNA

Alg13-212 >nonsense mediated decay

Alg13-206 >nonsense mediated decay Alg13-214 >retained intron Alg13-209 >lncRNA

Alg13-207 >protein coding Alg13-211 >retained intron

Alg13-203 >protein coding Alg13-208 >lncRNA

Alg13-213 >nonsense mediated decay

Alg13-210 >nonsense mediated decay

Alg13-205 >lncRNA

Contigs AL713978.9 >

Genes < Gm2214-201processed pseudogene < Trpc5-201protein coding (Comprehensive set...

Regulatory Build

Reverse strand 76.65 kb

Regulation Legend

CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000070801

7.39 kb Forward strand

Alg13-202 >protein coding

ENSMUSP00000068... Superfamily SSF53756

Pfam Glycosyl transferase, family 28, C-terminal

PANTHER UDP-N-acetylglucosamine transferase subunit Alg13-like

PTHR12867:SF7 Gene3D 3.40.50.2000 CDD cd03785

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 165

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8