https://www.alphaknockout.com

Mouse Gga1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gga1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gga1 (NCBI Reference Sequence: NM_145929 ; Ensembl: ENSMUSG00000033128 ) is located on Mouse 15. 17 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 17 (Transcript: ENSMUST00000041587). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gga1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-83M14 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene-trapped allele display decreased birth weight, slow postnatal weight gain, hypoglycemia, increased plasma levels of acid hydrolases, and partial neonatal lethality.

Exon 2 starts from about 2.31% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 3420 bp, and the size of intron 3 for 3'-loxP site insertion: 1064 bp. The size of effective cKO region: ~1666 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 17 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gga1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8166bp) | A(21.45% 1752) | C(28.46% 2324) | T(23.34% 1906) | G(26.75% 2184)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 + 78877789 78880788 3000 browser details YourSeq 372 912 1325 3000 95.9% chr18 + 57457015 57457449 435 browser details YourSeq 369 913 1305 3000 97.0% chr1 + 20790406 20790798 393 browser details YourSeq 368 913 1312 3000 96.0% chr5 - 118450776 118451175 400 browser details YourSeq 366 874 1305 3000 95.6% chrX - 101969641 101970358 718 browser details YourSeq 366 910 1315 3000 94.9% chr1 - 31374905 31375309 405 browser details YourSeq 364 914 1331 3000 93.6% chr2 - 40491428 40491834 407 browser details YourSeq 364 913 1310 3000 95.0% chr14 - 45210306 45210701 396 browser details YourSeq 363 912 1305 3000 96.2% chr16 - 30713525 30713918 394 browser details YourSeq 363 881 1305 3000 95.6% chr15 - 94333148 94333600 453 browser details YourSeq 363 914 1470 3000 94.6% chr1 - 185843324 185844036 713 browser details YourSeq 363 914 1318 3000 94.8% chr2 + 174520302 174520705 404 browser details YourSeq 363 913 1309 3000 95.3% chr2 + 92792521 92792916 396 browser details YourSeq 362 899 1305 3000 95.5% chr15 - 7228265 7228676 412 browser details YourSeq 362 914 1305 3000 95.7% chr1 - 151033579 151033969 391 browser details YourSeq 361 914 1318 3000 94.7% chr18 - 6094323 6094725 403 browser details YourSeq 361 914 1310 3000 95.2% chr14 - 24417487 24417882 396 browser details YourSeq 361 912 1305 3000 96.0% chr12 - 26536441 26536834 394 browser details YourSeq 360 914 1310 3000 96.0% chr8 - 95216939 95217336 398 browser details YourSeq 360 905 1305 3000 96.0% chr17 - 56790425 56790831 407

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 + 78882455 78885454 3000 browser details YourSeq 176 2393 2584 3000 97.4% chr17 - 66460467 66460662 196 browser details YourSeq 173 2174 2557 3000 94.8% chr15 + 102327479 102327863 385 browser details YourSeq 170 2395 2569 3000 98.9% chr16 - 17965465 17965659 195 browser details YourSeq 168 1 2213 3000 96.2% chr2 - 34376011 34780573 404563 browser details YourSeq 164 2385 2578 3000 93.7% chr4 - 135405410 135405607 198 browser details YourSeq 162 2399 2573 3000 96.6% chr3 - 84827457 84827634 178 browser details YourSeq 161 2385 2559 3000 96.6% chr9 - 101127353 101127527 175 browser details YourSeq 159 2396 2567 3000 96.5% chrX + 152279278 152279463 186 browser details YourSeq 159 2395 2581 3000 94.9% chr10 + 56455297 56455834 538 browser details YourSeq 158 2395 2556 3000 97.6% chr6 - 142965589 142965749 161 browser details YourSeq 158 2395 2556 3000 98.8% chr5 - 109898791 109898952 162 browser details YourSeq 158 2393 2559 3000 95.8% chr3 + 66891001 66891165 165 browser details YourSeq 157 2395 2557 3000 98.2% chr2 - 132170883 132171045 163 browser details YourSeq 157 2395 2559 3000 97.6% chr12 + 54705987 54706151 165 browser details YourSeq 157 2392 2573 3000 96.0% chr11 + 104163218 104163407 190 browser details YourSeq 155 2363 2553 3000 95.4% chr7 - 45658871 45659184 314 browser details YourSeq 155 2395 2563 3000 96.5% chr5 - 90266506 90266677 172 browser details YourSeq 155 2385 2561 3000 93.3% chr16 - 64793029 64793195 167 browser details YourSeq 153 2395 2559 3000 95.2% chr7 - 79234526 79234689 164

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Gga1 golgi associated, gamma adaptin ear containing, ARF binding protein 1 [ Mus musculus (house mouse) ] Gene ID: 106039, updated on 10-Oct-2019

Gene summary

Official Symbol Gga1 provided by MGI Official Full Name golgi associated, gamma adaptin ear containing, ARF binding protein 1 provided by MGI Primary source MGI:MGI:2146207 See related Ensembl:ENSMUSG00000033128 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AU016030; AW209092; 4930406E12Rik Expression Ubiquitous expression in adrenal adult (RPKM 56.3), duodenum adult (RPKM 51.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 15; 15 E1 See Gga1 in Genome Data Viewer

Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (78877167..78894585)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (78707620..78725015)

Chromosome 15 - NC_000081.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Gga1 ENSMUSG00000033128

Description golgi associated, gamma adaptin ear containing, ARF binding protein 1 [Source:MGI Symbol;Acc:MGI:2146207] Gene Synonyms 4930406E12Rik Location Chromosome 15: 78,877,190-78,894,585 forward strand. GRCm38:CM001008.2 About this gene This gene has 5 transcripts (splice variants), 259 orthologues, 10 paralogues, is a member of 1 Ensembl protein family and is associated with 9 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gga1- ENSMUST00000041587.7 3034 635aa ENSMUSP00000035992.7 Protein coding CCDS27625 Q8R0H9 TSL:1 201 GENCODE basic APPRIS P1

Gga1- ENSMUST00000230192.1 2665 363aa ENSMUSP00000155780.1 Nonsense mediated - A0A2R8VI72 - 203 decay

Gga1- ENSMUST00000230772.1 860 No - Retained intron - - - 205 protein

Gga1- ENSMUST00000229353.1 842 No - Retained intron - - - 202 protein

Gga1- ENSMUST00000230243.1 393 No - lncRNA - - - 204 protein

Page 6 of 8 https://www.alphaknockout.com

37.40 kb Forward strand 78.87Mb 78.88Mb 78.89Mb 78.90Mb (Comprehensive set... Gga1-201 >protein coding Sh3bp1-204 >nonsense mediated decay

Gga1-203 >nonsense mediated decay Sh3bp1-206 >nonsense mediated decay

Gga1-204 >lncRNA Mir6955-201 >miRNA Sh3bp1-207 >retained intron

Gga1-205 >retained intron Gga1-202 >retained intron Gm49510-201 >protein coding

Sh3bp1-203 >protein coding

Sh3bp1-202 >protein coding

Sh3bp1-201 >protein coding

Sh3bp1-205 >protein coding

Contigs < AL592169.14 Genes < Gm26634-201lncRNA (Comprehensive set...

Regulatory Build

78.87Mb 78.88Mb 78.89Mb 78.90Mb Reverse strand 37.40 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000041587

17.40 kb Forward strand

Gga1-201 >protein coding

ENSMUSP00000035... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily ENTH/VHS SSF89009 Clathrin adaptor, appendage, Ig-like subdomain superfamily

SMART VHS domain Clathrin adaptor, alpha/beta/gamma-adaptin, appendage, Ig-like subdomain

Pfam VHS domain GGA, N-GAT domain Clathrin adaptor, alpha/beta/gamma-adaptin, appendage, Ig-like subdomain

GAT domain PROSITE profiles VHS domain GAT domain Gamma-adaptin ear (GAE) domain

PANTHER PTHR45905

PTHR45905:SF4 Gene3D ENTH/VHS GAT domain superfamily 2.60.40.1230

1.20.5.170 CDD cd17009 cd14239

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 635

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8