https://www.alphaknockout.com

Mouse Agbl4 Knockout Project (CRISPR/Cas9)

Objective: To create a Agbl4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Agbl4 (NCBI Reference Sequence: NM_030231 ; Ensembl: ENSMUSG00000061298 ) is located on Mouse 4. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000097920). Exon 3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit abnormal platelet morphology and physiology, impaired megakaryopoiesis, increased spleen weight and increased susceptibility to HSV or VACV infection.

Exon 3 starts from about 9.75% of the coding region. Exon 3 covers 7.72% of the coding region. The size of effective KO region: ~125 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 13

Legends Exon of mouse Agbl4 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.9% 638) | C(13.0% 260) | T(40.45% 809) | G(14.65% 293)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.35% 547) | C(19.35% 387) | T(32.4% 648) | G(20.9% 418)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 110578052 110580051 2000 browser details YourSeq 36 113 216 2000 87.5% chr2 - 11827516 11827617 102 browser details YourSeq 35 1 122 2000 92.9% chr14 - 115143410 115143715 306 browser details YourSeq 25 1 30 2000 93.4% chr1 - 8506753 8507244 492 browser details YourSeq 24 103 128 2000 96.2% chr15 - 88336871 88336896 26 browser details YourSeq 22 1340 1365 2000 92.4% chr14 + 90667891 90667916 26

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 110580177 110582176 2000 browser details YourSeq 315 968 1491 2000 85.1% chr8 + 41587089 41587588 500 browser details YourSeq 314 968 1491 2000 85.1% chr1 + 89279824 89280300 477 browser details YourSeq 310 931 1472 2000 85.6% chr15 + 36192699 36193208 510 browser details YourSeq 301 968 1491 2000 87.7% chrX + 71950798 71951282 485 browser details YourSeq 300 972 1520 2000 85.3% chr4 - 58940909 58941418 510 browser details YourSeq 296 968 1508 2000 86.5% chr10 + 93899119 93899634 516 browser details YourSeq 291 968 1491 2000 85.0% chr4 + 83407341 83407845 505 browser details YourSeq 290 971 1490 2000 86.0% chr12 + 75282446 75282900 455 browser details YourSeq 285 968 1447 2000 84.1% chr1 - 189063043 189063504 462 browser details YourSeq 274 968 1491 2000 83.3% chr2 + 50728133 50728616 484 browser details YourSeq 274 982 1491 2000 84.7% chr1 + 189357375 189357853 479 browser details YourSeq 271 968 1491 2000 84.7% chr6 + 81843799 81844346 548 browser details YourSeq 271 968 1491 2000 86.3% chr1 + 59097123 59097629 507 browser details YourSeq 269 972 1447 2000 83.0% chr18 - 37651604 37652053 450 browser details YourSeq 268 970 1491 2000 84.3% chr7 - 126652615 126653114 500 browser details YourSeq 261 968 1491 2000 85.2% chr9 - 121140078 121140561 484 browser details YourSeq 255 968 1491 2000 82.6% chr3 + 66819058 66819526 469 browser details YourSeq 253 986 1490 2000 84.8% chr16 - 4760440 4760875 436 browser details YourSeq 252 968 1413 2000 88.9% chr16 + 64876950 64877399 450

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Agbl4 ATP/GTP binding protein-like 4 [ Mus musculus (house mouse) ] Gene ID: 78933, updated on 10-Oct-2019

Gene summary

Official Symbol Agbl4 provided by MGI Official Full Name ATP/GTP binding protein-like 4 provided by MGI Primary source MGI:MGI:1918244 See related Ensembl:ENSMUSG00000061298 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4930578N11Rik; 4931433A01Rik Expression Biased expression in testis adult (RPKM 3.9), cortex adult (RPKM 0.7) and 3 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 C7-D1 See Agbl4 in Genome Data Viewer Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (110397649..111664324)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (110070396..111330115)

Chromosome 4 - NC_000070.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Agbl4 ENSMUSG00000061298

Description ATP/GTP binding protein-like 4 [Source:MGI Symbol;Acc:MGI:1918244] Gene Synonyms 4930578N11Rik, 4931433A01Rik, Ccp6 Location Chromosome 4: 110,397,661-111,664,324 forward strand. GRCm38:CM000997.2 About this gene This gene has 10 transcripts (splice variants), 218 orthologues, 5 paralogues, is a member of 1 Ensembl protein family and is associated with 16 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Agbl4-206 ENSMUST00000106592.7 2959 494aa ENSMUSP00000102202.1 Protein coding CCDS71443 E4W7Y4 TSL:1 GENCODE basic APPRIS P1

Agbl4-201 ENSMUST00000080744.12 1864 463aa ENSMUSP00000079568.6 Protein coding CCDS51263 Q09LZ8 TSL:1 GENCODE basic

Agbl4-202 ENSMUST00000097920.8 1798 540aa ENSMUSP00000095533.2 Protein coding CCDS51264 Q09LZ8 TSL:1 GENCODE basic

Agbl4-209 ENSMUST00000148038.1 4012 277aa ENSMUSP00000118551.1 Protein coding - F6VQN0 CDS 5' incomplete TSL:5

Agbl4-203 ENSMUST00000106587.8 2631 164aa ENSMUSP00000102197.2 Protein coding - A2A9S6 TSL:1 GENCODE basic

Agbl4-205 ENSMUST00000106591.7 1406 346aa ENSMUSP00000102201.1 Protein coding - A2A9S7 TSL:1 GENCODE basic

Agbl4-204 ENSMUST00000106589.8 400 63aa ENSMUSP00000102199.2 Protein coding - A2A9S5 CDS 3' incomplete TSL:5

Agbl4-208 ENSMUST00000142460.1 834 No protein - lncRNA - - TSL:5

Agbl4-210 ENSMUST00000154123.7 595 No protein - lncRNA - - TSL:5

Agbl4-207 ENSMUST00000136433.1 574 No protein - lncRNA - - TSL:2

Page 7 of 9 https://www.alphaknockout.com

1.29 Mb Forward strand 110.4Mb 110.6Mb 110.8Mb 111.0Mb 111.2Mb 111.4Mb 111.6Mb (Comprehensive set... Agbl4-204 >protein coding Agbl4-207 >lncRNA Gm12804-201 >processed pseudogeneAgbl4-210 >lncRNA

Agbl4-206 >protein coding

Agbl4-205 >protein coding

Agbl4-203 >protein coding Agbl4-209 >protein coding

Agbl4-201 >protein coding

Agbl4-202 >protein coding

Gm12806-201 >processed pseudogene Bend5-201 >protein coding

Bend5-204 >lncRNA

Bend5-203 >lncRNA

Bend5-205 >lncRNA

Bend5-206 >protein coding

Bend5-202 >lncRNA

Agbl4-208 >lncRNA

Contigs AL669959.22 > AL662829.8 > AL627183.16 > AL669965.13 > Genes < Gm12807-201processed pseudogene < Gm12805-201processed pseudogene (Comprehensive set...

< n-R5s191-201rRNA

Regulatory Build

110.4Mb 110.6Mb 110.8Mb 111.0Mb 111.2Mb 111.4Mb 111.6Mb Reverse strand 1.29 Mb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene pseudogene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000097920

1.26 Mb Forward strand

Agbl4-202 >protein coding

ENSMUSP00000095... Superfamily SSF53187 SMART Peptidase M14, carboxypeptidase A Pfam Cytosolic carboxypeptidase, N-terminal

Peptidase M14, carboxypeptidase A PANTHER PTHR12756:SF9

PTHR12756 Gene3D 2.60.40.3120 3.40.630.10

CDD cd06908

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9