https://www.alphaknockout.com

Mouse Agbl4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Agbl4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Agbl4 (NCBI Reference Sequence: NM_030231 ; Ensembl: ENSMUSG00000061298 ) is located on Mouse 4. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000097920). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Agbl4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-283J17 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit abnormal platelet morphology and physiology, impaired megakaryopoiesis, increased spleen weight and increased susceptibility to HSV or VACV infection.

Exon 3 starts from about 9.75% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 101463 bp, and the size of intron 3 for 3'-loxP site insertion: 375391 bp. The size of effective cKO region: ~625 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 13 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Agbl4 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7125bp) | A(27.37% 1950) | C(16.7% 1190) | T(38.26% 2726) | G(17.67% 1259)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 110576802 110579801 3000 browser details YourSeq 95 232 583 3000 90.7% chr13 + 102336585 102336936 352 browser details YourSeq 86 230 582 3000 94.0% chr10 + 98620561 98620932 372 browser details YourSeq 70 230 553 3000 90.5% chr11 - 92043379 92043744 366 browser details YourSeq 57 230 369 3000 89.1% chr7 - 56530701 56530842 142 browser details YourSeq 48 230 399 3000 94.7% chr3 - 102797710 102797890 181 browser details YourSeq 44 135 262 3000 89.1% chr1 + 97685646 97685774 129 browser details YourSeq 43 539 782 3000 93.9% chr12 - 60050959 60051210 252 browser details YourSeq 41 230 350 3000 91.2% chrX - 131104561 131104680 120 browser details YourSeq 37 241 350 3000 85.8% chr2 - 67760842 67760949 108 browser details YourSeq 37 226 276 3000 86.1% chr12 + 42994719 42994767 49 browser details YourSeq 36 1363 1466 3000 87.5% chr2 - 11827516 11827617 102 browser details YourSeq 35 230 266 3000 97.3% chr11 - 19041215 19041251 37 browser details YourSeq 35 538 582 3000 92.7% chr12 + 97901651 97901697 47 browser details YourSeq 34 230 276 3000 87.5% chr2 - 113045660 113045705 46 browser details YourSeq 34 232 272 3000 92.5% chr17 + 51219749 51219789 41 browser details YourSeq 33 230 276 3000 94.6% chrX - 132594614 132594661 48 browser details YourSeq 33 230 266 3000 88.9% chrX + 131304471 131304506 36 browser details YourSeq 33 535 583 3000 94.6% chr12 + 50889266 50889316 51 browser details YourSeq 32 539 582 3000 90.0% chr12 + 13364780 13364825 46

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 110580427 110583426 3000 browser details YourSeq 358 1929 2399 3000 91.1% chr12 - 55623900 55624512 613 browser details YourSeq 306 718 1241 3000 84.8% chr1 + 89279824 89280300 477 browser details YourSeq 275 718 1159 3000 85.7% chr1 - 189063064 189063504 441 browser details YourSeq 267 2436 2823 3000 88.1% chr19 - 7550610 7551001 392 browser details YourSeq 257 2436 2824 3000 86.2% chrX - 163142236 163142622 387 browser details YourSeq 256 2438 2804 3000 89.0% chr15 + 59267060 59267427 368 browser details YourSeq 255 2416 2805 3000 90.6% chr2 - 181430657 181431052 396 browser details YourSeq 255 2441 2805 3000 88.8% chr1 - 158903545 158903916 372 browser details YourSeq 253 2416 2823 3000 87.2% chr19 - 47915903 47916295 393 browser details YourSeq 252 2436 2824 3000 85.8% chr4 - 39245342 39245735 394 browser details YourSeq 252 2436 2824 3000 86.9% chr9 + 40144080 40144472 393 browser details YourSeq 251 2449 2824 3000 88.0% chrX + 41709822 41710198 377 browser details YourSeq 248 2449 2826 3000 85.3% chrX + 86635892 86636267 376 browser details YourSeq 246 2415 2795 3000 83.7% chr8 - 73190451 73190826 376 browser details YourSeq 246 2440 2805 3000 86.5% chr8 + 127375860 127376224 365 browser details YourSeq 245 2415 2823 3000 87.4% chr15 - 24334632 24335282 651 browser details YourSeq 245 2439 2824 3000 87.3% chr1 - 176556363 176556760 398 browser details YourSeq 244 2436 2826 3000 86.8% chr2 - 43965706 43966108 403 browser details YourSeq 244 2436 2806 3000 87.9% chr1 - 179934994 179935366 373

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Agbl4 ATP/GTP binding protein-like 4 [ Mus musculus (house mouse) ] Gene ID: 78933, updated on 10-Oct-2019

Gene summary

Official Symbol Agbl4 provided by MGI Official Full Name ATP/GTP binding protein-like 4 provided by MGI Primary source MGI:MGI:1918244 See related Ensembl:ENSMUSG00000061298 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4930578N11Rik; 4931433A01Rik Expression Biased expression in testis adult (RPKM 3.9), cortex adult (RPKM 0.7) and 3 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 C7-D1 See Agbl4 in Genome Data Viewer

Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (110397649..111664324)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (110070396..111330115)

Chromosome 4 - NC_000070.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Agbl4 ENSMUSG00000061298

Description ATP/GTP binding protein-like 4 [Source:MGI Symbol;Acc:MGI:1918244] Gene Synonyms 4930578N11Rik, 4931433A01Rik, Ccp6 Location Chromosome 4: 110,397,661-111,664,324 forward strand. GRCm38:CM000997.2 About this gene This gene has 10 transcripts (splice variants), 218 orthologues, 5 paralogues, is a member of 1 Ensembl protein family and is associated with 16 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Agbl4-206 ENSMUST00000106592.7 2959 494aa ENSMUSP00000102202.1 Protein coding CCDS71443 E4W7Y4 TSL:1 GENCODE basic APPRIS P1

Agbl4-201 ENSMUST00000080744.12 1864 463aa ENSMUSP00000079568.6 Protein coding CCDS51263 Q09LZ8 TSL:1 GENCODE basic

Agbl4-202 ENSMUST00000097920.8 1798 540aa ENSMUSP00000095533.2 Protein coding CCDS51264 Q09LZ8 TSL:1 GENCODE basic

Agbl4-209 ENSMUST00000148038.1 4012 277aa ENSMUSP00000118551.1 Protein coding - F6VQN0 CDS 5' incomplete TSL:5

Agbl4-203 ENSMUST00000106587.8 2631 164aa ENSMUSP00000102197.2 Protein coding - A2A9S6 TSL:1 GENCODE basic

Agbl4-205 ENSMUST00000106591.7 1406 346aa ENSMUSP00000102201.1 Protein coding - A2A9S7 TSL:1 GENCODE basic

Agbl4-204 ENSMUST00000106589.8 400 63aa ENSMUSP00000102199.2 Protein coding - A2A9S5 CDS 3' incomplete TSL:5

Agbl4-208 ENSMUST00000142460.1 834 No protein - lncRNA - - TSL:5

Agbl4-210 ENSMUST00000154123.7 595 No protein - lncRNA - - TSL:5

Agbl4-207 ENSMUST00000136433.1 574 No protein - lncRNA - - TSL:2

Page 6 of 8 https://www.alphaknockout.com

1.29 Mb Forward strand 110.4Mb 110.6Mb 110.8Mb 111.0Mb 111.2Mb 111.4Mb 111.6Mb (Comprehensive set... Agbl4-204 >protein coding Agbl4-207 >lncRNA Gm12804-201 >processed pseudogeneAgbl4-210 >lncRNA

Agbl4-206 >protein coding

Agbl4-205 >protein coding

Agbl4-203 >protein coding Agbl4-209 >protein coding

Agbl4-201 >protein coding

Agbl4-202 >protein coding

Gm12806-201 >processed pseudogene Bend5-201 >protein coding

Bend5-204 >lncRNA

Bend5-203 >lncRNA

Bend5-205 >lncRNA

Bend5-206 >protein coding

Bend5-202 >lncRNA

Agbl4-208 >lncRNA

Contigs AL669959.22 > AL662829.8 > AL627183.16 > AL669965.13 > Genes < Gm12807-201processed pseudogene < Gm12805-201processed pseudogene (Comprehensive set...

< n-R5s191-201rRNA

Regulatory Build

110.4Mb 110.6Mb 110.8Mb 111.0Mb 111.2Mb 111.4Mb 111.6Mb Reverse strand 1.29 Mb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000097920

1.26 Mb Forward strand

Agbl4-202 >protein coding

ENSMUSP00000095... Superfamily SSF53187 SMART Peptidase M14, carboxypeptidase A Pfam Cytosolic carboxypeptidase, N-terminal

Peptidase M14, carboxypeptidase A PANTHER PTHR12756:SF9

PTHR12756 Gene3D 2.60.40.3120 3.40.630.10

CDD cd06908

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8