https://www.alphaknockout.com

Mouse Arf5 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Arf5 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Arf5 (NCBI Reference Sequence: NM_007480 ; Ensembl: ENSMUSG00000020440 ) is located on Mouse 6. 6 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 6 (Transcript: ENSMUST00000020717). Exon 2~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Arf5 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-176I9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 12.59% of the coding region. The knockout of Exon 2~4 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 534 bp, and the size of intron 4 for 3'-loxP site insertion: 494 bp. The size of effective cKO region: ~1452 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Arf5 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7949bp) | A(22.52% 1790) | C(26.71% 2123) | T(23.88% 1898) | G(26.9% 2138)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 28421041 28424040 3000 browser details YourSeq 161 1742 2216 3000 90.5% chr17 + 70677308 70678286 979 browser details YourSeq 147 1852 2203 3000 85.5% chr13 - 3587368 3587692 325 browser details YourSeq 145 1850 2029 3000 90.1% chr6 + 124950067 124950233 167 browser details YourSeq 145 1856 2212 3000 94.0% chr1 + 182056105 182056612 508 browser details YourSeq 143 1851 2028 3000 95.0% chr10 + 76466823 76467070 248 browser details YourSeq 141 1855 2029 3000 95.0% chr16 - 3290173 3290352 180 browser details YourSeq 141 1792 1998 3000 93.9% chr10 - 60004328 60004650 323 browser details YourSeq 141 1851 2004 3000 96.2% chr15 + 99154100 99154255 156 browser details YourSeq 140 1850 2004 3000 95.5% chr3 - 157980298 157980454 157 browser details YourSeq 140 1834 2002 3000 90.7% chr9 + 83593374 83593537 164 browser details YourSeq 140 1856 2029 3000 88.6% chr9 + 64768950 64769106 157 browser details YourSeq 140 1837 2001 3000 89.9% chr7 + 55786184 55786340 157 browser details YourSeq 140 1851 2004 3000 95.5% chr11 + 82777460 82777613 154 browser details YourSeq 139 1850 2004 3000 95.5% chr6 - 113535255 113535412 158 browser details YourSeq 139 1851 2004 3000 95.5% chr17 - 17422766 17422921 156 browser details YourSeq 139 1851 2004 3000 95.5% chr15 - 93727472 93727627 156 browser details YourSeq 139 1836 2004 3000 92.0% chr15 - 38534585 38534752 168 browser details YourSeq 139 1832 2004 3000 88.4% chr5 + 142684874 142685038 165 browser details YourSeq 139 1856 2004 3000 96.7% chr5 + 108254506 108254654 149

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 28425493 28428492 3000 browser details YourSeq 52 2706 2766 3000 86.0% chr9 + 67121911 67121967 57 browser details YourSeq 52 2708 2761 3000 98.2% chr7 + 72755100 72755153 54 browser details YourSeq 50 2708 2761 3000 94.3% chr13 - 45969697 45969749 53 browser details YourSeq 50 2708 2761 3000 94.3% chr12 - 13685566 13685618 53 browser details YourSeq 50 2708 2761 3000 94.3% chr12 - 8443011 8443063 53 browser details YourSeq 50 2708 2761 3000 94.3% chr1 - 27264706 27264758 53 browser details YourSeq 50 2708 2762 3000 98.2% chr15 + 91484483 91484538 56 browser details YourSeq 50 2708 2761 3000 94.3% chr13 + 115771860 115771912 53 browser details YourSeq 50 2708 2761 3000 94.3% chr1 + 184741220 184741272 53 browser details YourSeq 50 2708 2761 3000 94.3% chr1 + 140403553 140403605 53 browser details YourSeq 49 2708 2761 3000 98.1% chr14 - 54102937 54103013 77 browser details YourSeq 49 2708 2761 3000 98.1% chr11 - 109059341 109059394 54 browser details YourSeq 49 2708 2762 3000 92.5% chr10 + 19736409 19736462 54 browser details YourSeq 49 2708 2761 3000 98.1% chr1 + 136451881 136451973 93 browser details YourSeq 48 2708 2761 3000 92.4% chr8 - 65426397 65426449 53 browser details YourSeq 48 2708 2761 3000 92.4% chr14 - 9069078 9069130 53 browser details YourSeq 48 2706 2761 3000 94.0% chr12 - 99549959 99550013 55 browser details YourSeq 48 2712 2761 3000 98.0% chr10 + 115428037 115428086 50 browser details YourSeq 47 2715 2761 3000 100.0% chr10 - 96615685 96615731 47

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Arf5 ADP-ribosylation factor 5 [ Mus musculus (house mouse) ] Gene ID: 11844, updated on 10-Oct-2019

Gene summary

Official Symbol Arf5 provided by MGI Official Full Name ADP-ribosylation factor 5 provided by MGI Primary source MGI:MGI:99434 See related Ensembl:ENSMUSG00000020440 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Ubiquitous expression in duodenum adult (RPKM 584.2), small intestine adult (RPKM 299.8) and 28 other tissues See Orthologs more human all

Genomic context

Location: 6; 6 A3.3 See Arf5 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (28423604..28426602)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (28373640..28376499)

Chromosome 6 - NC_000072.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Arf5 ENSMUSG00000020440

Description ADP-ribosylation factor 5 [Source:MGI Symbol;Acc:MGI:99434] Location Chromosome 6: 28,423,560-28,426,602 forward strand. GRCm38:CM000999.2 About this gene This gene has 3 transcripts (splice variants), 209 orthologues, 29 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Arf5-201 ENSMUST00000020717.11 1116 180aa ENSMUSP00000020717.5 Protein coding CCDS19952 P84084 TSL:1 GENCODE basic APPRIS P1

Arf5-202 ENSMUST00000169841.1 768 180aa ENSMUSP00000127281.1 Protein coding CCDS19952 P84084 TSL:5 GENCODE basic APPRIS P1

Arf5-203 ENSMUST00000202028.1 919 No protein - Retained intron - - TSL:NA

23.04 kb Forward strand 28.415Mb 28.420Mb 28.425Mb 28.430Mb 28.435Mb (Comprehensive set... Arf5-201 >protein coding Fscn3-201 >protein coding

Arf5-202 >protein coding Fscn3-202 >lncRNA

Arf5-203 >retained intron

Contigs AC068608.5 > Genes < Gcc1-202protein coding (Comprehensive set...

< Gcc1-201protein coding

< Gcc1-203protein coding

< Gcc1-204protein coding

Regulatory Build

28.415Mb 28.420Mb 28.425Mb 28.430Mb 28.435Mb Reverse strand 23.04 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000020717

3.04 kb Forward strand

Arf5-201 >protein coding

ENSMUSP00000020... Coiled-coils (Ncoils) TIGRFAM Small GTP-binding protein domain Superfamily P-loop containing nucleoside triphosphate hydrolase SMART SM00175

SM00178

SM00177 Prints Small GTPase superfamily, ARF/SAR type Pfam Small GTPase superfamily, ARF/SAR type PROSITE profiles PS51417

PANTHER PTHR11711:SF314

PTHR11711 Gene3D 3.40.50.300

CDD cd04150

Scale bar 0 20 40 60 80 100 120 140 160 180

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7