https://www.alphaknockout.com

Mouse Arl8a Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Arl8a conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Arl8a (NCBI Reference Sequence: NM_026823 ; Ensembl: ENSMUSG00000026426 ) is located on Mouse 1. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000027684). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Arl8a gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-355D14 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 22.22% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 5370 bp, and the size of intron 3 for 3'-loxP site insertion: 1309 bp. The size of effective cKO region: ~896 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 6 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Arl8a Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7396bp) | A(24.63% 1822) | C(24.68% 1825) | T(26.78% 1981) | G(23.9% 1768)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 + 135149222 135152221 3000 browser details YourSeq 163 215 393 3000 95.6% chr2 - 169786019 169786197 179 browser details YourSeq 162 216 393 3000 96.1% chr8 + 111647684 111647872 189 browser details YourSeq 162 216 393 3000 94.4% chr11 + 98865745 98865921 177 browser details YourSeq 161 215 393 3000 95.0% chr11 + 88858048 88858226 179 browser details YourSeq 161 215 393 3000 95.0% chr10 + 3176254 3176432 179 browser details YourSeq 160 215 393 3000 95.0% chr1 - 181294859 181295039 181 browser details YourSeq 159 215 393 3000 95.5% chr2 - 23166782 23166961 180 browser details YourSeq 159 216 394 3000 94.5% chr16 - 32260435 32260613 179 browser details YourSeq 159 221 409 3000 94.5% chr6 + 135209514 135209741 228 browser details YourSeq 158 216 393 3000 95.5% chr15 - 25699884 25700063 180 browser details YourSeq 158 216 392 3000 94.9% chr11 - 116633605 116633781 177 browser details YourSeq 158 225 393 3000 97.1% chr10 - 85931879 85932048 170 browser details YourSeq 158 226 393 3000 97.7% chr4 + 40838574 40838747 174 browser details YourSeq 158 226 393 3000 97.1% chr15 + 97149445 97149612 168 browser details YourSeq 158 226 393 3000 97.7% chr1 + 68893575 68893747 173 browser details YourSeq 157 218 393 3000 95.5% chr17 - 29080136 29080318 183 browser details YourSeq 157 226 394 3000 96.5% chr12 - 59186521 59186689 169 browser details YourSeq 157 221 393 3000 95.4% chr1 + 24188164 24188336 173 browser details YourSeq 156 214 393 3000 92.0% chr13 - 106921086 106921261 176

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 + 135153118 135156117 3000 browser details YourSeq 138 436 706 3000 87.4% chr18 + 12097617 12097876 260 browser details YourSeq 133 458 703 3000 92.4% chr11 - 98212842 98213519 678 browser details YourSeq 129 399 703 3000 88.1% chrX - 73830839 73831200 362 browser details YourSeq 114 229 674 3000 85.4% chr1 + 74980893 75158448 177556 browser details YourSeq 113 411 695 3000 87.6% chr17 - 28745220 28745718 499 browser details YourSeq 108 565 745 3000 88.0% chr1 - 81979528 81979702 175 browser details YourSeq 108 413 701 3000 85.6% chr1 - 71519314 71768681 249368 browser details YourSeq 107 547 711 3000 83.6% chr3 + 108864991 108865152 162 browser details YourSeq 106 395 653 3000 91.7% chr15 + 76411536 76411800 265 browser details YourSeq 106 438 720 3000 89.5% chr1 + 10017924 10040922 22999 browser details YourSeq 101 386 703 3000 83.8% chr7 + 109997959 109998243 285 browser details YourSeq 97 563 700 3000 87.3% chr9 - 101059254 101059766 513 browser details YourSeq 96 377 512 3000 89.5% chr18 - 36526893 36527374 482 browser details YourSeq 96 565 729 3000 84.0% chr1 - 41099510 41099672 163 browser details YourSeq 96 373 513 3000 90.0% chr4 + 11274805 11274958 154 browser details YourSeq 95 393 513 3000 90.0% chr11 - 88336621 88336742 122 browser details YourSeq 93 563 708 3000 86.3% chr14 + 10525055 10525197 143 browser details YourSeq 93 565 705 3000 92.7% chr1 + 52993190 52993348 159 browser details YourSeq 92 393 513 3000 90.6% chr8 + 64742217 64742338 122

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Arl8a ADP-ribosylation factor-like 8A [ Mus musculus (house mouse) ] Gene ID: 68724, updated on 12-Aug-2019

Gene summary

Official Symbol Arl8a provided by MGI Official Full Name ADP-ribosylation factor-like 8A provided by MGI Primary source MGI:MGI:1915974 See related Ensembl:ENSMUSG00000026426 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as gie2; Arl10b; 1110033P22Rik Expression Ubiquitous expression in CNS E18 (RPKM 133.7), whole brain E14.5 (RPKM 99.2) and 25 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 E4 See Arl8a in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (135146834..135156268)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (137043411..137052845)

Chromosome 1 - NC_000067.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Arl8a ENSMUSG00000026426

Description ADP-ribosylation factor-like 8A [Source:MGI Symbol;Acc:MGI:1915974] Gene Synonyms 1110033P22Rik, Arl10b Location : 135,146,824-135,156,269 forward strand. GRCm38:CM000994.2 About this gene This gene has 4 transcripts (splice variants), 192 orthologues, 29 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Arl8a-201 ENSMUST00000027684.10 1713 186aa ENSMUSP00000027684.4 Protein coding CCDS15316 Q8VEH3 TSL:1 GENCODE basic APPRIS P1

Arl8a-203 ENSMUST00000125774.1 1495 165aa ENSMUSP00000121545.1 Protein coding - F6QKK2 CDS 5' incomplete TSL:3

Arl8a-202 ENSMUST00000123344.1 664 No protein - Retained intron - - TSL:2

Arl8a-204 ENSMUST00000141177.1 472 No protein - Retained intron - - TSL:2

29.45 kb Forward strand 135.14Mb 135.15Mb 135.16Mb (Comprehensive set... Ptpn7-203 >protein coding Arl8a-201 >protein coding

Ptpn7-201 >protein coding Arl8a-203 >protein coding

Ptpn7-204 >retained intron Arl8a-202 >retained intron

Ptpn7-202 >protein coding Gm15445-201 >lncRNA

Arl8a-204 >retained intron

Contigs AC133097.3 > Genes < Gm26280-201rRNA < Gpr37l1-201protein coding (Comprehensive set...

Regulatory Build

135.14Mb 135.15Mb 135.16Mb Reverse strand 29.45 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000027684

9.45 kb Forward strand

Arl8a-201 >protein coding

ENSMUSP00000027... TIGRFAM Small GTP-binding protein domain Superfamily P-loop containing nucleoside triphosphate hydrolase SMART SM00177

SM00178

SM00175 Prints Small GTPase superfamily, ARF/SAR type Pfam Small GTPase superfamily, ARF/SAR type PROSITE profiles PS51417

PANTHER PTHR45732:SF11

PTHR45732 Gene3D 3.40.50.300 CDD cd04159

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend splice region variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 186

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7