https://www.alphaknockout.com

Mouse Arpc4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Arpc4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Arpc4 (NCBI Reference Sequence: NM_026552 ; Ensembl: ENSMUSG00000079426 ) is located on Mouse 6. 6 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 6 (Transcript: ENSMUST00000156898). Exon 4~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Arpc4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-339P1 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4~5 is not frameshift exon, and covers 52.98% of the coding region. The size of intron 3 for 5'-loxP site insertion: 2007 bp, and the size of intron 5 for 3'-loxP site insertion: 3110 bp. The size of effective cKO region: ~2079 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Arpc4 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8579bp) | A(25.28% 2169) | C(22.83% 1959) | T(27.86% 2390) | G(24.02% 2061)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 113380833 113383832 3000 browser details YourSeq 288 854 2034 3000 93.7% chr10 - 7671410 8180404 508995 browser details YourSeq 252 904 2034 3000 90.2% chr1 - 33671099 33849592 178494 browser details YourSeq 166 507 1029 3000 90.3% chr15 - 98071462 98138517 67056 browser details YourSeq 162 431 1028 3000 83.3% chr10 - 71314048 71314432 385 browser details YourSeq 159 507 1029 3000 92.6% chr11 - 80358689 80359234 546 browser details YourSeq 157 845 1028 3000 94.5% chr4 - 130577481 130578149 669 browser details YourSeq 157 845 1029 3000 93.0% chr12 + 8209665 8209855 191 browser details YourSeq 156 853 1029 3000 96.0% chr14 - 121394905 121395087 183 browser details YourSeq 156 844 1030 3000 92.6% chr1 + 44109259 44109451 193 browser details YourSeq 155 853 1459 3000 84.9% chr8 - 106965235 106965799 565 browser details YourSeq 155 852 1030 3000 93.9% chr3 - 52993091 52993312 222 browser details YourSeq 155 844 1029 3000 92.5% chr17 - 34938939 34939131 193 browser details YourSeq 155 848 1028 3000 93.4% chr14 + 57584737 57594842 10106 browser details YourSeq 154 853 1028 3000 94.4% chr15 - 100044332 100044550 219 browser details YourSeq 154 853 1028 3000 94.3% chr15 - 100236650 100236829 180 browser details YourSeq 154 849 1028 3000 93.9% chr1 - 60145870 60146053 184 browser details YourSeq 154 842 1029 3000 92.5% chr11 + 98162405 98162765 361 browser details YourSeq 154 848 1029 3000 91.0% chr10 + 14592835 14593011 177 browser details YourSeq 153 853 1029 3000 93.8% chr11 - 95917782 95917963 182

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 113385912 113388911 3000 browser details YourSeq 468 613 1211 3000 93.6% chr1 + 172379010 172379628 619 browser details YourSeq 324 728 1213 3000 89.5% chr11 + 40712506 40713295 790 browser details YourSeq 105 793 942 3000 90.3% chr4 - 136348094 136348255 162 browser details YourSeq 105 774 926 3000 87.8% chr2 + 112488858 112489011 154 browser details YourSeq 99 811 957 3000 87.4% chr10 + 62933094 62933541 448 browser details YourSeq 98 833 1277 3000 92.4% chr15 + 31992474 31993064 591 browser details YourSeq 91 773 904 3000 84.3% chr17 + 36859653 36859781 129 browser details YourSeq 90 807 926 3000 91.1% chr8 + 124496782 124496911 130 browser details YourSeq 90 774 925 3000 81.3% chr19 + 19764394 19764524 131 browser details YourSeq 88 795 925 3000 88.5% chr13 - 62067140 62067270 131 browser details YourSeq 88 817 933 3000 90.2% chr4 + 101529563 101529682 120 browser details YourSeq 85 830 995 3000 88.2% chr12 + 85210683 85210910 228 browser details YourSeq 85 803 926 3000 92.3% chr1 + 75294771 75294906 136 browser details YourSeq 84 775 923 3000 82.2% chr15 + 76326035 76326165 131 browser details YourSeq 83 814 923 3000 92.9% chr5 - 71019043 71019153 111 browser details YourSeq 83 803 923 3000 88.0% chr16 - 20448590 20448710 121 browser details YourSeq 82 794 904 3000 88.7% chr1 - 153426330 153426452 123 browser details YourSeq 81 795 925 3000 89.4% chr15 - 7437291 7437423 133 browser details YourSeq 81 719 930 3000 75.3% chr14 - 24152156 24152274 119

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Arpc4 actin related protein 2/3 complex, subunit 4 [ Mus musculus (house mouse) ] Gene ID: 68089, updated on 12-Aug-2019

Gene summary

Official Symbol Arpc4 provided by MGI Official Full Name actin related protein 2/3 complex, subunit 4 provided by MGI Primary source MGI:MGI:1915339 See related Ensembl:ENSMUSG00000079426 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 20kDa; p20-Arc; AI327076; 5330419I20Rik Expression Ubiquitous expression in large intestine adult (RPKM 58.0), placenta adult (RPKM 48.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 E3 See Arpc4 in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (113378113..113390447)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (113328107..113340441)

Chromosome 6 - NC_000072.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Arpc4 ENSMUSG00000079426

Description actin related protein 2/3 complex, subunit 4 [Source:MGI Symbol;Acc:MGI:1915339] Gene Synonyms 5330419I20Rik, p20-Arc Location Chromosome 6: 113,378,115-113,390,448 forward strand. GRCm38:CM000999.2 About this gene This gene has 4 transcripts (splice variants), 256 orthologues, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Arpc4-201 ENSMUST00000156898.4 2270 168aa ENSMUSP00000114839.1 Protein coding CCDS20417 P59999 TSL:1 GENCODE basic APPRIS P1

Arpc4-203 ENSMUST00000203578.2 1323 78aa ENSMUSP00000145344.1 Protein coding CCDS85117 Q3TX55 TSL:1 GENCODE basic

Arpc4-204 ENSMUST00000204802.1 794 78aa ENSMUSP00000144751.1 Protein coding CCDS85117 Q3TX55 TSL:3 GENCODE basic

Arpc4-202 ENSMUST00000171058.7 695 91aa ENSMUSP00000131690.1 Protein coding CCDS51871 E9PWA7 TSL:5 GENCODE basic

Page 6 of 8 https://www.alphaknockout.com

32.33 kb Forward strand 113.37Mb 113.38Mb 113.39Mb 113.40Mb (Comprehensive set... Arpc4-201 >protein coding Ttll3-208 >lncRNA

Arpc4-203 >protein coding Ttll3-201 >protein coding

Arpc4-202 >protein coding Ttll3-206 >protein coding

Arpc4-204 >protein coding Ttll3-202 >protein coding

Ttll3-203 >protein coding

Ttll3-209 >nonsense mediated decay

Ttll3-205 >retained intron

Contigs AC155287.6 > AC153910.6 > Genes < Tada3-201protein coding (Comprehensive set...

< Tada3-203protein coding

< Tada3-202protein coding

< Tada3-206retained intron < Tada3-205retained intron

< Tada3-204retained intron

< Tada3-207protein coding

Regulatory Build

113.37Mb 113.38Mb 113.39Mb 113.40Mb Reverse strand 32.33 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000156898

12.33 kb Forward strand

Arpc4-201 >protein coding

ENSMUSP00000114... Superfamily Arp2/3 complex subunit 2/4

Pfam Actin-related protein 2/3 complex subunit 4

PIRSF Actin-related protein 2/3 complex subunit 4 PANTHER Actin-related protein 2/3 complex subunit 4

Gene3D Arp2/3 complex subunit 2/4

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

synonymous variant

Scale bar 0 20 40 60 80 100 120 140 168

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8