https://www.alphaknockout.com

Mouse Arhgef7 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Arhgef7 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Arhgef7 (NCBI Reference Sequence: NM_001113517 ; Ensembl: ENSMUSG00000031511 ) is located on Mouse 8. 19 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 19 (Transcript: ENSMUST00000110909). Exon 6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Arhgef7 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-196K9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 6 starts from about 28.6% of the coding region. The knockout of Exon 6 will result in frameshift of the gene. The size of intron 5 for 5'-loxP site insertion: 5541 bp, and the size of intron 6 for 3'-loxP site insertion: 8915 bp. The size of effective cKO region: ~589 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 6 19 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Arhgef7 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7089bp) | A(25.57% 1813) | C(21.96% 1557) | T(27.62% 1958) | G(24.84% 1761)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 + 11788207 11791206 3000 browser details YourSeq 217 464 718 3000 92.5% chr10 + 24681480 24681733 254 browser details YourSeq 216 440 718 3000 90.3% chr1 - 131878712 131878992 281 browser details YourSeq 203 464 720 3000 90.2% chr5 - 147241984 147242281 298 browser details YourSeq 196 455 753 3000 91.9% chr3 + 9039204 9039526 323 browser details YourSeq 194 461 718 3000 91.8% chr10 - 62571440 62572111 672 browser details YourSeq 190 461 718 3000 93.3% chr15 + 75650468 75652528 2061 browser details YourSeq 189 455 718 3000 93.6% chr11 - 107134096 107134672 577 browser details YourSeq 188 456 691 3000 91.6% chrX + 94140458 94140695 238 browser details YourSeq 187 461 715 3000 92.4% chr1 - 179735594 179735850 257 browser details YourSeq 185 461 718 3000 93.1% chr11 + 88136500 88136781 282 browser details YourSeq 184 455 715 3000 91.1% chr7 - 12816962 12817270 309 browser details YourSeq 184 465 721 3000 93.9% chr3 + 137875296 137875917 622 browser details YourSeq 182 456 712 3000 92.2% chr14 - 49014561 49014840 280 browser details YourSeq 180 456 719 3000 92.3% chr17 - 28426289 28426550 262 browser details YourSeq 179 461 688 3000 92.8% chr4 - 132626580 132627028 449 browser details YourSeq 179 461 716 3000 94.2% chr4 - 11304906 11305236 331 browser details YourSeq 179 463 720 3000 92.5% chr10 + 121433815 121434119 305 browser details YourSeq 178 455 710 3000 92.0% chr8 - 121490373 121490650 278 browser details YourSeq 177 456 715 3000 91.6% chrX - 152129518 152281491 151974

Note: The 3000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 + 11791796 11794795 3000 browser details YourSeq 1133 1730 3000 3000 95.7% chr10 - 3986027 3987318 1292 browser details YourSeq 1131 1730 3000 3000 95.8% chr2 - 106896584 106897867 1284 browser details YourSeq 1130 1753 3000 3000 95.8% chr9 + 50418485 50419725 1241 browser details YourSeq 1126 1730 3000 3000 95.4% chr9 - 109209553 109210838 1286 browser details YourSeq 1126 1749 3000 3000 95.2% chr12 + 115949500 116107645 158146 browser details YourSeq 1124 1749 3000 3000 95.4% chr8 - 107967266 107968494 1229 browser details YourSeq 1120 1749 3000 3000 95.6% chr5 + 88390103 88449383 59281 browser details YourSeq 1119 1749 3000 3000 95.4% chr10 - 18750829 19080030 329202 browser details YourSeq 1119 1749 3000 3000 95.2% chr1 - 93249236 93250475 1240 browser details YourSeq 1117 1749 3000 3000 95.2% chr11 - 48509260 48510497 1238 browser details YourSeq 1117 1749 3000 3000 95.5% chrX + 67434543 67663653 229111 browser details YourSeq 1115 1749 3000 3000 95.2% chr17 - 54898937 54900187 1251 browser details YourSeq 1113 1749 3000 3000 94.6% chr7 - 100594798 100596029 1232 browser details YourSeq 1112 1749 3000 3000 94.5% chr10 - 44205838 44207072 1235 browser details YourSeq 1112 1779 3000 3000 95.8% chr15 + 89041745 89042950 1206 browser details YourSeq 1108 1749 3000 3000 94.4% chr1 - 32272796 32274031 1236 browser details YourSeq 1108 1749 3000 3000 95.0% chr8 + 56803103 57081968 278866 browser details YourSeq 1108 1746 3000 3000 95.4% chr3 + 21782711 21783970 1260 browser details YourSeq 1107 1749 3000 3000 94.4% chr5 - 105369118 105370351 1234

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Arhgef7 Rho guanine nucleotide exchange factor (GEF7) [ Mus musculus (house mouse) ] Gene ID: 54126, updated on 19-Oct-2019

Gene summary

Official Symbol Arhgef7 provided by MGI Official Full Name Rho guanine nucleotide exchange factor (GEF7) provided by MGI Primary source MGI:MGI:1860493 See related Ensembl:ENSMUSG00000031511 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PIX; Cool; Pak3bp; cool-1; p85SPR; betaPix; beta-Pix; p85Cool1; betaPix-b; betaPix-c; mKIAA0142 Expression Ubiquitous expression in cerebellum adult (RPKM 15.8), whole brain E14.5 (RPKM 14.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 A1.1 See Arhgef7 in Genome Data Viewer

Exon count: 25

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (11728001..11835219)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (11728105..11835219)

Chromosome 8 - NC_000074.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Arhgef7 ENSMUSG00000031511

Description Rho guanine nucleotide exchange factor (GEF7) [Source:MGI Symbol;Acc:MGI:1860493] Gene Synonyms Cool, PIX, Pak interacting exchange factor, betaPix, betaPix-b, betaPix-c, cool-1, p85SPR Location Chromosome 8: 11,727,721-11,835,219 forward strand. GRCm38:CM001001.2 About this gene This gene has 13 transcripts (splice variants), 256 orthologues, 20 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Arhgef7- ENSMUST00000110909.8 4926 782aa ENSMUSP00000106534.2 Protein coding CCDS52480 Q9ES28 TSL:1 205 GENCODE basic APPRIS ALT1

Arhgef7- ENSMUST00000098938.8 4628 705aa ENSMUSP00000096538.2 Protein coding CCDS52481 Q9ES28 TSL:1 203 GENCODE basic APPRIS ALT1

Arhgef7- ENSMUST00000074856.12 4451 646aa ENSMUSP00000074399.6 Protein coding CCDS22099 A0A0R4J0X8 TSL:1 202 GENCODE basic APPRIS P3

Arhgef7- ENSMUST00000110904.1 4038 636aa ENSMUSP00000106529.2 Protein coding - D3Z0V2 CDS 5' 204 incomplete TSL:1

Arhgef7- ENSMUST00000210012.1 732 200aa ENSMUSP00000147641.1 Protein coding - A0A1B0GRS3 CDS 3' 209 incomplete TSL:3

Arhgef7- ENSMUST00000210104.1 574 131aa ENSMUSP00000148109.1 Protein coding - A0A1B0GSX2 CDS 3' 210 incomplete TSL:3

Arhgef7- ENSMUST00000211409.1 613 68aa ENSMUSP00000148111.1 Nonsense mediated - A0A1B0GSX4 CDS 5' 212 decay incomplete TSL:3

Arhgef7- ENSMUST00000154204.1 3355 No - Retained intron - - TSL:1 207 protein

Arhgef7- ENSMUST00000033908.13 3221 No - Retained intron - - TSL:1 201 protein

Arhgef7- ENSMUST00000209686.1 2205 No - Retained intron - - TSL:5 208 protein

Arhgef7- ENSMUST00000151225.1 1761 No - Retained intron - - TSL:1 206 protein

Arhgef7- ENSMUST00000210287.1 898 No - Retained intron - - TSL:5 211 protein

Arhgef7- ENSMUST00000211510.1 334 No - lncRNA - - TSL:3 213 protein

Page 6 of 8 https://www.alphaknockout.com

127.50 kb Forward strand

11.72Mb 11.74Mb 11.76Mb 11.78Mb 11.80Mb 11.82Mb 11.84Mb (Comprehensive set... 1700128E19Rik-201 >lncRNA Arhgef7-208 >retained intron Arhgef7-211 >retained intron Tex29-206 >nonsense mediated decay

Arhgef7-210 >protein coding Arhgef7-204 >protein coding

Arhgef7-205 >protein coding Arhgef7-206 >retained intron

Arhgef7-201 >retained intron Tex29-201 >protein coding

Gm18991-201 >processed pseudogene Arhgef7-212 >nonsense mediated decay Arhgef7-207 >retained intron Tex29-202 >lncRNA

Arhgef7-203 >protein coding

Arhgef7-202 >protein coding

Arhgef7-209 >protein coding Arhgef7-213 >lncRNA Tex29-205 >protein coding

Tex29-203 >nonsense mediated decay

Tex29-204 >protein coding

Contigs < AC126800.3 < AC120785.13 Genes < Gm15875-201lncRNA (Comprehensive set...

Regulatory Build

11.72Mb 11.74Mb 11.76Mb 11.78Mb 11.80Mb 11.82Mb 11.84Mb Reverse strand 127.50 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000110909

99.05 kb Forward strand

Arhgef7-205 >protein coding

ENSMUSP00000106... MobiDB lite Low complexity (Seg) Superfamily CH domain superfamily Dbl homology (DH) domain superfamily

SH3-like domain superfamily SSF50729 SMART Calponin homology domain SH3 domain Dbl homology (DH) domain Pleckstrin homology domain

Prints SH3 domain Pfam Calponin homology domain SH3 domain Dbl homology (DH) domain PF16614

PF16615 Pleckstrin homology domain PROSITE profiles Calponin homology domain SH3 domain Dbl homology (DH) domain Pleckstrin homology domain

PROSITE patterns Guanine-nucleotide dissociation stimulator, CDC24, conserved site PANTHER PTHR46026:SF3

PTHR46026 Gene3D CH domain superfamily 2.30.30.40 Dbl homology (DH) domain superfamily

PH-like domain superfamily CDD Calponin homology domain Dbl homology (DH) domain cd01225

Rho guanine nucleotide exchange factor 7, SH3 domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 782

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8