https://www.alphaknockout.com

Mouse Farp1 Knockout Project (CRISPR/Cas9)

Objective: To create a Farp1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Farp1 (NCBI Reference Sequence: NM_134082 ; Ensembl: ENSMUSG00000025555 ) is located on Mouse 14. 27 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 27 (Transcript: ENSMUST00000026635). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from the coding region. Exon 2 covers 5.44% of the coding region. The size of effective KO region: ~194 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 27

Legends Exon of mouse Farp1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(19.6% 392) | C(29.15% 583) | T(30.05% 601) | G(21.2% 424)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.25% 485) | C(22.45% 449) | T(29.7% 594) | G(23.6% 472)

Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr14 + 121100013 121102012 2000 browser details YourSeq 157 298 630 2000 91.6% chr13 + 93817522 93817890 369 browser details YourSeq 123 295 617 2000 81.8% chr1 - 162335660 162335802 143 browser details YourSeq 121 444 623 2000 91.3% chr1 - 158746535 158746816 282 browser details YourSeq 119 299 623 2000 93.5% chr10 + 125744706 126092789 348084 browser details YourSeq 118 423 631 2000 89.2% chr4 + 47343807 47344011 205 browser details YourSeq 116 299 627 2000 81.1% chr9 - 40116065 40116209 145 browser details YourSeq 113 295 627 2000 89.4% chr10 - 90585662 90720631 134970 browser details YourSeq 101 39 597 2000 78.8% chr4 - 120359737 120359953 217 browser details YourSeq 98 425 623 2000 82.0% chr1 - 43200915 43201050 136 browser details YourSeq 95 299 631 2000 85.9% chr10 + 64218972 64219288 317 browser details YourSeq 93 295 468 2000 91.1% chr1 - 178549021 178549242 222 browser details YourSeq 92 299 629 2000 94.3% chr1 - 179139971 179328177 188207 browser details YourSeq 90 369 623 2000 80.6% chr13 + 44364410 44364578 169 browser details YourSeq 85 273 439 2000 86.5% chr8 + 96646093 96646251 159 browser details YourSeq 84 453 630 2000 80.7% chr3 - 115622161 115622265 105 browser details YourSeq 84 257 438 2000 79.4% chr17 - 4049114 4049215 102 browser details YourSeq 82 463 624 2000 92.0% chr10 - 76991481 76991709 229 browser details YourSeq 81 295 551 2000 80.3% chr10 - 39839774 39839921 148 browser details YourSeq 80 299 509 2000 92.4% chr11 - 19437755 19438046 292

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr14 + 121102184 121104183 2000 browser details YourSeq 50 1853 1972 2000 86.5% chr7 + 118465316 118465433 118 browser details YourSeq 47 1939 1999 2000 92.6% chr5 + 142594861 142594923 63 browser details YourSeq 41 1903 1972 2000 79.8% chr7 - 122309534 122309603 70 browser details YourSeq 40 1904 1975 2000 87.1% chr16 + 36920861 36920935 75 browser details YourSeq 37 1897 1972 2000 81.4% chr16 - 10626989 10627061 73 browser details YourSeq 37 1905 1959 2000 87.8% chr15 - 79943266 79943321 56 browser details YourSeq 37 1902 1982 2000 93.1% chr6 + 134274702 134274785 84 browser details YourSeq 34 290 344 2000 78.1% chr17 + 82746294 82746342 49 browser details YourSeq 34 1819 1972 2000 97.3% chr15 + 81060528 81060682 155 browser details YourSeq 33 1860 1969 2000 88.1% chr13 - 29185908 29186018 111 browser details YourSeq 33 1904 1958 2000 85.4% chr17 + 29195395 29195448 54 browser details YourSeq 32 1900 1969 2000 92.4% chr8 - 127607496 127607567 72 browser details YourSeq 32 1904 1959 2000 78.6% chr8 + 116588409 116588464 56 browser details YourSeq 32 1904 1973 2000 70.4% chr12 + 31889930 31889997 68 browser details YourSeq 31 1904 1972 2000 91.7% chr16 - 11310329 11310397 69 browser details YourSeq 31 1904 1955 2000 69.5% chr6 + 134727425 134727460 36 browser details YourSeq 31 1939 1979 2000 88.6% chr15 + 13151724 13151763 40 browser details YourSeq 29 1946 1976 2000 90.0% chr15 + 67967798 67967827 30 browser details YourSeq 28 1861 1916 2000 75.0% chr17 + 81124613 81124668 56

Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Farp1 FERM, RhoGEF (Arhgef) and pleckstrin domain protein 1 (chondrocyte-derived) [ Mus musculus (house mouse) ] Gene ID: 223254, updated on 12-Aug-2019

Gene summary

Official Symbol Farp1 provided by MGI Official Full Name FERM, RhoGEF (Arhgef) and pleckstrin domain protein 1 (chondrocyte-derived) provided by MGI Primary source MGI:MGI:2446173 See related Ensembl:ENSMUSG00000025555 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cdep; AW228844; BC030329 Expression Ubiquitous expression in limb E14.5 (RPKM 25.3), CNS E18 (RPKM 22.0) and 27 other tissues See more Orthologs human all

Genomic context

Location: 14; 14 E5 See Farp1 in Genome Data Viewer Exon count: 30

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (121035168..121283744)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (121434796..121682948)

Chromosome 14 - NC_000080.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Farp1 ENSMUSG00000025555

Description FERM, RhoGEF (Arhgef) and pleckstrin domain protein 1 (chondrocyte-derived) [Source:MGI Symbol;Acc:MGI:2446173] Gene Synonyms Cdep Location Chromosome 14: 121,035,200-121,283,744 forward strand. GRCm38:CM001007.2 About this gene This gene has 4 transcripts (splice variants), 252 orthologues, 10 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Farp1-201 ENSMUST00000026635.7 4885 1048aa ENSMUSP00000026635.6 Protein coding CCDS37015 F8VPU2 TSL:5 GENCODE basic APPRIS P1

Farp1-202 ENSMUST00000135010.7 797 211aa ENSMUSP00000116985.1 Protein coding - E9Q805 CDS 3' incomplete TSL:5

Farp1-203 ENSMUST00000137971.1 961 No protein - Retained intron - - TSL:3

Farp1-204 ENSMUST00000153607.1 777 No protein - lncRNA - - TSL:3

268.55 kb Forward strand

121.05Mb 121.10Mb 121.15Mb 121.20Mb 121.25Mb Farp1-202 >protein coding B930095G15Rik-201 >lncRNA (Comprehensive set...

Farp1-201 >protein coding

Farp1-204 >lncRNA Farp1-203 >retained intron

Contigs < AC165163.2 AC167566.1 > < AC154618.2

Genes < Stk24-201protein coding (Comprehensive set...

Regulatory Build

121.05Mb 121.10Mb 121.15Mb 121.20Mb 121.25Mb Reverse strand 268.55 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000026635

248.18 kb Forward strand

Farp1-201 >protein coding

ENSMUSP00000026... MobiDB lite Low complexity (Seg) Superfamily Ubiquitin-like domain superfamily Dbl homology (DH) domain superfamily

SSF50729

FERM superfamily, second domain SMART Band 4.1 domain FERM adjacent (FA) Pleckstrin homology domain

FERM, C-terminal PH-like domain Dbl homology (DH) domain Prints Band 4.1 domain

Ezrin/radixin/moesin-like Pfam FERM, N-terminal FERM adjacent (FA) Dbl homology (DH) domain

FERM central domain Pleckstrin homology domain

FERM, C-terminal PH-like domain PROSITE profiles FERM domain Dbl homology (DH) domain Pleckstrin homology domain

PROSITE patterns FERM conserved site PANTHER PTHR45858:SF2

PTHR45858 Gene3D 3.10.20.90 PH-like domain superfamily

FERM/acyl-CoA-binding protein superfamily Dbl homology (DH) domain superfamily CDD cd17189 FERM central domain Dbl homology (DH) domain cd13235

FARP1/FARP2/FRMD7, FERM domain C-lobe cd01220

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1048

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8