https://www.alphaknockout.com

Mouse Lrrfip1 Knockout Project (CRISPR/Cas9)

Objective: To create a Lrrfip1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Lrrfip1 (NCBI Reference Sequence: NM_001111311 ; Ensembl: ENSMUSG00000026305 ) is located on Mouse 1. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000097649). Exon 5~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 14.45% of the coding region. Exon 5~7 covers 18.56% of the coding region. The size of effective KO region: ~7171 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8

Legends Exon of mouse Lrrfip1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.35% 507) | C(25.15% 503) | T(26.4% 528) | G(23.1% 462)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.0% 480) | C(22.1% 442) | T(30.55% 611) | G(23.35% 467)

Note: The 2000 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 91103128 91105127 2000 browser details YourSeq 48 17 68 2000 98.0% chr10 + 84855400 84855451 52 browser details YourSeq 46 25 112 2000 92.0% chr6 - 126541610 126541696 87 browser details YourSeq 46 23 69 2000 100.0% chr4 + 33855798 33855866 69 browser details YourSeq 46 18 65 2000 98.0% chr1 + 52551994 52552041 48 browser details YourSeq 45 1650 1717 2000 92.4% chr9 - 104454097 104454167 71 browser details YourSeq 45 28 92 2000 91.5% chr13 - 52927536 52927598 63 browser details YourSeq 45 24 68 2000 100.0% chr1 - 34654530 34654574 45 browser details YourSeq 43 25 67 2000 100.0% chr1 - 16174711 16174753 43 browser details YourSeq 43 33 93 2000 80.9% chr14 + 119262118 119262169 52 browser details YourSeq 42 18 61 2000 97.8% chr3 + 85634664 85634707 44 browser details YourSeq 41 26 68 2000 97.7% chr2 + 62975194 62975236 43 browser details YourSeq 40 18 57 2000 100.0% chr16 - 37224523 37224562 40 browser details YourSeq 39 18 58 2000 97.6% chr1 + 34734033 34734073 41 browser details YourSeq 38 1649 1717 2000 85.2% chr12 - 70738826 70738898 73 browser details YourSeq 37 27 68 2000 97.7% chr5 + 143673782 143673831 50 browser details YourSeq 37 27 66 2000 97.5% chr1 + 39871621 39871676 56 browser details YourSeq 36 1704 1840 2000 95.0% chr5 - 50762313 50762472 160 browser details YourSeq 36 31 68 2000 97.4% chr2 + 123534389 123534426 38 browser details YourSeq 34 30 65 2000 97.3% chr1 - 47925551 47925586 36

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 + 91112299 91114298 2000 browser details YourSeq 32 226 263 2000 94.5% chr6 - 116451514 116451558 45 browser details YourSeq 31 225 263 2000 97.1% chr15 - 50979847 50979897 51 browser details YourSeq 30 698 742 2000 78.2% chr7 + 98285988 98286026 39 browser details YourSeq 27 1964 1999 2000 96.6% chr12 + 85919416 85919455 40 browser details YourSeq 26 225 251 2000 100.0% chr4 + 74855210 74855243 34 browser details YourSeq 26 225 250 2000 100.0% chr17 + 40219426 40219451 26 browser details YourSeq 25 225 251 2000 88.5% chr16 - 14478445 14478470 26 browser details YourSeq 25 225 251 2000 88.5% chr12 - 61683930 61683955 26 browser details YourSeq 25 225 251 2000 96.3% chr12 - 52910906 52910932 27 browser details YourSeq 25 225 250 2000 100.0% chr1 - 27544716 27544743 28 browser details YourSeq 25 225 249 2000 100.0% chr13 + 50508765 50508789 25 browser details YourSeq 25 225 250 2000 100.0% chr1 + 51011149 51011176 28 browser details YourSeq 24 228 251 2000 100.0% chrX + 145079729 145079752 24 browser details YourSeq 24 226 250 2000 100.0% chr1 + 135404670 135404696 27 browser details YourSeq 23 1673 1695 2000 100.0% chr15 + 63260286 63260308 23 browser details YourSeq 23 7 36 2000 76.0% chr1 + 74622414 74622439 26 browser details YourSeq 22 225 246 2000 100.0% chr3 - 99778441 99778462 22 browser details YourSeq 21 745 765 2000 100.0% chr17 - 68485291 68485311 21 browser details YourSeq 21 230 250 2000 100.0% chr16 - 90300831 90300851 21

Note: The 2000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Lrrfip1 leucine rich repeat (in FLII) interacting protein 1 [ Mus musculus (house mouse) ] Gene ID: 16978, updated on 12-Aug-2019

Gene summary

Official Symbol Lrrfip1 provided by MGI Official Full Name leucine rich repeat (in FLII) interacting protein 1 provided by MGI Primary source MGI:MGI:1342770 See related Ensembl:ENSMUSG00000026305 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Fliiap1; AU024550 Expression Ubiquitous expression in bladder adult (RPKM 22.2), placenta adult (RPKM 10.6) and 26 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 D See Lrrfip1 in Genome Data Viewer Exon count: 28

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (90996930..91128944)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (92895304..93025521)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Lrrfip1 ENSMUSG00000026305

Description leucine rich repeat (in FLII) interacting protein 1 [Source:MGI Symbol;Acc:MGI:1342770] Gene Synonyms FLAP (FLI LRR associated protein), Fliiap1 Location Chromosome 1: 90,998,737-91,128,944 forward strand. GRCm38:CM000994.2 About this gene This gene has 12 transcripts (splice variants), 236 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 6 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Lrrfip1-203 ENSMUST00000097649.9 3645 729aa ENSMUSP00000095254.3 Protein coding CCDS48320 Q3UZ39 TSL:1 GENCODE basic

Lrrfip1-204 ENSMUST00000097650.9 2760 628aa ENSMUSP00000095255.3 Protein coding CCDS35662 Q3UZ39 TSL:1 GENCODE basic APPRIS P3

Lrrfip1-212 ENSMUST00000189617.2 2155 663aa ENSMUSP00000139811.1 Protein coding CCDS78647 A0A087WPK3 TSL:1 GENCODE basic APPRIS ALT2

Lrrfip1-201 ENSMUST00000068116.12 2050 428aa ENSMUSP00000065850.6 Protein coding CCDS48319 G5E8E1 TSL:5 GENCODE basic APPRIS ALT2

Lrrfip1-202 ENSMUST00000068167.12 2849 628aa ENSMUSP00000063878.6 Protein coding - E9Q9T1 TSL:5 GENCODE basic APPRIS ALT2

Lrrfip1-205 ENSMUST00000185531.6 1747 575aa ENSMUSP00000139497.1 Protein coding - A0A087WNU6 CDS 3' incomplete TSL:5

Lrrfip1-211 ENSMUST00000189505.6 1031 323aa ENSMUSP00000141024.1 Protein coding - A0A087WSF5 CDS 3' incomplete TSL:1

Lrrfip1-206 ENSMUST00000186762.6 447 98aa ENSMUSP00000139902.1 Protein coding - A0A087WPT0 CDS 3' incomplete TSL:5

Lrrfip1-208 ENSMUST00000187532.1 483 No protein - Retained intron - - TSL:3

Lrrfip1-209 ENSMUST00000188094.6 2642 No protein - lncRNA - - TSL:1

Lrrfip1-210 ENSMUST00000188708.1 654 No protein - lncRNA - - TSL:2

Lrrfip1-207 ENSMUST00000187375.1 215 No protein - lncRNA - - TSL:5

Page 7 of 9 https://www.alphaknockout.com

150.21 kb Forward strand 91.00Mb 91.05Mb 91.10Mb (Comprehensive set... Lrrfip1-211 >protein coding Lrrfip1-209 >lncRNA

Lrrfip1-205 >protein coding

Lrrfip1-201 >protein coding

Lrrfip1-202 >protein coding

Lrrfip1-203 >protein coding Lrrfip1-208 >retained intron

Lrrfip1-204 >protein coding

Lrrfip1-206 >protein coding Lrrfip1-210 >lncRNA

Lrrfip1-212 >protein coding

Lrrfip1-207 >lncRNA

Contigs < AC118682.8 AC162883.4 > Regulatory Build

91.00Mb 91.05Mb 91.10Mb Reverse strand 150.21 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000097649

63.86 kb Forward strand

Lrrfip1-203 >protein coding

ENSMUSP00000095... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Leucine-rich repeat flightless-interacting protein 1/2 PANTHER PTHR19212:SF7

Leucine-rich repeat flightless-interacting protein 1/2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 729

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9