https://www.alphaknockout.com

Mouse Fem1a Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Fem1a conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Fem1a (NCBI Reference Sequence: NM_010192 ; Ensembl: ENSMUSG00000043683 ) is located on Mouse 17. 1 exon is identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 1 (Transcript: ENSMUST00000060253). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Fem1a gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-351E9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous knockout results in an increased susceptibility to DSS-induced colitis and colitis-associated tumorigenesis.

Exon 1 covers 100.0% of the coding region. Start codon is in exon 1, and stop codon is in exon 1. The size of effective cKO region: ~1995 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region gRNA region

Wildtype allele A T G T G 5' A 3'

1

Targeting vector A T G T G A

Targeted allele A T G T G A

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Fem1a cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7962bp) | A(23.56% 1876) | C(27.14% 2161) | T(22.9% 1823) | G(26.4% 2102)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 56253910 56256909 3000 browser details YourSeq 235 506 893 3000 92.4% chr6 + 124843802 124844256 455 browser details YourSeq 230 506 892 3000 87.3% chr17 - 66463356 66463714 359 browser details YourSeq 228 501 880 3000 92.0% chr2 + 120145734 120146340 607 browser details YourSeq 224 506 893 3000 90.7% chr11 + 20132425 20132994 570 browser details YourSeq 222 491 841 3000 91.8% chr16 - 17690271 17690712 442 browser details YourSeq 219 506 874 3000 86.7% chr10 - 91140513 91140828 316 browser details YourSeq 219 506 857 3000 92.1% chr11 + 75317962 75433806 115845 browser details YourSeq 205 511 825 3000 94.8% chr7 - 19708229 19708611 383 browser details YourSeq 187 506 893 3000 90.1% chr8 + 85121236 85121602 367 browser details YourSeq 186 506 819 3000 93.6% chr4 + 116009246 116009898 653 browser details YourSeq 184 506 892 3000 90.8% chr7 - 21434188 21574902 140715 browser details YourSeq 177 506 744 3000 90.2% chr5 + 46102008 46102207 200 browser details YourSeq 176 506 802 3000 94.0% chr15 - 81750860 81751487 628 browser details YourSeq 174 501 786 3000 94.4% chr18 - 34934310 34934848 539 browser details YourSeq 173 479 683 3000 94.4% chr7 - 29924068 29924274 207 browser details YourSeq 172 492 716 3000 92.2% chr7 + 28327672 28327901 230 browser details YourSeq 169 501 695 3000 93.9% chr17 - 23471669 23471865 197 browser details YourSeq 168 491 683 3000 94.8% chr17 + 21736434 21736629 196 browser details YourSeq 167 506 695 3000 95.3% chr6 + 47483413 47483606 194

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 + 56258872 56261871 3000 browser details YourSeq 829 1 908 3000 95.6% chr11 - 29822586 29823492 907 browser details YourSeq 72 2894 2992 3000 90.4% chr15 + 80986412 80986807 396 browser details YourSeq 65 2896 2978 3000 97.2% chr10 + 53768199 53768290 92 browser details YourSeq 63 2897 2975 3000 93.3% chr12 + 54644894 54644987 94 browser details YourSeq 63 2894 2977 3000 97.2% chr10 + 7215176 7215268 93 browser details YourSeq 62 2896 2975 3000 97.1% chr14 - 88083736 88083824 89 browser details YourSeq 62 2894 2975 3000 89.1% chr14 - 58238431 58238530 100 browser details YourSeq 61 2876 2994 3000 94.2% chr16 + 54236662 54236809 148 browser details YourSeq 61 2102 2975 3000 64.4% chr12 + 105693840 105693933 94 browser details YourSeq 61 2894 2963 3000 94.3% chr10 + 39157915 39157986 72 browser details YourSeq 60 2894 2975 3000 94.3% chr13 - 4823246 4823336 91 browser details YourSeq 60 2894 2975 3000 94.3% chr10 - 32581931 32582021 91 browser details YourSeq 58 2894 2975 3000 95.3% chr14 + 45479966 45480052 87 browser details YourSeq 56 2894 2964 3000 98.4% chr16 + 86194572 86194642 71 browser details YourSeq 56 2894 2975 3000 93.9% chr1 + 33815482 33815572 91 browser details YourSeq 55 2892 2964 3000 96.8% chr1 + 64729135 64729212 78 browser details YourSeq 54 2897 2991 3000 89.8% chr11 - 67575372 67575475 104 browser details YourSeq 54 2924 2995 3000 91.6% chr10 - 93494313 93494382 70 browser details YourSeq 54 2894 2962 3000 93.6% chr10 + 85202900 85202970 71

Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Fem1a fem 1 homolog a [ Mus musculus (house mouse) ] Gene ID: 14154, updated on 12-Aug-2019

Gene summary

Official Symbol Fem1a provided by MGI Official Full Name fem 1 homolog a provided by MGI Primary source MGI:MGI:1335089 See related Ensembl:ENSMUSG00000043683 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Fem1aa; AW611390 Orthologs human all

Genomic context

Location: 17 D; 17 29.24 cM See Fem1a in Genome Data Viewer Exon count: 1

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (56256793..56263608)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (56396216..56403031)

Chromosome 17 - NC_000083.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Fem1a ENSMUSG00000043683

Description feminization 1 homolog a (C. elegans) [Source:MGI Symbol;Acc:MGI:1335089] Location Chromosome 17: 56,256,810-56,263,610 forward strand. GRCm38:CM001010.2 About this gene This gene has 1 transcript (splice variant), 169 orthologues, 5 paralogues, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Fem1a-201 ENSMUST00000060253.4 6801 654aa ENSMUSP00000057996.3 Protein coding CCDS28899 Q9Z2G1 TSL:NA GENCODE basic APPRIS P1

26.80 kb Forward strand 56.25Mb 56.26Mb 56.27Mb (Comprehensive set... Fem1a-201 >protein coding

Contigs AC026385.19 > Genes < Ticam1-201protein coding (Comprehensive set...

Regulatory Build

56.25Mb 56.26Mb 56.27Mb Reverse strand 26.80 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000060253

6.80 kb Forward strand

Fem1a-201 >protein coding

ENSMUSP00000057... Low complexity (Seg) Superfamily Tetratricopeptide-like helical domain superfamily

Ankyrin repeat-containing domain superfamily SMART Ankyrin repeat Prints Ankyrin repeat Pfam Ankyrin repeat-containing domain PF13857 PROSITE profiles Ankyrin repeat-containing domain

Ankyrin repeat PANTHER PTHR24173

PTHR24173:SF12 Gene3D Tetratricopeptide-like helical domain superfamily

Ankyrin repeat-containing domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 654

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7