https://www.alphaknockout.com

Mouse Reg4 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Reg4 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Reg4 (NCBI Reference Sequence: NM_026328 ; Ensembl: ENSMUSG00000027876 ) is located on Mouse 3. 6 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 6 (Transcript: ENSMUST00000029469). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Reg4 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-4O4 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 2454 bp, and the size of intron 2 for 3'-loxP site insertion: 4943 bp. The size of effective cKO region: ~567 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Reg4 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7067bp) | A(28.58% 2020) | C(20.66% 1460) | T(25.81% 1824) | G(24.95% 1763)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 98221525 98224524 3000 browser details YourSeq 41 952 1016 3000 84.5% chr11 + 34384562 34384621 60 browser details YourSeq 38 819 973 3000 78.6% chr16 + 87046214 87046359 146 browser details YourSeq 37 512 897 3000 97.5% chr13 - 48726071 48726462 392 browser details YourSeq 32 852 897 3000 79.5% chr10 + 23466112 23466151 40 browser details YourSeq 30 1654 1688 3000 81.3% chr2 - 26245613 26245644 32 browser details YourSeq 28 991 1018 3000 100.0% chr7 + 24617785 24617812 28 browser details YourSeq 26 948 973 3000 100.0% chr4 - 137280261 137280286 26 browser details YourSeq 26 872 897 3000 100.0% chr3 - 114580086 114580111 26 browser details YourSeq 26 872 897 3000 100.0% chr16 - 10345976 10346001 26 browser details YourSeq 26 872 897 3000 100.0% chr14 - 47924341 47924366 26 browser details YourSeq 26 948 973 3000 100.0% chr11 - 68881311 68881336 26 browser details YourSeq 26 944 973 3000 93.4% chr2 + 168582470 168582499 30 browser details YourSeq 26 2282 2314 3000 79.4% chr17 + 67209413 67209442 30 browser details YourSeq 26 872 897 3000 100.0% chr14 + 24145801 24145826 26 browser details YourSeq 26 872 897 3000 100.0% chr12 + 110427986 110428011 26 browser details YourSeq 26 872 897 3000 100.0% chr10 + 67925481 67925506 26 browser details YourSeq 25 1668 1693 3000 100.0% chr5 - 124292806 124292832 27 browser details YourSeq 25 945 973 3000 93.2% chr13 - 64309463 64309491 29 browser details YourSeq 24 991 1016 3000 96.2% chr4 - 118530351 118530376 26

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 98225092 98228091 3000 browser details YourSeq 62 1395 1497 3000 95.6% chr14 - 74765648 74766074 427 browser details YourSeq 50 1442 1496 3000 96.3% chr3 - 16691629 16691683 55 browser details YourSeq 50 1443 1500 3000 96.5% chr2 + 97661797 97661857 61 browser details YourSeq 49 1443 1498 3000 94.5% chr15 - 78224139 78224195 57 browser details YourSeq 49 1443 1499 3000 94.6% chr2 + 173646514 173646571 58 browser details YourSeq 49 1448 1502 3000 88.7% chr12 + 119959597 119959649 53 browser details YourSeq 48 1443 1495 3000 96.2% chr11 - 46718192 46718262 71 browser details YourSeq 48 1445 1498 3000 96.3% chrX + 142591485 142591550 66 browser details YourSeq 47 1448 1499 3000 96.2% chr2 + 156602329 156602418 90 browser details YourSeq 47 1441 1494 3000 86.3% chr19 + 50804610 50804660 51 browser details YourSeq 46 1445 1497 3000 94.3% chr2 - 46180503 46180558 56 browser details YourSeq 46 1445 1497 3000 94.3% chr17 - 53281619 53281671 53 browser details YourSeq 46 1445 1494 3000 91.9% chr3 + 124592217 124592265 49 browser details YourSeq 46 1450 1498 3000 100.0% chr17 + 84812375 84812430 56 browser details YourSeq 45 1445 1498 3000 86.0% chr2 - 20607074 20607124 51 browser details YourSeq 45 1443 1494 3000 94.2% chr12 + 77032499 77032559 61 browser details YourSeq 44 1448 1498 3000 87.8% chr5 + 66259439 66259487 49 browser details YourSeq 44 1448 1495 3000 89.2% chr15 + 86200564 86200609 46 browser details YourSeq 43 1444 1494 3000 85.8% chr3 - 150507348 150507396 49

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Reg4 regenerating islet-derived family, member 4 [ Mus musculus (house mouse) ] Gene ID: 67709, updated on 12-Aug-2019

Gene summary

Official Symbol Reg4 provided by MGI Official Full Name regenerating islet-derived family, member 4 provided by MGI Primary source MGI:MGI:1914959 See related Ensembl:ENSMUSG00000027876 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GISP; RELP; 2010002L15Rik Expression Biased expression in colon adult (RPKM 46.3), large intestine adult (RPKM 32.9) and 2 other tissuesS ee more Orthologs human all

Genomic context

Location: 3; 3 F2.2 See Reg4 in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (98222138..98236748)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (98026079..98040671)

Chromosome 3 - NC_000069.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Reg4 ENSMUSG00000027876

Description regenerating islet-derived family, member 4 [Source:MGI Symbol;Acc:MGI:1914959] Gene Synonyms 2010002L15Rik, RELP Location Chromosome 3: 98,222,156-98,236,748 forward strand. GRCm38:CM000996.2 About this gene This gene has 1 transcript (splice variant), 111 orthologues, 6 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Reg4-201 ENSMUST00000029469.4 1021 157aa ENSMUSP00000029469.4 Protein coding CCDS17661 Q9D8G5 TSL:1 GENCODE basic APPRIS P1

34.59 kb Forward strand

98.22Mb 98.23Mb 98.24Mb (Comprehensive set... Reg4-201 >protein coding

Contigs AC121771.3 > Regulatory Build

98.22Mb 98.23Mb 98.24Mb Reverse strand 34.59 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000029469

14.59 kb Forward strand

Reg4-201 >protein coding

ENSMUSP00000029... Cleavage site (Sign... Superfamily C-type lectin fold SMART C-type lectin-like Prints PR01504 Pfam C-type lectin-like PROSITE profiles C-type lectin-like PANTHER PTHR45710

PTHR45710:SF6 Gene3D C-type lectin-like/link domain superfamily CDD cd03594

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 157

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7