https://www.alphaknockout.com

Mouse Gpr141 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gpr141 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gpr141 (NCBI Reference Sequence: NM_181754.4 ; Ensembl: ENSMUSG00000053101 ) is located on Mouse 13. 5 exons are identified, with the ATG start codon in exon 5 and the TAG stop codon in exon 5 (Transcript: ENSMUST00000065335). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gpr141 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-223E19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 covers 100.0% of the coding region. Start codon is in exon 5, and stop codon is in exon 5. The size of intron 4 for 5'-loxP site insertion: 2392 bp. The size of effective cKO region: ~1188 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele T A

5' gRNA region G 3'

1 5

Targeting vector T A G

Targeted allele T A G

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gpr141 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7415bp) | A(30.73% 2279) | C(17.83% 1322) | T(31.84% 2361) | G(19.6% 1453)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 19752854 19755853 3000 browser details YourSeq 489 1476 2109 3000 89.2% chr5 - 80666581 80667205 625 browser details YourSeq 168 1794 2117 3000 88.6% chr16 - 72562135 72562653 519 browser details YourSeq 162 1837 2115 3000 88.0% chrX - 155567767 155568262 496 browser details YourSeq 159 1869 2105 3000 90.1% chr10 + 115737510 115737751 242 browser details YourSeq 157 1869 2104 3000 82.4% chr8 + 103842209 103842437 229 browser details YourSeq 155 1869 2171 3000 85.8% chr3 + 104377131 104377650 520 browser details YourSeq 154 1869 2116 3000 83.0% chr1 - 65260586 65260837 252 browser details YourSeq 154 1867 2118 3000 88.5% chr8 + 47603209 47603766 558 browser details YourSeq 153 1875 2112 3000 84.9% chr13 + 91789676 91789888 213 browser details YourSeq 150 1874 2117 3000 85.8% chr3 - 158556780 158557046 267 browser details YourSeq 150 1869 2106 3000 86.4% chr15 + 25311832 25312071 240 browser details YourSeq 148 1877 2105 3000 89.9% chr2 + 122231208 122231441 234 browser details YourSeq 147 1869 2106 3000 86.7% chr13 - 101106141 101106378 238 browser details YourSeq 145 1874 2112 3000 83.3% chr6 - 105287857 105288101 245 browser details YourSeq 145 1869 2105 3000 86.9% chr5 + 18023201 18023460 260 browser details YourSeq 144 1842 2328 3000 88.4% chr1 - 129489321 129489862 542 browser details YourSeq 143 1869 2093 3000 86.6% chr5 - 110155518 110155746 229 browser details YourSeq 143 1874 2103 3000 81.8% chr14 + 119416895 119417127 233 browser details YourSeq 141 1867 2110 3000 90.3% chrX - 47683473 47967183 283711

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 19748689 19751688 3000 browser details YourSeq 36 2649 2696 3000 97.5% chr2 + 150572695 150572867 173 browser details YourSeq 34 396 464 3000 71.1% chr16 + 51573500 51573551 52

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Gpr141 G protein-coupled receptor 141 [ Mus musculus (house mouse) ] Gene ID: 353346, updated on 26-Jun-2020

Gene summary

Official Symbol Gpr141 provided by MGI Official Full Name G protein-coupled receptor 141 provided by MGI Primary source MGI:MGI:2672983 See related Ensembl:ENSMUSG00000053101 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PGR13 Expression Low expression observed in reference dataset See more

Genomic context

Location: 13; 13 A2 See Gpr141 in Genome Data Viewer Exon count: 6

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (19749680..19824336, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (19841551..19916126, complement)

Chromosome 13 - NC_000079.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Gpr141 ENSMUSG00000053101

Description G protein-coupled receptor 141 [Source:MGI Symbol;Acc:MGI:2672983] Gene Synonyms Pgr13 Location Chromosome 13: 19,749,682-19,824,257 reverse strand. GRCm38:CM001006.2 About this gene This gene has 4 transcripts (splice variants), 266 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gpr141-201 ENSMUST00000065335.2 3433 305aa ENSMUSP00000066921.2 Protein coding CCDS26264 Q7TQP0 TSL:1 GENCODE basic APPRIS P1

Gpr141-203 ENSMUST00000222664.1 646 113aa ENSMUSP00000152482.1 Protein coding - A0A1Y7VM27 CDS 3' incomplete TSL:2

Gpr141-204 ENSMUST00000222772.1 1531 No protein - Retained intron - - TSL:1

Gpr141-202 ENSMUST00000221268.1 1334 No protein - Retained intron - - TSL:NA

94.58 kb Forward strand 19.74Mb 19.76Mb 19.78Mb 19.80Mb 19.82Mb Gm19154-201 >processed pseudogene (Comprehensive set...

Contigs AC153016.2 > AC122886.3 > Genes (Comprehensive set... < Gpr141-201protein coding

< Gpr141-203protein coding

< Gpr141-204retained intron

< Gpr141-202retained intron

Regulatory Build

19.74Mb 19.76Mb 19.78Mb 19.80Mb 19.82Mb Reverse strand 94.58 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000065335

< Gpr141-201protein coding

Reverse strand 74.58 kb

ENSMUSP00000066... Transmembrane heli... Low complexity (Seg) Superfamily SSF81321 Prints G protein-coupled receptor, -like Pfam G protein-coupled receptor, rhodopsin-like PROSITE profiles GPCR, rhodopsin-like, 7TM PANTHER PTHR24237

PTHR24237:SF35 Gene3D 1.20.1070.10 CDD cd14994

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 305

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7