https://www.alphaknockout.com

Mouse Insig2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Insig2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Insig2 (NCBI Reference Sequence: NM_133748 ; Ensembl: ENSMUSG00000003721 ) is located on Mouse 1. 6 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000003818). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Insig2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-36M6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 36.3% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 7271 bp, and the size of intron 3 for 3'-loxP site insertion: 4894 bp. The size of effective cKO region: ~625 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Insig2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7125bp) | A(25.67% 1829) | C(20.49% 1460) | T(30.96% 2206) | G(22.88% 1630)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 121312576 121315575 3000 browser details YourSeq 71 1 83 3000 89.1% chr15 - 88829328 88829402 75 browser details YourSeq 66 1 69 3000 100.0% chr2 - 59348106 59348212 107 browser details YourSeq 66 1 69 3000 98.6% chr14 - 14688638 14689168 531 browser details YourSeq 65 1 69 3000 98.6% chr5 - 130801191 130801388 198 browser details YourSeq 64 1 69 3000 97.1% chr7 - 121008371 121008539 169 browser details YourSeq 64 1 69 3000 100.0% chr3 + 103557900 103558026 127 browser details YourSeq 64 1 67 3000 98.6% chr15 + 99380691 99380811 121 browser details YourSeq 63 1 69 3000 97.1% chr11 + 58777376 59096221 318846 browser details YourSeq 62 1 67 3000 97.1% chr2 - 169945001 169945091 91 browser details YourSeq 61 1 69 3000 95.7% chr5 - 79043441 79043527 87 browser details YourSeq 61 1 69 3000 97.0% chr7 + 67890139 67890215 77 browser details YourSeq 58 1 67 3000 94.1% chr2 - 169944971 169945039 69 browser details YourSeq 58 1 69 3000 92.7% chr10 + 94128907 94128993 87 browser details YourSeq 57 1 69 3000 98.4% chrX - 145868154 145868246 93 browser details YourSeq 57 1 69 3000 98.4% chr19 - 52268010 52268148 139 browser details YourSeq 57 1 67 3000 93.9% chr16 - 23548804 23548870 67 browser details YourSeq 57 1 69 3000 98.4% chr10 - 113279704 113279796 93 browser details YourSeq 57 1 69 3000 96.8% chr10 + 68225033 68225709 677 browser details YourSeq 56 1 69 3000 96.8% chr9 - 121128094 121128184 91

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 121308951 121311950 3000 browser details YourSeq 458 2009 2650 3000 89.1% chr2 + 115412971 115413590 620 browser details YourSeq 437 2007 2647 3000 88.2% chr17 - 56526691 56527306 616 browser details YourSeq 410 2007 2685 3000 89.9% chr6 + 90330802 90331593 792 browser details YourSeq 404 2007 2513 3000 90.3% chr13 + 89775556 89776055 500 browser details YourSeq 400 2024 2648 3000 85.3% chr14 - 21915938 21916539 602 browser details YourSeq 392 2007 2616 3000 88.6% chr7 - 4227003 4227761 759 browser details YourSeq 387 2021 2649 3000 87.9% chr5 + 27304961 27305560 600 browser details YourSeq 379 2007 2493 3000 89.2% chr11 + 24909070 24909539 470 browser details YourSeq 378 2007 2500 3000 89.9% chr11 - 21831888 21832374 487 browser details YourSeq 378 2007 2508 3000 89.2% chr18 + 15099211 15099692 482 browser details YourSeq 377 2007 2513 3000 91.8% chr14 - 49288390 49288994 605 browser details YourSeq 376 2007 2491 3000 88.2% chr2 + 134712580 134713054 475 browser details YourSeq 375 2007 2490 3000 88.9% chr2 + 142728527 142728993 467 browser details YourSeq 375 2008 2494 3000 88.8% chr12 + 10560496 10560970 475 browser details YourSeq 375 2008 2491 3000 88.3% chr1 + 97489050 97489523 474 browser details YourSeq 374 2007 2493 3000 88.5% chr17 - 14797718 14798195 478 browser details YourSeq 372 2007 2500 3000 88.8% chr3 - 29926968 29927451 484 browser details YourSeq 372 2007 2491 3000 89.2% chr18 + 16384972 16385455 484 browser details YourSeq 372 2008 2687 3000 85.4% chr10 + 114774037 114774524 488

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Insig2 induced gene 2 [ Mus musculus (house mouse) ] Gene ID: 72999, updated on 24-Oct-2019

Gene summary

Official Symbol Insig2 provided by MGI Official Full Name insulin induced gene 2 provided by MGI Primary source MGI:MGI:1920249 See related Ensembl:ENSMUSG00000003721 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Insig-2; 2900053I11Rik; C730043J18Rik Expression Ubiquitous expression in liver adult (RPKM 40.0), large intestine adult (RPKM 16.4) and 26 other tissuesS ee more Orthologs human all

Genomic context

Location: 1; 1 E2.3 See Insig2 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (121304353..121332662, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (123200930..123229157, complement)

Chromosome 1 - NC_000067.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Insig2 ENSMUSG00000003721

Description insulin induced gene 2 [Source:MGI Symbol;Acc:MGI:1920249] Gene Synonyms 2900053I11Rik, C730043J18Rik, Insig-2 Location Chromosome 1: 121,304,353-121,332,589 reverse strand. GRCm38:CM000994.2 About this gene This gene has 14 transcripts (splice variants), 199 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Insig2-201 ENSMUST00000003818.13 2697 225aa ENSMUSP00000003818.7 Protein coding CCDS15236 Q91WG1 TSL:1 GENCODE basic APPRIS P1

Insig2-202 ENSMUST00000071064.12 1292 225aa ENSMUSP00000065485.6 Protein coding CCDS15236 Q91WG1 TSL:1 GENCODE basic APPRIS P1

Insig2-203 ENSMUST00000159085.7 1252 225aa ENSMUSP00000124345.1 Protein coding CCDS15236 Q91WG1 TSL:1 GENCODE basic APPRIS P1

Insig2-208 ENSMUST00000160968.7 1204 225aa ENSMUSP00000123747.1 Protein coding CCDS15236 Q91WG1 TSL:5 GENCODE basic APPRIS P1

Insig2-214 ENSMUST00000186915.1 674 117aa ENSMUSP00000140292.1 Protein coding CCDS78670 A0A087WQP7 TSL:3 GENCODE basic

Insig2-213 ENSMUST00000162790.1 557 146aa ENSMUSP00000124697.1 Protein coding - E0CXF1 CDS 3' incomplete TSL:3

Insig2-212 ENSMUST00000162582.1 478 90aa ENSMUSP00000125046.1 Protein coding - E0CXS5 CDS 3' incomplete TSL:5

Insig2-207 ENSMUST00000160688.1 471 78aa ENSMUSP00000123702.1 Protein coding - E0CZ15 CDS 3' incomplete TSL:3

Insig2-210 ENSMUST00000161818.1 362 35aa ENSMUSP00000123993.1 Protein coding - E0CYQ1 CDS 3' incomplete TSL:2

Insig2-204 ENSMUST00000159125.1 359 59aa ENSMUSP00000123729.1 Protein coding - E0CZ00 CDS 3' incomplete TSL:2

Insig2-209 ENSMUST00000161068.1 204 26aa ENSMUSP00000125216.1 Protein coding - E0CXL0 CDS 3' incomplete TSL:5

Insig2-211 ENSMUST00000162019.1 613 No protein - Retained intron - - TSL:1

Insig2-205 ENSMUST00000159192.1 465 No protein - Retained intron - - TSL:3

Insig2-206 ENSMUST00000159528.1 334 No protein - Retained intron - - TSL:2

Page 6 of 8 https://www.alphaknockout.com

48.24 kb Forward strand 121.30Mb 121.31Mb 121.32Mb 121.33Mb 121.34Mb Gm37174-201 >TEC (Comprehensive set...

Contigs AC163333.6 >

Genes < Gm38283-201TEC < Insig2-205retained intron < Insig2-204protein coding (Comprehensive set...

< Insig2-201protein coding

< Insig2-202protein coding

< Insig2-208protein coding

< Insig2-203protein coding

< Insig2-214protein coding

< Insig2-206retained intron < Insig2-210protein coding

< Insig2-207protein coding

< Insig2-212protein coding

< Insig2-213protein coding

< Insig2-211retained intron < Insig2-209protein coding

Regulatory Build

121.30Mb 121.31Mb 121.32Mb 121.33Mb 121.34Mb Reverse strand 48.24 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000003818

< Insig2-201protein coding

Reverse strand 23.67 kb

ENSMUSP00000003... Transmembrane heli... Pfam Insulin-induced protein family

PANTHER Insulin-induced protein family

PTHR15301:SF10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 225

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8