https://www.alphaknockout.com

Mouse Irs4 Knockout Project (CRISPR/Cas9)

Objective: To create a Irs4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Irs4 (NCBI Reference Sequence: NM_010572 ; Ensembl: ENSMUSG00000054667 ) is located on Mouse X. 2 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 1 (Transcript: ENSMUST00000067841). Exon 1 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a targeted null mutation exhibit a 10% reduction in male adult size, slightly impaired oral glucose tolerance, and decreased reproductive ability.

Exon 1 starts from about 0.03% of the coding region. Exon 1 covers 100.0% of the coding region. The size of effective KO region: ~3646 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2

Legends Exon of mouse Irs4 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.15% 483) | C(26.4% 528) | T(25.5% 510) | G(23.95% 479)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.9% 558) | C(20.85% 417) | T(30.85% 617) | G(20.4% 408)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chrX - 141725199 141727198 2000 browser details YourSeq 62 367 428 2000 100.0% chr6 + 76928200 76928261 62 browser details YourSeq 45 378 428 2000 94.2% chr7 - 78730381 78730431 51 browser details YourSeq 41 380 428 2000 91.9% chr1 - 12176116 12176164 49 browser details YourSeq 28 399 428 2000 96.7% chr9 + 77055457 77055486 30 browser details YourSeq 28 401 428 2000 100.0% chr13 + 73694384 73694411 28 browser details YourSeq 26 397 428 2000 90.7% chr1 - 18165245 18165276 32 browser details YourSeq 25 405 431 2000 96.3% chr17 + 55397820 55397846 27 browser details YourSeq 25 278 311 2000 88.9% chr1 + 81149544 81149576 33 browser details YourSeq 21 504 525 2000 100.0% chr1 - 69325601 69325623 23

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chrX - 141719551 141721550 2000 browser details YourSeq 30 734 801 2000 94.2% chr4 - 152172686 152172754 69 browser details YourSeq 23 1972 1996 2000 87.5% chr16 + 76995915 76995938 24

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Irs4 insulin receptor substrate 4 [ Mus musculus (house mouse) ] Gene ID: 16370, updated on 12-Aug-2019

Gene summary

Official Symbol Irs4 provided by MGI Official Full Name insulin receptor substrate 4 provided by MGI Primary source MGI:MGI:1338009 See related Ensembl:ENSMUSG00000054667 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as IRS-4 Expression Biased expression in CNS E18 (RPKM 1.0), testis adult (RPKM 0.8) and 8 other tissues See more Orthologs human all

Genomic context

Location: X F2; X 62.43 cM See Irs4 in Genome Data Viewer Exon count: 2

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (141710998..141725217, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (138145541..138159760, complement)

Chromosome X - NC_000086.7

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Irs4 ENSMUSG00000054667

Description insulin receptor substrate 4 [Source:MGI Symbol;Acc:MGI:1338009] Gene Synonyms IRS-4 Location Chromosome X: 141,710,998-141,725,263 reverse strand. GRCm38:CM001013.2 About this gene This gene has 1 transcript (splice variant), 174 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 9 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Irs4-201 ENSMUST00000067841.7 6294 1216aa ENSMUSP00000067085.7 Protein coding CCDS30445 Q9Z0Y7 TSL:1 GENCODE basic APPRIS P1

34.27 kb Forward strand 141.71Mb 141.72Mb 141.73Mb Gm15295-201 >lncRNA (Comprehensive set...

Contigs BX571779.6 > AL671983.18 > Genes (Comprehensive set... < Irs4-201protein coding

Regulatory Build

141.71Mb 141.72Mb 141.73Mb Reverse strand 34.27 kb

Regulation Legend Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000067841

< Irs4-201protein coding

Reverse strand 14.27 kb

ENSMUSP00000067... MobiDB lite Low complexity (Seg) Superfamily SSF50729 SMART SM01244

Pleckstrin homology domain

IRS-type PTB domain Prints IRS-type PTB domain Pfam IRS-type PTB domain PROSITE profiles IRS-type PTB domain

Pleckstrin homology domain PANTHER PTHR10614:SF2

Insulin receptor substrate Gene3D PH-like domain superfamily CDD cd01257 IRS-type PTB domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1216

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8