https://www.alphaknockout.com

Mouse Gipc1 Knockout Project (CRISPR/Cas9)

Objective: To create a Gipc1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gipc1 (NCBI Reference Sequence: NM_018771 ; Ensembl: ENSMUSG00000019433 ) is located on Mouse 8. 7 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000019577). Exon 2~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trapped allele display reduced body and heart weight, selective arteriogenesis and arterial endothelial cell defects, and impaired cardiac performance and wound healing. Mice homozygous for a knock-out allele exhibit low molecular weight proteinuria.

Exon 2 starts from about 0.1% of the coding region. Exon 2~7 covers 100.0% of the coding region. The size of effective KO region: ~3274 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7

Legends Exon of mouse Gipc1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.3% 466) | C(24.3% 486) | T(26.4% 528) | G(26.0% 520)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.1% 422) | C(24.8% 496) | T(25.0% 500) | G(29.1% 582)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 83658994 83660993 2000 browser details YourSeq 160 672 931 2000 96.0% chrX - 160432846 160433146 301 browser details YourSeq 158 546 827 2000 94.9% chr19 + 43685045 43685491 447 browser details YourSeq 154 546 827 2000 93.8% chr4 + 108473887 108474264 378 browser details YourSeq 153 664 841 2000 94.8% chr9 - 14405665 14405846 182 browser details YourSeq 153 686 928 2000 95.4% chr1 - 63106719 63107316 598 browser details YourSeq 150 572 823 2000 89.7% chr10 + 88354129 88354361 233 browser details YourSeq 148 672 827 2000 97.5% chr4 + 98918636 98918791 156 browser details YourSeq 148 546 819 2000 94.6% chr15 + 28200549 28200847 299 browser details YourSeq 147 664 827 2000 95.7% chr3 - 119975257 119975424 168 browser details YourSeq 147 664 827 2000 95.7% chr1 - 20686926 20687095 170 browser details YourSeq 146 664 827 2000 95.0% chr9 - 115289921 115290083 163 browser details YourSeq 146 290 827 2000 82.3% chr6 - 120354979 120355162 184 browser details YourSeq 146 662 827 2000 94.6% chr4 - 136085300 136085470 171 browser details YourSeq 146 546 1009 2000 82.7% chr12 + 111566336 111566618 283 browser details YourSeq 145 672 827 2000 96.8% chr4 + 92883209 92883366 158 browser details YourSeq 144 672 829 2000 94.3% chr15 - 43896422 43896578 157 browser details YourSeq 144 672 839 2000 94.0% chr15 - 39085597 39085776 180 browser details YourSeq 144 365 824 2000 84.9% chr12 + 24813471 24813629 159 browser details YourSeq 143 664 827 2000 93.2% chr19 + 29214709 29214871 163

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr8 + 83664268 83666267 2000 browser details YourSeq 108 384 533 2000 90.9% chr11 + 80076118 80122134 46017 browser details YourSeq 102 375 515 2000 85.0% chr17 + 31213242 31213375 134 browser details YourSeq 101 386 523 2000 88.6% chr18 - 36470223 36470361 139 browser details YourSeq 100 392 529 2000 89.6% chr3 - 19873767 19873904 138 browser details YourSeq 100 384 510 2000 93.1% chr17 - 29390786 29390913 128 browser details YourSeq 100 412 548 2000 93.1% chr14 - 29757101 29757488 388 browser details YourSeq 96 375 523 2000 84.5% chr8 - 110916941 110917083 143 browser details YourSeq 96 4 521 2000 82.3% chr11 + 4255939 4256371 433 browser details YourSeq 95 384 506 2000 90.0% chr1 + 74827161 74827287 127 browser details YourSeq 94 376 523 2000 81.0% chr14 + 55439911 55440041 131 browser details YourSeq 93 375 522 2000 79.9% chr4 - 129127067 129127200 134 browser details YourSeq 90 414 524 2000 88.1% chr19 - 3711960 3712068 109 browser details YourSeq 89 406 524 2000 88.1% chr3 + 90460310 90460823 514 browser details YourSeq 89 412 523 2000 88.0% chr18 + 7349431 7349540 110 browser details YourSeq 89 412 529 2000 93.3% chr11 + 98843606 98844007 402 browser details YourSeq 87 412 523 2000 85.1% chr4 + 126347475 126347581 107 browser details YourSeq 87 414 523 2000 87.1% chr12 + 55328592 55328699 108 browser details YourSeq 86 392 523 2000 94.8% chr14 - 49014948 49015079 132 browser details YourSeq 86 374 514 2000 82.0% chr1 - 74619867 74619996 130

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Gipc1 GIPC PDZ domain containing family, member 1 [ Mus musculus (house mouse) ] Gene ID: 67903, updated on 10-Oct-2019

Gene summary

Official Symbol Gipc1 provided by MGI Official Full Name GIPC PDZ domain containing family, member 1 provided by MGI Primary source MGI:MGI:1926252 See related Ensembl:ENSMUSG00000019433 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GIPC; TIP-2; TaxIP2; Semcap1; Glut1CIP; Rgs19ip1 Expression Ubiquitous expression in stomach adult (RPKM 41.0), colon adult (RPKM 40.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 C2 See Gipc1 in Genome Data Viewer Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (83649408..83664789)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (86176577..86188688)

Chromosome 8 - NC_000074.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Gipc1 ENSMUSG00000019433

Description GIPC PDZ domain containing family, member 1 [Source:MGI Symbol;Acc:MGI:1926252] Gene Synonyms Glut1CIP, Rgs19ip1, Semcap1, TIP-2, TaxIP2, neurophilin1-IP, synectin Location Chromosome 8: 83,652,677-83,664,694 forward strand. GRCm38:CM001001.2 About this gene This gene has 3 transcripts (splice variants), 176 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gipc1-201 ENSMUST00000019577.9 1522 333aa ENSMUSP00000019577.8 Protein coding CCDS22457 Q9Z0G0 TSL:1 GENCODE basic APPRIS P1

Gipc1-203 ENSMUST00000212463.1 476 65aa ENSMUSP00000148824.1 Protein coding - A0A1D5RML2 CDS 3' incomplete TSL:5

Gipc1-202 ENSMUST00000211985.1 370 105aa ENSMUSP00000148847.1 Protein coding - A0A1D5RMN2 CDS 3' incomplete TSL:3

Page 7 of 9 https://www.alphaknockout.com

32.02 kb Forward strand 83.65Mb 83.66Mb 83.67Mb (Comprehensive set... Gipc1-201 >protein coding Ptger1-201 >protein coding

Gipc1-202 >protein coding Ptger1-202 >retained intron

Gipc1-203 >protein coding Ptger1-203 >lncRNA

Contigs < AC151992.3 < AC164432.3

Genes < Pkn1-201protein coding (Comprehensive set...

< Pkn1-204nonsense mediated decay

< Pkn1-208protein coding

< Pkn1-203retained intron

< Pkn1-202retained intron

< Pkn1-207lncRNA

< Pkn1-209lncRNA

Regulatory Build

83.65Mb 83.66Mb 83.67Mb Reverse strand 32.02 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000019577

12.02 kb Forward strand

Gipc1-201 >protein coding

ENSMUSP00000019... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Superfamily PDZ superfamily SMART PDZ domain Pfam PDZ domain PROSITE profiles PDZ domain PIRSF PDZ domain-containing protein GIPC1/2/3 PANTHER PTHR12259:SF4

PDZ domain-containing protein GIPC1/2/3 Gene3D 2.30.42.10 CDD cd00992

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 333

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9