https://www.alphaknockout.com

Mouse Gipc1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gipc1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gipc1 (NCBI Reference Sequence: NM_018771 ; Ensembl: ENSMUSG00000019433 ) is located on Mouse 8. 7 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000019577). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gipc1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-455G23 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trapped allele display reduced body and heart weight, selective arteriogenesis and arterial endothelial cell defects, and impaired cardiac performance and wound healing. Mice homozygous for a knock-out allele exhibit low molecular weight proteinuria.

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 8221 bp, and the size of intron 2 for 3'-loxP site insertion: 674 bp. The size of effective cKO region: ~788 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gipc1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7288bp) | A(23.23% 1693) | C(26.19% 1909) | T(23.41% 1706) | G(27.17% 1980)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 + 83657744 83660743 3000 browser details YourSeq 160 1922 2181 3000 96.0% chrX - 160432846 160433146 301 browser details YourSeq 158 1796 2077 3000 94.9% chr19 + 43685045 43685491 447 browser details YourSeq 153 1914 2091 3000 94.8% chr9 - 14405665 14405846 182 browser details YourSeq 153 1936 2178 3000 95.4% chr1 - 63106719 63107316 598 browser details YourSeq 150 1822 2073 3000 89.7% chr10 + 88354129 88354361 233 browser details YourSeq 148 1922 2077 3000 97.5% chr4 + 98918636 98918791 156 browser details YourSeq 148 1796 2069 3000 94.6% chr15 + 28200549 28200847 299 browser details YourSeq 147 1914 2077 3000 95.7% chr3 - 119975257 119975424 168 browser details YourSeq 147 1914 2077 3000 95.7% chr1 - 20686926 20687095 170 browser details YourSeq 146 1914 2077 3000 95.0% chr9 - 115289921 115290083 163 browser details YourSeq 146 1540 2077 3000 82.3% chr6 - 120354979 120355162 184 browser details YourSeq 146 1912 2077 3000 94.6% chr4 - 136085300 136085470 171 browser details YourSeq 146 1796 2259 3000 82.7% chr12 + 111566336 111566618 283 browser details YourSeq 145 1922 2077 3000 96.8% chr4 + 92883209 92883366 158 browser details YourSeq 144 1922 2079 3000 94.3% chr15 - 43896422 43896578 157 browser details YourSeq 144 1922 2089 3000 94.0% chr15 - 39085597 39085776 180 browser details YourSeq 144 1615 2074 3000 84.9% chr12 + 24813471 24813629 159 browser details YourSeq 143 1914 2077 3000 93.2% chr19 + 29214709 29214871 163 browser details YourSeq 142 1923 2077 3000 96.2% chr6 - 143021051 143021209 159

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 + 83661532 83664531 3000 browser details YourSeq 65 430 516 3000 87.4% chr10 - 81341584 81341670 87 browser details YourSeq 60 2046 2154 3000 82.5% chr10 + 126870446 126870551 106 browser details YourSeq 57 2078 2205 3000 90.0% chr10 + 84607955 84608084 130 browser details YourSeq 52 2092 2190 3000 92.0% chr14 + 20761984 20969836 207853 browser details YourSeq 49 2042 2124 3000 82.7% chr7 + 19346978 19347401 424 browser details YourSeq 48 2122 2189 3000 91.4% chr7 + 133592100 133592169 70 browser details YourSeq 48 2046 2226 3000 90.2% chr5 + 37636910 37637101 192 browser details YourSeq 46 2045 2114 3000 82.9% chr12 - 82540402 82540471 70 browser details YourSeq 46 2090 2165 3000 87.8% chr10 + 103449631 103449705 75 browser details YourSeq 44 2039 2100 3000 85.5% chr12 - 69498473 69498534 62 browser details YourSeq 43 2046 2246 3000 93.9% chr5 - 111685174 111685392 219 browser details YourSeq 43 2039 2189 3000 75.3% chr1 + 23756415 23756566 152 browser details YourSeq 42 2050 2185 3000 93.8% chr12 - 19457600 19457738 139 browser details YourSeq 42 2046 2176 3000 93.8% chr16 + 25214745 25214877 133 browser details YourSeq 41 2038 2204 3000 75.0% chr11 - 20209868 20210024 157 browser details YourSeq 41 2052 2203 3000 95.6% chr10 - 29647672 29648252 581 browser details YourSeq 41 2050 2185 3000 93.7% chr12 + 21816915 21817055 141 browser details YourSeq 40 2090 2154 3000 93.5% chrX - 12931471 12931536 66 browser details YourSeq 40 2048 2337 3000 55.3% chr17 + 26232921 26233033 113

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Gipc1 GIPC PDZ domain containing family, member 1 [ Mus musculus (house mouse) ] Gene ID: 67903, updated on 10-Oct-2019

Gene summary

Official Symbol Gipc1 provided by MGI Official Full Name GIPC PDZ domain containing family, member 1 provided by MGI Primary source MGI:MGI:1926252 See related Ensembl:ENSMUSG00000019433 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GIPC; TIP-2; TaxIP2; Semcap1; Glut1CIP; Rgs19ip1 Expression Ubiquitous expression in stomach adult (RPKM 41.0), colon adult (RPKM 40.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 C2 See Gipc1 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (83649408..83664789)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (86176577..86188688)

Chromosome 8 - NC_000074.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Gipc1 ENSMUSG00000019433

Description GIPC PDZ domain containing family, member 1 [Source:MGI Symbol;Acc:MGI:1926252] Gene Synonyms Glut1CIP, Rgs19ip1, Semcap1, TIP-2, TaxIP2, neurophilin1-IP, synectin Location Chromosome 8: 83,652,677-83,664,694 forward strand. GRCm38:CM001001.2 About this gene This gene has 3 transcripts (splice variants), 176 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 12 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gipc1-201 ENSMUST00000019577.9 1522 333aa ENSMUSP00000019577.8 Protein coding CCDS22457 Q9Z0G0 TSL:1 GENCODE basic APPRIS P1

Gipc1-203 ENSMUST00000212463.1 476 65aa ENSMUSP00000148824.1 Protein coding - A0A1D5RML2 CDS 3' incomplete TSL:5

Gipc1-202 ENSMUST00000211985.1 370 105aa ENSMUSP00000148847.1 Protein coding - A0A1D5RMN2 CDS 3' incomplete TSL:3

Page 6 of 8 https://www.alphaknockout.com

32.02 kb Forward strand 83.65Mb 83.66Mb 83.67Mb (Comprehensive set... Gipc1-201 >protein coding Ptger1-201 >protein coding

Gipc1-202 >protein coding Ptger1-202 >retained intron

Gipc1-203 >protein coding Ptger1-203 >lncRNA

Contigs < AC151992.3 < AC164432.3

Genes < Pkn1-201protein coding (Comprehensive set...

< Pkn1-204nonsense mediated decay

< Pkn1-208protein coding

< Pkn1-203retained intron

< Pkn1-202retained intron

< Pkn1-207lncRNA

< Pkn1-209lncRNA

Regulatory Build

83.65Mb 83.66Mb 83.67Mb Reverse strand 32.02 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000019577

12.02 kb Forward strand

Gipc1-201 >protein coding

ENSMUSP00000019... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Superfamily PDZ superfamily SMART PDZ domain Pfam PDZ domain PROSITE profiles PDZ domain PIRSF PDZ domain-containing protein GIPC1/2/3 PANTHER PTHR12259:SF4

PDZ domain-containing protein GIPC1/2/3 Gene3D 2.30.42.10 CDD cd00992

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 333

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8