https://www.alphaknockout.com

Mouse Pign Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Pign conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pign (NCBI Reference Sequence: NM_013784 ; Ensembl: ENSMUSG00000056536 ) is located on Mouse 1. 31 exons are identified, with the ATG start codon in exon 4 and the TGA stop codon in exon 31 (Transcript: ENSMUST00000186485). Exon 8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Pign gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-213N12 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for an ENU-induced allele exhibit abnormal gastrulation, forebrain hypoplasia, coloboma, and microphthalmia.

Exon 8 starts from about 19.69% of the coding region. The knockout of Exon 8 will result in frameshift of the gene. The size of intron 7 for 5'-loxP site insertion: 3696 bp, and the size of intron 8 for 3'-loxP site insertion: 1043 bp. The size of effective cKO region: ~625 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 8 9 31 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Pign Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7125bp) | A(27.97% 1993) | C(15.45% 1101) | T(37.26% 2655) | G(19.31% 1376)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 105649658 105652657 3000 browser details YourSeq 472 256 1489 3000 86.8% chr8 + 76979618 76983172 3555 browser details YourSeq 335 792 1510 3000 85.8% chr3 - 135188375 135189156 782 browser details YourSeq 330 741 1473 3000 86.0% chr5 - 72836550 72837340 791 browser details YourSeq 325 524 1499 3000 86.2% chr17 - 52289376 52290388 1013 browser details YourSeq 314 741 1492 3000 87.9% chr5 - 30081026 30081811 786 browser details YourSeq 314 518 1467 3000 85.1% chr2 - 149699856 149700806 951 browser details YourSeq 302 575 1354 3000 86.4% chr2 - 12565104 12565905 802 browser details YourSeq 300 504 1488 3000 87.2% chr1 + 107758331 107759364 1034 browser details YourSeq 287 789 1638 3000 88.4% chr6 + 40316876 40317753 878 browser details YourSeq 286 518 1438 3000 87.9% chr14 - 119931901 119932878 978 browser details YourSeq 283 524 1035 3000 88.7% chr12 - 72162560 72163078 519 browser details YourSeq 265 511 1443 3000 87.6% chr13 + 28137574 28138629 1056 browser details YourSeq 260 558 1433 3000 83.8% chr6 + 16357951 16358814 864 browser details YourSeq 259 518 1486 3000 85.3% chr18 + 25974863 25975855 993 browser details YourSeq 255 741 1503 3000 86.5% chr1 - 53954399 53955229 831 browser details YourSeq 253 524 1180 3000 82.5% chr14 - 115491663 115492302 640 browser details YourSeq 253 898 1447 3000 87.0% chr9 + 13860179 13860746 568 browser details YourSeq 253 536 1055 3000 85.0% chr8 + 85243735 85244241 507 browser details YourSeq 250 575 1338 3000 81.4% chr6 + 63223892 63224670 779

Note: The 3000 bp section upstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 105646033 105649032 3000 browser details YourSeq 32 1052 1097 3000 90.0% chr13 + 60140448 60140593 146 browser details YourSeq 23 1855 1878 3000 100.0% chr15 + 100971366 100971391 26

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Pign phosphatidylinositol glycan anchor biosynthesis, class N [ Mus musculus (house mouse) ] Gene ID: 27392, updated on 12-Aug-2019

Gene summary

Official Symbol Pign provided by MGI Official Full Name phosphatidylinositol glycan anchor biosynthesis, class N provided by MGI Primary source MGI:MGI:1351629 See related Ensembl:ENSMUSG00000056536 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PIG-N; Gm20308 Expression Ubiquitous expression in limb E14.5 (RPKM 3.0), testis adult (RPKM 2.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 E2.1 See Pign in Genome Data Viewer

Exon count: 34

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (105518421..105663741, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (107417716..107560253, complement)

Chromosome 1 - NC_000067.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Pign ENSMUSG00000056536

Description phosphatidylinositol glycan anchor biosynthesis, class N [Source:MGI Symbol;Acc:MGI:1351629] Location Chromosome 1: 105,518,422-105,663,677 reverse strand. GRCm38:CM000994.2 About this gene This gene has 12 transcripts (splice variants), 212 orthologues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pign-205 ENSMUST00000186485.6 6759 931aa ENSMUSP00000139638.1 Protein coding CCDS15205 G3X9F1 TSL:1 GENCODE basic

Pign-201 ENSMUST00000070699.14 6528 931aa ENSMUSP00000069969.8 Protein coding CCDS15205 G3X9F1 TSL:5 GENCODE basic

Pign-206 ENSMUST00000187537.6 4006 826aa ENSMUSP00000140020.1 Protein coding - A0A087WQ32 TSL:5 GENCODE basic APPRIS P5

Pign-210 ENSMUST00000190811.6 3015 798aa ENSMUSP00000140844.1 Protein coding - A0A087WS03 TSL:1 GENCODE basic APPRIS ALT2

Pign-204 ENSMUST00000186195.1 408 74aa ENSMUSP00000139490.1 Protein coding - A0A087WNT9 CDS 5' incomplete TSL:3

Pign-203 ENSMUST00000185983.6 2564 No protein - Retained intron - - TSL:1

Pign-212 ENSMUST00000191408.6 1362 No protein - Retained intron - - TSL:1

Pign-207 ENSMUST00000187909.1 996 No protein - Retained intron - - TSL:1

Pign-208 ENSMUST00000188858.1 532 No protein - Retained intron - - TSL:2

Pign-202 ENSMUST00000185209.1 362 No protein - Retained intron - - TSL:2

Pign-211 ENSMUST00000190945.6 4031 No protein - lncRNA - - TSL:5

Pign-209 ENSMUST00000190139.1 1967 No protein - lncRNA - - TSL:1

Page 6 of 8 https://www.alphaknockout.com

165.26 kb Forward strand 105.52Mb 105.54Mb 105.56Mb 105.58Mb 105.60Mb 105.62Mb 105.64Mb 105.66Mb Gm20302-201 >processed pseudogene Gm8004-201 >processed pseudogene Relch-212 >retained intron (Comprehensive set...

Gm22426-201 >miRNA Relch-205 >nonsense mediated decay

Relch-201 >protein coding

Relch-202 >protein coding

Relch-210 >protein coding

Relch-208 >lncRNA

Relch-211 >lncRNA

Contigs AC132082.3 > AC166354.3 > Genes (Comprehensive set... < Gm28403-201lncRNA < Pign-204protein coding < Pign-207retained intron

< Pign-201protein coding

< Pign-205protein coding

< Pign-211lncRNA < Pign-212retained intron

< Pign-209lncRNA < Pign-203retained intron

< Pign-206protein coding

< Pign-210protein coding

< Gm28404-201lncRNA < Pign-208retained intron

< Pign-202retained intron

Regulatory Build

105.52Mb 105.54Mb 105.56Mb 105.58Mb 105.60Mb 105.62Mb 105.64Mb 105.66Mb Reverse strand 165.26 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000186485

< Pign-205protein coding

Reverse strand 145.24 kb

ENSMUSP00000139... Transmembrane heli... Low complexity (Seg) Superfamily Alkaline-phosphatase-like, core domain superfamily

Pfam Type I phosphodiesterase/nucleotide pyrophosphatase/phosphate transferase

GPI ethanolamine phosphate transferase 1, C-terminal PANTHER GPI ethanolamine phosphate transferase 1

Gene3D Alkaline-phosphatase-like, core domain superfamily CDD GPI ethanolamine phosphate transferase 1, N-terminal

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 800 931

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8