https://www.alphaknockout.com

Mouse Qsox1 Knockout Project (CRISPR/Cas9)

Objective: To create a Qsox1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Qsox1 (NCBI Reference Sequence: NM_001024945 ; Ensembl: ENSMUSG00000033684 ) is located on Mouse 1. 12 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000035325). Exon 2~4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for an ENU-induced mutation show cardiovascular phenotypes including persistent truncus arteriosus, atriventricular septal defects and vascular ring, as well as eye defects, short snout, micrognathia, cleft palate, tracheosophageal fistula, polydactyly and spleen hypoplasia.

Exon 2 starts from about 12.25% of the coding region. Exon 2~4 covers 11.14% of the coding region. The size of effective KO region: ~9252 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 12

Legends Exon of mouse Qsox1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.25% 545) | C(21.1% 422) | T(24.85% 497) | G(26.8% 536)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.55% 471) | C(26.1% 522) | T(26.8% 536) | G(23.55% 471)

Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 155803798 155805797 2000 browser details YourSeq 123 168 385 2000 88.3% chr10 + 64022734 64336475 313742 browser details YourSeq 120 175 387 2000 85.8% chr6 - 136294842 136295056 215 browser details YourSeq 118 163 387 2000 86.4% chr4 + 99420308 99420532 225 browser details YourSeq 115 212 387 2000 86.6% chr9 - 107320065 107320242 178 browser details YourSeq 114 194 387 2000 86.2% chr1 - 88399608 88399800 193 browser details YourSeq 113 173 339 2000 86.9% chr8 - 85328917 85329081 165 browser details YourSeq 113 174 380 2000 86.1% chr3 + 103825915 103826120 206 browser details YourSeq 111 223 387 2000 91.4% chr2 - 116942831 116942997 167 browser details YourSeq 109 161 404 2000 84.5% chr6 + 47234569 47234810 242 browser details YourSeq 109 222 404 2000 86.1% chr2 + 148526124 148526307 184 browser details YourSeq 107 202 380 2000 85.6% chr6 - 144854152 144854330 179 browser details YourSeq 107 174 404 2000 87.4% chr13 - 49922242 49922472 231 browser details YourSeq 107 175 385 2000 81.1% chr10 - 69934176 69934383 208 browser details YourSeq 107 173 387 2000 87.5% chr1 - 89899600 89899814 215 browser details YourSeq 107 175 399 2000 88.8% chr4 + 123562674 123562897 224 browser details YourSeq 101 232 403 2000 83.9% chr9 - 63947562 63947734 173 browser details YourSeq 101 175 397 2000 84.4% chr16 + 24003748 24003964 217 browser details YourSeq 101 219 387 2000 89.8% chr11 + 62101628 62101797 170 browser details YourSeq 99 170 338 2000 82.4% chr11 - 95630615 95630781 167

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 155792546 155794545 2000 browser details YourSeq 74 1068 1146 2000 98.8% chr5 + 69435880 69435969 90 browser details YourSeq 72 1079 1158 2000 98.7% chr13 + 53002563 53002798 236 browser details YourSeq 67 1092 1158 2000 100.0% chr2 - 102651632 102651698 67 browser details YourSeq 49 1068 1120 2000 94.2% chr10 - 37212966 37213017 52 browser details YourSeq 31 732 767 2000 85.3% chr16 + 92875186 92875219 34 browser details YourSeq 20 1509 1530 2000 95.5% chr1 + 16830805 16830826 22

Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Qsox1 quiescin Q6 sulfhydryl oxidase 1 [ Mus musculus (house mouse) ] Gene ID: 104009, updated on 1-Oct-2019

Gene summary

Official Symbol Qsox1 provided by MGI Official Full Name quiescin Q6 sulfhydryl oxidase 1 provided by MGI Primary source MGI:MGI:1330818 See related Ensembl:ENSMUSG00000033684 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as SOx; QSOX; Qscn6; b2b2673Clo; 1300003H02Rik Expression Broad expression in liver adult (RPKM 148.9), liver E18 (RPKM 139.0) and 21 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 G3 See Qsox1 in Genome Data Viewer Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (155778155..155812899, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (157625285..157660029, complement)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Qsox1 ENSMUSG00000033684

Description quiescin Q6 sulfhydryl oxidase 1 [Source:MGI Symbol;Acc:MGI:1330818] Gene Synonyms 1300003H02Rik, QSOX, Qscn6, b2b2673Clo Location : 155,776,029-155,812,889 reverse strand. GRCm38:CM000994.2 About this gene This gene has 8 transcripts (splice variants), 196 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Qsox1-201 ENSMUST00000035325.14 3348 748aa ENSMUSP00000035658.8 Protein coding CCDS15386 Q8BND5 TSL:1 GENCODE basic APPRIS P3

Qsox1-207 ENSMUST00000194632.1 2433 568aa ENSMUSP00000142301.1 Protein coding CCDS78713 Q8BND5 TSL:1 GENCODE basic APPRIS ALT2

Qsox1-202 ENSMUST00000111764.7 2722 661aa ENSMUSP00000107394.2 Protein coding - Q8BND5 TSL:1 GENCODE basic APPRIS ALT2

Qsox1-208 ENSMUST00000195419.1 3395 No protein - Retained intron - - TSL:NA

Qsox1-205 ENSMUST00000140809.1 2664 No protein - Retained intron - - TSL:1

Qsox1-206 ENSMUST00000151368.1 922 No protein - Retained intron - - TSL:2

Qsox1-204 ENSMUST00000132495.1 373 No protein - Retained intron - - TSL:2

Qsox1-203 ENSMUST00000130701.1 366 No protein - lncRNA - - TSL:5

Page 7 of 9 https://www.alphaknockout.com

56.86 kb Forward strand

155.77Mb 155.78Mb 155.79Mb 155.80Mb 155.81Mb 155.82Mb Gm37089-201 >processed pseudogene Gm37539-201 >TEC Gm37571-201 >lncRNA (Comprehensive set...

Contigs AC121314.7 >

Genes (Comprehensive set... < Qsox1-208retained intron < Qsox1-206retained intron

< Qsox1-201protein coding

< Qsox1-202protein coding

< Qsox1-207protein coding

< Qsox1-203lncRNA < Qsox1-204retained intron

< Qsox1-205retained intron

Regulatory Build

155.77Mb 155.78Mb 155.79Mb 155.80Mb 155.81Mb 155.82Mb Reverse strand 56.86 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000035325

< Qsox1-201protein coding

Reverse strand 34.73 kb

ENSMUSP00000035... Transmembrane heli... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily -like superfamily ERV/ALR sulfhydryl oxidase domain superfamily

Pfam Thioredoxin domain ERV/ALR sulfhydryl oxidase domain

Sulfhydryl oxidase, Trx-like domain

Sulfhydryl oxidase, flavin adenine dinucleotide (FAD) binding domain PROSITE profiles Thioredoxin domain ERV/ALR sulfhydryl oxidase domain

PANTHER PTHR22897:SF6

Sulfhydryl oxidase Gene3D 3.40.30.10 ERV/ALR sulfhydryl oxidase domain superfamily

Sulfhydryl oxidase, flavin adenine dinucleotide (FAD) binding domain superfamily CDD cd02992

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 748

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9