https://www.alphaknockout.com

Mouse Mxra8 Knockout Project (CRISPR/Cas9)

Objective: To create a Mxra8 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mxra8 (NCBI Reference Sequence: NM_024263 ; Ensembl: ENSMUSG00000029070 ) is located on Mouse 4. 10 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 10 (Transcript: ENSMUST00000030947). Exon 1~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Phenotypic analysis of mice homozygous for a gene trap allele indicates this mutation has no notable phenotype in any parameter tested in a high-throughput screen.

Exon 1 starts from about 0.08% of the coding region. Exon 1~10 covers 100.0% of the coding region. The size of effective KO region: ~3486 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10

Legends Exon of mouse Mxra8 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.45% 449) | C(24.1% 482) | T(23.05% 461) | G(30.4% 608)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.9% 518) | C(24.2% 484) | T(26.05% 521) | G(23.85% 477)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 155837843 155839842 2000 browser details YourSeq 40 1266 1319 2000 87.1% chr15 + 79535195 79535248 54 browser details YourSeq 37 1275 1321 2000 89.4% chr11 + 60125932 60125978 47 browser details YourSeq 35 1265 1366 2000 85.8% chr2 - 171595156 171595258 103 browser details YourSeq 34 1529 1564 2000 97.3% chr9 - 119885905 119885940 36 browser details YourSeq 31 1528 1562 2000 97.0% chr13 - 93855829 93855863 35 browser details YourSeq 30 903 954 2000 94.2% chr9 - 121801916 121801986 71 browser details YourSeq 29 1265 1295 2000 96.8% chr17 + 53627897 53627927 31 browser details YourSeq 28 1528 1556 2000 100.0% chr4 + 134327156 134327185 30 browser details YourSeq 27 1538 1564 2000 100.0% chr3 + 70625998 70626024 27 browser details YourSeq 26 1269 1296 2000 96.5% chr2 - 154751706 154751733 28 browser details YourSeq 26 1268 1297 2000 93.4% chr7 + 102528581 102528610 30 browser details YourSeq 26 1265 1294 2000 93.4% chr10 + 82820328 82820357 30 browser details YourSeq 26 1265 1294 2000 93.4% chr1 + 179698057 179698086 30 browser details YourSeq 26 1525 1553 2000 96.5% chr1 + 51890126 51890159 34 browser details YourSeq 25 1276 1300 2000 100.0% chr18 - 10891459 10891483 25 browser details YourSeq 25 928 953 2000 100.0% chr1 - 45751910 45751946 37 browser details YourSeq 25 1525 1551 2000 100.0% chr1 - 34755574 34755606 33 browser details YourSeq 24 987 1013 2000 96.2% chr1 - 26256269 26256298 30 browser details YourSeq 24 1672 1706 2000 69.3% chr1 + 40899905 40899931 27

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 + 155843329 155845328 2000 browser details YourSeq 249 1361 1919 2000 91.7% chr14 + 20605987 20654194 48208 browser details YourSeq 242 1358 1905 2000 91.7% chr12 - 85162160 85295287 133128 browser details YourSeq 210 1307 1698 2000 91.6% chr13 + 8863505 8864032 528 browser details YourSeq 199 1350 1902 2000 94.7% chr14 + 64192480 64226751 34272 browser details YourSeq 186 1354 1721 2000 94.3% chr12 + 53980130 53980513 384 browser details YourSeq 183 1383 1698 2000 85.5% chr4 + 148566566 148566854 289 browser details YourSeq 183 1345 1669 2000 88.3% chr16 + 17108981 17109343 363 browser details YourSeq 182 1353 1706 2000 94.7% chr5 - 66256160 66256788 629 browser details YourSeq 181 1361 1721 2000 86.6% chr2 - 152770408 152770732 325 browser details YourSeq 143 1543 1698 2000 93.5% chr17 - 80175816 80175968 153 browser details YourSeq 134 1542 1688 2000 96.0% chr1 + 56172916 56173063 148 browser details YourSeq 127 1359 1644 2000 93.8% chr12 - 82261156 82261697 542 browser details YourSeq 125 1348 1691 2000 86.1% chr4 - 129549272 129549602 331 browser details YourSeq 123 1565 1702 2000 92.7% chrX + 8003403 8003538 136 browser details YourSeq 121 1346 1678 2000 84.2% chr1 - 180720693 180721004 312 browser details YourSeq 120 1359 1646 2000 87.0% chr11 - 83331666 83331975 310 browser details YourSeq 120 1354 1676 2000 94.9% chr15 + 82151651 82182793 31143 browser details YourSeq 119 1564 1710 2000 91.7% chr7 - 27204845 27205010 166 browser details YourSeq 118 1345 1520 2000 85.9% chr6 - 53129849 53130029 181

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Mxra8 matrix-remodelling associated 8 [ Mus musculus (house mouse) ] Gene ID: 74761, updated on 12-Aug-2019

Gene summary

Official Symbol Mxra8 provided by MGI Official Full Name matrix-remodelling associated 8 provided by MGI Primary source MGI:MGI:1922011 See related Ensembl:ENSMUSG00000029070 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Asp3; Dicam; AI131686; 1200013A08Rik; 1700095D18Rik Expression Broad expression in lung adult (RPKM 383.7), ovary adult (RPKM 167.8) and 15 other tissues See more Orthologs human all

Genomic context

Location: 4 E2; 4 87.58 cM See Mxra8 in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (155839680..155844102)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (155213789..155218211)

Chromosome 4 - NC_000070.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Mxra8 ENSMUSG00000029070

Description matrix-remodelling associated 8 [Source:MGI Symbol;Acc:MGI:1922011] Gene Synonyms 1200013A08Rik Location Chromosome 4: 155,839,680-155,844,088 forward strand. GRCm38:CM000997.2 About this gene This gene has 7 transcripts (splice variants), 231 orthologues, 17 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mxra8-201 ENSMUST00000030947.3 2203 442aa ENSMUSP00000030947.3 Protein coding CCDS19044 Q9DBV4 TSL:1 GENCODE basic APPRIS P1

Mxra8-206 ENSMUST00000141883.7 1097 311aa ENSMUSP00000114929.1 Protein coding - A2AD97 CDS 3' incomplete TSL:2

Mxra8-205 ENSMUST00000141766.7 788 No protein - Retained intron - - TSL:3

Mxra8-202 ENSMUST00000126487.1 715 No protein - Retained intron - - TSL:1

Mxra8-207 ENSMUST00000143886.1 594 No protein - Retained intron - - TSL:3

Mxra8-204 ENSMUST00000133592.1 564 No protein - Retained intron - - TSL:3

Mxra8-203 ENSMUST00000132142.1 449 No protein - lncRNA - - TSL:5

Page 7 of 9 https://www.alphaknockout.com

24.41 kb Forward strand 155.83Mb 155.84Mb 155.85Mb (Comprehensive set... Aurkaip1-205 >protein coding Mxra8-206 >protein codiMngxra8-203 >lncRNA Dvl1-201 >protein coding

Aurkaip1-201 >protein coding Mxra8-201 >protein coding Dvl1-208 >protein coding

Aurkaip1-206 >retained intron Mxra8-204 >retained intron Dvl1-207 >retained intron

Aurkaip1-203 >protein coding Mxra8-207 >retained intron Dvl1-203 >retained intron

Aurkaip1-204 >lncRNA Mxra8-205 >retained intron Dvl1-205 >retained intron

Aurkaip1-202 >protein coding Mxra8-202 >retained intron Dvl1-206 >retained intron

Dvl1-204 >lncRNA

Contigs AL670236.9 >

Genes < Mxra8os-201lncRNA (Comprehensive set...

Regulatory Build

155.83Mb 155.84Mb 155.85Mb Reverse strand 24.41 kb

Regulation Legend

CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000030947

4.36 kb Forward strand

Mxra8-201 >protein coding

ENSMUSP00000030... Transmembrane heli... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily Immunoglobulin-like domain superfamily SMART Immunoglobulin V-set domain

Immunoglobulin subtype Pfam Immunoglobulin V-set domain PROSITE profiles Immunoglobulin-like domain PANTHER Matrix remodeling-associated protein 8

Gene3D Immunoglobulin-like fold CDD cd00096

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 442

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9