https://www.alphaknockout.com

Mouse Def6 Knockout Project (CRISPR/Cas9)

Objective: To create a Def6 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Def6 (NCBI Reference Sequence: NM_027185 ; Ensembl: ENSMUSG00000002257 ) is located on Mouse 17. 11 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 11 (Transcript: ENSMUST00000002327). Exon 3~9 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous mutants spontaneously develop systemic autoimmunity. Females primarily are affected, displaying hypergammaglobulinemia, accumulation of effector/memory T cells and IgG+ B cells, and production of autoantibodies

Exon 3 starts from about 12.59% of the coding region. Exon 3~9 covers 71.11% of the coding region. The size of effective KO region: ~8699 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 9 11

Legends Exon of mouse Def6 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 186 bp section of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 199 bp section of Exon 9 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(186bp) | A(22.04% 41) | C(24.73% 46) | T(24.19% 45) | G(29.03% 54)

Note: The 186 bp section of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(199bp) | A(27.14% 54) | C(23.12% 46) | T(9.05% 18) | G(40.7% 81)

Note: The 199 bp section of Exon 9 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 186 1 186 186 100.0% chr17 + 28217603 28217788 186 browser details YourSeq 20 98 117 186 100.0% chr15 - 93522854 93522873 20

Note: The 186 bp section of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 199 1 199 199 100.0% chr17 + 28226101 28226299 199 browser details YourSeq 51 9 109 199 91.9% chr1 - 165767367 165767491 125 browser details YourSeq 28 10 60 199 96.7% chr2 + 170247312 170247365 54 browser details YourSeq 25 81 110 199 96.3% chr7 - 127558346 127558387 42 browser details YourSeq 24 81 109 199 96.2% chr4 - 128221745 128221776 32 browser details YourSeq 24 85 109 199 100.0% chr11 - 120629506 120629536 31 browser details YourSeq 24 81 105 199 100.0% chrX + 93146049 93146075 27 browser details YourSeq 24 81 109 199 96.2% chr7 + 122257465 122257505 41 browser details YourSeq 24 81 110 199 80.8% chr2 + 16329996 16330022 27 browser details YourSeq 23 130 152 199 100.0% chr9 - 7764245 7764267 23 browser details YourSeq 23 81 109 199 80.0% chr7 - 35278492 35278517 26 browser details YourSeq 23 82 110 199 80.0% chr2 + 20896883 20896908 26 browser details YourSeq 22 143 164 199 100.0% chr4 + 99616484 99616505 22 browser details YourSeq 21 35 55 199 100.0% chr17 - 15056869 15056889 21 browser details YourSeq 20 82 103 199 95.5% chr9 - 80671890 80671911 22 browser details YourSeq 20 45 64 199 100.0% chr9 + 122814938 122814957 20 browser details YourSeq 20 1 22 199 95.5% chr8 + 71680756 71680777 22 browser details YourSeq 20 142 161 199 100.0% chr12 + 91749287 91749306 20

Note: The 199 bp section of Exon 9 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Def6 differentially expressed in FDCP 6 [ Mus musculus (house mouse) ] Gene ID: 23853, updated on 12-Aug-2019

Gene summary

Official Symbol Def6 provided by MGI Official Full Name differentially expressed in FDCP 6 provided by MGI Primary source MGI:MGI:1346328 See related Ensembl:ENSMUSG00000002257 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ibp; Slat; Slat2; Slat6; AV094905; 2410003F05Rik; 6430538D02Rik Expression Biased expression in thymus adult (RPKM 67.0), spleen adult (RPKM 23.0) and 8 other tissuesS ee more Orthologs human all

Genomic context

Location: 17; 17 A3.3 See Def6 in Genome Data Viewer Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (28207456..28228608)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (28344723..28365553)

Chromosome 17 - NC_000083.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Def6 ENSMUSG00000002257

Description differentially expressed in FDCP 6 [Source:MGI Symbol;Acc:MGI:1346328] Gene Synonyms 2410003F05Rik, 6430538D02Rik, IBP, IRF-4-binding protein, SLAT Location Chromosome 17: 28,207,778-28,228,608 forward strand. GRCm38:CM001010.2 About this gene This gene has 8 transcripts (splice variants), 207 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 19 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Def6- ENSMUST00000002327.5 2277 630aa ENSMUSP00000002327.5 Protein CCDS28574 A0A0R4IZX1 TSL:1 201 coding GENCODE basic APPRIS P1

Def6- ENSMUST00000233264.1 1155 385aa ENSMUSP00000156702.1 Protein - A0A3B2W417 CDS 5' and 3' 205 coding incomplete

Def6- ENSMUST00000233170.1 774 257aa ENSMUSP00000156601.1 Protein - Q8C2K1 GENCODE basic 203 coding

Def6- ENSMUST00000233560.1 581 180aa ENSMUSP00000156873.1 Protein - A0A3B2W860 CDS 3' incomplete 207 coding

Def6- ENSMUST00000233958.1 432 56aa ENSMUSP00000156736.1 Protein - A0A3B2WCX4 CDS 3' incomplete 208 coding

Def6- ENSMUST00000233534.1 385 129aa ENSMUSP00000156804.1 Protein - A0A3B2W486 CDS 5' and 3' 206 coding incomplete

Def6- ENSMUST00000233205.1 445 No - lncRNA - - - 204 protein

Def6- ENSMUST00000146724.1 349 No - lncRNA - - TSL:3 202 protein

Page 7 of 9 https://www.alphaknockout.com

40.83 kb Forward strand 28.20Mb 28.21Mb 28.22Mb 28.23Mb (Comprehensive set... Zfp523-202 >protein coding Def6-201 >protein coding Ppard-201 >protein coding

Zfp523-213 >nonsense mediated decay Def6-207 >protein coding Ppard-207 >protein coding

Zfp523-205 >retained intron Def6-208 >protein coding Ppard-202 >retained intron

Zfp523-201 >protein coding Def6-202 >lncRNA Def6-206 >protein coding Ppard-209 >retained intron

Zfp523-208 >nonsense mediated decay Def6-203 >protein coding Ppard-205 >protein coding

Zfp523-214 >protein coding Def6-205 >protein coding Ppard-203 >retained intron

Zfp523-212 >protein coding Def6-204 >lncRNA

Zfp523-204 >protein coding Zfp523-206 >retained intron

Zfp523-209 >nonsense mediated decay

Zfp523-203 >nonsense mediated decay

Zfp523-207 >protein coding

Zfp523-210 >retained intron

Contigs AC126937.3 > Genes < Gm49874-201lncRNA (Comprehensive set...

Regulatory Build

28.20Mb 28.21Mb 28.22Mb 28.23Mb Reverse strand 40.83 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000002327

20.83 kb Forward strand

Def6-201 >protein coding

ENSMUSP00000002... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF50729

EF-hand domain pair SMART Pleckstrin homology domain Pfam Pleckstrin homology domain PROSITE profiles Pleckstrin homology domain PANTHER PTHR14383

PTHR14383:SF2 Gene3D PH-like domain superfamily CDD cd13273

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 630

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9