https://www.alphaknockout.com

Mouse Hecw2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Hecw2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hecw2 (NCBI Reference Sequence: NM_001001883 ; Ensembl: ENSMUSG00000042807 ) is located on Mouse 1. 29 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 29 (Transcript: ENSMUST00000087659). Exon 6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Hecw2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-124D3 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 6 starts from about 12.08% of the coding region. The knockout of Exon 6 will result in frameshift of the gene. The size of intron 5 for 5'-loxP site insertion: 3769 bp, and the size of intron 6 for 3'-loxP site insertion: 2009 bp. The size of effective cKO region: ~670 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 6 7 29 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Hecw2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7170bp) | A(27.73% 1988) | C(20.54% 1473) | T(30.93% 2218) | G(20.79% 1491)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 53933539 53936538 3000 browser details YourSeq 54 2716 2790 3000 88.4% chrX - 100070903 100070974 72 browser details YourSeq 35 2729 2789 3000 78.1% chr5 - 58109836 58109891 56 browser details YourSeq 30 2721 2755 3000 94.3% chr16 + 68328149 68328200 52 browser details YourSeq 30 2717 2755 3000 97.0% chr15 + 95301163 95301201 39 browser details YourSeq 26 2730 2758 3000 96.5% chr10 + 31857161 31857191 31 browser details YourSeq 25 2716 2740 3000 100.0% chr4 - 40475906 40475930 25 browser details YourSeq 21 2721 2741 3000 100.0% chr10 + 64481559 64481579 21 browser details YourSeq 20 2722 2741 3000 100.0% chr2 + 82761871 82761890 20 browser details YourSeq 20 2716 2735 3000 100.0% chr16 + 40623723 40623742 20

Note: The 3000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr1 - 53929869 53932868 3000 browser details YourSeq 25 2850 2878 3000 88.9% chr1 + 128724669 128724696 28 browser details YourSeq 24 1550 1574 3000 100.0% chr1 + 23798701 23798726 26 browser details YourSeq 23 1367 1389 3000 100.0% chr14 + 118823535 118823557 23 browser details YourSeq 22 2858 2879 3000 100.0% chr4 + 152080481 152080502 22 browser details YourSeq 22 2704 2725 3000 100.0% chr17 + 33711071 33711092 22 browser details YourSeq 21 1440 1460 3000 100.0% chrX + 74320091 74320111 21

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Hecw2 HECT, C2 and WW domain containing E3 ubiquitin protein ligase 2 [ Mus musculus (house mouse) ] Gene ID: 329152, updated on 12-Aug-2019

Gene summary

Official Symbol Hecw2 provided by MGI Official Full Name HECT, C2 and WW domain containing E3 ubiquitin protein ligase 2 provided by MGI Primary source MGI:MGI:2685817 See related Ensembl:ENSMUSG00000042807 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm971; Nedl2; mKIAA1301; A730039N16Rik; D030049F17Rik Expression Broad expression in cortex adult (RPKM 2.4), lung adult (RPKM 2.2) and 19 other tissues See more Orthologs human all

Genomic context

Location: 1; 1 C1.1 See Hecw2 in Genome Data Viewer

Exon count: 33

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (53806872..54195034, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (53863718..54251878, complement)

Chromosome 1 - NC_000067.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Hecw2 ENSMUSG00000042807

Description HECT, C2 and WW domain containing E3 ubiquitin protein ligase 2 [Source:MGI Symbol;Acc:MGI:2685817] Gene Synonyms A730039N16Rik, D030049F17Rik, Nedl2 Location Chromosome 1: 53,806,876-54,195,168 reverse strand. GRCm38:CM000994.2 About this gene This gene has 6 transcripts (splice variants), 219 orthologues, 23 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hecw2- ENSMUST00000120904.7 11263 1578aa ENSMUSP00000113283.1 Protein coding CCDS14956 Q6I6G8 TSL:5 203 GENCODE basic APPRIS P1

Hecw2- ENSMUST00000087659.10 11171 1578aa ENSMUSP00000084942.4 Protein coding CCDS14956 Q6I6G8 TSL:5 201 GENCODE basic APPRIS P1

Hecw2- ENSMUST00000097741.2 1132 294aa ENSMUSP00000095348.2 Protein coding CCDS14957 A3KPB7 TSL:1 202 Q6I6G8 GENCODE basic

Hecw2- ENSMUST00000146850.1 1933 No - Retained - - TSL:1 204 protein intron

Hecw2- ENSMUST00000152870.1 665 No - lncRNA - - TSL:2 206 protein

Hecw2- ENSMUST00000150677.7 363 No - lncRNA - - TSL:3 205 protein

Page 6 of 8 https://www.alphaknockout.com

408.29 kb Forward strand 53.8Mb 53.9Mb 54.0Mb 54.1Mb 54.2Mb Mir7681-201 >miRNA (Comprehensive set...

Contigs < AC150896.6 AC136517.3 > Genes (Comprehensive set... < Hecw2-203protein coding

< Hecw2-201protein coding

< Hecw2-205lncRNA < Gm24251-201snRNA < Hecw2-204retained intron

< Hecw2-206lncRNA < Hecw2-202protein coding

< Gm37633-201TEC

Regulatory Build

53.8Mb 53.9Mb 54.0Mb 54.1Mb 54.2Mb Reverse strand 408.29 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000087659

< Hecw2-201protein coding

Reverse strand 388.15 kb

ENSMUSP00000084... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF49562 WW domain superfamily HECT, E3 ligase catalytic domain

SMART C2 domain WW domain HECT domain

Pfam E3 ubiquitin-protein ligase HECW1/2, N-terminal WW domain HECT domain

C2 domain E3 ubiquitin-protein ligase HECW1, helical box domain PROSITE profiles C2 domain WW domain HECT domain

PROSITE patterns WW domain PANTHER PTHR11254:SF127

PTHR11254 Gene3D C2 domain superfamily 2.20.70.10 3.90.1750.10 3.30.2410.10

2.60.40.2840 3.30.2160.10 CDD E3 ubiquitin-protein ligase HECW, C2 domain WW domain HECT domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1578

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8