https://www.alphaknockout.com

Mouse Ahnak Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ahnak conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ahnak (NCBI Reference Sequence: NM_009643 ; Ensembl: ENSMUSG00000069833 ) is located on Mouse 19. 5 exons are identified, with the ATG start codon in exon 3 and the TAG stop codon in exon 5 (Transcript: ENSMUST00000092956). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ahnak gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-241A8 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for one knock-out allele exhibit decreased T cell proliferation and increased susceptibility to parasitic infection.

Exon 4 starts from about 0.93% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 419 bp, and the size of intron 4 for 3'-loxP site insertion: 835 bp. The size of effective cKO region: ~647 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ahnak Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7106bp) | A(24.53% 1743) | C(21.66% 1539) | T(25.95% 1844) | G(27.86% 1980)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 8997468 9000467 3000 browser details YourSeq 40 754 803 3000 84.5% chr17 - 81064051 81064097 47 browser details YourSeq 33 762 796 3000 100.0% chrX + 65835044 65835086 43 browser details YourSeq 30 760 789 3000 100.0% chr14 - 57321072 57321101 30 browser details YourSeq 29 754 789 3000 87.9% chr12 - 45882163 45882197 35 browser details YourSeq 29 764 795 3000 96.9% chr18 + 64607514 64607546 33 browser details YourSeq 28 764 794 3000 96.7% chr18 - 14008845 14008883 39 browser details YourSeq 27 752 814 3000 61.6% chr2 - 6476255 6476306 52 browser details YourSeq 22 2820 2841 3000 100.0% chr13 + 51588055 51588076 22

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 9001115 9004114 3000 browser details YourSeq 501 1512 2995 3000 81.9% chr19 + 9004306 9015155 10850 browser details YourSeq 396 2325 2996 3000 81.9% chr19 + 9004180 9012078 7899 browser details YourSeq 387 2071 2965 3000 81.6% chr19 + 9009533 9013733 4201 browser details YourSeq 379 2283 3000 3000 79.4% chr19 + 9007807 9011476 3670 browser details YourSeq 357 2317 3000 3000 79.7% chr19 + 9006632 9009358 2727 browser details YourSeq 318 2061 3000 3000 83.0% chr19 + 9004494 9009736 5243 browser details YourSeq 256 2073 2833 3000 82.7% chr19 + 9011962 9015155 3194 browser details YourSeq 248 2337 2835 3000 82.6% chr19 + 9005695 9012841 7147 browser details YourSeq 244 2422 2893 3000 77.9% chr19 + 9004537 9010985 6449 browser details YourSeq 187 2421 2698 3000 85.0% chr19 + 9011659 9014072 2414 browser details YourSeq 185 2337 2995 3000 92.0% chr19 + 9004939 9010670 5732 browser details YourSeq 185 2457 3000 3000 84.0% chr19 + 9004312 9006373 2062 browser details YourSeq 178 2060 2572 3000 86.9% chr19 + 9012792 9015116 2325 browser details YourSeq 148 2460 2827 3000 91.2% chr19 + 9005839 9012995 7157 browser details YourSeq 106 2428 2567 3000 91.5% chr19 + 9013352 9013719 368

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Ahnak AHNAK nucleoprotein (desmoyokin) [ Mus musculus (house mouse) ] Gene ID: 66395, updated on 12-Aug-2019

Gene summary

Official Symbol Ahnak provided by MGI Official Full Name AHNAK nucleoprotein (desmoyokin) provided by MGI Primary source MGI:MGI:1316648 See related Ensembl:ENSMUSG00000069833 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as DY6 Expression Biased expression in bladder adult (RPKM 56.2), lung adult (RPKM 31.1) and 13 other tissues See more Orthologs human all

Genomic context

Location: 19; 19 A See Ahnak in Genome Data Viewer

Exon count: 12

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (8989284..9076935)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (9063774..9151409)

Chromosome 19 - NC_000085.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Ahnak ENSMUSG00000069833

Description AHNAK nucleoprotein (desmoyokin) [Source:MGI Symbol;Acc:MGI:1316648] Gene Synonyms 1110004P15Rik, 2310047C17Rik, DY6 Location Chromosome 19: 8,989,284-9,076,914 forward strand. GRCm38:CM001012.2 About this gene This gene has 5 transcripts (splice variants), 175 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 14 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ahnak- ENSMUST00000092956.3 18100 5656aa ENSMUSP00000090633.2 Protein coding CCDS29564 E9Q616 TSL:1 202 GENCODE basic

Ahnak- ENSMUST00000092955.10 830 150aa ENSMUSP00000090632.3 Protein coding CCDS50384 G5E8K8 TSL:2 201 GENCODE basic APPRIS P1

Ahnak- ENSMUST00000236390.1 2200 311aa ENSMUSP00000158470.1 Protein coding - A0A494BBD5 CDS 3' 203 incomplete

Ahnak- ENSMUST00000237912.1 1188 176aa ENSMUSP00000157404.1 Nonsense mediated - A0A494B8Y7 - 205 decay

Ahnak- ENSMUST00000237033.1 989 No - lncRNA - - - 204 protein

Page 6 of 8 https://www.alphaknockout.com

107.63 kb Forward strand 8.98Mb 9.00Mb 9.02Mb 9.04Mb 9.06Mb 9.08Mb (Comprehensive set... Ahnak-205 >nonsense mediated decay

Ahnak-202 >protein coding

Ahnak-201 >protein coding

Ahnak-203 >protein coding

Ahnak-204 >lncRNA

Contigs < AC130819.4 Genes < Scgb1a1-201protein coding (Comprehensive set...

Regulatory Build

8.98Mb 9.00Mb 9.02Mb 9.04Mb 9.06Mb 9.08Mb Reverse strand 107.63 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000092956

29.91 kb Forward strand

Ahnak-202 >protein coding

ENSMUSP00000090... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily PDZ superfamily SMART PDZ domain Pfam PDZ domain PROSITE profiles PDZ domain PANTHER PTHR23348

PTHR23348:SF41 Gene3D 2.30.42.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant synonymous variant

Scale bar 0 600 1200 1800 2400 3000 3600 4200 4800 5656

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8