https://www.alphaknockout.com

Mouse Hars2 Knockout Project (CRISPR/Cas9)

Objective: To create a Hars2 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hars2 (NCBI Reference Sequence: NM_080636 ; Ensembl: ENSMUSG00000019143 ) is located on Mouse 18. 13 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 13 (Transcript: ENSMUST00000152954). Exon 2~13 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 6.8% of the coding region. Exon 2~13 covers 93.27% of the coding region. The size of effective KO region: ~5562 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 11 12 13

Legends Exon of mouse Hars2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.85% 457) | C(19.65% 393) | T(34.75% 695) | G(22.75% 455)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(30.4% 608) | C(19.85% 397) | T(26.7% 534) | G(23.05% 461)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 36783566 36785565 2000 browser details YourSeq 223 1250 1765 2000 94.2% chr11 - 72555350 72555931 582 browser details YourSeq 221 1245 1768 2000 90.1% chr16 + 17231542 17231988 447 browser details YourSeq 208 1245 1750 2000 88.6% chr8 - 13970336 13970601 266 browser details YourSeq 198 1333 1768 2000 87.9% chr7 + 27630398 27630785 388 browser details YourSeq 197 1245 1750 2000 85.4% chr14 - 63518559 63518828 270 browser details YourSeq 195 1186 1751 2000 91.5% chr7 + 28397022 28397597 576 browser details YourSeq 191 1560 1767 2000 94.5% chr9 + 6237817 6238014 198 browser details YourSeq 187 1560 1750 2000 99.0% chr6 + 47857719 47857909 191 browser details YourSeq 186 1564 1765 2000 97.0% chr7 + 31493942 31494142 201 browser details YourSeq 184 1245 1750 2000 86.1% chr5 - 121563921 121564167 247 browser details YourSeq 183 1560 1750 2000 98.0% chr8 - 119180199 119180389 191 browser details YourSeq 183 1560 1750 2000 98.0% chr3 - 10432318 10432508 191 browser details YourSeq 183 1142 1728 2000 93.4% chr18 - 38606801 38607421 621 browser details YourSeq 183 1560 1750 2000 96.9% chr9 + 114502463 114502652 190 browser details YourSeq 183 1560 1750 2000 96.9% chr6 + 113175930 113176119 190 browser details YourSeq 183 1559 1750 2000 98.0% chr11 + 49810673 49810866 194 browser details YourSeq 182 1563 1750 2000 98.5% chr4 + 132798935 132799122 188 browser details YourSeq 182 1562 1750 2000 96.8% chr3 + 88154076 88154262 187 browser details YourSeq 182 1560 1750 2000 98.0% chr18 + 56582308 56582502 195

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 36791128 36793127 2000 browser details YourSeq 727 1 921 2000 89.9% chrX + 59197642 59198576 935 browser details YourSeq 79 707 908 2000 76.2% chr13 - 90953566 90953730 165 browser details YourSeq 71 742 867 2000 77.8% chr6 - 82299498 82299616 119 browser details YourSeq 64 745 908 2000 87.1% chr19 - 45195626 45195790 165 browser details YourSeq 60 708 1298 2000 65.8% chr19 + 58136336 58136518 183 browser details YourSeq 58 703 808 2000 86.3% chr1 - 178428123 178428584 462 browser details YourSeq 57 700 839 2000 80.5% chr6 - 38170533 38170671 139 browser details YourSeq 57 703 809 2000 78.7% chr19 - 8842446 8842550 105 browser details YourSeq 57 742 1756 2000 78.8% chr12 + 3632409 3875906 243498 browser details YourSeq 56 686 809 2000 93.8% chr5 + 144048282 144048406 125 browser details YourSeq 56 741 838 2000 87.0% chr15 + 94574163 94574259 97 browser details YourSeq 56 789 1240 2000 75.8% chr1 + 180627564 180627987 424 browser details YourSeq 55 703 843 2000 91.9% chr3 - 54594096 54594235 140 browser details YourSeq 53 700 851 2000 65.3% chr4 - 122906532 122906631 100 browser details YourSeq 53 704 806 2000 75.8% chr13 - 92687925 92688027 103 browser details YourSeq 53 1470 1767 2000 66.7% chr17 + 35689357 35689440 84 browser details YourSeq 53 700 809 2000 96.7% chr11 + 31812810 31812919 110 browser details YourSeq 52 605 841 2000 94.9% chr1 - 37793615 37794183 569 browser details YourSeq 52 734 808 2000 92.0% chr5 + 25718962 25719037 76

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Hars2 histidyl-tRNA synthetase 2 [ Mus musculus (house mouse) ] Gene ID: 70791, updated on 12-Aug-2019

Gene summary

Official Symbol Hars2 provided by MGI Official Full Name histidyl-tRNA synthetase 2 provided by MGI Primary source MGI:MGI:1918041 See related Ensembl:ENSMUSG00000019143 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as HO3; HARSR; Harsl; 4631412B19Rik Summary This gene encodes a putative member of the class II family of aminoacyl-tRNA synthetases. These play a critical Expression role in by charging tRNAs with their cognate amino acids. This protein is encoded by the nuclear genome but is likely to be imported to the where it is thought to catalyze the ligation of histidine to tRNA molecules. Mutations in a similar gene in have been associated with Perrault syndrome 2 (PRLTS2). [provided by RefSeq, Mar 2015] Orthologs Ubiquitous expression in CNS E11.5 (RPKM 11.8), CNS E14 (RPKM 10.4) and 28 other tissues See more human all

Genomic context

Location: 18; 18 B2 See Hars2 in Genome Data Viewer

Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (36783202..36792562)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (36942934..36952216)

Chromosome 18 - NC_000084.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Hars2 ENSMUSG00000019143

Description histidyl-tRNA synthetase 2 [Source:MGI Symbol;Acc:MGI:1918041] Gene Synonyms 4631412B19Rik, HARSR, HO3, Harsl Location Chromosome 18: 36,783,008-36,792,562 forward strand. GRCm38:CM001011.2 About this gene This gene has 7 transcripts (splice variants), 225 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hars2-206 ENSMUST00000152954.7 3336 505aa ENSMUSP00000117231.1 Protein coding CCDS29165 Q99KK9 TSL:1 GENCODE basic APPRIS P1

Hars2-201 ENSMUST00000019287.8 2002 424aa ENSMUSP00000019287.8 Protein coding CCDS84376 G5E823 TSL:1 GENCODE basic

Hars2-203 ENSMUST00000131952.1 819 No protein - Retained intron - - TSL:5

Hars2-205 ENSMUST00000145876.1 772 No protein - Retained intron - - TSL:2

Hars2-202 ENSMUST00000124204.1 769 No protein - Retained intron - - TSL:1

Hars2-207 ENSMUST00000155842.1 406 No protein - Retained intron - - TSL:5

Hars2-204 ENSMUST00000134122.7 896 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

29.55 kb Forward strand 36.78Mb 36.79Mb 36.80Mb Hars2-206 >protein coding Zmat2-201 >protein coding (Comprehensive set...

Hars2-204 >lncRNA Zmat2-202 >retained intron

Hars2-201 >protein coding Vaultrc5-201 >misc RNA

Hars2-205 >retained intron Hars2-207 >retained intron

Hars2-203 >retained intron

Hars2-202 >retained intron

Contigs AC027740.11 > Genes < Hars-201protein coding (Comprehensive set...

< Hars-204retained intron

< Hars-203nonsense mediated decay

Regulatory Build

36.78Mb 36.79Mb 36.80Mb Reverse strand 29.55 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000152954

9.55 kb Forward strand

Hars2-206 >protein coding

ENSMUSP00000117... Low complexity (Seg) Cleavage site (Sign... TIGRFAM Histidine-tRNA ligase

Superfamily SSF55681 SSF52954

Pfam Class II Histidinyl-tRNA synthetase (HisRS)-like catalytic core domain Anticodon-binding

PROSITE profiles Aminoacyl-tRNA synthetase, class II

PIRSF Histidine-tRNA ligase/ATP phosphoribosyltransferase regulatory subunit

PANTHER PTHR11476

PTHR11476:SF6 Gene3D 3.30.930.10 Anticodon-binding domain superfamily

CDD Class II Histidinyl-tRNA synthetase (HisRS)-like catalytic core domain Histidyl-anticodon-binding

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 505

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9