https://www.alphaknockout.com

Mouse Ly6e Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ly6e conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ly6e (NCBI Reference Sequence: NM_001164036 ; Ensembl: ENSMUSG00000022587 ) is located on Mouse 15. 4 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000051698). Exon 2~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ly6e gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-400G11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for disruptions in this gene die as embryos as a result of heart defects.

Exon 2~4 covers 100.0% of the coding region. Start codon is in exon 2, and stop codon is in exon 4. The size of intron 1 for 5'-loxP site insertion: 2059 bp. The size of effective cKO region: ~2848 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ly6e Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7403bp) | A(23.41% 1733) | C(26.89% 1991) | T(23.3% 1725) | G(26.39% 1954)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 + 74954559 74957558 3000 browser details YourSeq 48 1941 1990 3000 100.0% chr11 - 114639306 114639361 56 browser details YourSeq 47 1090 1151 3000 89.1% chr12 + 72620675 72620735 61 browser details YourSeq 40 1942 1983 3000 97.7% chr4 - 115059026 115059067 42 browser details YourSeq 35 1955 1993 3000 97.4% chr8 - 78933551 78933592 42 browser details YourSeq 35 1949 1983 3000 100.0% chr15 + 53695812 53695846 35 browser details YourSeq 29 1949 1994 3000 74.2% chr16 - 63633001 63633037 37 browser details YourSeq 26 952 992 3000 85.8% chr11 - 79778154 79778192 39 browser details YourSeq 26 1969 1994 3000 100.0% chr10 + 13711232 13711257 26 browser details YourSeq 23 1750 1773 3000 100.0% chr2 - 143846166 143846191 26 browser details YourSeq 22 778 799 3000 100.0% chr9 - 63963955 63963976 22 browser details YourSeq 22 2491 2515 3000 96.0% chr10 + 80210380 80210410 31 browser details YourSeq 21 1733 1753 3000 100.0% chr6 - 34953021 34953041 21

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr15 + 74958712 74961711 3000 browser details YourSeq 513 1 597 3000 93.4% chr12 + 72621140 72621727 588 browser details YourSeq 382 11 618 3000 84.7% chr11 + 73390328 73390924 597 browser details YourSeq 50 849 1199 3000 76.0% chrX + 94151372 94151692 321 browser details YourSeq 49 1163 1285 3000 92.9% chr8 - 79546868 79547051 184 browser details YourSeq 44 1164 1219 3000 95.9% chr11 + 99162337 99162395 59 browser details YourSeq 43 1154 1201 3000 95.8% chr10 - 39005036 39005095 60 browser details YourSeq 43 1114 1198 3000 95.8% chr15 + 103251505 103251632 128 browser details YourSeq 42 1163 1214 3000 95.7% chr2 - 169092828 169092881 54 browser details YourSeq 42 1163 1215 3000 95.7% chr17 + 55832661 55832713 53 browser details YourSeq 41 1098 1200 3000 91.9% chr11 - 119657074 119657204 131 browser details YourSeq 41 1154 1201 3000 93.7% chr6 + 124890191 124890250 60 browser details YourSeq 39 1163 1213 3000 81.3% chrX - 21277250 21277297 48 browser details YourSeq 39 1157 1201 3000 90.7% chr16 - 47607126 47607169 44 browser details YourSeq 39 1157 1201 3000 88.7% chr11 - 86187292 86187335 44 browser details YourSeq 38 1165 1204 3000 97.5% chr6 - 22581973 22582012 40 browser details YourSeq 38 1164 1203 3000 97.5% chr5 + 20101283 20101322 40 browser details YourSeq 38 1157 1200 3000 87.5% chr14 + 100438852 100438892 41 browser details YourSeq 38 1164 1201 3000 100.0% chr14 + 61781759 61781796 38 browser details YourSeq 37 1151 1196 3000 95.2% chr8 - 8339976 8340033 58

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Ly6e lymphocyte antigen 6 complex, E [ Mus musculus (house mouse) ] Gene ID: 17069, updated on 24-Oct-2019

Gene summary

Official Symbol Ly6e provided by MGI Official Full Name lymphocyte antigen 6 complex, locus E provided by MGI Primary source MGI:MGI:106651 See related Ensembl:ENSMUSG00000022587 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ly67; Tsa1; RIG-E; Sca-2; TSA-1 Expression Broad expression in thymus adult (RPKM 431.0), spleen adult (RPKM 360.8) and 23 other tissues See more Orthologs human all

Genomic context

Location: 15 D3; 15 34.29 cM See Ly6e in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (74955016..74959905)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (74785481..74790335)

Chromosome 15 - NC_000081.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 16 transcripts

Gene: Ly6e ENSMUSG00000022587

Description lymphocyte antigen 6 complex, locus E [Source:MGI Symbol;Acc:MGI:106651] Gene Synonyms 9804, Ly67, RIG-E, Sca-2, TSA-1, Tsa1 Location Chromosome 15: 74,955,051-74,959,905 forward strand. GRCm38:CM001008.2 About this gene This gene has 16 transcripts (splice variants), 190 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 17 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ly6e-201 ENSMUST00000051698.13 2281 136aa ENSMUSP00000056703.7 Protein coding CCDS27538 Q99JA5 TSL:1 GENCODE basic APPRIS P1

Ly6e-202 ENSMUST00000169343.7 2154 136aa ENSMUSP00000132081.1 Protein coding CCDS27538 Q99JA5 TSL:5 GENCODE basic APPRIS P1

Ly6e-210 ENSMUST00000188866.6 2068 136aa ENSMUSP00000140145.1 Protein coding CCDS27538 Q99JA5 TSL:3 GENCODE basic APPRIS P1

Ly6e-215 ENSMUST00000191436.6 1122 136aa ENSMUSP00000139549.1 Protein coding CCDS27538 Q99JA5 TSL:1 GENCODE basic APPRIS P1

Ly6e-203 ENSMUST00000185861.6 838 136aa ENSMUSP00000141145.1 Protein coding CCDS27538 Q99JA5 TSL:2 GENCODE basic APPRIS P1

Ly6e-207 ENSMUST00000187606.6 735 136aa ENSMUSP00000139471.1 Protein coding CCDS27538 Q99JA5 TSL:2 GENCODE basic APPRIS P1

Ly6e-208 ENSMUST00000188042.1 714 136aa ENSMUSP00000141059.1 Protein coding CCDS27538 Q99JA5 TSL:3 GENCODE basic APPRIS P1

Ly6e-206 ENSMUST00000187284.6 654 136aa ENSMUSP00000140553.1 Protein coding CCDS27538 Q99JA5 TSL:5 GENCODE basic APPRIS P1

Ly6e-212 ENSMUST00000190810.6 924 117aa ENSMUSP00000139482.1 Protein coding - A0A087WNT2 CDS 3' incomplete TSL:1

Ly6e-211 ENSMUST00000189186.6 795 63aa ENSMUSP00000139477.1 Protein coding - A0A087WNS9 CDS 3' incomplete TSL:1

Ly6e-204 ENSMUST00000185863.6 617 113aa ENSMUSP00000140060.1 Protein coding - A0A087WQ65 CDS 3' incomplete TSL:3

Ly6e-213 ENSMUST00000191127.6 575 119aa ENSMUSP00000139966.1 Protein coding - A0A087WPY4 CDS 3' incomplete TSL:3

Ly6e-214 ENSMUST00000191145.6 431 117aa ENSMUSP00000140829.1 Protein coding - A0A087WRZ2 CDS 3' incomplete TSL:3

Ly6e-209 ENSMUST00000188503.1 1211 No protein - Retained intron - - TSL:1

Ly6e-205 ENSMUST00000186927.1 1201 No protein - Retained intron - - TSL:5

Ly6e-216 ENSMUST00000191439.1 815 No protein - Retained intron - - TSL:NA

Page 6 of 8 https://www.alphaknockout.com

24.86 kb Forward strand

74.950Mb 74.955Mb 74.960Mb 74.965Mb (Comprehensive set... Ly6e-202 >protein coding

Ly6e-210 >protein coding

Ly6e-201 >protein coding

Ly6e-207 >protein coding

Ly6e-211 >protein coding

Ly6e-206 >protein coding

Ly6e-212 >protein coding

Ly6e-203 >protein coding

Ly6e-204 >protein coding

Ly6e-213 >protein coding

Ly6e-215 >protein coding

Ly6e-205 >retained intron

Ly6e-209 >retained intron

Ly6e-214 >protein coding

Ly6e-216 >retained intron

Ly6e-208 >protein coding

Contigs AC124637.10 > Genes < Gm29126-201unprocessed pseudogene (Comprehensive set...

Regulatory Build

74.950Mb 74.955Mb 74.960Mb 74.965Mb Reverse strand 24.86 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding

Non-Protein Coding

pseudogene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000051698

4.83 kb Forward strand

Ly6e-201 >protein coding

ENSMUSP00000056... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily SSF57302 SMART Ly-6 antigen/uPA receptor-like Pfam Ly-6 antigen/uPA receptor-like PANTHER PTHR16983

PTHR16983:SF13 Gene3D 2.10.60.10 CDD cd00117

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 136

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8