https://www.alphaknockout.com

Mouse Tinagl1 Knockout Project (CRISPR/Cas9)

Objective: To create a Tinagl1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tinagl1 (NCBI Reference Sequence: NM_023476 ; Ensembl: ENSMUSG00000028776 ) is located on Mouse 4. 12 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000105998). Exon 2~12 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Female mice homozygous for a null mutation display impaired fertility and homozygous pups born to homozygous females show impaired postnatal survival.

Exon 2 starts from about 0.07% of the coding region. Exon 2~12 covers 100.0% of the coding region. The size of effective KO region: ~8130 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 11 12

Legends Exon of mouse Tinagl1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.1% 422) | C(27.4% 548) | T(21.1% 422) | G(30.4% 608)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.65% 453) | C(30.75% 615) | T(24.7% 494) | G(21.9% 438)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 - 130174200 130176199 2000 browser details YourSeq 39 1461 1616 2000 95.5% chr4 - 130896837 130897358 522 browser details YourSeq 39 80 155 2000 91.4% chr3 + 121558297 121558372 76 browser details YourSeq 36 92 141 2000 89.2% chr4 + 48131156 48131206 51 browser details YourSeq 33 92 141 2000 88.4% chrX - 11957891 11957941 51 browser details YourSeq 32 200 241 2000 97.2% chr11 - 56201776 56201826 51 browser details YourSeq 31 180 216 2000 91.9% chrX - 101488736 101488772 37 browser details YourSeq 31 195 227 2000 97.0% chr13 - 5058154 5058186 33 browser details YourSeq 30 179 220 2000 85.8% chr16 + 30411261 30411302 42 browser details YourSeq 29 180 216 2000 83.4% chr5 - 134113586 134113621 36 browser details YourSeq 29 111 147 2000 83.4% chrX + 68834596 68834631 36 browser details YourSeq 28 126 155 2000 89.7% chr8 - 46537905 46537933 29 browser details YourSeq 27 91 141 2000 96.6% chr1 - 62467122 62467173 52 browser details YourSeq 27 91 128 2000 86.9% chr7 + 18989446 18989491 46 browser details YourSeq 27 126 154 2000 89.3% chr5 + 147436254 147436281 28 browser details YourSeq 25 130 158 2000 85.8% chr4 - 116729991 116730018 28 browser details YourSeq 25 180 216 2000 83.8% chr2 - 102230746 102230782 37 browser details YourSeq 25 180 216 2000 83.8% chr10 - 42529466 42529502 37 browser details YourSeq 25 185 215 2000 90.4% chr9 + 15879181 15879211 31 browser details YourSeq 24 92 121 2000 92.9% chr3 - 97817771 97817801 31

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr4 - 130164068 130166067 2000 browser details YourSeq 129 1353 1655 2000 83.4% chr11 + 82695101 82695321 221 browser details YourSeq 121 1333 1655 2000 89.2% chr3 + 153446036 153446370 335 browser details YourSeq 111 1361 1648 2000 92.5% chr7 + 29368026 29368314 289 browser details YourSeq 107 1362 1650 2000 89.4% chr6 - 52056076 52056359 284 browser details YourSeq 94 1419 1647 2000 85.1% chr10 - 100687212 100687431 220 browser details YourSeq 92 1396 1655 2000 88.3% chr7 - 3583492 3583748 257 browser details YourSeq 90 1335 1648 2000 83.2% chr4 - 104585126 104585430 305 browser details YourSeq 84 1511 1647 2000 86.8% chr9 + 44004785 44004921 137 browser details YourSeq 84 1536 1649 2000 90.3% chr5 + 135319840 135319972 133 browser details YourSeq 79 1544 1648 2000 91.6% chr1 - 88454198 88454311 114 browser details YourSeq 70 1361 1648 2000 91.5% chr1 - 155776417 155776750 334 browser details YourSeq 68 1396 1647 2000 75.6% chr15 - 84185636 84185854 219 browser details YourSeq 68 1417 1579 2000 81.3% chr11 - 61227756 61227886 131 browser details YourSeq 66 1361 1649 2000 91.4% chr12 + 8463075 8463368 294 browser details YourSeq 64 1001 1085 2000 93.4% chr11 - 9011579 9011681 103 browser details YourSeq 62 1546 1624 2000 89.5% chr5 - 132977990 132978085 96 browser details YourSeq 61 1016 1096 2000 93.1% chr9 + 48407052 48407149 98 browser details YourSeq 60 1361 1528 2000 78.7% chr14 - 40354897 40355056 160 browser details YourSeq 60 1005 1083 2000 91.9% chr12 + 23358707 23358801 95

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tinagl1 tubulointerstitial nephritis antigen-like 1 [ Mus musculus (house mouse) ] Gene ID: 94242, updated on 12-Aug-2019

Gene summary

Official Symbol Tinagl1 provided by MGI Official Full Name tubulointerstitial nephritis antigen-like 1 provided by MGI Primary source MGI:MGI:2137617 See related Ensembl:ENSMUSG00000028776 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AZ1; AZ-1; Arg1; Lcn7; TARP; Tinagl; 1110021J17Rik Expression Broad expression in bladder adult (RPKM 158.0), placenta adult (RPKM 140.0) and 19 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 D2.2 See Tinagl1 in Genome Data Viewer Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (130165600..130175122, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (129842844..129852366, complement)

Chromosome 4 - NC_000070.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Tinagl1 ENSMUSG00000028776

Description tubulointerstitial nephritis antigen-like 1 [Source:MGI Symbol;Acc:MGI:2137617] Gene Synonyms 1110021J17Rik, AZ-1, Arg1, Lcn7, androgen-regulated gene 1 Location Chromosome 4: 130,164,454-130,175,122 reverse strand. GRCm38:CM000997.2 About this gene This gene has 9 transcripts (splice variants), 198 orthologues, 27 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tinagl1- ENSMUST00000105998.7 1976 466aa ENSMUSP00000101620.1 Protein coding CCDS18708 Q99JR5 TSL:1 202 GENCODE basic APPRIS P2

Tinagl1- ENSMUST00000105999.8 1952 466aa ENSMUSP00000101621.2 Protein coding CCDS18708 Q99JR5 TSL:1 203 GENCODE basic APPRIS P2

Tinagl1- ENSMUST00000175992.7 1847 435aa ENSMUSP00000134900.1 Protein coding - H3BJ97 TSL:5 209 GENCODE basic APPRIS ALT2

Tinagl1- ENSMUST00000132545.2 977 161aa ENSMUSP00000135453.1 Protein coding - H3BKM8 CDS 3' 204 incomplete TSL:2

Tinagl1- ENSMUST00000030560.8 2085 466aa ENSMUSP00000030560.2 Nonsense mediated - Q99JR5 TSL:1 201 decay

Tinagl1- ENSMUST00000145000.2 911 No - Retained intron - - TSL:2 206 protein

Tinagl1- ENSMUST00000145774.2 849 No - Retained intron - - TSL:5 207 protein

Tinagl1- ENSMUST00000133660.2 596 No - Retained intron - - TSL:1 205 protein

Tinagl1- ENSMUST00000175822.1 1100 No - lncRNA - - TSL:5 208 protein

Page 7 of 9 https://www.alphaknockout.com

30.67 kb Forward strand

130.16Mb 130.17Mb 130.18Mb Contigs AL606925.16 >

Genes < Tinagl1-201nonsense mediated decay (Comprehensive set...

< Tinagl1-203protein coding

< Tinagl1-209protein coding

< Tinagl1-206retained intron < Tinagl1-208lncRNA

< Tinagl1-202protein coding

< Tinagl1-207retained intron < Tinagl1-205retained intron

< Tinagl1-204protein coding

Regulatory Build

130.16Mb 130.17Mb 130.18Mb Reverse strand 30.67 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000105998

< Tinagl1-202protein coding

Reverse strand 9.40 kb

ENSMUSP00000101... Low complexity (Seg) Cleavage site (Sign... Superfamily Papain-like cysteine peptidase superfamily SMART Peptidase C1A, papain C-terminal Pfam Peptidase C1A, papain C-terminal PROSITE profiles Somatomedin B domain PROSITE patterns Somatomedin B domain Cysteine peptidase, histidine active site

PANTHER PTHR12411:SF270

PTHR12411 Gene3D 3.90.70.10 CDD cd02620

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 466

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9