https://www.alphaknockout.com

Mouse Tbrg4 Knockout Project (CRISPR/Cas9)

Objective: To create a Tbrg4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tbrg4 (NCBI Reference Sequence: NM_001130457 ; Ensembl: ENSMUSG00000000384 ) is located on Mouse 11. 13 exons are identified, with the ATG start codon in exon 3 and the TGA stop codon in exon 12 (Transcript: ENSMUST00000189268). Exon 3~12 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 0.05% of the coding region. Exon 3~12 covers 100.0% of the coding region. The size of effective KO region: ~7603 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 1011 12 13

Legends Exon of mouse Tbrg4 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.85% 437) | C(22.3% 446) | T(27.8% 556) | G(28.05% 561)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.25% 445) | C(23.65% 473) | T(28.15% 563) | G(25.95% 519)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 6624221 6626220 2000 browser details YourSeq 164 575 746 2000 97.7% chr13 - 55175710 55175881 172 browser details YourSeq 163 576 747 2000 97.7% chr16 - 17852658 17852830 173 browser details YourSeq 163 576 749 2000 97.2% chr11 - 74627326 74627505 180 browser details YourSeq 159 578 747 2000 95.9% chrX - 164367894 164368061 168 browser details YourSeq 159 575 746 2000 96.6% chr11 - 85256585 85256758 174 browser details YourSeq 158 581 746 2000 97.6% chr2 + 33279712 33279877 166 browser details YourSeq 158 577 742 2000 97.6% chr11 + 46901203 46901368 166 browser details YourSeq 156 585 748 2000 97.6% chr7 - 46029531 46029694 164 browser details YourSeq 156 576 746 2000 96.0% chr17 + 26148601 26148778 178 browser details YourSeq 156 576 749 2000 94.9% chr17 + 17065568 17065741 174 browser details YourSeq 155 581 743 2000 97.6% chr2 - 153819366 153819528 163 browser details YourSeq 155 579 743 2000 97.0% chr1 - 82818508 82818672 165 browser details YourSeq 155 576 739 2000 97.6% chr6 + 4968032 4968197 166 browser details YourSeq 154 580 747 2000 95.9% chr4 + 124577706 124577873 168 browser details YourSeq 154 575 747 2000 94.8% chr12 + 91872598 91872771 174 browser details YourSeq 153 573 738 2000 95.8% chr3 - 121965701 121965865 165 browser details YourSeq 153 584 747 2000 95.1% chr17 - 27641975 27642136 162 browser details YourSeq 153 584 746 2000 97.0% chr8 + 71441266 71441428 163 browser details YourSeq 153 577 748 2000 94.8% chr5 + 125390913 125391085 173

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 6614616 6616615 2000 browser details YourSeq 152 1411 1945 2000 84.6% chr9 + 69790052 69790433 382 browser details YourSeq 139 1764 1962 2000 91.7% chr2 + 29159721 29160240 520 browser details YourSeq 139 1231 1935 2000 91.2% chr13 + 101274763 101275504 742 browser details YourSeq 137 1778 1966 2000 93.1% chr6 - 99153199 99153517 319 browser details YourSeq 134 1777 1942 2000 91.4% chr1 - 128369189 128369355 167 browser details YourSeq 133 1794 1970 2000 89.3% chr11 + 51618848 51619039 192 browser details YourSeq 132 1777 1941 2000 87.3% chrX + 47951459 47951615 157 browser details YourSeq 130 1806 1964 2000 89.9% chr19 + 6107136 6107287 152 browser details YourSeq 130 1802 1965 2000 89.2% chr10 + 81171440 81171600 161 browser details YourSeq 128 1777 1936 2000 87.4% chr17 + 23624928 23625077 150 browser details YourSeq 126 1802 1939 2000 95.7% chr12 - 8555016 8555153 138 browser details YourSeq 126 1802 1958 2000 91.9% chr10 - 24011157 24011312 156 browser details YourSeq 126 1777 1925 2000 93.2% chr17 + 47569562 47778834 209273 browser details YourSeq 126 1806 1962 2000 87.2% chr1 + 37839911 37840058 148 browser details YourSeq 125 1802 1962 2000 90.9% chr4 + 110003680 110004198 519 browser details YourSeq 125 1802 1940 2000 95.0% chr3 + 137111484 137111622 139 browser details YourSeq 123 1793 1940 2000 91.9% chr3 - 135561849 135562003 155 browser details YourSeq 122 1805 1956 2000 88.9% chr1 - 7087319 7087466 148 browser details YourSeq 122 1806 1966 2000 92.4% chr3 + 87888816 87889120 305

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Tbrg4 transforming growth factor beta regulated gene 4 [ Mus musculus (house mouse) ] Gene ID: 21379, updated on 24-Oct-2019

Gene summary

Official Symbol Tbrg4 provided by MGI Official Full Name transforming growth factor beta regulated gene 4 provided by MGI Primary source MGI:MGI:1100868 See related Ensembl:ENSMUSG00000000384 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cpr2; Tb-12; R74877; AA120735; AA408001; AI527316; 2310042P22Rik Expression Ubiquitous expression in colon adult (RPKM 29.8), large intestine adult (RPKM 26.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 A1 See Tbrg4 in Genome Data Viewer Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (6615598..6626084, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (6515601..6526070, complement)

Chromosome 11 - NC_000077.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Tbrg4 ENSMUSG00000000384

Description transforming growth factor beta regulated gene 4 [Source:MGI Symbol;Acc:MGI:1100868] Gene Synonyms 2310042P22Rik, Cpr2, TB-12 Location Chromosome 11: 6,615,598-6,626,067 reverse strand. GRCm38:CM001004.2 About this gene This gene has 12 transcripts (splice variants), 183 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tbrg4- ENSMUST00000189268.6 2347 630aa ENSMUSP00000140835.1 Protein coding CCDS24423 Q91YM4 TSL:1 212 GENCODE basic APPRIS P1

Tbrg4- ENSMUST00000000394.13 2290 630aa ENSMUSP00000000394.7 Protein coding CCDS24423 Q91YM4 TSL:1 201 GENCODE basic APPRIS P1

Tbrg4- ENSMUST00000136682.7 647 158aa ENSMUSP00000114174.1 Protein coding - Q5SWP1 CDS 3' 207 incomplete TSL:3

Tbrg4- ENSMUST00000144463.1 540 112aa ENSMUSP00000120103.1 Protein coding - Q5SWP0 CDS 3' 208 incomplete TSL:3

Tbrg4- ENSMUST00000156969.7 2324 630aa ENSMUSP00000114256.1 Nonsense mediated - Q91YM4 TSL:1 211 decay

Tbrg4- ENSMUST00000150697.7 2241 365aa ENSMUSP00000123131.1 Nonsense mediated - E9PUT1 TSL:1 209 decay

Tbrg4- ENSMUST00000134016.1 682 No - Retained intron - - TSL:3 206 protein

Tbrg4- ENSMUST00000151008.1 652 No - Retained intron - - TSL:1 210 protein

Tbrg4- ENSMUST00000132446.1 602 No - Retained intron - - TSL:2 205 protein

Tbrg4- ENSMUST00000131815.7 853 No - lncRNA - - TSL:3 204 protein

Tbrg4- ENSMUST00000131477.1 582 No - lncRNA - - TSL:5 203 protein

Tbrg4- ENSMUST00000131313.1 380 No - lncRNA - - TSL:3 202 protein

Page 7 of 9 https://www.alphaknockout.com

30.47 kb Forward strand 6.61Mb 6.62Mb 6.63Mb Contigs AL603787.8 > (Comprehensive set... < Nacad-201protein coding < Tbrg4-211nonsense mediated decay < Wap-201protein coding

< Tbrg4-212protein coding < Wap-203lncRNA

< Tbrg4-201protein coding < Wap-202lncRNA

< Tbrg4-209nonsense mediated decay

< Tbrg4-206retained intron < Tbrg4-204lncRNA

< Tbrg4-203lncRNA < Tbrg4-208protein coding

< Tbrg4-205retained intron

< Gm24313-201snoRNA

< Tbrg4-210retained intron

< Tbrg4-202lncRNA

< Snora5c-201snoRNA

< Tbrg4-207protein coding

Regulatory Build

6.61Mb 6.62Mb 6.63Mb Reverse strand 30.47 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000189268

< Tbrg4-212protein coding

Reverse strand 10.47 kb

ENSMUSP00000140... Low complexity (Seg) Superfamily Armadillo-type fold SMART RAP domain Pfam FAST kinase-like protein, subdomain 2

FAST kinase leucine-rich RAP domain PROSITE profiles RAP domain PANTHER PTHR21228

PTHR21228:SF59

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 630

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9