https://www.alphaknockout.com

Mouse Tgfb1i1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tgfb1i1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tgfb1i1 (NCBI Reference Sequence: NM_001289550 ; Ensembl: ENSMUSG00000030782 ) is located on Mouse 7. 11 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 11 (Transcript: ENSMUST00000167965). Exon 1~7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tgfb1i1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-144N15 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit abnormal response to wire injury of femoral arteries and increased VSMC apoptosis in response to wire injury or mechanical stress. Mice homozygous for a different knock-out allele show normal platelet integrin function both in vitro and in vivo.

Exon 1~7 covers 51.63% of the coding region. Start codon is in exon 1, and stop codon is in exon 11. The size of intron 7 for 3'-loxP site insertion: 2582 bp. The size of effective cKO region: ~2941 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele A T

5' G gRNA region 3'

6 1 2 3 4 5 6 7 11

Targeting vector A T G

Targeted allele A T G

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Armc5 Homology arm Exon of mouse Tgfb1i1 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8999bp) | A(21.26% 1913) | C(27.84% 2505) | T(26.8% 2412) | G(24.1% 2169)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 128243997 128246996 3000 browser details YourSeq 69 1585 1944 3000 71.8% chr11 - 73091596 73091736 141 browser details YourSeq 51 1615 1944 3000 63.5% chr17 - 25741161 25741246 86 browser details YourSeq 42 1898 1976 3000 85.3% chr11 + 36193122 36193209 88 browser details YourSeq 41 1895 1994 3000 80.4% chr12 - 112097210 112097307 98 browser details YourSeq 41 1901 1943 3000 100.0% chr2 + 87757060 87757108 49 browser details YourSeq 39 1904 1944 3000 97.6% chr11 - 53377109 53377149 41 browser details YourSeq 39 1903 1943 3000 100.0% chr1 + 5275237 5275283 47 browser details YourSeq 38 1904 1943 3000 100.0% chr19 - 47466474 47466519 46 browser details YourSeq 38 1904 1943 3000 100.0% chr19 + 4391043 4391088 46 browser details YourSeq 38 1904 1943 3000 100.0% chr16 + 92360318 92360363 46 browser details YourSeq 38 1904 1943 3000 100.0% chr16 + 76870793 76870838 46 browser details YourSeq 38 1904 1943 3000 100.0% chr15 + 49392133 49392178 46 browser details YourSeq 38 1904 1943 3000 100.0% chr15 + 28332503 28332548 46 browser details YourSeq 37 1577 1634 3000 86.3% chr17 - 83870464 83870522 59 browser details YourSeq 37 1903 1943 3000 95.2% chr4 + 50915906 50915946 41 browser details YourSeq 37 1905 1943 3000 100.0% chr15 + 84330539 84330583 45 browser details YourSeq 37 1905 1943 3000 92.2% chr11 + 8262374 8262411 38 browser details YourSeq 36 1904 1943 3000 97.5% chr18 + 11560403 11560448 46 browser details YourSeq 36 1574 1635 3000 79.1% chr17 + 46988188 46988249 62

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 128249746 128252745 3000 browser details YourSeq 203 1158 1460 3000 95.2% chr19 + 8673972 8881700 207729 browser details YourSeq 152 1125 1296 3000 95.9% chr10 + 111104609 111104892 284 browser details YourSeq 109 1159 1300 3000 89.9% chr10 + 111104765 111104892 128 browser details YourSeq 99 1002 1463 3000 77.1% chr10 - 62385050 62385329 280 browser details YourSeq 98 1169 1292 3000 95.4% chr10 + 111104713 111104852 140 browser details YourSeq 92 1002 1431 3000 76.1% chr12 - 52545814 52546093 280 browser details YourSeq 85 1002 1454 3000 74.4% chr11 - 87027889 87028286 398 browser details YourSeq 84 1002 1462 3000 73.3% chr19 - 47361135 47361423 289 browser details YourSeq 82 1163 1437 3000 87.9% chr17 - 13197973 13198494 522 browser details YourSeq 82 1009 1461 3000 87.2% chr11 - 60512600 60513112 513 browser details YourSeq 82 1339 1473 3000 86.8% chr11 + 52372469 52391755 19287 browser details YourSeq 82 1344 1475 3000 87.3% chr1 + 171572716 171572891 176 browser details YourSeq 81 929 1460 3000 72.3% chr11 - 66806341 66806524 184 browser details YourSeq 79 1009 1470 3000 91.5% chr2 - 181407166 181407668 503 browser details YourSeq 79 1158 1417 3000 92.5% chr2 - 20647198 20647475 278 browser details YourSeq 79 1215 1459 3000 87.4% chr1 + 55156800 55157132 333 browser details YourSeq 77 1002 1453 3000 73.3% chr11 + 84298173 84298482 310 browser details YourSeq 77 1339 1454 3000 90.7% chr11 + 80350253 80350412 160 browser details YourSeq 76 1339 1452 3000 89.5% chr9 - 94888295 94888433 139

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Tgfb1i1 transforming growth factor beta 1 induced transcript 1 [ Mus musculus (house mouse) ] Gene ID: 21804, updated on 10-Oct-2019

Gene summary

Official Symbol Tgfb1i1 provided by MGI Official Full Name transforming growth factor beta 1 induced transcript 1 provided by MGI Primary source MGI:MGI:102784 See related Ensembl:ENSMUSG00000030782 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Hic5; ARA55; TSC-5; hic-5 Expression Broad expression in bladder adult (RPKM 38.4), ovary adult (RPKM 25.8) and 20 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 F3 See Tgfb1i1 in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (128245182..128255699)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (135390385..135397226)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Tgfb1i1 ENSMUSG00000030782

Description transforming growth factor beta 1 induced transcript 1 [Source:MGI Symbol;Acc:MGI:102784] Gene Synonyms ARA55, TSC-5, hic-5 Location Chromosome 7: 128,246,812-128,255,699 forward strand. GRCm38:CM001000.2 About this gene This gene has 13 transcripts (splice variants), 170 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 9 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tgfb1i1- ENSMUST00000167965.7 1746 461aa ENSMUSP00000132100.1 Protein coding CCDS21893 Q62219 TSL:1 207 GENCODE basic APPRIS P2

Tgfb1i1- ENSMUST00000163609.7 1053 350aa ENSMUSP00000133134.1 Protein coding CCDS80812 Q62219 TSL:5 203 GENCODE basic

Tgfb1i1- ENSMUST00000164710.7 3788 483aa ENSMUSP00000130964.1 Protein coding - Q62219 TSL:1 204 GENCODE basic

Tgfb1i1- ENSMUST00000070656.11 3026 444aa ENSMUSP00000068529.5 Protein coding - Q62219 TSL:1 201 GENCODE basic APPRIS ALT2

Tgfb1i1- ENSMUST00000165667.7 1812 422aa ENSMUSP00000127695.1 Protein coding - E9Q1D5 TSL:1 205 GENCODE basic

Tgfb1i1- ENSMUST00000168825.7 1585 112aa ENSMUSP00000132685.2 Nonsense mediated - E9PYQ1 CDS 5' 208 decay incomplete TSL:5

Tgfb1i1- ENSMUST00000169919.7 1317 61aa ENSMUSP00000131705.1 Nonsense mediated - Q62219 TSL:1 209 decay

Tgfb1i1- ENSMUST00000170115.1 1287 114aa ENSMUSP00000129958.1 Nonsense mediated - E9Q1V4 TSL:5 210 decay

Tgfb1i1- ENSMUST00000206691.1 2120 No - Retained intron - - TSL:2 213 protein

Tgfb1i1- ENSMUST00000171888.7 2044 No - Retained intron - - TSL:2 211 protein

Tgfb1i1- ENSMUST00000205654.1 2005 No - Retained intron - - TSL:2 212 protein

Tgfb1i1- ENSMUST00000163553.7 1993 No - Retained intron - - TSL:2 202 protein

Tgfb1i1- ENSMUST00000166755.7 1966 No - Retained intron - - TSL:2 206 protein

Page 6 of 8 https://www.alphaknockout.com

28.89 kb Forward strand 128.24Mb 128.25Mb 128.26Mb (Comprehensive set... Armc5-202 >lncRNA Tgfb1i1-204 >protein coding Slc5a2-209 >nonsense mediated decay

Armc5-201 >protein coding Tgfb1i1-201 >protein coding Slc5a2-206 >retained intron

Tgfb1i1-209 >nonsense mediated decay Slc5a2-202 >protein coding

Tgfb1i1-207 >protein coding Slc5a2-212 >protein coding

Tgfb1i1-208 >nonsense mediated decay Slc5a2-205 >protein coding

Tgfb1i1-205 >protein coding

Tgfb1i1-211 >retained intron

Tgfb1i1-206 >retained intron

Tgfb1i1-202 >retained intron

Tgfb1i1-213 >retained intron

Tgfb1i1-212 >retained intron

Tgfb1i1-210 >nonsense mediated decay

Tgfb1i1-203 >protein coding

Contigs < AC124566.5 Genes < 9130023H24Rik-201protein coding (Comprehensive set...

Regulatory Build

128.24Mb 128.25Mb 128.26Mb Reverse strand 28.89 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000167965

6.73 kb Forward strand

Tgfb1i1-207 >protein coding

ENSMUSP00000132... MobiDB lite Low complexity (Seg) Superfamily SSF57716 SMART Zinc finger, LIM-type Pfam PF03535 Zinc finger, LIM-type

PROSITE profiles Zinc finger, LIM-type PROSITE patterns Zinc finger, LIM-type PIRSF Leupaxin/Paxillin/TGFB1I1

PANTHER PTHR24216

PTHR24216:SF27 Gene3D 2.10.110.10 CDD cd09336 cd09337 cd09412

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 461

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8