https://www.alphaknockout.com

Mouse Creld1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Creld1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Creld1 (NCBI Reference Sequence: NM_133930 ; Ensembl: ENSMUSG00000030284 ) is located on Mouse 6. 10 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 10 (Transcript: ENSMUST00000032422). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Creld1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-263G9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous KO is embryonic lethal: abnormal vasculature and brain and craniofacial development and reduced atrioventricular cushion size at E10.5.

Exon 3 starts from about 20.48% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 3493 bp, and the size of intron 4 for 3'-loxP site insertion: 863 bp. The size of effective cKO region: ~1000 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 10 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Creld1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7500bp) | A(23.13% 1735) | C(23.47% 1760) | T(26.95% 2021) | G(26.45% 1984)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 113484822 113487821 3000 browser details YourSeq 282 467 1802 3000 95.0% chr17 - 35034586 35468283 433698 browser details YourSeq 159 1241 1795 3000 85.2% chr16 + 32652703 32653124 422 browser details YourSeq 143 1518 1799 3000 92.9% chr17 - 88078670 88079134 465 browser details YourSeq 142 1655 2183 3000 83.8% chr4 + 108473872 108474222 351 browser details YourSeq 140 1588 1802 3000 93.2% chr12 + 69440676 69441194 519 browser details YourSeq 140 1664 2183 3000 93.8% chr11 + 22624602 22625168 567 browser details YourSeq 137 467 619 3000 96.0% chr10 + 18083697 18083853 157 browser details YourSeq 136 1655 1803 3000 96.0% chrX - 99865838 99865987 150 browser details YourSeq 134 1601 1799 3000 94.7% chr11 + 57946912 57947348 437 browser details YourSeq 132 1511 1802 3000 84.0% chr18 - 16560224 16560376 153 browser details YourSeq 132 1664 2064 3000 94.0% chr11 + 88858088 88858488 401 browser details YourSeq 130 1646 1798 3000 90.9% chr13 - 100726614 100726758 145 browser details YourSeq 130 1660 1802 3000 95.9% chr11 - 113680816 113680971 156 browser details YourSeq 129 1664 1804 3000 95.8% chr13 + 51709088 51709228 141 browser details YourSeq 129 1657 1809 3000 92.8% chr11 + 96755585 96755742 158 browser details YourSeq 127 1664 1802 3000 95.7% chr14 - 57918492 57918630 139 browser details YourSeq 126 1664 1799 3000 96.4% chr16 - 48262154 48262289 136 browser details YourSeq 126 1656 1802 3000 93.2% chr14 - 15251491 15251643 153 browser details YourSeq 126 1664 1805 3000 94.4% chr1 - 75295348 75295489 142

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 113488822 113491821 3000 browser details YourSeq 267 2250 2607 3000 92.4% chr11 - 29122250 29122656 407 browser details YourSeq 265 2245 2607 3000 93.1% chr16 + 13956797 13957162 366 browser details YourSeq 264 2275 2607 3000 94.3% chr3 - 95304173 95304513 341 browser details YourSeq 262 2249 2607 3000 92.5% chr17 - 35116815 35117333 519 browser details YourSeq 258 2270 2607 3000 93.3% chr17 - 47510151 47510488 338 browser details YourSeq 257 2249 2607 3000 91.8% chr8 - 70881700 70882065 366 browser details YourSeq 254 2249 2620 3000 88.0% chr8 + 72291704 72292063 360 browser details YourSeq 250 2249 2607 3000 91.6% chr17 + 83676296 83676653 358 browser details YourSeq 249 2277 2607 3000 93.2% chr5 - 142980951 142981325 375 browser details YourSeq 247 2270 2607 3000 91.8% chr15 - 73403165 73403504 340 browser details YourSeq 245 2255 2610 3000 88.0% chr15 + 59084242 59084572 331 browser details YourSeq 243 2249 2599 3000 92.2% chr8 + 43339741 43340269 529 browser details YourSeq 238 2252 2622 3000 91.9% chr1 + 85040397 85315609 275213 browser details YourSeq 236 2255 2585 3000 93.4% chr11 - 4733226 4733568 343 browser details YourSeq 231 2254 2607 3000 90.6% chr12 + 70222039 70222408 370 browser details YourSeq 205 2262 2609 3000 93.3% chr5 + 103915152 103915700 549 browser details YourSeq 196 2349 2623 3000 92.6% chr19 + 36134723 36135007 285 browser details YourSeq 180 2286 2607 3000 91.7% chr15 + 99835900 99836397 498 browser details YourSeq 179 2328 2607 3000 88.0% chr9 + 31214256 31214523 268

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Creld1 cysteine-rich with EGF-like domains 1 [ Mus musculus (house mouse) ] Gene ID: 171508, updated on 28-Sep-2019

Gene summary

Official Symbol Creld1 provided by MGI Official Full Name cysteine-rich with EGF-like domains 1 provided by MGI Primary source MGI:MGI:2152539 See related Ensembl:ENSMUSG00000030284 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as i11E7; AI843811 Expression Ubiquitous expression in bladder adult (RPKM 14.0), subcutaneous fat pad adult (RPKM 13.2) and 28 other tissues See Orthologs more human all

Genomic context

Location: 6 E3; 6 52.77 cM See Creld1 in Genome Data Viewer

Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (113483356..113493343)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (113433563..113443332)

Chromosome 6 - NC_000072.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Creld1 ENSMUSG00000030284

Description cysteine-rich with EGF-like domains 1 [Source:MGI Symbol;Acc:MGI:2152539] Location Chromosome 6: 113,483,297-113,493,343 forward strand. GRCm38:CM000999.2 About this gene This gene has 6 transcripts (splice variants), 192 orthologues, 8 paralogues, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Creld1-201 ENSMUST00000032422.5 2306 420aa ENSMUSP00000032422.5 Protein coding CCDS20423 A8C1T7 Q91XD7 TSL:1 GENCODE basic APPRIS P1

Creld1-204 ENSMUST00000147932.7 1609 No protein - Retained intron - - TSL:1

Creld1-206 ENSMUST00000204920.1 1327 No protein - Retained intron - - TSL:NA

Creld1-203 ENSMUST00000135852.1 920 No protein - Retained intron - - TSL:3

Creld1-202 ENSMUST00000129125.1 889 No protein - Retained intron - - TSL:2

Creld1-205 ENSMUST00000156764.1 643 No protein - Retained intron - - TSL:1

Page 6 of 8 https://www.alphaknockout.com

30.05 kb Forward strand 113.48Mb 113.49Mb 113.50Mb (Comprehensive set... Il17rc-201 >protein coding Creld1-201 >protein coding

Il17rc-203 >protein coding Creld1-204 >retained intron Creld1-206 >retained intron

Il17rc-205 >protein coding Creld1-205 >retained intron Creld1-202 >retained intron

Il17rc-204 >protein coding Creld1-203 >retained intron

Il17rc-202 >retained intron

Contigs AC153910.6 > Genes < Prrt3-203protein coding (Comprehensive set...

< Prrt3-202protein coding

< Prrt3-201protein coding

< Prrt3-205protein coding

< Prrt3-204protein coding

Regulatory Build

113.48Mb 113.49Mb 113.50Mb Reverse strand 30.05 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000032422

10.04 kb Forward strand

Creld1-201 >protein coding

ENSMUSP00000032... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily Growth factor receptor cysteine-rich domain superfamily SMART EGF-like domain

Furin-like repeat

EGF-like calcium-binding domain Pfam Domain of unknown function DUF3456 EGF-like calcium-binding domain PROSITE profiles EGF-like domain PROSITE patterns EGF-like calcium-binding, conserved site

EGF-like, conserved site EGF-type aspartate/asparagine hydroxylation site

Laminin EGF domain PANTHER PTHR24034

PTHR24034:SF114 Gene3D 2.90.20.10 2.10.25.10

CDD Furin-like repeat cd00054

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 420

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8