http://www.alphaknockout.com/

Mouse Ubap2l Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ubap2l conditional knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ubap2l (NCBI Reference Sequence: NM_001165983 ; Ensembl: ENSMUSG00000042520 ) is located on Mouse 3. 28 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 28 (Transcript: ENSMUST00000064639). Exon 5~8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ubap2l gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-18H19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a transgenic gene disruption exhibit decreased female body size and reduced female fertility.

Exon 5 starts from about 8.39% of the coding region. The knockout of Exon 5~8 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 4683 bp, and the size of intron 8 for 3'-loxP site insertion: 2694 bp. The size of effective cKO region: ~5651 bp. The cKO region does not have any other known gene.

The transcripts: Ubap2l-209、Ubap2l-211、Ubap2l-213、Ubap2l-217、Ubap2l-221 will not be affected by deleting this cKO region.

Page 1 of 9 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 8 28 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ubap2l Homology arm cKO region loxP site

Page 2 of 9 http://www.alphaknockout.com/

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(11952bp) | A(27.52% 3289) | C(17.78% 2125) | G(20.07% 2399) | T(34.63% 4139)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 90039267 90042266 3000 browser details YourSeq 258 1716 2251 3000 92.3% chr5 - 124479394 124622403 143010 browser details YourSeq 224 1743 2254 3000 93.8% chr13 + 81499736 81708453 208718 browser details YourSeq 222 1751 2277 3000 91.1% chr2 + 91225107 91225737 631 browser details YourSeq 217 1719 2254 3000 91.9% chr12 + 22065913 22386971 321059 browser details YourSeq 204 1716 2251 3000 94.0% chr11 - 117908218 117908796 579 browser details YourSeq 199 1716 2262 3000 86.3% chr11 + 51618834 51619195 362 browser details YourSeq 197 1740 2234 3000 84.1% chr12 + 86934083 86934363 281 browser details YourSeq 191 1740 2252 3000 82.8% chr11 + 3459716 3460045 330 browser details YourSeq 185 1724 2251 3000 83.8% chr8 - 96405743 96406048 306 browser details YourSeq 178 1617 2242 3000 83.0% chr2 - 34765542 34765879 338 browser details YourSeq 170 1792 2251 3000 82.6% chr12 + 76753972 76754274 303 browser details YourSeq 165 1746 2223 3000 86.4% chr11 - 87097758 87098133 376 browser details YourSeq 161 1740 2115 3000 95.6% chr2 + 132148203 132148679 477 browser details YourSeq 154 1751 2242 3000 82.7% chrX - 42045731 42046112 382 browser details YourSeq 154 1716 2169 3000 90.6% chrX - 35857771 35858367 597 browser details YourSeq 151 1719 2103 3000 92.7% chr3 - 130661285 130661678 394 browser details YourSeq 151 1847 2242 3000 94.8% chr1 + 170778864 170779431 568 browser details YourSeq 150 1717 1886 3000 94.6% chr16 - 13782853 13783021 169 browser details YourSeq 149 1712 1883 3000 92.7% chr4 + 35300466 35300632 167

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 90030815 90033814 3000 browser details YourSeq 112 682 1414 3000 92.4% chr15 - 79204431 79548033 343603 browser details YourSeq 84 1359 2029 3000 86.4% chr11 + 60110947 60125072 14126 browser details YourSeq 83 1359 2028 3000 89.0% chr11 - 3264108 3316303 52196 browser details YourSeq 57 1953 2030 3000 91.5% chr18 - 27406804 27406883 80 browser details YourSeq 50 1947 2037 3000 85.8% chr1 - 62915326 62915422 97 browser details YourSeq 50 1946 2031 3000 83.8% chr10 + 20450536 20450623 88 browser details YourSeq 48 1348 1414 3000 87.7% chr17 - 45211166 45211232 67 browser details YourSeq 47 982 1414 3000 86.2% chr13 - 93487310 93487767 458 browser details YourSeq 46 1950 2016 3000 85.1% chr10 + 76010631 76010700 70 browser details YourSeq 46 1713 1991 3000 68.0% chr1 + 122637251 122637439 189 browser details YourSeq 45 1358 1484 3000 80.3% chr19 + 4459591 4468622 9032 browser details YourSeq 44 1953 2029 3000 84.7% chr11 - 46685412 46685493 82 browser details YourSeq 43 1359 1414 3000 89.3% chr16 + 65141995 65142052 58 browser details YourSeq 43 1354 1417 3000 84.2% chr1 + 166492673 166492736 64 browser details YourSeq 42 1345 1414 3000 80.0% chr9 + 57461997 57462066 70 browser details YourSeq 41 1953 2009 3000 93.7% chr5 - 114350726 114350783 58 browser details YourSeq 41 1946 2011 3000 95.8% chr1 - 88659130 88659202 73 browser details YourSeq 40 1953 2011 3000 91.7% chr18 - 43915169 43915228 60 browser details YourSeq 39 1359 1419 3000 82.0% chr13 - 38809630 38809690 61

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 9 http://www.alphaknockout.com/ Gene and information: Ubap2l ubiquitin-associated protein 2-like [ Mus musculus (house mouse) ] Gene ID: 74383, updated on 10-Oct-2019

Gene summary

Official Symbol Ubap2l provided by MGI Official Full Name ubiquitin-associated protein 2-like provided by MGI Primary source MGI:MGI:1921633 See related Ensembl:ENSMUSG00000042520 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C77168; Nice-4; mKIAA0144; 3110083O19Rik; 4932431F02Rik; A430103N23Rik Expression Ubiquitous expression in limb E14.5 (RPKM 19.2), testis adult (RPKM 17.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 3 F1; 3 See Ubap2l in Genome Data Viewer

Exon count: 35

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (89999589..90052609, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (89803511..89856437, complement)

Chromosome 3 - NC_000069.6

Page 5 of 9 http://www.alphaknockout.com/

Transcript information: This gene has 23 transcripts

Gene: Ubap2l ENSMUSG00000042520

Description ubiquitin-associated protein 2-like [Source:MGI Symbol;Acc:MGI:1921633] Gene Synonyms 3110083O19Rik, 4932431F02Rik, A430103N23Rik, NICE-4 Location Chromosome 3: 90,000,140-90,052,628 reverse strand. GRCm38:CM000996.2 About this gene This gene has 23 transcripts (splice variants), 203 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ubap2l-202 ENSMUST00000064639.14 4073 1112aa ENSMUSP00000066138.8 Protein coding CCDS50966 Q80X50 TSL:1 GENCODE basic

Ubap2l-208 ENSMUST00000196843.4 3975 1107aa ENSMUSP00000143459.1 Protein coding CCDS79960 Q80X50 TSL:5 GENCODE basic

Ubap2l-203 ENSMUST00000090908.10 3629 983aa ENSMUSP00000088424.7 Protein coding CCDS79959 A0A0H2UH17 TSL:1 GENCODE basic APPRIS ALT1

Ubap2l-204 ENSMUST00000195995.4 3516 1014aa ENSMUSP00000143638.1 Protein coding CCDS50965 Q80X50 TSL:1 GENCODE basic APPRIS P3

Ubap2l-201 ENSMUST00000029553.15 3509 1105aa ENSMUSP00000029553.9 Protein coding CCDS38498 Q80X50 TSL:1 GENCODE basic

Ubap2l-220 ENSMUST00000199834.4 3494 1014aa ENSMUSP00000143254.1 Protein coding CCDS50965 Q80X50 TSL:1 GENCODE basic APPRIS P3

Ubap2l-215 ENSMUST00000198322.4 3385 1067aa ENSMUSP00000142524.1 Protein coding CCDS79958 A0A0G2JDV6 TSL:1 GENCODE basic APPRIS ALT1

Ubap2l-211 ENSMUST00000197177.4 3503 497aa ENSMUSP00000143246.1 Protein coding - A0A0G2JFN7 CDS 5' incomplete TSL:1

Ubap2l-221 ENSMUST00000199929.1 623 57aa ENSMUSP00000142488.1 Protein coding - A0A0G2JDT1 CDS 3' incomplete TSL:2

Ubap2l-209 ENSMUST00000196917.1 592 50aa ENSMUSP00000142602.1 Protein coding - A0A0G2JE24 CDS 3' incomplete TSL:3

Ubap2l-217 ENSMUST00000199050.1 473 129aa ENSMUSP00000142719.1 Protein coding - A0A0G2JEC6 CDS 5' incomplete TSL:3

Ubap2l-206 ENSMUST00000196633.4 410 105aa ENSMUSP00000143423.1 Protein coding - A0A0G2JG47 CDS 3' incomplete TSL:3

Ubap2l-213 ENSMUST00000197903.4 328 81aa ENSMUSP00000143519.1 Protein coding - A0A0G2JGD0 CDS 3' incomplete TSL:3

Ubap2l-219 ENSMUST00000199612.4 2789 No protein - Retained intron - - TSL:1

Ubap2l-205 ENSMUST00000196568.4 2681 No protein - Retained intron - - TSL:1

Ubap2l-216 ENSMUST00000199016.1 2396 No protein - Retained intron - - TSL:NA

Ubap2l-218 ENSMUST00000199301.1 2318 No protein - Retained intron - - TSL:1

Ubap2l-214 ENSMUST00000198282.1 2271 No protein - Retained intron - - TSL:1

Ubap2l-207 ENSMUST00000196747.1 2099 No protein - Retained intron - - TSL:NA

Ubap2l-212 ENSMUST00000197633.1 1490 No protein - Retained intron - - TSL:NA

Ubap2l-210 ENSMUST00000196952.1 1398 No protein - Retained intron - - TSL:NA

Ubap2l-223 ENSMUST00000200301.1 620 No protein - Retained intron - - TSL:3

Ubap2l-222 ENSMUST00000200195.1 479 No protein - Retained intron - - TSL:3

Page 6 of 9 http://www.alphaknockout.com/

72.49 kb Forward strand

90.00Mb 90.02Mb 90.04Mb 90.06Mb Gm19710-201 >lncRNA 4933434E20Rik-207 >nonsense mediated decay (Comprehensive set...

Gm19710-202 >lncRNA AC163616.1-201 >nonsense mediated decay

AC163616.1-202 >nonsense mediated decay

4933434E20Rik-204 >processed transcript

4933434E20Rik-203 >protein coding

4933434E20Rik-205 >protein coding

4933434E20Rik-201 >protein coding

4933434E20Rik-208 >protein coding

4933434E20Rik-210 >retained intron

4933434E20Rik-202 >protein coding

4933434E20Rik-206 >retained intron

4933434E20Rik-209 >retained intron

Gm16540-201 >lncRNA

Contigs AC163616.3 > Genes (Comprehensive set... < Hax1-201protein coding < Ubap2l-203protein coding

< Hax1-205protein coding < Ubap2l-211protein coding < Ubap2l-210retained intron < Ubap2l-216retained intron

< Hax1-206protein coding < Ubap2l-220protein coding

< Hax1-202protein coding < Ubap2l-204protein coding

< Hax1-203retained intron < Gm24608-201snoRNA< Ubap2l-218retained intron < Ubap2l-206protein coding

< Hax1-209retained intron < Mir7669-201miRNA < Ubap2l-214retained intron

< Hax1-204protein coding < Ubap2l-207retained intron < Ubap2l-213protein coding

< Hax1-207protein coding < Ubap2l-219retained intron < Ubap2l-221protein coding

< Hax1-208retained intron < Ubap2l-205retained intron

< Hax1-210nonsense mediated decay < Ubap2l-212retained intron < Ubap2l-209protein coding

< Hax1-211retained intron < Ubap2l-222retained intron

< Hax1-212protein coding

< Ubap2l-201protein coding

< Ubap2l-215protein coding

< Ubap2l-202protein coding

< Ubap2l-208protein coding

< Ubap2l-217protein coding

< Ubap2l-223retained intron

Regulatory Build Page 7 of 9

90.00Mb 90.02Mb 90.04Mb 90.06Mb Reverse strand 72.49 kb

Regulation Legend CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene 72.49 kb Forward strand 90.00Mb 90.02Mb 90.04Mb 90.06Mb Genes Gm19710-201 >lncRNA 4933434E20Rik-207 >nonsense mediated decay (Comprehensive set...

Gm19710-202 >lncRNA AC163616.1-201 >nonsense mediated decay

AC163616.1-202 >nonsense mediated decay

4933434E20Rik-204 >processed transcript

4933434E20Rik-203 >protein coding

4933434E20Rik-205 >protein coding

4933434E20Rik-201 >protein coding

4933434E20Rik-208 >protein coding

4933434E20Rik-210 >retained intron

4933434E20Rik-202 >protein coding

4933434E20Rik-206 >retained intron

4933434E20Rik-209 >retained intron

Gm16540-201 >lncRNA

Contigs AC163616.3 > Genes (Comprehensive set... < Hax1-201protein coding < Ubap2l-203protein coding

< Hax1-205protein coding < Ubap2l-211protein coding < Ubap2l-210retained intron < Ubap2l-216retained intron

< Hax1-206protein coding < Ubap2l-220protein coding

< Hax1-202protein coding < Ubap2l-204protein coding

< Hax1-203retained intron < Gm24608-201snoRNA< Ubap2l-218retained intron < Ubap2l-206protein coding

< Hax1-209retained intron < Mir7669-201miRNA < Ubap2l-214retained intron

< Hax1-204protein coding < Ubap2l-207retained intron < Ubap2l-213protein coding

< Hax1-207protein coding < Ubap2l-219retained intron < Ubap2l-221protein coding

< Hax1-208retained intron < Ubap2l-205retained intron

< Hax1-210nonsense mediated decay < Ubap2l-212retained intron < Ubap2l-209protein coding

< Hax1-211retained intron < Ubap2l-222retained intron

< Hax1-212protein coding

< Ubap2l-201protein coding

< Ubap2l-215protein coding

< Ubap2l-202protein coding

< Ubap2l-208protein coding

< Ubap2l-217protein coding

< Ubap2l-223retained intron http://www.alphaknockout.com/ Regulatory Build

90.00Mb 90.02Mb 90.04Mb 90.06Mb Reverse strand 72.49 kb

Regulation Legend CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 http://www.alphaknockout.com/

Transcript: ENSMUST00000064639

< Ubap2l-202protein coding

Reverse strand 52.19 kb

ENSMUSP00000066... MobiDB lite Low complexity (Seg) Superfamily UBA-like superfamily

SMART Ubiquitin-associated domain

Pfam UBAP2/protein lingerer

PROSITE profiles Ubiquitin-associated domain

PANTHER PTHR16308

PTHR16308:SF18 Gene3D 1.10.8.10 CDD cd14277

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1000 1112

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 9 of 9