https://www.alphaknockout.com

Mouse Hnrnpul1 Knockout Project (CRISPR/Cas9)

Objective: To create a Hnrnpul1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hnrnpul1 (NCBI Reference Sequence: NM_144922 ; Ensembl: ENSMUSG00000040725 ) is located on Mouse 7. 15 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 15 (Transcript: ENSMUST00000206832). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 11.49% of the coding region. Exon 2~5 covers 19.17% of the coding region. The size of effective KO region: ~5898 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 15

Legends Exon of mouse Hnrnpul1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.1% 522) | C(22.05% 441) | T(29.85% 597) | G(22.0% 440)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.0% 560) | C(20.75% 415) | T(28.55% 571) | G(22.7% 454)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 25750991 25752990 2000 browser details YourSeq 36 1648 1717 2000 92.9% chr7 + 25751274 25751343 70 browser details YourSeq 33 681 713 2000 100.0% chr1 - 157438784 157438816 33 browser details YourSeq 32 1405 1449 2000 94.5% chr10 - 11501381 11501639 259 browser details YourSeq 29 219 278 2000 96.8% chr1 - 191979201 191979469 269 browser details YourSeq 25 809 837 2000 80.8% chr14 + 30640351 30640376 26 browser details YourSeq 24 693 716 2000 100.0% chr12 - 19077676 19077699 24 browser details YourSeq 23 1196 1219 2000 100.0% chr1 - 23283899 23283926 28 browser details YourSeq 21 354 374 2000 100.0% chr12 - 106270906 106270926 21 browser details YourSeq 21 858 878 2000 100.0% chr10 + 10395250 10395270 21 browser details YourSeq 21 1459 1487 2000 86.3% chr1 + 37510860 37510888 29 browser details YourSeq 20 802 821 2000 100.0% chr1 - 67722146 67722165 20 browser details YourSeq 20 195 216 2000 95.5% chr10 + 23567100 23567121 22 browser details YourSeq 20 1585 1604 2000 100.0% chr1 + 84782533 84782552 20 browser details YourSeq 20 1289 1308 2000 100.0% chr1 + 51207259 51207278 20

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 - 25743093 25745092 2000 browser details YourSeq 151 937 1122 2000 89.1% chr10 - 43925744 43925920 177 browser details YourSeq 145 950 1122 2000 93.5% chr19 - 52554750 52554922 173 browser details YourSeq 142 950 1126 2000 94.0% chr12 - 98374587 98374765 179 browser details YourSeq 141 957 1122 2000 93.8% chr3 - 37440849 37441016 168 browser details YourSeq 139 968 1122 2000 94.9% chr7 - 127127269 127127423 155 browser details YourSeq 139 970 1122 2000 95.5% chr3 - 31648539 31648691 153 browser details YourSeq 138 969 1122 2000 94.9% chrX - 152307706 152307859 154 browser details YourSeq 138 970 1122 2000 95.5% chr7 - 43722822 43722976 155 browser details YourSeq 137 969 1126 2000 93.7% chr7 - 66892645 66892803 159 browser details YourSeq 137 957 1126 2000 91.6% chr13 - 74843085 74843269 185 browser details YourSeq 136 969 1122 2000 94.2% chr2 - 129982669 129982822 154 browser details YourSeq 136 969 1122 2000 94.2% chr17 - 26278704 26278857 154 browser details YourSeq 136 969 1128 2000 92.5% chr15 - 100017948 100018107 160 browser details YourSeq 136 970 1126 2000 93.7% chr10 - 3589954 3590126 173 browser details YourSeq 136 969 1122 2000 94.8% chrX + 94227138 94227293 156 browser details YourSeq 136 969 1122 2000 94.2% chr16 + 15790820 15790973 154 browser details YourSeq 135 969 1122 2000 94.8% chr1 - 4815929 4816082 154 browser details YourSeq 135 975 1126 2000 94.8% chr7 + 99490280 99490432 153 browser details YourSeq 135 969 1122 2000 94.2% chr18 + 60773373 60773529 157

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Hnrnpul1 heterogeneous nuclear ribonucleoprotein U-like 1 [ Mus musculus (house mouse) ] Gene ID: 232989, updated on 24-Oct-2019

Gene summary

Official Symbol Hnrnpul1 provided by MGI Official Full Name heterogeneous nuclear ribonucleoprotein U-like 1 provided by MGI Primary source MGI:MGI:2443517 See related Ensembl:ENSMUSG00000040725 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as E1BAP5; E1B-AP5; Hnrnpul; Hnrpul1; E130317O14Rik Expression Ubiquitous expression in thymus adult (RPKM 34.0), testis adult (RPKM 32.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 A3 See Hnrnpul1 in Genome Data Viewer Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (25721161..25756268, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (26507055..26539739, complement)

Chromosome 7 - NC_000073.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Hnrnpul1 ENSMUSG00000040725

Description heterogeneous nuclear ribonucleoprotein U-like 1 [Source:MGI Symbol;Acc:MGI:2443517] Gene Synonyms E130317O14Rik, E1B-AP5, E1BAP5, Hnrnpul, Hnrpul1 Location Chromosome 7: 25,721,165-25,754,757 reverse strand. GRCm38:CM001000.2 About this gene This gene has 5 transcripts (splice variants), 190 orthologues, 3 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hnrnpul1-205 ENSMUST00000206832.1 3784 859aa ENSMUSP00000146263.1 Protein coding CCDS20995 Q8VDM6 TSL:1 GENCODE basic APPRIS P1

Hnrnpul1-202 ENSMUST00000108401.2 1240 340aa ENSMUSP00000104038.1 Protein coding CCDS39841 Q8VDM6 TSL:1 GENCODE basic

Hnrnpul1-201 ENSMUST00000043765.13 3354 759aa ENSMUSP00000037268.8 Protein coding - Q8VDM6 TSL:1 GENCODE basic

Hnrnpul1-203 ENSMUST00000206041.1 2615 No protein - Retained intron - - TSL:NA

Hnrnpul1-204 ENSMUST00000206260.1 2296 No protein - Retained intron - - TSL:NA

Page 7 of 9 https://www.alphaknockout.com

53.59 kb Forward strand 25.72Mb 25.73Mb 25.74Mb 25.75Mb 25.76Mb Contigs < AC162614.2 Genes (Comprehensive set... < Ccdc97-201protein coding < Hnrnpul1-205protein coding < Axl-201protein coding

< Ccdc97-205lncRNA < Hnrnpul1-201protein coding < Axl-203retained intron

< Ccdc97-208retained intron < Hnrnpul1-202protein coding < Axl-206retained intron

< Ccdc97-202protein coding < Hnrnpul1-204retained intron < Axl-205retained intron

< Ccdc97-204retained intron < Hnrnpul1-203retained intron < Axl-202protein coding

< Ccdc97-206lncRNA

< Ccdc97-209lncRNA

< Ccdc97-203lncRNA

< Ccdc97-207lncRNA

Regulatory Build

25.72Mb 25.73Mb 25.74Mb 25.75Mb 25.76Mb Reverse strand 53.59 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000206832

< Hnrnpul1-205protein coding

Reverse strand 33.59 kb

ENSMUSP00000146... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SAP domain superfamily Concanavalin A-like lectin/glucanase domain superfamily

P-loop containing nucleoside triphosphate hydrolase SMART SAP domain SPRY domain

Prints PR01217 Pfam SAP domain SPRY domain PF13671

PROSITE profiles SAP domain B30.2/SPRY domain

PANTHER Heterogeneous nuclear ribonucleoprotein U-like protein 1

PTHR12381 Gene3D SAP domain superfamily 2.60.120.920 3.40.50.300

CDD Heterogeneous nuclear ribonucleoprotein U, SPRY domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 859

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9