Mouse Gcfc2 Conditional Knockout Project (CRISPR/Cas9)

https://www.alphaknockout.com Mouse Gcfc2 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Gcfc2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Gcfc2 gene (NCBI Reference Sequence: NM_177884 ; Ensembl: ENSMUSG00000035125 ) is located on Mouse chromosome 6. 17 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 17 (Transcript: ENSMUST00000043195). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gcfc2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-442L23 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 5 starts from about 29.69% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 2994 bp, and the size of intron 5 for 3'-loxP site insertion: 3358 bp. The size of effective cKO region: ~616 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele gRNA region 5' gRNA region 3' 1 5 17 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Gcfc2 Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(7116bp) | A(26.48% 1884) | C(19.83% 1411) | T(32.39% 2305) | G(21.3% 1516) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 81932657 81935656 3000 browser details YourSeq 173 1493 1667 3000 99.5% chrX - 36466504 36466678 175 browser details YourSeq 104 351 827 3000 80.0% chr5 + 147746729 147747173 445 browser details YourSeq 91 709 899 3000 88.9% chr14 - 70822160 70822354 195 browser details YourSeq 74 355 831 3000 71.7% chr17 - 48742224 48742662 439 browser details YourSeq 72 761 909 3000 87.5% chr11 + 69315796 69315949 154 browser details YourSeq 64 709 867 3000 81.4% chr5 + 117186629 117186957 329 browser details YourSeq 59 725 867 3000 87.4% chr13 - 76014115 76014257 143 browser details YourSeq 59 711 830 3000 84.8% chr1 + 166108468 166108748 281 browser details YourSeq 58 730 830 3000 82.4% chr1 - 79662300 79662391 92 browser details YourSeq 55 728 830 3000 86.6% chr6 - 82809716 82809816 101 browser details YourSeq 54 709 830 3000 92.2% chr1 - 192660416 192660570 155 browser details YourSeq 54 734 844 3000 89.8% chr1 + 87303822 87303934 113 browser details YourSeq 53 707 775 3000 85.3% chr7 + 91524710 91524777 68 browser details YourSeq 53 1145 1236 3000 79.4% chr13 + 43692395 43692487 93 browser details YourSeq 52 732 842 3000 86.2% chrX - 36032664 36032776 113 browser details YourSeq 52 698 775 3000 83.4% chr19 - 31156124 31156201 78 browser details YourSeq 52 1148 1350 3000 93.3% chr12 + 79417985 79418412 428 browser details YourSeq 50 709 780 3000 84.8% chrX - 163213165 163213236 72 browser details YourSeq 50 1201 1264 3000 89.1% chr5 - 129667498 129667561 64 Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr6 + 81936273 81939272 3000 browser details YourSeq 138 360 558 3000 86.2% chr10 - 93496906 93497078 173 browser details YourSeq 131 383 616 3000 86.6% chr16 - 24959408 24959640 233 browser details YourSeq 130 355 558 3000 81.6% chr15 - 42876589 42876759 171 browser details YourSeq 124 360 558 3000 84.8% chr12 + 95099058 95099236 179 browser details YourSeq 123 382 558 3000 89.0% chr16 + 88055032 88055201 170 browser details YourSeq 121 393 558 3000 88.1% chr3 - 88705253 88705420 168 browser details YourSeq 118 383 558 3000 86.1% chr18 + 27993938 27994121 184 browser details YourSeq 118 383 558 3000 82.0% chr11 + 58202233 58202391 159 browser details YourSeq 117 407 558 3000 90.6% chr1 - 171432537 171432693 157 browser details YourSeq 117 37 558 3000 78.5% chr13 + 38051846 38052032 187 browser details YourSeq 117 381 558 3000 81.8% chr13 + 29001380 29001549 170 browser details YourSeq 116 408 558 3000 89.4% chr11 - 20809404 20809559 156 browser details YourSeq 116 409 558 3000 93.4% chr5 + 122409518 122409679 162 browser details YourSeq 116 382 547 3000 83.5% chr18 + 22898691 22898838 148 browser details YourSeq 115 392 558 3000 84.6% chr6 - 51521896 51522059 164 browser details YourSeq 115 407 575 3000 88.4% chr3 + 62568991 62569169 179 browser details YourSeq 115 414 558 3000 91.0% chr15 + 79207112 79207260 149 browser details YourSeq 115 381 558 3000 85.7% chr12 + 68156597 68156766 170 browser details YourSeq 115 383 558 3000 88.3% chr10 + 38382942 38383128 187 Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Gcfc2 GC-rich sequence DNA binding factor 2 [ Mus musculus (house mouse) ] Gene ID: 330361, updated on 12-Aug-2019 Gene summary Official Symbol Gcfc2 provided by MGI Official Full Name GC-rich sequence DNA binding factor 2 provided by MGI Primary source MGI:MGI:2141656 See related Ensembl:ENSMUSG00000035125 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as GCF2; Tcf9; AW146020; A130099G21 Expression Ubiquitous expression in CNS E11.5 (RPKM 5.4), limb E14.5 (RPKM 4.3) and 27 other tissues See more Orthologs human all Genomic context Location: 6; 6 C3 See Gcfc2 in Genome Data Viewer Exon count: 22 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (81910562..81959098) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (81873663..81909092) Chromosome 6 - NC_000072.6 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 8 transcripts Gene: Gcfc2 ENSMUSG00000035125 Description GC-rich sequence DNA binding factor 2 [Source:MGI Symbol;Acc:MGI:2141656] Gene Synonyms AW146020 Location Chromosome 6: 81,923,669-81,959,915 forward strand. GRCm38:CM000999.2 About this gene This gene has 8 transcripts (splice variants), 227 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Gcfc2- ENSMUST00000043195.10 4192 769aa ENSMUSP00000035644.4 Protein coding CCDS20260 Q8BKT3 TSL:1 201 GENCODE basic APPRIS P1 Gcfc2- ENSMUST00000203959.1 630 175aa ENSMUSP00000144868.1 Protein coding - A0A0N4SUY0 CDS 5' 207 incomplete TSL:3 Gcfc2- ENSMUST00000152996.7 3519 263aa ENSMUSP00000138136.1 Nonsense mediated - S4R198 TSL:1 206 decay Gcfc2- ENSMUST00000132301.1 2429 No - Retained intron - - TSL:1 204 protein Gcfc2- ENSMUST00000127949.1 1953 No - Retained intron - - TSL:1 202 protein Gcfc2- ENSMUST00000129678.1 641 No - Retained intron - - TSL:3 203 protein Gcfc2- ENSMUST00000147673.1 612 No - Retained intron - - TSL:3 205 protein Gcfc2- ENSMUST00000204691.1 453 No - Retained intron - - TSL:NA 208 protein Page 6 of 8 https://www.alphaknockout.com 56.25 kb Forward strand 81.92Mb 81.93Mb 81.94Mb 81.95Mb 81.96Mb Genes (Comprehensive set... Gcfc2-201 >protein coding Gcfc2-206 >nonsense mediated decay Gcfc2-204 >retained intron Gcfc2-207 >protein coding Gcfc2-208 >retained intron Gcfc2-202 >retained intronGcfc2-203 >retained intron Gcfc2-205 >retained intron Contigs < AC129024.4 Genes < Mrpl19-201protein coding (Comprehensive set... < Mrpl19-202retained intron < Mrpl19-203lncRNA Regulatory Build 81.92Mb 81.93Mb 81.94Mb 81.95Mb 81.96Mb Reverse strand 56.25 kb Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding RNA gene processed transcript Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000043195 36.25 kb Forward strand Gcfc2-201 >protein coding ENSMUSP00000035... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam GC-rich sequence DNA-binding factor-like domain PANTHER GC-rich sequence DNA-binding factor PTHR12214:SF3 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend inframe deletion missense variant synonymous variant Scale bar 0 80 160 240 320 400 480 560 640 769 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.

Mouse Gcfc2 Conditional Knockout Project (CRISPR/Cas9)

C2orf3 (GCFC2) (NM 001201334) Human Tagged ORF Clone Product Data

Table SI. Genes Upregulated ≥ 2-Fold by MIH 2.4Bl Treatment Affymetrix ID

Structural Variant Calling by Assembly in Whole Human Genomes: Applications in Hypoplastic Left Heart Syndrome by Matthew Kendzi

Supplementary Table S4. FGA Co-Expressed Gene List in LUAD

GCF (H-87): Sc-366876

Genome-Wide Association Scan Identifies New Variants Associated

LILRB1 Intron 1 Has a Polymorphic Regulatory Region That Enhances Transcription in NK Cells and Recruits YY1

Genome-Wide Association Scan Identifies

Identifying Genetic Signatures of Recent Local Adaptations in People from Ibiza

Dissecting the Genetics of Human Communication

Dyslexia and Language Impairment Associated Genetic Markers Influence Cortical Thickness and White Matter in Typically Developing Children

Chromosome Walking: a Novel Approach to Analyse Amino Acid Content of Human Proteins Ordered by Gene Position