Extensive genomic plasticity identied by anti- cancer Huaier therapy

Manami Tanaka (  [email protected] ) Bradeion Institute of Medical Sciences, Co. Ltd. Tomoo Tanaka Bradeion Institute of Medical Sciences, Co. Ltd. Fei Teng BGI-Shenzhen Hong Lin BGI-Shenzhen Chenying Xu BGI-Shenzhen Ning Li BGI-Japan Zhu Luo BGI-Japan Ding Wei Japan Kampo NewMedicine, Co. Ltd. Zhengxin Lu QiDong Gaitianli Medicines Co. Ltd.

Biological Sciences - Article

Keywords: Huaier (Trametes robiniophila murr), Cancer therapy, Total RNA- and small RNA-sequencing, RNA editing events, Transcriptional control

Posted Date: October 22nd, 2020

DOI: https://doi.org/10.21203/rs.3.rs-91260/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Title:

Extensive genomic plasticity identified by anti-cancer Huaier therapy

Manami Tanaka1)*, Tomoo Tanaka1), Fei Teng2), Hong Lin2), Chenying Xu2), Ning Li3), Zhu Luo3), Ding Wei4), and Zhengxin Lu5)

1) Bradeion Institute of Medical Sciences, Co. Ltd., Itado 344-1, Hiratsuka, Kanagawa 259-1211, Japan. 2) BGI-Shenzhen, Building NO.7, BGI Park, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China 3) BGI-Japan, Kobe KIMEC Center BLDG. 8F 1-5-2 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047 Japan. 4) Japan Kampo NewMedicine, Co. Ltd., 2-8-10 Kayaba-Cho, Chuo-Ku, Tokyo 103−0025,

Japan. 5) QiDong Gaitianli Medicines Co. Ltd., Jiangsu Province, China.

Key words: Huaier (Trametes robiniophila murr) Cancer therapy Total RNA- and small RNA-sequencing RNA editing events Transcriptional control

1 Clinical significance of anti-cancer effects of Huaier has been emphasized recently. We have proved that a broad spectrum of Huaier effects was based on the rescue of the disrupted Hippo signalling pathway, especially through the rescue of transcriptional dysregulation. We initiated clinical research for thorough understanding of mode of action by total RNA- and small non-coding RNA- sequencing. Here we provide the surprising genomic plasticity by Huaier observed in the time course of recovery from cancer. The extensive changes in RNA editing events were first observed, such as average 92,427 SNP variances per individual (22,688 in normal control). The subsequent changes in the process of translation and transcription resulted in the drastic changes in number of up/down-regulated transcripts. The ratio of the changes at maximum was 85% (23,210/27,447). With the advent of many novel sequences found in small nuclear RNAs and the consequent gene-silencing derived from miRNA-mediated post-transcriptional control, quantitative and qualitative changes of expression in transcriptional factors (mean 1,115/person) contributed to rescue dysregulated functions occurred in almost all required physiological functions for cancer recovery. The transcription control was demonstrated as functional lineage map between the core transcriptional factor and functionally linked transcripts, and the recovery could be clearly shown as the silencing of those massive changes in the end. The genetic alterations, especially related to the inhibition of NFkB and TGF-b signalling pathways showed significant effects on the improvement of prognosis. Thus, Huaier therapy contributes to cancer cell death with damaged tissue regeneration by extensive and systematic genomic modulations.

2 Recent advances in cancer therapy integrate most advanced devices and inventions among broad areas of science1-3. Although a rapid progress in the methodology4-8, many tasks remain unsolved for successful control of cancer, and for diseases originated from multiple factors and environmental stresses throughout a life-long time 9. Recent advancement in technology enabled to cultivate those useful natural herb species in quantity10-14, and quality-controlled distribution in the conventional granule forms. In 1970’s, Huaier (Tremetes robiniophila Murr) has been recognized widely as anti- cancer drug in China12-14 (Chinese administration license No. Z-20000109). Huaier has been introduced with its significant anti-cancer effects without any side effects and/or toxicity13-23. The effective ingredient of Huaier was proteoglycan, which consists of 41.53% polysaccharides, 12.93% amino acids and 8.72% water13,14. The proteoglycan is the most effective anticancer element among all of the isolated ingredients of Huaier extract, which was confirmed on breast cancer MCF-715, liver cancer H2216, 17, MHCC97-H18, lung cancer19, ovarian cancer20, colorectal cancer21, and human fibrosarcoma HT1080 cells 22. More importantly, Huaier has no toxicity or side provided a simple and successful evidence that the rescue of Hippo signalling pathway was a molecular basis of anti-cancer effects of Huaier4, 5. However, it was not enough, either, to explain all functions associated to anti-cancer effects, such as 1) immunological alterations both to enhance and also inhibit the excessive reaction, 2) normal tissue reproduction after the specific cell death and elimination of cell debris in cancer lesion, and 3) reduction of (oxidative) stress materials in liver, and 4) recovery from the opportunistic infections during cancer therapy, and etc. Therefore, we initiated clinical research to identify Huaier effects on every biophysiological system and influences each other. The samples used in this approach should be free from the effects of chemotherapeutic agents, especially molecular-targeting anti-cancer drugs, which disrupted largely molecular mechanisms. The present study was thus constructed with a collaboration of cancer patients, not a large in number, but wished to have Huaier treatment. The medical characteristics of the 31 volunteer patients were summarized in Table 1. The mean age is 67 years old, since this research excluded the congenital cancer or tumour. The present study summarizes the results from 92 sample sequencing from 31 patients, including 2 normal controls (patient No. 21 and 26), and 4 with benign tumours (No. 13, 14, 18, and 20). Normal control No. 21 was the same as No. 25, the opportunistic infection case after 8 months of the end of first research period as No. 21. In order to compare the results with the other volunteer patients with opportunistic virus infections, the sampling was continued over one year from the patients No.5, 8, and 10. The sample from the volunteer No. 20 without Huaier administration was later identified adenoma progression in colon, and successfully removed by colon endoscopic dissection one month later.

3 The cancer origin and the stages (benign tumour to cancer TNM stages 0 to IV) varies among patients, and their symptom changes and anti-cancer effects after Huaier administration were confirmed by many ordinary clinical tests in clinics, and by the evaluation of CT, MRI, and X-ray images. The volunteers summarized in Table 1 were free from any conventional anti-cancer chemotherapy during research period (3 months). TNM Stage IV patients were only treated by Huaier. Interestingly, there were 5 volunteers who had virus infection during the observation period (No. 5, 8, 10, 12, 21/25). The methodology hired in the present study was also valid to identify virus infection, and that could analyse in quantity (Fig. 4b). These results confirmed that Huaier could not prevent virus infection, however, the infection remained silent and totally cured within reasonable period (one month to half a year) according to the severity of the condition of original disease. The obtained peripheral blood samples (92 samples from 32 cancer patients) were designated to be analysed by RNA extraction, construction of whole transcriptome library for sequencing on BGISEQ-500 Platform, followed by whole transcriptome mapping3,7,8,25-28. We have also performed total non-coding small RNA sequencing on the same BGISEQ-500 Platform3,7,8, 25-28, and the length of the small RNAs were found to be between 18 to 30 nucleotides. The results were compared between the samples obtained before and after 1month, and 3 months of Huaier administration. The quantitative analysis of > 7.0 GB RNA sequencing of each sample provided estimated transcribed genes of >27,000; mean number of 27, 447 per sample (Table 2). The average mapping ratio with reference genome is 89.64 % (varies from 71.49% to 94.51%); 43,296 genes were identified per sample (in which 18,952 of them are already known genes). and 24,344 of them are novel genes. Total number of the novel transcripts identified was 172,487, composed of 1) 108,991 for previously unknown splicing event for known genes, 2) 24,344 novel coding transcript without any known features, and 3) 39,152 long non-coding RNA. The obtained novel genes and small nuclear non-coding RNA obtained in the present study have been deposited to the NCBI GEO (GSE157086). The sequencing reads were filtered to remove low-quality, adaptor-polluted and high content of unknown base (N) reads. After reads filtering, we mapped those clean reads only to reference genome using HISAT129, and confirmed the sample quality. At the same time, the bam files for genome mapping results were prepared using Integrative Genome Viewer (IGV)30 provided by BGI. This process supported importing multiple samples for comparison and show the distribution of reads in the exon, intron, UTR, intergenic areas based on the annotation results.

4 After genome mapping, STRINGTIE31 was used to reconstruct transcripts, and with genome annotation information, novel transcripts were identified by using Cuffcompare (a tool of Cufflinks)32, and the coding ability of those new transcripts were predicted by CPC (Coding Potential Calculator)33. At the same time, we used GATK (Genome Analysis Toolkit)34 to identify single nucleotide polymorphisms (SNP) and insertions and deletions (INDL) variants for each sample. Final results are shown in Fig. 1a. We identified total 8,503,297 SNP variant types in 92 sample analysis, and 92,427 SNP per sample in mean number (ranging from 48,994 to 146,102), whereas 22,688 in total among normal healthy individuals3. As for the SNP variations, A-G > C-T transitions were the most common mutations (3,152,487 and 3,142,470 respectively, 74.0%), followed by A-C > A-T (552,611 and 406,169, respectively, 11.3%), C-G > G-T (700,403 and 549,157 respectively, 14.7%) transversions, which is consistent with the previous reports on oesophageal squamous carcinoma cells8. Mean number of transitions per sample is 68,423 (ranging from 34,672 to 110,293), whereas 24,004 transversions (ranging from 14,322 to 35,809) (Table 2). These ratios found no significant differences among samples, but such a gross increase of SNP variants has not been reported before using the samples from the same person. The incidence ratio was three times higher in transition, which is much higher than that identified in oesophageal squamous carcinoma cells8, breast cancer cells16, and other cancer cells17-21 analysed before. There were no significant differences identified by cancer origin, gender, and the stages of cancer. In contrast, the significant increases were observed at the time of opportunistic virus infections in patient No. 8, 10, 12, and 21. Distribution of SNP location is about 40% each in both exons and introns, and the rest located in intergenic region (12%) and within 2,000 bp up- and down-stream region (2% each) of the expressed genes. The distribution of numerous INDL (>2,000,000 per sample) location showed the similar results. Both SNP and INDL location was identified 40 to 45 % in exons, and up to 60% including the adjacent area. Thus, These RNA editing events did not randomly scatter among the whole genome, nor any correlation to cancer origin or stages. We also detect the level of differentially splicing genes (DSG) compared before and after Huaier administration, using rMATS35. The summary of splicing variation comparison is shown in Fig. 1b. DSGs are regulated by alternative splicing (AS), which allows the production of a variety of different isoforms from one gene only. Changes in relative abundance of isoforms, regardless to the expression change, indicates a splicing-related mechanism. We detected five types of AS events, including skipped exon (SE), alternative 5’ splicing site (A5SS), alternative 3’ splicing site (A3SS), mutually exclusive exons (MXE), and retained intron (RI). Approximately total 2,557,753 splicing events in 92 samples were observed, in which were 1,303,783 SE (51%); 211,336 MXE (8.2%);

5 264,368 A5SS (10.3%); 343,124 A3SS (13.4%); and 435,142 RI (17.1%), respectively. Mean numbers of each event per sample (= per person) were 14,172 (ranging from 7,567 to19,240) SE; 2,297 (ranging from 1,229 to 3,258) MXE; 2,874 (ranging from 1,655 to 3,704) A5SS; 3,730 (ranging from 2,341 to 4,699) A3SS; and 27,802 (ranging from 16,294 to 36,079) RI, respectively. Just like SNP variants, the ratio of splicing events showed no significant differences among patients, and irrespectively to cancer origin, cancer stage, age, or gender (Table 2), only exception at the time of virus infections. Both results clearly indicated the genomic events were not influenced by cancer origin, stage, and personal differences, but the extrinsic factors such as virus infections. The important findings were total number of changes, quantitative alterations identified, which clearly indicates significant genomic flexibility and capability in each patient. These results of average numbers of RNA editing events such as SNP and INDL showed that Huaier administration induced approximately 4 times higher movability of the genome, ranging 2 times to 7 times higher than those detected as identified among normal population3. The statistics of splicing events expanded the plasticity and flexibility enhanced the genomic capability to manage or cope with the following transcription and translation. The importance of these results will be discussed later, together with the results of transcriptome analysis and the results obtained by small non-coding RNA sequencing. After the genome mapping and Gene Ontology classification, we defined 24,344 novel transcripts. Then we merge novel coding transcripts with reference ones to get complete reference, and map clean reads using Bowtie36, followed by calculate level for each sample with RSEM37. The results provided the total numbers of predicted transcripts were 27,447, and the gene mean mapping ratio was 89.64 % (ranging from 71.49% to 94.51%). Based on the expression information, box plot analysis was completed o show the distribution of the gene expression level of each sample, followed by construction of the density map which revealed the change of gene abundance by the time course of Huaier administration. As a result of those analysis, gene expression level analysis enabled us to compare the levels of Differentially Expression Genes (DEG) among samples of the same patients according to the time course of Huaier administration (Fig. 2). The numbers of up- and down- regulated DEGs were summarized in Table 3. We also used MAplot, Volcano plot, Scatter plot, and Heatmap plot to show the distributions of DEGs (details demonstrated at BGI website: http://www.bgitechsolutions.com/). As shown in Table 3, the average number of up-regulated genes were 2,589 out of the total 27,447 transcripts (ranging from 21 to 5,671), whereas down-regulated genes were 2,273 (ranging from 28 to 4,227).

6 Fig. 2 demonstrates those changes by individuals. It is obvious that there were high-responders as well as low-responders to Huaier. Generally, TNM stage IV cancer patients were typical low- responder (such as Patient No.7; colorectal cancer with type I diabetes, with a long-term history of chemotherapy combinations before Huaier taking). A sudden increase of the up-regulated genes was a typical first observation, followed by the inversion to the increase of down-regulated genes. The endoscopic dissection of cancer lesion often caused massive increase of up-regulated genes (patients Nos. 9-11, 13, 14, 19), more or less 10 to 25 % of total transcripts, which might reflect the repair function and tissue reproduction by Huaier. The history of chemotherapy and radiotherapy delayed any responses (5, 6, 9, 24, 27, and 30), typically observed in Patient No. 5. Interestingly, just as observed in RNA editing events, the opportunistic virus infection caused the utmost increase of up-regulated genes, highlighted by asterisks in Fig. 2. Huaier administration successfully down-regulated those changes to normal level, within 30 days in normal control, 3 to 6 months in cancer patients (Patient No. 5, 8, 10), with an exception in patient No.12, most severe case of stage IV lung cancer patients with multiple metastasis without any additional medical care or hospitalization. As reported, anti-cancer effects of Huaier was significant on hepatocellular carcinoma14, 16-18, 23, breast cancer15, 23, and other cancers19-23. In the present study, significant effects on the recovery from oesophageal cancer (squamous cell cancer and adenocarcinoma) was observed, and that even with multiple metastasis (patient No. 5, 9, 28, and 29). Interestingly, the patient No.29 with lung metastasis was successfully recovered judged by CT images and endoscopic examination, without any significant increase of both up- and down-regulated transcripts. The clinical examinations showed a good coincidence to the results obtained here, with a significant improvement/elimination of metastatic lesion, and that without any more progression of the disease. In addition, it is emphasized here that genomic plasticity has been continuously observed even in normal healthy individuals. It seems that, without any medical problems and disorders, every person should have influences from environmental stress, infective agents, ageing processes which disrupted physiological functions. Huaier revealed its efficacy even on the healthy individual, according to the extent of functional disruption. The observation provided of the potential of Huaier to rescue the silent disruptions in molecular level to recover the normal homeostasis. With quantitative analysis of DEGs, we performed Gene Ontology (GO) classification (Fig. 3a) and functional enrichment. Basically, GO has three ontologies: molecular biological function, cellular component and biological process, and the results were compared by each patient with a consideration of time course of Huaier administration. We also performed KEGG pathway classification25 according to each cancer analysed in the present study (https://www.genome.jp/kegg/,

7 Fig.3b) and functional enrichment (Fig. 3c). Since there were no basic figure on oesophageal cancer in KEGG pathway classification, we did not demonstrate the molecular linkages in this cancer. There were a broad range of enriched pathway and up- and down-regulations among cases, and the alterations in transcripts were observed among all the molecular function categories, i.e., every process involved in molecular biological function, cellular component and biological process classified by KEGG pathways (the numbers of DEGs were shown in extended Table 1). In Fig. 3, the typical GO changes were shown by three types of analyse methods. Since abundant numbers and kinds of alterations identified so far, these panels were focused on the influence of Huaier on; a. the opportunistic infection, b. the responsible molecules altered in each cancer and signalling pathway, and, c. the responsible alterations observed in colorectal cancer by progression of the stage (from adenoma to stage IV) by the pathway functional enrichment analysis. In Fig. 3a, the utmost DEG changes were shown. The largest changes were highlighted by red box, the level was 3,477.8 in mean number (ranging from 1,826 to 5,671). These changes were indicated by asterisks in Fig. 2 and Table 3, too. The virus origins in 5 patients, were; HTLV-1 or related RNA viruses (patient No. 5, 10, 12), Papilloma virus (patient No.8), and Epstein-Barr virus (patient No. 21/25) by KEGG classification. Those patients were significantly improved with administration of Huaier, however, the patient No.12 (deceased at 150 days after Huaier administration). Patient No.8 had surgical dissection of facial basal cell cancer just before the last sampling. As shown in Fig. 3a, continuous Huaier administration (20g per day) could completely rescue the molecular functions to the normal level (to recover homeostasis). The required time for the recovery seemed to vary among patients and physiological situation (from 30 days to several months), and severe case such as patient No.12 could not recover from the immunodeficiency treated by Huaier only. However, even in this case, the prognosis was improved up to 150 days, significantly improved with better QOL than predicted. Immunological improvements seemed to be closely linked to the regulation of NFkB signalling pathways, which also closely related to improve prognosis, as shown in Fig. 3b (comparison among patients 8, 21, and 12). Fig. 3b demonstrated detailed factors and molecules with up/down regulation according to the KEGG pathway map24 (https://www.genome.jp/kegg/). The panels represented Huaier effects on each cancer (noted by upper left) after 90 days administration, with alterations of DEGs indicated by red coloured box (up-regulation) and blue-coloured box (down- regulation). In each panel, the specified DEG names were highlighted in grey box, especially related to oncogenes and tumour suppressor genes. The numbers of DEGs classified by KEGG pathways in all the patients by the time course were shown in extended Table 1.

8 We searched for the possible molecular changes by the comparison between the patients recovered and deceased (patients number noted in Table 1 were written on the left side of each panel). The representative changes were highlighted in grey box such as: oncogenes; c-Myc, TERT, NRF2, c-Met, PIK3CA, k-Ra, HER2/neu, EGFR, MDM2, PDGF, PDGFR, CDK4, tumour suppressors; Rb, p53, PTEN, p16, Smad4, BRAC2, IKN4a/ARF, AXIN, KEAP1, ARIDIA, and DNA repair genes; hMLH1, hMSH2, hMSH3, hMSH6. Huaier administration induced as such many clusters of oncogenes and tumour suppressor genes, and that in both up- and down-regulation. It was difficult to identify the specific molecular systems and factors to control cancer recovery by Huaier, and such a broad spectrum of DEG alterations have not yet been reported yet in the same individual. The modification of transcriptomes contained all the molecules currently targeted by anti-cancer chemotherapy. Huaier seemed to compensate for the limitation of targeting system by conventional chemotherapy, by the rescue of all the required physiological functions. Clinically, the rescued functions seemed to be responsible for the specific cell death in occult metastasis in the peripheral blood (prevent the recurrence and relapse of cancer), and for the recovery from anemia and hypoleukocytemia. This could be typically observed in the results obtained in the patient No.12, by up-regulation of DEGs responsible for signal transfer in NFkB and TGF-b signalling pathway25, with further influences to related regulatory systems. Immunological systems and pathways are multi-functional biosystems, and both deficiency as well as excessive functional enhancement cause various disorders and diseases. The results obtained in patient No.12 provided a hint for the major molecules and pathways. The continuous up- regulation of DEGs was observed which was responsible for signal transfer in NFkB and TGF- b signalling pathway25, whereas the other patients successfully recovered showed the down- regulation of these systems at a certain point of Huaier complementally therapy (Fig. 3b). From the results obtained in the present study, we identified Huaier effects on the immunotolerance to the patients on the course of recovery from cancer, and that depended on the quantity and duration of administration, and also on the onset time of Huaier therapy. The sequential control of immunological reactions, both activation as well as modulation of excessive enhancement, were required for the recovery of cancer, by NFkB and TGF-b signalling pathways. In extended Table 2 and 3, we provided the copy number changes which might link to NFkB by NFkB subunits 1 and 2; molecules related to the production of iPS cells (KIT, Myc, OCT3/4, SOX2, LIN28A and NANOG). However, there observed no significant increase and/or decrease among patients.

9 Not only the genes correlating with signal transduction and carcinogenesis, many other genes of physiological importance should be also affected as a link to main pathways for maintenance of homeostasis. More importantly, almost all the transcriptional misregulation were rescued chiefly by the up-regulated transcription factors, with the down-regulation of N-CoR (Fig. 3b). Huaier effects on transcriptional control rescue has been reported before7,8, and we frequently identified Hippo signaling pathway4,.5, Wnt signaling pathway rescue22 in the KEGG pathway24 classification results in the present study. The functional enrichment classified by KEGG pathway24 specifically using the results of every TNM stages of colorectal cancer were shown in Fig. 3c. Blue box represents down-regulated genes and pathways, and red box indicates up-regulated ones. The panels were placed according to the stage of cancer progression, and the last panels showed the comparison of Huaier effects on benign adenoma vs. adenocarcinoma. The Patient No. 20 was a control without Huaier administration, with a latent progression of benign adenoma (dissected endoscopically one month after the final sampling). There were significantly more suppressive effects in the responsible oncogenes and tumour suppressors in benign adenoma cases than progressive adenocarcinoma cases. Details of relating signalling pathways were shown in Fig. 3b. The results obtained from the Patient No.20 indicated that, on the early process of cancer progression, the immunological pathways were down-regulated, whereas cell cycle signalling and oxidative stress accumulation were up-regulated. Interestingly, the alterations in the oxidative stress reduction systems by Huaier were identical in different cancers, especially strong similarity found in hepatoma and meningioma (Patient No.8 and No.10 in Fig. 3b), which might suggest the required rescue system and/or functional damaging factors by stress accumulation were identical in those two organs, liver and central nervous system. The typical low responder to Huaier, the patient No.7, did not demonstrate any significant DEGs alterations in any pathways. We analysed the relationship of those quantitative and qualitative DEG alterations to encoding transcription factors (TF). Fig. 4a showed transcription factor families and their quantitative changes classified by the DEGs which belonged to. Total 7,664 DEGs coding transcriptional factors were altered to rescue and/or modify transcriptional misregulation noted in Fig. 3b, within 30 days after Huaier administration. We have already indicated Huaier effects on the rescue of transcriptional control using Drosophila mutants5, and the present study provides the systematic and quantitative background. These dynamic changes in transcription factors might explain the enormous range of transcriptional control rescue by Huaier. As shown in Table 4, 66 kinds of transcription families were altered and modified by up/down regulation of encoding DEGs, maximum differences identified are 2,535 DEGs within 30 days, and 2.293 DEGS at 90 days after Huaier

10 administration. Again, we have to emphasise that these effects were observed with no side effects, toxicity, which resulted in the improvement of QOL of the patients. The results of the changes of TF-DEG network were demonstrated in Fig. 4b as a linkage map of TF-cored transcripts, according to the time course of Huaier administration. The red and green dots represent the up-regulated and down-regulated DEGs, respectively. Purple ball represents each TF as a core. The greater the core size, the more DEGs regulated by the core transcription factor. The results indicated the extreme lineage of gene expression was evoked soon after the Huaier administration in cancer patients as shown in the former tables and figures, suggesting the genetic changes would be based on transcriptional control via small nuclear RNAs. It is clearly demonstrated the extensive changes and rescues in transcriptional control induced by Huaier administration, with a variation of low-responder and high responders, with slow-responders and fast-responders. The recovery of cancer was easily identified to see the quantitative level of regulation in transcriptomes, with a confirmation of CT, MRI, and X-ray images of the patients (data not shown). However, there were many patients without significant gene expression changes. It is surprising to find no significant quantitative changes found specially in the patient with oesophageal cancer even with multiple metastasis (patient No.29). In the present study, to investigate the exact information on transcriptional alterations as previously reported, the detailed simultaneously small non-coding RNAs (sncRNA) were sequenced using BGISEQ-500 technology30,31. The numbers of detected small non-coding RNA for each sample were summarized in Table 5. Before the data analysis, low-quality tags were removed from the samples (data filtering), and only used the samples of base quality with clean tags. In the end, length distribution analysis indicated that the length of small RNA is between 18 and 30 nucleotides. After filtering, clean tags were mapped to sRNA database such as miRBase38, Rfam39. To make unique small RNA mapped to only one annotation, we follow priority rule: miRNA>piRNA>Rfam>other sRNA. After sRNA annotation, those unknown tags will be used to predict novel sRNA based on their architectural features. Thus, novel miRNA, piRNA, and siRNA were predicted. Table 6 showed the results of different kinds of sRNA in all the samples (the ratio of each components was shown in extended Table 4). The structure of miRNAs and the sequences of the novel small RNAs were deposited to the NCBI GEO (GSE157086), and shown and stored in precursor.tar.gz. The small RNA expression level was calculated by using TPM40 (Transcripts Per Kilobase Million). The analysed differentially expressed small RNAs (DESs) were designated to compare between samples. DESs in each pairwise were shown in Fig. 5. The changes in each patient by time course were demonstrated in Fig. 6.

11 In the patient No.5 (oesophageal squamous cell cancer with lung metastasis, we identified 20 up- regulation at 90 days after Huaier administration, and with one down-regulated element, total 21 novel elements were all down-regulated at 11 months after administration (Figs. 6). The slow onset of novel siRNA function was also noteworthy in this patient. Compared with the changes in transcriptomes (Fig. 2) and TF-DEG network (Fig. 4b), these siRNAs seemed to contribute to the drastic alterations of downregulation of the transcriptomes, followed by up-regulation of over 3,000 transcripts later. DEG changes were also associated with the drastic up-regulation of transcription factor families. SiRNA has been reported to interfere with the expression of specific genes with complementary nucleotide sequences by degrading mRNA after transcription, preventing translation41. In addition, siRNAs typically work by cleaving the mRNA before translation, and have 100% complementarity, thus very tight target specificity42. These sequential genetic events found in patient No.5 in total would rescue the misregulation of translational and transcriptional systems in vivo, required for cancer recovery. In the other patients, no significant siRNA function was identified. On the other hand, as shown in Table 5, average 1,003 miRNAs (ranging from 685 to 1,378) per sample were identified within 90 days after Huaier administration. DESs in each pair wise was shown in Table 5, Figs. 5 and 6, revealed several hundreds of miRNA alterations in each patient, and over a half of total identified miRNAs were differentially expressed. The comparison between Figs. 2 and 6, miRNA-mediating gene silencing effects were well demonstrated. For example, the significant changes in up- and down-regulated DEGs were in inverse proportion to those observed in quantitative DES changes, such as significant increase of up-regulated miRNAs in the patient No. 8 (Fig. 6, middle column) resulted in the significant decrease of both up- and down-regulated DEGs in Fig. 2. In contrast, the significant decrease of up-regulated DESs in Fig. 6, related to human papillomavirus infection suggested by Fig. 3a, drastically increased both up- and down-regulation of DEGs up to 3,330 and 3,505, respectively (12.2% of the total DEGs). The total numbers of DEG changes in the Patient No.8 became 85% of total transcripts (23,210/27,447). In the patient 10, the increase of down-regulated miRNA was observed when massive increase of down-regulated DEGs (4,274; 15.6% of the total). In both patients No. 8 and 10, the opportunistic virus infection might influence the drastic decrease in number of miRNAs. The present study also increased the information of the many novel miRNAs significantly contributed to cancer recovery, and further invetigation of these molecules in vitro and in vivo are expected. MiRNAs have been reported to typically silence genes by repression of translation, and with broader specificity than siRNA, to functions in RNA silencing and post-transcriptional regulation of gene expression42-44.

12 In the present study, haemoglobin counts in TNM stage IV colorectal cancer improved from 6.0 to 12.0 g/dl after 4 months’ treatment with 20g Huaier per day. MiRNAs also play crucial roles in the regulation of complex enzymatic cascades including the haemostatic blood coagulation system45. Large scale studies of functional miRNA targeting have recently uncovered rationale therapeutic targets in the haemostatic system46.

Piwi-interacting RNA (piRNA) is the largest class of small non-coding RNA molecules (containing about 24 nucleotides in the present study) expressed in animal cells47. PiRNAs were reported distinct from microRNA (miRNA) in size (24 nucleotides as opposed to 21–22 nt), lack of sequence conservation, increased complexity. However, as shown in Fig. 6, the dynamic statistics of piRNA revealed similar movement as miRNAs, although the total numbers of the known and the novel piRNAs were much smaller compared with miRNAs. The present study, together with other small RNA information, shed lights into the function of piRNA function related to RNA-mediated adaptive immunity against transposon expansions and invasions, especially in the case with the opportunistic infection. While many unusual expressions of long ncRNAs in disease states have been reported48, it is still remained unknown how those ncRNAs function in causing diseases. Transcriptome analyses here identified numerous differentially expressed novel ncRNAs in various forms of cancer such as colorectal carcinoma and hepatocellular carcinoma. Many SNPs have been found to be associated with certain disease conditions, which function in association with targeting ncRNAs in the transcriptional regulation48. All the novel small RNA (DSG) sequences identified were also deposited to the NCBI GEO (GSE157086). In conclusion, the present study provided almost all the genomic and genetic events on cancer patients, especially identified from the recovered patients. The recovery from cancer was identical to the recovery of homeostasis, through the modulation and rescue of transcriptional dysregulation using the multi-elements involved in the series of each process in transcription and translation. SncRNAs together with dynamic RNA editing in SNP variation directly influenced these processes, which resulted in large-scale alterations in DEGs. Huaier effects could be quantitatively and widely observed in the biophysiological systems classified by Gene Ontology, the changes presented here in total were derived from the physiological capability and flexibility of each patient. Thus, we have observed a striking molecular function as anti-cancer effects of Huaier. Those changes and modulations, in total, contributed every point required for the dynamic and systematic improvement of transcriptional control to recover biophysiological condition i.e., individual homeostasis.

13 Although there have been many reports regarding specific cancer cell death as anti-cancer effects of Huaier5, 12-23, it should be emphasized the efficacy of Huaier on the significant potential for the recovery from opportunistic virus infections, and also for tissue reproduction to regenerate the damaged tissues and organs after the removal and/or cancer cells in the lesion. The elimination of occult metastasis (after surgical treatment) should be emphasized, too. These effects are spontaneous, and dose- and duration-dependent. It is the major reason why Huaier does not cause any side effects and/or toxicity.

14 Acknowledgements The authors wish to thank cancer patient volunteers and many healthy volunteers kindly collaborated with the present study. We also wish to thank Prof. Dr. Tongbiao Zhao, Professor of Stem Cell and Immunology, Institute of Zoology, Chinese Academy of Sciences, China, for critical revision of the manuscript. The present study was grant-in-aid from QiDong Gaitianli Medicines Co. Ltd. And Japan Kampo NewMedicine, Co. Ltd.

Authors’ contributions T.T., M.T., designed the study from the clinical observation of the cancer patients with Huaier treatment (as a complementally therapy), and managed the sampling and clinical assessment of the patient volunteers, statistically analysed the data, and drafted the manuscript. F.T., H.L., C.X., managed total RNA and small nuclear RNA sequencing and conducted systematic analysis of the data. Z.L., D.W., contributed to the provision of Huaier granules and clinical evaluation of the data, especially focused on Immunological evaluation.

Author information The authors have no competing interest to declare. Readers are welcome to comment on the paper. Correspondence should be addressed to T.T. ([email protected]) and M.T. ([email protected]), and those researchers contributed equally to this work. Requests for Huaier extract and commercially-available granules should be addressed to D.W. ([email protected]) and M.T. ([email protected])

Conflict of interest The authors have no competing interest to declare.

15 Table 1. Clinical features of cancer patients and normal controls. The asterisks (*) of No.21 are the same as ones of No. 25. The double asterisks (**) of No. 5, 8 10 and 12 mean that the recovery from the opportunistic RNA virus infection could be clearly judged numerically by the methods hired in the present study.

Table 2. The numbers of RNA editing events identified in each sample. Left column summarizes the numbers by the classifications by SNP variation types, and right column for the classifications by splice event types.

Table 3. Quantitative analysis of Up- and Down-regulated differently expressed genes DEGs by time course of Huaier administration. The asterisks * indicate the time of opportunistic virus infection.

Table 4. Transcription Factor (TF) families classified by the DEGs which belonged to, and the numbers of DEGS up/down regulated by the time course of Huaier administration.

Table 5. The numbers of detected small non-coding RNA.

Table 6. The numbers of differentially expressed small RNAs (DSGs) in each patient by the time course of Huaier administration. a. numbers of differentially expressed siRNAs; b. the numbers of differentially expressed miRNAs; c. the numbers of differentially expressed piRNAs.

Extended Table 1. The comparison of the numbers of DEGs by the KEGG pathway classifications. Extended Table 2. The comparison of NFkB subunits 1 (upper) and 2 (lower) copy numbers in all the samples. Extended Table 3. The comparison of regulated DEGs (up- or down-) supposed to have intensive relationship to produce iPS cells (KIT, Myc, OCT3/4, SOX2, LIN28A and NANOG. a. comparison between before and 30 days after Huaier administration. b. comparison between before and 90 days after Huaier administration. Extended Table 4. The resulting ratio (%) of analyzed different small non-coding RNA. To make every unique small RNA mapped to only one annotation, we follow the priority rule: miRNA>piRNA>snoRNA>Rfam>other small RNAs.

16 Figure Legends Fig. 1. SNP variance after Huaier administration. a. classifications by SNP variation types; b. classifications by splice event types.

Fig. 2. Comparison of the numbers of differently expressed genes (DEGs) in each patient by time course of Huaier administration. The asterisks * indicate the time of opportunistic virus infection.

Fig. 3. Pathway classification and functional enrichment of DEGs. a. Pathway functional enrichment results for up/down regulation genes. X axis represents the terms of pathway. Y axis represents the number of up/down regulation genes. b. The detailed analysis of the Huaier effects on each cancer by KEGG biological pathways25. The pathways responsible for cancer progression, stress reduction, and major immunological responses related to NFkB and TGFb signalling pathway were specifically compared among patients with recovery and poor prognosis. c. Functional enrichment of the factors classified by KEGG pathways. The colour indicates the q-value (high: white, low: blue), the lower q-value indicates the more significant enrichment. Point size indicates DEG number (the bigger dots refer to larger amount). Rich Factor refers to the value of enrichment factor, which is the quotient of foreground value (the name of DEGs) and background value (total gene amount). The larger the value, the more significant enrichment.

Fig. 4. TF-DEG network in each patient by the time course of Huaier administration. a. TF families classified by the encoding DEGs, with each number of up/down regulation. b. TF-DEG predicted network. The red and green dots represent the up-regulated and down-regulated DEGs, respectively. Purple ball represents transcription factor, the greater the node the more DEGs the transcription factor regulate.

Fig. 5. Comparison of the numbers of DESs until 90 days of Huaier administration. Upper: Statistics of siRNA, middle: miRNA, lower: piRNA.

Fig. 6. Comparison of the numbers of DESs in each patient by time course of Huaier administration.

17 METHODS Project Design, The present study initiated systematic understanding of the genomic and genetic requirement for the cancer recovery. Huaier compounds were provided by the manufacturer for this purpose with a strict control on transfer to Japan, good condition for maintenance, and provision to the patient volunteers. The present study was strictly conducted according to the guidelines of the Declaration of Helsinki and the principles of good clinical practice. Written informed consent was obtained from the patients. This clinical research was applied according to the Consolidated Standards of Clinical Research Trials guidelines and was applied to the Japanese Medical Association on 9th February 2018, and approved on 5th March, 2018 (ID: JMA-IIA00335). The project has been strictly conducted with a monthly review by the ethics committee consisted by the experts on Medicine, Nursing, Laws, Pharmaceutics,and Business Community (first committee held on 9th February, 2018). We used Huaier compounds as complementary therapy, without any chemotherapy and radiotherapy which disrupt the molecular systems. Only surgical operation was performed when applicable. We thus planned and initiated an open-style, before-after controlled study, using peripheral blood as sampling materials to understand the almost all molecular events in each Huaier taking patient7,8. The sampling materials were total blood, the same as reported previously7,8, by the consideration of as follows; 1) easy to obtain from any candidate patients, 2) non- or less-hazardous procedure, and 3) applicable to any patients of any stages, even after the surgical operation. To compare with the other sampling, RNA extraction using nuclear cell components in peripheral blood rapidly reflects the biophysiological changes, and that more sensitively5,6 to monitor the course of any treatment than any other samples such as dissected organs. Patient characteristics and sample collection. Basically, the volunteers were chosen on the demand of each individual, with a condition as written below; 1) diagnosed as cancer in the clinics, 2) exclusive for anti-cancer chemotherapy, radiotherapy, or any measures to influence genomic or genetic events in physiological condition for 90 days research period, 3) between ages of 35 to 85. The normal healthy volunteers were chosen from the group of 1) without any diagnosed diseases, symptoms, and disorders, and 2) no drug administration, with an age distribution from ca. 35 to 85 years old. The blood samples were obtained from the volunteer three times; 1: before, 2: 30 days after, and 3: 90 days after 20g per day Huaier administration14, 23. In case, more sampling was applied to some

18 patients for follow- up study in continuously Huaier taking patients. The samples (4ml blood) in PAXgene Blood RNA Tube (QIAGEN, Valencia, CA) was handled and kept at –80 C until shipping to BGI Shenzhen by dry-ice cargo. The RNA extraction, quality check for sequencing were performed in BGI Shenzhen according to the BGI protocols. For details regarding to the following procedures, the information is provided by website ; http://www.bgitechsolutions.com/. RNA extraction & miRNA library construction. Total RNA was extracted in BGI Shenzhen from transferred total blood samples using Trizol (Invitrogen, Carlsbad, CA, USA), according to the manual instructions. The mix was centrifuge at 12000×g for 5min at 4°C. The supernatant was transferred to a new 2.0ml tube which was added 0.3ml of Chloroform/isoamyl alcohol (24:1) per 1.5ml of Trizol reagent. After the mix was centrifuged at 12,000×g for 10min at 4°C, the aqueous phase was transferred to a new 1.5mL tube which was add equal volume of supernatant of isopropyl alcohol. The mix was centrifuged at 12,000×g for 20min at 4°C and then removed the supernatant. After washed with 1ml 75% ethanol, the RNA pellet was air-dried in the biosafety cabinet and then dissolved by add 25μL~100μL of DEPC-treated water. Subsequently, total RNA was qualified and quantified using a Nano Drop and Agilent 2100 bioanalyzer (Thermo Fisher Scientific, MA, USA). mRNA Library Construction. Oligo(dT)-attached magnetic beads were used to purified mRNA. Purified mRNA was fragmented into small pieces with fragment buffer at appropriate temperature. Then First-strand cDNA was generated using random hexamer-primed reverse transcription, followed by a second-strand cDNA synthesis. afterwards, A-Tailing Mix and RNA Index Adapters were added by incubating to end repair. The cDNA fragments obtained from previous step were amplified by PCR, and products were purified by Ampure XP Beads, then dissolved in EB solution. The product was validated on the Agilent Technologies 2100 bioanalyzer for quality control. The double stranded PCR products from previous step were heated denatured and circularized by the splint oligo sequence to get the final library. The single strand circle DNA (ssCir DNA) was formatted as the final library. The final library was amplified with phi29 to make DNA nanoball (DNB) which had more than 300 copies of one molecular, DNBs were loaded into the patterned nanoarray and single end 50 bases reads were generated on BGIseq500 platform (BGI-Shenzhen, China). Library was prepared with 1 μg total RNA for each sample. Total RNA was purified by electrophoretic separation on a 15% urea denaturing polyacrylamide gel electrophoresis (PAGE) gel and small RNA regions corresponding to the 18–30 nt bands in the marker lane (14-30 ssRNA Ladder Marker, TAKARA, Japan) were excised and recovered. Then the 18–30 nt small RNAs were ligated to a 5′-adaptor and a 3′-adaptor. The adapter-ligated small RNAs were subsequently transcribed into cDNA by SuperScript II Reverse Transcriptase (Invitrogen, USA) and then several rounds of PCR

19 amplification with PCR Primer Cocktail and PCR Mix were performed to enrich the cDNA fragments. The PCR products were selected by agarose gel electrophoresis with target fragments 100~120 bp, and then purified by QIAquick Gel Extraction Kit (QIAGEN, Valencia, CA). The library was quality and quantitated in two methods: check the distribution of the fragments size using the Agilent 2100 bioanalyzer, and quantify the library using real-time quantitative PCR (QPCR) (TaqMan Probe). The final ligation PCR products were sequenced using the BGISEQ-500 platform (BGI-Shenzhen, China). Transcriptome Resequencing. We first removed the reads mapped to rRNAs and get raw data, then we filtered the low quality reads (More than 20% of the bases qualities are lower than 10), reads with adaptors and reads with unknown bases (N bases more than 5%) to get the clean reads. Then we map those clean reads onto reference genome(hg38_UCSC_20180118), followed with novel gene prediction, SNP & INDEL calling and gene splicing detection. Finally, we identify DEGs (differentially expressed genes) between samples and do clustering analysis and functional annotations. In the transcriptome analysis, all samples had library construction and sequenced on the MGISEQ-2000 and BGISEQ-500 platform. Paired-end reads were aligned to UCSC hg38 using Hisat2 and Bowtie2. We calculated the expression of genes by RSEM, call DEGs by PossionDis method, predicted novel transcripts by StringTie and call SNP by GATK . Small RNA analysis. In the smallRNA analysis, all samples had library construction and sequenced on the BGISEQ-500 platform. Singled-end reads were aligned to UCSC hg38 MirBase22 and Rfam by Bowtie2. We calculated the expression of miRNAs by in-house program, predicted miRNA by mirdeep2, and predicted target genes by miRanda and TargetScan. The following processes were applied to the subsequent data analysis. a. Eliminate the low-quality reads, small tags, adaptors and other contaminants to get clean reads. b. Summarize the length distribution of the clean tags, common and specific sequences between samples. c. Annotate the clean tags into different categories. d. Predict the novel miRNA by exploring the characteristic hairpin structure of miRNA precursor. e. Function annotation of known miRNAs, including Gene Ontology and Pathway. The detailed protocols were provided and demonstrated at BGI website: http://www.bgitechsolutions.com/). The obtained novel genes and small nuclear non-coding RNA obtained in the present study have been deposited to The NCBI GEO (GSE157086).

20 References 1. Zauber A. G. et al. Colonoscopic polypectomy and long-term prevention of colorectal-cancer deaths. N. Engl. J. Med. 366, 687-696 (2012). 2. Okada, K., Sadahiro, S., Ogimi,T. et al. Tatooing improves the detection of small lymph node and increases the number of rerieved lymph nodes in patients with rectal cancer who receive preoperative chemoradiotherapy: A randamized controlled clinical trial. Am. J. Surgery, 215, 563-569 (2017). 3. Peng, Z. et al. Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Nat. Biotechnol. 30, 253–260 (2012). 4. Mo, J. S., Park, J. W. & Guan, K. L. The Hippo signaling pathway in stem cell biology and cancer. EMBO Rep. 25, 642-656 (2014). 5. Tanaka, T. et al. Huaier Regulates Cell Fate by the Rescue of Disrupted Transcription Control in the Hippo Signaling Pathway. Arch. Clin. Biomed. Res. 1, 179–199 (2017). 6. Tanaka, M. et al. Rapid and quantitative detection of a mammalian septin Bradeion as a practical diagnostic method of colorectal and urologic cancers. Med. Sci. Monit. 9, 61-68 (2003). 7. Jiang, Y. et al. Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma. Nature, 567, 257–261 (2019). 8. Song, Y. et al. Identification of genomic alterations in oesophageal squamous cell cancer. Nature 509, 91–95 (2014). 9. Chongtham, A. & Agrawal, N. Crucumin modulates cell death and it protective in Huntington’s disease model. Sci. Rep. 6, 18736-18745 (2016). 10. Da Rocha, A. B., Lopes, R. M. & Schwartsmann, G. Natural products in anticancer therapy. Curr. Opin. Pharmaco.l 1, 364-369 (2001). 11. Li L, Ye S, W. Y. Progress on experimental research and clinical application of Trametes robiniophila. Bull. Chin. Cancer 16, 110–113 (2007). 12. Huo, Q. & Yang, Q. Role of Huaier Extract as a Promising Anticancer Drug. Adapt. Med. 4, 1–6 (2012). 13. Sun, Y. et al. A polysaccharide from the fungi of Huaier exhibits anti-tumor potential and immunomodulatory effects. Carbohydr. Polym. 92, 577-582 (2013). 14. Wang, X. et al. Chinese medicines for prevention and treatment of human hepatocellular carcinoma: current progress on pharmacological actions and mechanisms. J. Integr. Med. 13, 142–164 (2015). 15. Wang, X. et al. Huaier aqueous extract inhibits stem-like characteristics of MCF7 breast cancer cells via inactivation of hedgehog pathway. Tumor Biol. 35, 10805-10813 (2014).

21 16. Li, C. et al. A Huaier polysaccharide inhibits hepatocellular carcinoma growth and metastasis. Tumour Biol: J. Int. Soc. Oncodevelopmental. Biol. Med. 36, 1739–1745 (2015). 17. Zheng, J. et al. Huaier polysaccharides suppresses hepatocarcinoma MHCC97-H cell metastasis via inactivation of EMT and AEG-1 pathway. Int. J. Biol. Macromol. 64, 106–110 (2014). 18. Chen, Q. et al. Effect of Huaier granule on recurrence after curative resection of HCC: A multicentre, randomised clinical trial. Gut 67, 2006–2016 (2018). 19. Wu, T. et al. Huaier suppresses proliferation and induces apoptosis in human pulmonary cancer cells via upregulation of miR-26b-5p. FEBS Lett. 588, 2107-2114 (2014). 20. Yan, X. et al. (2013) Huaier aqueous extract inhibits ovarian cancer cell motility via the AKT/GSK3beta/beta-catenin pathway. Plos One 8, e63731. 21. Zhang, T. et al. Huaier aqueous extract inhibits colorectal cancer stem cell growth partially via downregulation of the Wnt/β-catenin pathway. Oncol Lett. 5, 1171-1176 (2013). 22. Cui, Y., Meng, H., Liu, W., Wang, H., & Liu, Q. Huaier aqueous extract induces apoptosis of human fibrosarcoma HT1080 cells through the mitochondrial pathway. Oncol Lett. 9, 1590-1596 (2015). 23. Song, X., Li, Y., Zhang, H. & Yang, Q. The anticancer effect of Huaier (Review). Oncol. Rep. 34, 12–21 (2015). 24. Kanehisa, M. et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, 480–484 (2008). 25. Ramsköld, D. et al. Full-Length mRNA-Seq from single cell levels of RNA and individual circulating tumor cells. Nat Biotechnol 30, 777–782 (2013). 26. Cock, P. J. A., Fields, C. J., Goto, N., Heuer, M. L. & Rice, P. M. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 38, 1767–1771 (2009). 27. Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements Daehwan HHS Public Access. Nat. Methods 12, 357–360 (2015). 28. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for Transcriptomics. 10, 57–63 (2010). 29. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008). 30. Robinson, J. T. et al. Integrative Genome Viewer. Nat. Biotechnol. 29, 24–6 (2011). 31. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

22 32. Trapnell, C. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2013). 33. Kong, L. et al. CPC: Assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 35, 345–349 (2007). 34. DePristo, M. . et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 43, 491–498 (2011). 35. Shen, S. et al. rMATS: Robust and flexible detection of differential alternative splicing from replicate RNA-Seq data. Proc. Natl. Acad. Sci. U. S. A. 111, E5593–E5601 (2014). 36. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, (2009). 37. Li, B. & Dewey, C. N. RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, (2011). 38. Griffiths-Jones, S. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 34, D140–D144 (2006). 39. Nawrocki, E. P. et al. Rfam 12.0: Updates to the RNA families database. Nucleic Acids Res. 43, D130–D137 (2015). 40. Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2009). 41. Laganà, A. et al. Computational design of artificial RNA molecules for gene regulation. Methods Mol. Biol. 1269, 393–412 (2015). 42. Mack, G. S. MicroRNA gets down to business. Nat. Biotechnol. 25, 631–638 (2007). 43. Fabian, M. R., Sonenberg, N. & Filipowicz, W. Regulation of mRNA Translation and Stability by . Annu. Rev. Biochem. 79, 351–379 (2010). 44. Bartel, D. P. MicroRNAs: Target Recognition and Regulatory Functions. Cell 136, 215–233 (2009). 45. Teruel-Montoya, R., Rosendaal, F. R. & Martínez, C. MicroRNAs in hemostasis. J. Thrombosis and Haemostasis, 13: 170–181 (2015). 46. Nourse, J., Braun, J., Lackner, K., Hüttelmaier, S. & Danckwardt, S. Large-scale identification of functional microRNA targeting reveals cooperative regulation of the hemostatic system. J. Thrombosis and Haemostasis, 16, 2233-2245 (2018). 47. Seto, A. G., Kingston, R. E., & Lau, N. C. The Coming of Age for Piwi Proteins. Mol. Cell, 26, 603-609 (2007). 48. Ma, L. et al. Lncbook: A curated knowledgebase of human long non-coding rnas. Nucleic Acids Res. 47, D128–D134 (2019).

23 Table 1

Patient Sampling time after Huaier administration No. age gender diagnosis stage metastasis before 30 days 90days 6 months 9 months 11 months 12 months 14 months 18 months complication additional treatment prognosis

1 70 M lung cancer (small cell carcinoma) IV lymph nodes, liver ○ ○ ○ pleural effusion, anemia chemotherapy for 1 year deceased (144days) 2 67 F pancreas cancer IV liver, lumber vertebrate ○ -- not specified chemotherapy for 4 months + radiotherapy deceased (35 days) 3 70 M pancreas cancer IV peritoneal metastasis ○ ○ ○ lung, severe anemia chemotherapy for 4 months + radiotherapy deceased (120days) 4 67 M cholangiocarcinoma IIb liver ○ ○ ○ jaundice traditional Chinese medicine complex deceased (10 months) 5 72 M oesophageal squamous cell cancer IIb lung ○ ○ ○** ○ ○ ○ not specified chemotherapy for 4 months + radiotherapy remission 6 68 M stomach cancer (signet ring cell ca.) IIa not detected ○ ○ - not specified Vitamin C injection deceased (35 days) 7 56 M colorectal cancer IV lymph nodes, lung ○ ○ - diabetes mellites(type I) chemotherapy for 1 year deceased (140days) 8 78 M hepatoma I pancreas(suspected) ○ ○ ○ ○ ○** ○ hepatitis C (fulminant) Interleukin alfa + liver treatment remission 9 67 M oesophageal squamous cell cancer I hypopharynx ○ ○ ○ hyperlipidemia radiotherapy for 35 days remission 10 39 F meningioma (benign) -- ○ ○ ○ ○ ○** ○ not specified surgical dissection at 6 weeks after Huaier tr. complete recovery 11 76 M oesophageal adenocarcinoma I - ○ ○ ○ adenoma in stomach surgical dissection before Huaier tr. complete recovery 12 71 F lung cancer (adenocarcinoma) IV brain, vertebrate ○ ○ ○** anemia, pleural effusion surgical dissection + chemotherapy deceased (150days) 13 61 M colorectal adenoma (benign) - - ○ ○ ○ not specified endoscopic polyp dissection after Huaier tr. complete recovery 14 61 F colorectal adenoma (benign) - - ○ ○ ○ hepatoma (suspected) endoscopic polyp dissection after Huaier tr. complete recovery 15 73 M prostate cancer in situ (biopsy) 0 - ○ ○ ○ liver dysfunction None complete recovery 16 60 M prostate cancer in situ (biopsy) 0 - ○ ○ ○ liver dysfunction None complete recovery 17 62 M stomach cancer (adenocarcinoma) IV lung, liver ○ ○ ○ anemia chemotherapy followed deceased (150 days) 18 63 M lung cancer suspected by CT 0 - ○ ○ ○ prostate hyperplasia None complete recovery 19 70 F colorectal cancer IIb lymph nodes ○ ○ ○ not specified endoscopic dissection at 2 weeks after Huaier tr. remission 20 70 F colorectal adenoma (benign) - - ○ ○ ○ hypertension Huaier-free control, endoscopic dissection complete recovery 21 66 M healthy control - - ○ ○ ○ ○* ○* ○* not specified None healthy 22 73 F breast cancer IV lymph nodes, lung, liver ○ ○ ○ anemia, anorexia None deceased (93 days) 23 61 F breast cancer IV lymph nodes, lung, liver ○ -- anemia, anorexia surgical dissection planned deceased by accident 24 72 M cholangiocarcinoma IV lung, liver, prostate ○ ○ - anemia, anorexia chemotherapy for 4 months + radiotherapy deceased (58 days) 25 66 M healthy control - - ○* ○* ○* not specified None healthy 26 73 M healthy control - - ○ ○ ○* prostate hyperplasia None healthy 27 61 M prostate cancer I - ○ ○ ○* not specified surgical dissection before Huaier tr. remission 28 64 F oesophageal squamous cell cancer II lymph nodes ○ ○ ○* multi-cancer suspected endoscopic dissection before Huaier tr. remission 29 83 F oesophageal squamous cell cancer IV lung ○ ○ ○* dysphagia endoscopic dissection before Huaier tr. remission 30 72 M pancreas cancer IV lung, liver, vertebrate ○ ○ ○* anemia, anorexia, ascites chemotherapy for 4 months deceased ( 60 days) 31 72 M pancreas cancer IV lung, liver, vertebrate ○ ○ ○* anemia, anorexia, ascites chemotherapy for 6 months+ immunotherapy deceased ( 120 days)

24 Table 2

Mutually Skipped Alternative 5' Alternative 3' Retained Sample exclusive Sample A-G C-T Transition A-C A-T C-G G-T Transversion Total exons splice sites splice sites introns exons KS1_1 12,633 1,975 2,620 3,453 4,317 KS1_1 27,269 27,034 54,303 4,598 3,092 6,200 4,659 18,549 72,852 KS1_2 12,935 2,023 2,665 3,497 4,346 KS1_2 29,664 29,625 59,289 5,093 3,570 6,664 5,155 20,482 79,771 KS1_3 13,415 2,062 2,777 3,615 4,448 KS1_3 30,485 30,057 60,542 5,182 3,482 6,895 5,199 20,758 81,300 KS2_1 10,520 1,700 2,211 2,989 3,973 KS2_1 28,184 27,953 56,137 4,881 3,388 6,113 4,761 19,143 75,280 KS3_1 13,561 2,160 2,681 3,498 4,343 KS3_1 28,505 28,393 56,898 4,947 3,423 6,688 4,938 19,996 76,894 KS3_2 12,212 1,868 2,559 3,341 4,287 KS3_2 27,712 27,633 55,345 4,719 3,141 6,344 4,760 18,964 74,309 KS3_3 12,927 2,103 2,608 3,437 4,338 KS3_3 28,275 28,022 56,297 4,765 3,161 6,506 4,849 19,281 75,578 KS4_1 15,222 2,366 2,957 3,801 4,603 KS4_1 36,375 36,491 72,866 6,186 4,226 8,137 6,173 24,722 97,588 KS4_2 15,323 2,432 2,887 3,766 4,485 KS4_2 35,590 35,178 70,768 6,130 4,301 7,948 6,182 24,561 95,329 KS4_3 17,577 2,887 3,154 3,959 4,667 KS4_3 36,892 36,941 73,833 6,369 4,596 8,350 6,424 25,739 99,572 KS5_1 12,774 2,081 2,787 3,777 4,873 KS5_1 25,825 25,597 51,422 4,389 3,122 5,997 4,515 18,023 69,445 KS5_2 13,852 2,352 2,914 3,902 4,968 KS5_2 27,696 27,980 55,676 4,795 3,397 6,351 4,961 19,504 75,180 KS5_3 18,120 3,014 3,589 4,624 5,451 KS5_3 39,988 39,975 79,963 6,905 5,193 9,208 6,998 28,304 108,267 KS5_4 16,227 2,883 3,226 3,896 4,788 KS5_4 35,477 35,332 70,809 5,968 4,635 7,651 6,196 24,450 95,259 KS5_5 10,747 1,702 2,391 3,195 4,434 KS5_5 22,091 21,880 43,971 4,399 3,151 5,381 4,234 17,165 61,136 KS5_6 11,740 1,851 2,522 3,421 4,564 KS5_6 22,635 22,337 44,972 4,338 3,272 5,377 4,184 17,171 62,143 KS6_1 14,622 2,316 2,851 3,732 4,543 KS6_1 34,169 33,696 67,865 5,860 3,996 7,595 5,713 23,164 91,029 KS6_2 12,488 1,936 2,546 3,465 4,290 KS6_2 28,918 28,489 57,407 4,951 3,397 6,377 4,905 19,630 77,037 KS7_1 10,993 1,857 2,211 2,991 3,921 KS7_1 27,082 26,902 53,984 4,486 3,152 5,843 4,605 18,086 72,070 KS7_2 11,537 1,947 2,349 3,083 4,071 KS7_2 28,315 28,919 57,234 4,719 3,330 6,268 4,762 19,079 76,313 KS8_1 16,009 2,647 3,358 4,342 5,300 KS8_1 33,089 32,820 65,909 5,539 3,819 7,516 5,573 22,447 88,356 KS8_2 18,702 3,088 3,704 4,629 5,460 KS8_2 46,621 45,998 92,619 7,997 6,005 9,990 7,954 31,946 124,565 KS8_3 15,285 2,495 3,297 4,192 5,297 KS8_3 39,089 38,915 78,004 7,244 5,460 8,994 7,381 29,079 107,083 KS8_4 17,751 2,900 3,573 4,673 5,495 KS8_4 48,001 48,049 96,050 7,468 5,514 10,046 7,721 30,749 126,799 KS8_5 17,177 2,986 3,406 4,098 4,930 KS8_5 38,094 37,786 75,880 6,508 5,044 7,995 6,497 26,044 101,924 KS8_6 14,582 2,240 2,957 3,888 4,887 KS8_6 32,377 31,778 64,155 6,135 4,689 7,556 5,890 24,270 88,425 KS9_1 14,382 2,194 2,762 3,674 4,514 KS9_1 34,213 34,018 68,231 5,876 4,125 7,724 5,705 23,430 91,661 KS9_2 14,635 2,273 2,835 3,686 4,562 KS9_2 33,480 33,400 66,880 5,640 4,048 7,445 5,525 22,658 89,538 KS9_3 12,535 1,985 2,625 3,393 4,444 KS9_3 30,875 30,871 61,746 5,649 3,874 7,381 5,700 22,604 84,350 KS10_1 15,393 2,510 3,235 4,241 5,272 KS10_1 35,699 35,402 71,101 5,956 4,138 7,819 5,866 23,779 94,880 KS10_2 16,365 2,689 3,370 4,405 5,366 KS10_2 38,584 38,318 76,902 6,626 4,944 8,655 6,490 26,715 103,617 KS10_3 11,826 1,964 2,643 3,435 4,752 KS10_3 27,475 27,433 54,908 5,415 3,927 6,495 5,435 21,272 76,180 KS10_4 18,428 3,258 3,567 4,255 4,968 KS10_4 44,956 45,029 89,985 7,499 5,837 9,296 7,565 30,197 120,182 KS10_5 12,485 1,949 2,699 3,588 4,715 KS10_5 27,704 27,533 55,237 5,166 3,774 6,673 4,970 20,583 75,820 KS10_6 12,617 1,958 2,641 3,527 4,624 KS10_6 26,403 26,226 52,629 5,010 3,818 6,341 4,873 20,042 72,671 KS11_1 14,220 2,245 2,865 3,701 4,546 KS11_1 35,106 35,126 70,232 5,795 3,987 7,748 5,798 23,328 93,560 KS11_2 16,731 2,639 3,048 4,022 4,742 KS11_2 39,922 39,628 79,550 6,836 4,974 8,868 6,580 27,258 106,808 KS11_3 11,624 1,791 2,475 3,234 4,254 KS11_3 35,839 35,902 71,741 7,779 6,345 8,429 7,819 30,372 102,113 KS12_1 10,686 1,720 2,336 3,091 4,196 KS12_1 27,510 27,409 54,919 5,294 3,725 6,689 5,345 21,053 75,972 KS12_2 12,663 1,922 2,623 3,395 4,402 KS12_2 35,134 35,049 70,183 6,964 5,014 8,311 6,789 27,078 97,261 KS12_3 19,240 3,092 3,545 4,519 5,309 KS12_3 55,119 55,174 110,293 8,861 6,305 11,648 8,995 35,809 146,102 KS13_1 11,896 1,904 2,513 3,249 4,296 KS13_1 29,132 29,286 58,418 5,758 4,048 7,057 5,783 22,646 81,064 KS13_2 11,674 1,847 2,527 3,252 4,321 KS13_2 27,996 27,911 55,907 5,470 3,792 6,709 5,556 21,527 77,434 KS13_3 18,317 3,111 3,476 4,417 5,254 KS13_3 43,614 43,369 86,983 6,905 4,761 9,382 7,033 28,081 115,064 KS14_1 12,044 1,900 2,541 3,321 4,324 KS14_1 33,015 32,764 65,779 6,542 4,787 7,510 6,519 25,358 91,137 KS14_2 11,072 1,774 2,394 3,157 4,196 KS14_2 28,324 27,978 56,302 5,352 3,848 6,611 5,332 21,143 77,445 KS14_3 17,198 2,760 3,342 4,237 5,119 KS14_3 51,175 51,497 102,672 8,268 5,811 10,522 8,239 32,840 135,512 KS15_1 12,475 1,901 2,634 3,469 4,413 KS15_1 35,232 35,381 70,613 6,959 5,347 7,974 6,805 27,085 97,698 KS15_2 18,053 2,917 3,407 4,358 5,238 KS15_2 44,555 44,772 89,327 7,142 5,156 9,486 7,152 28,936 118,263 KS15_3 17,762 2,992 3,385 4,357 5,231 KS15_3 47,501 47,843 95,344 7,479 5,545 9,955 7,652 30,631 125,975 KS16_1 13,103 2,142 2,722 3,476 4,431 KS16_1 32,430 32,311 64,741 5,734 4,308 7,107 5,791 22,940 87,681 KS16_2 17,066 2,797 3,306 4,284 5,231 KS16_2 47,450 47,705 95,155 7,665 5,529 9,850 7,587 30,631 125,786 KS16_3 16,040 2,636 3,202 4,117 5,105 KS16_3 40,282 40,706 80,988 6,480 4,619 8,441 6,515 26,055 107,043 KS17_1 14,215 2,237 2,800 3,663 4,497 KS17_1 36,985 37,179 74,164 6,534 4,642 8,299 6,686 26,161 100,325 KS17_2 17,502 2,844 3,338 4,279 5,176 KS17_2 40,056 39,674 79,730 6,475 4,622 8,664 6,456 26,217 105,947 KS17_3 17,730 2,901 3,443 4,329 5,199 KS17_3 41,720 41,848 83,568 6,746 4,918 8,973 6,842 27,479 111,047 KS18_1 13,401 2,095 2,703 3,525 4,470 KS18_1 32,309 32,471 64,780 6,009 4,296 7,488 6,062 23,855 88,635 KS18_2 17,753 2,945 3,363 4,255 5,232 KS18_2 43,897 43,978 87,875 6,943 4,999 9,222 7,086 28,250 116,125 KS18_3 16,334 2,606 3,210 4,158 5,122 KS18_3 46,729 47,595 94,324 7,428 5,183 9,897 7,486 29,994 124,318 KS19_1 12,627 1,976 2,689 3,413 4,400 KS19_1 32,839 32,461 65,300 5,767 4,073 7,462 5,842 23,144 88,444 KS19_2 18,471 3,063 3,510 4,421 5,256 KS19_2 53,519 53,149 106,668 8,397 6,116 11,009 8,621 34,143 140,811 KS19_3 17,534 2,852 3,377 4,342 5,264 KS19_3 43,392 43,758 87,150 7,013 5,127 9,100 7,114 28,354 115,504 KS20_1 17,427 2,820 3,449 4,415 5,633 KS20_1 46,529 46,114 92,643 7,303 5,013 9,752 7,220 29,288 121,931 KS20_2 18,082 2,934 3,577 4,564 5,734 KS20_2 48,356 48,352 96,708 7,619 5,341 10,239 7,667 30,866 127,574 KS20_3 17,442 2,990 3,451 4,065 5,177 KS20_3 41,496 41,651 83,147 6,914 5,226 8,617 6,931 27,688 110,835 KS21_1 18,735 3,119 3,691 4,699 5,835 KS21_1 49,871 49,872 99,743 7,940 5,777 10,397 7,881 31,995 131,738 KS21_2 18,089 2,956 3,606 4,544 5,758 KS21_2 40,383 40,374 80,757 6,537 4,840 8,775 6,531 26,683 107,440 KS21_3 16,511 2,908 3,342 4,095 4,956 KS21_3 42,217 42,302 84,519 6,893 5,146 8,560 6,878 27,477 111,996 KS22_1 16,179 2,815 3,232 4,254 5,433 KS22_1 40,353 40,852 81,205 6,296 4,528 8,296 6,347 25,467 106,672 KS22_2 16,133 2,907 3,192 3,834 4,977 KS22_2 41,292 42,167 83,459 6,722 5,191 8,272 6,731 26,916 110,375 KS22_3 13,438 2,395 2,726 3,377 4,570 KS22_3 33,215 33,595 66,810 5,566 4,119 6,716 5,619 22,020 88,830 KS24_1 15,205 2,711 3,048 3,736 4,718 KS24_1 35,233 35,324 70,557 5,596 4,331 7,123 5,744 22,794 93,351 KS24_2 12,298 2,055 2,491 3,457 4,549 KS24_2 24,505 24,229 48,734 4,615 3,494 5,742 4,497 18,348 67,082 KS25_1 15,246 2,425 3,037 4,052 5,116 KS25_1 34,453 34,396 68,849 6,573 5,024 8,076 6,412 26,085 94,934 KS25_2 13,651 2,163 2,824 3,705 4,745 KS25_2 29,443 29,408 58,851 5,906 4,693 7,089 5,652 23,340 82,191 KS25_3 9,768 1,521 2,182 2,984 4,174 KS25_3 20,786 20,363 41,149 4,153 3,222 4,915 3,878 16,168 57,317 KS26_1 10,311 1,638 2,339 3,124 4,320 KS26_1 21,851 21,440 43,291 4,310 3,351 5,164 4,079 16,904 60,195 KS26_2 12,071 1,968 2,450 3,284 4,388 KS26_2 26,961 26,603 53,564 5,300 4,149 6,281 5,065 20,795 74,359 KS26_3 11,753 1,805 2,497 3,418 4,503 KS26_3 24,446 23,855 48,301 4,886 3,898 5,664 4,502 18,950 67,251 KS27_1 12,121 1,908 2,555 3,453 4,563 KS27_1 27,042 26,719 53,761 5,052 3,788 6,235 4,956 20,031 73,792 KS27_2 9,849 1,601 2,133 2,936 4,094 KS27_2 21,453 21,177 42,630 4,329 3,397 5,006 4,030 16,762 59,392 KS27_3 11,910 1,931 2,538 3,362 4,520 KS27_3 25,451 24,962 50,413 5,085 3,982 5,923 4,714 19,704 70,117 KS28_1 13,399 2,083 2,767 3,688 4,825 KS28_1 30,644 30,117 60,761 5,925 4,673 7,156 5,699 23,453 84,214 KS28_2 11,134 1,741 2,481 3,366 4,560 KS28_2 25,741 25,490 51,231 4,910 4,039 5,976 4,748 19,673 70,904 KS28_3 12,818 1,902 2,710 3,546 4,695 KS28_3 30,399 29,688 60,087 5,869 4,608 7,040 5,735 23,252 83,339 KS29_1 9,946 1,516 2,200 2,973 4,200 KS29_1 23,372 23,178 46,550 4,420 3,358 5,590 4,399 17,767 64,317 KS29_2 10,666 1,667 2,333 3,106 4,334 KS29_2 26,053 25,443 51,496 5,020 3,975 6,078 4,779 19,852 71,348 KS29_3 12,686 1,944 2,674 3,524 4,654 KS29_3 34,292 33,949 68,241 6,782 5,429 8,009 6,618 26,838 95,079 KS30_1 10,639 1,637 2,309 3,115 4,302 KS30_1 33,734 33,064 66,798 7,068 5,506 8,201 6,824 27,599 94,397 KS30_2 8,795 1,351 1,997 2,803 4,030 KS30_2 24,684 24,469 49,153 5,126 4,056 5,801 4,865 19,848 69,001 KS31_1 7,657 1,229 1,655 2,341 3,412 KS31_1 17,423 17,249 34,672 3,711 3,141 4,052 3,418 14,322 48,994 KS31_2 14,796 2,436 2,931 3,795 4,847 KS31_2 32,220 32,134 64,354 6,182 4,956 7,428 5,935 24,501 88,855

Total 1,303,783 211,336 264,368 343,124 435,142 Total 3,152,487 3,142,470 6,294,957 552,611 406,169 700,403 549,157 2,208,340 8,503,297 Average 14,172 2,297 2,874 3,730 4,730 Average 34,266 34,157 68,423 6,007 4,415 7,613 5,969 24,004 92,427 Max 19,240 3,258 3,704 4,699 5,835 Max 55,119 55,174 110,293 8,861 6,345 11,648 8,995 35,809 146,102 Min 7,657 1,229 1,655 2,341 3,412 Min 17,423 17,249 34,672 3,711 3,092 4,052 3,418 14,322 48,994

25 Table 3

Huaier administratio 30 days 90 days 6 months 9 months 11 months 1 year 14 months 1.5 year Total n Patient No. up down up up down up down up down up down up down up down Up down 1 954 434 58 63 1,012 497 2 deseased after 1st sampling 3 303 723 528 1,134 831 1,857 4 512 113 319 259 831 372 5 67 28 1,750 105 2,351 3,351 3,468* 2,847 147 560 7,783 6,891 6 35 121 35 121 7 67 36 67 36 8 2,877 554 358 791 720 850 3,330* 3,505 3,533 2,891 10,818 8,591 9 47 225 1,005 471 1,052 696 10 1,799 812 810 4,247 5,671* 2,477 3,540 3,439 215 200 12,035 11,175 11 1,523 48 169 857 1,692 905 12 549 125 1,826* 353 2,375 478 13 36 53 1,119 505 1,862 558 14 66 94 1,685 1,178 1,751 1,272 15 1,455 1,141 60 47 1,515 1,188 16 1,166 736 35 186 1,201 922 17 313 239 140 442 453 681 18 664 498 62 183 726 681 19 1,580 823 26 186 1,606 1,009 20 21 30 3,746 3,103 3,767 3,133 21 204 82 1,424 2,746 3,094* 2,037 68 71 97 196 4,887 5,132 22 830 1,557 89 78 919 1,635 24 2,615 3,031 2,615 3,031 25 68 71 97 196 165 267 26 534 2,430 2,235 460 2,769 2,890 27 147 281 215 56 2,769 2,890 28 39 46 103 73 2,769 2,890 29 224 330 319 120 2,769 2,890 30 157 1,548 2,769 2,890 31 1,251 351 1,251 351

Average 2,589 2,273 Max 12,035 11,175 Mini 35 36

26 Table 4

TF name 30 days 90 days AF-4 56 52 AP-2 3 2 ARID 240 220 bHLH 150 165 C/EBP 10 9 CBF 3 CG-1 20 22 COE 7 4 CP2 12 9 CSD 11 9 CSL 1 0 CTF/NFI 12 10 CUT 86 98 DM 1 3 E2F 34 25 ESR-like 33 23 ETS 61 43 Fork 138 118 GCM 1 3 GCNF-like 4 2 GTF2I 19 22 HMG 272 271 HMGI/HMGY 100 79 Homeobox 730 665 HPD 3 3 HSF 23 24 HTH 4 5 IRF 21 18 MBD 131 117 MH1 16 20 MYB 184 175 NCU-G1 2 2 NDT80/PhoG 4 5 NF-YA 1 2 NF-YB 9 5 NF-YC 14 13 NGFIB-like 14 7 Nrf1 1 3 Others 1,128 1,037 P53 7 11 PAX 24 9 PC4 11 6 Pou 51 34 RFX 22 15 RHD 146 105 Runt 9 9 RXR-like 13 9 SAND 32 21 SRF 18 16 STAT 24 15 T-box 20 17 TEA 3 2 TF_bZIP 158 116 TF_Otx 12 6 THAP 39 22 THR-like 37 35 TSC22 13 10 Tub 17 16 ZBTB 342 333 zf-BED 16 14 zf-C2H2 2,535 2,293 zf-C2HC 415 451 zf-GATA 101 96 zf-LITAF-like 4 4 zf-MIZ 26 26 zf-NF-X1 13 4

Total 7,664 6,988 Max 2,535 2,293 Min 1 0

27 Table 5 Known Novel Known Novel Known Novel Sample miRNA miRNA piRNA piRNA siRNA siRNA name count count count count count count KS1_1 1,001 317 48 213 0 2 KS1_2 1,044 251 60 167 0 1 KS1_3 985 262 69 170 0 1 KS2_1 941 161 74 99 0 0 KS3_1 923 190 68 88 0 1 KS3_2 903 182 69 97 0 0 KS3_3 978 260 86 148 0 2 KS4_1 993 214 87 144 0 1 KS4_2 941 180 88 112 0 1 KS4_3 1,005 129 70 108 0 0 KS5_1 946 317 61 194 0 0 KS5_2 919 255 69 162 0 1 KS5_3 962 191 86 125 0 19 KS5_4 1,000 108 118 108 0 0 KS5_5 708 26 35 24 0 0 KS5_6 906 64 72 71 0 0 KS6_1 989 260 67 174 0 2 KS6_2 996 289 63 168 0 1 KS7_1 983 201 63 125 0 0 KS7_2 970 259 59 124 0 0 KS8_1 956 238 69 148 0 1 KS8_2 946 218 73 141 0 0 KS8_3 729 15 73 21 0 0 KS8_4 1,018 205 81 168 0 1 KS8_5 914 124 78 106 0 0 KS8_6 671 14 43 14 0 0 KS9_1 905 273 50 189 0 0 KS9_2 986 240 74 145 0 0 KS9_3 721 30 64 35 0 1 KS10_1 995 281 57 159 0 0 KS10_2 980 207 73 124 0 0 KS10_3 769 21 67 27 0 0 KS10_4 891 112 68 58 0 1 KS10_5 848 51 30 45 0 0 KS10_6 784 41 31 39 0 0 KS11_1 1,054 324 60 230 0 2 KS11_2 970 206 92 129 0 0 KS11_3 818 39 82 55 0 1 KS12_1 741 14 67 25 0 1 KS12_2 782 16 78 35 0 1 KS12_3 943 181 81 123 0 1 KS13_1 908 34 76 46 0 3 KS13_2 773 40 66 57 0 0 KS13_3 982 153 91 138 0 0 KS14_1 728 16 45 17 0 0 KS14_2 809 51 63 50 0 0 KS14_3 967 151 120 135 0 0 KS15_1 781 19 71 27 0 1 KS15_2 972 127 86 99 0 0 KS15_3 935 127 72 102 0 1 KS16_1 829 62 62 51 0 0 KS16_2 942 173 73 116 0 0 KS16_3 969 186 73 168 0 1 KS17_1 981 244 62 138 0 0 KS17_2 1,015 167 81 131 0 1 KS17_3 956 110 88 83 0 0 KS18_1 684 24 47 22 0 0 KS18_2 920 106 75 115 0 0 KS18_3 898 107 74 85 0 0 KS19_1 926 51 84 121 0 0 KS19_2 988 138 85 111 0 0 KS19_3 987 160 75 129 0 0 KS20_1 864 84 53 67 0 0 KS20_2 978 155 102 121 0 1 KS20_3 887 85 83 72 0 0 KS21_1 1,023 178 88 133 0 1 KS21_2 990 163 90 161 0 1 KS21_3 909 90 103 59 0 0 KS22_1 1,045 185 65 177 0 1 KS22_2 937 106 115 104 0 0 KS22_3 854 67 78 61 0 0 KS24_1 992 178 103 181 0 0 KS24_2 811 47 31 43 0 0 KS25_1 792 44 49 28 0 0 KS25_2 698 23 60 16 0 0 KS25_3 725 37 45 112 0 0 KS26_1 806 41 52 34 0 0 KS26_2 761 33 35 30 0 0 KS26_3 793 30 64 26 0 0 KS27_1 734 30 40 24 0 0 KS27_2 708 23 47 17 0 0 KS27_3 744 24 45 28 0 0 KS28_1 723 23 39 28 0 0 KS28_2 824 31 51 20 0 0 KS28_3 724 25 39 25 0 0 KS29_1 758 19 43 21 0 0 KS29_2 757 17 39 23 0 0 KS29_3 722 26 47 34 0 0 KS30_1 768 38 27 28 0 0 KS30_2 813 53 49 40 0 0 KS31_1 757 15 67 16 0 0 KS31_2 817 25 64 31 0 0

Average 882 121 67 91 0 1 Max 1,054 324 120 230 0 19 Min 671 14 27 14 0 0

28 Table 6 siRNA Huaier administration 30 days 90 days 6 months 9 months 11 months 1 year 14 months 1.5 year Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- KS No. regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated 1 0 0 0 0 3 0 1 1 0 4 0 0 0 1 5 1 0 20 1 0 21 0 0 0 0 6 1 1 7 0 0 8 0 1 1 0 1 0 0 1 0 0 9 0 0 1 0 10 0 0 0 0 1 0 0 1 0 0 11 0 2 1 0 12 0 0 1 1 13 0 3 0 0 14 0 0 0 0 15 0 1 1 0 16 0 0 1 0 17 0 0 0 0 18 0 0 0 0 19 0 0 0 0 20 1 0 0 1 21 0 0 0 1 0 0 0 0 0 0 22 0 1 0 0 24 0 0 25 0 0 0 0 26 0 0 0 0 27 0 0 0 0 28 0 0 0 0 29 0 0 0 0 30 0 0 31 0 0 miRNA Huaier administration 30 days 90 days 6 months 9 months 11 months 1 year 14 months 1.5 year Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- KS No. regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated 1 22 126 51 50 3 54 68 253 23 4 18 142 135 196 5 51 128 118 184 113 233 28 446 302 9 6 42 28 7 113 45 8 193 232 49 549 602 28 18 302 33 389 9 79 74 38 552 10 55 204 52 512 338 60 86 243 13 55 11 52 327 82 473 12 38 5 464 62 13 40 95 400 33 14 128 5 396 50 15 458 44 28 57 16 368 63 68 25 17 131 175 17 129 18 379 25 2 6 19 333 131 85 19 20 320 28 53 331 21 27 59 37 267 115 236 27 130 55 20 22 41 246 34 153 24 46 439 25 27 130 55 20 26 19 70 74 27 27 41 37 20 20 28 86 11 14 98 29 17 9 29 10 30 55 8 31 90 18 piRNA Huaier administration 30 days 90 days 6 months 9 months 11 months 1 year 14 months 1.5 year Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- Up- Down- KS No. regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated regulated 1 16 19 47 26 3 27 15 65 9 4 8 46 39 54 5 34 73 62 85 54 68 11 129 64 2 6 21 35 7 21 18 8 63 76 6 129 156 1 11 92 1 94 9 18 65 19 135 10 21 62 9 115 50 14 28 53 7 12 11 34 128 28 126 12 6 2 119 4 13 31 30 122 27 14 30 0 126 12 15 105 6 29 22 16 93 10 54 11 17 38 48 4 47 18 110 1 9 40 19 85 90 43 16 20 84 21 27 96 21 43 33 10 105 25 52 6 18 100 5 22 43 104 8 66 24 18 171 25 6 18 100 5 26 8 15 8 11 27 6 10 11 2 28 7 12 11 9 29 8 5 8 4 30 17 2 31 24 2

29 Fig. 1a

160,000

140,000

120,000

100,000

80,000 SNP numberSNP 60,000

40,000

20,000

0 A-G C-T Transition A-C A-T C-G G-T Transversion Total Single nucleotide polymorphism (SNP) variant type

KS1_1 KS1_2 KS1_3 KS2_1 KS3_1 KS3_2 KS3_3 KS4_1 KS4_2 KS4_3 KS5_1 KS5_2 KS5_3 KS5_4 KS5_5 KS5_6 KS6_1 KS6_2 KS7_1 KS7_2 KS8_1 KS8_2 KS8_3 KS8_4 KS8_5 KS8_6 KS9_1 KS9_2 KS9_3 KS10_1 KS10_2 KS10_3 KS10_4 KS10_5 KS10_6 KS11_1 KS11_2 KS11_3 KS12_1 KS12_2 KS12_3 KS13_1 KS13_2 KS13_3 KS14_1 KS14_2 KS14_3 KS15_1 KS15_2 KS15_3 KS16_1 KS16_2 KS16_3 KS17_1 KS17_2 KS17_3 KS18_1 KS18_2 KS18_3 KS19_1 KS19_2 KS19_3 KS20_1 KS20_2 KS20_3 KS21_1 KS21_2 KS21_3 KS22_1 KS22_2 KS22_3 KS24_1 KS24_2 KS25_1 KS25_2 KS25_3 KS26_1 KS26_2 KS26_3 KS27_1 KS27_2 KS27_3 KS28_1 KS28_2 KS28_3 KS29_1 KS29_2 KS29_3 KS30_1 KS30_2 KS31_1 KS31_2

30 Fig. 1b

25,000

20,000

15,000 Number

10,000

5,000

0 Skipped exons Mutually exclusive exons Alternative 5' splice sites Alternative 3' splice sites Retained introns Splice event types KS1_1 KS1_2 KS1_3 KS2_1 KS3_1 KS3_2 KS3_3 KS4_1 KS4_2 KS4_3 KS5_1 KS5_2 KS5_3 KS5_4 KS5_5 KS5_6 KS6_1 KS6_2 KS7_1 KS7_2 KS8_1 KS8_2 KS8_3 KS8_4 KS8_5 KS8_6 KS9_1 KS9_2 KS9_3 KS10_1 KS10_2 KS10_3 KS10_4 KS10_5 KS10_6 KS11_1 KS11_2 KS11_3 KS12_1 KS12_2 KS12_3 KS13_1 KS13_2 KS13_3 KS14_1 KS14_2 KS14_3 KS15_1 KS15_2 KS15_3 KS16_1 KS16_2 KS16_3 KS17_1 KS17_2 KS17_3 KS18_1 KS18_2 KS18_3 KS19_1 KS19_2 KS19_3 KS20_1 KS20_2 KS20_3 KS21_1 KS21_2 KS21_3 KS22_1 KS22_2 KS22_3 KS24_1 KS24_2 KS25_1 KS25_2 KS25_3 KS26_1 KS26_2 KS26_3 KS27_1 KS27_2 KS27_3 KS28_1 KS28_2 KS28_3 KS29_1 KS29_2 KS29_3 KS30_1 KS30_2 KS31_1 KS31_2

31

32

33

34 5 8 months

8 3 months

10 4 months

12

Fig.3a-1

35 21/25

DEGs Number of The Most Enriched Pathway DEGs Number of The Most Enriched Pathway 100 30 days8

6

75 UpDown 4 down up UpDown

50 down DEGs Number 2 up

DEGs Number 0

25

joining

Pertussis

Necroptosis

0 Breast cancer Prion diseases Focal adhesion

Adherens junction

Lysine degradation receptor interaction

Rheumatoid arthritis − 17 signaling pathway Histidine metabolism ventral axis formation Akt signaling pathway − − − beta signaling pathway

IL Notch signaling pathway − Linoleic acid metabolism homologous end Malaria Asthma ECM −

PI3K Biosynthesis of amino acids Lysosome cytokine receptor interaction Arachidonic acid metabolism Dorso TGF

− Glycolysis / Gluconeogenesis Primary bile acid biosynthesis Endocytosis Proteasome Non

Human papillomavirus infection

Leishmaniasis Glycerophospholipid metabolism

Colorectal cancer Sulfur metabolism Central carbon metabolism in cancer Barr virus infection Cysteine and methionine metabolism Endometrial cancer − Alzheimer’s disease Parkinson’s disease Cytokine Huntington’s disease

Synaptic vesicle cycle

mediated phagocytosis Th17 cell differentiation Glycine, serine and threonine metabolism Acute myeloid leukemia Vibrio cholerae infection − Notch signaling pathway Osteoclast differentiation

Oxidative phosphorylation Hematopoietic cell lineage Primary immunodeficiency kappa B signaling pathway Epstein −

Choline metabolism in cancer

Th1 and Th2 cell differentiation NF RAGE signaling pathway in diabetic complications

Neomycin, kanamycin and gentamicin biosynthesis −

Inflammatory bowel disease (IBD)

Phospholipase D signaling pathway alcoholic fatty liver disease (NAFLD)

AGE Fc gamma R − Arrhythmogenic right ventricular cardiomyopathy (ARVC)

SNARE interactions in vesicular transport Non

Fig.3a-2

36 Fig.3b 11 sheets 8

37

8

31

38 10

22

39 26 Prostate hyperplasia

15

40 1

12

41

42

43 5,8

12

44 5,8,10, 21/25

12

45 5,8,10, 21/25

12

46 5,8,10, 21/25

NFkB

12

NFkB

47 Fig. 3c-1

30 days 90 days Up-regulated Down-regulated Colon-endoscopic dissection

No.14 Adenoma Enriched KEGG Pathway Enriched KEGG Pathway Amoebiasis ● Leukocyte transendothelial migration ● Ether lipid metabolism ● B cell receptor signaling pathway ●

Ovarian steroidogenesis ● Cardiac muscle contraction ●

Cocaine addiction ● HIF−1 signaling pathway ● Relaxin signaling pathway ● MicroRNAs in cancer ●

Riboflavin metabolism ● Longevity regulating pathway − worm ● Qvalue Intestinal immune network for IgA production ● 0.8 Staphylococcus aureus infection ● 0.7 ● Qvalue Platelet activation ● 0.6 Nucleotide excision repair 0.5 0.3 ● ● 0.4 Hematopoietic cell lineage Aldosterone−regulated sodium reabsorption 0.2 0.3 ● Regulation of actin cytoskeleton ● 0.1 Thyroid hormone synthesis Gene_number ● Linoleic acid metabolism ● 1 Protein export ● ● 2 Gene_number ● 30 ● ● 3 Retrograde endocannabinoid signaling ● alpha−Linolenic acid metabolism 60 ● 4 ● Spliceosome ● 90 Asthma ● ● 5 ● ● 6 Proteasome ● Choline metabolism in cancer ● ● 7 Huntingtons disease ● Hedgehog signaling pathway ● Non−alcoholic fatty liver disease (NAFLD) ● Arachidonic acid metabolism ● Alzheimers disease ● Complement and coagulation cascades ● Parkinsons disease ● Glycerophospholipid metabolism ● Oxidative phosphorylation ● Phospholipase D signaling pathway ● Ribosome ● Fc epsilon RI signaling pathway ● 0.3 0.4 0.5 0.03 0.06 0.09 Rich Factor Rich Factor NO. 13 Multiple adenoma Enriched KEGG Pathway RNA transport ●

2−Oxocarboxylic acid metabolism ● Galactose metabolism ● Pancreatic secretion ● B cell receptor signaling pathway ● Renin−angiotensin system ● Kaposis sarcoma−associated herpesvirus infection ● Calcium signaling pathway ● Apoptosis − fly ● Axon guidance ●

Other glycan degradation ● Ribosome biogenesis in eukaryotes ● Human papillomavirus infection ● Monobactam biosynthesis ● Ribosome ● Qvalue Chemokine signaling pathway ● 0.3 Aldosterone synthesis and secretion ● HIF−1 signaling pathway ● 0.2 Relaxin signaling pathway ● Acute myeloid leukemia ● 0.1 Basal transcription factors ● Oxytocin signaling pathway ● Protein export ● Gene_number PI3K−Akt signaling pathway ● Cardiac muscle contraction ● ● 20 ● 40 Thyroid hormone synthesis ● Regulation of actin cytoskeleton ● ● 60 ECM−receptor interaction ● Huntingtons disease ● Long−term depression ● Non−alcoholic fatty liver disease (NAFLD) ● Staphylococcus aureus infection ●

Nucleotide excision repair ● Parkinsons disease ● Protein digestion and absorption ● Spliceosome ● Biosynthesis of amino acids ● Alzheimers disease ● 0.01 0.02 0.03 0.04 Rich Factor Oxidative phosphorylation ● Ribosome ● No.19 Stage IIb cancer 0.25 0.50 0.75 1.00 Rich Factor

Enriched KEGG Pathway Hepatitis B ● Enriched KEGG Pathway Mucin type O−glycan biosynthesis ● Fc gamma R−mediated phagocytosis ● Th1 and Th2 cell differentiation ● Nucleotide excision repair ● B cell receptor signaling pathway ● Platinum drug resistance ● Nucleotide excision repair ● Protein processing in endoplasmic reticulum ● Adherens junction ● B cell receptor signaling pathway ● Protein processing in endoplasmic reticulum ● Malaria ● DNA replication ● Gene_number Thyroid hormone signaling pathway ● Gene_number ● 25 MicroRNAs in cancer ● ● 25 ● ● 50 Regulation of actin cytoskeleton ● 50 ● 75 Regulation of actin cytoskeleton ● ● 75 Spliceosome ● Qvalue Cardiac muscle contraction ● 0.5 Qvalue ● 0.8 Retrograde endocannabinoid signaling 0.4 Malaria ● 0.3 0.6 Cell cycle ● 0.2 Protein export ● 0.4 0.1 0.2 MicroRNAs in cancer ● Retrograde endocannabinoid signaling ● Cardiac muscle contraction ● Spliceosome ● Huntingtons disease ● Huntingtons disease ● Alzheimers disease ● Alzheimers disease ● Non−alcoholic fatty liver disease (NAFLD) ● Non−alcoholic fatty liver disease (NAFLD) ● Parkinsons disease ● Parkinsons disease ● Oxidative phosphorylation ● Oxidative phosphorylation ● Ribosome ● Ribosome ●

0.20 0.25 0.30 0.35 0.40 0.2 0.3 Rich Factor Rich Factor

48 Fig. 3c-2 30 days No. 7 Stage IV cancer 90 days Enriched KEGG Pathway Influenza A ●

Hedgehog signaling pathway ● Hematopoietic cell lineage ● Th1 and Th2 cell differentiation ● Cellular senescence ● NF−kappa B signaling pathway ●

HIF−1 signaling pathway ● Gene_number ● 2 Glycine, serine and threonine metabolism ● ● 3 ● 4 Cytokine−cytokine receptor interaction ● ● 5 ● 6 Apoptosis − multiple species ● ● 7 ● EGFR tyrosine kinase inhibitor resistance Qvalue

Inositol phosphate metabolism ● 0.20 0.15 Hippo signaling pathway ● 0.10 0.05 Measles ● VEGF signaling pathway ● Antigen processing and presentation ● NOD−like receptor signaling pathway ● T cell receptor signaling pathway ● Hepatitis C ● Natural killer cell mediated cytotoxicity ●

0.02 0.03 0.04 0.05 Rich Factor

No.20 Benign adenoma (2.0 x 2.0 cm diameter, no treatment, or Huaier administration)

Enriched KEGG Pathway Enriched KEGG Pathway Carbohydrate digestion and absorption ● Primary immunodeficiency ● Fatty acid degradation ● Alzheimers disease ● Aldosterone−regulated sodium reabsorption ● Ubiquitin mediated proteolysis ● Necroptosis ● Acute myeloid leukemia ● ● Pyrimidine metabolism Non−homologous end−joining ● ● Adrenergic signaling in cardiomyocytes T cell receptor signaling pathway ● ● Proximal tubule bicarbonate reclamation ● Qvalue Glycerophospholipid metabolism Salivary secretion ● 0.32 Osteoclast differentiation ● Gene_number 0.28 ● 50 ● Terpenoid backbone biosynthesis ● 100 0.24 NF−kappa B signaling pathway ● Histidine metabolism ● ● 150 0.20 Autophagy − other ● Qvalue Biosynthesis of unsaturated fatty acids ● Gene_number 0.20 ● Endocytosis ● 1 0.15 ● Renin−angiotensin system ● 2 RNA transport ● 0.10 ● 3 AMPK signaling pathway ● 0.05 ● 4 Nucleotide excision repair ● RNA transport ● Cell cycle ● Pancreatic secretion ● Parkinsons disease ● Chemical carcinogenesis ● Spliceosome ● Cardiac muscle contraction ● Non−alcoholic fatty liver disease (NAFLD) ● PPAR signaling pathway ● Proteasome ● Fatty acid metabolism ● Oxidative phosphorylation ● Ribosome ●

0.01 0.02 0.03 Ribosome ● Rich Factor 0.5 0.6 Rich Factor

49 Fig. 3c-3

13,14

19,20

50 Fig. 4a a 30 days 90 days

51 Fig. 4b

Huaier ad. 1 month 3 month

Patient No.1

Patient No.3

Patient No.4

11 months 14 months 1,5 year

Patient No.5

Patient No.6

Patient No.7

6 month 12 months 1,5 year

Patient No.8

Patient No.9

9 month 14 months 1,5 year Patient No.10

Patient No.11

52 Huaier ad. 1 month 3 month Huaier ad. 1 month 3 month

Patient No.12 Patient No.22

Patient No.13 Patient No.24

Patient No.14 Patient No.25

Patient No.15 Patient No.26

Patient No.16 Patient No.27

Patient No.17 Patient No.28

Patient No.29 Patient No.18

Patient No.19 Patient No.30

Patient No.20 Patient No.31

Patient No.21 21=25

Huaier ad. 1 month 3 months 11 months 12 months 14 months

53 Fig. 5 Huaier ad. 30 days 90 days

siRNA

miRNA

piRNA

54 Fig. 6 Patient No.5

Huaier ad. 1 month 3 months 11 months 14 months 18 months

siRNA

miRNA

piRNA

55 Patient No.8 Huaier ad. 1 month 3 months 6 months 12 months 18 months

siRNA

miRNA

piRNA

56

Patient No.10 Huaier ad. 1 month 3 months 9 months 14 months 18 months

siRNA

miRNA

piRNA

57 Patient No.21/25 Huaier ad. 1 month 3 months 9 months 14 months 18 months

siRNA

miRNA

piRNA

58 Extended Table 1

KS No. 1 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 24 25 26 27 28 29 30 31 Pathway up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down up down Cellular community - eukaryotes 72 34 18 37 35 6 3 2 0 9 7 2 176 66 4 10 124 74 84 1 53 6 2 7 3 5 149 36 134 19 20 13 67 13 144 27 0 0 2 5 63 156 238 174 3 10 33 155 5 18 0 0 15 26 10 114 89 22 Transport and catabolism 103 35 37 43 56 7 7 1 1 9 5 0 309 66 2 14 181 77 158 1 36 8 1 2 3 4 144 89 0 0 25 10 72 27 168 46 0 1 12 4 128 194 274 353 3 2 36 177 5 14 3 3 13 36 11 135 81 27 Cell growth and death 63 38 20 131 26 2 3 0 3 3 10 0 192 45 3 8 109 75 110 2 27 9 2 0 2 6 126 68 102 36 12 17 50 18 161 38 1 1 9 5 96 130 188 209 6 2 30 151 8 9 4 8 16 17 14 92 81 17 Cell motility 13 10 4 6 5 2 2 0 0 3 1 0 32 27 1 3 32 21 19 0 11 1 0 0 0 0 51 10 37 7 5 4 28 3 52 9 0 0 0 0 17 59 82 36 0 1 11 42 2 2 0 0 1 9 1 24 14 6 Signal transduction 318 147 69 188 149 23 18 6 1 39 62 7 677 189 8 57 558 265 378 9 169 29 8 23 26 31 579 182 0 0 70 74 238 75 540 141 3 4 32 20 287 689 1,070 660 12 30 181 657 27 78 4 19 64 108 55 489 358 136 Signaling molecules and interaction 58 41 15 50 24 9 7 2 1 5 10 1 96 35 3 12 97 79 49 1 33 9 3 4 3 6 69 35 0 0 20 11 36 9 61 20 0 1 3 4 32 71 134 129 5 7 40 137 6 23 4 3 17 13 18 86 93 26 Membrane transport 1233001001007401646010000161210020410000156 14 01380100010942 Folding sorting and degradation 50 21 9 61 32 42125220045 113 20 106 2 29 3 1 1 1 1 81 74 56 53 8 7 36 35 94 51 0 0 14 1 80 80 143 223 0 5 19 103 4 12 1 0 12 9 2 66 54 4 Translation 77 8 8 14 45 2 2 0 0 4 2 1 231 11 0 27 107 27 110 4 27 1 3 3 5 3 72 147 0 0 10 4 34 83 76 111 4 3 43 2 73 69 99 249 1 4 10 127 2 5 0 1 8 7 2 53 53 5 Transcription 11 31361110000 67 12 0 4 40 10 36 1 6 0 1 2 1 2 41 45 0 0 4 1 12 28 39 35 1 1 12 1 25 32 52 90 1 1 8 52 2 6 0 0 5 2 0 20 31 4 Replication and repair 9 3 1 36 5 1 2 0 0 1 0 0 66 1 0 1 34 12 37 2 2 1 1 1 0 2 30 39 0 0 3 2 9 21 34 27 1 0 2 1 12 16 27 79 2 1 5 69 3 0 3 0 6 1 2 21 22 3 Infectious deseases: Viral 116 146 20 212 65 14 9 5 0 15 15 23 0 0 6 21 253 143 209 3 67 7 1 10 7 5 287 116 260 60 87 19 131 35 322 57 0 1 14 7 176 354 497 444 4 13 84 366 11 33 2 1 36 40 14 187 250 41 Cancers: Overview 144 71 31 119 70 17 12 3 3 26 15 6 353 111 9 19 271 135 196 3 90 20 3 9 11 5 270 86 250 50 43 43 140 40 286 69 1 1 11 12 138 295 444 355 6 17 87 332 10 32 3 8 28 54 26 238 158 63 Cancers: Specific types 149 83 44 150 64 9 8 6 0 18 17 8 439 110 2 23 313 117 234 2 125 16 0 10 11 1 261 81 284 41 49 41 129 45 298 43 0 0 4 11 142 336 513 387 1 27 82 385 9 29 1 4 30 60 24 237 172 53 Infections diseases: Parasitic 69 19 17 40 23 4 4 3 0 5 6 1 0 0 0 16 79 66 58 5 22 6 0 8 5 4 84 39 71 22 17 6 40 9 78 30 0 0 5 1 57 122 174 128 2 6 47 131 8 14 2 1 19 17 7 62 86 32 Endocrine and metabokic diseases 35 14 4 23 12 64005400024 83 19 51 1 16 7 1 2 3 2 70 60 55 38 11 5 25 38 63 44 1 0 14 0 38 76 125 117 1 6 22 92 9 8 0 0 17 5 2 40 58 12 Immune diseases 24 11 5 30 16 0 6 2 4 4 2 0 42 13 1 3 30 53 35 1 23 3 1 2 1 5 24 66 30 21 8 14 8 13 19 26 1 0 6 1 31 47 58 105 2 1 19 128 3 6 1 5 8 5 9 39 80 16 Infections diseases: Bacterial 55 27 9 60 21 49426610033 82 69 73 5 21 6 1 4 0 3 123 79 67 40 13 21 41 22 123 33 1 1 10 4 92 165 215 196 2 1 61 118 7 9 3 5 18 26 10 65 71 23 Neruodegenerative diseases 73 12 8 23 31 4 9 0 0 0 1 0 0 0 2 13 110 30 94 3 10 2 0 1 1 5 54 170 23 108 10 4 25 123 47 160 0 1 54 2 76 58 104 204 0 2 19 86 28 8 1 2 25 11 3 52 47 14 Cardiovascular diseases 38 16 11 22 10 12 11 1 0 4 3 0 54 43 6 2 45 47 34 0 37 9 0 4 1 2 68 32 47 11 7 16 33 8 60 25 0 0 3 0 26 65 106 72 5 4 31 88 5 5 1 2 12 7 5 57 32 11 Substance dependence 19 21 4 13 12 3 2 0 2 0 1 0 41 18 1 4 38 29 30 1 14 3 0 5 0 5 27 28 0 0 2 12 14 8 19 15 1 0 6 2 24 55 89 57 0 1 15 34 1 7 0 5 2 9 10 50 26 3 Drug resistance: Antineoplastic 18 11 9 18 8 3 1 0 0 0 5 0 47 17 2 4 37 22 35 1 14 2 0 1 1 0 45 13 44 4 4 3 21 9 47 11 0 0 1 3 24 53 74 54 0 2 13 50 0 5 0 1 3 11 4 31 21 5 Global and overview maps 94 33 29 90 56 20 6 1 4 8 7 1 262 63 2 17 266 55 182 5 66 19 9 4 9 13 134 139 91 97 20 36 53 69 138 119 2 3 30 6 101 146 221 411 7 8 49 382 17 27 2 4 27 19 18 148 128 32 Lipid metabolism 40 12 13 33 13 5 0 0 0 3 0 0 76 10 0 25 86 21 64 7 11 8 0 1 10 9 31 25 28 24 7 13 16 22 36 40 1 1 4 6 30 54 91 119 9 3 23 107 1 16 1 5 14 4 10 54 49 15 Amino acid metabolism 21 20 11 31 17 14 3 0 1 5 6 0 49 25 1 2 63 10 33 0 9 14 3 1 0 3 60 16 35 7 15 11 25 2 46 10 1 0 7 6 16 38 59 102 4 4 17 117 7 7 0 2 4 11 7 39 31 17 Carbohydrate metabolism 20 8 13 16 12 1 1 1 0 1 3 0 72 18 0 2 103 15 45 0 22 5 2 1 2 2 61 18 43 5 3 9 28 2 62 7 0 0 4 4 31 55 77 142 2 6 14 131 3 8 0 2 5 12 7 46 33 16 Nucleotide metabolism 16 8 5 20 12 4 2 0 0 1 1 1 49 20 1 7 46 16 38 1 6 2 0 2 2 3 39 47 0 0 6 6 14 24 32 42 0 4 3 2 19 32 61 90 0 1 9 88 3 5 0 0 4 5 1 37 41 6 Glycan biosynthesis and metabolism 17 1 5 18 710012100001 46 26 36 1 13 3 1 1 1 3 31 14 22 11 5 4 14 3 22 11 0 0 2 2 13 42 59 60 0 0 13 77 0 6 0 0 1 6 0 32 29 5 Metabolism of cofactiors and vitamins 7 8 6 15 931000000024 31 11 29 2 12 5 0 1 1 3 16 27 13 16 7 5 9 6 20 16 0 0 0 1 13 19 32 60 2 0 12 46 1 5 0 0 1 4 6 28 13 9 Metabolism of other amino acids 8 2 5 10 2 2 0 0 0 1 0 0 15 7 0 0 21 4 11 0 4 2 0 1 0 0 13 7 5 2 3 7 7 2 13 8 0 0 0 0 12 9 18 37 0 0 6 23 1 0 0 0 2 1 4 10 9 2 Energy metabolism 20 0 2 4 13 1 2 0 0 0 0 0 43 6 0 3 48 2 35 0 1 0 0 0 0 0 3 59 2 51 0 2 2 39 4 54 0 0 19 0 13 8 16 61 0 0 3 17 9 1 2 0 8 1 1 9 7 1 Xenobiotics biodegradation and metabolism 2 5 1 4 5 1 0 0 0 0 0 0 13 6 0 0 25 3 12 1 11 00002690004434 11 0 0 1 0 2 5 17 25 3 0 2 23 0 6 0 0 0 2 0 15 7 5 Metabolism of terpenoids and polyketides 1 0 1 0 2 1 1 0 0 1 0 0 9 0 0 0 406020000044110021310120215900190300100260 Biosynthesis of othere secondary metabolites 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 412010000031210030300010110601020100000210 Immune system 107 95 49 155 58 13 15 7 8 11 34 7 295 121 3 29 212 176 205 8 103 12 1 4 9 14 298 153 203 82 31 37 113 60 281 97 0 3 33 9 147 414 546 351 18 6 107 417 24 23 3 4 39 38 32 167 264 62 Endocrine system 120 74 28 87 62 18 7 1 1 11 7 2 277 95 2 23 249 115 166 5 66 18 2 20 12 7 270 82 223 41 26 41 89 41 222 45 4 4 10 10 117 302 470 284 3 11 79 270 10 44 2 16 21 49 24 226 135 52 Digestive system 38 47 24 26 20 9 2 1 1 8 2 1 0 0 3 10 79 64 44 2 18 14 2 11 6 6 123 33 66 16 20 11 29 16 68 29 2 7 0 1 33 118 193 114 0 6 31 160 7 28 3 6 10 56 17 98 94 26 Nervous system 64 33 5 33 34 4 4 0 0 5 4 0 124 39 1 14 100 49 87 6 24 3 1 4 3 7 100 75 85 47 6 15 41 37 94 53 0 0 24 1 70 141 198 157 1 3 30 106 5 10 1 6 12 9 5 120 49 32 Development 14 22 2 25 9 4 2 2 0 3 1 1 41 29 1 7 40 41 36 2 16 1 1 2 0 2 55 11 46 5 10 6 29 6 55 10 0 0 1 4 20 84 114 42 1 3 26 61 7 9 0 0 7 11 7 50 34 8 Sensory system 14 10 1 11 11 1 0 0 0 1 1 0 33 15 1 4 18 19 17 2 8 2 0 1 1 2 28 23 0 0 3 12 9 8 17 11 0 0 0 0 12 38 59 29 0 1 8 34 2 12 0 5 4 8 8 51 24 10 Circulatory system 35 8 9 6 13 3 4 0 0 2 2 0 38 18 0 10 49 20 32 1 8 2 0 4 1 1 42 29 24 25 8 9 11 22 33 27 0 4 7 0 27 46 69 42 4 0 15 37 4 5 0 4 15 14 4 48 19 12 Aging 24 88962000320 55 8 1 3 35 7 24 0 7 3 1 0 0 0 43 12 35 6 1 3 13 7 22 7 0 1 1 1 27 34 52 42 0 0 14 27 4 3 0 0 1 3 7 23 7 7 Environmental adaptation 8 3 1 2 8 0 1 1 0 0 2 0 17 5 0 1 11 9 12 0 2 0 1 1 0 1 13 7 14 5 0 2 5 2 15 3 0 0 3 0 11 16 25 15 0 1 6 18 0 3 0 1 4 2 0 22 9 3 Excretory system 11 77462001200 20 13 1 2 16 14 13 0 2 1 0 3 1 2 36 8 13 8 2 2 3 5 21 9 0 3 2 1 9 31 45 21 0 0 3 28 0 4 1 0 0 17 4 23 9 3 Subtotal 2,197 1,177 572 1,881 1,091 242 174 51 36 230 247 65 4,317 1,266 78 418 4,294 2,072 3,271 91 1,245 262 53 161 143 178 4,070 2,295 2,413 961 601 562 1,710 1,039 3,946 1,629 25 46 411 140 2,364 4,761 7,139 6,554 110 198 1,324 5,668 260 550 48 123 535 746 391 3,417 2,879 848 Total 3,374 2,453 1,333 225 266 312 5,583 496 6,366 3,362 1,507 214 321 6,365 3,374 1,163 2,749 5,575 71 551 7,125 13,693 308 6,992 810 171 1,281 3,808 3,727

59 Extended Table 2 Patient Sampling time after Huaier administration No. before 30 days 90days 6 months 9 months 11 months 12 months 14 months 18 months 1 23.06 20.05 21.09 3 22.99 14.65 19.6 4 22.74 21.04 23.77 5 21.45 19.9 25.54 28.45 17.14 23.48 6 27.4 22.58 7 18.51 20.83 8 29.63 18.34 10.74 29.02 31.66 33.18 9 23.16 26.02 12.38 10 31.57 23.27 8.98 41.44 40.88 32.9 11 24.17 21.2 8.68 12 8.47 12.03 23.35 13 7.58 9.44 24.26 14 13.07 11.84 29.77 15 10.7 30.66 25.81 16 11.97 27.01 22.99 17 34.31 28.1 23.56 18 14.88 29.76 26.83 19 11.05 26.31 24.84 20 30.67 35.74 48.99 21 25.44 22.13 44.74 26.53* 20.69* 22.14* 22 20.32 36.91 34.64 24 34.26 23.15 25 26.53* 20.69* 22.14* 26 22.49 15.75 19.92 27 21.7 19.15 21.75 28 29.78 28.25 32.33 29 26.28 24 23.43 30 22.47 19.36 31 20.77 22.99 *: the same

Patient Sampling time after Huaier administration No. before 30 days 90days 6 months 9 months 11 months 12 months 14 months 18 months 1 50.04 45.41 53 3 80.16 35.28 63.57 4 54.79 34.88 39.16 5 54.12 50.67 55.49 9.21 34.18 45.4 6 57.97 56.38 7 37 46.12 8 67.05 33.18 35.83 64.21 8.55 39.79 9 58.04 68.18 46.61 10 55.78 36.03 39.44 12.52 47.62 36.65 11 58.89 48.58 39.43 12 36.52 49.91 50 13 47.2 42.85 52.84 14 39.14 43.04 43.93 15 52.74 64.62 55.82 16 39.7 53.61 58.45 17 45.09 46.08 46.77 18 47.37 59.43 60.89 19 41.59 51.46 57.81 20 59.34 68.57 13.88 21 49.47 53.41 18.36 41.31* 31.46* 38.62* 22 65.44 11.82 12.02 24 13.14 49.17 25 41.31* 31.46* 38.62* 26 40.07 26.23 33.86 27 40.55 29.7 33.8 28 44.58 45.79 47.44 29 33.51 34.68 33.48 30 34.88 36.39 31 30.37 44.66 *: the same

60 Extended Table 3 a iPs-related genes iPS genes\KS No. 1 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 24 25 26 27 28 29 30 31 KIT 3,815 Up ******* Up ******* Up ****** Up ***** MYC 4,609 ************** Up ***** Down Up * Down ***** OCT3/4 5,460 ************** Up * Up Up ******** Down ** SOX2 6,657 ***** *********** LIN28A 79,729 ** Down ********************* Down **** NANOG 79,923 **************************** b iPs-related genes iPS genes\KS No. 1 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 24 25 26 27 28 29 KIT 3,815 *-**--****** Up Up *******-***** MYC 4,609 *-**-- Down * Down Down Up Up ****** Down Down Down -***** OCT3/4 5,460 *-* Up --**** Up Up Up * Up * Up * Down **-***** SOX2 6,657 -*--*********-** LIN28A 79,729 *- Down *--***** Down Down ********-***** NANOG 79,923 *-**--***************-*****

61 Extended Table 4 sRNA Type Average S.D. Max. Mini. miRNA mature 87.05 4.15 94.44 71.24 precursor 6.35 1.72 10.33 3.66 genome intron 1.33 0.92 3.68 0.21 intergenic 1.07 1.16 6.39 0.06 exon 0.51 0.39 1.40 0.06 repeat 0.46 0.37 1.70 0.05 sncRNA piRNA 0.06 0.06 0.26 0.00 rRNA 0.05 0.04 0.19 0.00 snoRNA 0.03 0.02 0.10 0.00 tRNA 0.01 0.01 0.03 0.00 snRNA 0.00 0.00 0.00 0.00 hairpin 0.00 0.00 0.00 0.00 unknown unmap 1.94 1.73 12.37 0.30 Rfam other sncRNA 1.14 1.14 4.94 0.07

62 Figures

Figure 1

SNP variance after Huaier administration. a. classications by SNP variation types; b. classications by splice event types. Figure 2

Comparison of the numbers of differently expressed genes (DEGs) in each patient by time course of Huaier administration. The asterisks * indicate the time of opportunistic virus infection.

Figure 3

Pathway classication and functional enrichment of DEGs. a. Pathway functional enrichment results for up/down regulation genes. X axis represents the terms of pathway. Y axis represents the number of up/down regulation genes. b. The detailed analysis of the Huaier effects on each cancer by KEGG biological pathways25. The pathways responsible for cancer progression, stress reduction, and major immunological responses related to NFkB and TGFb signalling pathway were specically compared among patients with recovery and poor prognosis. c. Functional enrichment of the factors classied by KEGG pathways. The colour indicates the q-value (high: white, low: blue), the lower q-value indicates the more signicant enrichment. Point size indicates DEG number (the bigger dots refer to larger amount). Rich Factor refers to the value of enrichment factor, which is the quotient of foreground value (the name of DEGs) and background value (total gene amount). The larger the value, the more signicant enrichment. Figure 4

TF-DEG network in each patient by the time course of Huaier administration. a. TF families classied by the encoding DEGs, with each number of up/down regulation. b. TF-DEG predicted network. The red and green dots represent the up-regulated and down-regulated DEGs, respectively. Purple ball represents transcription factor, the greater the node the more DEGs the transcription factor regulate. Figure 5

Comparison of the numbers of DESs until 90 days of Huaier administration. Upper: Statistics of siRNA, middle: miRNA, lower: piRNA. Figure 6

Comparison of the numbers of DESs in each patient by time course of Huaier administration.

Supplementary Files

This is a list of supplementary les associated with this preprint. Click to download.

Figure3.pdf