Prediction and Interpretation of Effective Material Basis and Mechanism of TCM prescription Lung-toxin Dispelling Formula No.1/Harmony Shot in COVID-19 control and prevention based on “Network-Molecular Docking-LC-MS” Analysis

Abstract

The Lung-toxin Dispelling Formula No.1 (referred as Harmony Shot, HSh) was established based on the theory of TCM medicinal properties and the classical prescription of TCM. It has demonstrated therapeutic benefits in both disease prevention and treatment against COVID-19. However, the material basis and mechanism of action of HSh are still unclear. This study integrates network pharmacology, molecular docking, and LC-MS techniques to analyze and clarify the material foundation of HSh. First, the targets were obtained by network pharmacology for pathway analysis, -protein interaction (PPI) construction, and prediction of the core active compounds. Then, the binding ability of those active compounds with SARS-CoV-2 3CLpro (6LU7) was evaluated by molecular docking via computational pattern recognition. Finally, the compounds with high 6LU7 inhibitory activity were identified and confirmed with the LC-MS technique in both non-targeted and targeted manners using reference standards. The global analysis was made and visualized to reveal the intricate network among the compounds, their derived ingredients, the corresponding targets and the various TCM classification and meridian tropism theory. Our results illuminate the effective material basis and possible mechanism of action of HSh prescription, which may be a promising therapeutic TCM prescription for COVID-19 prevention and treatment.

Key words Lung-toxin Dispelling Formula No.1/Harmony Shot; COVID-19; SARS-CoV-2; 3CLpro; network pharmacology; molecular docking; LC-MS; meridian tropisms; Zang-fu visceras

Abbreviations

3CLpro, pro3-chymotrypsin-like protease; WHO, World Health Organization; COVID-19, the new coronavirus disease; PHEIC, public health emergency of international concern; IHR, International Health Regulations; TCM, traditional Chinese medicine; HSh, Lung-toxin Dispelling Formula No.1 (referred as Harmony Shot); PPI, protein- protein interaction; RSS, regulated strength score; FDR, false discovery rate.

1. Introduction

On January 30, 2020, the World Health Organization (WHO) announced that the outbreak of the new coronavirus disease (COVID-19) was a public health emergency of international concern (PHEIC) [1]. This is the 6th time WHO has declared a PHEIC since the International Health Regulations (IHR) came into effect in 2005 [2]. COVID-19, due to its high infectiousness, epidemic susceptibility, and fatality rate, has now spread across China and many countries and regions outside China, endangering the health of people all over the world as a global health crisis. In recent months against COVID-19, effective preparedness and response have been executed in the disease prevention and control in China, and the preliminary results seem promising and encouraging. However, no effective antiviral treatment for infected patients has had been approved or confirmed thus far. Covid-19 can be defined as a "plague" from the perspective of traditional Chinese medicine (TCM), along with other 321 documented plagues in ancient Chinese history. TCM has played an essential role in plague prevention and control in China. Throughout the history of battling with the plagues, TCM gradually forms a unique and complete system with invaluable experience in clinical treatment on both theoretical and practical levels. As for today, TCM provides alternative treatments for the effective prevention and control of the emerging epidemic of acute respiratory tract infections around the world. At the moment, the combination of TCM and allopathic medicine has been the primary treatment option in all COVID-19 affected areas across China [3]. The Lung-toxin Dispelling Formula No.1 (referred as Harmony Shot, HSh) was established based on the theory of TCM medicinal properties and the classical prescription of TCM, in combination with clinical practice. There are nine TCM ingredients in HSh: Schizonepetae Herba, Lonicerae Japonicae Flos, Forsythiae Fructus, Scrophulariae Radix, Gleditsiae Spina, Armeniacae Semen Amarum, Vespae Nidus, Glycyrrhizae Radix et Rhizoma, and Ginseng Radix et Rhizoma. This prescription has been used in clinical practice for more than a decade and proved to be effective for the prevention and treatment of acute respiratory tract infection, as well as the common cold and flu patients. Within recent months, Harmony Shot has been tested in clinical trials for severe/critical patients with COVID-19 pneumonia, with definite efficacy and no side effects [4]. Additionally, eight out of the nine ingredients in HSh are plant-based, with one from non-animal source. All of the ingredients have been permitted as dietary supplements by FDA in favor of their safety and suitability for long-term use. Now, HSh has demonstrated therapeutic benefits in both disease prevention and treatment against COVID-19 globally. TCM has substantial advantages in treating difficult and severe diseases with solid clinical evidence, but it is still considered as a form of alternative medicine, mainly because of the unclear material basis and mechanism of action of TCM prescription. The critical situation of the COVID-19 global epidemic urges us to seek out novel strategies beyond traditional antiviral treatments. To better understand and promote the clinical application of the Harmony Shot prescription globally, this study intends to clarify the chemical material basis and mechanism of actions underlying the therapeutic effects of the Harmony Shot. Despite that the technological limitations in drug research are reduced, challenges still exist in studying the chemical material basis and mechanism of TCM prescription. TCM prescription is a complex system composed of multiple TCM herbs. However, the complex nature of TCM makes it challenging to unveil such holistic actions by

the current reductionist research approaches, which separate herbal ingredients and the targets during the study. [5]. In the era of the Omics technologies, an increasing amount of data has become available and makes the shift of focus from the classic "drug-reductionist" to the novel "drug-holistic" system based approaches in the scientific community [6]. Network pharmacology is a branch of system biology, which explores the correlation between drugs and diseases from a comprehensive perspective, which is consistent with the holistic view, systematic approach and compatibility principle of TCM [7] [8]. As early as the emergence of SARS in 2003, the structure of 3-chymotrypsin-like protease (3CLpro) has been determined [9]. It is the main protease used to cleave host polyproteins into viral replication-related , and is highly conserved across coronaviruses family, including SARS-CoV-2, SARS-CoV, and MERS-CoV. Therefore, it is considered as a validated target in the design of potential anti-coronavirus inhibitors. Molecular docking on the 3CLpro is an efficient approach used for activity simulation screening [10] to obtain the active compounds. It is a versatile and efficient approach in modern drug design and drug discovery.

This study is designed based on the success of network analysis. A network of interaction among prescription, each TCM ingredient and their Meridian Tropism, chemical composition, and the targets was established to analyze the possible pathways of action and predict the chemical material basis of TCM prescription, providing a new approach for the in-depth understanding of the complex drug systems when treating complex diseases. It is a crucial research strategy to understand the possible intervention mechanism of HSh on COVID-19 from the perspective of the biological and molecular networks, which can help to form the early prevention and treatment scheme of new coronavirus pneumonia. Then, the binding ability of those active compounds with SARS-CoV-2 3CLpro (6LU7) was evaluated by molecular docking via computational pattern recognition. Finally, the holistic profile of HSh and the screened active compounds were analyzed by ultra-high performance liquid chromatography coupled with tandem mass spectrometry using non-targeted and targeted metabolomics approaches. Through the above studies, the chemical material basis and mechanism of the HSh were preliminarily elucidated.

2. Material and Methods

The schematic workflow of this study is demonstrated in the graphical abstract. All the version and accession information of the databases and software was listed in Table S1.

2.1 Network pharmacology 2.1.1 Construction of the “Ingredient-Property-Flavor-Meridian Tropism” network for Harmony Shot prescription There are nine ingredients in the Harmony Shot (HSh) prescription, and the corresponding information about their traditional Chinese medicine (TCM) properties (Hot, Cold, Warm, Cold, and Neutral), flavors (Sweet, Bitter, Pungent, and Salty), and meridian tropisms (Lung, Stomach, Heart, Spleen, Liver, Kidney, Large intestine, and Small intestine) were compiled from 2015 Chinese Pharmacopoeia, ETCM, SymMap, NPASS and TCMID (Table 1) [ref]. A network of ingredient-property-flavor-meridian tropism was constructed using Cytoscape 3.7.2.

2.1.2 Compound database construction, compound target prediction, and target retrieval Compounds of the nine ingredients in the HSh prescription were obtained from two TCM databases: TCMSP or TCMID. The predicted targets of the compounds from the TCMSP database were obtained by setting the built-in drug-likeness screening criteria as OB (oral availability) ≥ 30% and DL (drug-likeness) ≥ 0.18. Compounds compiled from TCMID database were first applied to FAFDrugs4 platform, an external drug-likeness screening database, and only compounds passed the Drug-like soft[*] filters were kept for the following target prediction. Compound information was further processed with SEA, TargetNet, and SwissTargetPrediction databases to obtained the predicted targets of the compounds. Predicted targets using the above two strategies were pooled, and the detailed information of the corresponding to the predicted targets were queried using UniProt, GeneCards, and STRING databases.

2.1.3 Construction of the HSh network Cytoscape 3.7.2 software was used for network construction and topological analysis. Binary high-quality interactome of H. sapiens (62435 queries, on 03/04/2020) was downloaded from HINT (High-quality INTeractomes) database, which curates a compilation of high-quality protein-protein interactions from 8 interactome resources (BioGRID, MINT, iRefWeb, DIP, IntAct, HPRD, MIPS, and the PDB). All the downloaded queries were processed by Cytoscape, and the HINT network was constructed as the initial framework. Then the HSh compound targets were mapped onto the HINT network and referred as the HSh network.

2.1.4 Submodule recognition and protein-protein interaction (PPI) analysis of the HSh network, and the KEGG biological pathway enrichment To better understand the network of HSh network, submodules of the network were clustered using the MCODE Plug-in of Cytoscape software. Each submodule represents a group of genes that are with tight mutual regulation to achieve their molecular functions. The default settings of MCODE plug-in were adopted as follows: Network Scoring - Include Loops: false; Degree Cutoff: 2; Cluster Finding - Node Score Cutoff: 0.2; Haircut: true; Fluff: false; K-Core: 2; Max. Depth from Seed: 100. The biological pathways involved in the submodules were further analyzed.

2.1.5 Construction of the HSh hub gene network and evaluation of the HSh regulatory strength on the targets To further focus on the more important genes among the 148 targets in the HSh network, the HSh hub gene network was constructed by selecting the corresponding nodes and edges with the highest degree of connection. In the HSh network, nodes with the highest connection score were defined as "HSh hub genes". To evaluate the strength of the compounds in the prescription on the HSh hub gene network, we introduced a parameter called the target "regulated strength score (RSS)" was defined as follows: 푛

푅푆푆 = ∑ 퐶푖 푖=1

n: the total number of compounds that target each HSh hub gene

C1 (C2, C3…Cn): the number of ingredients that contain compound C1 (C2, C3…Cn) The higher the regulated strength score, the stronger the effects compounds in the prescription are on the targets.

2.1.6 Target-pathway and disease-target construction with DAVID and STRING databases The HSh hub genes were uploaded to DAVID database and "functional annotation clustering" was chosen under disease analysis. "Classification stringency" was set at high, and disease clusters were obtained. Lung injury related diseases were picked from the top-ranked cluster. The STRING biological function enrichment analysis was set within the scope of human disease and the FDR (false discovery rate) was set below 1×10−6.Pathways that involved in lung infection and viral infection were selected for the following analysis. A disease-target network was constructed between lung-/virus- related disease and HSh hub genes, and the disease-target related core genes were identified.

2.2 Molecular docking of the HSh compounds with SARS-CoV-2 3CLpro (6LU7) The 3D structure of the compound was first constructed with ChemOffice software and saved in *mol2 format, and its energy was minimized under MMFF94 force field. The 3D structure of 3CLpro was downloaded from the PDB database in *PDB format. PyMOL software was used for protein dehydrating, hydrogenation and other operations, and then AutoDock Software was used to convert the compounds and target protein format to * PDBQT format. Finally, run AutoDock Vina for virtual docking. Binding energy less than zero indicates that the ligand can spontaneously bind to the receptor. It is generally accepted that when the conformation of ligand and receptor binding is stable, the lower the energy is, the more likely the binding effect is to occur. The binding energy ≤− 5.0 kJ/mol was selected as the screening criteria.

2.3 Analysis of Chemical material basis of the Harmony Shot 2.3.1 Reagents and materials

HPLC grade methanol and gradient grade acetonitrile were obtained from Fisher scientific (Ottawa,Canada) and Merck KGaA (Darmstade, Germany), respectively. LC-MS grade ammonium formate obtained from Fluka (Buchs, Switzerland) and formic acid 98%-100% obtained from Merck KGaA were used to prepare mobile phase. Type-1 water (18.2 MΩ) was obtained form an Milli-Q Biocel from Millipore (Billerica, MA, USA). All chemical standards were purchased from Shanghai Standard Technology Co., Ltd. and Chengdu DeSiTe Biological Technology Co., Ltd.

Harmony Shot is composed of nine different herbs: Schizoepetae Herba (Jing-Jie), Lonicerae Japonicae Flos (Jin- Yin-Hua), Forsythiae Fructus (Lian-Qiao), Scrophulariae Radix (Xuan-Shen), Gleditsiae Spina (Zao-Ci), Armeniacae Semen Amarum (Ku-Xin-Ren), Vespae Nidus (Feng-Fang), Glycyrrhizae Radix Et Thizoma (Gan-Cao), Ginseng Radix Et Thizoma (Ren-Shen). All of the nine herbs and decoction of HSh were provided by Shanghai Cai Tong De Pharmacy.

2.3.2 Sample preparation and solutions

The 0.30 g powder of the Harmony Shot decoction was placed in a 50 mL centrifuge tube then soaked and ultrasonically extracted with 25 mL of 50% methanol (v/v) for 30 min. After centrifugation at 11000 rpm for 10min, the supernatant was used as the test solution. Each of the single herb was also extracted with water and dried in vacuum. And 0.10 g powder of each herbal extract was extracted with same method and used as the test solution.

For identification, the stock solution of each chemical standard was prepared in methanol, then two mixed standards solutions were diluted with 50% methanol (V/V) in about 20 μg/ml.

2.3.3 UPLC/QTOF MS Conditions.

The LC-MS analysis was performed on Waters ACQUITY I-Class UPLC (Waters, Milford, MA, USA) equipped with a binary solvent manager, a sample manager, and a column manager. A Waters HSS T3 column (2.1 × 150 mm, 1.7 휇m) together with a Waters on-line filtrate. 35℃ was used. The mobile phase consisted of acetonitrile containing 0.1% formic acid (v/v) (B) and water containing 0.1% formic acid (v/v) (A) following a gradient elution program: 0–2min: 0%– 2% (B); 2–22min: 2%–60% (B); 22–24min: 60%–90% (B); 24–29min: 90% (B); 29–30min: 90%-0% (B); 30–35min: 0% (B); 35–36min: 100%–0% (B); 36–41min: 0% (B). Post-column infusion was performed with methanol containing 10mM ammonium formate and 0.05% formic acid (0.2 ml /min). The flow rate was set at 0.4 mL/min. 2 휇L of the test solution was injected for UPLC analysis.

High-accuracy mass spectrometric data were recorded on a Waters Xevo G2-S QTOF mass spectrometer (Waters, Manchester, UK). Tune parameters were set for MSE experiments: capillary voltage, 2.5 kV (negative mode) and 3.0KV (positive mode); sampling cone, 40V; source offset voltage, 60V; source temperature, 150℃; desolvation temperature, 500℃); cone gas flow, 20 L/h; desolvation gas, 800 L/h. The mass analyzer scanned over a mass range of 50– 1200Da within 0.1 s under a low collision energy at 6V. High collision energy ramp of 15–60V for negative mode and positive mode was employed. Data calibration was performed using an external reference (LockSprayTM) constant infused at 1 ng/휇L of enkephalin (LE; Sigma- Aldrich, St. Louis, MO, USA) at a flow rate of 5 휇L/min, and with reference to the ion m/z 554.2615 for negative, 556.2771 for positive. Data acquisition was controlled by MassLynx V4.1 software (Waters Corporation, Milford, USA). Automatic metabolites characterization was performed using UNIFI 1.9.4 (Waters, Milford, USA) by the search of the in-house library.

3. Results and Discussion

3.1.1 The “Ingredient-Property-Flavor-Meridian Tropism” network for the Harmony Shot prescription Based on the TCM theory, the nine ingredients in the Harmony Shot were classified by their property, flavor, and meridian tropism with representative colors Table 1. A network was constructed among all items in the TCM classification system, plus the nine ingredients in the HSh prescription using the Cytoscape software (Fig. 1). Regarding the color scheme in Fig.1, the nodes of the ingredient are in blue color, and the colors of the nodes for the flavor and meridian tropism corresponded to the five elements and their matching colors in TCM theory, which were wood - green, fire - red, earth - yellow, gold - white and water - black. The top three nodes with the degree of

connection in the network are the Lung, Stomach and Heart, and the connection values are 7, 5 and 4 respectively. Seven ingredients in HSh are attributed to the Lung meridian, except for Gleditsiae Spina and Vespae Nidus. Among the nodes of the flavor, the three nodes with the largest connection degree are Sweet, Bitter and Warm. The connection degrees are 5, 4 and 4 respectively.

3.1.2 Compound database construction, compound target prediction, and target gene retrieval Among the nine herbs of HSh prescription, and eight were archived in TCMSP database, including Schizonepetae Herba, Lonicerae Japonicae Flos, Forsythiae Fructus, Scrophulariae Radix, Gleditsiae Spina, Armeniacae Semen Amarum, Glycyrrhizae Radix et Rhizoma,and Ginseng Radix et Rhizoma. The remaining one was Vespae Nidus, which was retrieved from TCMID database. The information about the compounds included in the HSh prescription was searched in the two databases above, and a total of 1071 compounds were retrieved for all 9 herbs. Since not all compounds have drug-like ADME (absorption, distribution, metabolism and excretion) properties, a drug-likeness screening was performed on the compounds prior to the target prediction. TCMSP database has the built-in oral availability and drug-likeness screening filters, as well as target prediction functions. The drug-likeness screening for compounds in Vespae Nidus from TCMID database was performed using the Drug-like soft[*] functions. Compounds passing the filter were further uploaded to SEA/TargetNet/SwissTargetPrediction databases for target prediction. A total of 157 compounds passed the drug-likeness screening, and 339 corresponding preliminary targets were obtained from the data process above. The compounds, the targets and their distribution in the ingredients were listed in Fig. S1, and the common compounds across various ingredients were also listed in Table S2.

3.1.3 Construction of the HSh ingredient-compound-target and protein-protein interaction (PPI) networks To build the HSh target network, we downloaded the H. sapiens Binary dataset from HINT (High-quality INTeractomes) database and processed the downloaded dataset with Cytoscape software, from which a background HINT network was constructed as the initial framework. Then, the 339 preliminary targets of the HSh were mapped onto the HINT network, and the results were further filtered resulting in a network reduced in size with 346 edges and 148 nodes, i.e. 148 targets, referred as the HSh network. These 148 targets corresponded to 155 compounds, and their connection with the ingredients in the HSh prescription was demonstrated in Fig.2. The PPI was further analyzed to reveal the interaction among the 148 target genes in the HSh network (Fig. 3).

3.1.4 Submodule analysis and protein-complex recognition of the HSh target network with KEGG biological pathway enrichment To find the clusters (highly interconnected regions) in the HSh network with 148 targets, submodule recognition analysis was performed on the 148 targets using MCODE, a Cytoscape plug-in. Six submodules were obtained covering 23 out of the 148 targets (Fig.4). GO analysis STRING revealed that 19 of the 23 genes were related to nucleopasm. It indicates that all six submodules are highly interconnected in their biological functions. To further reveal the structural and functional connections of the 23 genes in the submodules, enrichment analysis of "protein complex-based gene sets" was performed using the CPDB database. The results showed that gene products in

Submodule 4 can form "integrin alpha4beta1:VCAM1" protein complex (complex source: Reactome, P-value= 1.15e-08) (Fig.4d), and Submodule 3 can form IkappaB kinase complex (IKBKB, CHUK, IKBKAP, NFKBIA,

RELA, MAP3K14) (complex source: CORUM,P-value= 2.28E-07)(Fig.4c). Besides, CDK4, CDK6, and CCND2 in Submodule 5 can also form protein complex (Fig. 4e). These results demonstrate that the 23 genes in the submodules may be functionally connected and synergistically regulated. It is known that VCAM-1 receptor is the key mediator in the viral attachment and entry processes of human rhinoviruses (HRVs). Also, IkappaB kinase complex is the master regulator of the classic NF-κB inflammation pathway, which is closely related to the body's inflammatory response. HSh may have regulatory effects on the formation of inflammatory storm to reduce excessive inflammation of the body, which ameliorates the severe systemic damage in COVID-19 patients.

3.1.5 Construction of the HSh hub gene network and evaluation of the HSh regulatory strength on the network To further focus on the more important genes among the 148 targets in the HSh network, topological analysis was performed to calculate the average connection of the network. The mean node degree is 4.7, and the median is 3. Therefore, nodes with the node degree greater than or equal to 6 were determined as the hub nodes, i.e. the hub genes, of the network, which usually have greater influence and play a more important role in the network. 42 hub genes were identified in the HSh target network (Fig.3). To measure the strength of the regulatory effects on the hub genes, we introduced a parameter called "regulated strength score". The higher the regulated strength score, the stronger the compound effects are from the HSh on the hub genes. Among the 42 hub genes of the target network, 11 genes had a regulated strength score greater than 30, indicating that these targets are more strongly affected by the compounds in HSh and play a greater role in the therapeutic effects of the HSh prescription.

3.1.6 Target-pathway and disease-target network construction with DAVID and STRING databases The 42 hub gene proteins were uploaded to the STRING database for functional enrichment analysis. The statistical indicator was set as FDR < 1×10−6, and the KEGG enrichment pathway was set at "the basic biological processes", which include Metabolism, Genetic Information Processing, Environmental Information Processing, Cellular Processes, and Organismal Systems. 39 pathways were obtained from the pathway enrichment analysis, and the top entries were Signal transduction, Transport and catabolism, Cell growth and death, Cellular community - eukaryotes, Immune system, neural system, Nervous system, Development and regeneration (Fig.5). It indicates that these pathways are important biological processes for HSh intervention, and that HSh may affect these pathways to achieve the therapeutic effects. From the enrichment analysis on the 42 hub genes with String and DAVID databases, 29 related diseases were identified, which involves 38 targets corresponding to 134 compounds of 9 medicinal materials (Fig.6 and Table S3). Among them, 12 diseases attributed to lung and large intestine, involving 32 targets, corresponding to 118 compounds. There were 4 disease pathways attributed to liver and gallbladder, involving 18 targets, corresponding

to 92 compounds. There were 6 disease pathways attributed to kidney and bladder, involving 29 targets, corresponding to 125 compounds. There were 5 disease pathways attributed to the stomach and spleen, involving 14 targets, corresponding to 115 compounds, and 2 diseases attributed to the heart and small intestine, involving 5 targets, corresponding to 121 compounds. These results demonstrate the complex regulatory mechanism of TCM, which involves multiple compounds working on multiple targets and acting on multiple diseases.

Based on the "disease - targets - compounds" network, the connection degree of the top five compounds was

MOL000098, MOL000006, MOL000173, MOL000497, and MOL000422 (connection degree: 26, 19, 17, 16, and

16, respectively). The connection degree of the top five genes was ESR1, HSP90AA1, AR, PPARG, and GSK3B (connection degree: 106, 90, 88, 85, and 75, respectively), which were in accordance with the PPI analysis shown in

Fig.3. The size of the node is proportional to the node degree, and the color depth of the node is proportional to the regulated score of the target. All five genes were in key nodes in the HSh work, indicating that these compounds and targets may play a greater regulatory functions in the HSh network. When look further into the biological functions of the five genes and their proteins, ESR1, SP90AA1, and AR all have significant roles in , cellular proliferation and differentiation, while PPARG and GSK3B are involved in lipid and glucose metabolism, respectively. HSh may affect these crucial host cellular pathways to enhance the overall health of the body and improve the body’s ability to fight against viruses.

Unlike SARS, where the most affected organ is the lungs, the new coronavirus attacks not only the lungs, but also the heart, kidneys, intestines and other organs, causing multiple organ failure within a short window. Although the disease is located in the lung and requires immediate medical care, it is necessary to consider that the deficiency of the lung will affect the spleen and stomach, and hurt the heart and kidney as well. Based on the topological analysis of the ingredients and meridian tropism network, seven out of nine drugs in the prescription, passing the lung meridian, because the disease is located in the lung and should be treated with priority. There are 5 ingredients going to the stomach meridian and 2 ingredients to the spleen meridian, with 2 ingredients to kidney meridian. These Zang-fu visceras are known as the nature-nurture essences and supports for the body wellness. The 4 ingredients going the heart meridian is to suppress the over reaction to the disease in the lung. And the 2 ingredients going to the liver to remove any inhibitory effects from the liver on the lung to maximize the lung's ability to fight the disease. From the TCM property and flavor perspective, there are 4 warm, 3 cool and 2 neutral ingredients respectively; 5 Sweet, 4 bitter, and 2 pungent, and 1 salty ingredient. All ingredients work synergic ally in harmony, coordinating all the Zang-fu visceras globally to fight the disease. Via integration of medical theories from both allopathic medicine and the TCM Meridian Tropism, it is evident that TCM encompasses ample therapeutic medicinal information that beyond our current comprehension. Fortunately, our lack of understanding in TCM has not hindered its successful application and profound impact on fighting critical diseases like COVID-19. The Harmony Shot prescription focuses on and treat the body on the holistic level rather than the specific and limited treatment method.

3.2 Molecular docking with SARS-CoV-2 3CLpro (6LU7) and the antiviral potential of HSh

The network analysis gives us an integrative perspective to understand the underlying connection and interaction among the prescription, the compounds, their targets, pathways, and how they work on the human body together to fight against COVID-19. However, it is unknown that if the candidate compounds in HSh have any anti-coronavirus activity, such as blocking the penetration, uncoating, synthesis of viral protein or nucleic acid. To find out if the HSh compounds will work directly on the SARS-CoV2, a computational strategy of molecular docking was performed to evaluate the affinity of the candidate compounds in HSh and 3CLpro (3C-like protease) (Fig.7). The SARS-CoV-2 3CLpro (6LU7) is required for the replication of coronaviruses and considered as a validated target in the design of potential anti-coronavirus inhibitors. The results from the molecular docking showed that the molecular binding affinity between the core bioactive compounds in HSh and the 3CLpro is far less than -5.0 kJ/mol, indicating that the core bioactive compounds in HSh may have direct anti-SARS-CoV2 activity besides their effects on the body (Fig.7, and Table 2). To validate the accuracy and sensitivity of our computational parameters used in the molecular docking, we also tested 14 approved antiviral drugs with reported clinical effectiveness on SARS-CoV2, and the results were consistent with previous studies (Fig.7, and Table 2). The disease-target network analysis revealed that there were seven pathways related to viral infections in the enrichment analysis (Fig. S2), indicating the potential therapeutic effects of HSh on viral infections.

3.3. Multicomponent of Harmony Shot

By optimizing the gradient elution program, satisfactory separation of major peaks was achieved in both negative and positive ion modes, as shown in Fig.8 and Fig.S3. The obtained MSE data were further imported into UNIFI software for automatic components characterization. By comparison with the TCM library, 150 peaks were identified or tentatively characterized by element composition and fragment matching analyses, all of them are listed in Table S4.

Among them, sixty-one compounds might be from Schizoepetae Herba, and 40 compounds might be from Lonicerae Japonicae Flos. Both of the two herbs provide almost half of the identified compounds. The Glycyrrhizae Radix et Thizoma provided thirty-four compounds and nineteen compounds might be from Forsythiae Fructus. About 80% compounds were provides by the four herbs. Other five herbs offered all less than ten compounds. Through the above analysis, we obtained the overall chemical profile of Harmony Shot, and identified the compounds and their TCM corresponding ingredients, which provided the chemical network support for the clarification of the basis of medicinal substances and further in-depth research. Based on the network pharmacology and molecular docking analysis, 118 compounds were obtained, and 48 representative compounds were selected for the inhibition activity test of new coronavirus 3CL hydrolase (3CLPro). It was found that at 100 µM concentration, the inhibition rate of 22 compounds was more than 50% (unpublished data). We performed LC-MS targeted extraction and analysis on the 22 bioactive compounds in combination with their chemical standards in the sample solution of the prescription water decoction. Seven bioactive compounds could be identified and confirmed (Table 3). The results suggested that the seven bioactive compounds could be used in the subsequent study for drug discovery and the quality control evaluation of the prescription.

4. Conclusions In current study, the prevention and treatment of the new coronavirus pneumonia by HSh was analyzed systemically and thoroughly. The prediction and verification analysis of the chemical material basis and mechanism of action in the complex system of traditional Chinese medicine compound was successfully realized. Intricate networks were constructed for HSh to unveil the complex interaction among its nine ingredients, properties, flavors and meridians tropisms. Analyses the treatment principle and treatment method of the whole system with the lung as the main part and the Zang-fu visceras at the same time.

Inspired by the fact that the TCM prescription, Lung-toxin Dispelling Formula No.1/Harmony Shot, has been successfully used in treating severe and critical conditions of COVID-19 clinically in China, as well as a preventive dietary supplement for COVID-19 prevention in the US, our study applied network pharmacology, molecular docking, and UPLC/QTOF MS approaches to unveil the complex interaction between the TCM prescription and the disease. In our study, complex system of TCM prescription has been integrated with the targets, disease pathways and their Zang-fu visceras to unveil the complex correlation in their intricate network. Therefore, it is very crucial to integrate and apply modern biological and pharmacological technology to the TCM research to have a deeper understanding of TCM compounds and to better promote their clinical application. There is no doubt that the establishment of novel prescription based on the TCM theory requires more effective integration of sub-disciplines and technologies.

To further elucidate and understand the underlying mechanism of TCM, future study will analyze the chemical material basis and mechanism of action of Harmony Shot based on metabolomics, proteomics and other advanced and innovative technologies.

Graphical Abstract

Figure 1 The “Ingredient-Property-Flavor-Meridian Tropism” network of Harmony Shot (HSh) prescription.

The network revealed the interconnection of the nine ingredients in HSh and their corresponding information about their traditional Chinese medicine (TCM) properties (Hot, Cold, Warm, Cold, and Neutral), flavors (Sweet, Bitter, Pungent, and Salty), and meridian tropisms (Lung, Stomach, Heart, Spleen, Liver, Kidney, Large intestine, and Small intestine). The color scheme of the nodes and edges was in correspondence with the five elements and colors in TCM theory: Green, Wood; Red, Fire; Yellow, Earth; White, Gold; Black, Water. Blue: the nine ingredients in HSh.

Figure 2 HSh ingredient-compound-target network.

The network demonstrated the interconnections among the 148 predicted targets of HSh, the corresponding 155 compounds, and the nine ingredients. Diamond, HSh prescription; triangles, TCM ingredients in HSh; hexagons, compounds in TCM ingredients; arrowhead, predicted targets of HSh.

Figure 3 The protein-protein interaction (PPI) network of the predicted HSh targets.

The PPI network of the 148 predicted targets in HSh was visualized. The size of the node is proportional to the node degree. The color depth of the node is proportional to the regulated score of the target. The targets with the highest regulated scores were HSP90AA1 (heat shock protein 90kDa alpha (cytosolic), class A member 1), ESR1 (Estrogen Receptor 1), and AR (Androgen Receptor).

Figure 4 Cluster analysis and protein-complex recognition of the HSh target network.

Cluster analysis was performed on the HSh target network to reveal the highly interconnected regions within the HSh target network. Six clusters were identified, indicating their closely related biological functions.

Figure 5 KEGG pathway enrichment of the 42 HSh hub genes

The 42 hub gene proteins were analyzed for functional enrichment analysis using the STRING database, and the false discovery rate (FDR) was set at below 1×10−6. A total of 39 pathways were obtained mainly related to biological functions such as Signal transduction, Transport and catabolism, and Cell growth and death. The size of the legends is proportional to the number of genes included.

Figure 6 The intricate network includes the disease pathways, their corresponding Meridian Tropism, HSh hub genes, the corresponding compounds, and TCM ingredients.

The circle in the center described the disease pathways and their corresponding Meridian Tropism. Nearly half of the disease pathways were related to the Lung Meridian (white quadrangles), followed by the Stomach (yellow quadrangles) and Kidney Meridians (black quadrangles). They all matched the primary therapeutic effects and the underlying Zang-fu viscera theory of HSh prescription.

Figure 7 HSh candidate compounds and representative results of molecular docking with SARS-CoV-2 3CLpro (6LU7)

Schematic diagrams demonstrated the 3CLpro binding sites and proximate affinity of candidate compounds in HSh and valganciclovir (an FDA approved antiviral drug). Black dots: carbon atoms; blue dots: nitrogen atoms; red dots: oxygen atoms; green dotted lines: hydrogen bonds; red combs: residues.

Figure 8 Representative base ion chromatograms from the UPLC/QTOF-MS analysis

HSh decoction were analyzed using UPLC/QTOF-MS and the base ion chromatograms of the positive ions (upper) and the negative ions (lower) were obtained for candidate compounds.

Figure S1 Upset plot showing the number of screened compound distribution in the nine ingredients of HSh.

GLRR (Glycyrrhizae Radix et Rhizoma) had the highest contribution to the screened compounds in HSh, followed by VN (Vespae Nidus) and FSF (Forsythiae Fructus). There were a few compounds found in more than one ingredient in the HSh prescription.

Figure S2 The network of viral infection-related pathways in the enrichment analysis and the corresponding HSh hub genes.

There were seven viral infection-related pathways enriched from the HSh hub genes, indicating the potential therapeutic effects of HSh on viral infections.

Figure S3 Typical extract ion chromatograms of the sample solution and mixed standard solution

There were bioactive compounds could be identified and confirmed with their chemical standards by LC-MS targeted extraction and analysis.

Table 1 The Property, Flavor and Meridian Tropism of Ingredients in Harmony Shot

Property Flavor Meridian tropism

Chinese Ingredients Abbreviation name

Cold

Salty

Lung

Liver

Heart

Bitter

Large Small

Sweet

Warm

Spleen

Kidney

Neutral

Pungent

Intestine Intestine

Stomach

Ginseng Radix et Rhizoma GSRR Renshen √ ◎ ◎ ◎ ◎ Glycyrrhizae Radix et Rhizoma GLRR Gancao √ ◎ ◎ ◎ ◎ Schizonepetae Herba SPH Jingjie √ ◎ ◎ Lonicerae Japonicae Flos LJF Jinyinhua √ ◎ ◎ ◎ Armeniacae Semen Amarum ASA Kuxingren √ ◎ ◎ Forsythiae Fructus FSF Lianqiao √ ◎ ◎ ◎ Scrophulariae Radix SR Xuanshen √ ◎ ◎ ◎ Gleditsiae Spina GS Zaojiaoci √ ◎ ◎ Vespae Nidus VN Fengfang √ ◎ Total 4 3 2 5 4 2 1 7 5 4 2 2 2 1 1

Table 2 The binding energy values of the core compounds and SARS-CoV-2 3CLpro (PBD: 6LU7) in HSh and drugs

with reported clinical therapeutic effects against COVID-19

3CL Binding Molecular molecular NO. Source ID English name OB (%) DL Affinity formula weight (kJ/mol)

0 PDB -like / Inhibitor 6lu7_ligand C35H48N6O8 680.79 -24.70

1 ASA MOL010921 estrone 53.56 0.32 C18H22O2 270.16 -23.44

2 ASA MOL002311 Glycyrol 90.78 0.67 C21H18O6 366.11 -24.28

3 ASA MOL004355 Spinasterol 42.98 0.76 C29H48O 412.37 -22.60

4 ASA MOL004841 Licochalcone B 76.76 0.19 C16H14O5 286.08 -22.60

5 ASA MOL004903 liquiritin 65.69 0.74 C21H22O9 418.13 -28.88

6 ASA MOL004908 Glabridin 53.25 0.47 C20H20O4 324.14 -24.70

7 ASA MOL012922 l-SPD 87.35 0.54 C19H21NO4 327.15 -25.53

8 GSRR MOL000358 beta-sitosterol 36.91 0.75 C29H50O 414.72 -20.93

9 GSRR MOL000422 kaempferol 41.88 0.24 C15H10O6 286.24 -24.70

(2R,3R,4S)-4-(4-hydroxy-3-methoxy-phenyl)-7- 10 FSF MOL003283 66.51 0.39 C H O 360.41 -24.70 methoxy-2,3-dimethylol-tetralin-6-ol 20 24 6

11 FSF MOL003305 PHILLYRIN 36.40 0.86 C27H34O11 534.56 -23.44

12 FSF MOL000006 luteolin 36.16 0.25 C15H10O6 286.24 -28.88

13 FSF MOL000098 quercetin 46.43 0.28 C15H10O7 302.24 -27.21

14 GS MOL013179 fisetin 52.60 0.24 C15H10O6 286.24 -27.21

15 GS MOL013296 Fustin 50.91 0.24 C15H12O6 288.26 -25.95

16 GS MOL000449 Stigmasterol 43.83 0.76 C29H48O 412.70 -23.44

17 SR MOL002222 sugiol 36.11 0.28 C20H28O2 300.44 -24.28

18 SR MOL007662 harpagoside_qt 122.87 0.32 C18H20O6 332.35 -23.86

19 SPH MOL011856 Schkuhrin I 54.45 0.52 C22H28O8 420.18 -27.21

20 SPH MOL002881 Diosmetin 31.14 0.27 C16H12O6 300.06 -23.86

21 VN Coriatin Good Accepted C15H18O5 278.12 -24.28

(E)-3-[3,4-dihydroxy-5-(3-methylbut-2-enyl)phenyl]- 22 GLRR MOL004898 46.27 0.31 C H O 340.38 -23.44 1-(2,4-dihydroxyphenyl)prop-2-en-1-one 20 20 5

23 GLRR MOL004904 licopyranocoumarin 80.36 0.65 C21H20O7 384.38 -25.12

24 GLRR MOL004910 Glabranin 52.90 0.31 C20H20O4 324.38 -24.70

25 GLRR MOL004911 Glabrene 46.27 0.44 C20H18O4 322.36 -28.46

26 GLRR MOL004912 Glabrone 52.51 0.50 C20H16O5 336.34 -28.05

27 GLRR MOL004915 Eurycarpin A 43.28 0.37 C20H18O5 338.36 -24.28

28 GLRR MOL004949 Isolicoflavonol 45.17 0.42 C20H18O6 354.36 -26.37

29 GLRR MOL004959 1-Methoxyphaseollidin 69.98 0.64 C21H22O5 354.40 -23.86

30 GLRR MOL004961 Quercetin der. 46.45 0.33 C17H14O7 330.29 -26.79

31 GLRR MOL000497 licochalcone a 40.79 0.29 C21H22O4 338.40 -22.60

32 GLRR MOL000500 Vestitol 74.66 0.21 C16H16O4 272.30 -24.28

33 GLRR MOL005008 Glycyrrhiza flavonol A 41.28 0.60 C20H18O7 370.36 -25.53

34 GLRR MOL001484 Inermine 75.18 0.54 C16H12O5 284.27 -23.86

35 GLRR MOL000239 Jaranol 50.83 0.29 C17H14O6 314.29 -24.28

36 GLRR MOL000354 isorhamnetin 49.60 0.31 C16H12O7 316.27 -25.12

37 GLRR MOL003656 Lupiwighteone 51.64 0.37 C20H18O5 338.36 -23.86

38 GLRR MOL000392 formononetin 69.67 0.21 C16H12O4 268.27 -25.12

39 GLRR MOL000417 Calycosin 47.75 0.24 C16H12O5 284.27 -25.12

40 GLRR MOL004328 naringenin 59.29 0.21 C15H12O5 272.26 -24.28

41 GLRR MOL004810 glyasperin F 75.84 0.54 C20H18O6 354.36 -27.21

42 GLRR MOL004811 Glyasperin C 45.56 0.40 C21H24O5 356.42 -24.28

43 GLRR MOL004827 Semilicoisoflavone B 48.78 0.55 C20H16O6 352.34 -25.53

44 GLRR MOL004829 Glepidotin B 64.46 0.34 C20H20O5 340.38 -26.37

45 GLRR MOL004856 Gancaonin A 51.08 0.40 C21H20O5 352.39 -24.28

46 GLRR MOL004879 Glycyrin 52.61 0.47 C22H22O6 382.41 -24.28

47 GLRR MOL004884 Licoisoflavone B 38.93 0.55 C20H16O6 352.34 -26.79

48 GLRR MOL004885 licoisoflavanone 52.47 0.54 C20H18O6 354.36 -24.28

s1 Lopinavir C37H48N4O5 628.80 -22.19

s2 Ritonavir C37H48N6O5S2 720.96 -24.28

s3 Remdesivir C27H35N6O8P 602.58 -28.05

s4 Darunavir C27H37N3O7S 547.66 -24.28

s5 Arbidol C22H25BrN2O3S 531.89 -21.35

s6 CHLOROQUINE C18H26ClN3 319.87 -18.84

s7 Indinavir C36H47N5O4 613.79 -26.79

s8 Saquinavir C38H50N6O5 670.84 -30.56

s9 NELFINAVIR C32H45N3O4S 567.78 -25.53 s10 Tipranavir C31H33F3N2O5S 602.66 -27.63

s11 Cyclosporin A C62H111N11O12 1202.64 171.62 s12 Vancomycin C66H75Cl2N9O24 1449.25 114.69 s13 Ribavirin C8H12N4O5 244.20 -25.95 s14 Valganciclovir C14H22N6O5 354.36 -26.79

Core compounds and their binding affinity with 3CLpro were presented. Compounds in bold were those identified and confirmed in the following LC-MS analysis (Section 3 in Results and Discussion).

Table 3. Screen Targeted Compounds in Total ion Chromatogram of LC-MS

Rt Mass error No. Compound Origin Adduct m/z Formula Name (min) (ppm) 1 C020* LQ 13.49 +H/-H 287.0541/285.0401 -3.1/-1.3 C15H10O6 Luteolin 2 C116* GC 22.66 -H 351.0868 -1.6 C20H16O6 Licoisoflavone B 3 C023* ZC 13.49 +H/-H 287.0541/285.0401 -3.1/-1.3 C15H10O6 Fisetin 4 C022* LQ 9.84 +H 303.0493 -2.2 C15H10O7 Quercetin 5 C093* GC 20.69 -H 353.1021 -2.9 C20H18O6 Glyasperin F 6 C057* GC 20.09 +H 355.1175 -0.3 C20H18O6 Isolicoflavonol 7 C099* GC 22.66 -H 351.0868 -1.6 C20H16O6 Semilicoisoflavone-B * : The peak in chromatogram of sample soultion had same retention time in chromatogram of standard solution.

Table S1. The version and accession information of all the databases and software

Database Name Version Access date Address Ref

ETCM 2018 http://www.nrc.ac.cn:9090/ETCM/index.php/Home/Index/index.html [1]

Symmap 2018 https://www.symmap.org/ [2] NPASS 1.0 2017-10-13 http://bidd2.nus.edu.sg/NPASS/ [3]

TCMID 2.0 2017 http://119.3.41.228:8000/tcmid/ [4]

TCMSP 2.3 2014-05-31 http://www.tcmspw.com/tcmsp.php [5]

FAF-Drugs4 Version 4 2017-04-26 http://fafdrugs4.mti.univ-paris-diderot.fr/index.html [6]

Similarity ensemble Latest 2019-03-26 http://sea.bkslab.org/ [7] approach (SEA) Version

TargetNet 1.0 2014-02-25 http://targetnet.scbdd.com/ [8]

SwissTargetPrediction 2019 http://www.swisstargetprediction.ch/index.php [9]

UniProt 2019-10-15 https://www.uniprot.org/ [10] GeneCards 4.13 2020-02-03 https://www.genecards.org/ [11]

String 11.0 2019-01-19 https://string-db.org/ [12]

HINT 2019-04 http://hint.yulab.org/ [13] Cytoscape 3.7.2 2019-05-13 https://cytoscape.org/ [14]

MCODE 1.6 2020-01-15 http://apps.cytoscape.org/apps/mcode [15]

DAVID 6.8 Oct-2016 https://david.ncifcrf.gov/ [16]

PDB https://www.rcsb.org/structure/6LU7 [17] R 3.6.1 2019-07-05 https://www.r-project.org/ [18]

Rstudio 1.2.1335 2019-04-08 https://rstudio.com/ [19]

ggplot2 3.2.1 https://ggplot2.tidyverse.org/index.html [20] AutoDock 4.2.6 2014-08-04 http://autodock.scripps.edu/ [21]

AutoDockVina 1.1.2 2011-05-11 http://vina.scripps.edu/ [22]

ligplot+ 2.2 https://www.ebi.ac.uk/thornton-srv/software/LigPlus/ [23]

Pymol 0.99 https://sourceforge.net/projects/pymol/files/Legacy/ [24]

ConsensusPathDB Release 2019-01-15 http://consensuspathdb.org/ [25] (CPDB) 34

OmicSolution 26 2020-02-15 https://www.omicsolution.org/wkomics/main/ [26]

References

1 Xu HY, Zhang YQ, Liu ZM, Chen T, Lv CY, Tang SH, et al. ETCM: an encyclopaedia of traditional Chinese medicine. Nucleic Acids Res 2019; 47: D976-D82. 2 Wu Y, Zhang F, Yang K, Fang S, Bu D, Li H, et al. SymMap: an integrative database of traditional Chinese medicine enhanced by symptom mapping. Nucleic Acids Res 2019; 47: D1110-D7. 3 Zeng X, Zhang P, He W, Qin C, Chen S, Tao L, et al. NPASS: natural product activity and species source database for natural product research, discovery and tool development. Nucleic Acids Res 2018; 46: D1217- D22. 4 Xue R, Fang Z, Zhang M, Yi Z, Wen C, Shi T. TCMID: Traditional Chinese Medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Res 2013; 41: D1089-95. 5 Ru J, Li P, Wang J, Zhou W, Li B, Huang C, et al. TCMSP: a database of systems pharmacology for drug discovery from herbal medicines. 2014; 6: 13. 6 Lagorce D, Bouslama L, Becot J, Miteva MA, Villoutreix BOJB. FAF-Drugs4: free ADME-tox filtering computations for chemical biology and early stages drug discovery. 2017; 33: 3658-60. 7 Gfeller D, Michielin O, Zoete VJB. Shaping the interaction landscape of bioactive molecules. 2013; 29: 3073-9. 8 Yao Z-J, Dong J, Che Y-J, Zhu M-F, Wen M, Wang N-N, et al. TargetNet: a web service for predicting potential drug–target interaction profiling via multi-target SAR models. 2016; 30: 413-24. 9 Daina A, Michielin O, Zoete VJNar. SwissTargetPrediction: updated data and new features for efficient prediction of protein targets of small molecules. 2019; 47: W357-W64. 10 UniProt C. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 2019; 47: D506-D15. 11 Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, et al. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. Curr Protoc Bioinformatics 2016; 54: 1 30 1-1 3. 12 Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. 2019; 47: D607-D13. 13 Das J, Yu HJBsb. HINT: High-quality protein interactomes and their applications in understanding human disease. 2012; 6: 92. 14 Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. 2003; 13: 2498-504. 15 Bader GD, Hogue CWJBb. An automated method for finding molecular complexes in large protein interaction networks. 2003; 4: 2. 16 Sherman BT, Lempicki RAJNp. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. 2009; 4: 44. 17 The crystal structure of COVID-19 main protease in complex with an inhibitor N3. TO BE PUBLISHED. 18 Ihaka R, Gentleman RJJoc, statistics g. R: a language for data analysis and graphics. 1996; 5: 299-314. 19 Team RJR, Inc., Boston, MA URL http://www.rstudio.com. RStudio: integrated development for R. 2015; 42: 14. 20 Wickham H. ggplot2: elegant graphics for data analysis. (Springer, 2016). 21 Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. 2009; 30: 2785-91. 22 Trott O, Olson AJJJocc. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. 2010; 31: 455-61. 23 Laskowski RA, Swindells MB. (ACS Publications, 2011). 24 DeLano WLJCNopc. Pymol: An open-source molecular graphics tool. 2002; 40: 82-92. 25 Kamburov A, Pentchev K, Galicka H, Wierling C, Lehrach H, Herwig R. ConsensusPathDB: toward a more complete picture of cell biology. Nucleic Acids Res 2011; 39: D712-7. 26 Wang S, Cai Y, Cheng J, Li W, Liu Y, Yang H. motifeR: An Integrated Web Software for Identification and Visualization of Protein Posttranslational Modification Motifs. Proteomics 2019; 19: e1900245.

Table S2. The distribution of common compounds in TCM ingredient of Harmony Shot

Comound GLRR SPH LJF ASA FSF GSRR SR GS VN SUM

MOL000211 1 0 0 1 1 0 0 0 0 3

MOL002311 1 0 0 1 0 0 0 0 0 2

MOL000359 1 1 0 1 0 0 1 1 0 5

MOL000422 1 0 1 0 1 1 0 1 0 5

MOL004841 1 0 0 1 0 0 0 0 0 2

MOL004903 1 0 0 1 0 0 0 0 0 2

MOL004908 1 0 0 1 0 0 0 0 0 2

MOL005017 1 0 0 1 0 0 0 0 0 2

MOL000098 1 1 1 0 1 0 0 1 0 5

MOL000006 0 1 1 0 1 0 0 0 0 3

MOL000358 0 1 1 0 1 1 1 1 0 6

MOL000449 0 1 1 1 0 0 0 1 0 4

MOL002914 0 0 1 0 0 0 0 1 0 2

Table S3. The number of compounds and potential targets of Harmony Shot

Compound numbers Target TCM in HSh Chinese name in database Pre- Post- numbers Screening screening

GINSENG RADIX ET RHIZOMA Renshen 190 2 78

GLYCYRRHIZAE RADIX ET Gancao 280 88 224 RHIZOMA

SCHIZONEPETAE HERBA Jingjie 159 10 188

LONICERAE JAPONICAE FLOS Jinyinhua 236 17 206

ARMENIACAE SEMEN AMARUM Kuxingren 113 16 67

FORSYTHIAE FRUCTUS Lianqiao 150 19 214

SCROPHULARIAE RADIX Xuanshen 47 5 47

GLEDITSIAE SPINA Zaojiaoci 30 10 194

VESPAE NIDUS Fengfang 31 20 101

Table S4. Information of 150 compounds identified from Harmony Shot by UPLC/QTOF MS

Mass MW. RT No. Tentative identification error TCM (Da) (min) (ppm) 1 Cynanuriculoside-A_qt_1 490.2922 -1.7 0.80 GC 2 Isohexane 86.1092 -2.3 0.87 RS/ZC/GC/XR 3 3,4,3',4'-Tetrahydroxy-2-methoxychalcone 288.1017 5.6 0.89 JYH/GC/RS/XR 4 Corymbosin 358.1064 3.0 0.90 FF/XS/LQ 5 ISOHEPTANE 100.1247 -3.2 0.91 JYH/GC/XS/JJ/RS 6 Forsythiaside D 478.1670 -3.4 1.03 GC/RS/JJ ZC/XR/XS/LQ/JY 7 Licochalcone-a 338.1536 4.7 1.04 H/JJ 8 Isolariciresinol 360.1590 4.5 1.38 all 9 Glabrone 336.0976 -6.4 2.26 FF 10 Licoisoflavone-B 352.0930 -4.9 2.72 FF 11 Licoagrocarpin 338.1542 6.2 3.44 LQ 12 Salidroside 300.1232 6.6 4.19 LQ 13 Sweroside Swertiamarin 372.1406 -3.8 4.20 JYH 14 Glabranin 324.1368 1.8 4.26 LQ 15 Rengyoside B 320.1486 4.3 4.31 LQ 4H-1-Benzopyran-4-one-2-(4-(beta-D- 16 glucopyranosyloxy)phenyl)-2_3-dihydro-5_7- 434.1214 0.2 4.59 JYH dihydroxy-(2S)- 17 Rengyoside A 322.1646 5.1 4.62 LQ 18 Artonin-E 436.1487 -7.3 4.83 XR 19 Adoxosidic acid 376.1375 1.5 5.26 JJ 20 (-)-Medicocarpin 432.1417 -0.8 5.50 XS 21 caleolarioside-A 478.1453 -4.5 5.56 JYH 22 (-)-Medicocarpin 432.1420 -0.1 5.58 RS 23 Argininyl-fructosyl-glucose 498.2218 8.6 5.84 JYH 24 Hesperidin 610.1900 0.3 5.95 XR 25 schizonepetoside A/D 330.1663 -4.1 6.00 LQ 26 Isoliquiritin 418.1259 -1.0 6.05 JYH 27 14-deoxy-12(R)-sulfoandrographolide 414.1676 -8.4 6.14 JJ 28 Glyasperin-A 422.1688 -9.9 6.19 JYH 29 Folinic-acid 473.1646 -2.6 6.28 JJ 30 Forsythoside-E 462.1732 -1.1 6.57 JJ 31 schizonepetoside A/B 330.1662 -4.5 6.66 JYH 32 Forsythoside-B 756.2496 2.5 6.98 JYH 33 8-Prenylwighteone 406.1766 -3.4 7.07 JYH 34 7-epi-Vogeloside 432.1619 -2.9 7.14 JJ 35 3-(2-hydroxy-4-methoxyphenyl)-2H-chromen-7-ol 270.0904 4.5 7.40 JJ 36 4- feruloylquinic acids 368.1126 5.4 7.47 JYH 37 Glabrol 392.2003 4.0 7.56 JJ 38 Glyasperin-E 444.1612 8.9 7.56 JJ

39 amygdalin 457.1582 -0.4 7.67 JJ 40 Daidzein-dimethyl-ether 282.0886 -2.2 7.78 JJ 41 Shinpterocarpin 322.1179 -8.6 7.99 LQ/GC/FF/JYH (2R)-2-(3_4-dihydroxy-5-(3-methylbut-2-enyl)phenyl)- 42 424.1900 2.9 8.26 LQ 5_7-dihydroxy-8-(3-methylbut-2-enyl)chroman-4-one 43 Gancaonin-T 398.2123 7.9 8.27 RS 44 Morusin 420.1569 -0.9 8.33 JYH 45 schizonepetoside B 330.1689 3.0 8.33 JYH 46 Forsythoside-C 640.2021 2.7 8.37 JJ 47 Isoschaftoside 564.1441 -6.7 8.48 GC 48 Xambioona/Kanzonol-E 388.1640 -8.3 8.49 JYH 19-2β.3α-4- acetyl-7- [(2,2-acetylamino)ethyl]-3- 49 386.1483 1.3 8.53 LQ through-2-(3,4-dihydroxyphenyl)benzoxazine Ethenyl-(beta-D-glucopyranosyloxy)-hexahydro-5-oxo- 50 443.1436 1.8 8.85 JYH pyrano-pyridine-3-carboxylic-acid 51 Gancaonin-H 420.1568 -1.3 8.90 JYH 52 forsythidmethylester 404.1314 -1.2 8.91 JYH 53 (-)-Medicocarpin 432.1400 -4.7 9.07 JJ 54 Oxy-chromone 448.0990 -3.4 9.07 JJ 55 trihydroxy-methyl-2-(4-hydroxyphenyl)acetate 450.1514 -2.5 9.08 JJ 56 (E)-Aldosecologanin 742.2677 -0.9 9.19 JJ 57 Spathulenol 220.1844 7.0 9.21 LQ/ZC/FF 58 (+)-pinoresinol -O-β-D- glucopyranoside 520.1948 0.7 9.27 JYH 59 Hispaglabridin-B 390.1790 -10.0 9.31 JYH 60 Kanzonols-X 394.2147 0.8 9.31 JYH 61 Kanzonol-E 388.1682 2.0 9.34 JYH 62 Schaftoside 596.1381 0.7 9.41 JYH 63 Scropolioside-A_qt 590.2006 1.1 9.49 JYH 64 Isoviolanthin 578.1581 -9.5 9.70 GC 65 nicotiflorin 594.1528 -9.8 9.78 ZC 66 16- palmitic acid 610.1542 1.3 9.84 JJ 67 Trifolirhizin 446.1200 -2.9 9.88 GC 68 Isoglycyrrhiza glycoside 550.1694 1.3 9.88 GC 69 Rutin 610.1524 -1.6 9.89 JYH 70 Acteoside 624.2061 1.0 9.92 JJ 71 Deoxidized brucinosine 404.1343 5.7 9.95 GC 72 Hyperin 464.0942 -2.7 10.04 JJ 73 Amygdalin isomer 457.1573 -2.5 10.10 JYH 74 Matairesinol monoglucoside 520.1919 -4.6 10.23 JJ 75 Ssibirioside 472.1562 -3.7 10.28 JJ 76 Astragalin_2 448.0996 -2.1 10.30 JYH 77 Kanzonol-F 420.1954 3.8 10.31 LQ

78 Forsythoside-G 770.2628 -0.7 10.38 JJ 79 Euchrenone 406.2125 -4.7 10.58 JJ 80 1,3-O- dicaffeoylquinic acid 516.1274 1.1 10.90 JYH 81 Aaringin 580.1755 -6.7 11.10 LQ/ZC 82 Vitexin 432.1042 -3.2 11.20 FF/JYH/GC 83 Paeonioflorin 480.1600 -6.5 11.21 JJ 84 Epivogeloside 388.1387 4.1 11.26 GC 85 Isochlorogenic-acid 516.1261 -1.3 11.31 JYH 86 P=Paeonioflorin 480.1624 -1.6 11.35 JYH 87 Campesteryl-ferulate 576.4133 -7.6 11.37 JJ 88 Centauroside 758.2634 0.1 11.38 JYH 89 7alpha-L-Rhamnosyl-6-methoxylutcolin 462.1138 -5.3 11.49 JYH 90 angroside-C 784.2803 1.7 11.50 JJ 91 Notoginsenoside-R6 962.5456 0.6 11.59 JJ 92 (-)-secologanin 390.1181 4.4 11.76 JYH 93 Notoginsenoside R1 932.5353 0.9 11.84 JJ 94 Matairesinoside 520.1918 -4.9 11.85 JJ 95 Kanzonols-K 436.1887 0.3 11.87 LQ 96 Liquiritin-apioside 550.1679 -1.3 11.87 GC 97 Gycyroside 562.1680 -1.1 11.94 GC 2',7-Dihydroxy-4'-methoxyisoflavan-7-O-d- 98 434.1560 -4.0 11.96 JJ glucopyranoside 99 Licuraside 550.1689 0.6 12.08 GC 100 6-O-α-D- Galactose giharpside 656.2365 7.1 12.16 JYH 101 Ginsenoside Re 946.5527 2.6 12.29 JJ 102 (6′-O-Palmitoyl)-sitosterol-3-O-β-D-glucoside 800.6563 3.9 12.30 RS 103 Ginsenoside-Rh4_qt 458.3769 2.0 12.31 JJ 104 Quercetin_5 400.3730 5.9 12.31 JJ 105 Ginsenoside Rf 800.4947 3.0 12.32 JJ 106 Ginsenoside-Rh4 620.4282 -1.0 12.38 JJ 107 Ononin 430.1260 -0.8 12.38 GC 108 3β-O-trans-p-Caffeoyl alphitolic acid 634.3921 8.1 12.72 JJ/FF 109 3β-O-trans-p-Caffeoyl alphitolic acid 634.3893 3.7 12.79 JJ/JYH/GC 110 Helixin 750.4616 8.0 12.80 JJ 111 Phillyrin isomer 534.2049 -9.0 12.87 LQ 112 Arctiin 534.2095 -1.0 12.96 JJ 113 Patchouli ene 218.2056 8.6 13.60 FF 114 Harpagoside 494.1789 0.1 13.68 JJ 115 Harpagoside 494.1809 3.8 13.68 JJ 116 Acacetin -7-O-β-D- glucoside 446.1210 -0.7 13.83 JYH 117 liquoric-acid 484.3184 -1.1 14.40 GC

118 Deoxidized oleanolic acid 440.3655 0.1 14.93 JJ 119 Sanchinoside-C1 800.4960 4.6 14.94 JJ 120 Cryptomeriol 300.2113 6.8 15.21 LQ 121 Ginsenoside Rb1 1108.6081 4.5 15.59 JJ 122 (hydroxymethyl)-oxane-triol 604.4349 1.6 15.59 JJ 123 Betulin 442.3804 -1.5 15.59 JJ 124 20-Hexadecanoylingenol 586.4234 0.1 15.96 JJ 125 Licorice-saponin-G2 838.4019 3.8 16.01 JJ 126 Ginsenoside Ro 956.5020 4.0 16.05 JJ 127 Ginsenoside Rb3 1078.5971 4.3 16.30 JJ 128 Malonylginsenoside-Rc 1164.5975 4.0 16.46 JJ 129 Glabrolide 468.3223 -3.6 17.18 GC 130 3-O-beta-D-Glucuronopyranosyl-gypsogenin 646.3709 -1.2 17.23 GC 131 Glycyram_1 822.4039 0.2 17.23 GC 132 Zizyberanalic-acid 470.3381 -3.3 17.23 GC 133 Glyasperins-D 370.1773 -1.9 17.85 GC 134 Glycyram_2 822.4066 3.4 18.11 GC 135 Licorice-saponin-K2 822.4066 3.4 18.11 GC 136 Licorice-saponin-J2 824.4239 5.4 18.71 GC 137 Licochalcone-G 354.1484 4.8 19.26 GC 138 Arctigenin 372.1591 5.2 19.39 GC 139 Glyasperins-K 368.1624 0.1 19.40 XR/GC 140 Gancaonin-G 352.1334 6.6 19.77 XR/GC 141 Asiatic acid 488.3539 6.9 19.86 JJ/LQ 142 Glyasperin-B 370.1437 5.8 20.06 GC 143 Glyasperins-M 368.1284 6.7 22.50 ZC/XR/GC 144 Ginsenoside-Rg5 766.4889 2.7 23.71 JJ 145 Licorice-saponin-B2 808.4230 -1.7 23.71 RS 146 Apioglycyrrhizin_qt 470.3392 -0.9 24.25 JJ 147 Campesteryl-ferulate 576.4148 -5.4 27.17 JJ 148 Lonicerin -7-O-α-D- glucoside 450.1119 -9.0 28.74 JJ 149 Glabrolide 468.3224 -3.3 34.14 JJ 150 Licorice-saponin-C2_qt 454.3405 -9.1 34.97 JJ