<<

Results and Discussion

4. RESULTS AND DISCUSSION

4.1 Genetic diversity analysis of coat (CP) gene of Papaya ringspot virus-P (PRSV-P) isolates from multiple locations of Western India Results – 4.1.1 Sequence analysis In this study, fourteen CP gene sequences of PRSV-P originating from multiple locations of Western Indian States, Gujarat and Maharashtra (Fig. 3.1), have been analyzed and compared with 46 other CP sequences from different geographic locations of America (8), Australia (1), Asia (13) and India (24) (Table 4.1; Fig. 4.1). The CP length of the present isolates varies from 855 to 861 nucleotides encoding 285 to 287 amino acids.

Fig. 4.1: Amplification of PRSV-P coat protein (CP) gene from 14 isolates of Western India. From left to right lanes:1: Ladder (1Kb), 2: IN-GU-JN, 3: IN-GU-SU, 4: IN-GU-DS, 5: IN-GU-RM, 6: IN-GU-VL, 7: IN-MH-PN, 8: IN-MH-KO, 9: IN-MH-PL, 10: IN-MH-SN, 11: IN-MH-JL, 12: IN-MH-AM, 13: IN-MH-AM, 14: IN-MH-AK, 15: IN-MH-NS,16: Negative control. Red arrow indicates amplicon of Coat protein (CP) gene.

Table 4.1: Sources of coat protein (CP) gene sequences of PRSV-P isolates from India and other countries used in this study.

Country Name of Length GenBank Origin¥ Reference isolates* (nts) Acc No IN-GU-JN GU-Jamnagar 861 MG977140 This study IN-GU-SU GU-Surat 855 MG977142 This study IN-GU-DS GU-Desalpur 855 MG977139 This study India IN-GU-RM GU-Ratlam 858 MG977141 This study IN-GU-VL GU-Valsad 855 MG977143 This study IN-MH-PU MH-Pune 861 MH311882 This study

Page | 36

Results and Discussion

IN-MH-PN MH-Pune 861 MG977150 This study IN-MH-KO MH-Kolhapur 858 MG977147 This study IN-MH-PL MH-Palghar 858 MG977149 This study IN-MH-SN MH-Sangli 861 MG977151 This study IN-MH-JL MH-Jalgaon 858 MG977146 This study IN-MH-AM MH-Amravati 858 MG977145 This study IN-MH-AK MH-Akola 855 MG977144 This study IN-MH-NS MH-Nashik 858 MG977148 This study IN-PU-S MH-Pune 855 AY238881 Jain et al., 2004a IN-PU-M MH-Pune 855 AF063220 Jain et al., 2004a IN-HP HP 852 AY458617 Jain et al., 2004a IN-BH BH 852 EF210196 IN-RJ RJ 855 KF989499 IN-MP MP 858 KC149501 IN-AS AS 852 KC149500 IN-SIK SIK 852 DQ354072 Jain et al., 2004a IN-MG MG 861 MF356497 Mishra and Patil, 2018 IN-CG1 CG 861 DQ354071 IN-CG2 CG 852 AY491011 Jain et al., 2004a IN-HY HR 852 DQ088670 IN-JK JH-Ranchi 852 AY458619 Jain et al., 2004a IN-UP1 UP 855 AY238882 Jain et al., 2004a IN-UP2 UP 852 AY458620 Jain et al.,2004a IN-DL DL 846 AY238883 Jain et al., 2004a IN-WB WB 852 JN979406 Srinivasulu and IN-AP1 AP 918 DQ666638 SaiGopal, 2012 Srinivasulu and IN-AP2 AP 918 AY839864 SaiGopal, 2012 IN-TN1 TN 918 DQ666641 Srinivasulu and

Page | 37

Results and Discussion

SaiGopal, 2012 IN-TN2 TN 858 EF104919 Srinivasulu and IN-KE KE 918 DQ666640 SaiGopal, 2012 Srinivasulu and IN-KA1 KA 918 DQ666639 SaiGopal, 2012 IN-KA2 KA 918 AY238884 Jain et al.,2004a BD Bangladesh 861 AY423557 Jain et al., 2004a CH1-HN China 867 KT895257

CH2-LM China 858 KT633943 Shen et al.,2014

ID-1 Indonesia 831 AF374864

JP-S Japan 858 D50591

MY-M185P Malaysia 861 AB044342

MN-P Myanmar 851 AB822931 Asia NP-Ne2 Nepal 852 AB823733 (other PH Philippines 834 AF374863 than SL Sri Lanka 837 U14741 Bateson et al.,1994 India) TW-YK Taiwan 861 X97251 Wang et al.,1997

Charoensilp et TH Thailand 858 AY162218 al.,2003 VN-P24 Viet Nam 834 AF506867

BR-PA1 Brazil 861 KC748221 Martinez et al.,2014 CO Colombia 861 KT275938 Ortiz-Rojas et al.,2017 CU-MT1 Cuba 861 KC748227 Martinez et al.,2014 America US-FL US (Florida) 855 AF196839 Davis et al.,1999 US-HA US (Hawaii) 861 S46722 Yeh et al.,1992 JM-FS Jamaica 834 DQ104817 Chin et al.,2007 MX-VrAl Mexico 855 AF319496

VE-T Venezuela 837 DQ339578 Chin et al.,2007 Australia AU-BD Australia 837 U14736 Bateson et al.,1994

Page | 38

Results and Discussion

*Name of the isolates are abbreviated or modified in this study. References mentioned only to published sequences. Blank cells/ Columns indicates only NCBI submitted sequences which are not published. ¥Indian States Name Abbreviated as: AP: Andhra Pradesh, AS: Assam, BH:Bihar, CG: Chhattisgarh, DL: Delhi, GU: Gujarat, HP: Himachal Pradesh, HR: Haryana, JK/JH: Jharkhand, KA: Karnataka, KE: Kerala, MG: Meghalaya, MH: Maharashtra, MP: Madhya Pradesh, SIK: Sikkim,RJ: Rajasthan, TN: Tamil Nadu, UP: Uttar Pradesh, WB: West Bengal.

Previously reported KE (lysine and glutamic acid) repeats in the N-terminus of the PRSV CP (Silva-Rosales et al., 2000) were observed in all isolates of PRSV-P sequence analyzed. The conserved regions commonly found in Potyviruses CP sequences such as WCIEN, PQQID, DAG and QMKAAA (Shukla et al., 1994) were present in all the PRSV-P isolates from Maharashtra and Gujarat. The 14 CP sequences obtained from this study along with 46 other reported sequences of PRSV-P were compared at and nucleotide levels (Table 4.2). A general summary of sequence diversity among 60 isolates of PRSV-P, including present isolates of Gujarat and Maharashtra, from different countries and continents have been presented (Table 4.3). The isolates from Maharashtra and Gujarat when compared with other Asian countries and other Indian isolates, showed significantly diverse heterogeneity with the sequence identity of 82-98 % at amino acid (aa) level and 81-100 % at nucleotide (nt) level amongst each other (Table 4.3). Similarly, isolates from neighboring countries showed presence of diverse PRSV population (at aa level) 87-91 % sequence identity with Bangladesh, 85-93 % with Nepal, 85-89 % with Myanmar, 87-90 % with Sri Lanka and 88-93 % with China. However, the PRSV-P isolates from Gujarat and Maharashtra were found more closely related with the PRSV-P isolate from Japan (92-94 %) and USA (91-95 %) at aa level (Table 4.2). All Indian isolates showed identity of aa to Asian isolates by 81–96 %, Australian isolates by 83-93 %, whereas showed 81- 97% with American isolates. Considerably heterogeneous population of PRSV-P was observed within Gujarat which showed the amino acid sequence identity of 91-100 %. A similar situation was also observed within Maharashtra where the shared aa sequence identities among the PRSV-P population were between 91-99 % (Table 4.3).

Page | 39

Results and Discussion

Table 4.2: PRSV coat protein (CP) gene (amino acid) percent identity between PRSV-P isolates from this study and other reported isolates from different countries.

Gujarat State Maharashtra State

JL

PL SN NS

PN PU

JN

AK

SU DS

KO

VL

AM -

PRSV-P isolates RM

- - -

- -

-

-

- -

-

-

-

-

GU

GU GU

GU MH

MH MH MH

MH MH

GU

MH

MH

MH

-

- -

- -

- - -

- -

-

-

-

-

IN

IN IN

IN IN

IN IN IN

IN IN

IN

IN

IN IN IN-HP 86 91 87 90 91 89 90 90 89 88 91 90 90 93 IN-MG 86 92 88 90 91 89 91 90 89 88 92 91 90 92 IN-BH 86 91 87 90 91 89 91 90 89 89 90 90 91 89 IN-RJ 94 94 98 96 94 92 92 94 91 93 94 94 94 90 IN-MP 90 94 90 92 94 96 95 94 95 92 92 92 93 92 IN-AS 85 90 86 89 90 88 89 89 88 87 91 91 89 92 IN-SIK 85 90 86 89 90 88 89 89 88 87 91 91 89 93 IN-CG2 87 91 87 91 91 90 92 91 90 89 91 91 91 90 IN-CG1 91 94 92 94 94 94 93 94 93 94 93 93 95 93 IN-HY 86 92 88 90 92 90 92 91 90 90 91 91 92 93 IN-JK 86 92 88 91 92 90 91 91 90 90 92 91 92 93 India IN-UP2 87 91 87 91 91 90 92 91 90 89 91 91 91 94 (other IN-UP1 90 95 92 93 95 93 93 97 92 94 93 94 95 84 states) IN-DL 87 92 89 91 92 91 92 91 90 90 92 92 92 82 IN-WB 86 92 87 91 92 91 91 90 90 90 91 91 91 85 IN-AP1 84 89 85 86 89 90 90 88 90 87 86 86 88 90 IN-AP2 82 86 83 85 86 87 87 86 87 84 83 83 86 84 IN-TN1 85 89 85 87 89 90 90 88 89 86 86 86 88 83 IN-TN2 90 94 90 92 94 95 96 94 95 92 92 92 94 90 IN-KE 83 89 85 86 89 89 89 87 89 85 85 85 87 90 IN-KA1 84 88 84 86 88 88 89 88 89 85 85 85 87 90 IN-KA2 91 95 91 93 95 97 97 95 98 93 92 92 95 92 IN-PU-S 91 96 93 94 95 94 94 98 93 95 94 95 96 90 IN-PU-M 91 96 93 94 95 94 94 98 93 95 94 95 96 90 VN-P24 87 92 90 90 92 90 90 90 89 90 90 90 91 90 Asia ID-1 87 91 89 89 91 91 90 91 91 90 90 89 90 88 ( other PH 84 89 87 87 89 87 87 88 87 87 87 87 87 85 than TW-YK 89 94 91 92 93 92 94 93 92 91 92 93 93 90 India) CH1-HN 90 92 91 91 92 92 92 92 91 91 90 91 92 88 CH2-LM 90 93 91 91 93 92 93 93 92 91 91 92 93 90 TH 89 93 92 91 93 92 92 92 91 91 91 92 92 89

Page | 40

Results and Discussion

JP-S 90 94 92 93 94 93 93 94 92 93 92 93 94 90 MN-P 86 89 87 87 89 87 87 87 87 85 87 87 87 85 NP-Ne2 85 90 86 89 90 88 90 89 88 88 90 90 90 93 BD 87 91 89 90 91 91 90 91 91 89 90 90 91 89 SL 87 90 88 89 90 89 90 90 89 88 87 88 90 86 MY-M185P 89 93 91 91 93 92 92 93 91 91 91 92 92 89 Australia AU-BD 89 92 90 91 92 91 91 91 91 91 90 91 91 89 Cu-MT1 89 93 91 93 92 92 92 92 91 91 91 91 92 89 BR-PA1 89 93 91 92 93 92 92 93 92 91 92 92 93 90 US-HA 91 95 93 94 95 94 94 94 93 94 94 94 95 92 US-FL 91 95 93 93 95 93 93 94 93 93 93 93 94 91 America CO 88 92 90 91 92 91 91 92 91 90 91 91 92 89 MX-VrAl 89 93 91 92 92 92 92 93 91 92 91 91 93 90 VE-T 87 91 89 90 90 89 89 90 89 89 89 89 90 87 JM-FS 84 88 86 87 87 87 87 87 86 85 86 86 87 86

Table 4.3: Summary of coat protein (CP) gene identity (percent minimum – maximum) at amino acid (diagonally lower values in blue) and nucleotide (diagonally upper values in yellow) levels between PRSV-P isolates from India and other countries.

GJ MH RoI RoA AU AM Overall 90-99 88-98 81-98 83-94 89-90 86-92 81-99 GJ 91-100 87-100 89-95 80-100 82-92 85-90 82-93 81-100 MH 91-99 78-100 82-96 83-98 76-96 80-92 77-94 76-100 RoI 82-100 78-95 82-96 83-98 81-96 82-88 83-91 78-95 RoA 86-96 100 89-92 90-92 83-93 87-95 92-94 92-100 AU 100 88-96 84-95 85-94 81-97 83-97 92-95 88-96 AM 87-97 76-100 82-100 83-99 81-100 83-97 92-100 87-97 Overall 81-100 GJ- Gujarat; MH- Maharashtra; RoI- Rest of India; RoA- Rest of Asia; AU- Australia; AM- America

4.1.2 Phylogeographic analysis Phylogenetic tree based on the amino acid sequences of CP gene revealed that the clustering of the PRSV-P isolates were grouped into three major sub-clusters which are further divided into sub-clusters (Fig. 4.2). All the American isolates were grouped in separate clusters with few Asian isolates. However, Asian isolates including Indian are grouped in two major sub-clusters and further sub-divided in sub-clusters/ groups. The

Page | 41

Results and Discussion

Indian isolates were divided and grouped into two sub-clusters/ groups which are separated by the isolates from China, Japan, Taiwan, Malaysia, Vietnam, Thailand and Philippines. Most of the isolates from Gujarat and Maharashtra except few exceptions have been grouped into three sub-clusters/ groups in Indian isolates viz., one group aligned with the South Indian isolates and other two aligned with North Indian isolates. The data presented in above study reveals that PRSV-P population is highly diverse in the Gujarat and Maharashtra states of Western India and they form distinct separate sub- groups/ cluster within Indian PRSV population. A high level of diversity of PRSV isolates were also indicated by the two-dimensional color coded graphical representation of pair wise percent sequence identities (Fig. 4.3).

Page | 42

Results and Discussion

Page | 43

Results and Discussion

Fig. 4.2: Phylogenetic tree based on coat protein (CP) gene coding region of PRSV-P isolates from all over the world including the isolates of this study. Neighbor-Joining tree with 1000 bootstrap based on amino acid sequence of coat protein (CP) gene. The isolates of this study indicated with solid black diamond. The tree was rooted on out-group sequence of Zucchini yellow mosaic virus (ZYMV, NCBI Accession no. EF062582)

Fig. 4.3: Pairwise nucleotide percent sequence identities of PRSV-P isolates based on Coat protein (CP) gene.

4.1.3 Recombination and selection pressure analysis Results of the recombination analysis of 60 CP sequences worldwide are shown in Table 4.4. Nine putative recombination sites were detected out of 60 PRSV-P isolates were analyzed in this study. Only recombination events showed positive for minimum four out seven methods used in recombination detection program (RDP) analysis and were considered significant. Majority of recombination frequency was from N-terminal of CP gene. The Indian CP sequences appeared as major parents 15 times out of 24 and

Page | 44

Results and Discussion as minor parents 26 times out of 34. Majority of recombinant sequences (more than 60 %) were also from Indian isolates. In the present investigation, among the Indian sequences, the CP sequences from the Western Indian state of Maharashtra and Gujarat appear to have significant involvement (more than 50 %) in recombination either as major or minor parents or as recombinant sequences which indicate the CP sequences of PRSV-P from Western Indian isolates being the most prone to recombination. Synonymous Non-synonymous Analysis program (SNAP) analysis for study of evolutionary pressure acting on CP gene of 60 PRSV-P isolates indicates the number of non-synonymous substitutions (dN) and the number of synonymous substitutions (dS) which were 1.2884 and 1.9012 respectively. The ratio of dN/dS was 0.6776. These results suggest a greater evolutionary dynamic for the PRSV isolates from Western India due to nucleotide substitutions.

Table 4.4: Recombination site detected in Coat protein (CP) gene of PRSV-P using seven non-parametric and parametric methods. Recombination Recombinant Parental Sequences(s) position Sequences(s) Minor Major IN-MH-KO, TH, TW5, IN-UP1, JP-S, IN-MP, US-FL, IN-RJ, IN-AP1, IN-AP2, IN-PU-S, IN-PU-M, 20-114 IN-KA2 IN-TN1, IN-TN2, IN-MH-AK, IN-KE, IN-KA1, IN-GU-SU, IN-MH-JL, IN-GU-DS, IN-MH-NS IN-GU-VL IN-MH-KO, IN-MP, IN-AP1, IN-MH-SN, CH2-LM, TH 29-104 IN-TN1, IN-MH-PN IN-TN2, IN-KE, IN-KA1, IN-MH-JL,

Page | 45

Results and Discussion

IN-MH-NS 40-92 IN-MG IN-AP2 IN-GU-JN IN-MH-AK, US-FL, MY-M185P, IN-RJ, TH, JP-S IN-UP1, IN-PU-S IN-MH-PL 48-102 IN-PU-M, IN-GU-SU, IN-GU-DS, IN-GU-VL IN-MH-AK, Cu-MT1, BR-PA1, 52-93 CH2-LM IN-MH-PL US-HA, US-FL, CO MY-M185P, US-HA, 61-857 BD CH1-HN TW-YK US-FL, IN-RJ, IN-UP1, IN-PU-S, IN-CG1, US-HA, IN-PU-M, TW-YK, 70-855 BD IN-MH-AK, MY-M185P, IN-GU-SU, IN-GU-JN IN-GU-DS, IN-GU-VL IN-MH-SN, 106-858 IN-AP2, JP-S IN-MG, IN-MH-PU IN-MH-PN IN-KA2, IN-RJ, MX-VrAl, 774-854 IN-PU-S IN-GU-VL IN-GU-DS

Red coloured isolates were studied in present investigation.

Page | 46

Results and Discussion

4.1.4 Discussion – In this study, we have reported the CP sequences of papaya infecting PRSV-P from 14 different locations of Western Indian state of Gujarat and Maharashtra which are major papaya producing states. This is the first comprehensive study of PRSV-P CP sequences from Western regions of India with recombination analysis using larger sample size of 60 PRSV-P isolates. The size of CP gene in present study varies from 285-287 amino acids which is similar to earlier reports where CP length varied from 280-286 amino acids (Jain et al., 2004a). The CP size variation is mainly due to the variation in the number of KE repeats in the amino terminal region of the CP gene. Several reports on PRSV (Bateson et al., 1994; Silva-Rosales et al., 2000) indicated the initial fifty amino acids at the N-terminal region were found to be highly variable as in most of the Potyviruses (Shukla et al., 1994). However most of the changes in the N-terminal region of CP gene were found identical. Previous studies also reported to have the above conserved motif in PRSV CP sequences (Bateson et al., 2002; Hema and Prasad 2004; Jain et al., 2004a; Bag et al., 2007; Srinivasulu and Sai Gopal 2012). A considerable mixing and movement of PRSV isolates within and between Indian states were suggested by the phylogenetic studies ( Hema and Prasad 2004; Jain et al., 2004a; Bag et al., 2007; Srinivasulu and SaiGopal 2012; Akhter et al., 2013). Our investigations show agreement with sequence comparison studies carried out by previous researchers according to whom the PRSV isolates from the Indian sub-continent are most diverse (Quemada et al., 1990; Bateson et al., 1994; Silva-Rosales et al., 2000; Hema and Prasad 2004; Jain et al., 2004; Bag et al., 2007; Wei et al., 2007 and Srinivasulu and Gopal 2012; Akhter et al., 2013). The level of sequence divergence in PRSV population from Australia and USA (Quemada et al., 1990; Bateson et al., 1994; Silva-Rosales et al., 2000; Wei et al., 2007) is lower as compared to other continents. On other hand, a higher level of divergence with the PRSV population from Indian sub-continent has been reported earlier (Hema and Prasad 2004; Jain et al., 2004b; Bag et al., 2007; Srinivasulu and SaiGopal 2012; Akhter et al., 2013). A wide range of cropping systems and cultiva- tion practices followed in different agro-climatic regions of India may contribute in a higher level of divergence in CP gene of PRSV. Comparatively a higher CP sequence

Page | 47

Results and Discussion variability of up to 13 % was observed in the PRSV population of even smaller geographic area within Western Indian States of Gujarat and Maharashtra. This might be due to the higher recombination and selection pressure of the virus, as being the major intensive papaya producing region with tropical climatic condition favoring virus and vector survival and its multiplication. The presence of recombination in PRSV provides an understanding of its molecular evolution and help in the study of the characteristics such as the specificity of the host, geographical distribution and emergence as new epidemics (Mangrauthia et al., 2008). To the best of our knowledge this is the first report providing evidence of recombination in CP gene sequences of PRSV-P from western regions of India using the large number (60) of sequences. We have detected 9 putative recombination frequencies in CP sequences as compared to earlier report which indicated one putative recombination when 15 sequences were used for analysis (Mangrauthia et al., 2008). This could be due to the less number of samples used for analysis as detection of recombination is dependent on sufficient sampling. It is believed that insufficient sampling limits the detection of more recombination. The sequence diversity and recombination studies of 60 samples supports earlier observations that recombination is a dominant feature of PRSV evolution (Ohshima et al., 2002). More number of sequences helps us to detect more robust occurrence of recombination sites and hotspots which was a major limitations in previous studies (Mangrauthia et al., 2008). The CP sequence analyses of PRSV-P isolates from Western states of India further strengthen the concept of PRSV origin in the Indian sub-continent (Bateson et al., 1994; Mangrauthia et al., 2008). Molecular recombination and variability studies of CP gene has great implications on understanding the virus population, their origin and evolution, and more importantly on epidemiology and management of the virus through genetic engineering. Moreover, sequence variability plays an important role in overcoming the host resistance (Chiang et al., 2001; Tennant et al., 2001; Tripathi et al., 2004). This will also be helpful in designing the effective transgene for engineering transgenic resistance in commercial papaya and selecting a suitable mild isolate for cross-protection for mitigating economic damage by the devastating virus. The present study provides an idea on extent of

Page | 48

Results and Discussion recombination and variability in CP gene of PRSV-P in Western Indian states which was not studied previously in such details. This will have significant impact on understanding the virus epidemiology and designing strategies for PRSV management.

4.2 Molecular characterization and genetic analysis of PRSV-Pune genome: First report from Western India

Results – 4.2.1 PRSV-Pune genome sequence analysis During disease survey we collected 20 different virus-infected-papaya-leaf samples from Pune district of Maharashtra. Samples were confirmed using - Linked Immune-Sorbent Assay (ELISA) based on polyclonal antibody (pAb) specific to PRSV. All collected samples were found positive in ELISA. However, a severe mosaic and vein clearing showing virus isolate sample from Kusgaon village of Pune was maintained on papaya in an insect-proof glasshouse and was selected for complete genome analysis. Severe PRSV symptoms were observed after 20 - 25 dpi on leaves of inoculated papaya and the virus infection was reconfirmed using ELISA and reverse transcription and polymerase chain reaction (RT-PCR) by using PRSV CP gene specific primer. Two confirmed PRSV positive samples along with virus-free leaf samples were subjected to whole transcriptome sequencing for obtaining the complete genome of PRSV-Pune isolate from Kusgoan. A total of 182,164,620 sequences in pairs were generated in the total transcriptome sequencing of papaya samples. The contigs were generated by de novo assembly. Contigs giving blast hit to PRSV were filtered and mapped to the geographically closer PRSV genome (Accession no. KP743981) from India. Subsequently, the complete PRSV sequence was recovered, assembled and analyzed. Primers were synthesized based on conserved contigs within two infected samples for further confirmation (Table 4.7). The conserved domain search at NCBI and BLASTn analysis of the assembled contigs revealed it as PRSV genome. In several cases, RNA sequencing of infected host was used earlier to characterize their viruses (Kreuze et al., 2009; Adams et al., 2013; Barba et al., 2014).

Page | 49

Results and Discussion

The genome of PRSV-Pune isolate (10,326 bases) consisted of one uninterrupted ORF from 85-10,119 bases. The genome organization for PRSV-Pune shown in fig. 2.1 of review chapter. The bases 1–84 and 10,120-10,326 constituted the 5′ and 3′ UTRs excluding poly-A respectively. The nucleotides positioned at 85-1725 (P1), 1726–3096 (HC-Pro), 3097–4131 (P3), 4132–4278 (6K1), 4279–6192 (CI), 6193–6363 (6K2), 6364–6930 (NIa–VPg), 6931–7644 (NIa–Pro), 7645–9255 (NIb) and 9256–10119 (CP) constituted the genes or cistrons. The complete genome sequence of PRSV-Pune infecting papaya was deposited in NCBI Genbank with the accession number MH311882. The nucleotide (nt) and amino acid (aa) sequences of PRSV-P isolates originating from India and from rest of the world were analyzed for genetic diversity (Table 4.5) and summarized data has been presented (Table 4.6).The sequence identity matrix of the PRSV-Pune isolate showed nucleotide sequence identity ranging from 82- 86 % and 87-90 % at nt and aa level respectively along with previously reported PRSV isolates were also included in our present study (Table 4.5 and 4.6). The identity among all Indian PRSV-P isolates was 82-86 % and 88-89 % at nt and aa levels, respectively. The full genome sequence of present isolates exhibited highest sequence identity of 86 % with PRSV genome sequence (MF356497) reported from an isolate from North-Eastern state in India and the lowest identity of 82 % with papaya infecting isolates from Brazil and Columbia (Table 4.6). At poly-protein (3345 aa) level the highest sequence identity of 90 % shared with isolates from Taiwan (TW3, TW7 and TW10) and the lowest identities were noted with Columbian isolates. Maximum sequence identity was observed in 6K2 with 91-100 % at aa level with other isolates. However, the 6K1 showed lowest identity (46-87 % and 21-85 % identity at nt and aa level respectively) indicating highest variability among all PRSV genes. 6K1 from Meghalaya isolate showed least identity (46 % nt and 21 % aa). P1 showed 70-78 % and 66-75 % identity at nt and aa level respectively. In case of other gene like HC-Pro, P3, CI, VPg, NIa-Pro, NIb and CP showed 79-91 % and 87-98 % identity at nt and aa level respectively (Table 4.5 and 4.6). The well-studied motifs such as KITC, PTK, and FRNK in HC-Pro, AVGSGKST in CI, GDD in NIb and DAG in the CP coding region having important role in aphid transmission and symptom development, and virus replication were identified in the

Page | 50

Results and Discussion

PRSV-Pune. These motifs were commonly found and reported from other PRSV and Potyviruses (Mangrauthia et al., 2009; Shukla et al., 1994). The translated large poly- of PRSV in the Potyvirus group are processed by three encoded proteases that cleave the poly-protein into a total of ten mature peptides. The proteases and cleavage sites that have been identified for PRSV-Pune are shown in Table 4.8. The proteolytic cleavage junctions of P1/HC-Pro (MEQY/S), HC-Pro/P3 (HYIVG/G), P3/6K1(VIHQ/A), 6K1/CI (VYHQ/S), CI/6K2 (DYHQ/N ), 6K2/NIa-VPg (VFHQ/G ), NIa-VPg/Pro (VHHE/G ), Pro/NIb (VFEQ/S ) and NIb /CP (VFHQ/S) were found in the present isolate of PRSV-P.

Table 4.5: List of PRSV-P complete genome sequences and their sources from different countries used in this study for analysis. Year of Sr. Genome Abbre- Accession submis No Isolate size Origin No. -sion in . (nt) viation NCBI 1 IN1 DEL EF017707 10317 India 2006 2 IN2 PRSVR3 KJ755852 10326 India 2014 3 IN3 HYD KP743981 10341 India 2015 4 IN4 Meghalaya MF356497 10343 India 2017 5 IN5* Pune MH311882 10326 India 2018 6 IN6# - EU475877 10335 India 2008 7 CH1 - EF183499 10323 China 2006 8 CH2 HN-1 HQ424465 10332 China 2010 9 CH3 HNvb KF791028 10326 China 2013 10 CH4 - KF734962 10332 China 2013 11 CH5 HN-DF KT895257 10332 China 2013 12 CH6 LM KT633943 10325 China 2015 13 TH - AY162218 10323 Thailand 2002 14 TW1 YK X97251 10326 Taiwan 1996 15 TW2 Ping-tong (DF) DQ340769 10326 Taiwan 2005 16 TW3 Ping-tong (SMN) DQ340770 10326 Taiwan 2005 17 TW4 Ping-tong (SM) DQ340771 10326 Taiwan 2005

Page | 51

Results and Discussion

18 TW5 P-5-19 EU882728 10320 Taiwan 2008 19 TW6 PrT3-AX-N-Ad6 JX448369 10326 Taiwan 2012 20 TW7 PrT3-AX-N-Ph JX448370 10326 Taiwan 2012 21 TW8 PrT3-AX-M-Ph JX448371 10326 Taiwan 2012 22 TW9 pfT3-AX-D-Ph JX448372 10326 Taiwan 2012 23 TW10 pFT3-NP JX448373 10326 Taiwan 2012 24 BZ FEV KP462721 10349 Brazil 2015 25 CO1 VR KT275937 10326 Colombia 2015 26 CO2 CH KT275938 10326 Colombia 2015 27 MX Mexo VrPo AY231130 10320 Mexico 2003 28 US1 HA S46722 10326 Hawaii 1992 29 US2 PG EU126128 10320 Hawaii 2007 30 US3 PTX KY271954 10320 USA 2016

* - Isolate studied in the present investigation; #-PRSV-W isolate from India

Page | 52

Results and Discussion

Table 4.6: Percent sequence identity of PRSV-Pune isolate with other reported PRSV-P isolates worldwide.

GenBank FL 5'UTR Polyprotein P1 HC-Pro P3 6K1 CI 6K2 VPg NIa NIb CP 3'UTR Isolate Acc. No nt nt nt aa nt aa nt aa nt aa nt aa nt aa nt aa nt aa nt aa nt aa nt aa nt IN1 EF017707 84 72 84 88 72 68 85 91 87 93 81 75 88 93 87 93 87 94 85 91 83 91 88 93 83 IN2 KJ755852 84 74 84 89 74 70 86 92 86 92 85 83 86 93 88 93 84 94 84 93 82 93 90 94 81 IN3 KP743981 84 72 84 88 77 73 85 92 85 91 82 77 88 93 87 96 87 95 85 91 83 91 88 81 80 IN4 MF356497 86 72 86 88 71 68 89 92 90 95 46 21 90 92 91 93 90 93 89 92 85 87 92 92 93 IN6 EU475877 82 74 82 88 75 71 84 91 84 92 83 81 87 93 87 93 80 93 81 91 82 91 88 93 83 CH1 EF183499 83 73 83 88 75 71 84 92 86 93 83 81 85 93 84 93 83 92 84 92 81 92 89 93 83 CH2 HQ424465 83 68 83 89 75 71 84 92 86 93 81 77 87 93 88 93 83 92 82 92 82 92 88 93 82 CH3 KF791028 83 72 83 88 75 70 85 91 86 94 84 79 85 93 86 93 83 92 82 94 81 94 89 93 81 CH4 KF734962 83 74 83 89 74 70 86 92 85 93 84 81 85 93 84 93 83 93 83 93 81 93 89 94 81 CH5 KT895257 83 74 83 88 74 69 85 91 86 93 83 77 87 93 87 96 83 91 82 92 81 92 88 92 80 CH6 KT633943 83 76 83 89 76 72 85 92 85 91 86 81 85 93 89 98 84 92 83 92 81 92 88 93 82 TH AY162218 83 73 83 88 73 70 85 93 85 92 85 81 86 93 85 95 84 94 82 92 81 92 90 93 84 TW1 X97251 84 71 84 90 77 73 85 93 87 93 82 83 86 93 88 96 83 93 84 94 82 94 89 93 83 TW2 DQ340769 84 74 84 89 77 73 86 92 86 93 87 85 87 94 87 93 84 94 83 93 82 93 90 95 81 TW3 DQ340770 84 74 84 89 77 73 86 92 86 92 87 85 86 94 88 96 84 94 84 93 82 93 90 94 81 TW4 DQ340771 84 75 84 90 78 75 86 92 86 92 86 85 87 94 88 95 84 94 83 92 82 92 91 96 81 TW5 EU882728 84 68 84 89 76 72 85 92 86 93 87 83 86 93 88 93 83 92 83 93 82 93 89 92 83 TW6 JX448369 84 74 84 89 77 73 86 92 86 92 87 85 86 93 85 96 83 93 83 93 82 93 90 94 80 TW7 JX448370 84 74 84 89 76 72 86 92 86 92 87 85 86 94 85 96 83 93 83 93 82 93 90 94 80 TW8 JX448371 84 75 84 90 78 75 86 92 86 93 87 85 87 94 88 95 84 94 83 92 82 92 91 96 81 TW9 JX448372 84 75 84 89 78 75 86 92 86 93 85 85 87 94 88 91 84 94 83 92 82 92 91 96 81 TW10 JX448373 84 74 84 89 77 73 86 92 86 93 85 83 86 93 87 93 85 94 82 93 82 93 90 96 81 BZ KP462721 82 62 83 88 71 68 85 92 86 92 83 77 86 94 88 95 83 92 84 92 82 92 88 91 82 CO1 KT275937 82 67 82 87 71 67 84 91 86 91 83 77 86 93 91 95 86 95 84 91 81 91 86 91 81 CO2 KT275938 82 67 82 87 70 66 84 91 86 92 81 77 86 93 88 93 86 94 85 92 80 92 87 91 54 MX AY231130 83 68 83 88 73 69 85 91 87 92 85 79 86 93 87 98 84 95 84 90 82 90 88 93 83 US1 S46722 83 69 83 88 71 67 84 91 86 92 82 79 87 93 90 93 84 94 85 91 83 91 89 93 81 US2 EU126128 83 69 83 88 71 67 84 91 86 92 82 77 86 93 89 93 84 94 85 89 83 89 89 93 83 US3 KY271954 83 69 83 88 71 68 84 91 87 94 81 77 86 93 88 100 85 94 84 90 79 90 88 93 81

Page | 53

Results and Discussion

Table 4.7: List of the overlapping primer pairs designed for amplification of entire PRSV-Pune isolate genome. Product Primer Position in Annealing Nucleotide Sequence (5’-3’) length Name genome temp (°C) (nt)

PRSV-P 1F AAATAAAACAACTCAACACAACACAATCAAAAAC 1-34 60 1202 PRSV-P1R ATTTTTGTGACGACAGAGTCACTCACTTG 1174-1202

PRSV-P-2F GAAGCAGTGCATTTGGCACTAGAAATTGG 868-896 62 1667 PRSV-P-2R AATTTGATTTTCAACCATTTGCTGCCG 2509-2535

PRSV-P-3F GCACTTACGTGTGACAACCAGTTGGAC 2302-2328 60 1213 PRSV-P-3R AACCCTTGCTTGAGTAGACCTTCATCC 3489-3515

PRSV-P-4F ACGCTCGCTAGAACTATTATGGAGCAG 3340-3366 59 1502 PRSV-P-4R AGCCATTGCTTGGCTATCTATAACG 4818-4842

PRSV-P-5F GGTAATGTATACCTAAAAGAGCCTACCAGA 4594-4623 62 1835

Page | 54

Results and Discussion

PRSV-P-5R TCGACCAAGCTTAGCATTTGCAGCAG 6404-6429

PRSV-P-6F CAATCCCTCGGAGCTATCAGAGCGGA 6157-6182 62 1670 PRSV-P-6R GCTTTTCATGTAACGACCCATGAGTGGA 7800-7827

PRSV-P-7F CCTAGCAACTTAGTAACGAAGCATGTC 7705-7731 61 1890 PRSV-P-7R ACTTCTCAAACTGAGACTGAGTGGCACG 9568-9595

PRSV-P-8F CTGAGGCTGTGGATGCGGGCTTAAACGA 9263-9290 60 1038 PRSV-P-8R TTTTTTTTTTTTTTGCACGTGGGCC Poly-A

Page | 55

Results and Discussion

Table 4.8: Predicted cleavage sites for the PRSV-Pune proteins and percentage sequence identity among the different PRSV infected papaya compared to the other isolates based on their origins.

Sequence identity shared by PRSV-Pune with isolates from different

Predicted cleavage sites in regions

P P

Pune Pune -

- Nucleotide sequence identity Amino acid sequence identity

PRSV

Pune Pune 30 PRSV-P sequences

-

and functional functional and

UTR UTR of PRSV Regions genome in Position genome PRSV America Asia India Among America Asia India Among 5'UTR 1-84 - - 62-69% 68-76% 72-74% - - -

Polyprotein 85-10119 - - 82-83% 83-86% 83-86% 87-88% 88-90% 88-89%

P1/HC-Pro 85-1725 MEQY/S MEQY/N, except CO2 70-73% 71-78% 72-77% 66-69% 67-75% 68-73% (Y/S), US1, US2 (MDQY/N)

HC-Pro/P3 1726- HYIVG/G HYIVG/G 84-85% 84-89% 84-89% 91-92% 91-93% 91-92% 3096

P3/6K1 3097- VIHQ/A VIHQ/A 86-87% 84-90% 84-90% 91-94% 91-95% 91-95% 4131

6K1/CI 4132- VYHQ/S VYHQ/S, except US3 81-85% 46-87% 46-85% 77-79% 21-85% 21-83% 4287 (VFHQ/N)

CI/6K2 4288- DYHQ/N VYHQ/S, except IN1, 86-87% 85-90% 86-90% 93-94% 92-94% 92-93% IN2,TW1 (Q/G), CH2,

Page | 56

Results and Discussion

6192 CO2 (VFHQ/S)

6K2/VPg 6193- VFHQ/G VFHQ/G 87-91% 84-91% 87-91% 93- 91-98% 93-96% 6363 100%

VPg/NIa-Pro 6364- VHHE/G VHHE/G, except CO1 83-86% 80-90% 80-90% 92-95% 91-95% 93-95% 6930 (VNHE/G), TW5 (VYHE/G)

NIa-Pro/Nib 6931- VFEQ/S VFEQ/S 84-85% 81-89% 81-89% 89-92% 91-94% 91-93% 7644

NIb/CP 7645- VFHQ/S VFHQ/S except US3 (E/S), 79-83% 81-85% 82-85% 89-92% 87-94% 87-93% 9255 CO2 (AYHQ/S), IN3 (VYHQ/S)

CP 9256- 86-89% 88-92% 88-92% 91-93% 81-96% 81-94% 10119

3'UTR 10120- 61-94% 90-95% 90-94% 10326

Full genome 1-10326 82-83% 82-86% 82-86%

Page | 57

Results and Discussion

4.2.2 Phylogenetic analysis PRSV full genome sequences were retrieved from GenBank and used for the phylogenetic analysis (Table 4.5) along with PRSV-Pune isolate based on deduced amino acid of poly-protein and other genes such as P1, HC-Pro, 6K1, 6K2, NIa, NIb and CP. Phylogenetic tree of poly-protein shows two major lineages namely, Asian and American. The Asian lineage includes 18 isolates which include ten from Taiwan, six from China, one from India and one from Thailand. All Indian isolates except one (IN2) aligned with American lineage including 7 isolates of American origin. Five out of six Indian PRSV isolates are grouped in American cluster and one isolate (IN2) aligned within Asian cluster along with Taiwan isolates. Phylogenetic analysis clearly shows that papaya infecting isolates of PRSV from India with one exception is more closely related to PRSV-P sequences from Brazil, Columbia, Hawaii and Mexico than other Asian isolates from China, Thailand and Taiwan (Fig. 4.4).Similar types of phylogenes were also reported by others with some differences (Parameswari et al., 2007; Noa-Carrazana et al., 2007; Mangrauthia et al., 2008; Ortiz-Rojas and Chaves-Bedoya, 2017).

Page | 58

Results and Discussion

Fig. 4.4: Phylogenetic tree based on poly-protein of complete genome sequences of 30 PRSV isolates infecting papaya. Phylogenetic tree constructed using maximum likely-hood method based on JTT model with 1000 boot strap values. Black shaded circle indicates PRSV-Pune isolate. Moroccan watermelon mosaic virus (MWMV, Accession no. EF 579955) used as an out-group.

Page | 59

Results and Discussion

4.2.3 Recombination and selection pressure analysis In this study, recombination sites were detected in the PRSV-P genome including potential recombination events in genes (cistrons), poly-protein and full genomes of PRSV (Table 4.9). Higher number of recombination events were detected in first 1020 nucleotides of PRSV genome which suggested that 5‟ UTR and P1 played an important role in shaping the PRSV genome. This confirms the earlier finding reported from India (Parameswari et al., 2007; Mangrauthia et al., 2008 and Mangrauthia et al., 2009). No recombination events were detected in the 6K1and 6K2 cistrons. Some of the recombination sites were from intra-lineage and others were from inter-lineage recombination. PRSV-Pune acted as major parent in HC-Pro cistron for IN4 isolate and also it acted as a minor parent for IN3 and TW5 involved in NIb and CP respectively (Table 4.9). Substitutions per site per cistron (sub/site/cistron) for 30 PRSV sequences in this study shows that the P1 and 6K1 were the most variable cistron. However, NIa was found to be most conserved with major evolutionary constraint (Table 4.10). The low ratio substitution calculated cistron-wise between the non-synonymous and synonymous (dN/dS) sites ranged from 0.017 to 0.259 indicating negative selection (Kryazhimskiy and Plotkin, 2008), especially in the NIa (0.017) (Table 4.10).

Page | 60

Results and Discussion

Table 4.9: Recombination site detected in different regions/ genes of Papaya ringspot virus (PRSV-P) genomes using seven non-parametric and parametric methods.

Genomic Recombination Minor Parental Sequences Major Parental region/ Recombinant Sequences (s) position (s) Sequence (s) cistron

5‟ UTR 1-64 IN4,US2,1,MX,US IN3 TW5,CH2,5,TW4,1,TH

MX P1 372-1637 IN6,1,3 TH,TW1-10,CH1, 2 ,3, 5 ,BZ,US2,1,CO1,2,US3

CH2,MX, BZ,US1,CO1, 383-564 IN6,1,3 IN2,TW2,3,5,6,7,8,9,10 2,US3,IN4 536-876 MX,US2,1,CO1,2,US3 CH1-3,6,4 TH

1347-1606 IN4 CH6 CO2

536-733 BZ TH CH6,CH4,CH3

427-609 US3,CO1,2,CH1,2,US1,MX IN6 TW2,3,4,5,6,7,8, CH6

577-658 TW1-10,IN2,CH5 IN1 IN4

686-961 TH CH4 US1,US3,M,US2

416-1020 CH1,CO1,2,CH2,US1,MX,US3 CH5 IN4

Page | 61

Results and Discussion

Hc-Pro 110-1369 IN4 B,CO1,CO2 IN5

1113-1369 IN6 IN1 US3,M

411-1340 IN2 TW3,6,7 TW2,TW10

P3 3364-3896 IN2 TW4 TW2,TW9,TW10

CI 28-922 CH1 CH4 CH6

81-557 CH2 CH4,CH3 CH5

68-1808 CH6 CH5 CH4,CH3

NIa 1-595 IN6 IN1 CO2,CO1

Nib 1318-1603 IN6 H1,H2,T11 IN1

1216-1602 CH6 CH3 CH1,CH2,CH4

1-1565 IN3 I5 IN1

CP 90-855 US1,MX,US3 IN4 US1,BZ,US2

CO1,BZ,CO2,US1,TW2, 7, 66-858 CH3,1,6,TH IN4 1 106-860 IN4 HN,CH2,CH5 US1,BZ,US2,IN6,CO1,2

74-855 TW5 T1,IN5,T6,T10 BZ

3‟UTR 58-204 CH3,6 TW5,CH1 CO2,US1

5‟ UTR 39-405 IN6 IN4 CH4,2,6

Page | 62

Results and Discussion

UTR+P1

P1+HcPro 659-2027 TW5 TW8 TW1

HcPro 2108-4894 IN2 TW10, 2, 9 TW3,6,7 +CI

P3+VPg 3470-6508 MX US1,US2 US3

CI+VPg 4845-6782 CH2 CH5 CH4,1

CI+ NIa 5995-7023 TW2,IN2 TW8, 4 TW7,8

CI+ NIb 6098-8634 CH2,6 TW6, 2, 3, 8, 9, 10, 1 CH5

VPg+ NIa 6624-7459 IN4 CH5 CO1,2

NIa +CP 6996-9275 CH4,1,2 CH5 TW1, 3, 5, 6, 7. 8. 9

NIb +P1 764-9221 CH1 CH5 CH4

Page | 63

Results and Discussion

Table 4.10: Functional peptide/ cistron wise nucleotide substitution among the PRSV isolates.

P1 HC-Pro P3 CI 6k1 6k2 NIa NIb CP VPg

dN 0.184155 0.04396 0.036144 0.033243 0.116195 0.041113 0.024947 0.049053 0.028503 0.027211

dS 1.263774 0.998043 0.898865 0.874522 0.59128 0.711489 1.469254 1.964828 0.633991 0.915084

dN/dS 0.145718 0.044046 0.040211 0.038013 0.196514 0.057784 0.016979 0.024966 0.044958 0.029736

Note: d, nucleotide diversity; dN, substitution in non-synonymous position; dS, substitution in synonymous position for cistron to cistron comparison by SNAP

Table 4.11: Nucleotide diversity and substitution among the PRSV isolates from different geographical origins. World World Among Asia without America Asia (full (Polyprotein) India India genome)

Dn 0.09789 0.061704 0.1148 0.095135 0.2003 0.057792

dS 0.661652 1.060303 0.0614 0.662272 0.127 0.376015

dN/dS 0.147948 0.058195 0.5341 0.143649 0.6365 0.153696

Note: d, nucleotide diversity; dN, substitution in non-synonymous position; dS, substitution in synonymous position for geographical origin wise and for all isolates comparison by SNAP

Page | 64

Results and Discussion

4.2.4 Discussion – Our study has reported complete genome analysis of a severe and distinct isolate of PRSV-P (PRSV-Pune) which is the first complete genome sequence (NCBI GenBank accession number MH311882) from Maharashtra state and fifth from India. In our isolate, the cleavage sites are having variation in P1/HC-Pro (Y/S instead of Y/N) and in CI/6K2 (Q/N instead of Q/S) as compared to majority of published isolates (Table 4.5). Alteration or mutation in amino acids at and surrounding the cleavage sites are rare and have been reported with the exception of few cases. The implications of these mutations at cleavage sites have not well understood. However, the mutations at two cleavage sites in PRSV-Pune isolate probably involves in interfering the biological functions to be cleaved proteins such as P1, HC-Pro, CI and 6K2 and their interaction with other protein functions. P1 protein is least conserved among Potyviruses (Urcuqui-Inchima et al., 2001) and play a role in poly-protein processing and cell to cell movement of virus (Gonsalves et al., 2010). For aphid transmission of Potyviruses interaction between CP and HC-Pro (Pirone, 1981) and interaction between HC-pro and aphid styled is necessary (Peng et al., 1998). HC-Pro also plays a key role in PTGS suppression, long distance movement and genome replication (Anandalakshmi et al., 1998; Kasschau and Carrington, 2001). CI and 6K2 play important role in genome replication and cell to cell movement. N-terminal conserved region of CI possesses important activities such as ATPase, RNA binding and RNA helicase N terminal of CI (Fernandez and Garcia, 1996; Kadare and Haenni, 1997). 6K2 have been involved in genome amplification and anchoring of replication apparatus to ER like membranes (Urcuqui-Inchima et al., 2001). The vital roles of P1, HC-Pro, CI and 6K2 and their interaction with other proteins are most essential for viability of the virus and the mutations at poly-protein processing cleavage site in PRSV-Pune might have a potential role to play in making it more unique and best fit to survive and induce severe disease in its host. The phylogenetic tree generated in present study (Fig. 4.3) including more and new full genome sequences from India, Brazil, Colombia, China, Mexico, US, Taiwan and Thailand which were not included in previous reports, shows that Indian PRSV isolates continue to be distinct from other Asian isolates and closer to American isolates

Page | 65

Results and Discussion indicating a more interesting phylogeographic relationship of papaya infecting PRSV. The grouping of Indian PRSV-P in American lineage is very interesting and indicating the probable dispersal of Indian PRSV through importation which is supporting the earlier report by Mangrauthia et al. (2008). As previously reported, recombination is a dominant feature of PRSV evolution (Lai, 1992; Worobey and Holmes 1999; Ohshima et al., 2007; Mangrauthia et al., 2008).Our present analysis demonstrated the same with 30 PRSV genome recombination study. Mangrauthia et al., (2008) showed that molecular diversity of PRSV genome and the recombination events shaped the PRSV genome throughout the world. The clustering of Indian PRSV-P with American isolates (Brazil, Colombia, Mexico and US) with gene- wise high divergent population in Indian (Jain et al., 1998; Bag et al., 2007) indicates the dispersal of PRSV to these countries from Indian subcontinent. This study provides understanding in reference to evolution, specificity to host and regional distribution of PRSV virus population. Maximum numbers of recombination sites are observed in the P1, followed by the CP, HC-Pro, CI, NIb genes, indicating the key role of the 5‟ end of the genome in the PRSV genome architecture and dynamics. The P1 protein showed more recombination events. Hence it plays important role in selective pressure for host and evolution of PRSV genome (Valli et al., 2007). 5‟ UTR and P1 regions of Indian PRSV isolates contained most of the recombination site as a common parent. This further supports the concept of PRSV origin in the Indian sub-continent (Bateson et al., 2002). Environment serves as strong driver for circulating viruses leading to their genome evolution (Chare and Holmes, 2004). Multiple mutations occur during the replication of RNA viruses which causes considerably higher changes in their evolutionary rates than cellular genes within their host (Duffy et al., 2008; Holmes, 2009). These mutations in nucleotide result in the changes in amino acids which used to determine selection pressure for study of virus evolution. Synonymous changes (dS) indicate no change to the amino acids and non-synonymous changes (dN)) indicate a change in the amino acid. Positive selection (dN/dS> 1), purifying selection (dN/dS< 1) or neutral selection (dN/dS = 1) is calculated using relative magnitude of these types of changes within the genome (Nielsen and Yang, 1998; Zhou et al., 2010). New viral strains showed positive selection indicating rapid evolution. Purifying selection or neutral

Page | 66

Results and Discussion selections indicate slow evolution leading to stabilized viral population. Recombination has a significant effect on the molecular evolution of viruses. Negative selection pressure was found to be common in most of the PRSV population even at different geographic location including the different continents (Table 4.11). However, diversified selection pressure has been predicted in some cases of plant viruses (Chaves-Bedoya et al.,2013; Moury et al., 2002). Harmful mutations were eliminated naturally in the CP gene sequences of the Potato virus-Y (PVY) populations showing purifying selection (Hosseini et al., 2017). Selection pressure analysis of Maize dwarf mosaic virus (MDMV) in Spain showed negative selection for the P1-HC region concluding that MDMV population was spatially structured but not structured temporally or by host (Achon et al., 2012). Bagyalakshmi et al. (2012) showed that the HC-Pro gene of Sugarcane streak mosaic virus (SCSMV) in India was under strong negative selection. The highest selection was predicted on the P1 gene present in Chinese Turnip mosaic virus (TuMV) population (Wang et al., 2009a). Above finding suggests that recombination and selection pressure has significant role in slowly evolving PRSV-P population mainly in India. In this study we have first reported the complete genome sequence of PRSV- Pune isolate from Maharashtra state of India and its genetic analysis and relationship with other reported PRSV-P worldwide. Phylogenetic analysis showed, inspite of highly diverse Indian PRSV population, our isolates and other Indian PRSV isolates aligned with American isolates, which further support the hypothesis of PRSV origin in India. Recombination detection in the whole genome and at cistron level in PRSV in our isolate is highlighting its importance in PRSV genome structure and diversified selection for host with ecological constrains in India. Purifying selection with different phylogenetic pattern and recombination indicates the slow evolution of PRSV-Pune isolate which makes it best fit ecologically in the region. To study the genome dynamics of PRSV in relation to virus evolution more information on papaya infecting PRSV-P isolates from different eco-geographical locations from India and world need to be more focused. This study could help to broaden our understanding in the genetic recombination, evolution and molecular architecture of PRSV in India and worldwide and hope this will have potential implications for development of effective genetic strategies for integrated virus disease management.

Page | 67

Results and Discussion

4.3 Transcriptome analysis of PRSV-Pune infected leaf samples of C. papaya cultivars PM and PS3

Results –

4.3.1 Gene expression analysis of PRSV-Pune infected leaf samples of C. papaya cultivars PM and PS3 To determine the changes in the Papaya transcriptome associated with PRSV infection, we performed a time-course RNA-Seq study using next generation sequencing. Pusa Majesty (PM) is susceptible papaya cultivar to PRSV whereas Pune Selection-3 (PS3) is tolerant cultivar developed against PRSV. Two month old papaya cultivar PM and PS3 were treated with phosphate buffer (mock) or phosphate buffer containing PRSV and systemic leaf showing symptoms were successfully harvested at 21 dpi (Fig. 4.5 and 4.6). Total RNA was extracted from harvested leaf samples (Fig. 4.7). To confirm PRSV infection in harvested leaf samples, we performed RT-PCR using specific primers for the coat protein gene of PRSV. A total of four cDNAs (PM control, PM infected, PS3control and PS3infected) were sequenced by Illumina HiSeq 500.

Fig. 4.5: Cultivar Pusa Majesty (PM) samples used for transcriptome study. PM control and PRSV-Pune infected plants; b. PM leaf with PRSV-Pune infection after 21dpi. Red arrow indicates severe symptoms produced on PM leaf infected with PRSV- Pune after 21dpi. These symptomatic (Infected) leaf samples and non-symptomatic (Control) leaf samples were harvested at 21dpi in this study.

Page | 68

Results and Discussion

Fig. 4.6: Cultivar Pune Selection-3 (PS3) samples used for transcriptome study. a. PS3 control and PRSV-Pune infected plants; b. PS3 leaf with PRSV-Pune infection after 21dpi. Red arrow indicates mild symptoms produced on PS3 leaf infected with PRSV-Pune after 21dpi. These symptomatic (Infected) leaf samples and non- symptomatic (Control) leaf samples were harvested at 21dpi in this study.

Fig. 4.7: Total RNA extraction for transcriptome study. From left to right lanes: 1: PM control 2: PM infected, 3: PS3 control, 4: PS3 infected. These four RNA samples used in RNA seq study

Page | 69

Results and Discussion

Table 4.12: Summary and matches of Illumina HiSeq 500 reads with C. papaya genome.

Number of sequences % Reads Mapped

PM PM PS3 PS3 PM PM PS3 PS3

control infected control infected control infected control infected

Reads mapped 38,668,984 36,095,094 43,864,988 43,498,518 86.38 90.24 88.91 90.50 in pairs

Reads mapped in broken 3,953,494 2,444,969 3,935,506 2,521,872 8.83 6.11 7.98 5.25 pairs

Reads not 2,142,566 1,457,563 1,534,296 2,046,770 4.79 3.64 3.11 4.26 mapped

Total 44,765,044 39,997,626 49,334,790 48,067,160 100.00 100.00 100.00 100.00

Page | 70

Results and Discussion

Table 4.13: Summary of differential expression of genes (DEGs) in transcriptome study.

Differential expressed genes Total Total Total PM PS3 (DEGs) DEGs DEGs DEGs

Up-regulated DEGs 181 212 198 242 440 Down-regulated DEGs 17 30

Up-regulated DEGs with GO 116 132 assigned terms 126 146 272 Down-regulated DEGs with 10 14 GO assigned terms

Up-regulated DEGs with GO 65 80 unassigned terms 72 96 168 Down-regulated DEGs with 7 16 GO unassigned terms

Total up- and down-regulated DEGs were mentioned in table along with cultivar wise up-and down-regulated DEGs. These all DEGs used for Gene Ontology, where some up- and down-regulated DEGs were get assigned GO terms but some remained unassigned. Four sample sequencing generated total 172,827,456 reads in base pairs. A total reads mapped with 150 base pair (bp) ends were 162,127,585 generated after removing adapter and low quality sequences as shown in Table 4.12. Successful development of transcript assembly and quantification in terms of differential gene expression was carried out using available SunUp papaya reference genome (Ming et al., 2008). Changes in transcript expression were analyzed with the Cuffdiff program from Cufflinks (Trapnell et al., 2012). These programs were chosen because they use different ways to model the negative binomial dispersion parameter (Cumbie et al., 2011, Rapaport et al., 2013, Soneson et al., 2013). Pairwise comparisons were made between PRSV-inoculated vs. mock-inoculated samples for both varieties PM and PS3. After mapping sequence reads of four different samples with reference genome of SunUp, apporoximately 89% reads mapped in pairs, 7% were mapped in broken pairs and 3.95 % remained unmapped. A False Discovery Rate (FDR) cut-off of 5 % was used to select genes with significant differential expression. Only differentially expressed genes (DEGs) that were identified with above criteria showing Fragments Per Kilobase of transcript per Million (FPKM)

Page | 71

Results and Discussion mapped reads in terms of log 2 fold change (FC) in range of ±1 were kept for further analysis. Overall, more DEGs were noticed in PRSV infected than in control samples for both PM and PS3 (Table 4.13). PM showed total 198 DEGs and 242 DEGs obtained for PS3 in infected verses control. Out of total DEGs obtained, 32 DEGs remains to be functionally annotated in both PM and PS3. In this study DEGs with high expression levels in infected as compared to control samples were referred as “up-regulated genes”, while those showed high expression in control than infected samples were defined as “down-regulated genes”. 4.3.2 Gene ontology of C. papaya cultivars PM and PS3 under PRSV-Pune infection Gene ontology analysis was carried out for the study of functional analysis of DEGs. The up- and down-regulated DEGs were grouped according to Gene Ontology (GO) analysis mentioned in Table 4.13. Biological process, molecular function and cellular component are three ontologies on which GO are terms assigned to DEGs. Out of total 440 DEGs, 272 DEGs encoding proteins showed at least single GO term and 168 DEGs remained functionless in terms of GO (Table 4.13). According to cultivar wise GO study, 116 DEGs have assigned GO terms out of total 181 up-regulated DEGs. Only 10 DEGs which were down regulated have been assigned with GO terms out of total 17 down-regulated DEGs. In total 65 up-regulated and 7 down-regulated DEGs have not functionally characterized on the basis of GO in PM (Fig. 4.8). In biological processes, response to stimulus (30.3 %) and response to abiotic stimulus (16.6 %) were significantly upregulated in PM. Developmental programmed cell death (16.6 %) was significantly down-regulated in PM. Molecular functional categories for PM showed high up-regulation of catalytic activity (51.4 %), activity (11.4 %), transaminase activity (4.2 %) whereas chitin binding, zinc ion trans-membrane transporter activity, pattern binding, polysaccharide binding, -type endopeptidase activity each 25 % were found to be significantly down regulated in PM. Cellular component for PM variety showed cell part and cell (82.7 %) in upregulated DEGs. Down-regulated GO terms in cellular component for PM was not obtained.

Page | 72

Results and Discussion

vitamin biosynthetic process Up-regulated DEGs

positive regulation of… Down-regulated DEGs heterocycle catabolic process response to oxidative stress response to abiotic stimulus

response to stimulus Biological process Biological

developmental programmed cell…

proline activity serine-pyruvate transaminase… uridine nucleosidase activity inosine nucleosidase activity serine-glyoxylate transaminase… shikimate O-… adenosine nucleosidase activity transaminase activity hydrolase activity

catalytic activity Molecular functions Molecular

cysteine-type endopeptidase activity polysaccharide binding pattern binding zinc ion transmembrane… chitin binding

external encapsulating structure cell wall cytosol cell

Cellular components Cellular cell part

0% 20% 40% 60% 80% 100%

Fig. 4.8: Gene ontology (GO) analysis for PRSV-Pune infected leaf samples of cultivar Pusa Majesty (PM).

Page | 73

Results and Discussion

GO study for PS3 showed that 132 DEGs have assigned GO terms out of total 212 upregulated DEGs. Only 14 DEGs which were down-regulated have assigned GO terms out of total 30 down-regulated DEGs. In total 80 up-regulated and 16 down- regulated DEGs have not functionally characterized on the basis of GO in PS3. GO analysis for PS3 described in Fig. 4.9 In biological processes, metabolic process (54.7 %), cellular process (52.8 %), small molecule metabolic process (28.3 %), response to stimulus (28.3 %), response to stress (22.6 %), catabolic process (15 %) and response to abiotic stimulus(15 %) were significantly up-regulated in PS3.Carbohydrate catabolic process (28.5 %), catabolic process (42.8 %), cell wall macromolecule metabolic process (42.8 %), S-glycoside catabolic process, glycosinolate catabolic process, response to insect, sulphur compound catabolic process, cell wall macromolecule catabolic process, cell- wall macromolecule metabolic process each with 14.2 % were significantly down-regulated in terms of biological processes in PS3. Molecular functional categories for PS3 showed high up-regulation of catalytic activity (58.6 %) and hydrolase activity (13.7 %). PS3 showed down-regulation of GO terms for catalytic activity (80 %), hydrolase activity (40 %) along with chitin binding, beta-amylase activity, glucosidase activity, pattern binding, amylase activity and O- methyltransferase activity each with 10 %. Cellular component for PS3 variety showed cell part and cell (88.2 %), cytoplasm (43.1 %) in up-regulated DEGs whereas, plant-type cell wall, cell wall and external encapsulating structure with 40 % were found to be significantly down regulated in PS3.

Page | 74

Results and Discussion

positive regulation of… peptidyl-histidine… L-phenylalanine catabolic… vitamin biosynthetic process Up-regulated DEGs response to oxidative stress response to abiotic stimulus Down-regulated catabolic process DEGs response to stress response to stimulus small molecule metabolic… cellular process metabolic process

cell wall macromolecule… Biological process Biological cell wall macromolecule… sulfur compound catabolic… response to insect glycosinolate catabolic process S-glycoside catabolic process cell wall macromolecule… catabolic process carbohydrate catabolic process

adenosine nucleosidase activity galactosidase activity activity protein dimerization activity activity transaminase activity hydrolase activity catalytic activity

O-methyltransferase activity amylase activity

Molecular functions Molecular pattern binding glucosidase activity beta-amylase activity chitin binding hydrolase activity catalytic activity

cytoplasm cell cell part

external encapsulating structure cell wall

Cellular components Cellular plant-type cell wall 0% 20% 40% 60% 80% 100% Fig. 4.9: Gene ontology (GO) analysis for PRSV-Pune infected leaf samples of cultivar Pune Selection-3 (PS3).

Page | 75

Results and Discussion

4.3.3 Differential gene expression (DEGs) analysis for PRSV-Pune infected leaf samples of C. papaya cultivars PM and PS3 Details of Differential gene expression (DEGs) analysis for PRSV-Pune infected leaf samples of PM and PS3 was mentioned in Table 4.14 and 4.15. Common up-and down-regulated DEGs for PM and PS3 repeated in both Table 4.14 and 4.15 but for cultivar wise DEGs study its repetition was essential. 4.3.3.1 Common up-regulated DEGs in C. papaya cultivars PM and PS3 Total 29 DEGs were found to be commonly up-regulated in both PM and PS3 papaya cultivars. DnaJ family, C/III subfamily, Class-IV pyridoxal-phosphate-dependent aminotransferase family and Iron/ ascorbate-dependent oxidoreductase family encoding genes showed highest fold change up-regulation in both PM and PS3.Homogentisate dioxygenase family and Myo-inositol oxygenase family were second highest protein encoding genes common up-regulated followed by Amino acid/ polyamine transporter 2 family, Gibberellic acid-stimulated Arabidopsis (GASA) family, Proline oxidase family, Expansin family and rest others as mentioned in Table 4.14. 4.3.3.2 Common down-regulated genes in C. papaya cultivars PM and PS3 Two genes encoding Peptidase S9A family (-1.87 FC) and SPR1 (SPIRAL1) family (- 1.10 FC) were found to be down-regulated in both PM and PS3. 4.3.3.3 Significantly up-regulated genes in cultivar PM Apart from common up-regulated DEGs, 11 genes were significantly expressed in PM. Syntaxin family (4.53 FC), RING-type zinc finger family (3.13 FC) and Calmodulin family (3.12 FC) encoding genes were most significantly up-regulated in PM. Lysosomal lysine/arginine transporter (Laat-1) family (2.47 FC), Acyl-CoA-binding protein (ACBP) family (2.21 FC), Methyltransferase superfamily (2.07 FC) and Programmed cell death 4 (PDCD4) family (1.71 FC) were also showed significant expression in response to PRSV-Pune infection in PM.

4.3.3.4 Significantly up-regulated genes in cultivar PS3 Total 11 DEGs showed significant up-regulation in PS3 apart from common up- regulated genes (Table 4.15). Casparian strip membrane proteins (CASP) family (5.5 FC) and Defensin like (DEFL) family (5.03 FC) showed highest up-regulation in PS3. GDXG lipolytic enzyme family (4.87 FC), Drug/ metabolite transporter (DMT) superfamily

Page | 76

Results and Discussion

(4.27 FC), 2-oxoisovalerate dehydrogenase subunit alpha (BCKDHA) family (3.94 FC) and 2-oxoacid dehydrogenase family (3.92 FC) were among those uniquely up-regulated protein encoding genes in PS3. 4.3.3.5 Significantly down-regulated genes in cultivar PM Two genes were found to be uniquely down-regulated in PM. HEATR5 family (- 1.07 FC) and ZRT/IRT‐like protein (ZIP) transporter family (-1.02 FC) were down- regulated in PM. 4.3.3.6 Significantly down-regulated genes in cultivar PS3 Gene encoding EXO/Phosphate-Responsive Protein (EXORDIUM family) with - 2.67 fold change (FC) found to be down-regulated in PS3. 4.3.3.7 Glycosyl hydrolase family expressing genes in cultivar PM Glycosyl hydrolase 35(5.66 and 5.5 FC), Glycosyl hydrolase 3(4.84 FC), Glycosyl hydrolase 14 (3.25 and 2.78 FC), Glycosyl hydrolase 36 (2.76 FC) and Glycosyl hydrolase 16 (2.38 FC) were up-regulated in PM. Glycosyl hydrolase 19 family (-2.79 FC) and Glycosyl hydrolase 61(-3.03 FC) were down-regulated in PM. 4.3.3.8 Glycosyl hydrolase family expressing genes in cultivar PS3 Glycosyl hydrolase 35 (6.31 and 5.77 FC), Glycosyl hydrolase 3 (4.83 FC), Glycosyl hydrolase 9 (3.24 FC), Glycosyl hydrolase 14 (3.18FC), Glycosyl hydrolase 36 (2.65 FC) and Glycosyl hydrolase 16 (2.1 FC) were up-regulated in PS3. Glycosyl hydrolase 1 family (-1.8 FC), Glycosyl hydrolase 19 family (-2.79 and -2.74 FC) and Glycosyl hydrolase 14 (-4.27 FC) were down-regulated in PS3. 4.3.3.9 Transcription factors in cultivar PM Four transcription factors were up-regulated in PM. Regulator of rDNA transcription protein 15(ART2/ RRT15) family (3.11 FC), Leucine zipper motif (bZIP) family (2.47 FC), ARG7 family (2.02 FC), Plant ACBP60 (2.01 FC) were significantly up-regulated in PM. 4.3.3.10 Transcription factors in cultivar PS3 As like PM, transcription factors also showed expression in PS3. Regulator of rDNA transcription protein 15(ART2/RRT15) family (3.09 FC), Leucine zipper motif (bZIP) family (1.91 FC) and Plant ACBP60 protein family (1.95 FC) were significantly

Page | 77

Results and Discussion up-regulated in PM. NFYB/HAP3 subunit (-1.34 FC) family transcription factor encoding gene was down-regulated in PS3. 4.3.3.11 Heat shock proteins in cultivars PM and PS3 Three heat shock protein expressing genes were commonly up-regulated in both PM and PS3. Small heat shock protein (HSP20) family (2.63 and 2.62 FC in PM and PS3 respectively), ClpX chaperone family (2.39 and 2.43 FC in PM and PS3respectively) and Sigma-70 factor family (1.79 and 1.77 FC in PM and PS3respectively) were found in PM and PS3. 4.3.3.12 Genes encoding protein families over-represented in cultivar PM AB hydrolase superfamily proteins get encoded by five different genes with fold change (FC) ranging from 1.66 to 2.77. Class I-like SAM-binding methyltransferase superfamily (2.42 FC) and Class V-like SAM-binding methyltransferase superfamily (3.22 FC) were encoded by two different genes. Class IV pyridoxal-phosphate-dependent aminotransferase family (4.06 FC) and Class-V pyridoxal-phosphate-dependent aminotransferase family (1.66 FC) were upregulated in PM. Glycosyltransferase 61 family (-3.03 FC) and Glycosyltransferase 92 family (3.52 FC) were significantly upregulated in PM. Multi antimicrobial extrusion (MATE) family encoded by three different genes with fold change 2.95, 2.48, 2.27 fold change (FC) respectively. Pentatricopeptide repeat (PPR) PCMP-H sub-family and Pentatricopeptide repeat (PPR) P sub-family were up-regulated in PM with fold change 2.96 and 3.02 fold change respectively. Phenylalanine ammonia (PAL/ histidase) family (2.38 and 2.33 FC) was encoded by two different genes. Phosphatidylethanolamine-binding protein family (2.02 and 2.40 FC) was encoded by two different genes. Ser/ Thr protein kinase family including WNK sub-family encoded by two different genes with 3.12 FC and CDPK sub-family (3.12 FC) along with Ser/ Thr protein kinase with 3.12 FC. 4.3.3.13 Genes encoding protein families over-represented in cultivar PS3 AB hydrolase superfamily protein (1.69 and 2.37 FC) was encoded by different genes in PS3. Class IV pyridoxal-phosphate-dependent aminotransferase family (4.60 FC) and Class-V pyridoxal-phosphate-dependent aminotransferase family (1.71 FC) were significantly up-regulated in PS3. Class V-like SAM-binding methyltransferase superfamily (3.22 FC) and Class I-like SAM-binding methyltransferase superfamily (-

Page | 78

Results and Discussion

1.34 FC) were up- and down-regulated in PS3 respectively. Multi antimicrobial extrusion (MATE) family encoded by two different genes with 2.16 and 2.52 FC. Phenylalanine ammonia lyase (PAL/ histidase) family encoded by two different genes with 2.43 and 2.60 FC. Phosphatidylethanolamine-binding protein family encoded by two different genes with fold change were 1.98 and 1.70. Protein kinase superfamily encoding SNF1 family gene was significantly down-regulated in PS3 with -1.22 FC. In gene encoding protein families mentioned under over-represented categories in PM and PS3 have seven following genes common like evm.TU.supercontig_69.93 gene encoding AB hydrolase superfamily, Lipase family (2.17, 2.37 in PM and PS3 FC), evm.TU.supercontig_73.24, evm.TU.supercontig_73.21, encoding Multi antimicrobial extrusion (MATE) family (2.95, 2.48, FC in PM and 2.16, 2.52 in PS3),evm.TU.supercontig_390.3, evm.TU.supercontig_92.120 encoding Phenylalanine ammonia lyase (PAL/ histidase) family (2.38, 2.33 in PM and 2.43, 2.60 in PS3), evm.TU.supercontig_3.421, evm.TU.supercontig_3387.2 encoding (2.02 , 2.40 in PM and 1.98, 1.70 in PS3). Overall 41 genes (29 genes mentioned in common up-regulated genes, over-represented 7 common genes and 6 glycosyl hydrolase genes) were found to be common in both PM and PS3. 4.3.3.14 Common genes and proteins in transcriptome and proteome analysis of cultivars PM and PS3 Total 11 DEGs encoding proteins were found to be common at transcriptome and label-free protein of study of PM and PS3 leaf samples in response to PRSV-Pune infection (Table 4.14, 4.15, 4.18 and 4.19). ATP-dependent AMP-binding enzyme family showed up-regulation at transcriptome having evm.TU.supercontig_10.200 (Gene id) with 2.71 FC in PM and 2.90 FC in PS3 and down-regulation at proteome level in PM having XP_021905490 (Protein id) with -0.88 FC. Beta-class carbonic anhydrase family up-regulated at transcriptome and proteome study (evm.TU.supercontig_21.6 with 2.95 FC in PM and 3.15 FC in PS3 at transcriptome and XP_021887045, 1.20 FC in PS3 at proteome). Class-V pyridoxal-phosphate-dependent aminotransferase family (evm.TU.supercontig_892.2) showed fold change of 1.66 in PM and 1.71 in PS3 in transcriptome study. XP_021899177 representing protein family belonging to Class-V

Page | 79

Results and Discussion pyridoxal-phosphate-dependent aminotransferase showed fold change of 1.10 in PS3. Early -induced proteins (ELIP)/ psbS family encoded by evm.TU.supercontig_44.124 expressed 1.9 FC in PM and 1.93 in PS3 and XP_21899783 representing down-regulation of ELIP/ psbS family in PM with -0.91 FC. Glutathione S-transferase (GST) superfamily, Tau family was down-regulated in PM in transcriptome analysis (evm.TU.supercontig_1.379 with -1.68 FC in PM). GST superfamily, Phi family represented by XP_021906879 (0.73 FC) in PM, XP_021902740 and XP_021906879 (0.54 and 0.76 FC) in PS3. In PM, evm.TU.supercontig_1.398 encoded heat shock protein 70 family was up-regulated with 1.51 FC in transcriptome study. At protein level both PM and PS3 showed up-regulation of heat shock protein 70 family encoded by XP_021910429 (0.69 FC in PM) and XP_021891591 (0.51 FC in PS3). Late Embryogenesis Abundant (LEA) type 3 family proteins at transcriptome level encoded by evm.TU.supercontig_9.251was up-regulated with 2.21 FC in PM and 2.36 FC in PS3. XP_021906959 encoding LEA type 3 proteins showed down-regulation in PS3 with significant fold change of -0.71 in proteome analysis. Transcriptome study revealed that peptidase C1 family protein having gene id evm.TU.supercontig_286.15 was down-regulated (-1.29 FC) in PM. At proteome level down-regulation of peptidase C1 family protein observed in both PM (P05994, -0.91FC) and PS3 (XP_021888999, - 1.05 FC). The evm.TU.supercontig_468.4 encoding Peroxidase family was significantly up- regulated with 3.67 FC in PM, XP_021900593 showed down-regulation of peroxidase (- 0.80 FC) in PM itself in transcriptome and proteome study respectively. Short-chain / reductases (SDR) family encoded by evm.TU.supercontig_23.100 was upregulated (2.90 FC) in transcriptome study in PM. XP_021894719 representing Short- chain dehydrogenases/ reductases (SDR) family was found to be down-regulated in PS3 with -0.75FC in label free proteome analysis. Universal ribosomal protein uL4 family was up-regulated (1.92 FC) in PM and encoded by evm.TU.supercontig_112.52 in transcriptome study. P10056 for Universal ribosomal protein uS5 family showed down- regulation in proteome study with -0.68 FC in PS3.

Page | 80

Results and Discussion

Table 4.14: Transcriptome profile (Differentially expressed genes) of cultivar Pusa Majesty (PM).

Log2 Gene Id Protein families Fold change

Common up-regulated genes in PM and PS3

evm.TU.supercontig_362.2 DnaJ family, C/III subfamily 4.08

Class-IV pyridoxal-phosphate-dependent evm.TU.supercontig_160.23 4.06 aminotransferase

evm.TU.supercontig_62.66 Iron/ascorbate-dependent oxidoreductase 4.01

evm.TU.supercontig_80.119 Major facilitator superfamily 3.39

evm.TU.supercontig_131.80 Homogentisate dioxygenase 3.05

evm.TU.supercontig_78.51 Myo-inositol oxygenase 3.02

evm.TU.supercontig_122.44 Amino acid/polyamine transporter 2 2.8

Gibberellic acid-stimulated arabidopsis evm.TU.supercontig_111.7 2.8 (GASA)

evm.TU.supercontig_652.4 Proline oxidase 2.78

evm.TU.supercontig_19.182 Cytochrome P450 2.75

evm.TU.supercontig_20.162 Expansin 2.72

evm.TU.supercontig_2.435 Eukaryotic ribosomal protein eL30 2.57

evm.TU.supercontig_25.134 Aldehyde dehydrogenase 2.44

Vacuolar protein sorted associated evm.TU.supercontig_34.88 2.44 protein 18 (VPS18)

PTR2/POT transporter Proton dependent evm.TU.supercontig_132.11 2.14 oligopeptide transporter

evm.TU.supercontig_12.97 Uridine kinase 2.08

Page | 81

Results and Discussion

Inosine-uridine preferring nucleoside evm.TU.supercontig_5.45 2 hydrolase (IUNH ) evm.TU.supercontig_70.105 Glycerophosphoryl diester phosphodiesterase 1.97

Pyruvate dehydrogenase (acetyl-transferring) evm.TU.supercontig_146.27 kinase/Branched chain alpha ketoacid 1.96 dehydrogenase kinase ( PDK/ BCKDK ) kinase

Pyridoxal 5'-phosphate synthase evm.TU.supercontig_11.144 1.92 subunits (PdxS/ SNZ)

Myo-inositol tetrakisphosphate evm.TU.supercontig_160.11 1.88 kinase (ITPK1) evm.TU.supercontig_46.171 Plant acyltransferase 1.87

evm.TU.supercontig_81.10 Thioredoxin 1.87 evm.TU.supercontig_43.103 Gamma-glutamylcyclotransferase 1.84 evm.TU.supercontig_2639.1 Plant transfer proteins (LTP) 1.82

evm.TU.supercontig_49.77 Sweet sugar transporter 1.74

Small conductance Mechanosensetive ion evm.TU.supercontig_21.18 1.72 channel (MscS )

Class-V pyridoxal-phosphate-dependent evm.TU.supercontig_892.2 1.66 aminotransferase evm.TU.supercontig_19.110 An Arabidopsis Homolog of Trithorax (ATX1) 1.6

Common-down regulated genes in PM and PS3 evm.TU.supercontig_26.240 Peptidase S9A -1.87 evm.TU.supercontig_180.27 SPIRAL1 -1.10

Significantly up-regulated genes in PM evm.TU.supercontig_27.114 Syntaxin 4.53

evm.TU.supercontig_142.5 RING-type zinc finger 3.13

Page | 82

Results and Discussion evm.TU.supercontig_33.178 Calmodulin 3.12

evm.TU.supercontig_37.86 Lysosomal lysine/arginine transporter (Laat-1) 2.47 evm.TU.supercontig_21.112 Acyl-CoA-binding protein (ACBP) 2.21

evm.TU.supercontig_9.335 Methyltransferase superfamily 2.07

evm.TU.supercontig_73.34 Mildew resistance locus o (MLO ) 1.88

Heavy metal‐associated isoprenylated plant evm.TU.supercontig_27.238 1.81 protein (HIPP)

evm.TU.supercontig_9.337 FK506-binding proteins (FKBPs) type PPIase 1.76

evm.TU.supercontig_5.221 Mitochondrial carrier 1.73

evm.TU.supercontig_4.43 Programmed cell death 4 (PDCD4) 1.71

Significantly down-regulated genes in PM evm.TU.supercontig_3785.2 HEATR5 -1.07 evm.TU.supercontig_30.115 ZRT/IRT‐like protein (ZIP)transporter -1.02

Glycosyl Hydrolase

evm.TU.supercontig_93.34 Glycosyl hydrolase 35 5.66

evm.TU.supercontig_25.57 Glycosyl hydrolase 35 5.5 evm.TU.supercontig_1071.2 Glycosyl hydrolase 3 4.84

evm.TU.supercontig_59.40 Glycosyl hydrolase 14 3.25 evm.TU.supercontig_15.140 Glycosyl hydrolase 14 2.78

evm.TU.supercontig_46.40 Glycosyl 36 2.76

evm.TU.supercontig_2.137 Glycosyl hydrolase 16 2.38

evm.TU.supercontig_9.348 Glycosyl hydrolase 19 -2.79

Transcription factors in PM

Regulator of rDNA transcription protein evm.TU.supercontig_121.10 3.11 15(ART2/RRT15)

Page | 83

Results and Discussion evm.TU.supercontig_117.90 BZIP 2.47 evm.TU.supercontig_20.155 ARG7 2.02 evm.TU.supercontig_46.118 Plant ACBP60 protein 2.01

Heat shock proteins

evm.TU.supercontig_48.3 Small heat shock protein (HSP20) 2.63

evm.TU.supercontig_21.23 ClpX chaperone 2.39

evm.TU.supercontig_3.54 Sigma-70 factor 1.79

Protein families overrepresented in PM

evm.TU.supercontig_6.139 AB hydrolase superfamily 1.66 evm.TU.supercontig_1323.3 AB hydrolase superfamily 1.99 evm.TU.supercontig_85.111 AB hydrolase superfamily 1.87

evm.TU.supercontig_69.93 AB hydrolase superfamily, Lipase 2.17

AB hydrolase superfamily, evm.TU.supercontig_1.322 1.87 Monoacylglycerol lipase

Class I-like SAM-binding methyltransferase evm.TU.supercontig_18.2 2.42 superfamily

Class V-like SAM-binding methyltransferase evm.TU.supercontig_18.51 3.22 superfamily

evm.TU.supercontig_329.1 Glycosyltransferase 61 -3.03 evm.TU.supercontig_1291.2 Glycosyltransferase 92 3.52

evm.TU.supercontig_73.24 Multi antimicrobial extrusion (MATE) 2.95

evm.TU.supercontig_73.21 Multi antimicrobial extrusion (MATE) 2.48

evm.TU.supercontig _55.57 Multi antimicrobial extrusion (MATE) 2.27

evm.TU.supercontig Pentatricopeptide repeat (PPR) PCMP-H 2.96 _111.60 subfamily

Page | 84

Results and Discussion evm.TU.supercontig_131.84 Pentatricopeptide repeat (PPR) P subfamily 3.02

Phenylalanine ammonia lyase evm.TU.supercontig_390.3 2.38 (PAL/ histidase)

Phenylalanine ammonia lyase evm.TU.supercontig_92.120 2.33 (PAL/ histidase)

Phosphatidylethanolamine-binding protein evm.TU.supercontig_3.421 2.02 family

Phosphatidylethanolamine-binding protein evm.TU.supercontig_3387.2 2.40 family

Ser/Thr protein kinase family, evm.TU.supercontig_11.131 3.12 WNK subfamily

evm.TU.supercontig_748.4 Ser/Thr protein kinase family 2.32

Ser/Thr protein kinase, evm.TU.supercontig_11.131 3.12 CDPK subfamily

Ser/Thr protein kinase, evm.TU.supercontig_11.138 3.12 WNK subfamily

Gene and Protein families encoding similar proteins evm.TU.supercontig_10.200 ATP-dependent AMP-binding enzyme 2.71

evm.TU.supercontig_21.6 Beta-class carbonic anhydrase 2.95

Class-V pyridoxal-phosphate-dependent evm.TU.supercontig_892.2 1.66 aminotransferase

Early light induced protein evm.TU.supercontig_44.124 1.9 (ELIP)/ psbS family

Glutathione S transferase evm.TU.supercontig_1.379 -1.68 (GST), Tau family

Page | 85

Results and Discussion

evm.TU.supercontig_1.398 Heat shock protein 70 1.51

Late Embryogenesis Abundant evm.TU.supercontig_9.251 2.21 proteins (LEA) type 3

evm.TU.supercontig_286.15 Peptidase C1 -1.29

evm.TU.supercontig_468.4 Peroxidase 3.67

Short-chain dehydrogenases/ evm.TU.supercontig_23.100 2.90 reductases (SDR)

evm.TU.supercontig_112.52 Universal ribosomal protein uL4 1.92

Note: Common up- and down-regulated genes and other common genes for PM and PS3 were presented in both Table 4.14 and 4.15 for avoiding confusion. Table 4.14 and Table 4.15 represent cultivar wise expression profiles (DEGs) for PM and PS3 respectively. Blank cells represent no hits found for respective gene families against respective databases.

Page | 86

Results and Discussion

Table 4.15: Transcriptome profile (Differentially expressed genes) of cultivar Pune Selection-3 (PS3).

Log2 Gene Id Protein families Fold change

Common up-regulated genes in PM and PS3 Class-IV pyridoxal-phosphate-dependent evm.TU.supercontig_160.23 4.60 aminotransferase evm.TU.supercontig_362.2 DnaJ family, C/III subfamily 4.30 evm.TU.supercontig_62.66 Iron/ascorbate-dependent oxidoreductase 4.01 evm.TU.supercontig_131.80 Homogentisate dioxygenase 3.29 evm.TU.supercontig_78.51 Myo-inositol oxygenase 3.05 evm.TU.supercontig_122.44 Amino acid/polyamine transporter 2 3.03 evm.TU.supercontig_80.119 Major facilitator superfamily 2.85 Gibberellic acid-stimulated arabidopsis evm.TU.supercontig_111.7 2.69 (GASA) evm.TU.supercontig_652.4 Proline oxidase 2.68 evm.TU.supercontig_2.435 Eukaryotic ribosomal protein eL30 2.63 evm.TU.supercontig_20.162 Expansin 2.38 evm.TU.supercontig_23.35 Cytochrome P450 2.33 evm.TU.supercontig_25.134 Aldehyde dehydrogenase 2.25 Vacuolar protein sorted associated protein 18 evm.TU.supercontig_34.88 2.22 (VPS18) PTR2/POT transporter Proton dependent evm.TU.supercontig_132.11 2.14 oligopeptide transporter evm.TU.supercontig_12.97 Uridine kinase 2.05 Inosine-uridine preferring evm.TU.supercontig_5.45 2.03 nucleoside hydrolase (IUNH ) evm.TU.supercontig_70.105 Glycerophosphoryl diester 1.97

Page | 87

Results and Discussion

phosphodiesterase (acetyl-transferring) kinase/ Branched evm.TU.supercontig_146.27 1.96 chain alpha keto acid dehydrogenase kinase ( PDK/ BCKDK ) kinase Pyridoxal 5'-phosphate synthase subunits evm.TU.supercontig_11.144 1.94 (PdxS/SNZ) Myo-inositol tetrakisphosphate evm.TU.supercontig_160.11 1.9 kinase (ITPK1) evm.TU.supercontig_46.171 Plant acyltransferase 1.89 evm.TU.supercontig_81.10 Thioredoxin 1.84 evm.TU.supercontig_43.103 Gamma-glutamylcyclotransferase 1.76 evm.TU.supercontig_2639.1 Plant Lipid transfer proteins (LTP) 1.75 evm.TU.supercontig_49.77 Sweet sugar transporter 1.72 Class-V pyridoxal-phosphate-dependent evm.TU.supercontig_892.2 1.71 aminotransferase Small conductance Mechanosensitive ion evm.TU.supercontig_21.18 1.7 channel (MscS ) An Arabidopsis Homolog evm.TU.supercontig_19.110 1.7 of Trithorax (ATX1) Common down-regulated genes in PM and PS3 evm.TU.supercontig_26.240 Peptidase S9A -1.87 evm.TU.supercontig_180.27 SPIRAL1 -1.10 Significantly up-regulated genes in PS3 evm.TU.supercontig_5.247 Casparian strip membrane proteins (CASP) 5.5 evm.TU.supercontig_138.21 Defensin like (DEFL) 5.03 evm.TU.supercontig_222.8 GDXG' lipolytic enzyme 4.87 Drug/metabolite transporter (DMT) evm.TU.supercontig_127.41 4.27 superfamily evm.TU.supercontig_26.70 2-oxoisovalerate dehydrogenase subunit alpha 3.94

Page | 88

Results and Discussion

(BCKDHA) evm.TU.supercontig_50.171 2-oxoacid dehydrogenase 3.92 Acetyl-coenzyme A carboxylase carboxyl transferase subunit beta/ Propanoyl-CoA: evm.TU.supercontig_62.2 3.33 carbon dioxide subunit beta (AccD/ PCCB) Alanine dehydrogenases and NAD(P) evm.TU.supercontig_82.44 transhydrogenase subunit alpha 3.29 (AlaDH/ PNT ) evm.TU.supercontig_97.27 - 2.43 evm.TU.supercontig_64.140 Amino acid/polyamine transporter 2 2.13 evm.TU.supercontig_1.226 Ars A ATPase 1.89 Significantly down-regulated in PS3 EXO/Phosphate-Responsive Family Protein evm.TU.supercontig_3.291 -2.67 (EXORDIUM) Glycosyl Hydrolase evm.TU.supercontig_93.34 Glycosyl hydrolase 35 6.31 evm.TU.supercontig_25.57 Glycosyl hydrolase 35 5.77 evm.TU.supercontig_1071.2 Glycosyl hydrolase 3 4.83 evm.TU.supercontig_15.140 Glycosyl hydrolase 9 3.24 evm.TU.supercontig_59.40 Glycosyl hydrolase 14 3.18 evm.TU.supercontig_46.40 Glycosyl hydrolases 36 2.65 evm.TU.supercontig_2.137 Glycosyl hydrolase 16 2.1 evm.TU.supercontig_116.16 Glycosyl hydrolase 1 -1.8 evm.TU.supercontig_9.348 Glycosyl hydrolase 19 -2.74 evm.TU.supercontig_48.185 Glycosyl hydrolase 100 -4.25 evm.TU.supercontig_16.112 Glycosyl hydrolase 14 -4.27 Transcription factors in PS3 Regulator of rDNA transcription protein evm.TU.supercontig_121.10 3.09 15(ART2/ RRT15)

Page | 89

Results and Discussion evm.TU.supercontig_117.90 BZIP 1.91 evm.TU.supercontig_46.118 Plant ACBP60 protein 1.95 evm.TU.supercontig_136.3 NFYB/ HAP3 subunit -1.34 Heat shock proteins evm.TU.supercontig_48.3 Small heat shock protein (HSP20) 2.62 evm.TU.supercontig_21.23 ClpX chaperone 2.43 evm.TU.supercontig_3.54 Sigma-70 factor 1.77 Protein families over-represented in PS3 evm.TU.supercontig_6.139 AB hydrolase superfamily 1.69 evm.TU.supercontig_69.93 AB hydrolase superfamily, Lipase family 2.37 Class V-like SAM-binding methyltransferase evm.TU.supercontig_18.51 3.22 superfamily Class I-like SAM-binding methyltransferase evm.TU.supercontig_20.130 -1.34 superfamily evm.TU.supercontig_73.24 Multi antimicrobial extrusion (MATE) 2.16 evm.TU.supercontig_73.21 Multi antimicrobial extrusion (MATE) 2.52 Phenylalanine ammonia lyase evm.TU.supercontig_390.3 2.43 (PAL/ histidase) Phenylalanine ammonia lyase evm.TU.supercontig_92.120 2.60 (PAL/ histidase) evm.TU.supercontig_3.421 Phosphatidylethanolamine-binding protein 1.98 evm.TU.supercontig_3387.2 Phosphatidylethanolamine-binding protein 1.70 evm.TU.supercontig_7.83 Ser/Thr protein kinase, SNF1 subfamily -1.22 Gene and Protein families encoding similar proteins evm.TU.supercontig_10.200 ATP-dependent AMP-binding enzyme 2.90

evm.TU.supercontig_21.6 Beta-class carbonic anhydrase 3.15 evm.TU.supercontig_12.98 Class I fructose-bisphosphate aldolase 1.84 Class-I pyridoxal-phosphate-dependent evm.TU.supercontig_131.89 3.6 aminotransferase evm.TU.supercontig_892.2 Class-V pyridoxal-phosphate-dependent 1.71

Page | 90

Results and Discussion

aminotransferase evm.TU.supercontig_44.124 Early light induced protein (ELIP)/ psbS 1.93 evm.TU.supercontig_25.180 'GDSL' lipolytic enzyme -1.62 Late Embryogenesis Abundant proteins (LEA) evm.TU.supercontig_9.251 2.36 type 3 Note: Common up- and down-regulated genes and other common genes for PM and PS3 were presented in both Table 4.14 and 4.15 for avoiding confusion. Table 4.14 and Table 4.15 represent cultivar wise expression profiles (DEGs) for PM and PS3 respectively. Blank cells represent no hits found for respective gene families against respective databases.

4.3.3.15 Discussion – In this study, we analyzed changes in papaya leaves transcriptome using two cultivars PM and PS3 in response to the PRSV-Pune infection. High throughput sequencing generated 162,127,585 filtered reads and 89 % of total reads were uniquely mapped to the papaya reference genome. These unique mapped reads were used for analysis of differential gene expression and gene ontologies. Results obtained using PM and PS3 leaves for transcriptome analysis showed that the expression profiles of multiple genes in response to PRSV-Pune were altered in susceptible (PM) and tolerant papaya (PS3) cultivars. Resultant genes found in both cultivars represent part of metabolic and cytosolic regulatory networks, diseased resistance process and hormonal signaling pathway. The expression patterns were analyzed using two platforms, RNA seq and label-free proteome analysis which showed highly consistent and conferred the accuracy with reliability of the RNA seq results. The inducible defence response pathways were not completely activated at 21 dpi because many of common DEGs were up-regulated and very few were down-regulated in both PM and PS3. Those co-regulated genes were responsible to increase basal defence response leading to improve the disease resistance against PRSV-Pune infection in both tolerant (PS3) and susceptible (PM) papaya. Genes and proteins involved in pathway were mainly up-regulated in susceptible PM at 21 dpi but down- regulated in tolerant cultivar PS3 at 21 dpi. The reason of this divergent response between tolerant and susceptible plants in photosynthesis pathway has not previously been known in plant-pathogen interactions (Du et al., 2015). Plants have different genes

Page | 91

Results and Discussion and proteins which are involved PAMP-triggered immunity (PTI) and effector triggered immunity (ETI) as part of two different defence strategies against pathogen infection. PTI and ETI are plant immune response generated to protect themselves against pathogen attacks (Jones and Dangle, 2006). Gene ontological studies in terms of biological processes, molecular functions and cellular components were assigned to up- and down-regulate genes. Different GO terms were obtained in both PM and PS3 suggesting that these two papaya cultivars could already have developed different genetic pathways as part of adaption for survival with surrounding environment at early stage of growth cycle. In the present investigation, Glycosyl hydrolase family 3, 9, 14, 16 and 35 and expansin proteins encoding genes were found to be majorly up-regulated in both varieties of PM and PS3. In plants, two important proteins namely, AB hydrolases and Glycosyl hydrolases are responsible for the degradation and reorganization of cell wall polysaccharides (Minic and Jouanin, 2006). Apart from these, a group of extracellular proteins known as „expansins‟ play a key role in the modification of the mechanical properties of cell wall in the plants that lead to turgor-driven cell extension (Li et al., 2002). Cell wall polysaccharides functions as signaling molecules to elicit defence response has been reported by Vorwerk et al. (2004). Major facilitator superfamily (MFS) is group of secondary active membrane transporters which are involved in the transportation of a wide range of substrates (Law et al., 2008). According to Liu et al. (2018), the sugar transporter (STP) gene, a type of MFS, played a crucial in the distribution of monosaccharaides and in various plant metabolic processes. In our observations we recorded that the sweet sugar transporter family proteins encoding genes were up-regulated in both PM and PS3. Lv et al. (2017) has reported that Fructose-1, 6-bisphosphate aldolase (FBA) is a key enzyme involved in glycolysis, gluconeogenesis and . This FBA is responsible for the growth, development and stress responses in plants. Similar results were observed in our study where FBA encoding gene was up-regulated in PS3 and showed concurrence with Lv et al. (2017). Carbohydrate metabolism alteration can be affected by both pathogen and its host to fulfill their own requirements (Heil and

Page | 92

Results and Discussion

Bostock, 2002). Sweet proteins up-regulation is essential for plant immunity and growth of pathogen (Tauzin et al., 2014; Lanubile et al., 2015). According to Stiles et al. (2008), Soybean Itpks plays a vital role in inositol lipid‐ independent pathway of InsP6 synthesis. It has been reported that monoacylglycerol lipase (MAGL) leads to the hydrolysis of monoacylglycerol (MAG) to fatty acid and glycerol in triacylglycerol breakdown (Kim et al., 2016). Burton et al. (2005) stated that Acyl-CoA-binding protein (ACBP) had a major role to play in the acyl-CoA metabolism, especially in the ceramide synthesis and in signaling. In our study we have found the Itpks, MAG and ACBP encoding genes which were up-regulated in both PM and PS3. According to Chekan et al. (2017), PCY1 S9A protease encoding gene is related to the transamidation and yield cyclic products via acyl-enzyme intermediates. In our present study we observed down-regulation of PCY1 S9A encoding gene in both, PM and PS3. Carbohydrate and amino acid metabolic alterations have been responsible for development of symptoms in response to virus infection (Babu et al., 2008). Parallel to this we observed more severe symptoms on systemic leaves of PM than PS3 suggesting role of carbohydrate and amino acid metabolism changes. This was also supported by GO study where down-regulation of polysaccharide binding in PM and carbohydrate metabolism in PS3 was observed. In our study, both varieties PM and PS3 showed up-regulation of gene encoding ATX1. According to the report of Alvarez-Venegas et al., (2003) ATX1 is an epigenetic regulator with histone H3K4 which has a role in floral development in plants. We observed that PM showed up-regulation of mitochondrial carrier family (MCF) and MATE encoding genes. The major role of MCF is to facilitate the transport of different substrates like folates, S-adenosylmethionine, ADP glucose or ATP, ADP and AMP in plant plastid cell (Haferkamp, 2007). Similarly, the MATE gene is responsible for the transportation of primary and secondary metabolites via several membrane systems of the cell (dos Santos et al., 2017). 3-Deoxy-d-arabino-heptulosonate 7-phosphate (DAHP) is involved phenyl propanoid pathway in which Phenylalanine ammonia-lyase (PAL) is an important enzyme which produces secondary metabolites like lignin, flavonoids, furanocoumarin, phytoalexins and protectants. According to Shadle et al., (2003) PAL plays an important

Page | 93

Results and Discussion role in the induction of disease resistance upon pathogen attack in plants, whereas Maeda and Dudareva (2012) stated that DAHP is activated by tryptophan and is inhibited by arogenate. We observed that DAHP encoding gene was up-regulated in PS3 which provides precursor for the synthesis of key growth regulator significantly contributing in biotic stress tolerance like PRSV-Pune attack in PM and PS3. Early light‐inducible proteins (ELIP) are closely related to light‐harvesting a/ b‐binding (Cab) antenna proteins of I and II. Their main purpose is to perform photo protective functions in photosynthetic membranes upon light stress (Heddad and Adamska, 2002). In our study we recorded ELIP protein encoding gene in both cultivars PM and PS3 showing up-regulation. According to Qi et al. (2006) CYP51 sterol demethylases belong to cytochrome P450 which play an important role in the synthesis of essential sterols, serve as targets for cholesterol- lowering drugs, antifungal agents and . We recorded up-regulation of CYP51 gene in PS3 variety. It might be that PRSV interferes with structure, function during their development and changing non-cyclic to cyclic photo-phosphosporylation lead to produce more ATPs as demand of PRSV after infection in papaya. GO terms for photosynthesis were down-regulated in PM as well as in PS3 showed similar results at transcriptome level. Thioredoxin (TRX) aids in the regulation of different immune processes in plants. It chiefly acts as cysteine reductase and assists in the signaling of different proteins (Mata- Perez and Spoel, 2018). In both cultivars PM and PS3 we observed that TRX encoding gene was commonly up-regulated showing role in signaling in response to PRSV-Pune infection. Mildew Resistance Locus (MLO) are signaling proteins responsible in the activation and priming defence against P. syringae (Gruner et al., 2018). Such MLOs showed up-regulation in PM indicating defence priming response against PRSV. PS3 showed up-regulation of WUSCHEL (WUS) and CLAVATA (CLV) proteins encoding genes providing a feedback loop to regulate size of shoot meristem. Wahl (2010) showed the FAF genes have potential to modulate CLV3-WUS feedback loop. Beers et al. (2004) stated that the S8 family of subtilisin-like serine proteases, the C1A family of papain-like cysteine proteases and the A1 family of pepsin-like aspartic proteases genes had key roles in signal transduction, leaf and embryo development and

Page | 94

Results and Discussion disease resistance. Down-regulation of Peptidase C1 encoding gene was seen only in PM suggesting its importance in susceptibility against PRSV. According to the studies carried out by Alam and Ghosh (2018) several stress related proteins get up-regulated to combat biotic stress occurred due to PRSV. In our study both varieties PM and PS3 showed concurrence with the study carried out by Alam and Ghosh (2018) showing up-regulation indicating strong response from host for protection against infection. Water is produced with the help of catalase from hydrogen peroxide, which is necessary to maintain low reactive oxygen species (ROS) level in cell in order to protect them from apoptosis due to stress(O‟Brien et al., 2012). Pyridoxine–pyruvate aminotransferase is a coenzyme form of vitamin B6 which catalyzes a reversible transamination reaction between pyridoxine and pyruvate in order to form pyridoxal and L-alanine. Because of its and antioxidant functions, it is involved in the abiotic and biotic stress responses in plants (Fitzpatrick et al., 2011). „Ferroptosis‟ is programmed cell death caused due to lipid peroxidase accumulation due absence of iron dependent like glutathione-dependent antioxidant defences (Conlon and Dixon, 2017). Aubert et al., (1998) stated that gibberellin up-regulates GASA proteins encoding genes involved in development process of plants. The non- iron- containing enzymes like dioxygenases take part in the biosynthesis of abscisic acid, gibberellins, ethylene as well as flavonoids and alkaloids (Prescott and John, 1996). Over-expression of Myo-inositol oxygenase (MIOX) of rice reduced oxidative damage to plant cells under drought stress (Duan et al., 2012). All above genes showed up- regulation in both PM and PS3 showing similar line of defence for PRSV at genomic level for resistance. In our study, we recorded up-regulation of genes encoding LEA3, calmodulin- binding nuclear proteins, HIPP3 (Heavy metal associated isoprenylated plant protein3) and PDCD4 in PM. LEA3 is a protein which is highly similar to indole-3 acetic-acid- induced protein ARG2 which has been recorded from Mungbean (Vigna radiata L.) (George et al., 2009). According to Yang and Poovaiah (2002), Calmodulin-binding nuclear proteins that were encoded by AtSR gene family had specific CGCG box DNA binding activities which assisted in many signaling pathways that caused damage to organs in plants. HIPP3 acts an upstream regulator of stress and development dependent

Page | 95

Results and Discussion pathway of pathogen response via its bound zinc (Zschiesche et al., 2015) whereas, PDCD4 functions in the hormone-mediated signaling response in abiotic stress causing programmed cell death (Cheng et al., 2013). Defence response showed by PM against PRSV-Pune observed to different from that of PS3 indicating genetic basis for resistance making them susceptible and tolerant on basis of immunity they have for PRSV resistance. Quinlan (2014) showed that branched-chain 2-oxoacid dehydrogenase (BCKDH), pyruvate dehydrogenase (PDH) and 2-oxoglutarate dehydrogenase (OGDH), complexes are flavin-dependent enzymes that utilize NAD+ or NADH of the mitochondrial matrix to produce superoxide/ H2O2.Defensins are antimicrobial peptides containing defensin motifs and tissue-specific expression pattern upon infection by pathogen and environmental stress (Graham et al., 2008). We observed up-regulation in PS3 with respect to BCKDH and defensins indicating strong stress tolerance. Glutathione S transferase (GST, Phi and Tau classes in plants) are involved in different biological mechanisms like cell proliferation, stress response, oxidative factor scavenging, cellular detoxification, cell death regulation and secondary metabolite biosynthesis (Chronopoulou et al., 2017). In the present investigation GSTs encoding gene showed significant down-regulation in PM suggesting low defence response that leads to susceptibility against PRSV-P. The peroxidase–catalase superfamily consists of Class – I, – II and – III which are involved in the evolution of bi-functional catalase–peroxidases, carbon recycling by lignin degradation and secondary metabolism respectively (Zamocky et al., 2010). Our results showed down regulation of peroxidase in PS3 which could be paying role in breaking down the resistance to PRSV-Pune. EXORDIUM (EXO) genes were down regulated in the presence of exogenous cytokinin suggesting negative regulator of cell division (Farrar et al., 2003). The Snf1-related protein kinases (SnRKs), Snf1 (sucrose non-fermenting-1)-related protein kinases have three sub-families in plants viz., SnRK1, SnRK2 and SnRK3 along with other 38 members in Arabidopsis enable linking of metabolic and stress signaling (Halford and Hey, 2009). Those EXO and Snf1 genes were found to be down-regulated in PS3 which could be helpful for proving complex regulatory network to give tolerance to invading PRSV-Pune pathogen.

Page | 96

Results and Discussion

In the present work carried out, we recorded that S-adenosyl-l-methionine–IV (SAM–IV) encoding gene was up-regulated in both cultivars PM and PS3. According to Weinberg (2008) SAM–IV has the same nucleotide positions as that of SAM–I. Both the riboswitches are responsible for the regulation of gene expression by sensing of S- adenosylmethionine (SAM or Ado-Met). Thus changes at level of nuclear mechanism might be essential for both PRSV and Papaya for their counter defence against each other for survival. FK506, rapamycin and cyclosporine–A are immune-suppressive drugs which belong to „Immunophilins‟. FK506-binding protein (FKBP) family members contribute in functions like protein folding, receptor signaling, protein trafficking and transcription processes (Kang et al., 2008). This may be the reason for up-regulation of FK506-binding proteins (FKBPs) type PPIase in PM that have peptidylprolyl cis/ trans (PPIase) activity. According to Yu et al. (2009) AtECB2 protein is essential for accD RNA editing in the chloroplast early biogenesis in Arabidopsisthaliana. In our observations, PS3 showed up-regulation of AtECB2 encoding gene. ACBP60 protein family consists of proteins like SARD1 which bind to ICS1 promoter at upstream of GAAATTTTGG (-1217bp) in response to pathogen attack on plant to positively regulate salicylic acid (SA) synthesis (Zhang et al., 2010; Wang et al., 2015a). In our present work we recorded post-translational protein modification in both PM and PS3. Lysine methylation is a regular post-translational protein modification found on histone proteins and non-histone proteins. Eukaryotic elongation factor-1 alpha (eEF1A) is a non-histone protein which is known to play an important role in the protein synthesis machinery, for the regulation of actin cytoskeleton and in viral replication (Jakobsson et al., 2018). Uracil phosphoribosyltransferases (UPRTs) are the non- essential, energy-saving enzymes that catalyze the formation of uridine monophosphate (UMP) from uracil and phosphoribosylpyrophosphate (Mainguet et al., 2009). Enzymes E1, E2 and E3 are part of protein ubiquitination through covalent attachment of ubiquitin to target proteins. Among them RING-type E3 has a cysteine-rich domain attached to two zinc atoms with diverse substrate recognition domains. These are important in the target- specific proteolysis and abscisic acid signaling in plants (Stone et al., 2005 and Liu and Stone, 2011). Presence of these genes indicated papaya response in terms of proteolysis

Page | 97

Results and Discussion and signaling in response to PRSV infection. „Syntaxins‟ are membrane-trafficking proteins of SNARE protein family. As stated by Giovannone et al. (2018) Stx3S expression was responsible for the alteration in the expression of cancer-associated genes and promoted cell proliferation via membrane fusion of C-terminal. PM showed up- regulation of RING- type zinc figure family and Syntaxins, this could be the relevance with observation of necrotic symptoms on leaves infected with PRSV finally resulted into death of PM plants. GO study showed enrichment in down- regulation of cell wall catabolism and carbohydrate metabolism in PS3 and down- regulation of programmed cell death in PM. SPR1 is a microtubule stabilizing protein degraded by the 26S proteasome. Microtubule (MT) disassembly response is attained via SPR1 degradation in response to salt stress (Wang et al., 2011). Microtubules are important structural components of cytoskeleton giving support and playing major functional role in cell (Garner et al., 2004). Our results showed that SPR1encoding gene was down-regulated in both varieties PM and PS3 suggesting activation of response towards invading PRSV- Pune. In our present investigation both varieties showed up-regulation in the transcription factors and heat shock proteins (Hsp). Leucine zipper motif (bZIP) is involved in downstream signaling caused by PAMPs, plays a key role in pathogen defence (Asai et al., 2002; Bhattarai et al., 2010). Calmoduling-binding protein 60 like (CBP60g) is triggered by microbe associated molecular patterns (Vlot et al., 2009; Wang et al., 2009b; Wan et al., 2012). In present study Calmodulin encoding gene showed up- regulation in PM, Serine threonine-protein kinase encoding genes were up-regulated in both PM and PS3. This result indicates role of LRR receptor like serine threonine-protein kinase with different sub-families like dependent protein kinase (CDPK), plays positive sensory role in plant pathogen interaction as noticed by Park et al. (2004) and Choi et al. (2009). Weber-Ban et al. (1999) reported that ClpA is one of the members of the Hsp100 chaperone family that directed degradation of specific sequences that bear substrate proteins in ATP-dependent manner, 19S ATPase degrades ubiquitinated proteins. Up-regulation of HSPs like HSP20, ClpX, Sigma70 in both PM and PS3 showed importance and relevance of proteasome related proteins and HSps in plant-virus interaction as a line of defence. Virus replication related genes like nonstructural-binding

Page | 98

Results and Discussion proteins (NSm) of the DnaJ family have been studied by Soellick et al. (2000) from Nicotiana tabacum and Arabidopsis thaliana. These genes are involved in cell-to-cell viral movement. We observed up-regulation of these genes in both PM and PS3 due PRSV infection for virus movement in plant cells. In our study we recorded up-regulation of toxicity/ heavy metal related genes. Small conductance Mechanosensitive ion channel (MscS) family up-regulated in both cultivars, PS3 showed up-regulation of Drug/metabolite transporter (DMT) superfamily and Ars A ATPase family. We observed that PS3 exhibited down regulation of EXO/ Phosphate-Responsive Family Protein (EXORDIUM) family and NFYB/ HAP3 transcription factor encoding genes. Studies carried out by Qu et al. (2015) showed that TaNFYA-B1 gene from wheat played an important role in the development of root. Also, most NFYAs positively responded to the low nitrogen and availability. Tolerance of PS3 to PRSV infection may be due to activation of these heavy metal related genes and proteins, which might involve in providing nutrients for plant growth and indirectly helping complex network of stress tolerance.

4.4 Label-free proteomic analysis of PRSV-Pune infected leaf samples of C. papaya cultivars PM and PS3 4.4.1 Overview of differential proteome analysis of PRSV-Pune infected leaf samples of C. papaya cultivars PM and PS3 The comparison between infected and healthy (control) plants resulted in 131 and 132 differentially expressed proteins in Pusa Majesty (PM) and Pune Selection-3 (PS3). Differentially expressed proteins were defined as “up-regulated” in case where expression of protein was more in infected than in control sample and opposite to it was defined as “down-regulated” protein showing significant abundance changes at p ≤ 0.05 whereas cutoff fold change (FC) of at least ± 0.50. A total of 131 proteins, 10 up- and 19 down-accumulated in PM similarly a total of 132 proteins, 24 up- and 29 down- accumulated in PS3 (Table 4.16 and 4.17).

Page | 99

Results and Discussion

Table 4.16: Proteomic coverage of PRSV-Pune infected and control leaf samples of cultivars Pusa Majesty (PM) and Pune Selection-3 (PS3). Total number of Number of quantified Sample peptides proteins Control PM 750 131 Infected PM 764 Control PS3 748 132 Infected PS3 727 Total 2989 263

Table 4.17: Summary of differential protein expression and gene ontology (GO) analysis of PRSV-Pune infected and control leaf samples of cultivars Pusa Majesty (PM) and Pune Selection-3 (PS3). Number of GO terms assigned Sample proteins (total GO terms) Up-regulated proteins in PM 10 10 (59) Down-regulated proteins in 19 16 (106) PM Up-regulated proteins in PS3 24 22 (199) Down-regulated proteins in 29 23 (166) PS3 Total 82 71 (530) GO terms assigned to only 72 proteins out of 82 proteins. Total go terms were 530 for 72 GO assigned proteins.

4.4.2 Gene ontology analysis of PRSV-Pune infected leaf samples of C. papaya cultivars PM and PS3 Gene ontology analysis for both PM and PS3 in up- and down-regulated proteins was obtained in terms of biological processes, molecular functions and cellular components (Fig. 4.10 and 4.11).

Page | 100

Results and Discussion

4.4.2.1 Gene ontology analysis for cultivar PM Total ten up-regulated proteins found to be enriched with at least single GO term in PM. In biological process, metabolic process found to be 88.8 %, cellular metabolic process contributes to 77.7 % and response to abiotic stimulus found to be 44.4 %. Down-regulated biological processes in PM showed highest response to stimulus (57.1 %), photosynthesis (28.5 %), non-photochemical quenching (14.2 %), defence response to bacterium (14.2 %), response to bacterium (14.2%) developmental programmed cell death (7.1 %) and glycosinolate catabolic process (7.1 %). In molecular functions, cobalt ion binding showed 22.2 % enrichment in up- regulated genes encoding proteins, rest other like catalase activity, ribulose bisphosphate carboxylase activity, alanine oxo-acid transaminase activity, transition metal ion binding and L-malate dehydrogenase activity contributed upto 11.1 %. In down-regulated categories chlorophyll binding (14.2 %) and tetrapyrrole binding (21.4 %), antioxidant activity (14.2 %), peroxi-redoxin activity (7.1 %) were found in PM. In cellular components, cytoplasm and intracellular showed 88.8 % involvement, memebrane bound organelle 77.7 %, chloroplast and plastid 66.6 %, cytosolic ribosome, chloroplast stroma, ribonuclotide 33.3 % were up- regulated. In down-regulated proteins enrichment was highest in membrane 71.4 %, plastid stroma 71.4 % and cytoplasmic part 64.2 %.

Page | 101

Results and Discussion

cellular response to oxidative stress malate metabolic process carbon fixation cellular response to nitrogen starvation cellular response to hydrogen peroxide protein folding catabolic process response to abiotic stimulus cellular metabolic process metabolic process glycosinolate catabolic process

Biological process developmental programmed cell death response to bacterium defense response to bacterium nonphotochemical quenching photosynthesis response to stimulus L-malate dehydrogenase activity transition metal ion binding alanine-oxo-acid transaminase activity ribulose-bisphosphate carboxylase activity catalase activity oxoglutarate aminotransferase activity cobalt ion binding peroxiredoxin activity

Molecularfunctions antioxidant activity chlorophyll binding tetrapyrrole binding cytosolic small ribosomal subunit microbody stromule ribonucleoprotein complex plastid chloroplast membrane-bounded organelle intracellular organelle cytoplasm PSII associated light-harvesting complex II apoplast protein complex Cellularcomponents macromolecular complex chloroplast part intracellular organelle cytoplasmic part plastid stroma membrane 0% 10%20%30%40%50%60%70%80%90%100%

Fig. 4.10: Gene ontology (GO) analysis for PRSV-Pune infected Pusa Majesty (PM) leaf proteins.

Page | 102

Results and Discussion

4.4.2.2 Gene ontology analysis for cultivar PS3 Biological processes showed highest involvement for up-regulated proteins in response to stimulus and metabolic process up to 70 %. Response to stress and abiotic stimulus were 60 and 55 % respectively. For down-regulated proteins cellular processes include 80.9 %, metabolic process include 66.6 %, response to stimulus include 61.9 %. Molecular functions showed up-regulated proteins for lyase activity 15.7 % and glutathione binding, peptide binding, glutathione transferase activity were 10.5 %. Down regulation for molecular functions of PS3 proteins include glyceraldehyde-3-phosphate dehydrogenase activity and oxido-reductase activity 9 % and transporter 4.5 %. Cellular components were highest for plastid part, intracellular organelle part, organelle part, chloroplast, plastid, cytoplasm, organelle were 94.7 % in up-regulated proteins in PS3. Intracellular part 85.7 %, cytoplasm and organelle were 80.9 %, intracellular membrane bound organelle 76.1 % contributed in down regulated PS3 proteins in terms of cellular components.

Page | 103

Results and Discussion

response to bacterium defense response to bacterium multi-organism process response to biotic stimulus response to abiotic stimulus response to stress metabolic process response to stimulus

defense response response to chemical stimulus response to abiotic stimulus Biological process Biological photosynthesis response to stress cellular metabolic process response to stimulus metabolic process cellular process

aldehyde-lyase activity fructose-bisphosphate aldolase activity ribulose-bisphosphate carboxylase activity catalase activity glutathione transferase activity peptide binding glutathione binding lyase activity

Molecular functions Molecular electron transporter oxidoreductase activity glyceraldehyde-3-phosphate dehydrogenase…

envelope organelle cytoplasm plastid chloroplast organelle part intracellular organelle part plastid part

apoplast protein complex

Cellular components Cellular macromolecular complex plastid chloroplast membrane-bounded organelle organelle cytoplasm intracellular part 0% 20% 40% 60% 80% 100%

Figure 4.11: Gene ontology (GO) analysis for PRSV-Pune infected Pune Selection-3 (PS3) leaf proteins.

Page | 104

Results and Discussion

4.4.3 Differential protein expression in cultivars PM and PS3

4.4.3.1 Significantly up-regulated proteins in cultivar PM Total six significantly up-regulated proteins were found in PM. RuBisCO small chain family (1.04 FC), Cyclophilin-type PPIase family (0.89 FC), malate dehydrogenase (MDH) superfamily (0.71FC), Catalase family and two another proteins EP1-like glycoprotein 2 (1.05 FC), ATP synthase 24 kDa subunit, mitochondrial 0.55 FC) were up-regulated in PM (Table 4.18). 4.4.3.2 Significantly down-regulated proteins in cultivar PM Photosynthesis related proteins get down regulated in PM like PsbB/ PsbC family (-0.72 FC), ATPase B chain family (-0.71 FC), Psb28 family (-0.69 FC), Peroxi-redoxin family (-0.65 FC), Light-harvesting / b-binding (LHC) protein family (-0.58 FC) and PsbP family (-0.52 FC). Pectinmethylesterase (PMEI) family (-1.08 FC) and Plasma membrane intrinsic protein (PIP) aquaporin family (-0.86 FC) were also significantly down regulated in PM (Table 4.18).

Table 4.18: Differential protein expression in cultivar Pusa Majesty (PM) leaf samples infected with PRSV-Pune.

Log 2 NCBI Fold Description Protein families accession change (FC) Significantly up-regulated proteins in PM EP1-like glycoprotein 2 XP_021895074 1.05 [C. papaya] Ribulose bisphosphate carboxylase small chain, XP_021912379 RuBisCO small chain family 1.04 chloroplastic-like isoform X1 [C. papaya] Catalase isozyme 2-like XP_021908251 Catalase family 1.02 [C. papaya]

Page | 105

Results and Discussion

Peptidyl-prolyl cis-trans Cyclophilin-type PPIase XP_021902521 0.89 isomerase-like [C. papaya] family Malate dehydrogenase Malate dehydrogenase XP_021899518 0.71 [C. papaya] superfamily ATP synthase 24 kDa XP_021895519 subunit, mitochondrial 0.55 [C. papaya] Significantly down-regulated proteins in PM Pectinesterase-like Pectinmethylesterase XP_021902981 -1.08 [C. papaya] inhibitor (PMEI) family Plasma membrane intrinsic Aquaporin PIP1-3 XP_021895280 protein (PIP)aquaporin -0.86 [C. papaya] family Photosystem II CP47 XP_016719338 reaction center protein-like PsbB/ PsbC family -0.72 [Gossypium hirsutum] ATP synthase subunit b XP_021886985 ATPase B chain family -0.71 chloroplastic [C. papaya] Photosystem II reaction XP_021893773 center PSB28 protein, Psb28 family -0.69 chloroplastic [C. papaya] YP_001671682 chlorophyll a apoprotein A2 -0.67 [C. papaya] Photosystem I P700 XP_021894414 chlorophyll a Apoprotein Peroxi-redoxin family -0.65 A2 [C. papaya] Osmotin-like protein XP_021887635 -0.59 [C. papaya] XP_021903389 Chlorophyll a-b binding Light-harvesting chlorophyll -0.58

Page | 106

Results and Discussion

protein CP26, chloroplastic a/ b-binding(LHC) protein [C. papaya] family Thylakoid lumenal 19 kDa XP_021910744 protein,Chloroplastic -0.54 [C. papaya] Oxygen-evolving enhancer XP_021887763 protein 2, chloroplastic-like PsbP family -0.52 [C. papaya] Photosystem I P700 YP_001671683 chlorophyll a apoprotein A1 -0.52 [C. papaya] Photosystem II protein D2 BAE47589 (chloroplast) -0.52 [Lactuca sativa] Protein families up-regulated common with transcriptome study Glutathione S-transferase Glutathione S-transferase XP_021906879 (GST) superfamily, Phi 0.73 F6-like [C. papaya] family Heat shock cognate 70 kDa XP_021910429 Heat shock protein 70 family 0.69 protein 2 [C. papaya] 40S ribosomal protein S10- Eukaryotic ribosomal protein XP_021892143 0.59 1-like [C. papaya] eS10 family Protein families down-regulated common with transcriptome study GDSL lipolytic enzyme P86276 GDSL esterase/ lipase -1.60 family Chlorophyll a-b binding Early light-inducible protein XP_021899783 protein CP24 10A, -0.91 (ELIP) /psbS family chloroplastic [C. papaya] P05994 C. papaya proteinase 4 Peptidase C1 family -0.91 XP_021905490 Probable acyl-activating ATP-dependent AMP- -0.88

Page | 107

Results and Discussion

enzyme 17, peroxisomal binding enzyme family [C. papaya] Thylakoid lumenal 29 kDa XP_021900593 protein, chloroplastic Peroxidase family -0.80 [C. papaya] Note: Common up- and down-regulated proteins and other common proteins for PM and PS3 were presented in both Table 4.18 and 4.19 for avoiding confusion. Table 4.18 and Table 4.19 represent cultivar wise expression profiles of proteins for PM and PS3 respectively. Blank cells represent no hits found for respective proteins and protein families against respective databases.

4.4.3.3 Significantly up-regulated proteins in cultivar PS3 Cultivar PS3 showed up-regulation of proteins more in number as compared to PM as mentioned in Table 4.19. Proteins found to be significantly up-regulated like Ribulose-1,5-bisphosphate carboxylase/ oxygenase large subunit, partial (chloroplast) (Jacaratia corumbensis, 2.36 FC), Chaperonin (HSP60) family (1.74 FC and 0.54 FC), ATPase B chain family(1.07, 1.66 FC), RuBisCO small chain family (1.56 FC), Light- harvesting chlorophyll a/ b-binding (LHC) protein family (1.30 FC), NAD(P)-dependent epimerase/ dehydratase family (0.88 and 0.96 FC), RuBisCO activase family (0.82 FC), ATPase gamma chain family (0.64 FC), malate dehydrogenase (MDH) superfamily (0.59 FC), Catalase family ( 0.59 FC), Flavin mononucleotide (FMN) dependent alpha-hydroxy acid dehydrogenase family (0.58 FC) and Fructose bisphosphatase (FBPase) class 1 family (0.52 FC). 4.4.3.4 Significantly down-regulated proteins in cultivar PS3 Significantly down-regulated proteins were less in number as compared to up- regulated proteins in PS3. Photosynthesis related proteins like PsbB/ PsbC family (-1.79 FC), PsbQ family (-0.91 FC), PsaE family (-0.88 FC), PsbP family (-0.80, -0.52 FC), PsaH family (-0.74 FC), Light-harvesting chlorophyll a/ b-binding (LHC) protein family (-0.72 FC) significantly down-regulated. Protease inhibitor I3 (leguminous Kunitz-type inhibitor) family (-1.47 FC), Major latex protein (MLP) family (-0.96 FC), Cyclophilin- type PPIase family (-0.68 FC), Pectinmethylesterase inhibitor (PMEI) family (-0.57 FC), Histone H2A family (-0.54 FC), Glyceraldehyde-3-phosphate dehydrogenase family (-

Page | 108

Results and Discussion

0.54, -0.52 FC), Triosephosphate isomerase family (-0.50 FC) were got down regulated along with above mentioned photosynthesis proteins.

Table 4.19: Differential protein expression in cultivar Pune Selection-3 (PS3) leaf samples infected with PRSV-Pune. Log 2 Fold NCBI accession Description Protein families change Significantly up-regulated proteins in PS3 Ribulose-1,5- bisphosphate carboxylase/oxygenase AFQ13416 large subunit, partial 2.36 (chloroplast) [Jacaratia corumbensis] RuBisCO large subunit-binding Chaperonin (HSP60) XP_021896168 1.74 protein subunit alpha family [C. papaya] ATP synthase CF0 B YP_001671669 subunit ATPase B chain family 1.66 [C. papaya] Ribulose bisphosphate XP_021910306 carboxylase small RuBisCO small chain 1.56 chain, chloroplastic- family like [C. papaya] Chlorophyll a-b Light-harvesting binding protein 6, XP_021897916 chlorophyll a/b-binding 1.30 chloroplastic (LHC) protein family [C. papaya] YP_001671691 Ribulose-1,5- 1.24

Page | 109

Results and Discussion

bisphosphate carboxylase/oxygenase large subunit [C. papaya] ATP synthase subunit XP_021886985 b', chloroplastic ATPase B chain family 1.07 [C. papaya] Probable glutathione Glutathione S-transferase XP_021895014 S-transferase (GST) superfamily, 1.07 [C. papaya] HSP26 family Chloroplast stem-loop NAD(P)-dependent binding protein of 41 XP_021899026 epimerase/dehydratase 0.96 kDa b, chloroplastic family [C. papaya] Hypothetical protein NAD(P)-dependent CISIN_1g0462972mg, KDO51169 epimerase/dehydratase 0.88 partial family [Citrus sinensis] Ribulose bisphosphate XP_021894604 carboxylase/oxygenase RuBisCO activase family 0.82 activase, chloroplastic [C. papaya] ATP synthase gamma ATPase gamma chain XP_021889075 chain, chloroplastic, 0.64 family partial [C. papaya] Malate dehydrogenase, Malate dehydrogenase XP_021903754 glyoxysomal 0.59 (MDH) superfamily [C. papaya] Catalase isozyme 1 XP_021902483 Catalase family 0.59 [C. papaya] XP_021894989 Peroxisomal (S)-2- FMN-dependent alpha- 0.58

Page | 110

Results and Discussion

hydroxy-acid oxidase hydroxy acid [C. papaya] dehydrogenase family Osmotin-like protein XP_021887635 0.56 [C. papaya] Chaperonin 60 subunit Chaperonin (HSP60) XP_021888722 beta 2, chloroplastic 0.54 family [C. papaya] Sedoheptulose-1,7- bisphosphatase, Fructose bisphosphatase XP_021904150 0.52 chloroplastic (FBPase) class 1 family [C. papaya] Significantly down-regulated proteins in PS3 Photosystem II CP47 reaction center XP_016719338 -2.50 protein-like [Gossypium hirsutum] Photosystem I P700 chlorophyll a YP_001671683 -2.29 apoprotein A1 [C. papaya] Photosystem II CP43 B1A931 PsbB/PsbC family -1.79 reaction center protein Latex serine proteinase Protease inhibitor I3 XP_021909451 inhibitor-like (leguminous Kunitz-type -1.47 [C. papaya] inhibitor) family Photosystem II protein BAE47589 D2 (chloroplast) -1.28 [Lactuca sativa] Photosystem I P700 YP_001671682 chlorophyll a -1.25 apoprotein A2

Page | 111

Results and Discussion

[C. papaya] MLP-like protein 43 Major latex protein (MLP) XP_021905533 -0.96 [C. papaya] family Oxygen-evolving XP_021889816 enhancer protein 3-2, PsbQ family -0.91 chloroplastic [C. papaya] Photosystem I reaction center subunit IV B, XP_021907348 PsaE family -0.88 chloroplastic [C. papaya] Cytochrome b6-f complex iron-sulfur XP_021901105 subunit 2, -0.83 chloroplastic [C. papaya] PsbP-like protein 1, XP_021910570 chloroplastic PsbP family -0.80 [C. papaya] Photosystem I reaction center subunit VI-2, XP_021894937 PsaH family -0.74 chloroplastic-like [C. papaya] Chlorophyll a-b Light-harvesting binding protein of XP_021904725 chlorophyll a/b-binding -0.72 LHCII type 1 (LHC) protein family [C. papaya] 40S ribosomal protein Cyclophilin-type PPIase -0.68 XP_021910489 S2-4-like [C. papaya] family Peptidyl-prolyl cis- XP_021902521 -0.67 trans isomerase-like

Page | 112

Results and Discussion

[C. papaya] Photosystem II protein YP_001671700 V -0.67 [C. papaya] Pectinesterase-like Pectinmethylesterase XP_021902981 -0.57 [C. papaya] inhibitor (PMEI) family Histone H2A XP_021892997 Histone H2A family -0.54 [C. papaya] Glyceraldehyde-3- phosphate Glyceraldehyde-3- XP_021895754 dehydrogenase A, phosphate dehydrogenase -0.54 chloroplastic family [C. papaya] Glyceraldehyde-3- phosphate Glyceraldehyde-3- XP_021902809 dehydrogenase B, phosphate dehydrogenase -0.52 chloroplastic family [C. papaya] Oxygen-evolving enhancer protein 2, XP_021887763 PsbP family -0.52 chloroplastic-like [C. papaya] Triosephosphate Triosephosphate XP_021910469 isomerase, cytosolic -0.50 isomerase family [C. papaya] Protein families up-regulated common with transcriptome study Carbonic anhydrase, chloroplastic isoform Beta-class carbonic XP_021887045 1.20 X1 anhydrase family [C. papaya] XP_021899177 Serine--glyoxylate Class-V pyridoxal- 1.10

Page | 113

Results and Discussion

aminotransferase phosphate-dependent [C. papaya] aminotransferase family Glutathione S- Glutathione S-transferase XP_021906879 transferase F6-like (GST) superfamily, Phi 0.76 [C. papaya] family Fructose-bisphosphate Class I fructose- aldolase 1, XP_021907901 bisphosphate aldolase 0.74 chloroplastic family [C. papaya] Glutathione S- Glutathione S-transferase XP_021902740 transferase-like (GST) superfamily, Phi 0.54 [C. papaya] family Heat shock 70 kDa Heat shock protein family, XP_021891591 protein 6, chloroplastic 0.51 DnaK subfamily [C. papaya] Protein families down-regulated common with transcriptome study C. papaya proteinase XP_021888999 4 Peptidase C1 family -1.05 [C. papaya] Beta-thioglucoside Glycosyl hydrolase 1 ACO95142 glucohydrolase -0.77 family [C. papaya] Protochlorophyllide Short-chain reductase, XP_021894719 dehydrogenases/reductases -0.75 chloroplastic (SDR) family [C. papaya] Desiccation-related Late Embryogenesis XP_021906959 protein At2g46140 Abundant proteins (LEA) -0.71 [C. papaya] type 3 family Universal ribosomal P10056 C. papaya peptidase A -0.68 protein uS5 family

Page | 114

Results and Discussion

Glycosyl hydrolase 19 P85084 Endochitinase -0.58 family MDIS1-interacting XP_021903319 Ser/Thr protein kinase receptor like kinase 2- -0.52 family like [C. papaya] Note: Common up- and down-regulated proteins and other common proteins for PM and PS3 were presented in both Table 4.18 and 4.19 for avoiding confusion. Table 4.18 and Table 4.19 represent cultivar wise expression profiles of proteins for PM and PS3 respectively. Blank cells represent no hits found for respective proteins and protein families against respective databases.

4.4.4 Discussion – The finding of C. papaya genome has made the transcriptomics and proteomics study feasible(Ming et al., 2008).Till date, several different proteins have been studied from C. papaya, out of which 186 proteins have been studied from latex (Rodrigues et al., 2012 and Dhouib et al.,2011), 1581 proteins have been successfully isolated from chromoplasts (Wang et al., 2013), 71 proteins have been found from leaves (Rodrigues et al., 2011), 54 from C. papaya fruit pulp (Huerta-Ocampo et al., 2012 and Nogueira et al., 2012) and 76 from the somatic embryos (de Moura Vale et al., 2014).Rodrigues (2011 and 2012) isolated HSP70, latex serine protease inhibitors and other patho-system related proteins from C. papaya leaf and latex. Photosynthesis is a basic and essential process in plants which is necessary for the plant growth, for maintenance by providing ATP and NADPH, by providing carbon skeleton and assisting in various stresses (Kangasjarvi et al., 2014). Various Photosynthesis-related proteins like PsbO, PSI proteins and ATP synthase played role in viral infection (Lehto et al., 2003; Jang et al., 2013; Siriwan et al., 2013). Abbink et al. (2002) reported that Tobacco mosaic virus (TMV) replication enhanced due to the silencing of 33K subunit gene of the oxygen-evolving complex of photosystem II in Nicotiana benthamiana. Photosynthesis-related proteins showed significant expression in PRSV-P infected C. papaya suggesting crucial role in the light dependent reactions in plant-virus interaction. Reactive oxygen species (ROS) production from reduced (PQ) pool due to higher electron flow ratio lead to the activation of defence responses like hyper sensitive reaction (HR), stress-related hormones and

Page | 115

Results and Discussion signaling gene activation in plants (Lehto et al., 2003; Shao et al., 2008; Karpinski et al., 2013; Szechynska-Hebda and Karpinski, 2013 and Siriwan et al., 2013). GO terms showed down regulation of Chloroplast, plastid, cell metabolic process in our study suggesting PRSV-Pune impact on host photosynthesis machinery. Eukaryotic ribosomal proteins are essential in the regulation of proteolysis and different stress responses like virus infections (Sullivan et al., 2003 and Wolf and Hilt, 2004). Jin and Villegas (2006) reported that the down-regulation of RPN9 aided in the phloem development affecting resistance to movement of TMV and Turnip mosaic virus (TuMV) in N. benthamiana. The lower abundance of 26S proteasome at early stages of PMeV infection and the development of C. papaya laticifers were responsible for the osmotic imbalance caused in the cells. Thus, systemic spread of virus was negatively affected due to the development of vascular tissue via 26S proteasome (Rodrigues et al., 2009).Down regulation of glycosyl hydrolase proteins and other cell-wall associated proteins suggested seems to interfere PRSV movement. Cell-wall proteins are host factors that affect local and systemic translocation of virus (An et al., 2008 and Lionetti et al., 2014). These results indicate correlation of down-regulation of glycosyl hydrolase cell-wall proteins with development of systemic symptoms in leaves infected with PRSV- Pune of PM and PS3. Other proteins which play significant role as pathogenicity or virulence factor were also expressed in response to PRSV-Pune. Malate dehydrogenase (MDH), Fructose 1, 6-bisphosphatase (FBPase), Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) acts as virulence factors in plant pathogen interaction (Pandey et. al., 2018).These three showed differential expression in present investigation suggesting their probable role as virulence or pathogenicity factors in papaya and PRSV interaction. MLP 43 gene is reported to be positive regulator in ABA- and drought-stress responses in Arabidopsis via taking part in regulation of ROS levels, ABA-responsive gene expression and electrolyte and water balancing (Wang et al., 2015b). Major latex protein (MLP) showed differential expression in our study. It might be involved in stress and signaling pathway network of plant virus interaction.

Page | 116