-40307; No. of pages: 10; 4C: Gene xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Gene

journal homepage: www.elsevier.com/locate/gene

1Q1 Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker 2 identification: A vital component of

3Q2 Rohit Randhawa a, Manika Sehgal a, Tiratha Raj Singh a, Ajay Duseja b, Harish Changotra a,⁎

4 a Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan 1732 34 Himachal Pradesh, India 5 b Department of Hepatology, Postgraduate Institute of Medical Education and Research, Chandigarh 160 012, India

6 article info abstract

7 Article history: Autophagy is a degradation pathway involving lysosomal machinery for degradation of damaged organelles like 19 8 Received 19 October 2014 the endoplasmic reticulum and mitochondria into their building blocks to maintain homeostasis within the cell. 20 9 Received in revised form 3 February 2015 ULK1, a serine/threonine kinase, is conserved across species, from yeasts to mammals, and plays a central role in 21 10 Accepted 5 February 2015 autophagy pathway. It receives signals from upstream modulators such as TIP60, mTOR and AMPK and relays 22 11 Available online xxxx them to its downstream substrates like Ambra1 and ZIP kinase. The activity of this complex is regulated through 23 – fi fi 24 12 Keywords: protein interactions and post-translational modi cations. Applying in silico analysis we identi ed 25 13Q3 Autophagy (i) conserved patterns of ULK1 that showed its evolutionary relationship between the species which were closely 14 ULK1 related in a family compared to others. (ii) A total of 23 TFBS distributed throughout ULK1 and nuclear factor 26 15 Palmitoylation (erythroid-derived) 2 (NFE2) is of utmost significance because of its high importance rate. NEF2 has already 27 16 Phosphorylation been shown experimentally to play a role in the autophagy pathway. Most of these were of zinc coordinating 28 17 Haplotype class and we suggest that this information could be utilized to modulate this pathway by modifying interactions 29 18 – Protein protein interactions of these TFs with ULK1. (iii) CATTT haplotype was prominently found with frequency 0.774 in the studied 30 population and nsSNPs which could have harmful effect on ULK1 protein and these could further be tested. 31 (iv) A total of 83 phosphorylation sites were identified; 26 are already known and 57 are new that include one 32 at tyrosine residue which could further be studied for its involvement in ULK1 regulation and hence autophagy. 33 Furthermore, 4 palmitoylation sites at positions 426, 927, 1003 and 1049 were also found which could further be 34 studied for protein–protein interactions as well as in trafficking. 35 36 © 2015 Published by Elsevier B.V. 373840 39 41 1. Introduction required for binding to other essential components of autophagy, 59 Atg13 and FIP200, is lacking in ULK3, ULK4 and STK36. Therefore, 60 42 Autophagy is an evolutionarily conserved degradation pathway in ULK1 and ULK2 are the primary candidate mammalian Atg1 61 43 which cytoplasmic portions that include damaged organelles orthologues, essential for induction of autophagy. ULK1 protein expres- 62 44 and misfolded or aggregated are sequestered in double- sion pattern studies have been done (Kundu et al., 2008). ULK1 knock 63 45 membrane vesicles called autophagosomes. Then, these contents are out mouse model was viable and did not show any evident develop- 64 46 delivered to the lysosomes for degradation resulting in removal/ mental defects, in contrast to other core autophagy (Atg3, Atg5, 65 47 recycling of damaged/harmful contents from the cell to maintain the Atg7, Atg9 and Atg16L1) where their deletions led to neonatal lethality. 66 48 cellular homeostasis. This pathway is dysregulated in many diseases in- ULK1 expression levels were elevated during erythroid maturation but 67 49 cluding neurodegenerative, inflammatory, muscle, cardiac, infectious, not of ULK2 suggesting that ULK2 was not involved in this process. 68 50 and neoplastic diseases. There is possibility that modulation of autoph- Moreover, they also showed an important role of ULK1 in selective 69 51 agy pathway could be helpful in better therapeutic management of clearance of mitochondria and ribosomes in reticulocytes. The reasons 70 52 these diseases. that ULK1 is not essential for murine survival could be (i) ULK2, which 71 53 In mammals, autophagyUNCORRECTED plays an important role in preimplantation shows N50% homology PROOF with ULK1 and shows functional redundancy 72 54 development, survival during neonatal starvation, cell differentiation, and induce autophagy and/or (ii) existence of ULK1 independent 73 55 erythropoiesis and lymphopoiesis. Autophagy is actively induced in all mechanism of autophagy. Furthermore, Chan and co-workers have 74 56 neonatal tissues early during development. The five identified Atg1 ho- shown that in HEK293 cells ULK1 was critical for inducing autophagy 75 57 mologues in mammals include uncoordinated (Unc) 51-like kinase in response to amino acid starvation (Chan et al., 2009). Therefore, the 76 58 (ULK1) 1 to 4 and STK36. Carboxy-terminal domain (CTD) which is focus of this study was to analyze ULK1 which is a major regulator of 77 autophagy. 78 – 79 ⁎ Corresponding author. ULK1, a serine threonine kinase, is one of the central human E-mail addresses: [email protected], [email protected] (H. Changotra). autophagy-related genes and its chromosomal location is 12q24.3. A 80

http://dx.doi.org/10.1016/j.gene.2015.02.056 0378-1119/© 2015 Published by Elsevier B.V.

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 2 R. Randhawa et al. / Gene xxx (2015) xxx–xxx

81 ULK1 gene is 28,517 bp long with 28 exons and is translated to 1050 of site-specific residues and phylogenetic analysis; (2) Regulatory ele- 117 82 amino acids. ULK1 forms a stable complex with Atg13, FIP200, and ments and over-represented transcription factor binding site (TFBS) 118 83 Atg101. This complex plays a crucial role in initiation step of autophagy. recognition; (3) Detection of nsSNPs, their phenotypic effects and 119 84 ULK1 regulates its substrates and is itself regulated by phosphorylation quantitative statistical analysis for genetic parameters; (4) Elucidation 120 85 events. mTOR1, AMPK and TIP60 are its well known upstream regula- of putative phosphorylation and palmitoylation sites; and (5) Protein– 121 86 tors. It is hyperphosphorylated in nutrient-rich conditions and dephos- Protein Interaction (PPI) studies. 122 87 phorylates on starvation. So far, around 30 phosphorylation sites have 88 been identified on ULK1 and most of the kinases responsible for 2.1. Identification of site-specific residues and phylogenetic analysis 123 89 its phosphorylation and functions of these are still unidentified for ULK1 124 90 (Mack et al., 2012). This supports that phosphorylation events play an 91 important role in ULK1 regulation. Recently, decreased expression of The analyses initiated with retrieval of human protein sequence for 125 92 ULK1 has been shown in breast cancer patients, which was associated ULK1 (GenBank Accession Number: AAC32326) from the National Cen- 126 93 with cancer progression and low autophagic activity (Tang et al., ter for Biotechnology Information (NCBI) and corresponding protein se- 127 94 2012). However, another study showed higher expression of ULK1 in quences for other seven species of families Hominidae (Pan troglodytes; 128 95 the hepatocellular carcinoma (HCC) patients and furthermore, higher GenBank Accession Number: JAA43195), Bovidae (Bos taurus;GenBank 129Q4 96 ULK1 expression in HCC patients was associated with low survival rate Accession Number: NP_001192856), Cricetidae (Cricetulus griseus; 130 97 (Xu et al., 2013). These studies indicate different roles of autophagy in GenBank Accession Number: EGW02429), Pteropodidae (Pteropus 131 98 different types of cancers and indeed in different diseases. Therefore, alecto; GenBank Accession Number: ELK14239), Muridae (Rattus 132 99 ULK1 could be used as a prognostic marker for cancer patients. More- norvegicus; GenBank Accession Number: NP_001101811, Mus musculus; 133 100 over, this gene has been shown to be involved in genetic susceptibility GenBank Accession Number: NP_033495), Pipidae (Xenopus (Silurana) 134 101 of Crohn's disease (CD). Recent studies have shown the association of tropicalis; GenBank Accession Number: NP_001106388) were also re- 135 102 three SNPs (rs12303764, rs10902469 and rs7488085) with CD trieved and further deliberated for their evolutionary conservation. 136 103 (Henckaerts et al., 2011). These variations could be used as prognostic 104 markers in the therapeutic interventions after validation in more num- 2.1.1. Evolutionary conserved and variable regions 137 105 ber of patients and other populations. The genetic variations leading to different phenotypes were 138 106 In the present study, we computationally analyzed the ULK1 gene analyzed by observing the variable regions in the multiple sequence 139 107 for its phylogeny reconstruction which suggests that it is closely related alignment (MSA) generated for the ULK1 gene. The latter was carried 140 fi 108 in a family. We identi ed new TFBS, snSNP with their possible out using multiple sequence comparison by log-expectation (MUSCLE) 141 109 phenotypic effect on ULK1 protein function, phosphorylation and (Edgar, 2004) and multiple alignment using fast Fourier transform 142 – 110 palmitoylation sites and protein protein interactions. This comprehen- (MAFFT) (Katoh et al., 2002). These programs use log-expectation 143 111 sive in silico analyses would be helpful to unravel the functions of this scores and fast Fourier transform methods respectively for providing 144 112 gene and understand autophagy as well as non-autophagy roles of better average accuracy and speed compared to other MSA algorithms. 145 113 this gene. Programs were used with their default parameters. 146

114 2. Material and methods 2.1.2. Evolutionary relationship associated with ULK1 147 Highly conserved regions play an imperative role in phylogenetic 148 115 In the study, an extensive examination of the ULK1 gene is carried tree reconstruction. Therefore, the evolutionary relationship among 149 116 out which is subdivided into seven major sections; (1) Identification eight species was elucidated on the basis of sequence similarities by 150

UNCORRECTED PROOF

Fig. 1. Partial representation of Multiple Sequence Alignment of ULK1 gene for 8 different species. This was carried out using multiple sequence comparison by log-expectation (MUSCLE) and multiple alignment using fast Fourier transform (MAFFT) which use log-expectation scores and fast Fourier transform methods, respectively. Programs were used with their default parameters. Human sequence was taken as a reference and is shown at the top of MSA. Areas in boxes represent various conserved regions in MSA.

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 R. Randhawa et al. / Gene xxx (2015) xxx–xxx 3

Fig. 2. Evolutionary relationship among eight species included in the study. This was done by applying molecular evolutionary genetics analysis 5 (MEGA5). The phylogenetic tree was reconstructed by using maximum parsimony (MP), a character-based method for deducing phylogenetic trees by minimizing the total number of evolutionary steps. The analysis was performed for the verification of inferred tree by taking 1000 bootstrap replicates to generate statistically significant phylogenetic tree.

151 applying molecular evolutionary genetics analysis 5 (MEGA5) which 2.2. Identification of regulatory elements and over-represented TFBS 160 152 helps to estimate the rates of molecular evolution and deduce ancestral 153 affiliations (Tamura et al., 2011). The phylogenetic reconstruction was Identification of regulatory elements like enhancers, silencers and 161 154 achieved by means of maximum parsimony (MP), a character-based repressors involved in controlling the expression of ULK1 provides use- 162 155 method for deducing phylogenetic trees by minimizing the total num- ful insights into how the gene is regulated and expressed under the in- 163 156 ber of evolutionary steps required for explanation of a given set of fluence of these factors. Distant regulatory elements of co-regulated 164 157 data. The bootstrap analysis was also performed for the verification of genes (DiRE) (Gotea and Ovcharenko, 2008) and oPOSSUM 3 (Kwon 165 158 inferred tree by taking 1000 bootstrap replicates to generate statistically et al., 2012) were used for detection of the regulatory elements in 166 159 significant phylogenetic tree. ULK1 and over-represented transcription binding sites, respectively. 167

UNCORRECTED PROOF

Fig. 3. Transcription factor binding sites in ULK1 gene. (a) Distant regulatory elements of co-regulated genes (DiRE) showed that 75% of the total transcription factors were present in UTR and remaining 25% in the intron region. (b) Sequence logos obtained from JASPAR.

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 4 R. Randhawa et al. / Gene xxx (2015) xxx–xxx

168 Additionally, JASPAR database (Sandelin et al., 2004) was also explored Table 2 t2:1 169 for similar kind of datasets to identify various classes, families and List of top ten transcription factors with their rates of occurrence and importance. t2:2 170 sequence logos for transcription factors (TFs) and their binding sites. # Transcription factor Occurrence Importance t2:3

1 NFE2 25.00% 0.48555 t2:4 2 CACCCBINDINGFACTOR 25.00% 0.24984 t2:5 171 2.3. Identification of nsSNPs, their phenotypic effects and quantitative 3 CHOP 25.00% 0.24961 t2:6 172 statistical analyses for genetic parameters 4 MEF3 25.00% 0.24805 t2:7 5 GLI 25.00% 0.24727 t2:8 t2:9 173 The nsSNPs are the nucleotide changes that result in the altered 6 BARBIE 25.00% 0.24727 7 ARNT 25.00% 0.23340 t2:10 174 amino acid in the protein sequence. This altered amino acid may or 8 WT1 25.00% 0.21587 t2:11 175 may not affect the function of the protein. The affected protein function 9 BACH2 25.00% 0.20391 t2:12 176 in case of modifying nsSNPs is due to change in its (1) structure, (2) sta- 10 STAT 25.00% 0.19805 t2:13 177 bility and (3) by influencing functional binding sites. We have used 178 Sorting Intolerant From Tolerant (SIFT) and Polymorphism Phenotyping 179 (PolyPhen) tools to identify nsSNPs which are popular standard tools to analyzed for genetic association were D′ and r2.TheD′ value provides 206 180 predict intolerant or damaging variants. These are based on sequence the measure of LD between the two blocks and its value closer to zero 207 181 homology, conservation, structure and SWISS-PROT annotation. For shows a higher amount of historical recombination between the two 208 182 identification of SNPs, we computationally analyzed ULK1 gene as ex- blocks and r2 gives the correlation coefficient between the two loci 209 183 perimental methods are complicated, expensive and time consuming. under study. 210 184 These SNPs were further analyzed for their phenotypic effect in coding 185 sequences. SIFT algorithm was used for identification of genetic varia- 186 tions, leading to diverse phenotypes in ULK1 (Ng and Henikoff, 2003). 2.4. Elucidation of putative phosphorylation and palmitoylation sites 211 187 The prediction is based on the generated SIFT score and focuses on the in ULK1 212 188 phenomenon of protein conservation which states that protein evolu- 189 tion has a strong correlation with protein function. PolyPhen was also Most of the processes occurring in a cell are controlled by signaling 213 190 used for validating these phenotypic consequences which predicts on cascade dependent on phosphorylation/dephosphorylation. ULK1 is a 214 191 the basis of sequence, structural, evolutionary annotations and substitu- kinase which itself is controlled by phosphorylation events and phos- 215 192 tions in the proteins (Ramensky et al., 2002). The profile scores for the phorylates its substrates for modulating their activity (Bononi et al., 216 193 two amino acid positions (native and mutant) were calculated and 2011; Wu et al., 2014). Hence, detection of phosphorylation sites in 217 194 assessed for evaluating their phenotypic effects. On the basis of the ULK1 may unravel important functional aspects regarding its involve- 218 195 scores, PolyPhen categorizes the substitutions into three classes i.e. ‘be- ment in various disorders (Olsen et al., 2010; Wang et al., 2010; Shang 219 196 nign’, ‘possibly damaging’ and ‘probably damaging’. et al., 2011). The NetPhos algorithm (Blom et al., 1999) was used for 220 197 The genotype data from CEU (CEPH—Utah Residents with Northern the prediction of phosphorylation sites at serine (S), threonine (T) and 221 198 and Western European Ancestry) population for the ULK1 gene was re- tyrosine (Y) residues in the ULK1 amino acid sequence. This algorithm 222 199 trieved from The International HapMap project (Thorisson et al., 2005) utilizes an artificial neural network (ANN) based method which is 223 200 and was analyzed for various quantitative genetic parameters: linkage trained from PhosphoBase (Kreegipuu et al., 1999), a database of exper- 224 201 disequilibrium (LD), haplotypes and SNPs. These parameters represent imentally validated phosphorylated proteins. 225 202 the combination of on neighboring loci on the Detection of palmitoylation sites is also an important component of 226 203 being transmitted together and involvement of alleles in a non- this study. The palmitoylation sites were obtained from CSS-PALM 227 204 random mode of inheritance in the population. This analysis was per- (Zhou et al., 2006), a tool based on the clustering and scoring strategy 228 205 formed using Haploview (Barrett et al., 2005) and the parameters (CSS) algorithm for the prediction of palmitoylation sites. 229

t1:1 Table 1 t1:2 Regulatory elements, their types and identified transcription factors. t1:3 # Regulatory element Type Score Gene Candidate TFBS (relative positions) t1:4 1 chr12:130,946,862–130,947,195 Intron 2.544 chr12:130,905,754–130,979,065 ULK1 CACCCBINDINGFACTOR(120) NFE2(129) BACH2(130) GLI(207) WT1(214) RFX1(307) PR(324) GRE(324) t1:5 2 chr12:130,945,120UNCORRECTED–130,945,301 UTR5 1.221 chr12:130,905,754–130,979,065 PROOF ULK1 PAX5(4) NRF1(6) HIC1(9) MTF1(37) ZBRK1(45) t1:6 3 chr12:130,972,084–130,972,437 UTR3 2.027 chr12:130,905,754–130,979,065 ULK1 SMAD4(32) MYOGNF1(122) STAT(169) HSF1(205) BARBIE(233) t1:7 4 chr12:130,973,189–130,973,466 UTR3 2.784 chr12:130,905,754–130,979,065 ULK1 ARNT(65) CHOP(82) ERR1(151) MEF3(217) PPARG(258)

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 R. Randhawa et al. / Gene xxx (2015) xxx–xxx 5 t3:1 Table 3 t3:2 Over-represented transcription factor binding sites and their annotation. t3:3 TF JASPAR ID Class Family Target gene hits Target TFBS hits Z-score Fisher score t3:4 Zfx MA0146.1 Zinc-coordinating BetaBetaAlpha-zinc finger 1 5 16.721 0.919 t3:5 Tcfcp2l1 MA0145.1 Other CP2 1 4 12.964 0.849 t3:6 Klf4 MA0039.2 Zinc-coordinating BetaBetaAlpha-zinc finger 1 8 11.997 0.596 t3:7 MIZF MA0131.1 Zinc-coordinating BetaBetaAlpha-zinc finger 1 1 11.047 2.476 t3:8 Egr1 MA0162.1 Zinc-coordinating BetaBetaAlpha-zinc finger 1 2 9.729 1.358 t3:9 SP1 MA0079.2 Zinc-coordinating BetaBetaAlpha-zinc finger 1 6 8.687 0.692 t3:10 RORA_2 MA0072.1 Zinc-coordinating Hormone-nuclear receptor 1 1 8.63 1.936 t3:11 NR4A2 MA0160.1 Zinc-coordinating Hormone-nuclear receptor 1 5 7.765 0.577 t3:12 RORA_1 MA0071.1 Zinc-coordinating Hormone-nuclear receptor 1 2 6.65 1.054 t3:13 NFYA MA0060.1 Other Alpha-Helix NFY CCAAT-binding 1 1 5.88 1.44 t3:14 Zfp423 MA0116.1 Zinc-coordinating BetaBetaAlpha-zinc finger 1 1 5.353 1.47 t3:15 ZEB1 MA0103.1 Zinc-coordinating BetaBetaAlpha-zinc finger 1 8 4.153 0.349 t3:16 Foxa2 MA0047.2 Winged helix–turn–helix Forkhead 1 2 2.758 0.753 t3:17 MZF1_5-13 MA0057.1 Zinc-coordinating BetaBetaAlpha-zinc finger 1 3 2.147 0.612 t3:18 E2F1 MA0024.1 Winged helix–turn–helix E2F 1 1 2.033 1.115 t3:19 GABPA MA0062.2 Winged helix–turn–helix Ets 1 1 1.883 0.973 t3:20 AP1 MA0099.2 Zipper-type Leucine zipper 1 4 1.584 0.412 t3:21 Stat3 MA0144.1 Ig-fold Stat 1 1 1.386 0.918 t3:22 Esrrb MA0141.1 Zinc-coordinating Hormone-nuclear receptor 1 1 1.181 0.89 t3:23 Myf MA0055.1 Zipper-type Helix–loop–helix 1 1 1.031 0.891

230 2.5. Protein–Protein Interaction studies for ULK1 the ULK1 gene of eight species. The maximum conserved blocks are 258 present in the kinase domain followed by S/T rich region and CTD. 259 231 The identification of complex interaction networks involving ULK1 Dots represent conservation with respect to human ULK1 sequence, 260 232 protein which is often also implicated in different or same pathways while variable characters are shown as amino acids. As discussed 261 233 has been constructed by Search Tool for Retrieval of Interacting Genes above, these sequence similarities (conserved regions) reflect the evo- 262 234Q5 and proteins (STRING) version 9.05 (Szklarczyk et al., 2011). The tool lutionary relationship between the species. 263 235 predicts the interactions between various proteins based on a confi- For the phylogenetic analysis of the ULK1 protein sequence of the 264 236 dence score and validates the connections using databases, text mining eight different species (selected based on availability of complete vali- 265 237 and gene fusion support. The interaction studies were performed in var- dated protein sequence data), MEGA5 was used with 1000 bootstrap 266 238 ious modes and by changing parameters to obtain a robust network replicates using MP method and an evolutionary tree was reconstructed 267 239 model for ULK1 and its associated interacting partners. The different as shown in Fig. 2. The generated phylogenetic tree helps in clear under- 268 240 modes include confidence view, evidence view, action mode and the in- standing of evolutionary relationship between the different species. 269 241 teractive view to infer the most appropriate interactions among nodes Here, as shown by the bootstrap values, Homo sapiens and 270 242 in the network. P. troglodytes have 100% evolutionary relationship depicting that the 271 ULK1 gene present in these species is quite identical and could have 272 273 243 3. Results and discussions originated from the same ancestors. Similarly, R. norvegicus and M. musculus have 93% similarity and could have the same ancestral ori- 274 275 244 3.1. Phylogenetic analysis shows ULK1 gene is evolutionary conserved and gin as C. griseus, showing bootstrap value of 100. X. (Silurana) tropicalis, 276 245 is more closely related in a family a species of amphibian family shows a vast difference in its origin and evolution from others, when studied on the basis of protein sequence 277 278 246 The conserved regions in protein sequences generally signify the in- of ULK1. From these data, we could conclude that these sequences 279 247 tegrity and stability of genome which further affect the basic cellular tend to be related closely in a family compared to others. 248 processes. The conserved positions are considered to be involved in im- 249 portant functions, active sites of enzymes and binding sites of the pro- 3.2. Various transcription factor binding sites are distributed throughout 280 250 tein receptors (del Sol et al., 2006). The MSA generated from MUSCLE the ULK1 gene 281 251 and MAFFT tools for the ULK1 protein was found to be quite similar 252 and various important conserved patterns were identified. These The regulatory elements in the ULK1 gene were analyzed using DiRE 282 253 patterns may have important association with diseased states and evo- software with the default value for random set of genes as 5000. This in 283 254 lutionary relationship among these organisms. Fig. 1a illustrates the silico approach demonstrated a total of four potential regulatory ele- 284 255 MSA for the eight sequences where highly conserved regions are ments in the ULK1 gene out of which, three are untranslated regions 285 256 highlighted inside the red blocks whereas Fig. 1b clearly demonstrates (UTR) that correspond to 75% of the total regulatory region in ULK1 286 257 the or variationsUNCORRECTED in residues among the regions found in and one intron PROOF representing 25% of the total regulatory region as 287

t4:1 Table 4 t4:2 The class-wise categorization of transcription factors detected in ULK1. t4:3 Class name TFS Total gene hits TFBS hits t4:4 Zinc-coordinating Zfx, Klf4, MIZF, Egr1, SP1, RORA_1, RORA_2, 12 43 NR4A2, Zfp423, ZEB1, Esrrb, MZF1_5-13 t4:5 Zipper-type AP1, Myf 2 5 t4:6 Winged helix–turn–helix Foxa2, E2F1, GABPA 3 4 t4:7 Ig-fold Stat3 1 1 t4:8 Other alpha-helix NFYA 1 1 t4:9 Other Tcfcp2l1 1 4

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 6 R. Randhawa et al. / Gene xxx (2015) xxx–xxx

Fig. 4. Linkage disequilibrium plot, SNPs and haplotypes in ULK1. (a) The LD plot identified 14 kb block in ULK1. SNPs in the block are highlighted. The five important SNPs identified were rs9652059, rs1134574, rs7953348, rs11615995 and rs11616018 with minor frequencies of 0.2, 0.085, 0.198, 0.228 and 0.207 respectively. (b) CATTT haplotype was prominently found with frequency of 0.774 in the studied population.

288 shown in Fig. 3a. Among the three UTR regions, two are 3′ UTRs and one of these zinc-coordinating residues gives rise to diverse classes of ZnFs. 321 289 is 5′ UTR. Presence of regulatory elements at 5′ and 3′ indicates that The interactions of these classes of TFs with the ULK1 gene could be 322 290 ULK1 is a complex gene and it could be regulated from both ends utilized to modulate its expression which in turn could alter autophagy 323 291 (Heinrich and Pagtakhan, 2004). Additionally, it is proposed that the pathway. The latter recently has been shown to involve in various 324 292 neighboring generic locations might play a critical role in ULK1 regula- diseases and its manipulation could be utilized for therapeutic interven- 325 293 tion and ultimately in autophagy. A total of 23 TFs in UTR and intron re- tions (Beerli et al., 2000; Pabo et al., 2001; Segal et al., 2004). 326 294 gions were identified using DiRE and are shown along with their locus, 295 positions and score in Table 1. These TFs either bind directly or in the 3.3. Linkage disequilibrium analysis shows that 5 SNPs are linked and in 327 296 form of complex to the transcriptional regulatory region of ULK1 ULK1, 4 nsSNPs exist that have damaging/harmful effect 328 297 which could further control its expression. Among these detected tran- 298 scription factors, CHOP, E2F1, NFE2 and STAT have already been studied In order to analyze various genetic parameters (including linkage 329 299 for their role in ULK1 expression where CHOP and E2F1 enhance the disequilibrium (LD), haplotypes and SNPs) for the ULK1 gene that 330 300 autophagy process whereas NFE2 and STAT suppress the process of au- could give an idea about its involvement in predisposition of various 331 301 tophagy (Deretic et al., 2013; Fullgrabe et al., 2014). All these TFs have diseases, the genotype data for CEU (CEPH—Utah Residents with North- 332 302 equal occurrence rate of 25% (Table 2). Based on high importance rate ern and Western European Ancestry) obtained from The International 333 303 (0.48555), nuclear factor (erythroid-derived) 2 (NFE2) is of utmost sig- HapMap Project for ULK1 gene was subjected to extensive statistical 334 304 nificance when compared to the rest of the TFs (importance rate of analysis. These parameters act as vital biomarkers for the functional 335 305 b0.24984) found in the analysis (Table 2). Distribution of TFBS all over association with a variety of diseases. The LD analysis revealed an 336 306 the ULK1 gene suggests multiple points of their action and this informa- important block in the ULK1 gene with five important SNPs having 337 307 tion could further be utilized to understand its regulation (Whitfield non-random association as represented in Fig. 4a. The five important 338 308 et al., 2012). These multiple hotspots could be utilized to control the SNPs identified were rs9652059, rs1134574, rs7953348, rs11615995 339 309 regulation of ULK1 in an efficient way. Furthermore, to substantiate 310 these results, we used oPOSSUM tool to find over-represented TFBS in 311 the promoter region of ULK1. The identified TFs are shown in Tables 3 Table 6 t6:1 Information on two loci under consideration with their statistical inference. t6:2 312 &4with the sequence logos of TFs represented in Fig. 3b. The recogni- 313 tion of TFBS depends on features that consider the search parameters L1 L2 D′ LOD r2 CIlow CIhi t6:3 314 of JASPAR N8bitsandN75% as the threshold of position specificscoring rs9652059 rs1134574 1 7.49 0.363 0.71 1 t6:4 315 matrices. Based on standard parameters, we identified 20 TFs binding to rs9652059 rs7953348 1 24.79 1 0.92 1 t6:5 316 58 sites (Table 3). On further grouping based on their class, we observed rs9652059 rs11615995 1 21.21 0.903 0.89 1 t6:6 t6:7 317 that major over-represented TFBS (43/58) in ULK1 were of zinc- rs9652059 rs11616018 1 22.98 0.948 0.89 1 rs9652059 rs12303764 1 5.2 0.151 0.69 1 t6:8 318 – – coordinating class (Table 4), followed by winged helix turn helix, rs9652059 rs3088051 0.835 1.99 0.06 0.35 0.95 t6:9 319 zipper type, alpha helix andUNCORRECTED Ig-fold. Zinc coordinating class of TFs con- rs1134574 rs7953348 PROOF 1 6.99 0.839 0.69 1 t6:10 320 tains zinc fingers (ZnFs), a widespread protein domain, and the spacing rs1134574 rs11615995 1 6.81 0.324 0.7 1 t6:11 rs1134574 rs11616018 1 6.78 0.321 0.69 1 t6:12 rs1134574 rs12303764 1 1.71 0.054 0.31 1 t6:13 t5:1 Table 5 rs1134574 rs3088051 0.57 0.28 0.01 0.05 0.89 t6:14 t5:2 The linkage disequilibrium table representing important SNPs. rs7953348 rs11615995 1 20.57 0.899 0.88 1 t6:15 rs7953348 rs11616018 1 22.98 0.948 0.89 1 t6:16 t5:3 S. no Name ObsHET PredHET HWpval %Geno MAF Alleles rs7953348 rs12303764 1 5.2 0.151 0.69 1 t6:17 t5:4 1 rs9652059 0.267 0.32 0.331 100 0.2 C:T rs7953348 rs3088051 0.833 1.98 0.062 0.35 0.95 t6:18 t5:5 2 rs1134574 0.169 0.155 1 98.9 0.085 A:G rs11615995 rs11616018 1 22.33 0.949 0.9 1 t6:19 t5:6 3 rs7953348 0.259 0.318 0.2774 96.7 0.198 T:C rs11615995 rs12303764 1 5.48 0.189 0.7 1 t6:20 t5:7 4 rs11615995 0.316 0.352 0.6146 94.4 0.228 T:C rs11615995 rs3088051 0.825 1.87 0.065 0.33 0.95 t6:21 t5:8 5 rs11616018 0.276 0.328 0.3664 96.7 0.207 T:C rs11616018 rs12303764 1 5.48 0.159 0.7 1 t6:22 t5:9 6 rs12303764 0.552 0.471 0.3335 96.7 0.379 T:G rs11616018 rs3088051 0.84 2.13 0.067 0.37 0.96 t6:23 t5:10 7 rs3088051 0.424 0.387 0.7676 97.8 0.263 A:G rs12303764 rs3088051 0.616 2.02 0.086 0.27 0.82 t6:24

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 R. Randhawa et al. / Gene xxx (2015) xxx–xxx 7 t7:1 Table 7 3.4. ULK1 comprises novel phosphorylation and palmitoylation sites 358 t7:2 The identified coding non-synonymous SNPs having damaging effects. t7:3 SIFT prediction PolyPhen prediction Phosphorylation and palmitoylation sites play a crucial role in pro- 359 tein–protein interactions, hence in the functions of a protein (Smotrys 360 t7:4 SNP ID Amino Tolerance Predicted Probability Predicted 361 acid index impact score impact and Linder, 2004; Watanabe and Osada, 2012). As mentioned earlier, change ULK1 activity is controlled by phosphorylation events and in addition 362 363 t7:5 rs79965940 N148T 0.01 Damaging – Benign it phosphorylates its substrates for modulating their activity (Wang t7:6 rs61942435 A991V 0 Damaging – Benign et al., 2010; Wu et al., 2014). Most of the processes occurring in a cell 364 t7:7 rs55824543 T503M 0.01 Damaging 0.224 Benign are controlled by signaling cascade dependent on phosphorylation/de- 365 t7:8 rs56364352 S298L 0.05 Tolerated 1 Probably phosphorylation. Hence, detection of the phosphorylation sites in 366 damaging ULK1 could be helpful to understand its various functional aspects as 367 well as its involvement in various diseases like bone cancer, cervical ad- 368 340 and rs11616018 with minor allele frequencies of 0.2, 0.085, 0.198, 0.228 enocarcinoma, gastric cancer and lung cancer (Olsen et al., 2010). For 369 341 and 0.207 respectively as illustrated in Table 5. These SNPs also had prediction of phosphorylation sites at S, T, and Y amino acids in ULK1, 370 342 r2 ≥ 0.8 which further validated the higher correlation between the the NetPhos algorithm representing the prediction score ≥ 0.5 was con- 371 343 loci (Table 6). Furthermore, we identified one haplotype block in the sidered as phosphorylated. The ordered information retrieved from 372 344 ULK1 gene where 5 SNPs and 4 haplotypes were recognized with differ- NetPhos includes protein ID, phosphorylated AA in the sequence, a 373 345 ent population frequencies (Fig. 4b). CATTT haplotype was prominently stretch of 9 AAs with the phosphorylated residue at the center and the 374 346 found with frequency 0.774 in the studied population. The coding score. A complete list of all the phosphorylation sites identified in 375 347 nsSNPs may have significant impact on function of a protein and such ULK1 is shown in Supplementary Table 1. Some of the predicted phos- 376 348Q6 SNPs could be damaging or deleterious (Marín-Martín et al., 2014). phorylation sites have already been verified experimentally and the cor- 377 349 We used SIFT and PolyPhen to analyze the deleterious impact of these responding PubMed IDs are represented in Table 8. Among these 378 350 nsSNPs on the ULK1 protein. The predicted impact of change in amino phosphorylation sites, the newly identified phosphorylation sites are 379 351 acid may be tolerated or damaging depending on the tolerance index shown in Fig. 5 with their position in the domains of the ULK1 protein. 380 352 and probability scores. We identified 4 nsSNPs i.e. rs79965940, We found a new phosphorylation site in ULK1 at amino acid position 381 353 rs61942435, rs55824543 and rs56364352 which could have harmful 295 of tyrosine (Y) residue in kinase domain with significant score 382 354 functional effects on the ULK1 protein (Table 7). It is proposed that (0.809) which has not been verified experimentally and could further 383 355 these SNPs and their damaging or deleterious impact could be exam- be studied to unravel its impact on the protein function. This tyrosine 384 356 ined in other populations for their possible involvement in various dis- phosphorylation site is present in the kinase domain of the gene and 385 357 eases associated with ULK1 and autophagy. thus may regulate the phosphotransferase activity of the protein. 386

t8:1 Table 8 t8:2 Experimentally verified phosphorylation sites in ULK1. t8:3 Name Position Context sequence Score Prediction Predicted kinase PubMed IDs t8:4 ULK1_HUMAN 225 FQASSPQDL 0.993 *S* – 19807128 t8:5 ULK1_HUMAN 317 ASPPSLGEM 0.732 *S* PKC∝ 22932492 t8:6 ULK1_HUMAN 403 GRTPSPSPP 0.924 *S* GSK3β 22932492 t8:7 ULK1_HUMAN 405 TPSPSPPCS 0.987 *S* GSK3β 21383122 t8:8 ULK1_HUMAN 411 PCSSSPSPS 0.895 *S* GSK3β, ERK1 22932492 t8:9 ULK1_HUMAN 460 TPRSSAIRR 0.962 *S* – 22932492 t8:10 ULK1_HUMAN 465 AIRRSGSTS 0.984 *S* – 16964243 t8:11 ULK1_HUMAN 467 RRSGSTSPL 0.986 *S* AMPK, PKCδ 19807128 t8:12 ULK1_HUMAN 469 SGSTSPLGF 0.975 *S* ERK1 21383122 t8:13 ULK1_HUMAN 477 FARASPSPP 0.989 *S* ERK1 18691976 t8:14 ULK1_HUMAN 479 RASPSPPAH 0.891 *S* CDC2, CDK5 21383122 t8:15 18691976 t8:16 19807128 t8:17 ULK1_HUMAN 495 ARKMSLGGG 0.969 *S* AMPK 22932492 t8:18 AKT t8:19 PKA t8:20 PKCδ t8:21 PKCμ t8:22 ULK1_HUMAN 533 RGGRSPRPG 0.994 *S* CDK5 21383122 t8:23 ULK1_HUMAN 544 APEHSPRTS 0.986 *S* CDC2 22932492 t8:24 CDK5 t8:25 ULK1_HUMAN 556 CRLHSAPNL 0.807 *S* AMPK 21383122 t8:26 UNCORRECTED PROOF 21205641 t8:27 18669648 t8:28 18846507 t8:29 ULK1_HUMAN 694 GRSFSTSRL 0.987 *S* AKT 22932492 t8:30 ULK1_HUMAN 716 PDPGSTESL 0.894 *S* CK1 22932492 t8:31 ULK1_HUMAN 719 GSTESLQEK 0.574 *S* CK1 22932492 t8:32 ULK1_HUMAN 747 AGGTSSPSP 0.551 *S* – 16964243 t8:33 ULK1_HUMAN 761 GSPPSGSTP 0.575 *S* – 16964243 t8:34 ULK1_HUMAN 775 TRMFSAGPT 0.983 *S* – 21383122 t8:35 ULK1_HUMAN 866 ALKGSASEA 0.886 *S* – 19807128 t8:36 ULK1_HUMAN 1042 ERRLSALLT 0.99 *S* AMPK 19807128 t8:37 PKA t8:38 ULK1_HUMAN 456 TQFQTPRSS 0.971 *T* – 18669648 t8:39 ULK1_HUMAN 636 DFPKTPSSQ 0.671 *T* CDK5 22932492 t8:40 ULK1_HUMAN 695 RSFSTSRLT 0.975 *T* – 22932492

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 8 R. Randhawa et al. / Gene xxx (2015) xxx–xxx

Fig. 5. ULK1 domain structure with newly identified phosphorylation sites. NetPhos algorithm was used for prediction of phosphorylation sites at serine (S), threonine (T) and tyrosine (Y) residues. A total of 58 new phosphorylation sites were identified out of which 25 are in kinase domain, 29 are in Ser/Thr rich domain and only 4 are in CT domain.

387 Identification of palmitoylation sites was an important component interacting with ULK1 have been identified in this analysis from the 414 388 of this study, which adds palmitic acid resulting in an increase of hydro- PPI network generated by STRING as shown in Fig. 7. The main proteins 415 389 phobicity of proteins and helps in their association to the membranes which were found to interact with ULK1 are gamma-amino butyric acid 416 390 (Wu et al., 2014). The identification of these sites would help to under- receptor-associated protein (GABARAP), Activating Molecule in Beclin- 417 391 stand the complex processes such as sub-cellular trafficking between 1-Regulated Autophagy (AMBRA1), Mammalian target of rapamycin 418 392 the membrane compartments (Rocks et al., 2005) and protein–protein (mTOR), Regulatory-associated protein of mTOR (RPTOR), RB1- 419 393 interactions (Joyoti, 2004) that occur due to palmitoylation. Many au- inducible coiled-coil protein 1 (RB1CC1), Autophagy-related protein 420 394 tophagy related proteins have been studied for the palmitoylation 13 (ATG13), Beclin-1 (BECN1) and Synaptic Ras GTPase-activating pro- 421 395 sites (Mercer et al., 2009). In mammals, ATG101 an important autopha- tein 1 (SYNGAP1) (Cecconi et al., 2007; Chano et al., 2007; Sun et al., 422 396 gy related protein, interacts with ATG13 (ULK1 interacting protein) 2010; Van Humbeeck et al., 2011; Pagliarini et al., 2012; Tameno et al., 423 397 which is a component of macroautophagy (Mercer et al., 2009). 2012; Johnson et al., 2013; Koike et al., 2013; Tang et al., 2013; Wirth 424 398 ATG101 is localized to the isolation membrane or phagophore which et al., 2013; Yang et al., 2013; Yu et al., 2013). These proteins interact 425 399 surrounds the materials to be degraded in the lysosome. The interaction with ULK1 and have been shown to play important roles in autophagy 426 400 of ATG101 with the phagophore may be due to the palmitoylation site as well as in associated pathways. Further, exploration of structural ele- 427 401 present at the third amino acid, i.e. cysteine (Mercer et al., 2009). Iden- ments of these interacting proteins could provide beneficial information 428 402 tification of these sites would help us in exploring diverse interactions which might provide clues for their association with a myriad of biolog- 429 403 with ULK1 and their sub-cellular localizations. These palmitoylation ical regulatory processes. 430 404 sites were obtained from CSS-PALM. In ULK1, we found 4 palmitoylation 405 sites at positions 426, 927, 1003 and 1049 (Fig. 6) which could further 4. Conclusion 431 406 be investigated for their role in trafficking and protein–protein interac- 407 tions. These are the putative positions where the palmitic acid may at- Autophagy is a lysosomal degradation pathway for damaged cyto- 432 408 tach to the protein at cysteine residues for its attachment to the plasmic organelles or cytosolic components of a cell which has recently 433 409 membrane. been shown to be involved in many diseases. It is in debate for its 434 modulation for therapeutic interventions and better management of 435 410 3.5. ULK1 interacts with various proteins diseases in which its dysregulation has been shown. ULK1 mammalian 436 homologue of autophagy related gene-1 (Atg1) plays a central role in 437 411 The functional property of a protein could be analyzed by exploring autophagy pathway. In this study, we in silico identified TFBS which 438 412 its interactions with other proteins and their involvement in different were present throughout ULK1 and were of zinc coordinating class; 439 413 biochemical pathwaysUNCORRECTED (Perkins et al., 2010). Several proteins CATTT haplotype (0.774)PROOF as prominent; four nsSNPs which could have 440

Fig. 6. The putative palmitoylation sites in ULK1. Four novel putative palmitoylation sites were identified applying CSS-PALM.

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 R. Randhawa et al. / Gene xxx (2015) xxx–xxx 9

Fig. 7. ULK1 interacting partners. STRING (version 9.05) was used to find ULK1 interacting proteins; the scale represents the various search parameters and genes that are mapped through co-expression analysis.

441 harmful effect on ULK1 protein (Table 7); and 87 and 4 phosphorylation Fullgrabe, J., Klionsky, D.J., Joseph, B., 2014. The return of the nucleus: transcriptional and 478 – 479 442 epigenetic control of autophagy. Nat. Rev. Mol. Cell Biol. 15, 65 74. and palmitoylation sites, respectively. We suggest that this information Gotea, V., Ovcharenko, I., 2008. DiRE: identifying distant regulatory elements of co- 480 443 could be utilized in experimental studies to further gain insights into the expressed genes. Nucleic Acids Res. 36, W133–W139. 481 444 functions of ULK1. Heinrich, G., Pagtakhan, C.J., 2004. Both 5′ and 3′ flanks regulate Zebrafish brain-derived 482 483 445 neurotrophic factor . BMC Neurosci. 5, 19. Supplementary data to this article can be found online at http://dx. Henckaerts, L., Cleynen, I., Brinar, M., John, J.M., Van Steen, K., Rutgeerts, P., Vermeire, S., 484 446 doi.org/10.1016/j.gene.2015.02.056. 2011. Genetic variation in the autophagy gene ULK1 and risk of Crohn's disease. 485 Inflamm. Bowel Dis. 17, 1392–1397. 486 Johnson, S.C., Rabinovitch, P.S., Kaeberlein, M., 2013. mTOR is a key modulator of ageing 487 447 fl Con ict of interest and age-related disease. Nature 493, 338–345. 488 Joyoti, B., 2004. Protein palmitoylation and dynamic modulation of protein function. Curr. 489 490 448 The authors declare that they have no conflict of interest. Sci. 87, 6. Katoh, K., Misawa, K., Kuma, K., Miyata, T., 2002. MAFFT: a novel method for rapid multi- 491 ple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 492 449 Acknowledgments 3059–3066. 493 Koike, M., Tanida, I., Nanao, T., Tada, N., Iwata, J., Ueno, T., Kominami, E., Uchiyama, Y., 494 2013. Enrichment of GABARAP relative to LC3 in the axonal initial segments of neu- 495 450 This work is supported by funding (BT/PR6784/GBD/27/466/2012) rons. PLoS One 8, e63568. 496 451 from Department of Biotechnology, India to HC. Kreegipuu, A., Blom, N., Brunak, S., 1999. PhosphoBase, a database of phosphorylation 497 sites: release 2.0. Nucleic Acids Res. 27, 237–239. 498 Kundu, M., Lindsten, T., Yang, C.Y., Wu, J., Zhao, F., Zhang, J., Selak, M.A., Ney, P.A., 499 452 References Thompson, C.B., 2008. Ulk1 plays a critical role in the autophagic clearance of mito- 500 chondria and ribosomes during reticulocyte maturation. Blood 112, 1493–1502. 501 453 Barrett, J.C., Fry, B., Maller, J., Daly, M.J., 2005. Haploview: analysis and visualization of LD Kwon, A.T., Arenillas, D.J., Worsley Hunt, R., Wasserman, W.W., 2012. oPOSSUM-3: ad- 502 454 and haplotype maps. Bioinformatics 21, 263–265. vanced analysis of regulatory motif over-representation across genes or ChIP-Seq 503 455 Beerli, R.R., Dreier, B., Barbas III, C.F., 2000. Positive and negative regulation of endogenous datasets. G3 (Bethesda) 2, 987–1002. 504 456 genes by designed transcription factors. Proc. Natl. Acad. Sci. U. S. A. 97, 1495–1500. Mack, H.I., Zheng, B., Asara, J.M., Thomas, S.M., 2012. AMPK-dependent phosphorylation 505 457 Blom, N., Gammeltoft, S., Brunak, S., 1999. Sequence and structure-based prediction of eu- of ULK1 regulates ATG9 localization. Autophagy 8, 1197–1214. 506 458 karyotic protein phosphorylation sites. J. Mol. Biol. 294, 1351–1362. Marín-Martín, F.R., Soler-Rivas, C., Martín-Hernández, R., Rodriguez-Casado, A., 2014. A 507Q7 459 Bononi, A., Agnoletto, C., De Marchi, E., Marchi, S., Patergnani, S., Bonora, M., Giorgi, C., comprehensive in silico analysis of the functional and structural impact of 508 460 Missiroli, S., Poletti, F., Rimessi, A., Pinton, P., 2011. Protein kinases and phosphatases nonsynonymous SNPs in the ABCA1 transporter gene. Cholesterol 19. 509 461 in the control of cell fate. Enzym. Res. 2011, 329098. Mercer, C.A., Kaliappan, A., Dennis, P.B., 2009. A novel, human Atg13 binding protein, 510 462 Cecconi, F., Di Bartolomeo, S., Nardacci, R., Fuoco, C., Corazzari, M., Giunta, L., Romagnoli, Atg101, interacts with ULK1 and is essential for macroautophagy. Autophagy 5, 511 463 A., Stoykova, A., Chowdhury,UNCORRECTED K., Fimia, G.M., Piacentini, M., 2007. A novel role for au- 649–662. PROOF 512 464 tophagy in neurodevelopment. Autophagy 3, 506–508. Ng, P.C., Henikoff, S., 2003. SIFT: predicting amino acid changes that affect protein func- 513 465 Chan, E.Y., Longatti, A., McKnight, N.C., Tooze, S.A., 2009. Kinase-inactivated ULK proteins tion. Nucleic Acids Res. 31, 3812–3814. 514 466 inhibit autophagy via their conserved C-terminal domains using an Atg13- Olsen, J.V., Vermeulen, M., Santamaria, A., Kumar, C., Miller, M.L., Jensen, L.J., Gnad, F., Cox, J., 515 467 independent mechanism. Mol. Cell. Biol. 29, 157–171. Jensen, T.S., Nigg, E.A., Brunak, S., Mann, M., 2010. Quantitative phosphoproteomics re- 516 468 Chano, T., Okabe, H., Hulette, C.M., 2007. RB1CC1 insufficiency causes neuronal atrophy veals widespread full phosphorylation site occupancy during mitosis. Sci. Signal. 3, ra3. 517 469 through mTOR signaling alteration and involved in the pathology of Alzheimer's dis- Pabo, C.O., Peisach, E., Grant, R.A., 2001. Design and selection of novel Cys2His2 zinc finger 518 470 eases. Brain Res. 1168, 97–105. proteins. Annu. Rev. Biochem. 70, 313–340. 519 471 del Sol, A., Fujihashi, H., Amoros, D., Nussinov, R., 2006. Residue centrality, functionally Pagliarini, V., Wirawan, E., Romagnoli, A., Ciccosanti, F., Lisi, G., Lippens, S., Cecconi, F., 520 472 important residues, and active site shape: analysis of enzyme and non-enzyme fam- Fimia, G.M., Vandenabeele, P., Corazzari, M., Piacentini, M., 2012. Proteolysis of 521 473 ilies. Protein Sci. 15, 2120–2128. Ambra1 during apoptosis has a role in the inhibition of the autophagic pro-survival 522 474 Deretic, V., Saitoh, T., Akira, S., 2013. Autophagy in infection, inflammation and immunity. response. Cell Death Differ. 19, 1495–1504. 523 475 Nat. Rev. Immunol. 13, 722–737. Perkins, J.R., Diboun, I., Dessailly, B.H., Lees, J.G., Orengo, C., 2010. Transient protein–pro- 524 476 Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and high tein interactions: structural, functional, and network properties. Structure 18, 525 477 throughput. Nucleic Acids Res. 32, 1792–1797. 1233–1243. 526

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056 10 R. Randhawa et al. / Gene xxx (2015) xxx–xxx

527 Ramensky, V., Bork, P., Sunyaev, S., 2002. Human non-synonymous SNPs: server and sur- (mTor) mediates tau protein dyshomeostasis: implication for Alzheimer disease. 562 528 vey. Nucleic Acids Res. 30, 3894–3900. J. Biol. Chem. 288, 15556–15570. 563 529 Rocks, O., Peyker, A., Kahms, M., Verveer, P.J., Koerner, C., Lumbierres, M., Kuhlmann, J., Thorisson, G.A., Smith, A.V., Krishnan, L., Stein, L.D., 2005. The International HapMap Pro- 564 530 Waldmann, H., Wittinghofer, A., Bastiaens, P.I., 2005. An acylation cycle regulates lo- ject Web site. Genome Res. 15, 1592–1593. 565 531 calization and activity of palmitoylated Ras isoforms. Science 307, 1746–1752. Van Humbeeck, C., Cornelissen, T., Hofkens, H., Mandemakers, W., Gevaert, K., De 566 532 Sandelin, A., Alkema, W., Engstrom, P., Wasserman, W.W., Lenhard, B., 2004. JASPAR: an Strooper, B., Vandenberghe, W., 2011. Parkin interacts with Ambra1 to induce 567 533 open-access database for eukaryotic transcription factor binding profiles. Nucleic mitophagy. J. Neurosci. 31, 10249–10261. 568 534 Acids Res. 32, D91–D94. Wang, Y.T., Tsai, C.F., Hong, T.C., Tsou, C.C., Lin, P.Y., Pan, S.H., Hong, T.M., Yang, P.C., Sung, 569 535 Segal, D.J., Goncalves, J., Eberhardy, S., Swan, C.H., Torbett, B.E., Li, X., Barbas III, C.F., 2004. T.Y., Hsu, W.L., Chen, Y.J., 2010. An informatics-assisted label-free quantitation strat- 570 536 Attenuation of HIV-1 replication in primary human cells with a designed zinc finger egy that depicts phosphoproteomic profiles in lung cancer cell invasion. 571 537 transcription factor. J. Biol. Chem. 279, 14509–14519. J. Proteome Res. 9, 5582–5597. 572 538 Shang, L., Chen, S., Du, F., Li, S., Zhao, L., Wang, X., 2011. Nutrient starvation elicits an acute Watanabe, N., Osada, H., 2012. Phosphorylation-dependent protein–protein interaction 573 539 autophagic response mediated by Ulk1 dephosphorylation and its subsequent disso- modules as potential molecular targets for cancer therapy. Curr. Drug Targets 13, 574 540 ciation from AMPK. Proc. Natl. Acad. Sci. U. S. A. 108, 4788–4793. 1654–1658. 575 541 Smotrys, J.E., Linder, M.E., 2004. Palmitoylation of intracellular signaling proteins: regula- Whitfield, T.W., Wang, J., Collins, P.J., Partridge, E.C., Aldred, S.F., Trinklein, N.D., Myers, 576 542 tion and function. Annu. Rev. Biochem. 73, 559–587. R.M., Weng, Z., 2012. Functional analysis of transcription factor binding sites in 577 543 Sun, C., Southard, C., Witonsky, D.B., Kittler, R., Di Rienzo, A., 2010. Allele-specificdown- human promoters. Genome Biol. 13, R50. 578 544 regulation of RPTOR expression induced by retinoids contributes to climate adapta- Wirth, M., Joachim, J., Tooze, S.A., 2013. Autophagosome formation—the role of ULK1 and 579 545 tions. PLoS Genet. 6, e1001178. Beclin1–PI3KC3 complexes in setting the stage. Semin. Cancer Biol. 23, 301–309. 580 546 Szklarczyk, D., Franceschini, A., Kuhn, M., Simonovic, M., Roth, A., Minguez, P., Doerks, T., Wu, W., Tian, W., Hu, Z., Chen, G., Huang, L., Li, W., Zhang, X., Xue, P., Zhou, C., Liu, L., Zhu, 581 547 Stark, M., Muller, J., Bork, P., Jensen, L.J., von Mering, C., 2011. The STRING database in Y., Li, L., Zhang, L., Sui, S., Zhao, B., Feng, D., 2014. ULK1 translocates to mitochondria 582 548 2011: functional interaction networks of proteins, globally integrated and scored. and phosphorylates FUNDC1 to regulate mitophagy. EMBO Rep. 15, 566–575. 583 549 Nucleic Acids Res. 39, D561–D568. Xu, H., Yu, H., Zhang, X., Shen, X., Zhang, K., Sheng, H., Dai, S., Gao, H., 2013. UNC51-like 584 550 Tameno, H., Chano, T., Ikebuchi, K., Ochi, Y., Arai, A., Kishimoto, M., Shimada, T., Hisa, Y., kinase 1 as a potential prognostic biomarker for hepatocellular carcinoma. Int. 585 551 Okabe, H., 2012. Prognostic significance of RB1-inducible coiled-coil 1 in salivary J. Clin. Exp. Pathol. 6, 711–717. 586 552 gland cancers. Head Neck 34, 674–680. Yang, H., Rudge, D.G., Koos, J.D., Vaidialingam, B., Yang, H.J., Pavletich, N.P., 2013. mTOR 587 553 Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. MEGA5: molec- kinase structure, mechanism and regulation. Nature 497, 217–223. 588 554 ular evolutionary genetics analysis using maximum likelihood, evolutionary distance, Yu, M., Gou, W.F., Zhao, S., Xiao, L.J., Mao, X.Y., Xing, Y.N., Takahashi, H., Takano, Y., Zheng, 589 555 and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739. H.C., 2013. Beclin 1 expression is an independent prognostic factor for gastric carcino- 590 556 Tang, J., Deng, R., Luo, R.Z., Shen, G.P., Cai, M.Y., Du, Z.M., Jiang, S., Yang, M.T., Fu, J.H., Zhu, mas. Tumour Biol. 34, 1071–1083. 591 557 X.F., 2012. Low expression of ULK1 is associated with operable breast cancer progres- Zhou, F., Xue, Y., Yao, X., Xu, Y., 2006. CSS-PALM: palmitoylation site prediction with a 592 558 sion and is an adverse prognostic marker of survival for patients. Breast Cancer Res. clustering and scoring strategy (CSS). Bioinformatics 22, 894–896. 593 559 Treat. 134, 549–560. 560 Tang, Z., Bereczki, E., Zhang, H., Wang, S., Li, C., Ji, X., Branca, R.M., Lehtio, J., Guan, Z., 561 Filipcik, P., Xu, S., Winblad, B., Pei, J.J., 2013. Mammalian target of rapamycin 594

UNCORRECTED PROOF

Please cite this article as: Randhawa, R., et al., Unc-51 like kinase 1 (ULK1) in silico analysis for biomarker identification: A vital component of autophagy, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.02.056