Challenges in Using Hierarchical Clustering to Identify Asthma Subtypes: Choosing the Variables and Variable Transformation Mate

Challenges in Using Hierarchical Clustering to Identify Asthma Subtypes: Choosing the Variables and Variable Transformation Mate

Challenges in using hierarchical clustering to identify asthma subtypes: choosing the variables and variable transformation Matea Deliu, University of Manchester, Health e-Research Centre, Manchester Tolga Yavuz, University of Manchester, Manchester Matthew Sperrin and Umit Sahiner, Imperial College London, London Danielle Belgrave and Cansin Sackesen Adnan Custovic Omer Kalayci, Haceteppe University, Ankara Introduction The use of unsupervised clustering has identified different subtypes of asthma. Choosing the variables to input into the clustering algorithm is one of the important considerations. The majority of previous studies selected variables based on expert advice, whilst others used dimension reduction techniques such as principal component analysis (PCA). We aimed to compare the results of unsupervised clustering when using raw variables, or variables transformed using dimensionality reduction techniques. Methods We recruited 613 asthmatics aged 6–23 years (Ankara, Turkey). We conducted extensive phenotyping, with 49 variables including demographic data, sensitization, lung function, medication, peripheral eosinophilia, and markers of asthma severity. We performed hierarchical clustering (HC) using: (1) all variables and (2) PCA-transformed variables. Results PCA revealed 5 components describing atopy and variations in asthma severity, which were then used to infer cluster assignment. The optimal HC solution in both PCA-transformed and raw untransformed data identified 5- clusters which were not identical. Both identified mild asthma with good lung function, severe atopic asthma and late- onset atopic mild asthma. Clustering without PCA identified early-onset severe atopic asthma and late-onset atopic asthma with high BMI, whilst early onset non-atopic mild asthma in females was identified in HC with PCA. However, cluster stability was poor. A comparison of the two cluster outputs revealed four key features driving cluster allocations: age of onset, asthma severity, atopic status, and asthma exacerbations. Re-clustering with these features markedly improved cluster stability. Conclusion Different methodologies applied to the same dataset identified differing clusters of asthma. We identified four main features that could be represent a new framework for clustering children with asthma. This will eventually bring us one step closer to identifying heterogeneity and subtypes of asthma thereby paving the way towards precision based medicine. .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    1 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us