Deep Neural Generative Model of Functional MRI Images For

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 Deep Neural Generative Model of Functional MRI Images for Psychiatric Disorder Diagnosis Takashi Matsubara, Member, IEEE, Tetsuo Tashiro, Nonmember, and Kuniaki Uehara, Nonmember Abstract—Accurate diagnosis of psychiatric disorders plays a dimensional samples compared to datasets for other machine- critical role in improving the quality of life for patients and learning tasks. Unsophisticated application of machine- potentially supports the development of new treatments. Many learning techniques tends to overfit to training samples and studies have been conducted on machine learning techniques that seek brain imaging data for specific biomarkers of disorders. to fail in generalizing to unknown samples. Many existing These studies have encountered the following dilemma: A direct techniques employed Pearson correlation coefficients (PCC) as classification overfits to a small number of high-dimensional a feature, whereby the PCCs were considered to represent the samples but unsupervised feature-extraction has the risk of functional connectivity between brain regions [10], [12], [13]. extracting a signal of no interest. In addition, such studies Then, the techniques consist of feature-selection, dimension- often provided only diagnoses for patients without presenting the reasons for these diagnoses. This study proposed a deep neural reduction, and classification. Instead of the PCCs, other studies generative model of resting-state functional magnetic resonance employed unsupervised dimension-reduction such as princi- imaging (fMRI) data. The proposed model is conditioned by the pal components analysis (PCA) and independent components assumption of the subject’s state and estimates the posterior analysis (ICA) [4], [5], [7], [8] in order to identify low- probability of the subject’s state given the imaging data, using dimensional dominant patterns directly in each frame or each Bayes’ rule. This study applied the proposed model to diag- nose schizophrenia and bipolar disorders. Diagnostic accuracy time-window and to extract the former as features. Then, was improved by a large margin over competitive approaches, these studies diagnosed subjects using supervised classifiers. namely classifications of functional connectivity, discrimina- These unsupervised feature-selection and dimension-reduction tive/generative models of region-wise signals, and those with approaches are considered to reduce the risk of overfitting. unsupervised feature-extractors. The proposed model visualizes However, they inevitably risk extracting factors unrelated brain regions largely related to the disorders, thus motivating further biological investigation. to the disorder, rather than extracting disorder-related brain activity [14]. Index Terms—deep learning, generative model, functional In contrast, artificial neural networks with deep architectures magnetic resonance imaging, psychiatric-disorder diagnosis, schizophrenia, bipolar disorder (deep neural networks; DNNs) are attracting attention in the machine-learning field (see [15], [16] for a review). They have the ability to approximate arbitrary functions and learn high- I. INTRODUCTION level features from a given dataset automatically, and thereby CCURATE diagnosis of neurological and psychiatric dis- improve performance in classification and regression tasks A orders plays a critical role in improving quality of life for related to images, speech, natural language, and more besides. patients; it provides an opportunity for appropriate treatment Variations of DNNs have been employed for neuroimaging and prevention of further disease progression. Moreover, it datasets. A multilayer perceptron (MLP) has been employed potentially enables the effectiveness of treatments to be evalu- as a supervised classifier [3], [8]. An autoencoder (AE) and its ated and supports the development of new treatments. With variations such as variational autoencoder (VAE) [17] and ad- advances in brain imaging techniques such as (functional) versarial autoencoder (AAE) [18] also have been employed as magnetic resonance imaging (MRI) and positron emission an unsupervised feature-extractor [5], [10]. These approaches tomography (PET) [1], many studies have attempted to find share common difficulties with the aforementioned techniques arXiv:1712.06260v2 [stat.ML] 12 Apr 2019 specific biomarkers of neurological and psychiatric disorders but they are uniquely characterized by their modifiable struc- in brain images using machine learning techniques [2], e.g., tures: The AE can be extended to a deep neural generative for schizophrenia [3], [4], Alzheimer’s disease (AD) [5], [6], model (DGM), which implements relationships between mul- and others [7]–[10]. Resting-state fMRI (rs-fMRI) has received tiple factors (e.g., fMRI images, class labels, imposed tasks, considerable attention [4]–[10]. This approach visualizes inter- and stimuli) in its network structure [17]–[21]. The DGM actions among brain regions in subjects at rest, that is, it does with class labels is no longer just an unsupervised feature- not require subjects to perform tasks and to receive stimuli, extractor but is a generative model of the joint distribution which eliminates potential confounders, e.g., individual task- of data points and class labels. Using Bayes’ rule, the DGM skills [11]. also works as a supervised classifier [22]–[24]. Hence, the Although neuroimaging datasets continue to increase in DGM has the aspects of both a supervised classifier and an size [1], each dataset contains only a small number of high- unsupervised feature-extractor. Several studies have compared simple discriminative and generative models (i.e., logistic T. Matsubara, T. Tashiro, and K. Uehara are with the Graduate regression and naive Bayes). They have revealed theoretically School of System Informatics, Kobe University, Hyogo, Japan e-mail: [email protected]. and experimentally that the generative model classifies a small- Manuscript received April 19, 2005; revised August 26, 2015. sized dataset better than the discriminative model [22], [23], JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 2 [25]. While this relationship is not guaranteed to hold for more complicated models like deep neural networks, a DGM φ yi zi;t potentially overcomes the difficulties that both conventional supervised classifiers and unsupervised feature-extractors en- counter. Given the above, this paper proposes a machine-learning- based method of diagnosing psychiatric disorders using a x DGM of rs-fMRI images. Our proposed DGM considers three i;t factors: a feature obtained from an fMRI image, a class label θ (controls or patients), and the remaining frame-wise variability. Ti The frame-wise variability is assumed to represent temporal N states of dynamic functional connectivity, what a subject has in mind at that moment, and other factors that vary over time. Fig. 1. Our proposed generative model of fMRI features (fMRI images or extracted feature vectors) x with diagnosis y and remaining variabilities It also contains signal of no interest (e.g., body motion that i;t i zi;t. preprocessing does not remove successfully). Each subject is expected to belong to one of the classes. Each scan image obtained from a subject is considered to be generated given of an image set obtained from a subject, which is ap- the subject’s class and the remaining frame-wise variability. plicable to other biomedical datasets such as electroen- Then, if a subject’s images are more likely generated given the cephalogram (EEG) data. class of patients rather than the class of controls, the subject is • For (semi-)supervised classification, extant studies em- considered to have the disorder because of Bayes’ rule. Since ployed DGMs with discriminative models (feedforward our proposed DGM explicitly has the class label as a visible MLPs) q(yjx) as internal components [18], [19], [21]; variable, unlike the ordinary AE, it is free from the risk of they still have a larger risk of overfitting of the discrim- not extracting activity of interest. Furthermore, we propose inative models q(yjx). In contrast, our proposed DGM a method for the proposed DGM to evaluate the contribution does not have such a discriminative model q(yjx) but weight of each brain region to the diagnosis, which potentially employs Bayes’ rule for classification; it works well provides a score that assesses the disorder progression. for a small-sized dataset [22]–[24] and achieves higher We evaluate our proposed DGM using open rs-fMRI diagnostic accuracies as shown in Section IV-B. datasets of schizophrenia and bipolar disorders provided by • Unlike MLP and classifiers with feature extractions, the OpenfMRI (https://openfmri.org/dataset/ds000030/). We ob- proposed DGM can measure the contribution weight of tained a region-wise feature vector from each fMRI image by each brain region to the diagnosis as shown in Sec- using an automated anatomical labeling (AAL) template [26]. tion IV-C. This potentially evaluates the disorder progres- Our experimental results demonstrate that our proposed DGM sion and contributes to further biomedical investigations achieves better diagnostic accuracy than existing PCC-based of the underlying mechanisms. approaches [27], [28], frame-wise classification using MLP, Gaussian mixture model (GMM), and MLP with AE, [4], [7], [8], and models of temporal dynamics such as hidden II. DEEP NEURAL GENERATIVE MODEL Markov model (HMM) [5], [29] and long short-term mem- A. Generative

Load more