Supplementary Material

Summary of Scan Success Rates Reported by Yerys et al (continued)

Yerys et al [41] reported that, on average, only 50% of all ADHD patients (both on and off MPH) were able to complete an entire fMRI study, compared to an 88% completion rate for controls. They also reported that 95% of both the medicated and unmedicated ADHD subjects successfully completed at least one session in an fMRI battery. Furthermore, the percentage of ADHD children who successfully completed at least one session in an fMRI battery was higher than both the epilepsy group (93%) and the ASD group (81%).

Further Considerations on the Practicalities of Scanning Children and Adolescents

Excessive head motion is the most common cause of failure of fMRI scanning. Yerys et al. [41] reported that medicated ADHD patients had the lowest percentage of failed runs due to excessive head motion, including when compared to healthy controls. Unfortunately, the reason why the medicated ADHD patients’ overall success rate was equal to the unmedicated ADHD patients’ overall success rate is not clear, as medicated ADHD patients failed a larger number of sessions classified by the authors as due to “other” reasons. When reviewing the literature of fMRI studies in ADHD it was discovered that 16.4 children and adolescents with ADHD and 15.3 controls (including only those used in the analysis) were included, on average, in the thirty-two studies investigated (a full list of the ADHD subjects scanned in the thirty-two fMRI studies is shown in Supplementary Table 1). The largest number of ADHD patients scanned in a study was 52 [41], the smallest number of ADHD patients scanned (not including patients that were scanned but later excluded from the study) was 7 [15]. Desmond and Glover [12] suggest that including 12 subjects in an fMRI scanning study would be adequate to find voxel differences at a low significance threshold of p<0.05, however this depends entirely on the effect size of interest (smaller numbers of subjects result in a lower power to detect differences that are actually present, therefore an increased risk of type II errors). They also suggest the number of subjects must be doubled if a higher level of significance is required. As the average number of ADHD patients and controls in the thirty-two studies reviewed was 31.7 subjects then it is clear that some studies may have required more subjects to obtain higher statistical power [9]. Consequently studies which have a low number of subjects and find negative results are of less interest as a null result could be due to a lack of statistical power. However, a study which rejects the null hypothesis, regardless of the number of subjects, is of more interest as the result is significantly different from what would have been expected by chance. Statistical power is notoriously difficult to estimate for neuroimaging studies as it varies from brain region to brain region; some brain regions (e.g. medial orbitofrontal cortex, inferior temporal lobes) adjacent to air filled spaces in the head (e.g. nasal sinuses, ear canals) are additionally affected by signal dropout (the ‘susceptibility’ artefact), and statistical power is further affected by other factors such as poor image quality (see the main text). For individual subject MVPA studies it is generally thought that even more subjects are required to achieve high accuracy and generalisability of predictions. In all fMRI studies considered here, the total duration for an fMRI study was restricted to 30 minutes or less.

Group Level Univariate Analyses of Scanning Data

An example of a common conventional technique for analysing brain images from clinical populations is the use of t-tests, at each voxel, to test the null hypothesis of no significant difference between groups [21]. In this statistical test, each small volume (voxel) of the brain is compared with the corresponding voxel for all subjects in each of the groups (e.g. patient and control). The mean and standard deviation values at each voxel (which correspond to grey matter probabilities in a grey matter segmented image) are computed for both groups and a t-score is calculated at each voxel. Voxels with a t- score exceeding a given significance threshold (typically p = 0.05 or p = 0.01), given the degrees of freedom, are evidence of brain regions for which the null hypothesis can be rejected. It should be noted that this method involves a large number of simultaneous tests and therefore a significance correction for multiple testing is required. Long established neuroimaging methods include the Family Wise Error (FWE), False Discovery Rate (FDR), and cluster-based methods [21]. This combination of pre- processing and making group level statistical inferences using structural brain scans is referred to as Voxel-Based Morphometry (VBM) http://www.fil.ion.ucl.ac.uk/spm/doc/papers/john_vbm_methods.pdf. Two meta-analyses of VBM studies of ADHD reported significant grey matter reductions in the putamen/globus pallidus [16] and lentiform nucleus extending to the caudate nucleus [26]. The most often reported decreases in child and adolescent ADHD also include the dorsolateral prefrontal cortex, cerebellum and white matter of the corpus callosum [34]. Abell et al [1] reported grey matter decreases in the right paracingulate sulcus and the left inferior frontal gyrus in young adults with autism when compared with controls. Increases in grey matter were also found in the amygdala/peri-amygdaloid cortex, middle temporal gyrus, and inferior temporal gyrus [1].

Mathematical overview of the linear SVM

This section will briefly outline the main mathematics applied in SVM as simply as possible. If there are N subjects in a training set then {xi,yi} where i = 1…N, xi represents a vector of the selected voxels from one subjects image, and yi represents a subject’s class label (i.e. -1 or 1 – class labels are arbitrarily assigned to either number). As described in the main text, the SVM algorithm identifies the decision boundary which best separates the two classes using the training set. This decision boundary (or hyperplane) can be described by w. x + b = 0 where w is normal to the hyperplane and b/ || w || is the perpendicular distance from the hyperplane to the origin. If the classes are linearly separable, there exists a vector w and a scalar b such that the inequalities:

w.xi  b  1 if yi  1 (Eq. 1) w.xi  b  1 if yi  1 are consistent throughout the training set. In order to classify data which is not linear separable Cortes and Vapnik [10] introduced a slack variable, ξi (where i = 1…N ), which allows for misclassified points:

w.xi  b  1i if yi  1

w.xi  b  1 i if yi  1 (Eq. 2)

i  0 i which can be combined into:

y(w.xi  b) 1 i  0 where i  0 i (Eq. 3)

The SVM optimises the hyperplane by minimising the classification errors and maximising the margin (the distance between the hyperplane and the closest subject (of either class label). The introduction of the slack variable ‘soft-margin’ allows classification to be performed even when there are subjects located on the incorrect side of the margin as it penalises each misclassified subject as a function of the distance from the hyperplane [10, 20]. Without including the derivation (see Bishop [4], Cristianini & Shawe-Taylor [11], Fletcher [20] for derivation) the hyperplane is found by minimising:

N 1 2 min w  Ci (Eq. 4) 2 i1 such that Eq. 3 is satisfied. The parameter C corresponds to the soft-margin parameter outlined in the main text. Once the SVM has identified the optimal hyperplane from the training data (assuming a suitable soft-margin parameter has been selected as described in the main text) the novel training data can then be classified. Using a linear kernel, a new subject {x*,y*} is classified by:

N * * f (x )   wi xi  b (Eq. 5) i1 where the sign of f(x*) determines which class label the subject is predicted to have. In general, the main equation for classifications using SVM (and RVM) is:

N f (x)   wi K(x, xi )  b (Eq. 6) i1

where K(x,xi) describes the kernel function which has been selected (e.g. linear, polynomial, radial basis function, etc) [4]. Supplementary References

1. Abell F, Krams M, Ashburner J, Passingham R, Friston K, Frackowiak R, Happe F, Frith C, Frith U (1999) The neuroanatomy of autism: a voxel-based whole brain analysis of structural scans. Neuroreport 10:1647-1651 2. Adler CM, Delbello MP, Mills NP, Schmithorst V, Holland S, Strakowski SM (2005) Comorbid ADHD is associated with altered patterns of neuronal activation in adolescents with bipolar disorder performing a simple attention task. Bipolar Disorders 7:577-588 3. Anderson CM, Polcari A, Lowen SB, Renshaw PF, Teicher MH (2002) Effects of methylphenidate on functional magnetic resonance relaxometry of the cerebellar vermis in boys with ADHD. The American Journal Of Psychiatry 159:1322-1328 4. Bishop CM (2006) Pattern recognition and machine learning. Springer 5. Booth JR, Burman DD, Meyer JR, Lei Z, Trommer BL, Davenport ND, Li W, Parrish TB, Gitelman DR, Mesulam MM (2005) Larger deficits in brain networks for response inhibition than for visual selective attention in attention deficit hyperactivity disorder (ADHD). Journal Of Child Psychology And Psychiatry, And Allied Disciplines 46:94-111 6. Brotman MA, Rich BA, Guyer AE, Lunsford JR, Horsey SE, Reising MM, Thomas LA, Fromm SJ, Towbin K, Pine DS, Leibenluft E (2010) Amygdala activation during emotion processing of neutral faces in children with severe mood dysregulation versus ADHD or bipolar disorder. The American Journal Of Psychiatry 167:61-69 7. Cao Q, Zang Y, Zhu C, Cao X, Sun L, Zhou X, Wang Y (2008) Alerting deficits in children with attention deficit/hyperactivity disorder: event-related fMRI evidence. Brain Research 1219:159-168 8. Cao X, Cao Q, Long X, Sun L, Sui M, Zhu C, Zuo X, Zang Y, Wang Y (2009) Abnormal resting-state functional connectivity patterns of the putamen in medication-naive children with attention deficit hyperactivity disorder. Brain Research 1303:195-206 9. Cohen J (1977) Statistical power analysis for the behavioral sciences. Academic Press 10. Cortes C, Vapnik V (1995) Support-vector networks. Machine Learning 20:273- 297 11. Cristianini N, Shawe-Taylor J (2000) An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods. Cambridge University Press 12. Desmond JE, Glover GH (2002) Estimating sample size in functional MRI (fMRI) neuroimaging studies: statistical power analyses. Journal Of Neuroscience Methods 118:115-128 13. Durston S, Davidson MC, Mulder MJ, Spicer JA, Galvan A, Tottenham N, Scheres A, Xavier Castellanos F, van Engeland H, Casey BJ (2007) Neural and behavioral correlates of expectancy violations in attention-deficit hyperactivity disorder. Journal Of Child Psychology And Psychiatry, And Allied Disciplines 48:881-889 14. Durston S, Mulder M, Casey BJ, Ziermans T, van Engeland H (2006) Activation in ventral prefrontal cortex is sensitive to genetic vulnerability for attention-deficit hyperactivity disorder. Biological Psychiatry 60:1062-1070 15. Durston S, Tottenham NT, Thomas KM, Davidson MC, Eigsti I-M, Yang Y, Ulug AM, Casey BJ (2003) Differential patterns of striatal activation in young children with and without ADHD. Biological Psychiatry 53:871-878 16. Ellison-Wright I, Ellison-Wright Z, Bullmore E (2008) Structural brain change in Attention Deficit Hyperactivity Disorder identified by meta-analysis. BMC Psychiatry 8:51-51 17. Epstein JN, Casey BJ, Tonev ST, Davidson MC, Reiss AL, Garrett A, Hinshaw SP, Greenhill LL, Glover G, Shafritz KM, Vitolo A, Kotler LA, Jarrett MA, Spicer J (2007) ADHD- and medication-related brain activation effects in concordantly affected parent-child dyads with ADHD. Journal Of Child Psychology And Psychiatry, And Allied Disciplines 48:899-913 18. Epstein JN, Delbello MP, Adler CM, Altaye M, Kramer M, Mills NP, Strakowski SM, Holland S (2009) Differential patterns of brain activation over time in adolescents with and without attention deficit hyperactivity disorder (ADHD) during performance of a sustained attention task. Neuropediatrics 40:1-5 19. Fassbender C, Zhang H, Buzy WM, Cortes CR, Mizuiri D, Beckett L, Schweitzer JB (2009) A lack of default network suppression is linked to increased distractibility in ADHD. Brain Research 1273:114-128 20. Fletcher T (2009) Support Vector Machines Explained. In:University College London (UCL) 21. Friston KJ, Ashburner JT, Kiebel SJ, Nichols TE, Penny WD (2007) Statistical Parametric Mapping: The Analysis of Functional Brain Images. Academic Press, London 22. Hoekzema E, Carmona S, Tremols V, Gispert JD, Guitart M, Fauquet J, Rovira M, Bielsa A, Soliva JC, Tomas X, Bulbena A, Ramos-Quiroga A, Casas M, Tobeña A, Vilarroya O (2010) Enhanced neural activity in frontal and cerebellar circuits after cognitive training in children with attention-deficit/hyperactivity disorder. Human Brain Mapping 31:1942-1950 23. Kobel M, Bechtel N, Weber P, Specht K, Klarhofer M, Scheffler K, Opwis K, Penner I-K (2009) Effects of methylphenidate on working memory functioning in children with attention deficit/hyperactivity disorder. European Journal Of Paediatric Neurology: EJPN: Official Journal Of The European Paediatric Neurology Society 13:516-523 24. Konrad K, Neufang S, Hanisch C, Fink GR, Herpertz-Dahlmann B (2006) Dysfunctional attentional networks in children with attention deficit/hyperactivity disorder: evidence from an event-related functional magnetic resonance imaging study. Biological Psychiatry 59:643-651 25. Mostofsky SH, Rimrodt SL, Schafer JGB, Boyce A, Goldberg MC, Pekar JJ, Denckla MB (2006) Atypical motor and sensory cortex activation in attention- deficit/hyperactivity disorder: a functional magnetic resonance imaging study of simple sequential finger tapping. Biological Psychiatry 59:48-56 26. Nakao T, Radua J, Rubia K, Mataix-Cols D (2011) Gray matter volume abnormalities in ADHD: voxel-based meta-analysis exploring the effects of age and stimulant medication. The American Journal Of Psychiatry 168:1154-1163 27. Passarotti AM, Sweeney JA, Pavuluri MN (2010) Neural correlates of response inhibition in pediatric bipolar disorder and attention deficit hyperactivity disorder. Psychiatry Research 181:36-43 28. Peterson BS, Potenza MN, Wang Z, Zhu H, Martin A, Marsh R, Plessen KJ, Yu S (2009) An FMRI study of the effects of psychostimulants on default-mode processing during Stroop task performance in youths with ADHD. The American Journal Of Psychiatry 166:1286-1294 29. Pliszka SR, Glahn DC, Semrud-Clikeman M, Franklin C, Perez III R, Xiong J, Liotti M (2006) Neuroimaging of inhibitory control areas in children with attention deficit hyperactivity disorder who were treatment naive or in long-term treatment. The American Journal Of Psychiatry 163:1052-1060 30. Rubia K, Cubillo A, Smith AB, Woolley J, Heyman I, Brammer MJ (2010) Disorder-specific dysfunction in right inferior prefrontal cortex during two inhibition tasks in boys with attention-deficit hyperactivity disorder compared to boys with obsessive-compulsive disorder. Human Brain Mapping 31:287-299 31. Rubia K, Halari R, Cubillo A, Mohammad A-M, Brammer M, Taylor E (2009) Methylphenidate normalises activation and functional connectivity deficits in attention and motivation networks in medication-naive children with ADHD during a rewarded continuous performance task. Neuropharmacology 57:640-652 32. Rubia K, Halari R, Cubillo A, Mohammad A-M, Scott S, Brammer M (2010) Disorder-specific inferior prefrontal hypofunction in boys with pure attention- deficit/hyperactivity disorder compared to boys with pure conduct disorder during cognitive flexibility. Human Brain Mapping 31:1823-1833 33. Rubia K, Halari R, Smith AB, Mohammad M, Scott S, Brammer MJ (2009) Shared and disorder-specific prefrontal abnormalities in boys with pure attention- deficit/hyperactivity disorder compared to boys with pure CD during interference inhibition and attention allocation. Journal Of Child Psychology And Psychiatry, And Allied Disciplines 50:669-678 34. Seidman LJ, Valera EM, Makris N (2005) Structural brain imaging of attention- deficit/hyperactivity disorder. Biological Psychiatry 57:1263-1272 35. Shafritz KM, Marchione KE, Gore JC, Shaywitz SE, Shaywitz BA (2004) The effects of methylphenidate on neural systems of attention in attention deficit hyperactivity disorder. The American Journal Of Psychiatry 161:1990-1997 36. Solanto MV, Schulz KP, Fan J, Tang CY, Newcorn JH (2009) Event-related FMRI of inhibitory control in the predominantly inattentive and combined subtypes of ADHD. Journal Of Neuroimaging: Official Journal Of The American Society Of Neuroimaging 19:205-212 37. Suskauer SJ, Simmonds DJ, Fotedar S, Blankner JG, Pekar JJ, Denckla MB, Mostofsky SH (2008) Functional magnetic resonance imaging evidence for abnormalities in response selection in attention deficit hyperactivity disorder: differences in activation associated with response inhibition but not habitual motor response. Journal Of Cognitive Neuroscience 20:478-493 38. Vaidya CJ, Bunge SA, Dudukovic NM, Zalecki CA, Elliott GR, Gabrieli JDE (2005) Altered neural substrates of cognitive control in childhood ADHD: evidence from functional magnetic resonance imaging. The American Journal Of Psychiatry 162:1605-1613 39. van 't Ent D, van Beijsterveldt CEM, Derks EM, Hudziak JJ, Veltman DJ, Todd RD, Boomsma DI, De Geus EJC (2009) Neuroimaging of response interference in twins concordant or discordant for inattention and hyperactivity symptoms. Neuroscience 164:16-29 40. Wang L, Zhu C, He Y, Zang Y, Cao Q, Zhang H, Zhong Q, Wang Y (2009) Altered small-world brain functional networks in children with attention- deficit/hyperactivity disorder. Human Brain Mapping 30:638-649 41. Yerys BE, Jankowski KF, Shook D, Rosenberger LR, Barnes KA, Berl MM, Ritzl EK, Vanmeter J, Vaidya CJ, Gaillard WD (2009) The fMRI success rate of children and adolescents: typical development, epilepsy, attention deficit/hyperactivity disorder, and autism spectrum disorders. Human Brain Mapping 30:3426-3435 42. Zang Y-F, He Y, Zhu C-Z, Cao Q-J, Sui M-Q, Liang M, Tian L-X, Jiang T-Z, Wang Y-F (2007) Altered baseline brain activity in children with ADHD revealed by resting-state functional MRI. Brain & Development 29:83-91 43. Zhu C-Z, Zang Y-F, Cao Q-J, Yan C-G, He Y, Jiang T-Z, Sui M-Q, Wang Y-F (2008) Fisher discriminative analysis of resting-state brain function for attention- deficit/hyperactivity disorder. Neuroimage 40:110-120 Supplementary Table 1: The number of ADHD subjects in a selection of fMRI studies. fMRI Study Number of ADHD patients Number of controls included included in the analysis in the analysis Adler et al [2] 11 (bipolar + ADHD) 11 (bipolar only) Anderson et al [3] 10 6 Booth et al [5] 12 12 Brotman et al [6] 18 37 Cao et al [7] 12 13 Cao et al [8] 19 23 Durston et al [15] 7 7 Durston et al [14] 11 11 Durston et al [13] 22 22 Epstein et al [17] 20 9 Epstein et al [18] 10 14 Fassbender et al [19] 12 13 Hoekzema et al [22] 19 0 Kobel et al [23] 14 12 Konrad et al [24] 16 16 Mostofsky et al [25] 11 11 Passarotti et al [27] 11 15 Peterson et al [28] 16 20 Pliszka et al [29] 17 15 Rubia et al [33] 20 20 Rubia et al [31] 13 13 Rubia et al [30] 20 20 Rubia et al [32] 14 20 Shafritz et al [35] 15 14 Solanto et al [36] 20 0 (comparing ADHD subtypes) Suskauer et al [37] 25 25 Vaidya et al [38] 10 10 Van ’t Ent et al [39] 27 27 Wang et al [40] 19 20 Yerys et al [41] 52 32 matched with the ADHD group (137 total) Zang et al [42] 13 12 Zhu et al [43] 9 11