<p>Supplementary statistical methods for gene expression analysis.</p><p>Combination of data sets</p><p>Of the 18 samples run on the cDNA array, 7 were also run on the Pathways array.</p><p>There are 562 distinct genes (by Unigene cluster ID) common to both arrays. They are represented by 773 probes on the cDNA array and 784 probes on the Pathways array. </p><p>There is not necessarily a one-to-one correspondence between probes on one array and probes on the other array; a gene that is queried by a single probe on one array might be queried by two or more probes on the other, or a gene may even be queried by multiple probes on each array. </p><p>To assess the correlation of expression levels between the two array platforms, we calculated the correlations </p><p> ri, j , k= r (X i , j, Y i , k) , i = 1...562, j = 1... n i and k = 1... m i</p><p> where ni is the number of probes for the i-th common gene on the Pathways array, mi is </p><p> the number of probes for the i-th common gene on the cDNA array, Xi, j are the 7 sample expression values for the j-th probe of the i-th common gene on the Pathways array, and</p><p>Yi, k are the expression values from the 7 corresponding samples for the k-th probe of the i-th common gene on the cDNA array. A total of 1032 correlation values were calculated.</p><p>A histogram of these correlations is clearly skewed towards 1 (Supplementary Figure 1), suggesting that there is substantial correlation between expression values across the two array platforms.</p><p>Under the null hypothesis, the transformation of a correlation coefficient r n - 2r Tn = 1- r 2 is approximately Student’s T with n - 2 degrees of freedom [ref: Mathematical Statistics, </p><p>Peter Bickel & Kjell Doksum, Holden-Day, Oakland CA 1977, p 221]. So, under the null hypothesis of no correlation between the cDNA array and the Pathways array,</p><p> n - 2r T = i, j , k i, j , k 2 is approximately Student’s T with 5 degrees of freedom. The one- 1- ri, j , k sided Kolmogorov-Smirnov test for comparing distributions is highly significant (p < </p><p>0.00001), so we reject the null hypothesis, and conclude that the correlation between the two arrays is not random. </p>
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages2 Page
-
File Size-