Data Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Mechanisms of Mitosis Data Analysis You are now ready to statistically analyze the data your team has collected. I. Mitotic Index and Mitotic Phase Indices Your mitotic cell counts comprise a survey of the number of different stages of mitosis in your two populations (treatment and control). You counted mitotic cells in 8 treatment and 8 control roots, and then calculated a Mitotic Index (M) for each sample. M = nm/N nm = the number of mitotic cells in the sample N = the total number of cells counted in the sample. You also calculated a Mitotic Phase Index (Mxxx) for each phase of mitosis you found in your treatment and control groups. Mphase = nphase/ nm nphase = # of cells in [mitotic phase] in the sample nm = total number of mitotic cells in the sample Use the indices you recorded in the table templates during Sessions 3 and 4 for the data analysis described in the following section. II. Applying a Statistical Test to Your Mitotic Indices Your mitotic indices are ordinal, non-parametric data that are not distributed along a normal curve. The non-parametric Mann-Whitney U test is appropriate for this type of data. The Mann Whitney U test measures the degree of overlap between two sets of data that can be ranked (i.e., put in order of ascending values). • large overlap means no significant difference between your populations o fail to reject the null hypothesis • small (5% or less) overlap means a significant difference between your populations o reject your null hypothesis. Non-parametric test for two samples: Mann-Whitney U The Mann-Whitney test allows the investigator (you) to compare your two cell populations without assuming that your Mitotic Index values are normally distributed. The Mann-Whitney U does have its rules. For this test to be appropriate: • You must be comparing two random, independent samples (treatment & control) • The measurements (Mitotic Indices) should be ordinal • No two measurements should have exactly the same value o (though we can deal with “ties” in a way that will be explained shortly) 1. State your null and alternative hypotheses. Ho: HA: Example: Ho: There is no difference in the ranks of Mitotic Indices (M) between meristematic cells in an onion treated with aqueous trifluralin and an onion treated with plain water. HA: There is a difference in the ranks of Mitotic Indices (M) between meristematic cells in an onion treated with aqueous trifluralin and an onion treated with plain water. 2. State the significance level (to be compared to a, 0.05) required to reject Ho. This is typically a probability value (P) of < 0.05 3. Rank your Mitotic Indices from smallest to largest in a table Note which index came from which population of cells (Treatment or Control). Example: • Table 1 shows 16 (imaginary) MI from treatment (T) and control (C) onion root tips. • Table 2 shows the values ranked and labeled by population. Table 1. Mitotic Indices for Table 2. Ranked Mitotic Indices treatment and control root tips Note tied values in blue. Sample M Mcontrol Rank Ranked Cell # treatment M Population values 1 0.20 0.55 1 0.10 T 2 0.25 0.60 2 0.15 T 3 0.45 0.65 3 0.20 T 4 0.35 0.80 4 0.25 T 5 0.15 0.35 5 0.35 T 6 0.10 0.75 6 0.35 C 7 0.55 0.70 7 0.40 T 8 0.40 0.85 8 0.45 T 9 0.55 T 10 0.55 C 11 0.60 C 12 0.65 C 13 0.70 C 14 0.75 C 15 0.80 C 16 0.85 C 4. Assign points to each ranked value (see Table 3): • Each “treatment” rank gets one point for every “control” rank that appears below it. • Every “control” value gets one point for every “treatment” value that appears below it. • For example, the first value, 0.10 (T) has 8 Control values below it, so it gets 8 points. • Value 10 (C) has 3 Treatment values below it, so it gets 3 points. • Tied values split the sum of their points. For example: o Rank 5 (0.35) has 8 points o Rank 6(0.35) has 3 points o 8 + 3 = 11 o Each rank gets half of 11, or 5.5 Table 3. Points assigned to ranked M values in Treatment and Control onion cell populations. (example) Tied values split their total points equally. Rank Ranked M Cell Points values population 1 0.10 T 8 2 0.15 T 8 3 0.20 T 8 4 0.25 T 8 5 0.35 T 8 à 5.5 6 0.35 C 3 à 5.5 7 0.40 T 7 8 0.45 T 7 9 0.55 T 7 à 3.5 10 0.55 C 0 à 3.5 11 0.60 C 0 12 0.65 C 0 13 0.70 C 0 14 0.75 C 0 15 0.80 C 0 16 0.85 C 0 5. Calculate a U statistic for each category by adding the points for each cell population. Utreatment = 8 + 8 + 8 + 8 + 5.5 + 7 + 7 + 3.5 = 55 Ucontrol = 5.5 + 3.5 + 0 + 0 + 0 + 0 + 0 + 0 = 9 Your U statistic is the smaller of these two values. In the imaginary example our U value is 9. The lower the U value, the greater the difference between the two groups being compared. (For example, if none of the M values overlapped, the U value would be zero.) III. Critical values for non-parametric statistics We have defined our significance level (a) as 0.05. This implies: • a true null hypothesis will be rejected only 5% of the time • a false null hypothesis will be rejected 95% of the time … if the P value obtained from your data is less than or equal to 0.05. A critical value of a statistic (e.g., Mann-Whitney U) is the value associated with a significance level less than or equal to a. (We are using the traditional value of a, 0.05.) Critical values for the Mann-Whitney U statistic (at different sample sizes) are shown in Table 4. In the previous imaginary example, treatment and control groups with 8 samples each, a critical value of 13 is required for rejection of the null hypothesis. The Mann Whitney U statistic of 9 is far lower than this cut-off value. This means there is very little overlap between the two populations (they are significantly different). The null hypothesis is rejected. Table 4. Critical values for the Mann-Whitney U statistic. Find the value that corresponds to the sample sizes (8 and 8) of your two cell populations. If your U value is smaller than that shown in the table, then there is less than 5% chance that the difference between your two cell populations is due to chance. If your U value is smaller than the one shown in this table, reject your null hypothesis. If your U value is larger than that shown in the table, fail to reject your null hypothesis. IV. Graphic Representation of your Data Tables of numerical data are important, but they are not always the best way to present your data to an audience. As the old saying goes, “A picture is worth a thousand words.” The most effective way to present your experimental results, whenever possible, is with a figure. A. Mitosis Raw Data A simple bar graph can be used to represent the proportion of cells in your sample that you found in each stage of mitosis. An example can be seen in Figure 1. In a bar graph, categories may be placed in any order, and do not necessarily represent a continuum. Figure 1. A bar graph showing a hypothetical distribution of cells in each stage of mitosis in a study population of cells. Don’t confuse a bar graph with a histogram. A histogram depicts continuous data. An example of a histogram is shown in Figure 2. Figure 2. A histogram showing a hypothetical distribution of cells of different diameter in a population of cells. Note that each bar on the histogram represents a specific subset of a range of continuous numerical data that occur in a set order. Figures, unlike tables, have their legends placed underneath. Always use proper format for figures and tables in your work. B. Visualizing Mann-Whitney U results Because the Mann-Whitney U provides a measure of how great the overlap is between two groups being compared, a box plot is a good way to represent your Mann-Whitney U results. The box graph can be created to show the median of each group, the range of values, and their overlap. An example of a box plot is shown in Figure 3, with a key and explanation in Figure 4. Figure 3. Sample box plot showing overlap of mitotic index values for two populations of cells. Figure 4. The black bar in the center of each population’s values represents the median. The Interquartile Range (IQR) includes 50% of the values, and is bordered on the bottom by the 25th percentile and on the top by the 75th percentile. The range is the region between the minimum and maximum values. The star represents a data point that is an outlier. .