Journal Club September 16, 2010 - STATS Moment

Total Page:16

File Type:pdf, Size:1020Kb

Journal Club September 16, 2010 - STATS Moment

Journal Club – September 16, 2010 - STATS Moment

1. Understanding Receiver Operating Characteristic Curves (ROC curves)

ROC curves are a tool used to visually and numerically represent the diagnostic performance of a test that yields results as a continuous value, i.e. a white blood cell count for detecting appendicitis. In other words, it illustrates the accuracy of a test to discriminate diseased cases from normal cases. Let’s review a few basic concepts that will help us better understand ROC curves:

Disease

Test Present Absent

Positiv True Positive (TP) False Positive (FP) e

Negati False Negative (FN) True Negative (TN) ve

Sensitivity: probability that a test result will be positive when the disease is present (true positive rate, expressed as a percentage), or the ability to detect disease in those who have it. = TP/(TP + FN) Specificity: probability that a test result will be negative when the disease is not present (true negative rate, expressed as a percentage), or the ability to to ensure that those without the disease are correctly identified that way. = TN/(TN + FP)

When you consider the results of a particular test in two populations, one population with a disease, the other population without the disease, you will rarely observe a perfect separation between the two groups. Indeed, the distribution of the test results will overlap, as shown in the following figure. For every possible cut-off point or criterion value i.e a specific WBC count, you select to discriminate between the two populations, there will be some cases with the disease correctly classified as positive (TP = True Positive fraction), but some cases with the disease will be classified negative (FN = False Negative fraction). On the other hand, some cases without the disease will be correctly classified as negative (TN = True Negative fraction), but some cases without the disease will be classified as positive (FP = False Positive fraction).

When you select a higher criterion value i.e. WBC of 15, the false positive fraction will decrease with increased specificity but on the other hand the true positive fraction and sensitivity will decrease:

When you select a lower criterion value i.e. WBC of 10, then the true positive fraction and sensitivity will increase. On the other hand the false positive fraction will also increase, and therefore the true negative fraction and specificity will decrease. The ROC curve

In a Receiver Operating Characteristic (ROC) curve the true positive rate (Sensitivity), also thought of as the “signal” that the test is detecting, is plotted on the y axis while the false positive rate (100-Specificity), also thought of as the “noise” that the test should be trying to avoid, is plotted on the x axis. Each point on the ROC plot represents a sensitivity/specificity pair corresponding to a particular test value, i.e. WBC of 13, and a decision threshold. A test with perfect discrimination (no overlap in the two distributions) has a ROC plot that passes through the upper left corner (100% sensitivity, 100% specificity). Therefore the closer the ROC plot is to the upper left corner, the higher the overall accuracy of the test.

The dashed line represents the “line of no discrimination”, meaning that a test following this line is no better than random chance at discriminating between diseased and non-diseased states.

Finally, ROC curves can be represented by a single value referred to as the Area Under the Curve (AUC). This is a measure of test accuracy. An AUC close to 1 corresponds to an excellent diagnostic test, whereas an AUC close to 0.5 corresponds to a poor diagnostic test.

Check out the following interactive sites for more information: http://www.anaesthetist.com/mnm/stats/roc/Findex.htm http://www.youtube.com/watch?v=_-PwDwaQQpY 2. Critical Appraisal of Clinical Decision Rules

We will be practicing this more at journal club, but here is a little background…

Clinical decision rules (CDR) attempt to formally test, simplify, and increase the accuracy of clinicians' diagnostic and prognostic assessments. Three steps are involved in the development and testing of a CDR: creation of the rule testing or validating the rule assessing the impact of the rule on clinical behavior.

Thus, there are four levels of clinical decision rules: a) Level IV – A newly derived clinical decision rule with validation performed retrospectively on the population used to derive the rule in the first place. i.e. BMJ article for this month’s journal club

Consider the following questions in evaluating these articles: Were all important predictors included in the derivation process? Were all important predictors present in a significant portion of the study population? Does the rule make clinical sense? b) Level III/II – Clinical rule has been validated prospectively on a similar, narrow population (III) or broad, heterogeneous populations (II) different from that used to derive it. i.e. CATCH study for pediatric minor head injury

Consider the following questions for Level II and III CDRs: Did the patients represent a wide spectrum of severity of disease? Was there a blinded assessment of criterion? Was there explicit and accurate interpretation of the predictor variables and the actual rule without knowledge of the outcome? Did the results of the assessment of the variables or of the rule influence the decision to perform the criterion standard?

How powerful is the rule? c) Level I – Impact analysis has demonstrated a change in clinical behavior or patient outcomes as a result of using the clinical prediction rule. i.e. Ottawa ankle rules

Recommended publications