10/31/2012
Image quantification in (radiation) therapy
Robert Jeraj Associate Professor of Medical Physics, Human Oncology, Radiology and Biomedical Engineering Translational Imaging Research Program University of Wisconsin Carbone Cancer Center [email protected]
Biomarkers and surrogate endpoints
Biomarkers are characteristics that can be objectively imaged as indicators of normal biological processes, pathogenic processes, or pharmacologic responses to therapeutic interventions.
Biomarkers as surrogate endpoints are biomarkers that are intended to substitute for clinical endpoints. Surrogate endpoints are expected to predict clinical benefit (or harm or lack of benefit or harm) based on epidemiologic, therapeutic, pathophysiologic, or other scientific evidence.
1 10/31/2012
Imaging as a biomarker
Tumor effects Manifestation
Target effect Tumor effect Clinical outcome
IMAGING
Imaging biomarker as Imaging biomarker a surrogate endpoint
Imaging as a biomarker
Imaging as a biomarker
Normal tissue effects Manifestation
Local Resolution Subclinical tissue damage
Organ dysfunction Clinical
IMAGING
Imaging biomarker as Imaging biomarker a surrogate endpoint Imaging as a biomarker Jeraj et al 2010, Int J Rad Oncol Biol Phys, 76(3): S140
2 10/31/2012
Prentice’s criteria
For a given treatment (Z) , a surrogate (S) may be validly substituted for a true endpoint (T) if and only if: 1. P(S|Z) ≠ P(S) 2. P(T|Z) ≠ P(T) 3. P(T|S) ≠ P(T) 4. P(T|S,Z) = P(T|S)
Entire treatment effect Z on true endpoint T is captured by surrogate S (100% explained) – A valid surrogate is defined as a response variable for which a test of the null hypothesis of no relationship to the treatment groups under comparison is also a valid test of the corresponding null hypothesis based on the true endpoint Prentice, 1989
Prentice’s criteria vs. Real world
Intervention
Clinically Disease Surrogate Meaningful Endpoint
3 10/31/2012
Prentice’s criteria vs. Real world
Other Causal Factors Unintended intervention consequences
Intervention
Clinically Disease Surrogate Meaningful Endpoint
Other Causal Factors Disease related Other Causal Factors “Noise”
Real world biomarkers
Proportion of treatment effect explained (PE):
PE = (β βs)/β = 1 βs/β Relative effectiveness (RE): RE = β/α
Z: Treatment S: Surrogate T: Endpoint
4 10/31/2012
Biomarker characteristics
Clinical relevance: Existence of a strong mechanistic molecular or biochemical basis for the biomarker to be influenced by exposure to treatment
Sensitivity and specificity: Ability to detect the intended measurement or change in target population via a given mechanism
Reliability: Ability to measure the biomarker with accuracy, precision, robustness, and reproducibility
Practicality: Ability to measure the biomarker in a minimally invasive way
Simplicity: Feasibility of widespread clinical adoption
Biomarker validation/qualification
Individual validation (measurement) – Successfully measures a quantifiable characteristic both objectively and reproducibly
Internal validation (study) – Correlates with clinical endpoint, adds accuracy to precision and reproducibility
External validation – Demonstrates similar predictive power in other populations or in other related treatment studies
Broad qualification – Can be used as a surrogate in evaluating other classes of disease
5 10/31/2012
Imaging biomarkers
Imaging biomarker validation – What imaging biomarkers are available? – What is uncertainty of imaging biomarkers?
Imaging biomarker qualification – What should be correlated to the clinical events? – How far in biomarker qualification are we?
Imaging biomarkers
Imaging biomarker validation – What imaging biomarkers are available? – What is uncertainty of imaging biomarkers?
Imaging biomarker qualification – What should be correlated to the clinical events? – How far in biomarker qualification are we?
6 10/31/2012
MICAD: Molecular Imaging and Contrast Agent Database
1260 agents listed (July 2012)
But can we really use them all?
1
1. Credentialing
2. Modality creation 3 3. Supporting tools 2 100% 4. Development Regulatory approval (eIND, RDRC) 5. Clinical trials 4 10%
Regulatory approval (full IND) Multicenter trial infrastructure (NCI CIP, ACRIN) 5 1%
7 10/31/2012
What do biomarkers really show?
What does the FDG show?
5 Pre Cu-ATSM SUV 4 5.000 4.500 3 3.500 2.500
2 1.500 0.5000
Pre Pre FDG SUV 1
0 0 2 4 6 Pre FLT SUV Proliferation Hypoxia Inflammation
Specificity
Proliferation [18 F]FLT
Proliferative and hypoxic
Proliferative Hypoxia Hypoxic [64 Cu]Cu ATSM
8 10/31/2012
Sensitivity and specificity HIGH HIGH sensitivity LOW specificity LOW sensitivity LOW HIGH specificity HIGH
Weissleder 2001, Radiology 219, 316
Extraction of biological information
1 min
15 min
60 min
FLT PET/CT
9 10/31/2012
PET imaging uncertainties
Technical factors – Relative calibration between PET scanner and dose calibrator – Residual activity in syringe – Incorrect synchronization of clocks – Injection vs calibration time – Quality of administration Physical factors – Scan acquisition parameters – Image reconstruction parameters – Use of contrast agents Analytical factors – Region of interest (ROI) definition – Image processing Biological factors – Patient positioning – Patient breathing – Uptake period – Blood glucose levels Jeraj et al 2011, in Uncertainties in ext. beam RT Boellaard et al 2009, J Nucl Med 50: 11S
PET imaging uncertainties
Technical factors – Relative calibration between PET scanner and dose calibrator ( 10% ) – Residual activity in syringe ( 5% ) – Incorrect synchronization of clocks ( 10% ) – Injection vs calibration time ( 10% ) – Quality of administration ( 50% ) Physical factors – Scan acquisition parameters ( 15% ) – Image reconstruction parameters ( 30% ) – Use of contrast agents ( 15% ) Analytical factors – Region of interest (ROI) definition ( 50% ) – Image processing ( 25% ) Biological factors – Patient positioning ( 15% ) – Patient breathing ( 30% ) – Uptake period ( 15% ) – Blood glucose levels ( 15% ) Jeraj et al 2011, in Uncertainties in ext. beam RT Boellaard et al 2009, J Nucl Med 50: 11S
10 10/31/2012
Efforts in quantitative imaging
Organized efforts started on behalf of pharma, FDA Initial efforts started within professional societies: RSNA, AAPM, ACR, SNM, ISMRM 2004: Image Response Assessment Teams (IRAT): first organized initiative by NCI/AACI 2006: Imaging as a biomarker: large workshop at NIST including pharma, FDA, NIH, academia, societies 2008: Quantitative Imaging Biomarkers Alliance (QIBA): drug and equipment industries, imaging, societies focusing on CT, PET, MRI 2008: Clinical Translational Science Awards Imaging Working Group (CTSA IWG): initiatives like UPICT 2008: Quantitative imaging for evaluation of responses to cancer therapies: funding initiative (U01 RFA 08 255) 2010: Coming to Consensus on Standards for Imaging Endpoints: NIH workshop between FDA, SNM, RSNA 2010: ITART 2010: Specialized conference on treatment assessment and quantitative imaging (AAPM, ASTRO, ESTRO, RSNA, NCI)
Repeatability
• Repeatability results of double baseline 18 F FDG PET scans were similar for all SUV parameters assessed
• Centralized QA and centralized data analysis improved intra subject CV from 15.9% to 10.7% for
averaged SUV max
Velasquez et al 2009, J Nucl Med 50: 1646
11 10/31/2012
Imaging biomarkers
Imaging biomarker validation – What imaging biomarkers are available? – What is uncertainty of imaging biomarkers?
Imaging biomarker qualification – What should be correlated to the clinical events? – How far in biomarker qualification are we?
PET based response assessment
EORTC, NCI Recommendations (1999, 2005) 1,2 – SUV based approach – SUV mean and SUV max – Response categories with thresholds ( CR , PR , SD , PD ) – Problems • SUV mean – collapse information, sensitivity issues • SUV max – noise contamination • fails to use all available functional data • distribution • heterogeneity • no response threshold validation • few sensitivity studies • alternative measures
PET Response Criteria in Solid Tumors (PERCIST) (2009) 3 – SUV peak
1Young et al 1999, 2Shankar et al 2006, 3Wahl et al 2009
12 10/31/2012
Definition of the measures
1.4 circle, center on SUV max circle, highest uptake region 1.2 circle, center on SUV max sphere, center on SUV max circle, highest uptake region sphere, highest uptake region sphere, center on SUV max 1.2 sphere, highest uptake region 1.1
1.0 peak
1.0
SUV 0.8 0.9 Tumor Response 0.6 0.8 7.5 10.0 12.5 15.0 17.5 20.0 7.5 10.0 12.5 15.0 17.5 20.0 Diameter (mm) Diameter (mm)
Vanderhoek et al 2012, J Nucl Med 53: 4 11
Images are more than just one number!
Size measures SUV peak SUV max – Volume SUV mean SUV total – 1D size (axial)
Standardized Uptake 1D Size (axial) Value (SUV) measures:
– SUV mean
– SUV total
– SUV max
– SUV peak 250
Uptake Non uniformity 200 SUV sd measure: Volume 150
– SUV sd 100 50 Number of Voxels 0 0 5 10 15 20 Standardized Uptake Value
13 10/31/2012
Ambiguity of response
Pre treatment
Post treatment SUV 18
0 FLT PET/CT
Ambiguity of response
140 120 SUVmean ambiguous SUVmax response 100 SUVpeak SUVtotal 80 Progressive 60 Disease 40 20
Stable 0 Disease 20
Response (%) 40 60 Partial Response 80 100 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 Tumor
14 10/31/2012
FDG PET as a potential biomarker
HNSCC: Negative FDG PET results post chemoRT have a high NPV (95%), but low PPV (50%) (Schöder et al 2009, J Nucl Med, 50:74S)
NSCLC: 80% decrease in FDG PET SUV max post chemoRT has 90% sensitivity, 100% specificity, and 96% accuracy for predicting pathologic response (Cerfolio et al 2004, Ann Thorac Surg, 78:1903)Except of FDG PET (still debatable) no other imaging biomarkers has Rectal cancer: 70% decrease in FDG PET SUV max post chemoRTgot has far 79% in specificity, qualification 81% sensitivity, process 77% PPV, 89% NPV and 80% accuracy for predicting pathological response (Caprici et al 2007, Eur J Nucl Med Mol Imaging, 34:1583)
Esophageal cancer: Mixed results in adenocarcinomas negative FDG PET post chemoRT has a high PPV, elsewhere inconclusive (Krause et al 2009, J Nucl Med, 50:89S)
FDG PET vs Time to progression
FDG SUV Pre treatment 3 months post RT 6 months post RT measure p val (N=19) p val (N=16) p val (N=11) SUVmean 0.94 0.005 0.0002 SUVmax 0.86 0.017 0.003 SUVpeak 0.90 0.046 0.004 SUVtotal 0.51 0.047 0.006
Pre treatment 3 months post RT 6 months post RT
400 400 400
300 300 300
200 200 200 100 100 Days toDays Progression DaysProgression to 100 0 Progression toDays 1 2 3 4 1 2 3 1 2 3 Pre FDG SUVmean 3 mo FDG SUVmean 6 mo FDG SUVmean
15 10/31/2012
Imaging normal tissue
Anatomical imaging of structural changes (e.g., CT): – Often identified months to years following RT – too late to ameliorate effects of radiation injury – Even then often not related to symptomatic injury
Functional imaging of organ function (e.g., DCE MRI): – Evaluation of the organ function, or reduction in response to radiation therapy – Many organ specific choices (e.g., brain, heart), but not much used
Molecular imaging of cellular processes (e.g., FDG PET): – Reflects pathophysiological processes – Early detection of normal tissue injury, which would allow further intervention to ameliorate radiation injury
Imaging normal tissue damage
Hart et al 2008, Int J Rad Oncol Biol Phys, 71: 967.
16 10/31/2012
Conclusions
Imaging biomarkers are not to be taken easy!!! – They should be subject to the same stringent criteria as other biomarkers
We are just at the beginning of biomarker validation: – Large uncertainties of the molecular imaging assays: • Scanner harmonization • Uniform protocols and definition of assays • Central review • Extensive test/retest studies – Poor exploration of available imaging information: • Comprehensive imaging metrics • Multiple molecular imaging agents/modalities – Need randomized Phase III studies with imaging endpoints
“A correlate does not a surrogate make” – Complexity of correlations (e.g., histology)
17