Health Evidence Review Commission's Genetics Advisory Panel

October 13, 2015 9:00 AM

Barbara Roberts Human Services Building, Room 559 500 Summer St NE Salem, OR 97301 Section 1.0 Call to Order

AGENDA Genetics Advisory Panel (GAP) October 13, 2015 9:00 am – 11:00 am Teleconference Public location: Human Services Building, Room 559 500 Summer Street NE Salem OR 97301

(All agenda items are subject to change and times listed are approximate)

# Time Item Presenter

1 9:00 AM Call to Order & Introductions Karen Kovak

2 9:05 AM Purpose of Meeting Ariel Smits

Ariel Smits 3 9:10 AM Review of New Genetics CPT Codes for 2016 Karen Kovak

Revisions to the non-prenatal genetic testing 4 10:10 AM Ariel Smits guideline

5 10:50 AM Public Comment

6 11:00 AM Adjournment Karen Kovak

Highlights

Oral Health Advisory Panel Conference Call hosted at: General Services Building, Bachelor Butte Converence Room 1225 Ferry Street, Salem, Oregon 10/16/2014 10:00-11:30 am

Members Present:Karen Novak; Kathryn Murray; Sudge Budden, MD; Sue Richards, PhD.

Staff Present: Darren Coffman; Ariel Smits, MD, MPH; Denise Taray.

Also Attending: Devki Saraiya, Myriad Genetics

Review of New Genetics CPT Codes for 2015

The following recommendations were suggested for staff to present to the Value-based Benefits Subcommittee at their November 13, 2014 meeting:

Recommendation Impact of detecting a mutation CPT Descriptor Cover Don't Change Change Provide Provide Code Cover treatment health Prognosis genetic monitoring counseling 81410 Aortic dysfunction or dilation (eg, Marfan X X X X X syndrome, Loeys Dietz syndrome, Ehler Danlos syndrome type IV, arterial tortuosity syndrome); genomic sequence analysis panel, must include sequencing of at least 9 , including FBN1, TGFBR1, TGFBR2, COL3A1,

GAP Highlights 10-16-2014 Page 1

81411 Aortic dysfunction or dilation (eg, Marfan X X X X X syndrome, Loeys Dietz syndrome, Ehler Danlos syndrome type IV, arterial tortuosity syndrome); duplication/deletion analysis panel, must include analyses for TGFBR1, TGFBR2, MYH11, and COL3A1 81415 Exome (eg, unexplained constitutional or X X X X heritable disorder or syndrome); sequence analysis 81416 Exome (eg, unexplained constitutional or X X X X heritable disorder or syndrome); sequence analysis, each comparator exome (eg, parents, siblings) (List separately in addition to code for primary procedure)

81417 Exome (eg, unexplained constitutional or X X X X heritable disorder or syndrome); re- evaluation of previously obtained exome sequence (eg, updated knowledge or unrelated condition/syndrome)

81425 Genome (eg, unexplained constitutional or X heritable disorder or syndrome); sequence analysis 81426 Genome (eg, unexplained constitutional or X heritable disorder or syndrome); sequence analysis, each comparator genome (eg, parents, siblings) (List separately in addition to code for primary procedure)

GAP Highlights 10-16-2014 Page 2

81427 Genome (eg, unexplained constitutional or X heritable disorder or syndrome); re- evaluation of previously obtained genome sequence (eg, updated knowledge or unrelated condition/syndrome)

81430 Hearing loss (eg, nonsyndromic hearing loss, X X X X X Usher syndrome, Pendred syndrome); genomic sequence analysis panel, must include sequencing of at least 60 genes, including CDH23, CLRN1, GJB2, GPR98, MTRNR1, MYO7A, MYO15A, PCDH15, OTOF, SLC26A4, TMC1, TMPRSS3 81431 Hearing loss (eg, nonsyndromic hearing loss, X X X X X Usher syndrome, Pendred syndrome); duplication/deletion analysis panel, must include copy number analyses for STRC and DFNB1 deletions in GJB2 and GJB6 genes

81435 Hereditary colon cancer syndromes (eg, Lynch X X syndrome, familial adenomatosis polyposis); genomic sequence analysis panel, must include analysis of at least 7 genes, including APC, CHEK2, MLH1, MSH2, MSH6, MUTYH, and PMS2 81436 Hereditary colon cancer syndromes (eg, Lynch X X syndrome, familial adenomatosis polyposis); duplication/deletion analysis panel, must include analysis of at least 8 genes, including APC, MLH1, MSH2, MSH6, PMS2, EPCAM, CHEK2, and MUTYH

GAP Highlights 10-16-2014 Page 3

81440 Nuclear encoded mitochondrial genes (eg, X X X X X neurologic or myopathic phenotypes), genomic sequence panel, must include analysis of at least 100 genes, including BCS1L, C10orf2, COQ2, COX10, DGUOK, MPV17, OPA1, PDSS2, POLG, POLG2, RRM2B, SCO1, SCO2, SLC25A4, S 81460 Whole mitochondrial genome (eg, Leigh X X X X X syndrome, mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes [MELAS], myoclonic epilepsy with ragged-red fibers [MERFF], neuropathy, ataxia, and retinitis pigmentosa [NARP], Leber hereditary op 81465 Whole mitochondrial genome large deletion X X X X X analysis panel (eg, Kearns-Sayre syndrome, chronic progressive external ophthalmoplegia), including heteroplasmy detection, if performed 81470 X-linked intellectual disability (XLID) (eg, syndromic and non-syndromic XLID); genomic sequence analysis panel, must include sequencing of at least 60 genes, including ARX, ATRX, CDKL5, FGD1, FMR1, HUWE1, IL1RAPL, KDM5C, L1CAM, MECP2, MED12, MID1, OCRL, 81471 X-linked intellectual disability (XLID) (eg, syndromic and non-syndromic XLID); duplication/deletion gene analysis, must include analysis of at least 60 genes, including ARX, ATRX, CDKL5, FGD1, FMR1, HUWE1, IL1RAPL, KDM5C, L1CAM, MECP2, MED12, MID1, OCRL,

GAP Highlights 10-16-2014 Page 4

Staff will follow-up on the following additional items and present any additional input directly to the VbBS for consideration::  Should suggestion to cover 81417 be reconsidered in light of Cary Harding’s comment that “If any validation of an exome result is necessary, that should be done by another method rather than repeating an exome analysis”?  Solicit member comments on additional potential changes to the non-prenatal genetic testing guideline.  Should there be any restrictions on mitochondrial genome testing?  Contact experts regarding the X-linked intellectual disability testing: o Need more information regarding recommendations to cover or not cover/any restrictions in coverage o Non-prenatal genetic testing guideline section on intellectual disability testing would need to be changed to accommodate this type of testing if covered  Review USPSTF vs NCCN guidelines on what defines a high risk patient for breast cancer testing.

GAP Highlights 10-16-2014 Page 5

Section 2.0 New Codes 2016 Non-Prenatal Genetic Testing CPT Code Recommendations Date: 10/13/15 Recommendation Impact of detecting a mutation Limitations & Comments CPT Descriptor Cover Don't Change Change Provide Provide Code Cover treatment health Prognosis genetic monitoring counseling 81162 BRCA1, BRCA2 (breast cancer 1 and 2) (eg, hereditary breast and ovarian cancer) gene analysis; full sequence analysis and full duplication/deletion analysis

81412 Ashkenazi Jewish associated disorders ( eg, Bloom syndrome, Canavan disease, cystic fibrosis, familial dysautonomia, Fanconi anemia group C, Gaucher disease, Tay-Sachs disease), genomic sequence analysis panel must include sequencing of at least 9 genes, including ASPA, BLM, CFTR, FANCC, GBA, HEXA IKBKAP, MCOLN1 and SMPD1 81432 Hereditary breast cancer-related disorders (eg, hereditary breast cancer, hereditary ovarian cancer, hereditary endometrial cancer); geonomic sequence analysis panel, must include sequencing of at least 14 genes, including ATM, BRCA1, BRCA2,BRIP1, CDH1, MLH1, MSH2, MSH6, NBN, PALB2, PTEN, RAD51C, STK11 AND TP53 81433 Hereditary breast cancer-related disorders (eg, hereditary breast cancer, hereditary ovarian cancer, hereditary endometrial cancer) duplication/deletion analysis panel, must include analyses for BRCA1, BRCA2, MLH1, MSH2, AND STK11

1 Recommendation Impact of detecting a mutation Limitations & Comments CPT Descriptor Cover Don't Change Change Provide Provide Code Cover treatment health Prognosis genetic monitoring counseling 81434 Hereditary retinal disorders (eg, retinitis pigmentosa, Leber congenital amaurosis, cone-rod dystrophyl, genomic sequence analysis panel, must include sequencing of at least 15 genes, including ABCA4, CNGA1, CRB1, EYS, PDE6A, PRPF31, PRPH2, RDH12, RHO, RP1 RP2, RPE65, RPGR, and USH2A

81437 Hereditary neuroendocrine tumor disorders (eg, medullary thyroid carcinoma, parathyroid carcinoma, malignant pheochromocytoma or paraganglioma; genomic sequence analysis panel, must include sequencing of at least 6 genes, including MAX, SDHB, SDHC, SDHD, TMEM127. and VHL

81438 Hereditary neuroendocrine tumor disorders (eg, medullary thyroid carcinoma, parathyroid carcinoma, malignant pheochromocytoma or paraganglioma duplication/deletion analysis panel, must include analyses for SDHB, SDHC, SDHD, and VHL

81442 Noonan spectrum disorders (eg, Noonan Syndrome, cardio -facio - cutaneous syndrome, Costello syndrome, LEOPARD Syndrome, Noonan-like syndrome), genomic sequence analysis panel , must include sequencing of at least 12 genes, including BRAF, CBL, HRAS, DRAS, MAP2K1, MAP2K2, NRAS, PTPN11, RAF1, RIT1,SHOC2, and SOS1

2 Recommendation Impact of detecting a mutation Limitations & Comments CPT Descriptor Cover Don't Change Change Provide Provide Code Cover treatment health Prognosis genetic monitoring counseling 81490 Autoimmune (rheumatoid arthritis), analysis of 12 biomarkers using immunoassays, utilizing serum, prognostic algorithm reported as a disease activity score 81493 Coronary artery disease, mRNA, gene expression profiling by real-time RT-PRC of 23 genes, utilizing whole peripheral blood, algorithm reported as a risk score 81595 Cardiology (heart transplant), mRNA, gene expression profiling by real-time quantitative PCR of 20 genes (11 content and 9 housekeeping), utilizing subfraction of peripheral blood, algorithm reported as rejection risk score

3 2016 Genetic Testing CPT Code Review

81162 1) BRCA1, BRCA2 (breast cancer 1 and 2) (eg, hereditary breast and ovarian cancer) gene analysis; full sequence analysis and full duplication/deletion analysis 2) Other BRCA gene testing codes (81211, 81212, 81213) are Diagnostic 3) Addressed in the 2015 NCCN Clinical Practice Guidelines in Oncology: Genetic/Familial High-Risk Assessment: Breast and Ovarian. V2.2015 (6/25/15) 4) HERC staff recommendation: a. Diagnostic Procedures File b. Add to the non-prenatal genetic testing guideline

81412 1) Ashkenazi Jewish associated disorders ( eg, Bloom syndrome, Canavan disease, cystic fibrosis, familial dysautonomia, Fanconi anemia group C, Gaucher disease, Tay-Sachs disease), genomic sequence analysis panel must include sequencing of at least 9 genes, including ASPA, BLM, CFTR, FANCC, GBA, HEXA IKBKAP, MCOLN1 and SMPD1 2) Indicated for use for preconception/early pregnancy testing when both parents are of Asheknazi Jewish ancestry or the spouse is a known Ashkenazi Jewish genetic disorder carrier or when there is a known family history of an Ashkenazi Jewish genetic disorder. Also for use for suspected clinical diagnosis of an individual disorder in a patient 3) The prenatal genetic testing guideline includes coverage for: a. 12. Screening for cystic fibrosis carrier status once in a lifetime (CPT 81220-81224) b. 15. Screening those with Ashkenazi Jewish heritage for Canavan disease (CPT 81200), familial dysautonomia (CPT 81260), and Tay-Sachs carrier status (CPT 81255) 4) HERC staff recommendation: a. Consider adding coverage for screening those with Ashkenazi Jewish heritage for Ashkenazi Jewish associated disorders (CPT 81412) if this test would replace 81200, 81260 and 81255 and 81220-81224 and be of similar or lower cost than the individual tests. b. If adopted, would need to amend the prenatal genetic testing guideline

81432-81433 1) 81432: Hereditary breast cancer-related disorders (eg, hereditary breast cancer, hereditary ovarian cancer, hereditary endometrial cancer); geonomic sequence analysis panel, must include sequencing of at least 14 genes, including ATM, BRCA1,

1 2016 Genetic Testing CPT Code Review

BRCA2,BRIP1, CDH1, MLH1, MSH2, MSH6, NBN, PALB2, PTEN, RAD51C, STK11 AND TP53 2) 81433: Hereditary breast cancer-related disorders (eg, hereditary breast cancer, hereditary ovarian cancer, hereditary endometrial cancer) duplication/deletion analysis panel, must include analyses for BRCA1, BRCA2, MLH1, MSH2, AND STK1 3) Addressed in the 2015 NCCN Clinical Practice Guidelines in Oncology: Genetic/Familial High-Risk Assessment: Breast and Ovarian. V2.2015 (6/25/15) a. Recommends multigene testing when more than one gene mutation could explain the family history or when single gene testing is negative in certain clinical situations with genetic counseling input pre and post testing 4) HERC staff recommendation: a. Diagnostic Procedures File b. Add to non-prenatal genetic testing guideline

81437-81438 1) 81437: Hereditary neuroendocrine tumor disorders (eg, medullary thyroid carcinoma, parathyroid carcinoma, malignant pheochromocytoma or paraganglioma; genomic sequence analysis panel, must include sequencing of at least 6 genes, including MAX, SDHB, SDHC, SDHD, TMEM127. and VHL 2) 81438: Hereditary neuroendocrine tumor disorders (eg, medullary thyroid carcinoma, parathyroid carcinoma, malignant pheochromocytoma or paraganglioma duplication/deletion analysis panel, must include analyses for SDHB, SDHC, SDHD, and VHL 3) HERC staff recommendation: a. Need GAP input

81442

1) Noonan spectrum disorders (eg, Noonan Syndrome, cardio -facio - cutaneous syndrome, Costello syndrome, LEOPARD Syndrome, Noonan-like syndrome), genomic sequence analysis panel , must include sequencing of at least 12 genes, including BRAF, CBL, HRAS, DRAS, MAP2K1, MAP2K2, NRAS, PTPN11, RAF1, RIT1,SHOC2, and SOS1 a. Lepri 2014 i. Validation study of multi gene sequencing for Noonan Syndrome ii. Conclusion: Here we show how molecular testing of RASopathies by targeted NGS could allow an early and accurate diagnosis for all enrolled patients, enabling a prompt diagnosis especially for those patients with

2 2016 Genetic Testing CPT Code Review

mild, non-specific or atypical features, in whom the detection of the causative mutation usually requires prolonged diagnostic timings when using standard routine. This approach strongly improved genetic counselling and clinical management. 2) HERC staff recommendation: a. Need GAP input

81490 1) Autoimmune (rheumatoid arthritis), analysis of 12 biomarkers using immunoassays, utilizing serum, prognostic algorithm reported as a disease activity score 2) Evidence a. Centola 2015 i. Development of the 12 biomarker algorithm ii. Multibiomarker statistical models outperformed individual biomarkers at estimating disease activity. Biomarker-based scores were significantly correlated with DAS28-CRP and could discriminate patients with low vs. moderate/high clinical disease activity. Such scores were also able to track changes in DAS28-CRP and were significantly associated with both joint inflammation measured by ultrasound and damage progression measured by radiography. The final MBDA algorithm uses 12 biomarkers to generate an MBDA score between 1 and 100. No significant effects on the MBDA score were found for common comorbidities. iii. Conclusion: We followed a stepwise approach to develop a quantitative serum-based measure of RA disease activity, based on 12-biomarkers, which was consistently associated with clinical disease activity levels. b. Michaud 2015 i. Study of the RA multibiomarker disease activity (MBDA) blood test on clinical progression and cost of treatment ii. Use of the MBDA test is projected to improve HAQ scores by 0.09 units in year 1, declining to 0.02 units after 10 years. Over the 10 year time horizon, quality-adjusted life years increased by 0.08 years and costs decreased by US$457 (cost savings in disability-related medical costs, US$659; in productivity costs, US$2137). The most influential variable in the analysis was the effect of the MBDA test on clinician treatment recommendations and subsequent HAQ changes. iii. Conclusion. The MBDA test aids in the assessment of disease activity in patients with RA by changing treatment decisions, improving the

3 2016 Genetic Testing CPT Code Review

functional status of patients and cost savings. Further validation is ongoing and future longitudinal studies are warranted. c. Hambardzumyan 2015 i. Study of multi-biomarker disease activity (MBDA) score, based on 12 serum biomarkers as a baseline predictor for 1-year RP in early RA ii. MBDA score was an independent predictor of rheumatoid progression (RP) as a continuous (OR=1.05, 95% CI 1.02 to 1.08) and dichotomised variable (high versus low/moderate, OR=3.86, 95% CI 1.04 to 14.26). iii. Conclusions: In patients with early RA, the MBDA score at baseline was a strong independent predictor of 1-year RP. These results suggest that when choosing initial treatment in early RA the MBDA test may be clinically useful to identify a subgroup of patients at low risk of RP. 3) HERC staff recommendation: a. Services Recommended for Non-Coverage Table i. Experimental

81493 1) Coronary artery disease, mRNA, gene expression profiling by real-time RT-PRC of 23 genes, utilizing whole peripheral blood, algorithm reported as a risk score 2) Proprietary algorithm test which can help predict whether a patient’s symptoms are due to obstructive coronary artery disease 3) Evidence a. Rosenberg 2010 i. Validation study ii. N=526 non-diabetic patients clinically indicated for invasive coronary angiography 1. excluded patients with chronic inflammatory disorders, elevated white blood counts or cardiac protein markers, and diabetes. i. Results: At a score threshold corresponding to 20% obstructive CAD likelihood (14.75), the sensitivity and specificity were 85% and 43%, yielding NPV of 83% and PPV 46%, with 33% of patient scores below this threshold.

4 2016 Genetic Testing CPT Code Review

ii. Conclusions—This non-invasive whole blood test, based on gene expression and demographics, may be useful for assessment of obstructive CAD in non-diabetic patients without known CAD. b. Phelps 2015 i. Cost effectiveness modeling for mRNA algorithm for CAD ii. the 2-threshold GES strategy is the most cost-effective strategy at a threshold of $100,000 per QALY gained, with an ICER of approximately $72,000 per QALY gained relative to no testing. Myocardial perfusion imaging alone and the 1-threshold strategy are weakly dominated. In sensitivity analysis, ICERs fall as the probability of oCAD increases from the base case value of 15%. The ranking of ICERs among strategies is sensitive to test costs, including the time cost for testing. The analysis reveals ways to improve on prespecified GES thresholds. iii. Conclusions Diagnostic testing for oCAD with a novel GES strategy in a 2-threshold model is cost effective by conventional standards. This diagnostic approach is more efficient than usual care of MPI alone or a 1-threshold GES strategy in most scenarios. 4) HERC staff recommendation: a. Services Recommended for Non-Coverage Table i. Experimental

81595 1) Cardiology (heart transplant), mRNA, gene expression profiling by real-time quantitative PCR of 20 genes (11 content and 9 housekeeping), utilizing subfraction of peripheral blood, algorithm reported as rejection risk score 2) Proprietary genetic test (Allomap). AlloMap testing is intended to aid in the identification of heart transplant recipients who have a low probability of moderate/severe acute cellular rejection (ACR) at the time of testing. 3) Evidence a. CTAF 2010 review i. The AlloMap gene expression profile has a high negative predictive value, but a low positive predictive value. Thus it may be useful to avoid biopsy in stable patients, but the high false positive rate precludes its use to definitively diagnose acute cellular rejection. Endomyocardial biopsies will still need to be performed in all patients

5 2016 Genetic Testing CPT Code Review

with elevated AlloMap scores and all patients with clinical signs of rejection. ii. the data only support strategies utilizing AlloMap in patients more than a year post-transplant. iii. RECOMMENDATION It is recommended that the use of gene expression profiling meets Technology Assessment Criterion 1 through 5 for safety, effectiveness and improvement in health outcomes when used to manage heart transplant patients at least one year post-transplant. 4) Other policies a. Anthem BCBS 2015: AlloMap molecular expression testing is considered medically necessary as a non-invasive method of determining the risk of rejection in heart transplant recipients between 1 and 5 years post- transplant. b. Aetna 2015: Aetna considers the Allomap gene expression profile medically necessary for monitoring rejection in heart transplant recipients more than six months post-heart transplant. 5) HERC staff recommendations: a. Add 81595 (Cardiology (heart transplant), mRNA, gene expression profiling) to the heart transplant lines i. 245 CONDITIONS REQUIRING HEART-LUNG AND LUNG TRANSPLANTATION ii. 268 CONGESTIVE HEART FAILURE, CARDIOMYOPATHY, MALIGNANT ARRHYTHMIAS, AND COMPLEX CONGENITAL HEART DISEASE b. Add a new guideline to lines 245 and 268

GUIDELINE NOTE XXX CARDIAC TRANSPLANT GENETIC TESTING FOR TRANSPLANT REJECTION Lines 245,268 Genetic testing for cardiac transplant rejection (CPT 81595) is included on these lines only for patients at least 1 year post transplant who are without clinical signs of rejection.

6 OHP Medical Directors Comments Non-Prenatal Genetic Testing Guideline

Right now, if a code doesn’t show up on guideline note D1 or D17, it’s not clear whether it was actually addressed and it is intended that it not be covered, or if it simply wasn’t addressed. It would be extremely helpful to list all the codes, what they are used for and when they are covered. I’ve started such a document for my staff with the assistance of a geneticist, but have no idea how evidence based this is, or even accurate. At this point, it is expert opinion.

The private OBs and surgeons are the ones who are ordering most of these tests, although I get an occasional one from primary care. The company that the private doctors use is Myriad, and they are the ones who do these big panels. When I ask couldn’t we just do the BRCA test, the answer I get is that that is not good medicine, you need to do these other tests too because sometimes they will uncover things that the BRCA test alone would not show. I am not sure what the science is in all of this but I am very concerned about the amount of money we are spending on these tests.

This is also an issue for us. Not so much regarding the genetic counseling, as we have access to that here in Portland, but regarding the advances in technology in genetic testing such that it’s very difficult to ascertain the necessity or appropriateness of some of the requests we get, primarily from OHSU genetics. We get a lot of requests where they want to send the testing to Baylor instead of doing locally, or request proprietary testing from a company that only offers the test as a package of 10 tests, so even if we want just one, they won’t do anything but the entire panel. But we haven’t been reviewing most genetic testing recently, in order to allocate review resources elsewhere

My struggle with genetic testing, especially for BRCA or Lynch syndrome is getting the genetic counseling. Myriad is telling our doctors that we are the only CCO that requires genetic counseling, but I think it is important for people to understand the test. Myriad has offered to provide genetic counseling or train our providers, but I don’t think that the test manufacturer is the right entity to be involved in the counseling. Portland is really the only location where our members have been able to access the counseling, and access is difficult. I have been pushing my hospital and oncologists to develop a telemedicine process for genetic counseling. The requirement for masters level genetic counselors to bill incident to a physician has also been an issue for us. There are genetic counselors in Eugene, but they work for the perinatologists. That is fine for prenatal genetics, but problematic for cancer genetics.

Non-Prenatal Genetic Testing Guideline October, 2015 Genetics Advisory Panel

Issues: 1) The NCCN guideline references need to be updated 2) OHP medical directors are requesting that the non-prenatal genetic testing guideline have a section added listing the CPT codes for genetic tests that are currently covered as diagnostic without restrictions a. Option: include a list of included genetic tests b. Option 2: add a clause at the beginning of the guideline noting that all codes not restricted or excluded are diagnostic and covered if they meet the criteria in the genetic testing algorithm 3) Consider deleting the genetic testing algorithm a. It appears to be unconnected to D1 and difficult to use by the CCOs b. Consider adding the intent to the non-prenatal genetic testing guideline c. If not deleted, then the current guideline reference to “C1” needs to be changed 4) OHP medical directors are requesting that the non-prenatal genetic testing guideline list the training requirements for the clinician ordering the genetic test, as well as for the genetic counseling providers a. Concern about OB/Gyns, primary care physicians ordering tests without appropriate training b. Genetic counseling “should precede genetic testing for hereditary cancer whenever possible” but is not required c. Genetic counseling is only recommended prior to genetic testing for developmental issues 5) OHP medical directors are requesting consideration of adding wording requiring the most cost-effective testing (single test instead of a panel if the panel is not indicated, or panel testing rather than a consecutive series of single tests a. See clause at end of guideline—may just need to move to the front

Non-Prenatal Genetic Testing Guideline October, 2015 Genetics Advisory Panel

HERC staff recommendations: 1) Modify the non-prenatal genetic testing guideline as shown below a. Update the NCCN references b. Update the BRCA testing clauses with CPT 81162 c. Move the restriction for the least costly test from the end of the guideline to the beginning d. Consider adding requirements for the provider ordering the test 2) Delete the genetic testing algorithm and incorporate the intent into the beginning of the non-prenatal genetic testing guideline a. Specify that pre- and post-test genetic counseling is required in most cases b. Add a clause that genetic tests not otherwise excluded or restricted are covered if the genetic testing algorithm criteria (as now incorporated into this guideline) are met

DIAGNOSTIC GUIDELINE D1, NON-PRENATAL GENETIC TESTING GUIDELINE Coverage of genetic testing in a non-prenatal setting shall be determined by the algorithm shown in Figure C.1 unless otherwise specified below. A) Genetic tests are covered as diagnostic, unless they are listed below in section F1 as excluded or have other restrictions listed below. To be covered, initial screening (e.g. physical exam, medical history, family history, laboratory studies, imaging studies) must indicate that the chance of genetic abnormality is > 10% and results would do at least one of the following: 1) Change treatment, 2) Change health monitoring, 3) Provide prognosis, or 4) Provide information needed for genetic counseling for patient; or patient’s parents, siblings, or children B) Pretest and posttest genetic counseling is required for presymptomatic and predisposition genetic testing. Pretest and posttest genetic evaluation (which includes genetic counseling) is covered when provided by a suitable trained health professional with expertise and experience in cancer genetics. 1) “Suitably trained” is defined as board certified or active candidate status from the American Board of Medical Genetics, American Board of Genetic Counseling, or Genetic Nursing Credentialing Commission. C) A more expensive genetic test (generally one with a wider scope or more detailed testing) is not covered if a cheaper (smaller scope) test is available and has, in this clinical context, a substantially similar sensitivity. For example, do not cover CFTR gene sequencing as the first test in a person of Northern European Caucasian ancestry because the gene panels are less expensive and provide substantially similar sensitivity in that context. Non-Prenatal Genetic Testing Guideline October, 2015 Genetics Advisory Panel

D) Related to genetic testing for patients with breast/ovarian and colon/endometrial cancer or other related cancers suspected to be hereditary, or patients at increased risk to due to family history. 1) Services are provided according to the Comprehensive Cancer Network Guidelines. a) Lynch syndrome (hereditary colorectal, endometrial and other cancers associated with Lynch syndrome) services (CPT 81288, 81292-81300, 81317- 81319, 81435, 81436) and familial adenomatous polyposis (FAP) services (CPT 81201-81203) should be provided as defined by the NCCN Clinical Practice Guidelines in Oncology. Genetic/Familial High-Risk Assessment: Colorectal V.21.20145 (5/19/14 5/4/15). www.nccn.org. b) BRCA1/BRCA2 Breast and ovarian cancer syndrome genetic testing services (CPT 81162, 81211-81217, 81432-81433) for women without a personal history of breast, ovarian and other associated cancers should be provided to high risk women as defined by the US Preventive Services Task Force or according to the NCCN Clinical Practice Guidelines in Oncology: Genetic/Familial High-Risk Assessment: Breast and Ovarian. V2.20145 (9/23/2014 6/25/15). www.nccn.org. c) BRCA1/BRCA2 Breast and ovarian cancer syndrome genetic testing services (CPT 81162, 81211-81217, 81432-81433) for women with a personal history of breast, ovarian, and other associated cancers and for men with breast cancer should be provided according to the NCCN Clinical Practice Guidelines in Oncology. Genetic/Familial High-Risk Assessment: Breast and Ovarian. V2.20145 (9/23/2014 6/25/15). www.nccn.org. d) PTEN (Cowden syndrome) services (CPT 81321-81323) should be provided as defined by the NCCN Clinical Practice Guidelines in Oncology. Colorectal Screening. V.1.20135 (5/13/13 5/1/15). www.nccn.org. 2) Genetic counseling should precede genetic testing for hereditary cancer whenever possible. a) Pre and post-test genetic counseling should be covered when provided by a suitable trained health professional with expertise and experience in cancer genetics. i) “Suitably trained” is defined as board certified or active candidate status from the American Board of Medical Genetics, American Board of Genetic Counseling, or Genetic Nursing Credentialing Commission. b) If timely pre-test genetic counseling is not possible for time-sensitive cases, appropriate genetic testing accompanied by pre- and post- test informed consent and post-test disclosure performed by a board-certified physician with experience in cancer genetics should be covered. i) Post-test genetic counseling should be performed as soon as is practical. 3) If the mutation in the family is known, only the test for that mutation is covered. For example, if a mutation for BRCA 1 has been identified in a family, a single site mutation analysis for that mutation is covered (CPT 81215), while a full sequence BRCA 1 and 2 (CPT 81211) analyses is not. There is one exception, for individuals of Non-Prenatal Genetic Testing Guideline October, 2015 Genetics Advisory Panel

Ashkenazi Jewish ancestry with a known mutation in the family, the panel for Ashkenazi Jewish BRCA mutations is covered (CPT 81212). 4) Costs for rush genetic testing for hereditary breast/ovarian and colon/endometrial cancer is not covered. E) Related to diagnostic evaluation of individuals with intellectual disability (defined as a full scale or verbal IQ < 70 in an individual > age 5), developmental delay (defined as a cognitive index <70 on a standardized test appropriate for children < 5 years of age), Autism Spectrum Disorder, or multiple congenital anomalies: 1) CPT 81228, Cytogenomic constitutional microarray analysis for copy number variants for chromosomal abnormalities: Cover for diagnostic evaluation of individuals with intellectual disability/developmental delay; multiple congenital anomalies; or, Autism Spectrum Disorder accompanied by at least one of the following: dysmorphic features including macro or microcephaly, congenital anomalies, or intellectual disability/developmental delay in addition to those required to diagnose Autism Spectrum Disorder. 2) CPT 81229, Cytogenomic constitutional microarray analysis for copy number variants for chromosomal abnormalities; plus cytogenetic constitutional microarray analysis for single nucleotide polymorphism (SNP) variants for chromosomal abnormalities: Cover for diagnostic evaluation of individuals with intellectual disability/developmental delay; multiple congenital anomalies; or, Autism Spectrum Disorder accompanied by at least one of the following: dysmorphic features including macro or microcephaly, congenital anomalies, or intellectual disability/developmental delay in addition to those required to diagnose Autism Spectrum Disorder; only if (a) consanguinity and recessive disease is suspected, or (b) uniparental disomy is suspected, or (c) another mechanism is suspected that is not detected by the copy number variant test alone. 3) CPT 81243, 81244, Fragile X genetic testing is covered for individuals with intellectual disability/developmental delay. Although the yield of Fragile X is 3.5- 10%, this is included because of additional reproductive implications. 4) A visit with the appropriate specialist (often genetics, developmental pediatrics, or child neurology), including physical exam, medical history, and family history is covered. Physical exam, medical history, and family history by the appropriate specialist, prior to any genetic testing is often the most cost-effective strategy and is encouraged. F) Related to other tests with specific CPT codes: 1) The following tests are not covered: a) CPT 81225, CYP2C9 (cytochrome P450, family 2, subfamily C, polypeptide 9) (eg, drug metabolism), gene analysis, common variants (eg, *2, *3, *5, *6) b) CPT 81226, CYP2D6 (cytochrome P450, family 2, subfamily D, polypeptide 6) (eg, drug metabolism), gene analysis, common variants (eg, *2, *3, *4, *5, *6, *9, *10, *17, *19, *29, *35, *41, *1XN, *2XN, *4XN). c) CPT 81227, CYP2C9 (cytochrome P450, family 2, subfamily C, polypeptide 9) (eg, drug metabolism), gene analysis, common variants (eg, *2, *3, *5, *6) Non-Prenatal Genetic Testing Guideline October, 2015 Genetics Advisory Panel

d) CPT 81287, MGMT (O-6-methylguanine-DNA methyltransferase) (eg, glioblastoma multiforme), methylation analysis e) CPT 81291, MTHFR (5,10-methylenetetrahydrofolate reductase) (eg, hereditary hypercoagulability) gene analysis, common variants (eg, 677T, 1298C) f) CPT 81330, SMPD1(sphingomyelin 1, acid lysosomal) (eg, Niemann-Pick disease, Type A) gene analysis, common variants (eg, R496L, L302P, fsP330) g) CPT 81350, UGT1A1 (UDP glucuronosyltransferase 1 family, polypeptide A1) (eg, irinotecan metabolism), gene analysis, common variants (eg, *28, *36, *37) h) CPT 81355, VKORC1 (vitamin K epoxide reductase complex, subunit 1) (eg, warfarin metabolism), gene analysis, common variants (eg, -1639/3673) i) CPT 81417, re-evaluation of whole exome sequencing j) CPT 81425-81427, Genome sequence analysis k) CPT 81470, 81471, X-linked intellectual disability (XLID) genomic sequence panels l) CPT 81504, Oncology (tissue of origin), microarray gene expression profiling of > 2000 genes, utilizing formalin-fixed paraffin-embedded tissue, algorithm reported as tissue similarity scores 2) The following tests are covered only if they meet the criteria in section A above for the Non-Prenatal Genetic Testing Algorithm AND the specified situations: a) CPT 81205, BCKDHB (branched-chain keto acid dehydrogenase E1, beta polypeptide) (eg, Maple syrup urine disease) gene analysis, common variants (eg, R183P, G278S, E422X): Cover only when the newborn screening test is abnormal and serum amino acids are normal b) Diagnostic testing for cystic fibrosis (CF) i) CFTR, cystic fibrosis transmembrane conductance regulator tests. CPT 81220, 81223, 81222: For infants with a positive newborn screen for cystic fibrosis or who are symptomatic for cystic fibrosis, or for clients that have previously been diagnosed with cystic fibrosis but have not had genetic testing, CFTR gene analysis of a panel containing at least the mutations recommended by the American College of Medical Genetics* (CPT 81220) is covered. If two mutations are not identified, CFTR full gene sequencing (CPT 81223) is covered. If two mutations are still not identified, duplication/deletion testing (CPT 81222) is covered. These tests may be ordered as reflex testing on the same specimen. c) Carrier testing for cystic fibrosis i) CFTR gene analysis of a panel containing at least the mutations recommended by the American College of Medical Genetics* (CPT 81220) is covered. d) CPT 81240. F2 (prothrombin, coagulation factor II) (eg, hereditary hypercoagulability) gene analysis, 20210G>A variant: Factor 2 20210G>A testing should not be covered for adults with idiopathic venous thromoboembolism; for asymptomatic family members of patients with venous thromboembolism and a Non-Prenatal Genetic Testing Guideline October, 2015 Genetics Advisory Panel

Factor V Leiden or Prothrombin 20210G>A mutation; or for determining the etiology of recurrent fetal loss or placental abruption. e) CPT 81241. F5 (coagulation Factor V) (eg, hereditary hypercoagulability) gene analysis, Leiden variant: Factor V Leiden testing should not be covered for: adults with idiopathic venous thromoboembolism; for asymptomatic family members of patients with venous thromboembolism and a Factor V Leiden or Prothrombin 20210G>A mutation; or for determining the etiology of recurrent fetal loss or placental abruption. f) CPT 81256, HFE (hemochromatosis) (eg, hereditary hemochromatosis) gene analysis, common variants (eg, C282Y, H63D): Covered for diagnostic testing of patients with elevated transferrin saturation or ferritin levels. Covered for predictive testing ONLY when a first degree family member has treatable iron overload from HFE. g) CPT 81332 SERPINA1 (serpin peptidase inhibitor, clade A, alpha-1 antiproteinase, antitrypsin, member 1) (eg, alpha-1-antitrypsin deficiency), gene analysis, common variants (eg, *S and *Z): The alpha-1-antitrypsin protein level should be the first line test for a suspected diagnosis of AAT deficiency in symptomatic individuals with unexplained liver disease or obstructive lung disease that is not asthma or in a middle age individual with unexplained dyspnea. Generic testing or the anpha-1 phenotype test is appropriate is the protein test is abnormal or borderline. The genetic test is appropriate for siblings of people with AAT deficiency regardless of the AAT protein test results. h) CPT 81415-81416, exome testing: A genetic counseling/geneticist consultation is required prior to ordering test i) CPT 81430-81431, Hearing loss (eg, nonsyndromic hearing loss, Usher syndrome, Pendred syndrome); genomic sequence analysis panel: Testing for mutations in GJB2 and GJB6 need to be done first and be negative in non-syndromic patients prior to panel testing. j) CPT 81440, 81460, 81465, mitochondrial genome testing: A genetic counseling/geneticist or metabolic consultation is required prior to ordering test. k) Do not cover a more expensive genetic test (generally one with a wider scope or more detailed testing) if a cheaper (smaller scope) test is available and has, in this clinical context, a substantially similar sensitivity. For example, do not cover CFTR gene sequencing as the first test in a person of Northern European Caucasian ancestry because the gene panels are less expensive and provide substantially similar sensitivity in that context.

* American College of Medical Genetics Standards and Guidelines for Clinical Genetics Laboratories. 2008 Edition, Revised 3/2011 and found at https://www.acmg.net/StaticContent/SGs/CFTR%20Mutation%20Testing.pdf ANCILLARY/DIAGNOSTIC GUIDELINE NOTES FOR THE OCTOBER 1, 2015 PRIORITIZED LIST OF HEALTH SERVICES

FIGURE D1 NON-PRENATAL GENETIC TESTING ALGORITHM (See Guideline Note D1)

Pretest genetic risk assessment and/or clinical evidence indicate chance of genetic abnormality is > 10% and results would do at least one of the following: Yes  Initial screening indicates genetic testing Change treatment,  Change health monitoring, may be indicated. 1  Provide prognosis, or

 Provide information needed for genetic counseling for patient; or patient’s parents, siblings, or children.

Yes

 Pretest and posttest genetic counseling is required for presymptomatic and

No No predisposition genetic testing.  Pretest and posttest genetic evaluation (which includes genetic counseling)

is covered.  Genetic test is covered. 

Genetic Test is not covered

1. Examples of initial screening: physical exam, medical history, family history, laboratory studies, imaging studies.

9-24-2015 Page AD-8 DIAGNOSTIC GUIDELINE D17, PRENATAL GENETIC TESTING The following types of prenatal genetic testing and genetic counseling are covered for pregnant women: 1. Genetic counseling (CPT 96040, HPCPS S0265) for high risk women who have family history of inheritable disorder or carrier state, ultrasound abnormality, previous pregnancy with aneuploidy, or elevated risk of neural tube defect. 2. Genetic counseling (CPT 96040, HPCPS S0265) prior to consideration of CVS, amniocentesis, microarray testing, Fragile X, and spinal muscular atrophy screening 3. Validated questionnaire to assess genetic risk in all pregnant women 4. Screening high risk ethnic groups for hemoglobinopathies (CPT 83020, 83021) 5. Screening for aneuploidy with any of five screening strategies [first trimester (nuchal translucency, beta-HCG and PAPP-A), integrated, serum integrated, stepwise sequential, and contingency] (CPT 76813, 76814, 81508-81511) 6. Cell free fetal DNA testing (CPT 81507) for evaluation of aneuploidy in women who have an elevated risk of a fetus with aneuploidy (maternal age >34, family history or elevated risk based on screening). 7. Ultrasound for structural anomalies between 18 and 20 weeks gestation (CPT 76811, 76812) 8. CVS or amniocentesis (CPT 59000, 59015) for a positive aneuploidy screen, maternal age >34, fetal structural anomalies, family history of inheritable chromosomal disorder or elevated risk of neural tube defect. 9. Array CGH (CPT 81228) when major fetal congenital anomalies apparent on imaging, and karyotype is normal 10. FISH testing (CPT 88271, 88275) only if karyotyping is not possible due a need for rapid turnaround for reasons of reproductive decision-making (i.e. at 22w4d gestation or beyond) 11. Screening for Tay-Sachs carrier status (CPT 81255) in high risk populations. First step is hex A, and then additional DNA analysis in individuals with ambiguous Hex A test results, suspected variant form of TSD or suspected pseudodeficiency of Hex A 12. Screening for cystic fibrosis carrier status once in a lifetime (CPT 81220-81224) 13. Screening for fragile X status (CPT 81243, 81244) in patients with a personal or family history of a. fragile X tremor/ataxia syndrome b. premature ovarian failure c. unexplained early onset intellectual disability d. fragile X intellectual disability e. unexplained autism through the pregnant woman’s maternal line 14. Screening for spinal muscular atrophy (CPT 81401) once in a lifetime 15. Screening those with Ashkenazi Jewish heritage for Canavan disease (CPT 81200), familial dysautonomia (CPT 81260), and Tay-Sachs carrier status (CPT 81255) 16. Expanded carrier screening only for those genetic conditions identified above

The following genetic screening tests are not covered: 1. Serum triple screen 2. Screening for thrombophilia in the general population or for recurrent pregnancy loss 3. Expanded carrier screening which includes results for conditions not explicitly recommended for coverage

The development of this guideline note was informed by a HERC coverage guidance. See http://www.oregon.gov/oha/herc/Pages/blog-prenatal-genetic.aspx Lepri et al. BMC Medical Genetics 2014, 15:14 http://www.biomedcentral.com/1471-2350/15/14

RESEARCHARTICLE Open Access Diagnosis of Noonan syndrome and related disorders using target next generation sequencing Francesca Romana Lepri1*, Rossana Scavelli2, Maria Cristina Digilio1, Maria Gnazzo1, Simona Grotta1, Maria Lisa Dentici1, Elisa Pisaneschi1, Pietro Sirleto1, Rossella Capolino1, Anwar Baban1, Serena Russo1, Tiziana Franchin1, Adriano Angioni1 and Bruno Dallapiccola1

Abstract Background: Noonan syndrome is an autosomal dominant developmental disorder with a high phenotypic variability, which shares clinical features with other rare conditions, including LEOPARD syndrome, cardiofaciocutaneous syndrome, Noonan-like syndrome with loose anagen hair, and Costello syndrome. This group of related disorders, so-called RASopathies, is caused by germline mutations in distinct genes encoding for components of the RAS-MAPK signalling pathway. Due to high number of genes associated with these disorders, standard diagnostic testing requires expensive and time consuming approaches using Sanger sequencing. In this study we show how targeted Next Generation Sequencing (NGS) technique can enable accurate, faster and cost-effective diagnosis of RASopathies. Methods: In this study we used a validation set of 10 patients (6 positive controls previously characterized by Sanger-sequencing and 4 negative controls) to assess the analytical sensitivity and specificity of the targeted NGS. As second step, a training set of 80 enrolled patients with a clinical suspect of RASopathies has been tested. Targeted NGS has been successfully applied over 92% of the regions of interest, including exons for the following genes: PTPN11, SOS1, RAF1, BRAF, HRAS, KRAS, NRAS, SHOC, MAP2K1, MAP2K2, CBL. Results: All expected variants in patients belonging to the validation set have been identified by targeted NGS providing a detection rate of 100%. Furthermore, all the newly detected mutations in patients from the training set have been confirmed by Sanger sequencing. Absence of any false negative event has been excluded by testing some of the negative patients, randomly selected, with Sanger sequencing. Conclusion: Here we show how molecular testing of RASopathies by targeted NGS could allow an early and accurate diagnosis for all enrolled patients, enabling a prompt diagnosis especially for those patients with mild, non-specific or atypical features, in whom the detection of the causative mutation usually requires prolonged diagnostic timings when using standard routine. This approach strongly improved genetic counselling and clinical management. Keywords: Noonan syndrome, Next generation sequencing, Molecular diagnosis, RASopathies

Background retardation, ectodermal and skeletal defects, and variable Noonan syndrome (NS, OMIM 163950) is an autosomal cognitive deficits [1,2], with other rare conditions, including dominant developmental disorder [1] with a prevalence LEOPARD syndrome (LS, OMIM 151100) [3], cardiofacio- ranging between 1:1.000 and 1:2.500 live births [2]. This cutaneous syndrome (CFCS, OMIM 115150) [4], Noonan- disorder is characterized by wide phenotype variability like syndrome with loose anagen hair (NS/LAH, OMIM and shares some clinical features, as facial dysmorph- 607721) [5], and Costello syndrome (CS, OMIM 218040) isms, congenital heart defect (CHD), postnatal growth [6]. This group of related disorders is caused by germline mutations in distinct genes, encoding for components of the RAS-MAPK signalling pathway. Based on common * Correspondence: [email protected] 1Cytogenetics, Medical Genetics and Pediatric Cardiology, Bambino Gesù pathogenetic mechanisms and clinical overlap, these Children Hospital, IRCCS, Rome, Italy diseases have been grouped into a single family, the so- Full list of author information is available at the end of the article

© 2014 Lepri et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Lepri et al. BMC Medical Genetics 2014, 15:14 Page 2 of 11 http://www.biomedcentral.com/1471-2350/15/14

called neuro-cardio-facial-cutaneous syndromes (NCFCS), had complete physical examination for major and minor recently coined RASopathies [7,8]. NS is associated with anomalies by trained clinical geneticists (MCD, BD, PTPN11, SOS1, KRAS, NRAS, RAF1, BRAF, SHOC2, RC). Two-dimensional Color-Doppler echocardiography, MEK1 and CBL gene mutations [9-19], LS with PTPN11, renal ultrasonography, and neurological/neuropsychiatric RAF1 and BRAF gene mutations [13,17,20,21], NS/LAH assessment for developmental delay or cognitive impair- with SHOC2 gene mutations [22], CFCS with KRAS, ment were routinely performed. Clinical inclusion criteria BRAF, MEK1 and MEK2 gene mutations [23,24], CS with were facial anomalies suggestive for RASopathies (pres- HRAS gene mutations [25]. ence of six or more features among hypertelorism, down- So far, the molecular characterization can be reached slanting palpebral fissures, epicanthal folds, short broad in approximately the 75-90% of affected individuals. Some nose, deeply grooved philtrum, high wide peaks of the distinct phenotypes are emerged in association with vermilion, micrognathia, low-set and/or posteriorly angu- definite gene mutations. lated ears with thick helices, and low posterior hairline) Nowadays, due to high genetic heterogeneity of these [26], associated with almost one of the following clinical disorders, which affect genes that all together span about features: short stature, organ malformation (congenital 30 kb of genomic DNA, the standard diagnostic testing heart defect or renal anomaly), developmental delay or protocol requires a multi-step approach, using Sanger cognitive deficit. All patients had normal standard chromo- sequencing. The selection of the genes to investigate on some analysis and array-CGH at a resolution of 75 kb. a first diagnostic level depends on the frequency of their A total of 10 DNA samples including 6 positive controls association with this disorder and their relationship with and 4 negative controls, previously characterized by stand- a distinct phenotype. For this reason, accurate clinical ard Sanger sequencing were used as a validation set for evaluation and close interaction between clinical and establishing the amplicon resequencing workflow and molecular geneticists are mandatory for selecting the assessing the analytical sensitivity and specificity of the genes to be first studied. By using this approach, the targeted NGS. A second group of 80 DNA samples, causative mutations can be identified in most of the cases. extracted from patients manifesting the RASopathies Some mutations cannot be identified during the first phenotype, was used as training set. The patient’sgenomic screening level since some phenotypes may be related to DNA was extracted from circulating leukocytes according mutations in different causative genes or some clinical to standard procedures and quantified with fluorescence- features associated with NS related disorders may not based method. Informed consent was obtained from be evident at younger ages, or some extremely rare the patients’ parents. The study was approved by the mutations are not routinely screened at first analysis. institutional scientific board of Bambino Gesù Children To detect these mutations, an additional screening level Hospital and was conducted in accordance with the is required with a second panel of genes, which again Helsinki Declaration. should be guided by clinical geneticist. In these latter cases the molecular diagnosis requires a longer time Targeted resequencing before identifying the pathogenic mutation. Moreover, Targeted resequencing was performed using a uniquely standard Sanger sequencing for multiple genes is also an customized design: TruSeq® Custom Amplicon (Illumina, expensive technique. Based on these notions, genetically San Diego, CA) with the MiSeq® sequencing platform heterogeneous disorders demand innovative diagnostic (Illumina, San Diego, CA). TruSeq Custom Amplicon protocols, in order to be able to identify disease-causing (TSCA) is a fully integrated DNA-to-data solution, in- mutations in a rapid and routinely way. cluding online probe design and ordering through the Here we report our personal experience on the use of Illumina website sequencing assay automated data ana- targeted Next Generation Sequencing (NGS) for diagnosis lysis and offline software for reviewing results. of RASopathies. Our study suggests that this protocol can be easily used as a standard diagnostic tool to iden- Probe design tify disease-causing mutations, with a straightforward Online probe design was performed by entering target gen- workflow from genomic DNA up to genomic variants omic regions into Design Studio (DS) software (Illumina, identification. San Diego, CA). Probe design (Locus Specific Oligos) was automatically performed by DS using a proprietary algo- Methods rithm that considers a range of factors, including GC Subjects content, specificity, probe interaction and coverage. Once Between June 2012 and June 2013, 80 patients (35 males the design was completed, a list of 500 bp candidate and 45 females) with a clinical suspect of any RASopathy amplicons (short regions of amplified DNA) was gener- were consecutively enrolled in this study. Mean age ated and the quality of each amplicon design assessed was 8 years (range 2 months - 16 years). All patients based on the predicted success score provided by DS. Lepri et al. BMC Medical Genetics 2014, 15:14 Page 3 of 11 http://www.biomedcentral.com/1471-2350/15/14

Figure 1 Screenshots of the designed panel within DS software.

For some targets, when required, DS has been used by Data analysis the operator to edit and improve the predicted success The MiSeq® system provides fully integrated on-instru- score to a minimum value of 60%. All exons with a ment data analysis software. MiSeq Reporter software lower success score have been removed from the design performs secondary analysis on the base calls and Phred- and excluded from the final TSCA panel. The design like quality score (Qscore) generated by Real Time was performed over a cumulative target region of Analysis software (RTA) during the sequencing run. 57,932 bp and generated a panel of 244 amplicons with The TSCA workflow in Miseq Reporter evaluates short acoverageof98%ofthecumulativeregion(Figure1). regions of amplified DNA (amplicons) for variants through The choice of genes investigated in this panel has been the alignment of reads against a “manifest file” specified madebasedonscientificevidenceforacausativerole while starting the sequencing run. The manifest file is in the disease [9-25]. The list of the 11 genes, for a total provided by Illumina and contains all the information of 132 exons, is reported in Table 1. on the custom assay. The TSCA workflow requires the reference genome specified in the manifest file (Homo sapiens, hg19, build 37.2). The reference genome provides Library preparation and sequencing variant annotations and sets the sizes in the TSCA kit generates desired targeted amplicons with the BAM file output. The TSCA workflow performs demulti- necessary sequencing adapter and indices for sequencing plexing of indexed reads, generates FASTQ files, aligns on the MiSeq® system without any additional processing. reads to a reference, identifies variants, and writes output Library preparation and sequencing runs have been files to the Alignment folder. SNPs and short indels are performed according to manufacturer’sprocedure. identified using the Genome Analysis Toolkit (GATK), by

Table 1 List of genes analyzed in this study and coverage percentage of the investigated exons Gene PTPN11 SOS1 BRAF RAF1 KRAS NRAS HRAS SHOC2 MAP2K1 MAP2K2 CBL Number of exons uploaded into DS 15 23 18 16 5 4 5 8 11 11 16 Number of exons entirely covered by DS 14 23 18 16 5 3 5 8 11 11 16 with predicted success score >60% Total exons covered by DS/ total exons 98.5 uploaded into DS (%) Number of exons successfully sequenced 13 22 16 16 5 3 5 8 9 9 14 with coverage > 30 Total exons successfully sequenced/ total 92.4% exons covered by DS (%) Lepri et al. BMC Medical Genetics 2014, 15:14 Page 4 of 11 http://www.biomedcentral.com/1471-2350/15/14

Figure 2 Flowchart of how the analysis was carried out. default. GATK calls raw variants for each sample, analyzes and, where possible, family members were tested to detect variants against known variants, and then calculates a the “de novo” origin of the mutation. Figure 2 shows the false discovery rate for each variant. Variants are flagged flowchart of the above described method. as homozygous (1/1) or heterozygous (0/1) in the Variant Call File sample column. Because a SNP database (dbSNP Results (http://www.ncbi.nlm.nih.gov/projects/SNP) is available in TSCA performance the Annotation subfolder of the reference genome folder, All coding regions for genes reported in Table 1 have been any known SNPs or indels are flagged in the VCF output uploaded into DS for a total of 132 exons (cumulative tar- file. A reference gene database is available in the Annota- get region of 57,932 bp). The 98.5% of the exons uploaded tion subfolder of the reference genome folder and any SNPs were covered by the amplicon design, with a predicted or indels that occur within known genes are annotated. success score ≥60%. The remaining exons not entirely cov- Each single variant reported in the VCF output file has ered by DS or with a predicted success score <60% have been evaluated for the coverage and the Qscore and been excluded from final TSCA content panel. TSCA se- visualized via Integrative Genome Viewer (IGV) [27,28]. quencing runs generated 120 exons successfully and steadily Based on the guidelines of the American College of sequenced (sequencing depth >30, Qscore >30), providing a Medical Genetics and Genomics [29], all regions that total coverage of 91% of the overall of the exons uploaded have been sequenced with a sequencing depth <30 have into DS, and a coverage of 92% when referring to the been considered not suitable for analysis. Furthermore number of exons covered by DS (Table 1). The TSCA we established a minimum threshold in Qscore of 30 approach reduced up to 12 the number of exons requiring (base call accuracy of 99.9%). the standard Sanger sequencing analysis.

Sanger sequencing validation Validation set All mutations identified by Miseq Reporter have been TSCA sequencing of 4 negative control confirmed the validated by Sanger sequencing using standard protocols absence of any variant and the analysis of 6 positive

Table 2 List of patients with known mutations included in the validation set Patient ID Gene Mutation Allele state Mutation detected by TSCA sequencing Coverage Qscore 1 PTPN11 Y63C het Y63C 374 38 2 PTPN11 N308D het N308D 519 39 3 PTPN11 T468M het T468M 525 40 4 SOS1 M279R het M279R 390 39 5 SOS1 I733N het I733N 78 37 6 HRAS G12A het G12A 20 37 Lepri et al. BMC Medical Genetics 2014, 15:14 Page 5 of 11 http://www.biomedcentral.com/1471-2350/15/14

control samples confirmed both the expected mutations absence of any unreported variant in the validation set, and the allele state. All variants were identified with a and of any false positive result. mean coverage of 318 and a mean Qscore = 38, providing a detection rate of 100% for the validation set (Table 2). Training set Both positive and negative control samples did not Samples from training set were investigated in three highlight any further unexpected variant, confirming the different sequencing runs, with an average coverage of

Table 3 Mutations identified by TSCA sequencing in patients enrolled in the training set Case Phenotype Gene Mutation Protein substitution Allele state Variant frequency Coverage Qscore Reference 1 NS PTPN11 c.184 T > G Y62D het 0.448 460 39 [30] 2 NS PTPN11 c.188A > G Y63C het 0.521 190 39 [9] 3 NS PTPN11 c.188A > G Y63C het 0.54 512 39 [9] 4 NS PTPN11 c.188A > G Y63C het 0.481 1046 38 [9] 5 NS PTPN11 c.317A > C D106A het 0.495 632 39 [31] 6 NS PTPN11 c.328G > A E110K het 0.486 702 36 [31] 7 NS PTPN11 c.417 G > C E139D het 0.498 406 38 [30] 8 NS PTPN11 c.661A > G I221V het 0.482 737 39 p.s 9 NS PTPN11 c.767A > G Q256R het 0.514 290 36 p.s 10 NS PTPN11 c.854 T > C F285S het 0.406 64 39 [30] 11 NS PTPN11 c.922 A > G N308D het 0.526 812 38 [9] 12 NS PTPN11 c.922 A > G N308D het 0.508 1174 39 [9] 13 NS PTPN11 c.922 A > G N308D het 0.505 3126 39 [9] 14 NS PTPN11 c.922 A > G N308D het 0.486 3111 39 [9] 15 NS PTPN11 c.923 A > G N308S het 0.555 119 40 [30] 16 NS PTPN11 c.1183G > T D395Y het 0.556 561 38 p.s 16 NS PTPN11 c.1186 T > C Y396H het 0.557 560 37 p.s 17 NS PTPN11 c.1226G > C G409A het 0.444 178 38 [32] 18 NS PTPN11 c.1282G > T V428L het 0.502 416 38 p.s 19 LS PTPN11 c.1403C > T T468M het 0.467 319 40 [20] 20 LS PTPN11 c.1492 C > T R498W het 0.573 185 35 [33] 21 LS PTPN11 c.1492 C > T R498W het 0.521 142 38 [33] 22 NS SOS1 c.755 T > C I252T het 0.528 212 39 [34] 23 NS SOS1 c.806 T > G M269R het 0.564 140 38 [34] 24 NS SOS1 c.806 T > G M269R het 0.496 391 38 [34] 25 NS SOS1 c.1310 T > A I437N het 0.46 302 40 [34] 26 NS SOS1 c.1649 T > C L550P het 0.516 275 39 [34] 27 NS SOS1 c.1649 T > C L550P het 0.428 428 39 [34] 28 NS SOS1 c.2104 T > C Y702H het 0.52 421 37 [34] 29 NS SOS1 c.2371C > A L791I het 0.576 363 37 p.s 30 NS SOS1 c.2371C > A L791I het 0.546 108 39 p.s 31 NS/CFCS BRAF c.1694A > G D565G het 0.463 341 39 p.s 32 CFCS BRAF c.1802A > T K601I het 0.538 1120 37 [17] 33 CFC MEK2 c.326C > T A110 T het 0.505 299 37 p.s 34 CFC MEK2 c. 395 T > G G132D het 0.533 227 38 [35] 35 NS RAF1 c.785 A > T N262I het 0.504 135 39 p.s 36 NS RAF1 c.781C > T P261S het 0.524 143 39 [13] 37 NS CBL c.2350G > A V784M het 0.428 173 36 p.s p.s: present study. Lepri et al. BMC Medical Genetics 2014, 15:14 Page 6 of 11 http://www.biomedcentral.com/1471-2350/15/14

Figure 3 An example of three different mutations (A. PTPN11:Y63C; B. SOS1: M269R; C. BRAF: K601I) identified by Miseq. Lepri et al. BMC Medical Genetics 2014, 15:14 Page 7 of 11 http://www.biomedcentral.com/1471-2350/15/14

200x, as set with DS. Among the patients, 38 mutations Reproducibility were identified in 6 of the 11 RAS pathway genes ana- TSCA sequencing showed 100% reproducibility for all 120 lyzed, PTPN11 (22/38 = 58%), SOS1 (9/38 = 23%), BRAF exons, independently from DNA samples and sequencing (2/38 = 5%), MEK2 (2/38 = 5%), RAF1 (2/38 = 5%), CBL runs, making this approach compatible with a diagnostic (1/38 = 3%). The 38 variants identified from Miseq Re- purpose. Figure 4 illustrates the performance of the same porter had an average coverage of 595x and an average target region through three sequencing runs. Qscore of 38 (Table 3). All variants have been confirmed by Sanger sequencing Discussion and IGV, indicating the absence of any false positive result The term RASopathy applies toagroupofgeneticdisor- in the training set group (Figure 3). Moreover, to exclude ders characterized by similar phenotypes, caused by muta- any possible false negative event, 10 negative samples tions in the RAS MAPK pathway. These phenotypes are randomly selected, have been further analyzed by Sanger characterized by a high degree of genetic heterogeneity, sequencing (only “hot spots” exons) and 30 additional since individual diseases can arise from mutations in differ- samples have been analyzed for PTPN11,usingNGS ent genes. In addition, since different RASopathies share andSangersequencingandallofthemprovidednega- similar clinical features, their molecular characterization is tive results. complex, time consuming and expensive.

Figure 4 Performance of the same target (PTPN11_exon8) region through 3 different sequencing runs (A. first run; B. second run; C. third run). Lepri et al. BMC Medical Genetics 2014, 15:14 Page 8 of 11 http://www.biomedcentral.com/1471-2350/15/14

Table 4 Clinical features of the patients carrying rare PTPN11 mutations Case n°8 Case n°9 Case n°10 Case n°16 Father case n°16 Case n°18 Mother case n°18 Sex Short stature Macrocephaly +- + - - Hypertelorism Downslanting palpebral fissures Palpebral ptosis --+ - + + + Epicanthal folds ++ - + - Short broad nose ++ - + - - Deeply grooved philtrum ++ + + + High wide peaks of the vermilion ++ + + + + + Micrognathia -+ - Low-set and/or posteriorly angulated ears with thick helices Low posterior hairline -++ - - + - Thorax anomalies -++ - - + - Cardiac defect ++ - - - - PVS -++---- - ASD -- - - VSD ------PDA -+ - - - Arrhythmia - - - - - WPW - Renal anomaly ------Cryptorchidism NA - - + NA Developmental delay or cognitive deficit Alopecia --+ + - Pancreatic cyst +- - - - Angioma -- - - - Inheritance NT pat NT mat NT ASD, atrial septal defect; VSD, ventricular septal defect; HCM, hypertrophic cardiomyopathy; mat, maternal; NA, not applicable; NT, not tested; pat, paternal; PDA, patent ductus arteriosus; PVS, pulmonary valve stenosis; WPW, Wolf-Parkinson-White. +present; -not present.

In order to improve the molecular testing of RASopa- followed by RAF1. Three patients with LEOPARD syn- thies, in this study we investigated a protocol based on a dromehadoneofthePTPN11 recurrent mutation pre- targeted NGS using MiSeq Illumina platform enabling the viously associated with this phenotype (T468M or analysis of all known causative genes in up to 96 patients R498W) [20,33]. Among patients heterozygous for a in a single sequencing run. In particular, we analyzed pathogenic BRAF mutation, one was clinically diagnosed 80 patients and identified 38 mutations in 6 of 11 RAS- as being affected by CFCS, while in the other case clinical pathway genes, including PTPN11 (22/38 = 58%), SOS1 evaluation was unable to conclude whether he was (9/38 = 23%), BRAF (2/38 = 5%), MEK2 (2/38 = 5%), affected by NS or CFCS. The patients’ age at diagnosis RAF1 (2/38 = 5%), CBL (1/38 = 3%). The relative frequency is obviously important: this latter subject was 2 year-old at of mutations in the tested genes was in agreement with time of clinical evaluation, when he displayed only some published results [30,36,37]. As shown in Table 3, in features of CFCS [38]. Two other patients with CFCS many patients the causal mutation was identified in the had a mutation in MEK2 gene, which is less commonly gene considered the most suitable candidate, based on mutated in this disorder. In addition, one NS patient frequency of mutations and the phenotypic characteristics. had a mutation in CBL, agenerarelyassociatedwith As expected, most NS patients had a PTPN11 mutation, this disorder. Since all genes have been analyzed in while the second most frequently mutated gene was SOS1, one run, the present protocol allowed to reach the Lepri et al. BMC Medical Genetics 2014, 15:14 Page 9 of 11 http://www.biomedcentral.com/1471-2350/15/14

Figure 5 Comparison of time (A) and cost (B) between NGS and Sanger sequencing. simultaneous identification of mutations affecting both evident that the phenotype related to these mutations the most frequent and rare genes, with a significantly is quite atypical, showing common Noonan-like facial reduction of time needed to reach the molecular anomalies associated with variable additional neural and characterization of the patient. This point is important ectodermal features. for early diagnosis and classification of the different The other 43 patients enrolled in this study were nega- RASopathies, allowing a more appropriate management tive for the investigated RAS genes. The proportion of and counseling. negative patients was higher compared to previous reports, A wide spectrum of PTPN11 mutations has been asso- likely because clinical inclusion criteria were less stringent ciated with NS, including some rare mutations: I221V, compared to other studies [39-41], being based on NS Q256R, F285S, V428L (Table 4). All these variants are facial anomalies and almost only one additional feature. associated with the distinct facial gestalt of NS, which Interestingly, the mother of patient n°18 was included was markedly expressed in one patient (case no.10) with in this study being the mutated parent of an affected F285S and mildly expressed in the others. Familial trans- child, although the minimal clinical criteria for diagnosis mission from an affected individual has been found in of NS were not present. one instance, and suspected in another case (no.8), where All the variations have been confirmed by Sanger se- the affected proband’s brother was referred to have quencing as well as the 4 negative control patients. These membranous subaortic stenosis and cryptorchidism. Men- data indicated a 100% detection rate of mutations involved tal retardation or cognitive deficit was not associated with in RASopathies and, most important, all these results have these mutations, with the exception of F285S mutation. been obtained with MiSeq on board analysis, being the Interestingly, the patient heterozygous for this latter bioinformatics examination performed only adopting mutation had a congenital pancreatic cyst, an unusual theuserfriendlyMiseqReportersoftware. malformation in the RASopathies. In one patient (case The absence of any false negative and false positive no.16) we identified two unpublished mutations affecting results and the possibility to have an easy and accurate two consecutive PTPN11 aminoacids, D395Y and Y396H. data analysis make this approach a good diagnostic tool. Both mutations were inherited from the affected father. Furthermore, the library preparation workflow is easy Variability of clinical expression in this family was recog- and the TSCA kit performance is stable. In fact, all nizable, since facial anomalies of NS were associated different experiments for the RAS pathway genes in each with developmental delay in the son only. The father run provided the same results in terms of coverage and had congenital total alopecia as distinctive feature. It is quality (Figure 4). Lepri et al. BMC Medical Genetics 2014, 15:14 Page 10 of 11 http://www.biomedcentral.com/1471-2350/15/14

Two major points to be considered in the diagnostic and interpretation of data and drafted the manuscript. MCD, BD and MLD protocols include the time needed to complete the entire performed the clinical examination of the patients and have been involved in drafting the manuscript. MG, SG, EP, PS and SR: contribute to perform the workflow and the costs that this approach requires. Tar- molecular genetic studies and analysis and interpretation of data. RC and AB: geted NGS analysis of the complete coding sequences of performed the clinical examination. TF: contribute to perform the analysis and the 11 genes in the RAS pathway for 96 patients takes interpretation of data. AA: contribute to concept the study and has been involved in drafting the manuscript. All authors read and approved the final manuscript. about two months, including ten days for library prepar- ation and data analysis and about 45 days to characterize Acknowledgements uncovered regions using Sanger sequencing. Conversely, This study was supported by grants from the Italian Ministry of Health, the use of Sanger sequencing to analyze the full coding Ricerca Corrente 2013. sequence of the 11 genes would take about 16-18 months Author details for the same number of patients. We also calculated that 1Cytogenetics, Medical Genetics and Pediatric Cardiology, Bambino Gesù 2 the cost of NGS analysis applied to the 92% of the regions Children Hospital, IRCCS, Rome, Italy. Illumina, Inc., San Diego, CA 92122, USA. of interest, plus Sanger sequencing of for the remaining Received: 25 September 2013 Accepted: 20 January 2014 regions, would cost 6 time less than the cost of a protocol Published: 23 January 2014 entirely based on Sanger sequencing (Figure 5). However, time and cost could be further reduced, by designing a References RIT1 1. Noonan JA: Hypertelorism with Turner phenotype. A new syndrome with new panel which includes also , a gene recently associated congenital heart disease. Am J Dis Child 1968, 116:373e380. associated to NS [42], resulting in a 100% coverage of 2. Nora JJ, Nora AH, Sinha AK, Spangler RD, Lubs HA: The Ullriche-Noonan the cumulative region. This result likely makes the Sanger syndrome (Turner phenotype). Am J Dis Child 1974, 127:48e55. 3. Gorlin RJ, Anderson RC, Blaw M: Multiples lentigienes syndrome sequencing irrelevant for the analysis, further reducing syndrome. Am J Dis Child 1969, 117:652–662. thetimeandthecostoftheentireprocess. 4. Reynolds JF, Neri G, Herrmann JP, Blumberg B, Coldwell JG, Miles PV, Opitz JM: New multiple congenital anomalies/mental retardation syndrome with cardio-facio-cutaneous involvement-the CFC syndrome. Conclusion Am J Med Genet 1986, 25(3):413–427. This study demonstrates that NGS can be successfully 5. Mazzanti L, Cacciari E, Cicognani A, Bergamaschi R, Scarano E, Forabosco A: applied to the molecular testing of RASophaties with a Noonan-like syndrome with loose anagen hair: a new syndrome? Am J Med Genet A 2003, 118A(3):279–286. remarkable gain of time and less cost, while maintaining 6. Costello JM: A new syndrome: mental subnormality and nasal the high quality of the results. Consistent with available papillomata. Aust Paediatr J 1977, 13(2):114–118. records, our data confirm that the genetic mechanism 7. Bentires-Alj M, Kontaridis MI, Neel BG: Stops along the RAS pathway in human genetic disease. Nat Med 2006, 12:283–285. underlying the RASopathies is due to germline mutations 8. Tidyman WE, Rauen KA: The RASopathies: developmental syndromes of in different genes encoding for components of the RAS- Ras/MAPK pathway dysregulation. Curr Opin Genet Dev 2009, 19:230–236. MAPK signalling mutations, with PTPN11, followed by 9. Tartaglia M, Mehler EL, Goldberg R, Zampino G, Brunner HG, Kremer H, van der SOS1 Burgt I, Crosby AH, Ion A, Jeffery S, Kalidas K, Patton MA, Kucherlapati RS, Gelb , being the most frequently mutated genes in our BD: Mutations in PTPN11,encoding the protein tyrosine SHP-2, cohort. Moreover, the use of NGS protocol has allowed, cause Noonan syndrome. Nat Genet 2001, 29:465–468. with a high standard in terms of coverage and quality, 10. Tartaglia M, Pennacchio LA, Zhao C, Yadav KK, Fodale V, Sarkozy A, Pandit B, Oishi K, Martinelli S, Schackwitz W, Ustaszewska A, Martin J, Bristow J, Carta C, an early detection of rare mutations in other RAS-MAPK Lepri F, Neri C, Vasta I, Gibson K, Curry CJ, Siguero JP, Digilio MC, Zampino G, genes, avoiding the use of standard Sanger sequencing Dallapiccola B, Bar-Sagi D, Gelb BD: Gain-of-function SOS1 mutations cause a approach and the related enlarged cost and time consum- distinctive form of Noonan syndrome. Nat Genet 2007, 39(1):75–79. 11. Carta C, Pantaleoni F, Bocchinfuso G, Stella L, Vasta I, Sarkozy A, Digilio C, ing issues. Taken all together, these data highlight the Palleschi A, Pizzuti A, Grammatico P, Zampino G, Dallapiccola B, Gelb BD, usefulness of a molecular characterization that lead to Tartaglia M: Germline missense mutations affecting KRAS isoform B are an early diagnosis especially for patients with mild, associated with a severe Noonan syndrome phenotype. Am J Hum Genet 2006, 79:129–135. nonspecific or atypical features and might direct to a more 12. Schubbert S, Zenker M, Rowe SL, Böll S, Klein C, Bollag G, van der Burgt I, appropriate genetic counselling and clinical management. Musante L, Kalscheuer V, Wehner LE, Nguyen H, West B, Zhang KY, Sistermans E, Rauch A, Niemeyer CM, Shannon K, Kratz CP: Germline KRAS Abbreviations mutations cause Noonan syndrome. Nat Genet 2006, 38:331–336. NS: Noonan syndrome; LS: LEOPARD syndrome; CFCS: Cardiofaciocutaneous 13. Pandit B, Sarkozy A, Pennacchio LA, Carta C, Oishi K, Martinelli S, Pogna EA, syndrome; NS/LAH: Noonan-like syndrome with loose anagen hair; Schackwitz W, Ustaszewska A, Landstrom A, Bos JM, Ommen SR, Esposito G, CS: Costello syndrome; NGS: Next generation sequencing; M.C.D: Maria Lepri F, Faul C, Mundel P, López Siguero JP, Tenconi R, Selicorni A, Rossi C, Cristina Digilio; B.D: Bruno Dallapiccola; R.C: Rossella Capolino; TSCA: TruSeq Mazzanti L, Torrente I, Marino B, Digilio MC, Zampino G, Ackerman MJ, custom amplicon; DS: Design studio software; GATK: Genome analysis Dallapiccola B, Tartaglia M, Gelb BD: Gain-of-function RAF1 mutations Toolkit; RTA: Real time analysis software; IGV: Integrative genome viewer; cause Noonan and LEOPARD syndromes with hypertrophic p.s.: Present study. cardiomyopathy. Nat Genet 2007, 39:1007–1012. 14. Razzaque MA, Nishizawa T, Komoike Y, Yagi H, Furutani M, Amo R, Kamisago Competing interests M, Momma K, Katayama H, Nakagawa M, Fujiwara Y, Matsushima M, Mizuno The authors declare that they have no competing interests. K, Tokuyama M, Hirota H, Muneuchi J, Higashinakagawa T, Matsuoka R: Germline gain-of-function mutations in RAF1 cause Noonan syndrome. Authors’ contributions Nat Genet 2007, 39:1013–1017. FRL: carried out the molecular genetic studies, the analysis and 15. Roberts AE, Araki T, Swanson KD, Montgomery KT, Schiripo TA, Joshi VA, interpretation of data and drafted the manuscript. RS: participated to analysis Li L, Yassin Y, Tamburino AM, Neel BG, Kucherlapati RS: Germline gain- Lepri et al. BMC Medical Genetics 2014, 15:14 Page 11 of 11 http://www.biomedcentral.com/1471-2350/15/14

offunction mutations in SOS1 cause Noonan syndrome. Nat Genet 2007, 30. Lee BH, Kim JM, Jin HY, Kim GH, Choi JH, Yoo HW: Spectrum of mutations 39:70–74. in Noonan syndrome and their correlation with phenotypes. J Pediatr 16. Nava C, Hanna N, Michot C, Pereira S, Pouvreau N, Niihori T, Aoki Y, 2011, 159(6):1029–1035. Matsubara Y, Arveiler B, Lacombe D, Pasmant E, Parfait B, Baumann C, Héron 31. Tartaglia M, Martinelli S, Stella L, Bocchinfuso G, Flex E, Cordeddu V, Zampino G, D, Sigaudy S, Toutain A, Rio M, Goldenberg A, Leheup B, Verloes A, Cavé H: Burgt I, Palleschi A, Petrucci TC, Sorcini M, Schoch C, Foa R, Emanuel PD, Gelb Cardio-facio-cutaneous and Noonan syndromes due to mutations in the BD: Diversity and functional consequences of germline and somatic PTPN11 RAS/MAPK signaling pathway: genoype-phenotype relationships and mutations in human disease. Am J Hum Genet 2006, 78(2):279–290. overlap with Costello syndrome. J Med Genet 2007, 44(12):763–771. 32. Zenker M, Voss E, Reis A: Mild variable Noonan syndrome in a family with 17. Sarkozy A, Carta C, Moretti S, Zampino G, Digilio MC, Pantaleoni F, Scioletti a novel PTPN11 mutation. Eur J Med Genet 2007, 50(1):43–47. AP, Esposito G, Cordeddu V, Lepri F, Petrangeli V, Dentici ML, Mancini GM, 33. Sarkozy A, Conti E, Digilio MC, Marino B, Morini E, Pacileo G, Wilson M, Calabrò R, Selicorni A, Rossi C, Mazzanti L, Marino B, Ferrero GB, Silengo MC, Memo L, Pizzuti A, Dallapiccola B: Clinical and molecular analysis of 30 patients with Stanzial F, Faravelli F, Stuppia L, Puxeddu E, Gelb BD, Dallapiccola B, multiple lentigines LEOPARD syndrome. J Med Genet 2004, 41(5):e68. Tartaglia M: Germline BRAF mutations in Noonan, LEOPARD, and 34. Lepri F, De Luca A, Stella L, Rossi C, Baldassarre G, Pantaleoni F, Cordeddu V, cardiofaciocutaneous syndromes: molecular diversity and associated Williams BJ, Dentici ML, Caputo V, Venanzi S, Bonaguro M, Kavamura I, phenotypic spectrum. Hum Mut 2009, 30:695–702. Faienza MF, Pilotta A, Stanzial F, Faravelli F, Gabrielli O, Marino B, Neri G, 18. Cirstea IC, Kutsche K, Dvorsky R, Gremer L, Carta C, Horn D, Roberts AE, Silengo MC, Ferrero GB, Torrrente I, Selicorni A, Mazzanti L, Digilio MC, Lepri F, Merbitz-Zahradnik T, König R, Kratz CP, Pantaleoni F, Dentici ML, Zampino G, Dallapiccola B, Gelb BD, Tartaglia M: SOS1 mutations in Joshi VA, Kucherlapati RS, Mazzanti L, Mundlos S, Patton MA, Silengo MC, Noonan syndrome: molecular spectrum, structural insights on Rossi C, Zampino G, Digilio C, Stuppia L, Seemanova E, Pennacchio LA, Gelb pathogenic effects, and genotype-phenotype correlations. Hum Mutat BD, Dallapiccola B, Wittinghofer A, Ahmadian MR, Tartaglia M, Zenker M: 2011, 32(7):760–772. A restricted spectrum of NRAS mutations causes Noonan syndrome. 35. Dentici ML, Sarkozy A, Pantaleoni F, Carta C, Lepri F, Ferese R, Cordeddu V, Nat Genet Jan 2010, 42:27–29. Martinelli S, Briuglia S, Digilio MC, Zampino G, Tartaglia M, Dallapiccola B: 19. Martinelli S, De Luca A, Stellacci E, Rossi C, Checquolo S, Lepri F, Caputo V, Spectrum of MEK1 and MEK2 gene mutations in cardio-facio-cutaneous Silvano M, Buscherini F, Consoli F, Ferrara G, Digilio MC, Cavaliere ML, van syndrome and genotype-phenotype correlations. Eur J Hum Genet 2009, Hagen JM, Zampino G, van der Burgt I, Ferrero GB, Mazzanti L, Screpanti I, 17(6):733–740. Yntema HG, Nillesen WM, Savarirayan R, Zenker M, Dallapiccola B, Gelb BD, 36. Zenker M: Noonan syndrome and related disorders: A matter of Tartaglia M: Heterozygous germline mutations in the CBL tumor deregulated Ras signaling. Monogr Hum Genet 2010, Vol. 17. supressor gene cause a Noonan syndrome-like phenotype. Am J Hum 37. Tartaglia M, Gelb BD, Zenker M: Noonan syndrome and clinically related Genet 2010, 87:250–257. disorders. Best Pract Res Clin Endocrinol Metab 2011, 25(1):161-179. 20. Digilio MC, Conti E, Sarkozy A, Mingarelli R, Dottorini T, Marino B, Pizzuti A, 38. Digilio MC, Lepri F, Baban A, Dentici ML, Versacci P, Capolino R, Ferese R, Dallapiccola B: Grouping of multiple-lentigines/LEOPARD and Noonan De Luca A, Tartaglia M, Marino B, Dallapiccola B: RASopathies: Clinical sindrome on the PTPN11 gene. Am J Hum Genet 2002, 71:389–394. Diagnosis in the First Year of Life. Mol Syndromol 2011, 1(6):282–289. 21. Legius E, Schrander-Stumpel C, Schollen E, Pulles-Heintzberger C, Gewillig M, 39. Van der Burgt I: Noonan syndrome. Orphanet J Rare Dis 2007, 2:4. Fryns JP: PTPN11 mutations in LEOPARD syndrome. J Med Genet 2002, 40. Voron DA, Hatfield HH, Kalkhoff RK: Multiple lentigines syndrome. Case 39:571–574. report and review of the literature. Am J Med 1976, 60:447–456. 22. Cordeddu V, Di Schiavi E, Pennacchio LA, Ma'ayan A, Sarkozy A, Fodale V, 41. Kavamura NI, Peres CA, Alchhorne MMA, Brunoni D: CFC index for the Cecchetti S, Cardinale A, Martin J, Schackwitz W, Lipzen A, Zampino G, diagnosis of cardiofaciocutaneous syndrome. Am J Med Genet 2002, – Mazzanti L, Digilio MC, Martinelli S, Flex E, Lepri F, Bartholdi D, Kutsche K, 112:12 16. Ferrero GB, Anichini C, Selicorni A, Rossi C, Tenconi R, Zenker M, Merlo D, 42. Aoki Y, Niihori T, Banjo T, Okamoto N, Mizuno S, Kurosawa K, Ogata T, Dallapiccola B, Iyengar R, Bazzicalupo P, Gelb BD, Tartaglia M: Mutation of Takada F, Yano M, Ando T, Hoshika T, Barnett C, Ohashi H, Kawame H, SHOC2 promotes aberrant protein N-myristoylation and causes Noonan- Hasegawa T, Okutani T, Nagashima T, Hasegawa S, Funayama R, like syndrome with loose anagen hair. Nat Genet 2009, 41(9):1022–1026. Nagashima T, Nakayama K, Inoue S, Watanabe Y, Ogura T, Matsubara Y: 23. Niihori T, Aoki Y, Narumi Y, Neri G, Cavé H, Verloes A, Okamoto N, Gain-of-Function Mutations in RIT1 Cause Noonan Syndrome, a RAS/ Hennekam RC, Gillessen-Kaesbach G, Wieczorek D, Kavamura MI, Kurosawa K, MAPK Pathway Syndrome. Am J Hum Genet 2013, 11:93. Ohashi H, Wilson L, Heron D, Bonneau D, Corona G, Kaname T, Naritomi K, Baumann C, Matsumoto N, Kato K, Kure S, Matsubara Y: Germline KRAS doi:10.1186/1471-2350-15-14 and BRAF mutations in cardio-facio-cutaneous syndrome. Nat Genet Cite this article as: Lepri et al.: Diagnosis of Noonan syndrome and 2006, 38(3):294–296. related disorders using target next generation sequencing. BMC Medical 24. Rodriguez-Viciana P, Tetsu O, Tidyman WE, Estep AL, Conger BA, Cruz MS, Genetics 2014 15:14. McCormick F, Rauen KA: Germline mutations in genes within the MAPK pathway cause cardio-facio-cutaneous syndrome. Science 2006, 311(5765):1287–1290. 25. Aoki Y, Niihori T, Kawame H, Kurosawa K, Ohashi H, Tanaka Y, Filocamo M, Kato K, Suzuki Y, Kure S, Matsubara Y: Germline mutations in HRAS proto- oncogene cause Costello syndrome. Nat Genet 2005, 37(10):1038–1040. 26. Allanson JE, Bohring A, Dörr HG, Dufke A, Gillessen-Kaesbach G, Horn D, König R, Kratz CP, Kutsche K, Pauli S, Raskin S, Rauch A, Turner A, Wieczorek D, Zenker M: The face of Noonan syndrome: Does phenotype predict Submit your next manuscript to BioMed Central – genotype. Am J Med Genet A 2010, 152A(8):1960 1966. and take full advantage of: 27. Helga T, James T, Robinson JT, Mesirov JP: Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 2013, 14(2):178–92. • Convenient online submission 28. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, • Thorough peer review Mesirov JP: Integrative Genomics Viewer. Nat Biotechnol 2011, 29:24–26. • No space constraints or color figure charges 29. Rehm HL, Bale SJ, Bayrak-Toydemir P, Berg JS, Brown KK, Deignan JL, Friez MJ, Funke BH, Hegde MR, Lyon E, Working Group of the American College • Immediate publication on acceptance of Medical Genetics and Genomics Laboratory Quality Assurance Commitee: • Inclusion in PubMed, CAS, Scopus and Google Scholar ACMG clinical laboratory standards for next-generation sequencing. • Research which is freely available for redistribution Genet Med. 2013, 15(9):733–747.

Submit your manuscript at www.biomedcentral.com/submit Development of a Multi-Biomarker Disease Activity Test for Rheumatoid Arthritis

Michael Centola1., Guy Cavet2*., Yijing Shen3, Saroja Ramanujan2, Nicholas Knowlton4, Kathryn A. Swan2, Mary Turner1, Chris Sutton1, Dustin R. Smith1, Douglas J. Haney2, David Chernoff5, Lyndal K. Hesterberg6, John P. Carulli7, Peter C. Taylor8, Nancy A. Shadick9, Michael E. Weinblatt9, Jeffrey R. Curtis10 1 Arthritis and Immunology, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America, 2 Department of Informatics, Crescendo Bioscience Inc., South San Francisco, California, United States of America, 3 Department of Biostatistics and Bioinformatics, Crescendo Bioscience, Inc., South San Francisco, California, United States of America, 4 Biomarker & Proteomic Core Facility, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America, 5 Department of Medicine, Crescendo Bioscience, Inc., South San Francisco, California, United States of America, 6 Department of Development, Crescendo Bioscience, Inc., South San Francisco, California, United States of America, 7 Genetics and Genomics Group, Biogen Idec, Cambridge, Massachusetts, United States of America, 8 Kennedy Institute of Rheumatology, University of Oxford, Oxford, United Kingdom, 9 Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America, 10 Division of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, Alabama, United States of America

Abstract

Background: Disease activity measurement is a key component of rheumatoid arthritis (RA) management. Biomarkers that capture the complex and heterogeneous biology of RA have the potential to complement clinical disease activity assessment.

Objectives: To develop a multi-biomarker disease activity (MBDA) test for rheumatoid arthritis.

Methods: Candidate serum protein biomarkers were selected from extensive literature screens, bioinformatics databases, mRNA expression and protein microarray data. Quantitative assays were identified and optimized for measuring candidate biomarkers in RA patient sera. Biomarkers with qualifying assays were prioritized in a series of studies based on their correlations to RA clinical disease activity (e.g. the Disease Activity Score 28-C-Reactive Protein [DAS28-CRP], a validated metric commonly used in clinical trials) and their contributions to multivariate models. Prioritized biomarkers were used to train an algorithm to measure disease activity, assessed by correlation to DAS and area under the receiver operating characteristic curve for classification of low vs. moderate/high disease activity. The effect of comorbidities on the MBDA score was evaluated using linear models with adjustment for multiple hypothesis testing.

Results: 130 candidate biomarkers were tested in feasibility studies and 25 were selected for algorithm training. Multi- biomarker statistical models outperformed individual biomarkers at estimating disease activity. Biomarker-based scores were significantly correlated with DAS28-CRP and could discriminate patients with low vs. moderate/high clinical disease activity. Such scores were also able to track changes in DAS28-CRP and were significantly associated with both joint inflammation measured by ultrasound and damage progression measured by radiography. The final MBDA algorithm uses 12 biomarkers to generate an MBDA score between 1 and 100. No significant effects on the MBDA score were found for common comorbidities.

Conclusion: We followed a stepwise approach to develop a quantitative serum-based measure of RA disease activity, based on 12-biomarkers, which was consistently associated with clinical disease activity levels.

Citation: Centola M, Cavet G, Shen Y, Ramanujan S, Knowlton N, et al. (2013) Development of a Multi-Biomarker Disease Activity Test for Rheumatoid Arthritis. PLoS ONE 8(4): e60635. doi:10.1371/journal.pone.0060635 Editor: Oliver Frey, University Hospital Jena, Germany Received August 12, 2012; Accepted March 1, 2013; Published April 9, 2013 Copyright: ß 2013 Centola et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This work was partly supported by Crescendo Bioscience and Biogen Idec. Crescendo Bioscience was partly responsible for study design, data collection, data analysis and preparation of the manuscript. No additional external funding received for this study. Competing Interests: The authors have the following interests. This work was partly supported by Crescendo Bioscience and Biogen Idec. GC, YS, SR, KS, DS, DH, LH: employees of Crescendo Bioscience; MC, NK, CS, DC, PT, MW, JC: consultants to Crescendo Bioscience; GC, YS, SR, KS, DS, DH, LH, MC, CS, DC: stock options in Crescendo Bioscience. GC, MC, NK, YS: inventors on US patent application #20110137851 (‘‘Biomarkers and Methods for Measuring and Monitoring Inflammatory Disease Activity’’) based on this work. JPC is employed by Biogen Idec. This paper describes the development of a diagnostics technology currently marketed by Crescendo Bioscience. There are no further patents, products in development or marketed products to declare. This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials, as detailed online in the guide for authors. * E-mail: [email protected] . These authors contributed equally to this work.

PLOS ONE | www.plosone.org 1 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

Introduction and develop such a multi-biomarker disease activity (MBDA) score for the assessment of RA disease activity. This score has RA is a common, chronic, idiopathic autoimmune disease with subsequently been tested and validated in additional patients over 1.3 million people diagnosed in the US and over 4 million [21,22]. worldwide. RA is characterized by synovitis, inflammatory joint fluid, degradation of articular cartilage, erosion of the marginal Methods bone, and systemic immune and inflammatory manifestations. Despite recent advances in treatment including the introduction of Ethics Statement potent biologic agents, substantial disease activity persists in many Clinical studies used as the source of biomarker samples were patients, with accompanying progressive bone and soft tissue approved by institutional review board (Partners Institutional damage, extra-articular consequences, disability, and increased Review Board for BRASS, Oklahoma Medical Research Foun- mortality. dation Institutional Review Board for the Oklahoma City cohort, Several studies, such as TICORA, CAMERA, BeSt, and and Quorum Institutional Review Board for InFoRM) and all FinRACO, have demonstrated improved outcomes with tight patients gave written informed consent. control of disease activity, a strategy employing frequent disease activity measurement and treatment adjustment to reach a specific Overview target disease activity level [1–3]. Treat to Target guidelines codify A multi-stage approach to biomarker discovery and algorithm these results into specific recommendations for optimal care development was used (Table 1). In Stage 1, Screening, candidate including frequent disease activity monitoring for all patients [4]. biomarkers were identified and corresponding assays were selected ACR guidelines also recommend regular disease activity testing and optimized. Stage 2, Feasibility, involved two parts: Stage 2A [5]. However there is no current gold standard for disease activity included four studies to assess and prioritize biomarkers based on assessment in RA. Multiple measures are used, each with varying their relationships to clinical disease activity; Stage 2B was a pilot strengths and weaknesses, such that no single ‘best’ measure of imaging study to verify that multi-biomarker disease activity scores disease activity could be recommended in U.S. or international could capture critical aspects of disease activity. Stage 3, Test RA guidelines. Development, involved further assay optimization, biomarker Current disease activity indices are typically composite scores selection, and the training and selection of the final MBDA that can include physician assessment of symptoms, patient algorithm. Once the algorithm was finalized, the impact of reported measures, and laboratory measurements. The Disease comorbidities on the MBDA score was assessed. Activity Score (DAS), the Simplified Disease Activity Index (SDAI) and the Clinical Disease Activity Index (CDAI), for example, rely Patient Cohorts & Samples on joint counts, patient self-assessment and (with the exception of Biomarkers were assayed from stored serum samples obtained CDAI) laboratory tests, while the Routine Assessment of Patient from patients from multiple clinical studies/cohorts. Serum was Index Data-3 is based solely on PROs [6–9]. Although physician collected in standard Serum Separator Tubes in accordance with evaluation and patient self-reporting are critical components of manufacturer’s instructions and frozen at 280 Celsius within 72 patient assessment and management, they are influenced by intra- hours. Material was maintained between 2 and 8 degrees Celsius and inter-assessor variability and can be confounded by comor- until freezing except in the BRASS cohort, for which SST tubes bidities or accumulated joint damage resulting from long-standing were shipped at ambient temperature prior to separation of serum disease [10–13]. by centrifugation. In all studies except the Stage 2B Pilot Imaging Protein biomarkers can provide complementary, objective, and study, observational cohorts were used. The objective was to reliable measurements reflecting underlying pathophysiological evaluate the intended use population for the MBDA test: diverse processes. Erythrocyte sedimentation rate (ESR) and CRP patients representative of the RA population in the United States measurements are currently incorporated into clinical disease and Western Europe, treated according to current practice norms. activity measures, including the DAS and SDAI. However, these The reasons for using multiple observational cohorts were 1) to biomarkers are non-specific indicators of inflammation that can be ensure that only biomarkers that behaved consistently across elevated due to age, anemia and the presence of immunoglobulins, different patient populations would be selected for use in the final and that can be unexpectedly low or even normal in patients with MBDA algorithm, and 2) to access sufficiently large numbers of active disease, possibly due to underlying genetics [14–16]. patients for adequate statistical power. In the Stage 2B Pilot Therefore, ESR and CRP measurement may not be useful in all Imaging study the objective was to examine the relationship RA patients, and other biomarkers may provide important between disease activity biomarkers and disease measures based information about disease state. Previous research studies have on joint imaging, and the cohort was selected because of the reported that other protein biomarkers implicated in the availability of high-quality ultrasound and X-ray image data. In all pathophysiology of joint disease, such as vascular endothelial studies the determination of clinical disease activity was carried out growth factor-A (VEGF-A) and matrix metalloproteinase 3 using standard methods (refs) and the assessment was carried out (MMP3), are also correlated with disease activity [17–20]. We without knowledge of biomarker concentrations (which were hypothesized that measurement of multiple serum protein determined later). biomarkers combined into a single score could quantitatively All patients fulfilled at least 4 of 7 of the 1987 revised ACR and objectively characterize RA disease activity and enhance criteria for RA [23]. Exclusion criteria applicable to all source current disease activity assessment. Periodic monitoring of this cohorts were: oral (.10 mg/day) or parenteral (any) corticosteroid score could complement existing approaches to patient care, use within the last 4 weeks; women who were pregnant, nursing, or facilitating quantitative tracking of patient status and treatment planning pregnancy within 6 months of study enrollment; signs or impact and supporting management of difficult cases such as symptoms of severe, progressive or uncontrolled renal, hepatic, patients with comorbidities or conflicting physician vs. patient hematologic, gastrointestinal, endocrine, pulmonary, cardiac, assessment. We applied a multi-step development process using neurologic, or cerebral disease; concomitant diagnosis or history multiple diverse cohorts to prioritize biomarkers of disease activity of congestive heart failure; a known history of a demyelinating

PLOS ONE | www.plosone.org 2 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

Table 1. Staged approach used in biomarker discovery and prioritization and algorithm development.

Stage Study Objectives Biomarkers Patients Samples

SCREENING 1 - Candidate marker identification; 130* 20 20 Initial assay optimization FEASIBILITY 2A Study I Prioritization 113 128 128 FEASIBILITY 2A Study II Prioritization 75 320 320 FEASIBILITY 2A Study III Prioritization 65 85 255 FEASIBILITY 2A Study IV New marker evaluation 16 119{ 119{ & Prioritization FEASIBILITY 2B Pilot Imaging Assessment of capabilities of .25` 24 107 biomarker-based disease activity scores DEVELOPMENT 3 Training Analytical validation; Development 25 708 708 & testing of candidate algorithms

*130 biomarkers had adequate measurability to advance to studies of clinical disease activity. {Patients and samples in Study IV represent a subset of those evaluated in Study II. `In addition to the 25 biomarkers that were subsequently advanced to model development (Stage 3), this study also examined other serum biomarkers of potential interest to prediction of structural damage progression, some of which overlapped with biomarkers considered for disease activity prediction. doi:10.1371/journal.pone.0060635.t001 disease; any known malignancy currently or within the previous 5 from the Oklahoma City cohort. For Study II, single visit samples years (with the exception of basal cell or squamous cell carcinoma were obtained from an additional 140 patients from the Oklahoma of the skin that had been fully excised with no evidence of City cohort and 180 patients from the BRASS registry. For Study recurrence); seropositivity for HIV; active infection or active III, serum samples at baseline, 1 year, and 3 years were obtained substance abuse. See Table 2 for cohort characteristics. from each of 85 patients in the BRASS registry, in order to Feasibility studies. Serum samples examined in Feasibility evaluate the utility of longitudinal disease activity data. For Study Studies I–IV were derived from the Oklahoma City cohort, an IV, single visit samples were analyzed from 119 patients among observational study of patients seen at community clinics located the 140 Oklahoma City cohort patients from Study II for whom in and around Oklahoma City, OK, and from the Brigham and sufficient residual sample volume was available, in order to Woman’s Rheumatoid Arthritis Sequential Study (BRASS) examine new candidate biomarkers. Registry [24,25], an observational cohort study of patients seen Pilot ImagingStudy. Serum samples at multiple time-points at the Brigham and Women’s Hospital in Boston, MA. For Study (0, 6, 18, 52, 110 weeks) were obtained from 24 patients followed I, single visit samples were obtained from 128 patients with RA in a 2-year blinded study in the UK comparing methotrexate+in-

Table 2. Patient characteristics in Feasibility (Stage 2) studies.

Study I Study II Study III Study IV PoC Study

Number of patients/samples 128/128 320/320 85/255* 119/119** 24/107* Female, % 82 80 91 77 75 CCP+,% 63 62 62 61 n/a RF+,% 83 83 64 97 n/a Smoker, % n/a 13 4 22 n/a Methotrexate, % 53 61 48 64 100 Non-biologic DMARDs, % 69 76 64 81 100 Biologics, % 65 53 43 50 50 Corticosteroids, % 24 27 27 33 n/a Age, mean6SD (min,max) 60613 59614 59613 60614 56613 DAS28-CRP, median (IQR) 5.8 (4.7–6.5) 4.0 (2.9–5.3) 3.8 (2.7–5.0) 5.2 (4.1–6.2) 3.3 (2.2–4.4) TJC28, median (IQR) 12 (4.8–20) 2.0 (0–8.3) 7.0 (2.0–14) 8.0 (3.0–15) 3.0 (0.0–8.0) SJC28, median (IQR) 16 (12–21) 6.5 (2–13) 2.0 (0.0–10) 14 (8.0–20) 4.0 (1.0–7.0) CRP mg/L, median (IQR) 14 (4.0–32) 14 (5.1–45) 14 (4.0–47) 18 (6.9–47) 25 (7.6–70) PG, median (IQR) 5.0 (2.9–7.0) 2.5 (1.0–5.0) 3.0 (1.0–5.0) 5.0 (2.0–6.5) n/a

DMARD, disease-modifying anti-rheumatic drug; IQR, inter-quartile range. *For studies with multiple samples per patient, sex, age, and serological status (when available) statistics are based on unique patients. Other statistics are based on all samples. **All studies used independent patients and samples, except Study IV, which used a subset of Study II samples. doi:10.1371/journal.pone.0060635.t002

PLOS ONE | www.plosone.org 3 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA fliximab combined therapy with methotrexate monotherapy in Assays & Optimization aggressive early RA [26,27]. Patients were evaluated with Detection of anti-cyclic citrullinated peptide (CCP) and ultrasound (US) at 0, 18, 54 and 110 weeks and scored for rheumatoid factor (RF) antibody activity. For the Okla- synovial thickening (ST) and for vascularity by power Doppler homa cohort samples, anti-CCP was measured using a commer- area (PDA). Details of imaging and scoring have been previously cially available ELISA kit (Quanta Lite CCP 3.1 IgG/IgA Kit, described [26]. Briefly, each MCP was scored for ST using INOVA Diagnostics Inc., San Diego, CA) and RF was measured grayscale images on a 0–5 scale, and the overall ST score was the with the EL-RF/3 kit (Theratest Laboratories, Lombard IL). In total of the scores for the individual joints. The numbers of pixels the BRASS and InFoRM cohort samples both RF and anti-CCP showing Doppler signal were summed across the 10 MCPs to give were measured using Cobas analyzers (Roche Diagnostics, the overall PDA score. Radiography was performed at 0, 54, and Indianapolis, IN) according to manufacturer’s instructions. 110 weeks and used to determine van der Heijde-modified total Candidate biomarker assays. All assays were run in 96- Sharp scores (TSS). well plates with 8 point standard curves (7 standards and a blank). Algorithm training. Single samples were obtained from each Both standards and patient samples were run in duplicate in of a total of 703 patients from the BRASS cohort and from the adjacent wells on the same plate in all studies. A single lot of each Index for Rheumatoid Arthritis Measurement (InFoRM) study assay reagent was used in each study wherever possible to [28], a multicenter North American observational study conduct- minimize plate to plate variation. Pools of commercially available ed by Crescendo Bioscience, South San Francisco, SF. Five human sera (Bioreclamation Inc., Nassau, NY) from rheumatoid hundred and twelve patients from InFoRM were selected (from arthritis patients, osteoarthritis patients, systemic lupus erythema- ,1300 total) to be representative of the overall disease activity tosus patients and unaffected controls were run as process controls distribution in the study and also of the broader North American on each plate. Luminex-based assays were run on either a RA population from which the InFoRM cohort was drawn. An Luminex 100 or Luminex 200 device and analyzed using either additional 29 InFoRM patients were selected to enrich for high Bioplex curve fitting (Biorad Inc. Hercules, CA) and 4-parameter and low disease activity. One hundred and sixty-seven patients logistic (4PL) curve fitting with Power Law Variance weighting or were selected from the BRASS registry to capture the full range of Xponent software (Luminex Inc. Austin TX) with 4PLcurve fitting 2 disease activity, enriching for very low and high disease activity and 1/y weighting. Meso Scale Discovery (MSD) assays were (although 5 were subsequently excluded from analysis due to evaluated using a Sector Imager 6000 device (Meso Scale incomplete clinical data). Representation of low and high disease Discovery, Gaithersburg, MD) using MSD Discovery Workbench 2 activity patients was enriched to increase the power to detect software with 4 parameter logistic curve fitting and 1/y associations between biomarkers and clinical disease activity, and weighting. ELISA-based assays were evaluated using a BioTek to ensure that the MBDA algorithm worked across the full disease ELx800 plate reader (BioTek Inc. Winooski, VT) and TERIS activity range. All patients used were independent from those software with 4 parameter logistic curve fitting and 1/y weighting. assessed in the prioritization studies and pilot imaging study listed Sample and standard/calibrator dilutions were adjusted such that above. The 703 samples were used to further prioritize the marker levels were best positioned in the linear portions of individual biomarkers based on their associations with clinical standard curves. For Stages 1 and 2, serum from 20 RA patients measures of disease activity. A subset of these patients was used for with a representative distribution of DAS scores was used to the final fitting of the statistical models used in the MBDA optimize assay sensitivity and dynamic range, to best map RA algorithm. In order to fit models optimally, it was important to patient protein measurements onto the linear portion of the have substantial variation in disease activity levels; therefore, the standard curve, and to minimize serum volume requirements. subset of 249 samples was selected to have similar numbers of Tested assays are listed in Table S1, which also indicates how patients in low, moderate and high disease activity. The assays were multiplexed and the suppliers of assay materials performance of the most promising models from training was including standards. Final sample dilutions, assay precision and measurability limits are reported in Table S2. For Stage 3 evaluated in an independent 70-sample subset of the 703 samples, (Development), assays were optimized and characterized as selected to have similar disease activity composition to the final described previously so as to be highly consistent across studies 249-sample training set. The best models were further evaluated in and suitable for use in a clinical diagnostic test [30]. Details of patients from the computer-assisted management in early rheu- assay multiplexing, sample dilutions, limits and sources of matoid arthritis (CAMERA) study [21]. standards are given in Table S3. Comorbidities study. Samples from the 512 representative Heterophilic antibody and RF interference. Assays sensi- InFoRM patients used in algorithm training were analyzed to tive to heterophilic antibody activity were identified by evaluating evaluate the impact of common comorbidities on the MBDA whether blocking reagent (Heteroblock, Omega Biologicals, score. The presence or absence of comorbidities was recorded by Bozeman, MT) reduced signal in 5 samples with high RF titers study investigators based on their individual clinical knowledge. but not in 5 samples with low RF. If an assay showed evidence of The case report forms did not specify diagnostic or classification interference, the optimal Heteroblock concentration for that assay criteria for diseases other than RA. was determined by identifying the minimum concentration that suppressed spurious signal in high RF samples, as previously Candidate Biomarker Selection described [31]. Heteroblock was then added to all subsequent Literature analysis. Candidate biomarkers previously re- plates for that assay. ported to be related to RA disease activity or underlying processes Biomarker data quality control. Quality control of each were identified through manual review of scientific articles and assay plate was conducted by monitoring the performance of the bioinformatics databases of findings extracted from the literature. process controls (the serum pools diluted alongside the samples on Manual literature searches were conducted using PubMed (http:// each plate) and, for Stage 3, also of run controls (made from www.ncbi.nlm.nih.gov/pubmed/). Bioinformatics approaches em- standard material). In stages 1 and 2, consistency of control ployed IRIDESCENT [29] and Ingenuity Pathways Analysis recovery was compared to the other plates within a study and (Ingenuity Systems, Redwood City, CA). across studies in order to identify any outlier plates that warranted

PLOS ONE | www.plosone.org 4 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA repeat assay runs. When individual sample signal in a study was Doppler area, and subsequent change in TSS, as well as the too low to fall on the standard curve, the concentration was set to association between change in biomarker-based disease activity the lowest value observed for any sample in that study. Conversely score and change in DAS28-CRP. Leave one out cross-validation if the signal was too high to fall on the standard curve, the [35] was used to generate biomarker-based disease activity scores, concentration was set to the highest value observed. In Stage 3, whereby the score for each patient were made using a model control tables were established for control samples with accept- trained using DAS28-CRP data from all the other patients. ability ranges being defined as 63 standard deviations (SD) of the Association with other measures was assessed by Spearman expected value. If multiple run controls or process controls on a correlation. Cross-sectional correlation to ultrasound was deter- plate had observed concentrations outside the acceptability range mined at all time-points for which paired biomarker/US for any given assay, the plate was flagged for review and repeated measurements were available. Longitudinal correlation to damage if necessary. For plates that were flagged for review, standard progression was evaluated for baseline biomarker-based disease curves were examined (i.e. % recovery versus expected concen- activity scores vs. change in TSS from baseline to year 1, and for trations and consistency among duplicate wells) and edited if year 1 disease activity scores vs. change in TSS from year 1 to year necessary to remove outlier wells. If the signal coefficient of 2. Correlation between change in biomarker-based disease activity variation (CVs) between a sample’s duplicate wells was greater scores and change in DAS28-CRP was evaluated for changes from than 20%, the sample was flagged for review. If either a plate or baseline to year 1 and from year 1 to year 2. sample was determined to be unacceptable, a data review team Algorithm training. Multivariate methods considered in decided whether to either remove the data from the analysis or re- algorithm development included OLS, LASSO, elastic net, and run if additional sample volume was available. also approaches combining the Curds & Whey (CW) multivariate response method [36] with OLS or LASSO (CW-OLS and CW- Statistical Analysis & Modeling LASSO respectively). Algorithm performance was measured by Biomarker evaluation & prioritization. In each study, Pearson correlation to DAS28-CRP and the area under the biomarkers were assessed and prioritized based on univariate and receiver operating characteristic curve (AUROC) for classifying multivariate association with 8 clinical measures (DAS28-CRP4, patients into low vs. moderate to high disease activity using a DAS28-ESR4, CDAI, SDAI, swollen joint count [SJC], tender DAS28-CRP cutoff of 2.67. To minimize variability of perfor- joint count [TJC], and patient global assessment (PGA)), both in mance estimates due to unequal numbers of patients in low and the entire study cohort and in subgroups defined by autoantibody moderate/high disease activity groups, the AUROC was also status, sex, and therapy. Data for each study was analyzed assessed using an alternate threshold equal to the study median separately. The need to adjust data for plate to plate variation was DAS28-CRP. The performance of the regression methods was assessed for each study. Univariate associations between marker compared in 70/30 cross-validation (repeatedly training in a levels and clinical measures were assessed using Pearson and randomly selected 70% of the data and testing in the remaining Spearman correlations. Statistical significance was assessed after 30%). The number of biomarkers in each regression model was correction for multiple testing (FDR,20%, [32]). Multivariate chosen using nested 10-fold cross validation: within a training set, modeling was performed using Ordinary Least Squares (OLS) samples were divided into 10 equal groups, models were built regression, Least Absolute Shrinkage and Selection Operator using data from 9 groups, and performance was tested in the one (LASSO) [33], and elastic net [34] modeling approaches, each group that was not used for model building. This process was with forward stepwise selection of biomarkers. repeated, each time with a different group set aside for In each study, a univariate rank was calculated for each independent testing, and then the performance was averaged biomarker based on the number of times the marker passed across all models. This procedure was used for models with all statistical significance criteria in univariate analyses using different possible numbers of variables, and the number of variables with disease activity metrics and subgroups. For multivariate analysis, a the best performance was chosen. In the CW approaches, nested priority was assigned for each biomarker for each multivariate 10-fold cross-validation was used for each sub-model correspond- model based on order of entry into the model (the first biomarker ing to each component of DAS to identify the optimal number of included in a model received a priority of 1, etc.). A score was variables for that sub-model. The performance of the MBDA calculated for each biomarker as the sum of the inverse of the algorithms identified in training was evaluated by Pearson biomarker’s priority values across all models. Ranks for multivar- correlation to DAS28-CRP in the 70-patient test data set, and iate analysis were determined by sorting biomarkers in descending the top performing models were selected. order of this score. Biomarkers not included in any model were Comorbidities analysis. Comorbidities were characterized assigned a common lowest rank. To combine ranking information according to physicians’ responses on InFoRM case report forms. from univariate and multivariate analyses, a combined score was The median MBDA score, CRP, CDAI, and DAS28-CRP were calculated as the sum of the inverse univariate rank and the inverse compared in subgroups of RA subjects with and without various multivariate rank. Combined ranks were determined by sorting comorbid conditions, using the ratio of the median values in the the biomarkers in descending order of combined score. The two groups. A ratio of 1.0 indicated that the value of the score was combined rank was used to inform biomarker advancement from one Feasibility study to the next. At the end of Stage 2, the inverse not affected by the comorbidity; ratios meaningfully higher or combined ranks from all of the Stage 2 studies were summed to lower than 1.0 suggest that the comorbidity may inappropriately determine an overall score. Biomarkers were sorted in descending raise or lower the score, respectively. Multiple linear regression order of the overall score to determine a ‘‘grand rank’’ that was used to control for age and sex. The effect of multiple testing summarized evidence from multiple studies. This grand rank was was controlled using the false discovery rate method of Benjamini used, along with other criteria as described in Results, to inform and Hochberg [32]. advancement to algorithm training in Stage 3. Statistical analysis was carried out with the R programming Pilot imaging analyses. The pilot imaging study included language (http://www.r-project.org) unless otherwise specified. evaluation of association between disease activity levels predicted by multivariate biomarker models and US synovitis, Power

PLOS ONE | www.plosone.org 5 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

Results Stage 1: Screening The objective of the Screening stage was to identify proteins that could be measured in serum and that had potential to show association to RA disease activity. We focused on serum testing due to the standard use of serum in clinical rheumatology testing as well as existing literature suggesting the association of multiple serum proteins with RA disease activity [17,20,37–41]; see also Table S1. A total of 130 candidate biomarkers were selected based on comprehensive literature review, previous experiments, the availability of immunoassay components, and evaluation of measurability in serum (see Methods). These candidate biomarkers were then evaluated by examining their association with clinical disease activity measures in a series of Feasibility studies.

Stage 2: Feasibility Patient characteristics for the 4 Feasibility studies are provided in Table 2. Feasibility stage 2A – biomarker prioritization. In Stage 2A, four successive studies were used to prioritize and reduce the candidate biomarker list to a final set of biomarkers for algorithm Figure 1. Receiver operating characteristic curve for biomark- development. Table S4 provides the list of biomarkers evaluated in er-based multivariate models of disease activity in Study II. each study, along with examples of univariate correlation results Curve shows the average true positive rate across 100 folds of cross validation. In each fold a model was trained on a randomly selected and biomarker rankings. 70% of the data and performance was tested on the remaining 30%. Study I was an initial assessment of the utility of candidate doi:10.1371/journal.pone.0060635.g001 biomarkers for predicting disease activity, in order to prioritize roughly 50 biomarkers for further testing. Testing was done with The results of the 4 biomarker prioritization studies were then 113 biomarkers with satisfactory assay performance in the analyzed together to determine a final biomarker set for Screening stage. Ninety-one different assays for 72 unique advancement to algorithm training. The grand ranks calculated biomarkers received a rank reflecting their contribution to from the integrated results of the 4 studies (see Methods) are multivariate disease activity models. Fifty-two of these biomarkers provided in Table S4. Biomarkers with grand rank #24 were were selected for advancement based on their combined rank from prioritized for advancement to training, with the following univariate and multivariate analysis. exceptions: apolipoprotein-B was eliminated due to poor assay Studies II, III, and IV were used for further refinement of the performance, BAFF was eliminated because of evidence that the biomarker set. Seventy-five biomarkers were assessed in Study II: B-cell depleting therapy rituximab led directly to large increases in in addition to the 52 markers that were selected based on marker level [43], and CXCL10 and GM-CSF were eliminated combined rank in Study I, another 23 were included due to strong because their high ranking resulted entirely from good perfor- univariate relationship to disease activity in Study I or because the mance in a single study. Furthermore, four lower-ranking assays were included on the same multiplex assay as other biomarkers were selected due to their representation of key prioritized biomarkers. Of the 75 biomarkers tested in Study II, 65 biological functions or pathways in RA, especially those implicated were also analyzed in Study III (10 were eliminated due to low by other prioritized biomarkers: VCAM1 was included as performance). In Study IV, we then assessed 16 new biomarkers that we had not previously tested, either because they were newly complementary to ICAM1 (adhesion molecules), IL1b was identified as candidate biomarkers or because assays had recently included for its relation to IL1Ra (IL-1 pathway), MMP1 was become available. included for its similarity to MMP3 (both MMPs) and CCL22 In Study II, we used results from multivariate analysis to review (MDC) was included to represent monocyte/macrophage biology. the potential value of prototype multi-biomarker models for Thus the sequence of prioritization studies yielded a set of 24 disease activity. In cross-validation, multi-biomarker models had a biomarkers. mean Pearson correlation with DAS28-CRP of r = 0.60, com- Feasibility stage 2B – pilot imaging study. In parallel with pared to the highest correlation of any individual biomarker to the later stages of biomarker prioritization, we assessed whether DAS28-CRP of r = 0.38, observed in this study for CRP. In an prototype multivariate models of disease activity based on serum AUROC analysis examining the ability of the models to classify biomarkers demonstrated various critical aspects of an effective patients into either low disease activity or moderate/high disease disease activity measure, specifically: an association with clinically- activity categories using the clinical cutoff of 2.67 [42], the average assessed disease activity and changes therein; an association with AUROC was 0.89 (Figure 1). Using the median DAS28-CRP as imaging-based assessment of joint inflammation; and an associa- threshold the average AUROC was 0.77. tion with subsequent radiographically assessed damage progres- Combined univariate/multivariate biomarker ranks were de- sion. termined in each of the 4 prioritization studies and guided The 24 prioritized disease activity biomarkers were assessed decisions on marker advancement between studies. Because Study alongside other biomarkers being evaluated for prediction of 4 was performed in a subset of patients previously analyzed in structural damage progression in samples from a 2-year study of Study II, data on the 16 new biomarkers was combined with clinical, ultrasound, and radiographic outcomes in early aggressive measurements of the previous 75 biomarkers to determine the RA [26,27]. Using data for all the tested biomarkers, multivariate ranks in this study. models were trained to estimate disease activity as measured by the

PLOS ONE | www.plosone.org 6 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

Figure 2. Multivariate biomarker-based disease activity predictions in relationship to other measurements in Pilot Imagingstudy. Prototype biomarker-based disease activity scores vs. DAS28-CRP (for all time-points with both DAS28-CRP and biomarker measurements) (a); baseline and year 1 values of biomarker-predicted disease activity vs. change in TSS from baseline to year 1 and from year 1 to year 2 (b); and change in biomarker-predicted disease activity vs. change in DAS28-CRP from baseline to year 1 and from year 1 to year 2 (c). Biomarker-based models were trained against DAS28-CRP and produce scores on a similar scale. doi:10.1371/journal.pone.0060635.g002

DAS28-CRP, and resulting biomarker-based scores were evaluat- associated with subsequent changes in total Sharp score. For ed for relationships to ultrasound and radiographic measurements example, biomarker-based scores calculated at baseline and year 1 of joint status and damage progression. In leave one out cross- had a Spearman correlation with 12 month change in total Sharp validation, the biomarker-based disease activity scores were score (baseline to year 1, year 1 to year 2, respectively) of 0.52 significantly correlated with DAS28-CRP (Spearman’s (P,0.001, Figure 2b) compared to a correlation of 0.43 (P = 0.006) rho = 0.82, P,0.001, Figure 2a). The biomarker-based disease between the DAS28-CRP and changes in total Sharp Score. activity scores were also significantly correlated with both synovial Biomarker-based disease activity scores and subsequent 12-month thickening (rho = 0.46, P,0.001) and joint vascularity (rho = 0.47, change in Sharp score were also correlated when calculated P,0.001) measured by power Doppler ultrasound. Furthermore, separately for the two years of the study (baseline scores vs. first the biomarker-based disease activity scores were significantly year change correlation = 0.44, P = 0.05; year 1 scores vs. second

PLOS ONE | www.plosone.org 7 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA year change correlation = 0.61, P = 0.005). Because this study results are rounded and scaled to produce integer-valued MBDA included serial measurements of disease activity by DAS28-CRP scores in the range 1–100 (Equations 1–7). The contribution of in each patient, we were able to evaluate whether the biomarker- CRP to the MBDA score is similar to its contribution to the based disease activity scores could also track changes in clinical DAS28-CRP. In the 512 InFoRM patients used in Development, disease activity. Indeed, changes in the biomarker-based scores CRP contributed an average of 16% of the overall score (SD 6%), were significantly correlated to changes in DAS28-CRP, with a and 25% of the non-constant portion of the score (SD 13%; the Spearman correlation of 0.62 (P,0.001, Figure 2c). portion remaining when the contribution of the 0.96 constant Finally, in the process of evaluating the relationship of the from the DAS28-CRP formula is removed). Thresholds for different biomarkers with disease activity, it became apparent that MBDA-based disease activity classification are shown in Table 4; pyridinoline, one of the biomarkers being investigated for category cutoffs were determined from the corresponding DAS28- structural damage prediction in a separate study, was strongly CRP values [45] using the relation specified in the MBDA correlated to DAS28-CRP (rho = 0.47, P,0.001). Thus, pyridino- algorithm: MBDA = round (DAS28-CRP*10.53+1). line was added to the marker set for Stage 3 studies, yielding a set DAS28-CRP and MBDA formulas: of 25 prioritized biomarkers.

Development: Stage 3 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi { ~ z Assay optimization. To ensure that the prioritized biomark- DAS28 CRP 0:56 TJC28 0:28 SJC28 er assays were sufficiently precise and specific for use in a clinical z0:14PGz0:36 ln (CRPz1)z0:96 diagnostic test, we optimized the individual biomarker assays to function in a multiplex environment with precision across time, instruments, operators, and reagent lots [44]. Of the 25 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ~ z biomarkers entering development efforts in Stage 3, seven were MBDA round½max ( min ((0:56 PTJC28 0:28 PSJC28 eliminated at various points in Stage 3 due to practical z0:14PPGAz0:36 ln (CRPz1)z0:96) considerations: interleukin 8 (IL-8) was highly sensitive to variation | z in sample collection conditions (e.g. shipping at ambient temper- 10:53 1,100),1) ature prior to serum separation); interleukin 1 beta (IL-1b) levels were below the limit of detection in many patient samples; interleukin 6 receptor (IL-6R) was eliminated because its levels are LASSO-derived component models: dramatically affected by treatment with tocilizumab (anti-IL-6R antibody); and pyridinoline, calprotectin, and apolipoproteins AI PTJC28(0){38:564z3:997½SAA(1=10)z17:331½IL6(1=10) and CIII were eliminated because their assays did not meet the performance criteria required for clinical testing. z4:665½YKL40(1=10){15:236½EGF(1=10)z2:651½TNFRI(1=10) Algorithm training. In the first part of algorithm training, z (1=10)z (1=10){ (1=10) the entire data set of 703 patients was used to characterize the 2:641½Leptin 4:026½VEGFA 1:470½VCAM1 relationships between the biomarkers and disease activity, while a 249-patient subset was used for final training. Demographics of the cohorts used in algorithm training are presented in Table 3. PSJC28(0)~{25:444z4:051½SAA1=10z16:154½IL61=10 Various statistical approaches were evaluated for algorithm 1=10 1=10 1=10 training (see Methods). A combination of ‘‘Curds and Whey’’ [36] {11:847½EGF z3:091½YKL40 z0:353½TNFRI and LASSO [33] approaches (CW-LASSO) yielded the best performance in cross-validation in training and algorithm testing (data not shown). LASSO modeling with forward stepwise variable PPGA~{ : z : IL 1=10z : SAA 1=10 selection and 10-fold cross-validation was used to select biomark- 13 489 5 474½ 6 0 486½ ers and optimize a linear model for each DAS28-CRP component z2:246½MMP11=10z1:684½Leptin1=10 (except CRP). Then the Curds and Whey method was applied to improve upon the LASSO-derived predictions by exploiting z4:14½TNFRI1=10z2:292½VEGFA1=10{1:898½EGF1=10 correlations between the components, through application of a 1=10 1=10 1=10 shrinkage matrix [36]. This approach improved performance of z0:028½MMP3 {2:892½VCAM1 {:506½Resistin LASSO-derived joint count predictions, whereas analysis from cross-validation showed that inclusion of predicted PGA or CRP in the shrinkage matrix would have impaired performance. CW-derived improved models. The final algorithm was a 12-biomarker model for the multi- biomarker disease activity (MBDA) score. In this algorithm, 11 PTJC28~max(0:174|PTJC28(0)z0:787|PSJC28(0),0) biomarkers (tumor necrosis factor receptor I (TNF-RI), interleukin 6 (IL-6), vascular cell adhesion molecule 1 (VCAM-1), epidermal growth factor (EGF), VEGF-A, cartilage glycoprotein 39 (YKL40), PSJC28~max(0:173|PTJC28(0)z0:784|PSJC28(0),0) matrix metalloproteinase 1 (MMP1), MMP3, serum amyloid A (SAA), leptin, and resistin) are used for prediction of the TJC28, SJC28 and PGA, with different biomarkers and weightings used to PTJC28, predicted 28 tender joint count; PSJC28, predicted 28 predict each component (Figure 3). Marker concentrations were swollen joint count; PPGA, predicted patient global assessment. transformed to the power 1/10 to produce approximate normality Impact of comorbidities on MBDA score. Because and improve the robustness of the multivariate models. The comorbidities could influence clinical and biomarker-based disease component predictions are then combined with CRP in an activity assessment, we evaluated the impact of various comor- equation analogous to that used to calculate the DAS28-CRP, and bidities on the MBDA score as well as on clinical disease activity

PLOS ONE | www.plosone.org 8 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

Table 3. Baseline characteristics of patients used for Training Table 4. Disease activity category definitions for DAS28-CRP and Comorbidities studies. and MBDA.

Training Comorbidities Disease activity DAS28-CRP (final fitting) (InFoRM 512) category definition MBDA definition

Number of samples (n = 249) (n = 512) Remission ,2.3 #25 % Female 75 76 Low #2.7 #29 %RF+ 61a 77b Moderate .2.7 & #4.1 .29 & #44 % anti-CCP+ 58c 65d High .4.1 .44 Median age (IQR) 58 (49–67) 59 (50–68) doi:10.1371/journal.pone.0060635.t004 Median DAS28-CRP (IQR) 3.8 (1.6–6.4) 3.3 (2.3–4.7) Median TJC (IQR) 5 (0–18) 2 (0–8) measures (DAS28-CRP, CDAI and CRP) in the subset of 512 representative InFoRM patients that was used in algorithm Median SJC (IQR) 4 (0–17) 2 (0–6) development. The following comorbidities were present in 10% Median CRP, mg/L (IQR) 3.8 (1.3- 20.5) 4.3 (1.9–12) or more of the subjects and were assessed in this study: Mean PG (IQR) 3.9 (1–7) 3.5 (1–5.5) hypertension, osteoporotic fracture, degenerative joint disease, Mean MBDA (IQR) 42 (33–50) 40 (30–59) diabetes, and asthma. There were no statistically significant associations between any of these comorbidities and any of the RF and anti-CCP status was not available for all patients, evaluable patients measures of disease activity when adjusting for age and sex, and noted: an = 198; accounting for multiple comparisons (Table 5). bn = 505; cn = 232; Discussion dn = 511. IQR, inter-quartile range. Advances in biomarker analysis in recent years have enabled the doi:10.1371/journal.pone.0060635.t003 development of multi-biomarker based diagnostics tests that are impacting patient care and outcomes in various therapeutic areas. Tests based on tissue and peripheral blood gene expression, and serum protein levels, for example, have been introduced for application in breast cancer, heart transplant, and type 2 diabetes, respectively [46–48]. In order to develop clinically useful tests for other clinical applications it is important to use adequately

Figure 3. MBDA score algorithm. The MBDA score algorithm uses an equation analogous to that for the DAS28-CRP, with biomarkers used to predict the Swollen Joint Count (SJC28), Tender Joint Count (TJC28), and Patient Global Assessment (PG) components of the equation. The Venn diagram lists the MBDA score biomarkers used to predict each MBDA score component. YKL-40, human cartilage glycoprotein-39; IL-6, interleukin-6; SAA, serum amyloid A; EGF, epidermal growth factor; TNF-RI, tumor necrosis factor receptor 1; VEGF-A, vascular endothelial growth factor-A; MMP, matrix metalloproteinase. doi:10.1371/journal.pone.0060635.g003

PLOS ONE | www.plosone.org 9 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

Table 5. Ratios of median disease activity measures* powered studies from diverse clinical cohorts, technically opti- between RA patients with and without common mized and validated protein assays, rigorous statistical analysis and comorbidities. multiple sequential studies. We have applied this approach to develop a multi-biomarker disease activity (MBDA) test to aid in the assessment of RA patients. DAS28- MBDA Testing in multiple diverse clinical cohorts was a critical aspect Comorbidity n (%) CRP CDAI CRP Score of the development process. Biomarker performance may vary Hypertension 223 (44) 0.98 1.32{ 1.14{ 1.05 between cohorts because of a range of factors including ethnicity, recruitment biases, practice patterns and technical differences. Osteoarthritis` 172 (34) 0.88 1.17 1.13 1.05 Patients were selected from multiple North American and Osteoporotic bone 131 (26) 0.91 1.05 1.02 1.05 European sites and studies to capture diversity in disease activity, fractures disease duration, treatment approaches, sex, serological status and { Degenerative joint 113 (22) 1.20 1.18 1.11 1.07 other patient characteristics. This diversity is needed to ensure that disease` { the resulting algorithm will be applicable to diverse RA patients Diabetes 73 (14) 1.01 1.09 1.04 1.07 despite the extensive clinical heterogeneity seen in RA. Asthma 50 (10) 1.28 1.11 1.05 1.05 Technical assay optimization and validation were essential to

*Values close to 1.0 indicate that the measurement or test is not affected by the ensure that assay results are consistent and reproducible over time. comorbidity. In RA patients it was also necessary to ensure that there was no {Nominal P,0.05 adjusted for age and sex; when adjusted for multiple direct interference due to interaction from RA drugs, heterophilic comparisons, none was statistically significant. antibodies such as RF, or other substances. Only assays with `Osteoarthritis and degenerative joint disease were listed as separate conditions on the case report forms. acceptable performance were carried forward into clinical studies doi:10.1371/journal.pone.0060635.t005 and ultimately, algorithm development. Markers were evaluated based on relationship to disease activity and contribution to models thereof in the serial biomarker prioritization studies. Furthermore, both univariate and multivar-

Figure 4. Network map of MBDA biomarker roles in cellular communication in RA. doi:10.1371/journal.pone.0060635.g004

PLOS ONE | www.plosone.org 10 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA iate statistical analyses of biomarker-based disease activity assays. As a result, cytokines and mediators that may be highly prediction were used for biomarker prioritization, so as not to expressed within the rheumatoid joint, but that are difficult to miss biomarkers that add value in multivariate models despite detect in serum either due to low levels or inadequate assays, were weaker individual relationships. Multiple statistical modeling not considered. In addition, the clinical studies that formed the methodologies were also applied in order to identify those that basis for this work excluded some patients for a combination of performed best on this problem. The ranking of biomarkers was ethical and practical reasons (e.g. around steroid use, pregnancy based on both univariate and multivariate analyses and on a range and serious medical condition). As a consequence, this develop- of disease activity metrics, in order to reduce unintentional ment program cannot be considered to establish the applicability elimination of potentially useful biomarkers. Assessment in of the biomarker findings and MBDA test to these patient types, multiple, sequential studies was used to reduce the overall false and further studies are required. Finally, the MBDA test is by discovery rate and to test the robustness of findings across studies. design an exclusively biomarker-based assessment. While this Although relationships between biomarkers and various disease allows for certain advantages, including efficiency and objectivity, activity measures were assessed during the development process, it does not include physical examination and should not be seen as the DAS28-CRP was chosen as the reference for final algorithm a replacement for clinician or patient assessment, but rather as an training because of its established validation against the DAS28- additional, complementary tool to provide objective, quantitative ESR [6], a disease activity measure shown to predict RA outcomes data to inform clinical decision-making. in clinical trials [49], and because measurement of CRP is more The MBDA algorithm developed as described here has readily standardized than ESR in banked samples from multiple subsequently been evaluated and validated in independent cohorts studies and sites. We found that multi-biomarker based models [21,22]. Additional studies are underway to further evaluate the outperformed individual biomarkers at prediction of DAS28-CRP. relationship between the MBDA score and other measures of Pilot imaging analyses indicated that disease activity scores from disease activity, and the ability of the score to indicate risk of multivariate models trained to predict DAS28-CRP were also critical patient outcomes such as joint damage progression. associated with synovitis and vascularity, assessed by power Ultimately, prospective studies incorporating the MBDA score Doppler ultrasound, and with radiographically-determined joint will be required to fully elucidate its clinical utility and its damage progression. These results demonstrate that a multi- appropriate use alongside other approaches to patient assessment biomarker algorithm trained specifically against the DAS28-CRP in routine practice. can nonetheless provide information more broadly on disease activity and, critically, on disease outcomes. Clinically assessed Supporting Information disease activity measures exhibit significant inter-assessor and inter-subject variability [10,13], whereas the median coefficient of Table S1 Candidate biomarkers and assays used. variation of the MBDA score in repeat runs of the same sample is (XLS) ,2% [44]. Clinical measures, while clearly valuable, are nonetheless imperfect standards for comparison, and independent Table S2 Details of Assays used in Feasibility studies. measures such as ultrasound and radiographic imaging are All summary statistics are presented as the mean values across important indicators of the activity of the fundamental disease study plates. ’sample dilution’ is the dilution of serum added to the processes. assay after any optimization. ’avg sample CV’ is mean % Because comorbidities can conceivably influence clinical assess- coefficient of variation, calculated using duplicate measurements ment and biomarker levels, we assessed the impact of several of all samples on each plate. The Lower Measurability Limit common comorbidities on the MBDA score and observed no (LML) is the standard of lowest concentration for which all confounding effects. Larger studies are needed to confirm and replicates have higher results than all replicates of the blank and all further define these observations. It is especially important to standards of lower concentrations. Successive standards above the examine other comorbidities with inflammatory components that LML are eligible to represent the Higher Measurability Limit could not be analyzed in our data, such as infections and (HML) so long as all their replicates have higher results than all malignancies. Infections in particular are associated with rapid replicates for standards of lower concentration. The HML is the elevation of cytokines and acute phase reactants, so MBDA score highest of these eligible standards for which all replicates have results should be interpreted with caution in patients with active lower results than all standards of higher concentrations. infections. In addition, the effects of vaccination merit examination (XLS) since it can cause acute inflammation. Table S3 Details of assays used in Development Although they were selected primarily via a statistics-driven studies. process, the prioritized biomarkers reflect key biological pathways, (XLS) cells, and features of RA. Figure 4 summarizes known roles of the 12 biomarkers included in the final algorithm in cellular Table S4 Results for individual biomarkers in Feasibil- interactions and processes important to the disease. The diagram ity (Stage 2) studies. illustrates a critical premise driving this multi-biomarker develop- (XLS) ment effort: that a biologically diverse set of quantitative biomarkers might be able to provide more information about Author Contributions underlying disease biology than any single biomarker. Conceived and designed the experiments: MC GC MT CS DS DC LH J. Despite the scope and rigor of the work described, some Carulli PT NS MW J. Curtis. Performed the experiments: MC MT CS DS important limitations should be noted. For practical consider- PT J. Curtis MW KS. Analyzed the data: MC GC YS SR NK DH PT KS. ations, we restricted our evaluation to serum biomarkers that were Wrote the paper: MC GC YS SR NK MT CS DS DH DC LH J. Carulli measurable in RA patient serum using commercially available PT NS MW J. Curtis KS.

PLOS ONE | www.plosone.org 11 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

References 1. Bakker MF, Jacobs JW, Verstappen SM, Bijlsma JW (2007) Tight control in the gov/entrez/query.fcgi?cmd = Retrieve&db = PubMed&dopt = Citation&list_ treatment of rheumatoid arthritis: efficacy and feasibility. Ann Rheum Dis 66 uids = 19032813. Suppl 3: iii56–60. Available:http://www.ncbi.nlm.nih.gov/entrez/query. 17. Mease PJ (1998) The potential roles for novel biomarkers in rheumatoid arthritis fcgi?cmd = Retrieve&db = PubMed&dopt = Citation&list_uids = 17934098. assessment. Clinical and experimental rheumatology 29: 567–574. Availa- 2. Verstappen SMM, Jacobs JWG, van der Veen MJ, Heurkens AH, Schenk Y, et ble:http://www.ncbi.nlm.nih.gov/pubmed/21640052. Accessed 18 July 2012. al. (2007) Intensive treatment with methotrexate in early rheumatoid arthritis: 18. Chambers RE, MacFarlane DG, Whicher JT, Dieppe P a (1983) Serum aiming for remission. Computer Assisted Management in Early Rheumatoid amyloid-A protein concentration in rheumatoid arthritis and its role in Arthritis (CAMERA, an open-label strategy trial). Annals of the rheumatic monitoring disease activity. Annals of the rheumatic diseases 42: 665–667. diseases 66: 1443–1449. Available:http://www.pubmedcentral.nih.gov/ Available:http://www.pubmedcentral.nih.gov/articlerender. articlerender.fcgi?artid = 2111604&tool = pmcentrez&rendertype = abstract. Ac- fcgi?artid = 1001325&tool = pmcentrez&rendertype = abstract. Accessed 18 July cessed 22 July 2011. 2012. 3. Mease PJ (2010) Improving the routine management of rheumatoid arthritis: the 19. Klimiuk P a, Sierakowski S, Latosiewicz R,CylwikJP,CylwikB,etal.(2002)Soluble value of tight control. The Journal of rheumatology 37: 1570–1578. adhesion molecules (ICAM-1, VCAM-1, and E-selectin) and vascular endothelial Available:http://www.ncbi.nlm.nih.gov/pubmed/20595287. Accessed 23 July growth factor (VEGF) in patients with distinct variants of rheumatoid synovitis. Annals 2012. of the rheumatic diseases 61: 804–809. Available:http://www.pubmedcentral.nih. 4. Smolen JS, Aletaha D, Bijlsma JWJ, Breedveld FC, Boumpas D, et al. (2010) Treating gov/articlerender.fcgi?artid = 1754213&tool = pmcentrez&rendertype = abstract. rheumatoid arthritis to target: recommendations of an international task force. Annals 20. Hammer HB, Odegard S, Fagerhol MK, Landewe´ R, van der Heijde D, et al. of the rheumatic diseases 69: 631–637. Available:http://www.pubmedcentral.nih. (2007) Calprotectin (a major leucocyte protein) is strongly and independently gov/articlerender.fcgi?artid = 3015099&tool = pmcentrez&rendertype = abstract. Ac- correlated with joint inflammation and damage in rheumatoid arthritis. Annals of cessed 24 June 2011. the rheumatic diseases 66: 1093–1097. Available:http://www.pubmedcentral.nih. 5. Singh JA, Furst DE, Bharat A, Curtis JR, Kavanaugh AF, et al. (2012) 2012 gov/articlerender.fcgi?artid = 1954700&tool = pmcentrez&rendertype = abstract. update of the 2008 American College of Rheumatology recommendations for Accessed 18 July 2012. the use of disease-modifying antirheumatic drugs and biologic agents in the 21. Bakker M, Cavet G, Jacobs J, Bijlsma J, Haney D, et al. (2012) Performance of a treatment of rheumatoid arthritis. Arthritis care & research 64: 625–639. Multi-Biomarker Score Measuring Rheumatoid Arthritis Disease Activity in the Available:http://www.ncbi.nlm.nih.gov/pubmed/22473917. Accessed 15 July CAMERA Tight Control Study. Annals of Rheumatic Disease. 2012. 22. Curtis JR, van der Helm-van Mil AH, Knevel R, Huizinga TW, Haney DJ, et 6. Wells G, Becker J-C, Teng J, Dougados M, Schiff M, et al. (2009) Validation of al. (2012) Validation of a novel multi-biomarker test to assess rheumatoid the 28-joint Disease Activity Score (DAS28) and European League Against arthritis disease activity. Arthritis care & research. Available:http://www.ncbi. Rheumatism response criteria based on C-reactive protein against disease nlm.nih.gov/pubmed/22736476. Accessed 22 July 2012. progression in patients with rheumatoid arthritis, and comparison with the 23. Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF, et al. (1988) The DAS28 based on erythrocyte sedimentation rate. Annals of the rheumatic American Rheumatism Association 1987 revised criteria for the classification of diseases 68: 954–960. Available:http://www.pubmedcentral.nih.gov/ rheumatoid arthritis. Arthritis and rheumatism 31: 315–324. Available:http:// articlerender.fcgi?artid = 2674547&tool = pmcentrez&rendertype = abstract. Ac- www.ncbi.nlm.nih.gov/pubmed/3358796. Accessed 18 July 2012. cessed 18 July 2011. 24. Iannaccone CK, Lee YC, Cui J, Frits ML, Glass RJ, et al. (2011) Using genetic 7. Fransen J, van Riel PLCM (n.d.) The Disease Activity Score and the EULAR and clinical data to understand response to disease-modifying anti-rheumatic response criteria. Clinical and experimental rheumatology 23: S93–9. Availa- drug therapy: data from the Brigham and Women’s Hospital Rheumatoid ble:http://www.ncbi.nlm.nih.gov/pubmed/16273792. Accessed 22 October Arthritis Sequential Study. Rheumatology (Oxford, England) 50: 40–46. 2012. Available:http://www.ncbi.nlm.nih.gov/pubmed/20847201. Accessed 18 July 8. Zatarain E, Strand V (2006) Monitoring disease activity of rheumatoid arthritis 2012. in clinical practice: contributions from clinical trials. Nature clinical practice 25. Karlson EW, Chibnik LB, Cui J, Plenge RM, Glass RJ, et al. (2008) Associations between Rheumatology 2: 611–618. Available:http://www.ncbi.nlm.nih.gov/pubmed/ human leukocyte antigen, PTPN22, CTLA4 genotypes and rheumatoid arthritis 17075600. Accessed 25 January 2012. phenotypes of autoantibody status, age at diagnosis and erosions in a large cohort study. 9. Pincus T, Swearingen CJ, Bergman M, Yazici Y (2008) RAPID3 (Routine Annals of the rheumatic diseases 67: 358–363. Available:http://www.pubmedcentral.nih. Assessment of Patient Index Data 3), a rheumatoid arthritis index without formal gov/articlerender.fcgi?artid = 2945890&tool = pmcentrez&rendertype = abstract. Accessed joint counts for routine care: proposed severity categories compared to disease 18 July 2012. activity score and clinical disease activity index categories. The Journal of 26. Taylor PC, Steuer A, Gruber J, Cosgrove DO, Blomley MJK, et al. (2004) rheumatology 35: 2136–2147. Available:http://www.ncbi.nlm.nih.gov/ Comparison of ultrasonographic assessment of synovitis and joint vascularity pubmed/18793006. Accessed 7 November 2012. with radiographic evaluation in a randomized, placebo-controlled study of 10. Uhlig T, Kvien TK, Pincus T (2009) Test-retest reliability of disease activity core infliximab therapy in early rheumatoid arthritis. Arthritis and rheumatism 50: set measures and indices in rheumatoid arthritis. Annals of the rheumatic 1107–1116. Available:http://www.ncbi.nlm.nih.gov/pubmed/15077292. Ac- diseases 68: 972–975. Available:http://www.ncbi.nlm.nih.gov/pubmed/ cessed 18 July 2012. 18957489. Accessed 22 July 2011. 27. Taylor PC, Steuer A, Gruber J, McClinton C, Cosgrove DO, et al. (2006) 11. Ranzolin A, Brenol JCT, Bredemeier M, Guarienti J, Rizzatti M, et al. (2009) Ultrasonographic and radiographic results from a two-year controlled trial of Association of concomitant fibromyalgia with worse disease activity score in 28 immediate or one-year-delayed addition of infliximab to ongoing methotrexate joints, health assessment questionnaire, and short form 36 scores in patients with therapy in patients with erosive early rheumatoid arthritis. Arthritis and rheumatoid arthritis. Arthritis and rheumatism 61: 794–800. Available:http:// rheumatism 54: 47–53. Available:http://www.ncbi.nlm.nih.gov/pubmed/ www.ncbi.nlm.nih.gov/pubmed/19479706. Accessed 23 August 2011. 16385521. Accessed 18 July 2012. 12. Leeb BF, Andel I, Sautner J, Nothnagl T, Rintelen B (2004) The DAS28 in 28. Fleischmann RM, Curtis JR, Hamburger MH, Blumstein H, Swan K, et al. rheumatoid arthritis and fibromyalgia patients. Rheumatology (Oxford, (2010) RA population characteristics in InFoRM, a longitudinal observational England) 43: 1504–1507. Available:http://www.ncbi.nlm.nih.gov/pubmed/ study. Ann Rheum Dis 69: 657. 15252215. Accessed 25 January 2012. 29. Wren JD, Garner HR (2004) Shared relationship analysis: ranking set cohesion 13. Marhadour T, Jousse-Joulin S, Chale`s G, Grange L, Hacquard C, et al. (2010) and commonalities within a literature-derived relationship network. Bioinfor- Reproducibility of joint swelling assessments in long-lasting rheumatoid arthritis: matics (Oxford, England) 20: 191–198. Available:http://bioinformatics. influence on Disease Activity Score-28 values (SEA-Repro study part I). The oxfordjournals.org/cgi/doi/10.1093/bioinformatics/btg390. Accessed 18 July Journal of rheumatology 37: 932–937. Available:http://www.ncbi.nlm.nih.gov/ 2012. pubmed/20360184. Accessed 25 January 2012. 30. Eastman PS, Manning WC, Qureshi F, Haney D, Cavet G, et al. (2012) 14. Sokka T, Pincus T (2009) Erythrocyte sedimentation rate, C-reactive protein, or Characterization of a multiplex, 12-biomarker test for rheumatoid arthritis. rheumatoid factor are normal at presentation in 35%–45% of patients with Journal of pharmaceutical and biomedical analysis 70: 415–424. Availa- rheumatoid arthritis seen between 1980 and 2004: analyses from Finland and the ble:http://www.ncbi.nlm.nih.gov/pubmed/22749821. Accessed 22 October United States. J Rheumatol 36: 1387–1390. Available:http://www.ncbi.nlm.nih. 2012. gov/entrez/query.fcgi?cmd = Retrieve&db = PubMed&dopt = Citation&list_ 31. Todd DJ, Knowlton N, Amato M, Frank MB, Schur PH, et al. (2011) Erroneous uids = 19411389. augmentation of multiplex assay measurements in patients with rheumatoid 15. Rhodes B, Merriman ME, Harrison A, Nissen MJ, Smith M, et al. (2010) A arthritis due to heterophilic binding by serum rheumatoid factor. Arthritis and genetic association study of serum acute-phase C-reactive protein levels in rheumatism 63: 894–903. Available:http://www.ncbi.nlm.nih.gov/pubmed/ rheumatoid arthritis: implications for clinical interpretation. PLoS Med 7: 21305505. Accessed 18 July 2012. e1000341. Available:http://www.ncbi.nlm.nih.gov/entrez/query. 32. Benjamini Y, Hochberg M (1995) Controlling the false discovery rate: a practical fcgi?cmd = Retrieve&db = PubMed&dopt = Citation&list_uids = 20877716. and powerful approach to multiple testing. Journal of the Royal Statistical 16. Keenan RT, Swearingen CJ, Yazici Y (2008) Erythrocyte sedimentation rate and Society: Series B (Statistical Methodology) 57: 289–300. Available:http://www. C-reactive protein levels are poorly correlated with clinical measures of disease jstor.org/stable/10.2307/2346101. Accessed 19 July 2012. activity in rheumatoid arthritis, systemic lupus erythematosus and osteoarthritis 33. Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of patients. Clin Exp Rheumatol 26: 814–819. Available:http://www.ncbi.nlm.nih. the Royal Statistical Society: Series B (Statistical Methodology) 58: 267–288.

PLOS ONE | www.plosone.org 12 April 2013 | Volume 8 | Issue 4 | e60635 Multi-Biomarker Disease Activity Test for RA

Available:http://www.jstor.org/stable/10.2307/2346178. Accessed 19 July C-reactive protein threshold values. Annals of the rheumatic diseases 66: 407– 2012. 409. Available:http://www.pubmedcentral.nih.gov/articlerender. 34. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. fcgi?artid = 1856019&tool = pmcentrez&rendertype = abstract. Accessed 25 Jan- Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67: uary 2012. 301–320. Available:http://doi.wiley.com/10.1111/j.1467-9868.2005.00503.x. 43. Lavie F, Miceli-Richard C, Ittah M, Sellam J, Gottenberg J-E, et al. (2007) 35. Hastie T, Tibshirani R, Friedman J (2009) The Elements of Statistical Learning: Increase of B cell-activating factor of the TNF family (BAFF) after rituximab Data Mining, Inference, and Prediction.NewYork:Springer.746p. treatment: insights into a new regulating system of BAFF production. Annals of the Available:http://www.springer.com/statistics/statistical+theory+and+methods/ rheumatic diseases 66: 700–703. Available:http://www.pubmedcentral.nih.gov/ book/978-0-387-84857-0. Accessed 22 July 2012. articlerender.fcgi?artid = 1954605&tool = pmcentrez&rendertype = abstract. Ac- 36. Breiman L, Friedman JH (1997) Predicting Multivariate Responses in Multiple cessed 19 July 2012. Linear Regression. Journal of the Royal Statistical Society: Series B (Statistical 44. Eastman PS, Manning WC, Qureshi F, Haney D, Cavet G, et al. (2012) Methodology) 59: 3–54. Available:http://onlinelibrary.wiley.com/doi/10. Characterization of a multiplex, 12-biomarker test for rheumatoid arthritis. 1111/1467-9868.00054/full. Accessed 19 July 2012. Journal of pharmaceutical and biomedical analysis. Available:http://www.ncbi. 37. Targon´ska-Stepniak B, Majdan M, Dryglewska M (2008) Leptin serum levels in nlm.nih.gov/pubmed/22749821. Accessed 22 July 2012. rheumatoid arthritis patients: relation to disease duration and activity. 45. Inoue E, Yamanaka H, Hara M, Tomatsu T, Kamatani N (2007) Comparison Rheumatology international 28: 585–591. Available:http://www.ncbi.nlm.nih. of Disease Activity Score (DAS)28- erythrocyte sedimentation rate and DAS28- gov/pubmed/17968549. Accessed 18 July 2012. C-reactive protein threshold values. Annals of the rheumatic diseases 66: 407– 38. Rioja I, Hughes FJ, Sharp CH, Warnock LC, Montgomery DS, et al. (2008) 409. Available:http://www.pubmedcentral.nih.gov/articlerender. Potential novel biomarkers of disease activity in rheumatoid arthritis patients: fcgi?artid = 1856019&tool = pmcentrez&rendertype = abstract. Accessed 25 Jan- CXCL13, CCL23, transforming growth factor alpha, tumor necrosis factor uary 2012. receptor superfamily member 9, and macrophage colony-stimulating factor. 46. Paik S, Tang G, Shak S, Kim C, Baker J, et al. (2006) Gene expression and Arthritis and rheumatism 58: 2257–2267. Available:http://www.ncbi.nlm.nih. benefit of chemotherapy in women with node-negative, estrogen receptor- gov/pubmed/18668547. Accessed 18 July 2012. positive breast cancer. Journal of clinical oncology?: official journal of the 39. Melis L, Vandooren B, Kruithof E, Jacques P, De Vos M, et al. (2010) Systemic American Society of Clinical Oncology 24: 3726–3734. Available:http://www. levels of IL-23 are strongly associated with disease activity in rheumatoid ncbi.nlm.nih.gov/pubmed/16720680. Accessed 13 July 2012. arthritis but not spondyloarthritis. Annals of the rheumatic diseases 69: 618–623. 47. Paik S, Shak S, Tang G, Kim C, Baker J, et al. (2004) A multigene assay to Available:http://www.ncbi.nlm.nih.gov/pubmed/19196728. Accessed 19 July predict recurrence of tamoxifen-treated, node-negative breast cancer. The New 2012. England journal of medicine 351: 2817–2826. Available:http://www.ncbi.nlm. 40. Park M-C, Chung SJ, Jung S-J, Park Y-B, Lee S-K (2004) Relationship of serum nih.gov/pubmed/15591335. TWEAK level to cytokine level, disease activity, and response to anti-TNF 48. Kolberg JA, Jørgensen T, Gerwien RW, Hamren S, McKenna MP, et al. (2009) treatment in patients with rheumatoid arthritis. Scandinavian journal of Development of a type 2 diabetes risk model from a panel of serum biomarkers from the rheumatology 37: 173–178. Available:http://www.ncbi.nlm.nih.gov/pubmed/ Inter99 cohort. Diabetes care 32: 1207–1212. Available:http://www.pubmedcentral. 18465450. Accessed 19 July 2012. nih.gov/articlerender.fcgi?artid = 2699726&tool = pmcentrez&rendertype = abstract. 41. Ehrchen JM, Sunderko¨tter C, Foell D, Vogl T, Roth J (2009) The endogenous Accessed 13 July 2012. Toll-like receptor 4 agonist S100A8/S100A9 (calprotectin) as innate amplifier of 49. Prevoo ML, van’t Hof MA, Kuper HH, van Leeuwen MA, van de Putte LB, et al. infection, autoimmunity, and cancer. Journal of leukocyte biology 86: 557–566. (1995) Modified disease activity scores that include twenty-eight-joint counts. Available:http://www.ncbi.nlm.nih.gov/pubmed/19451397. Accessed 19 July Development and validation in a prospective longitudinal study of patients with 2012. rheumatoid arthritis. Arthritis Rheum 38: 44–48. Available:http://www.ncbi.nlm. 42. Inoue E, Yamanaka H, Hara M, Tomatsu T, Kamatani N (2007) Comparison nih.gov/entrez/query.fcgi?cmd = Retrieve&db = PubMed&dopt = Citation&list_ of Disease Activity Score (DAS)28- erythrocyte sedimentation rate and DAS28- uids = 7818570.

PLOS ONE | www.plosone.org 13 April 2013 | Volume 8 | Issue 4 | e60635 Rheumatology 2015;54:1640–1649 RHEUMATOLOGY doi:10.1093/rheumatology/kev023 Advance Access publication 15 April 2015 Original article Outcomes and costs of incorporating a multibiomarker disease activity test in the management of patients with rheumatoid arthritis

Kaleb Michaud1,2, Vibeke Strand3, Nancy A. Shadick4, Irina Degtiar5, Kerri Ford6, Steven N. Michalopoulos5 and John Hornberger5,7 Downloaded from Abstract Objective. The multibiomarker disease activity (MBDA) blood test has been clinically validated as a measure of disease activity in patients with RA. We aimed to estimate the effect of the MBDA test on physical function for patients with RA (based on HAQ), quality-adjusted life years and costs over 10 years. http://rheumatology.oxfordjournals.org/ Methods. A decision analysis was conducted to quantify the effect of using the MBDA test on RA-related outcomes and costs to private payers and employers. Results of a clinical management study reporting changes to anti-rheumatic drug recommendations after use of the MBDA test informed clinical utility. The effect of treatment changes on HAQ was derived from 5 tight-control and 13 treatment-switch trials. Baseline HAQ scores and the HAQ score relationship with medical costs and quality of life were derived from published National Data Bank for Rheumatic Diseases data. Results. Use of the MBDA test is projected to improve HAQ scores by 0.09 units in year 1, declining to 0.02 units after 10 years. Over the 10 year time horizon, quality-adjusted life years increased by 0.08 years and costs decreased by US$457 (cost savings in disability-related medical costs, US$659; in productivity costs, US$2137). The most influential variable in the analysis was the effect of the MBDA test on clinician treatment recommendations and subsequent HAQ changes. at OHSU Main Library on September 30, 2015 Conclusion. The MBDA test aids in the assessment of disease activity in patients with RA by changing treatment decisions, improving the functional status of patients and cost savings. Further validation is ongoing and future longitudinal studies are warranted. Key words: rheumatoid arthritis, biomarkers, outcome assessment, quality of life, health economics. SCIENCE CLINICAL

Rheumatology key messages . MBDA test is projected to improve functional status and reduce costs for patients with RA. . MBDA test remained below willingness-to-pay levels found among therapies reported in other cost-effectiveness analyses.

Introduction

RA is a chronic, debilitating disease affecting 1.1% of the 1University of Nebraska Medical Center, Omaha, NE, 2National Data Bank for Rheumatic Diseases, Wichita, KS, 3Division of Immunology/ entire US population and between 0.2% and 0.9% of Rheumatology, Stanford University, Palo Alto, CA, 4Brigham & populations elsewhere [1]. RA can result in joint erosion, Women’s Hospital, Division of Rheumatology, Immunology and Allergy, Boston, MA, 5Cedar Associates, Menlo Park, CA, loss of physical function and premature death [2–4]. Early 6Crescendo Bioscience, San Francisco, CA, USA and 7Department of detection and appropriate treatment of RA increases the Internal Medicine, Stanford University School of Medicine, Palo Alto, probability of achieving remission, delays or halts radio- CA, USA graphic progression and preserves physical function Submitted 23 June 2014; revised version accepted 3 February 2015 [5, 6]. For patients with active disease, treatment guide- Correspondence to: John Hornberger, Cedar Associates, 3715 Haven Avenue, Suite 100, Menlo Park, CA 94025, USA. lines recommend regular assessments (i.e. every 1–3 E-mail: [email protected] months) and treatment to achieve remission and

! The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] MBDA test in RA

sustained low disease activity [7–9]. Frequent evaluations events and other direct costs (e.g. outpatient, hospitaliza- of patient disease activity, with appropriate adjustments in tion, disease monitoring). Indirect costs were those treatment regimens (tight-control), have been shown to related to labour force participation and the percentage improve clinical outcomes and health-related quality of of work activity while at work, referred to as presenteeism. life (QoL) [9, 10]. Benefits were assessed using quality-adjusted life years The ACR recommends using at least one of the five (QALYs). All costs were inflated to 2014 US dollars (USD) clinical measures to assess disease activity [9]. using the medical component of the US Bureau of Labor However, current measures have limitations, including Statistics Consumer Price Index [27]. See supplementary interreader variability [11], inability to detect subclinical Fig. S1, available at Rheumatology Online, for a summary synovitis and structural damage [12] and inability to of the structural framework. account for the potential influence of co-morbidities (e.g. Patient characteristics, treatment effects and disease FM, joint infections, obesity) [13–15]. Because joint progression were derived from published clinical trial destruction occurs rapidly in the first 2 years of disease, and registry data (Table 1 and supplementary Table S1, guidelines have stressed the importance of accurate mea- available at Rheumatology Online, for data sources and surement of disease activity to guide the use of treatments sensitivity analysis ranges). A fixed annual discount rate of Downloaded from that can limit joint damage [16]. Despite the benefits of this 3% was applied to all costs and benefits. strategy, the assessment of disease activity and risk of radiographic progression are performed inconsistently, which has been attributed to ambivalence about the pre- Target patient population dictive qualities of current measurements and time con- The analysis quantified the effect of the MBDA test on straints [17]. outcomes for representative patients with RA (see supple- http://rheumatology.oxfordjournals.org/ Researchers have been seeking serum biomarkers that mentary data, section on patient population, available at can complement clinical disease measures to improve the Rheumatology Online). A post hoc analysis of data from evaluation of disease activity in patients with RA [15, 18]. the clinical management study of RA treatment recom- A multibiomarker disease activity (MBDA) blood test for mendations was performed to describe the distribution RA (Vectra DA; Crescendo Bioscience, South San of patients with early (i.e. 41 year disease duration, Francisco, CA, USA) measures serum concentrations of 28%) or established (i.e. >1 year, 72%) RA at baseline 12 proteins and combines them in a validated algorithm to (data on file, Crescendo Bioscience, South San produce a score of 1–100 that indicates the level of dis- Francisco, CA, USA). ease activity in patients with RA. MBDA test results are significantly associated with RA disease activity and treat- ment response [19–21]. In clinical studies, patients with Treatment changes and resulting changes in outcome at OHSU Main Library on September 30, 2015 RA who were in remission (according to the 28-joint Treatment changes following use of the MBDA test were DAS using CRP) and had high MBDA scores had based on results of a clinical management study in which increased radiographic progression compared with 38.0% of treatment decisions (regarding the use of MTX, those with lower MBDA scores [22]. The MBDA test was synthetic DMARDs and biologics) were adjusted following subsequently shown to provide information that altered knowledge of the MBDA test results [23]. Results of a therapeutic recommendations for patients seen in routine subanalysis of these data found almost 30% of treatment clinical practice [23]. The primary aim of this study was to strategies were increased in intensity, which was defined evaluate the outcomes and costs of the MBDA test when as increasing the drug dosage, changing the formulation used as an adjunct to current clinical practice for the man- of the drug (e.g. oral to injectable) or recommending a agement of treatment in RA patients. drug in a more intensive drug class (e.g. MTX monother- apy to one comprising a biologic). Nine per cent of treat- Methods ment strategies decreased in intensity. Physicians ordered 1.91 tests per patient (range 1–3; Analytical framework data on file, Crescendo Bioscience), and most were A decision analysis was conducted to quantify the out- ordered in the first year. This amount and timing of test comes and costs of the MBDA test when used by physi- ordering is consistent with an incentive to identify as early cians to help guide patient-specific RA treatment as possible the regimen that will achieve guideline-recom- decisions compared with current clinical practice. The mended targets for response and remission. Tests analysis was conducted from a US perspective, including ordered after the first year were presumably related to third-party payers and employers, using a 10 year time patients who had not yet achieved a response after 1 horizon. The analytical framework was constructed to be year or the physician wanted to assess disease status consistent with independently validated analyses that in patients whose signs and symptoms suggested a reported long-term outcomes in RA [24–26]. relapse. We assumed that only those tests ordered Direct costs that were evaluated in this analysis within the first 5 years had an effect on disease included those associated with use of the MBDA test, progression. The cumulative effect of changes in disease RA drugs (including administration costs for intravenously progression on HAQ, QoL and costs were analysed over administered drugs), treatment of drug-related adverse 10 years.

www.rheumatology.oxfordjournals.org 1641 Kaleb Michaud et al.

TABLE 1 Patient characteristics and treatment effect cost estimates for decision analysis model input

Variable Estimate

Direct costs Contract price of MBDA (per test) $789 Drug costsa Annual drug costs with MBDA (reflects impact on drug costs) $17 978 Commercial markup, % 18 Reduction for patient adherenceb,% 21 Moderate-to-severe adverse event $24 149 Other direct medical costs Annual cost per unit increase in HAQ score $1637 Percentage of patients receiving MBDA with early RAc 28 Current clinical practice Baseline HAQ score, early RA 1.53 Baseline HAQ score, established RA 1.03 Downloaded from HAQ score standard deviation 0.71 Annual change in HAQ score 0.005 With MBDA MBDA’s average effect on HAQ score in early RA, year 1d 0.29 MBDA’s average effect on HAQ score in established RA, year 1d 0.02 Years until half of HAQ score difference remains 4 http://rheumatology.oxfordjournals.org/ Indirect costs Absenteeism HAQ <0.25 $7719 HAQ 0.25–0.75 $8393 HAQ 0.75–1.25 $12 369 HAQ 51.25 $16 636 Presenteeism HAQ <0.25 $2297 HAQ 0.25–0.75 $5429 HAQ 0.75–1.25 $8542 HAQ 51.25 $7650 Average annual income $25 734 Utilities Baseline utility 0.91 Utility decrement at OHSU Main Library on September 30, 2015 Per unit increase in HAQ score 0.17 Mild adverse event 0.10 Moderate-to-severe adverse event 0.45 Rates and risks Rate of mild adverse events, % MTX 25 sDMARDs 45 Biologic 30 Rate of moderate to severe adverse events, % MTX 2 sDMARDs 3 Biologic 1 Change in mortality risk Due to RA for each unit change in HAQ score 1.33 Other parameters Average age at baseline, years Prevalent RA 60 Early RA 50 Average duration of treatment effect, years 10 Average number of eligible visits per RA patient 4 Average number of tests per patient 1.91 Time horizon, years 10 Discount rate, % 3

aCurrent drug costs reflect CMS reference prices with a commercial markup and an adjustment for patient adherence. bThe reduction for patient adherence accounts for imperfect adherence to prescribed therapies and is based on the medical possession ratio for a cross section of patients on a mixture of biologic and synthetic DMARDs. cEarly RA is defined as 41 year duration. dFor early RA, this number is adjusted for baseline HAQ score. For established RA, this number is adjusted by the proportion of patients with established RA who experience an escalation in therapy. See supplementary Table S1, available at Rheumatology Online, for input data sources and sensitivity analysis ranges. All costs reported in 2014 US dollars. sDMARDs: synthetic DMARDs.

1642 www.rheumatology.oxfordjournals.org MBDA test in RA

Progression of RA FIG.1HAQ distribution: current clinical practice vs MBDA For this decision analysis, functional status was assessed at year 1 with the HAQ score. The mean baseline HAQ score for patients with early RA was 1.53 based on a similar patient population enrolled in the GO-BEFORE study; for the established RA population we used findings from the National Data Bank (see supplementary Fig. S2A, avail- able at Rheumatology Online) [28, 29]. Changes in HAQ score of patients with early and established RA following the MBDA test were estimated by combining the results of 5 tight-control and 13 treatment-switch trials (see supple- mentary data, section on changes in outcome and sup- plementary Fig. S2, available at Rheumatology Online). Half of the difference in HAQ score between current clin-

ical practice and adjunctive use of MBDA was expected to Downloaded from Dark grey: current clinical practice; light grey: with Vectra be maintained up to 4 years after the initial treatment DA. Shift of the curve, to the left, denotes improvement in change [16, 30]. This waning, on average, in the treatment the mean HAQ score with the use of the MBDA test (year effect accounts for incomplete control of the disease 1). MBDA: multibiomarker disease activity; PDF: prob- through worsening of symptoms. Radiographic damage ability density function. was expected to be correlated with HAQ score as reported in longitudinal clinical studies [16, 30, 31]. http://rheumatology.oxfordjournals.org/

QoL adjustments and costs was reported in the literature, we relied on expert opinion QoL adjustments and costs are presented in Table 1. or varied the estimate by ±15% (supplementary Table S1, Lower QoL scores were associated with increased HAQ available at Rheumatology Online). The univariate ana- scores (i.e. declining disability) and the occurrence of lyses provided the best- and worst-case results when treatment-related adverse events. altering individual parameters. We conducted probabilistic Costs were assigned to treatment strategies (drug analyses, which simultaneously varied parameters using dosing and frequency) reported in the MBDA test clinical randomly selected estimates that fell within the upper and management study before and after physicians reviewed lower bound of each parameter. the MBDA test results (data on file, Crescendo Bioscience). Annual drug prices were obtained from the Results at OHSU Main Library on September 30, 2015 Centers for Medicare & Medicaid Services (CMS) National Average Drug Acquisition Costs database [32, 33]. An Use of the MBDA test was projected to increase the 18% markup was applied to account for differences in number of patients with a lower HAQ score, as observed reimbursement between payers (i.e. private and public). in Fig. 1 (the curve shifts to the left based on benefits We included a 21% reduction in costs to account for received with treatment changes following use of the treatment non-adherence among patients with RA [34]. MBDA test). Overall, patients’ HAQ scores decreased by The analysis incorporated changes in disability-related a mean of 0.09 units in year 1 compared with current costs associated with changes in health status. Each unit clinical practice (Table 2); a 0.02 difference remained at increase in HAQ score has been reported to increase 10 years (Fig. 2). This improvement in patient health status direct medical costs by $1637 [35]. As mentioned pre- resulted in cumulative 0.08 QALYs gained per patient viously, the difference in costs diminished over time; half tested over 10 years. Use of the MBDA test was projected of the benefit remained after 4 years [16, 30]. to have a modest effect on the rate of adverse events, and Changes in functional status were directly related to thus on QALYs (Table 2). labour force participation and work productivity (supple- Over a 10-year period, use of MBDA was projected to mentary Table S2, available at Rheumatology Online). save $457 per patient in direct and indirect costs. Annual Hourly wages and the number of hours worked per drug costs totalled $17 284 in current clinical practice and week were estimated using the February 2014 US $17 455 with MBDA testing (year 1). Changes in treatment Bureau of Labor Statistics data for private workers recommendations thus led to an increase in RA drug [27, 36]. costs of $171 per patient in year 1 and $881 over the 10-year time horizon. Adoption of the MBDA test Sensitivity analyses accounted for an increase of $146 per year in costs Sensitivity analyses were run to assess the effect of (Table 2). uncertainty in costs and outcomes with use of the Improvements in patient health status resulted in mean MBDA test. The lower and upper bounds of parameter per-patient savings in other direct costs of $150 in the values were established using the maximum and minimum first year and $661 over the 10-year time horizon. values reported in the literature. We used the study esti- Moreover, treatment changes following the MBDA test mate’s 95% CI for the range, if available; when no range contributed to increased labour force participation and

www.rheumatology.oxfordjournals.org 1643 Kaleb Michaud et al.

TABLE 2 Results of cost-effectiveness analysis of the MBDA test from a US health care perspective

Current clinical Outcomes over budget horizon practicea MBDA Difference

HAQ score Year 1 1.17 1.08 À0.09 Time horizon 1.22 1.20 À0.02 Quality-adjusted life years Disability-related 5.57 5.65 0.08 Adverse-event related À0.04 À0.04 À0.00002 Total 5.54 5.61 0.08 Costs Direct medical costs MBDA test $0 $1460 $1460 RA drug costs $140 577 $141 458 $881 Other direct costs for RA $15 426 $14 765 À$661 Downloaded from Adverse-event related $370 $371 $0.24 Total $156 373 $158 053 $1680 Indirect costs Labour force participation $67 344 $65 866 À$1478 Work productivity $37 632 $36 973 À$659 Total $104 976 $102 839 À$2137 http://rheumatology.oxfordjournals.org/ Total direct and indirect costs $261 349 $260 892 À$457 Incremental cost per QALY gained Payer perspective $22 088 Payer and employer perspective Cost-saving

aCurrent clinical practice refers to outcome measures currently used in the practice setting to ascertain disease activity among patients diagnosed with RA. Guidelines recommend using outcome measures that combine information from the patient, provider and/or laboratory results to measure disease progression in RA (e.g. DAS28, CDAI, RAPID-3) [9]. All costs reported in 2014 US dollars. CDAI: Clinical Disease Activity Index; DAS28: 28-joint DAS; MBDA: multibiomarker disease activity; QALY: quality-adjusted life year; RAPID: Routine Assessment of Patient Index Data. at OHSU Main Library on September 30, 2015

FIG.2Mean HAQ progression over a 10-year time hori- (À$6011) (Table 2). From the payer’s perspective, for zon with and without MBDA test adjunct patients with early RA, the MBDA test is cost saving (À$3073) and improves QALYs by 0.23 per patient; in established RA there is an improvement in QALYs of 0.02, while adding $275 in overall costs. The MBDA test improved patient health across all variations of input parameters of the one-way sensitivity analysis. Factors that most influenced overall costs were the effect of the MBDA test on clinician treatment recom- mendations and changes in HAQ score over time (Fig. 3). The probabilistic sensitivity analysis showed the MBDA test was projected to increase QALYs in all scenarios; it was cost saving in 55% of scenarios. Ninety-three per cent of analyses resulted in a cost per QALY gained of Dark grey: current clinical practice; light grey: with Vectra <$50 000 (Fig. 4). DA. MBDA: multibiomarker disease activity. Discussion

This decision analysis assessed the outcomes and costs work productivity, resulting in savings of $514 in the first of using the MBDA test in the management of patients year and $2137 over the time horizon (Table 2). with RA. Improved control of disease activity derived The MBDA test costs $22 088 per QALY gained from from treatment changes with use of the MBDA test the perspective of the third-party payer. When including resulted in an improvement in HAQ scores of 0.09 units both third-party payer costs and costs related to work in year 1, declining to 0.02 after 10 years. Cumulatively, productivity, use of the MBDA test was cost saving QALYs increased by 0.08 years. Overall, costs decreased

1644 www.rheumatology.oxfordjournals.org MBDA test in RA

FIG.3One-way sensitivity analysis (effect on costs) Downloaded from http://rheumatology.oxfordjournals.org/

Lower and upper bounds for parameters presented in the supplementary data, available at Rheumatology Online. All at OHSU Main Library on September 30, 2015 variables were analysed in the one-way sensitivity analysis, however, only those showing a difference >US$10 (between the upper and lower bound) were included in the figure. All costs reported in 2014 US dollars. AEs: adverse events; MBDA: multibiomarker disease activity; sDMARDs: synthetic DMARDs.

by $457, with a $1680 cost increase to third-party payers and assess subclinical disease [22, 23]. Based on the offset by savings in labour force participation and work results of the test, physicians may recommend appropri- productivity (À$2137). Adoption of the MBDA test to ate treatments to limit joint damage when the risk of pro- inform disease activity was cost saving in >50% of all gression is high or avoid intensification of treatment when scenarios of the probabilistic sensitivity analysis; QALYs the risk is low, as it provides an objective measure that increased in all scenarios. may help improve the likelihood of achieving guideline- Guidelines panels recommend patients be treated early recommended goals. Other technologies such as CT in the disease because long-term disability is a potential scanning, ultrasonography, computerized image analysis consequence of underrecognized, cumulative joint and MRI have been proposed as potential methods for damage [7–10]. They also recommend patients have fre- measuring disease activity. The role and affordability of quent assessments of disease activity so that treatment these approaches in clinical practice is unresolved regimens may be adjusted to increase the proportion of [39–41]. Costs and outcomes of MRI use in early RA patients in disease remission [5, 7–10]. Current disease were analysed by Suter et al. [24], reporting the cost per activity measures have several limitations that seem unit increase in health benefit over a 10-year time horizon to contribute to slow or variable uptake in their use as $167 783. The reason for this high cost was directly [15, 17, 37, 38]. related to the high false-positive rate, which led to more The MBDA test may be a useful adjunct to clinical treatment-intensive strategies. In contrast, use of the assessment to evaluate the effectiveness of a treatment MBDA test was projected to lead to only modest regimen when trying to reach targeted treatment goals, increases of biologic therapies (by 2%) in patients with given its ability to identify progression-free remission established RA, and no increase among patients with

www.rheumatology.oxfordjournals.org 1645 Kaleb Michaud et al.

FIG.4Probabilistic sensitivity analysis

A Downloaded from http://rheumatology.oxfordjournals.org/ B at OHSU Main Library on September 30, 2015

(A) Incremental costs vs incremental QALYs. (B) Probability that cost per QALY gained is less than or equal to the willingness-to-pay threshold per QALY gained. MBDA: multibiomarker disease activity; QALY: quality-adjusted life years.

early RA [23]. Furthermore, our analyses showed that the experience the maximized effect of prescribed pharma- cost of using the MBDA test to monitor disease activity cotherapy regimens to achieve the treatment goal. remains below the willingness-to-pay threshold ($50 000), These changes may prevent and delay the occurrence which is used to assess the value of biologic therapies (vs of preliminary radiographic damage and lead to the reten- standard treatment) [42]. tion of functionality, which hampers QoL, for a longer The MBDA test was projected to increase QALYs by period of time. When extrapolating the QALY benefit in 0.08, or 1 month, over the 10-year time horizon. The impli- RA in the context of other medical interventions, the ben- cations of adoption of this test in the management of efit of the MBDA test in the management of RA is greater treatment strategies for patients with RA may seem trivial than that derived from the use of ticlopidine (vs aspirin) for at first glance. However, it is important to note the mag- prevention of stroke among high-risk patients and testing nitude of the effect, with and without use of the test, in the of the blood supply for HIV prior to use in surgical patients context of different RA patient populations and its effec- 570 years of age [43]. tiveness compared with other medical interventions. Use of the MBDA test is expected to improve the quality Progression of the disease worsens with increasing age, of care for patients with RA, as quantified by changes in thus early use of the MBDA test (i.e. at a younger age) HAQ score and QALYs, through more informed therapy would likely be of greater benefit, allowing patients to selection. This analysis indicated it can do so while also

1646 www.rheumatology.oxfordjournals.org MBDA test in RA

being affordable. Affordability is part of the Triple Aim J.H. and S.N.M. are employees of Cedar Associates LLC. initiative, along with outcomes and patient experiences, Crescendo Bioscience Inc. funded Cedar Associates LLC. required for certification by CMS for status as an for the research detailed in this study. V.S. is a consultant Accountable Care Organizations (ACOs) under the to and serves on the Scientific Advisory Board of Affordable Care Act of 2010 [44]. The MBDA test appears Crescendo Bioscience; she has also received honoraria to meet the three criteria and align with ACOs goals. from UCB Pharma, Abbott Immunology, Amgen, This analysis relied on an analytical framework and Anthera, AstraZeneca, Bristol-Myers Squibb, Genentech/ parameters used in earlier studies [24–26]. We sought to Roche, GlaxoSmithKline, Sciences, assess how outcomes reported herein align with costs Idera, Janssen, Lilly, Medimmune, Novartis and outcomes reported from other sources, referred to Pharmaceuticals, Novo Nordisk, Orbimed, Pfizer, Rigel, as external validity [45]. Specifically, compared with a Sanofi and Takeda. recent US registry study, the treatment patterns observed in pre-MBDA treatment recommendations are consistent Supplementary data with the trend towards increasing utilization of biologics [59% in the decision impact study vs 40% in the Supplementary data are available at Rheumatology Downloaded from Consortium of Rheumatology Researchers of North Online. America (CORRONA) for patients with established RA] [46]; utilization of biologics increased from 3% to 26% between 1999 and 2006 [47]. References A number of limitations should be considered when 1 Alamanos Y, Voulgari PV, Drosos AA. Incidence and interpreting the results of this analysis. First, the influence prevalence of rheumatoid arthritis, based on the 1987 http://rheumatology.oxfordjournals.org/ of the MBDA test on clinicians’ treatment decisions con- American College of Rheumatology criteria: a systematic tributed the greatest variation in outcomes and costs. review. Semin Arthritis Rheum 2006;36:182–8. Further research is needed to validate this estimate and 2 Wasserman AM. Diagnosis and management of its effect on treatment cost changes. Prospective studies rheumatoid arthritis. Am Fam Physician 2011;84: of MBDA test use in other settings are warranted to 1245–52. increase the precision and generalizability on medical 3 Michaud K, Vera-Llonch M, Oster G. Mortality risk by resource utilization (e.g. lab tests, imaging procedures, functional status and health-related quality of life in pa- patient treatment adherence). By further stratifying clinical tients with rheumatoid arthritis. J Rheumatol 2012;39: utility, based on patient features, it will extend the breadth 54–9. of future analyses of health care utilization. Furthermore, 4 Crowson CS, Liang KP, Therneau TM, Kremers HM, long-term studies will be useful to monitor whether the Gabriel SE. Could accelerated aging explain the excess at OHSU Main Library on September 30, 2015 predicted effects on outcomes and cost persist under dif- mortality in patients with seropositive rheumatoid arthritis? ferent conditions than those analysed herein. Arthritis Rheum 2010;62:378–82. The MBDA test provides an objective assessment of 5 Plant MJ, Williams AL, O’Sullivan MM et al. Relationship disease activity in RA that addresses well-reported limita- between time-integrated C-reactive protein levels and tions of existing clinical measures constraining their wide- radiologic progression in patients with rheumatoid arth- spread adoption [12, 13, 15]. The information provided by ritis. Arthritis Rheum 2000;43:1473–7. the MBDA test has been shown to aid monitoring of dis- 6 Mottonen T, Hannonen P, Korpela M et al. Delay to insti- ease activity, thus allowing for more informed selection of tution of therapy and induction of remission using single- treatments in RA, potentially helping to slow disease pro- drug or combination-disease-modifying antirheumatic gression and/or preserve joint integrity. This measure is drug therapy in early rheumatoid arthritis. Arthritis Rheum projected to influence RA-related disability and reduce 2002;46:894–8. overall combined costs to third-party payers and in 7 Albrecht K, Kruger K, Wollenhaupt J et al. German labour force participation and work productivity. guidelines for the sequential medical treatment of rheumatoid arthritis with traditional and biologic disease- Funding: This work was supported by Crescendo modifying antirheumatic drugs. Rheumatol Int 2014;34: Bioscience Inc. 1–9. 8 Bykerk VP, Akhavan P, Hazlewood GS et al. Canadian Disclosure statement: K.M. has been a consultant for Rheumatology Association recommendations for Crescendo Bioscience and has not received funding pharmacological management of rheumatoid arthritis with from the sponsor. K.F. has been a consultant for traditional and biologic disease-modifying antirheumatic Crescendo Bioscience and has received honoraria from drugs. J Rheumatol 2012;39:1559–82. the sponsor. I.D. was an employee of Cedar Associates 9 Singh JA, Furst DE, Bharat A et al. 2012 update of the LLC, which received research support from Crescendo 2008 American College of Rheumatology recommenda- Bioscience Inc. for the work that was performed. N.A.S. tions for the use of disease-modifying antirheumatic drugs currently receives research grant support from the BRASS and biologic agents in the treatment of rheumatoid arth- registry which is funded by Crescendo Biosciences, UCB ritis. Arthritis Care Res 2012;64:625–39. and BMS. N.A.S. has received research grant support in 10 Smolen JS, Landewe R, Breedveld FC et al. EULAR rec- the past two years from Amgen, Abbvie and Genentech. ommendations for the management of rheumatoid arthritis

www.rheumatology.oxfordjournals.org 1647 Kaleb Michaud et al.

with synthetic and biological disease-modifying antirheu- 26 Wailoo AJ, Bansback N, Brennan A et al. Biologic drugs matic drugs: 2013 update. Ann Rheum Dis 2014;73: for rheumatoid arthritis in the Medicare program: a cost- 492–509. effectiveness analysis. Arthritis Rheum 2008;58:939–46. 11 Marhadour T, Jousse-Joulin S, Chales G et al. 27 US Bureau of Labor Statistics. Division of Consumer Reproducibility of joint swelling assessments in long-last- Prices and Price Indexes. Consumer Price Index, 2013. ing rheumatoid arthritis: influence on disease activity US Bureau of Labor Statistics. Washington, DC, USA, score-28 values (SEA-Repro study part I). J Rheumatol 2013. 2010;37:932–7. 28 Emery P, Fleischmann R, van der Heijde D et al. The 12 Brown AK, Conaghan PG, Karim Z et al. An explanation for effects of golimumab on radiographic progression in the apparent dissociation between clinical remission and rheumatoid arthritis: results of randomized controlled continued structural deterioration in rheumatoid arthritis. studies of golimumab before methotrexate therapy and Arthritis Rheum 2008;58:2958–67. golimumab after methotrexate therapy. Arthritis Rheum 13 Pollard LC, Kingsley GH, Choy EH, Scott DL. Fibromyalgic 2011;63:1200–10. rheumatoid arthritis and disease assessment. 29 Michaud K, Wallenstein G, Wolfe F. Treatment and non- Rheumatology 2010;49:924–8. treatment predictors of health assessment questionnaire 14 Wolfe F, Michaud K, Busch RE et al. Polysymptomatic disability progression in rheumatoid arthritis: a longitudinal Downloaded from distress in patients with rheumatoid arthritis: understand- study of 18,485 patients. Arthritis Care Res 2011;63: ing disproportionate response and its spectrum. Arthritis 366–72. Care Res 2014;66:1465–71. 30 Aletaha D, Smolen J, Ward MM. Measuring function in 15 Hobbs KF, Cohen MD. Rheumatoid arthritis disease rheumatoid arthritis: identifying reversible and irreversible – measurement: a new old idea. Rheumatology 2012; components. Arthritis Rheum 2006;54:2784 92. http://rheumatology.oxfordjournals.org/ 51(Suppl 6):vi21–7. 31 Plant MJ, O’Sullivan MM, Lewis PA et al. What factors 16 van der Heijde D, Breedveld FC, Kavanaugh A et al. influence functional ability in patients with rheumatoid Disease activity, physical function, and radiographic pro- arthritis. Do they alter over time? Rheumatology 2005;44: gression after longterm therapy with adalimumab plus 1181–5. methotrexate: 5-year results of PREMIER. J Rheumatol 32 Centers for Medicare & Medicaid Services. . Survey of 2010;37:2237–46. drug acquisition costs paid by retail community pharma- 17 Pincus T, Segurado OG. Most visits of most patients with cies 2013 (part II) (May 16, 2013). Baltimore, MD, USA: rheumatoid arthritis to most rheumatologists do not in- Centers for Medicare & Medicaid Services, 2013. clude a formal quantitative joint count. Ann Rheum Dis 33 Centers for Medicare & Medicaid Services. . Healthcare 2006;65:820–2. Common Procedure Coding System. Payment allowance 18 Centola M, Cavet G, Shen Y et al. Development of a multi- limits for Medicare part B drugs (June 7, 2013). Baltimore,

biomarker disease activity test for rheumatoid arthritis. MD, USA: Centers for Medicare & Medicaid Services, at OHSU Main Library on September 30, 2015 PLoS One 2013;8:e60635. 2013. 19 Curtis JR, van der Helm-van Mil AH, Knevel R et al. 34 Grijalva CG, Chung CP, Arbogast PG et al. Assessment of Validation of a novel multibiomarker test to assess adherence to and persistence on disease-modifying anti- rheumatoid arthritis disease activity. Arthritis Care Res rheumatic drugs (DMARDs) in patients with rheumatoid 2012;64:1794–803. arthritis. Med Care 2007;45(10 Suppl 2):S66–76. 20 Bakker MF, Cavet G, Jacobs JW et al. Performance of a 35 Wailoo A, Brennan A, Bansback N et al. Modeling the cost multi-biomarker score measuring rheumatoid arthritis effectiveness of etanercept, adalimumab and anakinra disease activity in the CAMERA tight control study. Ann compared to infliximab in the treatment of patients with Rheum Dis 2012;71:1692–7. rheumatoid arthritis in the Medicare program: final report. 21 Hirata S, Dirven L, Shen Y et al. A multi-biomarker score Rockville, MD, USA: Agency for Health Care Research and measures rheumatoid arthritis disease activity in the BeSt Quality, 2006. study. Rheumatology 2013;52:1202–7. 36 US Bureau of Labor Statistics. Current employment stat- 22 van der Helm-van Mil AH, Knevel R, Cavet G, istics (CES) survey: real earnings. Washington, DC, USA: Huizinga TW, Haney DJ. An evaluation of molecular and Bureau of Labor Statistics. February 2014. clinical remission in rheumatoid arthritis by assessing 37 Cush JJ. Trends in rheumatology practice: a survey of US radiographic progression. Rheumatology 2013;52:839–46. rheumatologists. Paper presented at the American 23 Li W, Sasso EH, Emerling D, Cavet G, Ford K. Impact of a College of Rheumatology Annual Scientific Meeting. San multi-biomarker disease activity test on rheumatoid arth- Francisco, CA, 24–28 October 2008. ritis treatment decisions and therapy use. Curr Med Res 38 Demaria L, Acelajado MC, Luck J et al. Variations and Opin 2013;29:85–92. practice in the care of patients with rheumatoid arthritis: 24 Suter LG, Fraenkel L, Braithwaite RS. Cost-effectiveness quality and cost of care. J Clin Rheumatol 2014;20:79–86. of adding magnetic resonance imaging to rheumatoid 39 Wakefield RJ, Gibbon WW, Conaghan PG et al. The value arthritis management. Arch Intern Med 2011;171:657–67. of sonography in the detection of bone erosions in pa- 25 Finckh A, Bansback N, Marra CA et al. Treatment of very tients with rheumatoid arthritis: a comparison with con- early rheumatoid arthritis with symptomatic therapy, disease- ventional radiography. Arthritis Rheum 2000;43:2762–70. modifying antirheumatic drugs, or biologic agents: a cost- 40 Sharp JT, Gardner JC, Bennett EM. Computer-based effectiveness analysis. Ann Intern Med 2009;151:612–21. methods for measuring joint space and estimating erosion

1648 www.rheumatology.oxfordjournals.org MBDA test in RA

volume in the finger and wrist joints of patients with 44 Berwick DM, Nolan TW, Whittington J. The Triple Aim: rheumatoid arthritis. Arthritis Rheum 2000;43:1378–86. care, health, and cost. Health Affairs 2008;27:759–69. 41 McGonagle D, Conaghan PG, O’Connor P et al. The 45 Caro JJ, Briggs AH, Siebert U, Kuntz KM. Modeling good relationship between synovitis and bone changes in research practices—overview: a report of the ISPOR- early untreated rheumatoid arthritis: a controlled magnetic SMDM Modeling Good Research Practices Task resonance imaging study. Arthritis Rheum 1999;42: Force—1. Value Health 2012;15:796–803. 1706–11. 46 Lee SJ, Chang H, Yazici Y et al. Utilization trends of tumor 42 Schoels M, Wong J, Scott DL et al. Economic aspects of necrosis factor inhibitors among patients with rheumatoid treatment options in rheumatoid arthritis: a systematic lit- arthritis in a United States observational cohort study. J erature review informing the EULAR recommendations for Rheumatol 2009;36:1611–7. the management of rheumatoid arthritis. Ann Rheum Dis 47 Yazici Y, Shi N, John A. Utilization of biologic agents in 2010;69:995–1003. rheumatoid arthritis in the United States: analysis of pre- 43 Wright JC, Weinstein MC. Gains in life expectancy from scribing patterns in 16,752 newly diagnosed patients and medical interventions—standardizing data on outcomes. patients new to biologic therapy. Bull NYU Hosp Jt Dis N Engl J Med 1998;339:380–6. 2008;66:77–85. Downloaded from http://rheumatology.oxfordjournals.org/ at OHSU Main Library on September 30, 2015

www.rheumatology.oxfordjournals.org 1649 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research

EXTENDED REPORT Pretreatment multi-biomarker disease activity score and radiographic progression in early RA: results from the SWEFOT trial Karen Hambardzumyan,1 Rebecca Bolce,2 Saedis Saevarsdottir,1,3 Scott E Cruickshank,4 Eric H Sasso,2 David Chernoff,2 Kristina Forslind,5,6 Ingemar F Petersson,5,7 Pierre Geborek,5 Ronald F van Vollenhoven1

– Handling editor Tore K Kvien ABSTRACT progression (RP),3 14 they have limited predictive ▸ Additional material is Objectives Prediction of radiographic progression (RP) power on an individual basis. Therefore, identifica- published online only. To view in early rheumatoid arthritis (eRA) would be very useful tion of new predictors would be beneficial for please visit the journal online for optimal choice among available therapies. We establishing the prognosis at an early stage and for (http://dx.doi.org/10.1136/ evaluated a multi-biomarker disease activity (MBDA) optimally choosing therapy. annrheumdis-2013-204986). score, based on 12 serum biomarkers as a baseline Various serum biomarkers have been studied as fi For numbered af liations see predictor for 1-year RP in eRA. predictors of RP. For example, bone and cartilage end of article. Methods Baseline disease activity score based on metabolism turnover are found to be associated 15–17 Correspondence to erythrocyte sedimentation rate (DAS28-ESR), disease with RP of joint damage in patients with RA, Karen Hambardzumyan, activity score based on C-reactive protein (DAS28-CRP), whereas high leptin and eotaxin levels, although Department of Medicine, CRP, MBDA scores and DAS28-ESR at 3 months were being pro-inflammatory, are associated with better Clinical Therapy Research, 18 19 fl analysed for 235 patients with eRA from the Swedish radiographic outcomes. To date, no single bio- In ammatory Diseases fi (ClinTRID), Karolinska Institute, Farmacotherapy (SWEFOT) clinical trial. RP was de ned marker has proven to be highly reliable for predict- 20 21 D1:00 Karolinska University as an increase in the Van der Heijde-modified Sharp ing RP. Therefore, the use of combinations of Hospital, Solna, Stockholm score by more than five points over 1 year. Associations biomarkers may be a more promising approach. 17176, Sweden; between baseline disease activity measures, the MBDA The multiple-biomarker disease activity (MBDA; [email protected] score, and 1-year RP were evaluated using univariate Crescendo Bioscience Inc, South San Francisco, Received 27 November 2013 and multivariate logistic regression, adjusted for California, USA) score (range from 1 to 100) is Revised 4 April 2014 potential confounders. based on serum levels of several biomarkers. The Accepted 13 April 2014 Results Among 235 patients with eRA, 5 had low and development of the MBDA score started with Published Online First 29 moderate MBDA scores at baseline. None of the screening 396 candidate biomarkers and ended up 8 May 2014 former and only one of the latter group (3.4%) had RP with 12 that were combined into a score and – during 1 year, while the proportion of patients with RP shown to correlate well with disease activity.22 24 among those with high MBDA score was 20.9% This test is validated for clinical use in the USA as a (p=0.021). Among patients with low/moderate CRP, disease activity marker in RA. Its value as a pre- moderate DAS28-CRP or moderate DAS28-ESR at dictor of clinical and radiographic outcomes is cur- baseline, progression occurred in 14%, 15%, 14% and rently the subject of several studies. Bakker et al25 15%, respectively. MBDA score was an independent showed in the CAMERA study that the MBDA predictor of RP as a continuous (OR=1.05, 95% CI 1.02 score correlated significantly (r=0.72; p<0.001) to 1.08) and dichotomised variable (high versus with disease activity score based on C-reactive low/moderate, OR=3.86, 95% CI 1.04 to 14.26). protein (DAS28-CRP). Hirata et al26 observed an Conclusions In patients with eRA, the MBDA score at association of the MBDA score and its 1-year baseline was a strong independent predictor of 1-year change with different clinical outcomes and Van RP. These results suggest that when choosing initial der Helm-Van Mil et al23 demonstrated that remis-

Open Access treatment in eRA the MBDA test may be clinically useful sion based on the MBDA score was associated with Scan to access more free content to identify a subgroup of patients at low risk of RP. limited RP in patients with established RA on Trial registration number WHO database at the disease-modifying antirheumatic drug (DMARD) Karolinska Institute: CT20080004; and clinicaltrials.gov: therapy compared with other clinical measures of NCT00764725. remission. We report a post hoc analysis of the Swedish Farmacotherapy (SWEFOT) randomised clinical INTRODUCTION trial in DMARD-naïve early RA (eRA), which fea- The course of rheumatoid arthritis (RA) can vary tured an initial 3-month treatment with methotrex- from mild and non-destructive to severe and ate (MTX) monotherapy. In patients whose disease rapidly destructive.12Although some clinical para- did not respond to initial therapy, this was followed fl To cite: Hambardzumyan K, meters at diagnosis, including in ammatory by a randomised comparison between non- Bolce R, Saevarsdottir S, markers, baseline erosions, smoking, and in some biological triple DMARD therapy and MTX plus et al. Ann Rheum Dis studies, auto-antibody status, have been shown to biological (anti-tumour necrosis factor (TNF)) – 2015;74:1102 1109. be associated with the risk of radiographic therapy.27 28 The MBDA score was measured in

1102 Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research baseline serum samples from patients included in the SWEFOT disease activity measure was evaluated using univariate logistic clinical trial and studied as a predictor of RP after 1 year. regression. Wald’s χ2 test (p<0.05) and the estimated OR and corresponding 95% CI were used from the logistic model to METHODS assess the strength and direction of the association, respectively. Study population Additionally, bivariate and multivariate logistic regression This study was performed with data from the SWEFOT clinical models were used to assess the association between RP and base- trial, in which 487 DMARD-naïve patients with eRA line MBDA score, while simultaneously accounting for potential (duration<1 year) from 15 different clinics in Sweden started confounders at baseline. For the multivariate analyses, we 3 months of MTX treatment. After 3 months of MTX mono- adjusted for all significant univariate predictors as in our recent therapy, those whose disease did not respond (DAS28>3.2) report based on the same study populations.37 The reported were randomised into two groups: group A (n=130) received p values from these additional analyses were not adjusted for MTX combined with sulfasalazine (SSZ) and hydroxychloro- multiple testing. The difference in proportion of RP between quine (HCQ) (triple therapy), and group B (n=128) received patients with low/moderate and high MBDA score groups was MTX combined with infliximab. Approximately one-third of compared by Fisher’s Exact test. Probability plots were used to patients (n=145) had a good response after 3 months of MTX depict graphically the occurrence of RP over 1 year with monotherapy (DAS28≤3.2) and they continued the treatment patients stratified by the aforementioned baseline disease activity for 2 years. The trial was described in detail elsewhere.27 categories (low, moderate and high). Measures of sensitivity and specificity (positive predictive value and negative predictive Clinical and radiographic outcomes value) were calculated to determine the degree to which the For this study complete sets of baseline demographic, serological baseline MBDA score accurately predicts RP at 1 year. and radiographic data, and clinical measures from 235 patients were analysed. Identification of clinical response to MTX mono- RESULTS therapy was done by using DAS28 based on ESR at 3-month Description of the study cohort follow-up (≤3.2: response; >3.2: non-response).29 We also ana- A total of 235 patients had complete radiographic, clinical and lysed CRP (mg/L), ESR and DAS28-CRP. The thresholds for serological data for evaluation in this study (‘study cohort’). disease activity levels according to these measures were as Demographic and clinical data at baseline for these patients follows: for DAS28-ESR, low ≤3.2, moderate 3.3–5.1 and high were similar to those for the overall SWEFOT trial population >5.130; for CRP, low ≤10 mg/L, moderate >10–30 mg/L and (table 1). Overall, the patients in the study cohort had a mean high >30 mg/L31; and for DAS28-CRP,low≤2.7, moderate symptom duration of 6.1 months from diagnosis and moderate 2.8–4.1 and high >4.1.32 Categorisation of patients in ESR low, to high disease activity, as expected in an early-onset RA moderate and high disease activity groups was done by using population. tertiles of the measure (for results based on continuous variables Following 3 months of MTX therapy, 78 (33%) of the 235 and tertiles using other disease activity measures, see online sup- patients in the study cohort responded to treatment and contin- plementary figures S1 and S2, respectively). X-rays of the hands ued to receive MTX monotherapy per protocol and 157 (67%) and feet were done at baseline and after 1 year, and the van der did not respond and were randomised to receive triple DMARD Heijde modified Sharp score (SHS) was calculated.33 Patients therapy (group A) or MTX with infliximab (group B; table 2). whose SHS increased by more than five points from baseline to RRP,defined as ΔSHS>5 from baseline to 1 year, was 1 year (ΔSHS>5) were considered to have rapid RP (RRP).34 35 observed for 43 of the 235 patients in the study cohort. In addition, two other thresholds (ΔSHS>0 and ΔSHS>3) were analysed for comparison. In the analyses of RRP that follow, Baseline characteristics and RP these 235 patients were treated as a single group because their Among baseline parameters, MBDA, ESR and CRP values were results define the overall outcome of the SWEFOT tight control significantly higher in patients with RP versus those without strategy for patients with recent onset RA. (p<0.001, p=0.001 and p=0.018 respectively; table 1). Mean changes in SHS from baseline to 1 year were 2.1 and 3.6 for the Biomarker measurement and MBDA score responder and non-responder groups, respectively and 13% of The MBDA score was measured in baseline serum samples from the responder group had RRP (ΔSHS>5), compared with 21% the SWEFOT participants and is based on the following 12 bio- in the non-responder group. Other thresholds including markers: vascular cell adhesion molecule 1, epidermal growth ΔSHS>0 and ΔSHS>3 were also tested (see online factor, vascular endothelial growth factor, interleukin 6, TNF supplementary table S1). receptor I, matrix metalloproteinases 1 and 3, bone glycopro- tein 39 (YKL-40), leptin, resistin, serum amyloid A and CRP. Relationship between RP and baseline level of MBDA score, These biomarkers were measured by electrochemiluminescence- CRP, ESR or DAS28 based multiplexed immunoassays on the Meso Scale Discovery The discriminative capacity of the baseline MBDA score, CRP, Multi-Array platform.36 The measured levels for each of the 12 DAS28 and ESR for RP is illustrated by cumulative probability biomarkers were weighted and combined using a validated plots of ΔSHS from baseline to 1 year (figure 1 and see online formula to derive the MBDA score (Vectra DA score), which supplementary figures S1 and S2). The curve for the high ranges from 1 to 100. In this study, the following disease activity MBDA group was markedly different from curves for the low categories according to the MBDA score were used: low (<30), or moderate MBDA groups (figure 1A). By contrast, curves for moderate (30–44) and high (>44).22 23 the three baseline CRP groups, two DAS28 groups and three ESR groups were more similar, with RRP being relatively fre- Statistical analysis quent in all categories of these baseline measures (figure 1B–D, Descriptive statistics were prepared for demographics and base- respectively). Mean ΔSHS values and frequencies of progression line disease-related characteristics, including measures of disease for other thresholds of ΔSHS followed the same trends across activity. The association between RP at 1 year and each baseline categories of MBDA score as observed for ΔSHS>5 (table 2).

Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 1103 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research

Table 1 Baseline characteristics and demographic data of patients from SWEFOT trial Subset of patients with clinical measures at baseline and radiographs at baseline and 1 year Baseline characteristics, Radiographic subset Without progression With progression mean (±SD) All patients (n=487)* (n=235) (ΔSHS≤5) (n=192) (ΔSHS>5) (n=43) p Value

Female, N (%) 344 (70) 169 (72) 137 (71) 32 (74) 0.686 Symptom duration (months) 6.2 (4.57) 6.1 (5.1) 6.0 (5.38) 6.6 (3.61) 0.502 Anti-CCP status, N (%) 0.075 Positive 275 (57) 133 (57) 103 (53) 30 (70) Negative 157 (32) 92 (39) 80 (42) 12 (28) Not available 55 (11) 10 (4) 9 (5) 1 (2) RF status, N (%) 0.094 Positive 330 (68) 153 (65) 120 (63) 33 (77) Negative 152 (31) 80 (34) 70 (36) 10 (23) Not available 5 (1) 2 (1) 2 (1) 0 (0) 28 swollen joint count 10.8 (5.28) 10.8 (5.31) 10.7 (5.30) 11.0 (5.43) 0.807 28 tender joint count 9.6 (6.07) 9.3 (5.86) 9.4 (5.99) 8.77 (5.25) 0.518 ESR (mm/h) 39.9 (25.9) 41.3 (26.9) 38.5 (24.46) 53.9 (33.52) 0.001 CRP level (mg/L) 33.8 (36.81) 35.4 (38.37) 32.5 (36.41) 48.3 (44.31) 0.018 Patient’s global assessment of disease 56 (23.9) 55.4 (24.67) 54.1 (24.96) 61.3 (22.70) 0.082 activity (VAS 0–100 mm) score DAS28 5.7 (1.01) 5.7 (1.02) 5.7 (1.00) 5.9 (1.14) 0.107 DAS28-CRP 6.5 (1.22) 5.4 (0.99) 5.3 (0.97) 5.5 (1.04) 0.237 MBDA score 58.6 (15.08) 59.6 (14.71) 57.9 (14.68) 67.2 (12.38) <0.001 SHS mean (median) 4.5 (2) 4.7 (2) 4.3 (1) 6.5 (3) 0.126 *For “All patients” column the number of missing patients: 28 swollen and tender joint count (n=2), ESR (n=5), CRP and patient’s global assessment (n=3), DAS28 and DAS28-ESR (n=8), MBDA (n=185) and SHS (n=57). anti-CCP, anti-cyclic citrullinated peptide; CRP, C-reactive protein; DAS, disease activity score; ESR, erythrocyte sedimentation rate; MBDA, multi-biomarker disease activity; RF, rheumatoid factor; SHS, Sharp–van der Heijde score; VAS, visual analogue scale.

Discordance between MBDA scores and clinical score was observed in 59% (42/71) of patients with low CRP assessments: relationship to RP and all rapid progression associated with low CRP (n=10) As illustrated in figure 2, none of the patients had low occurred in the high MBDA subgroup (figure 2C). Thus, almost DAS28-ESR or DAS28-CRP at baseline (because of the trial all patients with RP (42 of 43 cases) belonged to the high inclusion criteria), but among those with moderate DAS28-ESR/ MBDA group (n=201) and represented 21% of that group CRP and low/moderate CRP, approximately 15% developed RP versus only one case of progression (3.4%) among patients with during 1 year (figure 2A–C, respectively). While all patients moderate (n=29, p=0.021), and none among patients with low with low MBDA score had low CRP and no RP, a high MBDA MBDA score (figure 2D).

Table 2 Radiographic progression over 1 year stratified by clinical response at 3 months of MTX monotherapy ΔSHS from baseline ΔSHS≤0 ΔSHS> 0 ΔSHS>3 ΔSHS>5

Mean (±SD) Median n (%) n (%) n (%) n (%)

Baseline MBDA score Low (MBDA <30, N=5) 0.8 (1.79) 0 4 (80) 1 (20) 1 (20) 0 Moderate (MBDA 30–44, N=29) 1.1 (2.07) 0 19 (66) 10 (34) 4 (14) 1 (3) High (MBDA >44, N=201) 3.4 (6.44) 1 92 (46) 109 (54) 67 (33) 42 (21) Radiographic assessment at 1 year by response to MTX at 3 months* Response to MTX (N=78) 2.1 (4.36) 0 40 (51) 38 (49) 15 (19) 10 (13) Non-response to MTX (N=157) 3.6 (6.70) 1 75 (48) 82 (52) 57 (36) 33 (21) Group A (N=77)† 4.0 (6.90) 1 36 (47) 41 (53) 28 (36) 18 (23) Group B (N=75)† 3.2 (6.71) 0 38 (51) 37 (49) 27 (36) 15 (20) Total cohort (N=235) 3.1 (6.05) 1 115 (49) 120 (51) 72 (31) 43 (18) The proportions represent patients within a certain ΔSHS range out of respective baseline MBDA subgroups or treatments groups. *Based on 235 patients with MBDA, DAS28-ESR, DAS28-CRP and CRP values at baseline plus radiographs at baseline and 1 year. †Five of the 157 patients whose disease did not respond to treatment at 3 months did not undergo randomisation to group A (triple DMARD therapy) or group B (MTX+infliximab therapy). CRP, C-reactive protein; DAS28-CRP, disease activity score based on C-reactive protein; DAS28-ESR, disease activity score based on erythrocyte sedimentation rate; MBDA, multi-biomarker disease activity; MTX, methotrexate, SD, standard deviation; SHS, Sharp–van der Heijde score.

1104 Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research

Figure 1 Probability plots of radiographic progression at year 1 for high, moderate and low disease activity patient (N=235) grouped according to baseline MBDA (A), CRP (B), DAS28 (C) and ESR (D). Each black circle represents a patient with low disease activity, red triangle—moderate disease activity and blue square—high disease activity. Horizontal dashed line represents ΔSHS=5 from baseline to 1 year, above which the change is considered as rapid radiographic progression (ΔSHS>5). DAS28, disease activity score; ESR, erythrocyte sedimentation rate; MBDA, multi-biomarker disease activity; SHS, Sharp–van der Heijde score.

The accuracy of the baseline MBDA score to predict RP at recent publication.37 When dichotomised into high MBDA year 1 was assessed by calculating measures of sensitivity and speci- score versus not, the adjusted OR for RRP after 1 year was 3.86 ficity (see online supplementary table S2 and text S1). Additionally, (p=0.04). the relationship between RP and baseline MBDA score was further examined in the subgroup of patients with high baseline scores (>44) (see online supplementary figures S3 and S4). DISCUSSION In this post hoc analysis of the SWEFOT trial we demonstrated that in DMARD-naïve patients with eRA, baseline serum levels Disease activity at baseline for predicting RRP of the 12-biomarker MBDA score may predict those that are at Univariate analyses of the radiographic subgroup (n=235) low versus relatively higher risk of RP (0%, 3.4% and 21% RP demonstrated significant associations with RRP,defined as among patients with low, moderate and high MBDA score, ΔSHS>5 units in 1 year, for baseline MBDA score (the odds of respectively). Our results also indicate that baseline MBDA RP increased by 5% for each 1-unit increase in the MBDA scores discriminate risk for subsequent RP in SWEFOT more score: OR=1.05, p<0.001) and baseline CRP (OR=1.10, effectively than the baseline CRP or DAS28. Furthermore, p=0.018) but not for baseline DAS28-ESR (OR = 1.31, MBDA score, both on a continuous and dichotomised (high vs p=0.107) or DAS28-CRP (OR = 1.22, p=0.237) (table 3). low/moderate), scale was found to be an independent predictor Further analyses of the high MBDA subset also confirmed that of RP after adjustments for other predictors in this study odds for RP is doubled in patients whose MBDA score is above population. 65 compared with those whose MBDA score is >44–65 (see Early identification of patients with RA whose condition is online supplementary figure S4). likely to have a good or poor response to the treatment is very In bivariate analyses that adjusted the MBDA scores for 11 important for the optimal choice of the therapy. However, good different clinical variables and for sex, one at a time, the base- clinical response does not guarantee good radiographic – line MBDA score was always an independent predictor of RRP outcome.38 42 Therefore, predictors of clinical and radiographic (OR values: 1.04–1.06, p values: 0.021 to <0.001). response are vital for patients’ long-term outcome. Furthermore, MBDA score as a continuous variable was a We evaluated baseline MBDA score as a predictor of 1 year strong independent predictor of RRP after 1 year (OR=1.05, RP,which was measured as the change in SHS. The definition of p<0.001; table 3), using a multivariate logistic regression model RP according to ΔSHS varies in different clinical studies. Van with adjustment for all significant baseline predictors from uni- der Helm-van Mil et al23 used ΔSHS>3 as the main definition variate analyses (sex, symptom duration, current smoking status, for progression, though ΔSHS>0 and ΔSHS>5 were also erosions, Health Assessment Questionnaire score), as in our applied for comparison. Vastesaeger et al34 tested different

Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 1105 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research

Figure 2 Cross tabulation of all analysed patients (N=235) and subset (n=43) with rapid radiographic progression (ΔSHS>5) over 1 year, by baseline disease activity measures. The denominator in each cell represents the number of patients cross classified by baseline MBDA score and DAS28-ESR (A), baseline MBDA score and DAS28-CRP (B) and baseline MBDA score and CRP (C) disease activity scores. The numerator in each cell represents the number of patients with radiographic progression at 1 year. (D) Radiographic progression for MBDA low, moderate and high score groups (%). Radiographic progression at 1 year is defined by increase in SHS>5 compared with baseline. CRP, C-reactive protein; DAS28-CRP, disease activity score based on C-reactive protein; DAS28-ESR, disease activity score based on erythrocyte sedimentation rate; MBDA, multi-biomarker disease activity; SHS, Sharp–van der Heijde score.

threshold values for RP from ΔSHS>0 to ΔSHS≥9 and found patients with high MBDA score were at sixfold higher risk of that ΔSHS≥5 was a suitable definition for RRP. Bruynesteyn RP (ΔSHS>3) than those in MBDA remission. Finally, the pro- et al35 showed that ΔSHS≥5 had 83% specificity for the smallest portion with RP in the DAS28-CRP remission group who had a detectable difference in RP. Therefore we applied a threshold of high MBDA score (47%) was twice as high as it was in all ΔSHS>5 for RRP. patients that met DAS28-CRP remission (20%) criteria.23 Previously, the MBDA score was evaluated and cut-offs were Although that study was based on the Leiden Early Arthritis established for ‘molecular remission’ (≤25), low (<30), moder- Cohort, the samples were obtained at different time points ate (30–44) and high (>44) disease activity scores.22 23 As it during the disease course, while patients were already on estab- was designed, the MBDA score is significantly associated with lished DMARD therapy, and it therefore conceptually addresses such disease activity measures as DAS28-CRP,DAS28-ESR, ESR, a different question compared with our study. Moreover, 20% CRP, simple disease activity index and clinical disease activity of patients from the former study, compared with only 2% from index.22 25 26 our study cohort, had remission or low disease activity by The relationship between MBDA score and RP has been MBDA, consistent with the fact that the former were on stable investigated in other settings. In the CAMERA trial, the MBDA DMARD therapy while patients in SWEFOT were DMARD score was predictive of RP with borderline significance after naïve at inclusion. However, one important finding in these two adjustment for rheumatoid factor and baseline erosions.25 In studies is similar, namely that the MBDA score is a stronger pre- that study, baseline MBDA score was compared with RP over dictor of 1-year RP than DAS28-CRP. 2 years and ΔSHS>0 was used as the cut-off. Perhaps most In the current study, patients with low or moderate MBDA score importantly, the sample size (n=72) in that study was smaller. (≤44) were shown to be at low risk of RP. Furthermore, when Van der Helm-van Mil et al23 showed that a greater proportion adjusted for commonly used markers and gender in bivariate and (93%) of patients with MBDA≤25 (‘molecular remission’) had multivariate logistic regression analyses, the findings remained, indi- no progression (SHS≤3) compared with patients in DAS28-CRP cating independence of the MBDA association with RP.Wealso (<2.32; 80%) or American College of Rheumatology (ACR)/ demonstrated that MBDA score differentiated patients without pro- European League Against Rheumatism (EULAR) (28 tender gression from those with progression better than CRP. Sensitivity joint count, 28 swollen joint count, patient’s global assessment and specificity analysis revealed a strong negative prediction (radio- and CRP≤1; 83%) remission. Moreover, the difference in the graphic non-progression). However, positive predictive value and proportion without progression was only significant (p=0.001) specificity were very low, indicating that, though having relatively for remission vs non-remission groups based on the MBDA higher risk, the majority of patients with high MBDA score still did score (but not when based on the DAS28-CRP or ACR/EULAR not progress radiographically over 1 year. These data suggest that definitions of remission). Furthermore, their study showed that baseline MBDA score might be used for identification of patients at

1106 Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research

Table 3 Univariate, bivariate and multivariate analyses of baseline MBDA score, DAS28 and CRP as predictors of 1-year radiographic progression OR* 95% CI p Value†

Univariate analyses Baseline MBDA score 1.05 (1.02 to 1.08) <0.001 Baseline DAS28-ESR 1.31 (0.94 to 1.81) 0.107 Baseline DAS28-CRP 1.22 (0.88 to 1.71) 0.237 Baseline CRP (mg/L) 1.10 (1.02 to 1.18) 0.018 Bivariate models Baseline MBDA adjusted for DAS28-ESR 1.05 (1.02 to 1.08) <0.001 Baseline MBDA adjusted for DAS28-CRP 1.05 (1.02 to 1.09) <0.001 Baseline MBDA adjusted for CRP 1.06 (1.02 to 1.10) 0.002 Baseline MBDA adjusted for ESR 1.04 (1.01 to 1.07) 0.021 Baseline MBDA adjusted for rheumatoid factor 1.05 (1.02 to 1.08) <0.001 Baseline MBDA adjusted for CCP status 1.05 (1.03 to 1.08) <0.001 Baseline MBDA adjusted for total swollen joint count 1.05 (1.02 to 1.08) <0.001 Baseline MBDA adjusted for total tender joint count 1.05 (1.02 to 1.08) <0.001 Baseline MBDA adjusted for global assessment of disease activity 1.05 (1.02 to 1.08) <0.001 Baseline MBDA adjusted for SHS 1.05 (1.02 to 1.08) <0.001 Baseline MBDA adjusted for symptom duration 1.05 (1.02 to 1.08) <0.001 Baseline MBDA adjusted for sex 1.05 (1.02 to 1.08) <0.001 Multivariate model‡ Baseline MBDA adjusted for sex, symptom duration, baseline erosions, current smoking status, HAQ score 1.05 (1.02 to 1.08) <0.001 High (>44) baseline MBDA score adjusted for sex, symptom duration, baseline erosions, current smoking status, HAQ score 3.86 (1.04 to 14.26) 0.04 *The OR was estimated from a logistic regression model. The logistic model is estimating the probability of radiographic progression at 1 year. For the univariate model, the odds of radiographic progression increases by 5% for every 1-unit increase in the baseline MBDA score. When accounting for other disease activity measures individually (bivariate models), the odds of radiographic progression increase in a cumulative manner, approximately 4–6% for every 1-unit increase in the baseline MBDA score. †p Value was calculated using Wald’s χ2 test. ‡Multivariate model adjusted for significant univariate predictors of 1-year radiographic progression (n=207), as in Saevarsdottir et al.37 CCP, cyclic citrullinated peptide; CRP, C-reactive protein; DAS28-CRP, disease activity score based on C-reactive protein; DAS28-ESR, disease activity score based on erythrocyte sedimentation rate; HAQ, Health Assessment Questionnaire; MBDA, multi-biomarker disease activity; SHS, Sharp–van der Heijde score.

lower risk of progression and will help in appropriate choice of these data suggest that MBDA score can be useful in risk assess- therapy for patients with a high MBDA score and at risk of RP. ment for RP in eRA. Our study has some limitations. Since patients with low DAS28 were not included in the SWEFOT trial; it was not pos- Author affiliations 1 fl sible to analyse the predictive value of the MBDA score in this Unit of Clinical Therapy Research, In ammatory Diseases (ClinTRID), Karolinska Institute, Stockholm, Sweden patient group. Also it should be noted that as a randomised 2Crescendo Bioscience Inc., South San Francisco, California, USA control trial, the SWEFOT study does not fully represent the RA 3Rheumatology Unit, Department of Medicine, Karolinska University Hospital and population. However, it was designed to be as close to a real-life Karolinska Institute, Stockholm, Sweden 4Scott Cruickshank and Associates Inc., Santa Barbara, California, USA eRA population as possible, with the only major inclusion criter- 5 ion being DAS28>3.2. The study is a post hoc analysis of the Section of Rheumatology, Institution of Clinical Sciences, University Hospital, Lund, Sweden SWEFOT trial, which was designed primarily for comparison of 6Section of Rheumatology, Department of Medicine, Helsingborg Hospital, biological and non-biological combination DMARD therapies. Helsingborg, Sweden During the trial some patients had to switch from one drug to 7Department of Orthopaedics, Institution of Clinical Sciences, Lund University, Lund, another for different reasons (lack of efficacy, side effects), and Sweden such switches could affect radiographic outcomes. However, fi Correction notice This article has been corrected twice since it was published any changes made in response to a lack of ef cacy would most Online First. Minor changes have been made to the Results section of the Abstract. likely attenuate any true differences between the groups. In addition, a minor correction has been made to a value in table 1. The strengths of this study were that it was based on a pro- spective, randomised trial, with a generous sample size, and that Acknowledgements The SWEFOT Trial Investigators Group, besides the authors: all the analyses presented here are based on the baseline clinical Johan Bratt, Stockholm; Kristina Albertsson, Stockholm; Lars Cöster, Linköping; Eva Waltbrand, Borås; Agneta Zickert, Stockholm; Jan Theander, Kristianstad; Åke characteristics and the baseline MBDA score, information that Thörner, Eskilstuna; Helena Hellström, Falun; Annika Teleman, Halmstad; Christina could in principle be available to the clinician when making the Dackhammar, Mölndal; Finn Akre, Örebro; Lotta Ljung, Umeå; Rolf Oding, Västerås; first decision regarding therapy. Katerina Chatzidionysiou, Stockholm; Margareta Wörnert, Stockholm. We would like It will be important to study the predictive value of the to thank all participating patients and the study nurses, co-investigators and colleagues who made this trial possible. MBDA score at additional time points for even longer follow-up times of clinical and radiographic data. Contributors KH had the lead role in management of the merged clinical and serological datasets and analyses thereof, discussions with collaborators, drafting In conclusion, in DMARD-naïve patients with eRA, low/mod- of the manuscript and its final approval for submission. SS contributed to the erate MBDA score at baseline was shown to be associated with a management of the merged clinical and serological datasets and had a lead role in very low risk of RP after 1 year. If confirmed in other studies, the statistical analyses thereof, participated in discussions with collaborators, critically

Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 1107 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research reviewed the manuscript and approved the final version for submission. RB, DC, ES prospective cohort of patients with early rheumatoid arthritis. Arthritis Res Ther and SC were responsible for the serum analyses and the MBDA score generation, 2006;8:R40. quality control of these data, and contributed to analysis and interpretations of these 12 Saag KG, Cerhan JR, Kolluri S, et al. Cigarette smoking and rheumatoid arthritis data. They all reviewed the manuscript critically and approved the final version for severity. Ann Rheum Dis 1997;56:463–9. submission. KF had the main responsibility of the radiological scoring in the SWEFOT 13 Masdottir B, Jonsson T, Manfredsdottir V, et al. Smoking, rheumatoid factor trial and supported the statistical analyses and interpretations thereof, participated in isotypes and severity of rheumatoid arthritis. Rheumatology (Oxford) discussions with collaborators, critically reviewed the manuscript and approved the 2000;39:1202–5. final version for submission. IP and PG were key investigators and steering committee 14 Visser K, Goekoop-Ruiterman YP, de Vries-Bouwstra JK, et al. A matrix risk model members in the SWEFOT trial. They supported the statistical analyses and for the prediction of rapid radiographic progression in patients with rheumatoid interpretations of the data in the current study, participated in discussions with arthritis receiving different dynamic treatment strategies: post hoc analyses from the collaborators, critically reviewed the manuscript and approved the final version for BeSt study. Ann Rheum Dis 2010;69:1333–7. submission. RvV was an investigator in the SWEFOT trial and responsible investigator 15 Bakker MF, Verstappen SM, Welsing PM, et al. The relation between cartilage of the study reported here. He designed the study and organised the collaboration biomarkers (C2C, C1, 2C, CS846, and CPII) and the long-term outcome of with Crescendo Biosciences Inc for the patients’ serum analyses and the MBDA score rheumatoid arthritis patients within the CAMERA trial. Arthritis Res Ther 2011;13: generation. He also discussed and reviewed the manuscript critically and gave his final R70. approval for submission of the manuscript. 16 Niki Y, Takeuchi T, Nakayama M, et al. Clinical significance of cartilage biomarkers for monitoring structural joint damage in rheumatoid arthritis patients treated with Funding The study was supported in part by a grant from the Swedish Rheumatism anti-TNF therapy. PloS One 2012;7:e37447. Association. Some of the authors were supported by clinical research funds from 17 Andersson ML, Svensson B, Petersson IF, et al. Early increase in serum-COMP is Stockholm County (ALF funds). An annual unrestricted grant was provided by associated with joint damage progression over the first five years in patients with Schering-Plough Sweden that was used to support a study coordinator and a rheumatoid arthritis. BMC Musculoskelet Disord 2013;14:229. medical monitor for the original clinical trial. The analyses of the MBDA score were 18 Rho YH, Solus J, Sokka T, et al. Adipocytokines are associated with radiographic done by Crescendo Bioscience Inc (South San Francisco, California, USA), at no cost joint damage in rheumatoid arthritis. Arthritis Rheum 2009;60: to the investigators. No financial support was provided by Crescendo Bioscience Inc 1906–14. or other companies for this study. 19 Syversen SW, Goll GL, Haavardsholm EA, et al. A high serum level of eotaxin Competing interests RvV received research support and/or honoraria from (CCL 11) is associated with less radiographic progression in early rheumatoid AbbVie, Biotest, BMS, GSK, Lilly, Merck, Pfizer, Roche, UCB, Vertex. IP received arthritis patients. Arthritis Res Ther 2008;10:R28. research funding and/or honoraria from AbbVie, UCB Pharma and Pfizer, not related 20 Ortea I, Roschitzki B, Ovalles JG, et al. Discovery of serum proteomic biomarkers to this study. ES is an employee of and holds stock options from Crescendo for prediction of response to infliximab (a monoclonal anti-TNF antibody) treatment Bioscienc Inc. DC receives consulting fees and stock options from Crescendo in rheumatoid arthritis: an exploratory analysis. J Proteomics 2012;77: Bioscience Inc. KF has received honoraria from AbbVie and BMS, not related to this 372–82. study. RB is an employee of and receives stock options form Crescendo Bioscience 21 Wagner C, Chen D, Fan H, et al. Evaluation of serum biomarkers associated with Inc. SC is an independent contractor to Crescendo Bioscience In and is paid on an radiographic progression in methotrexate-naive rheumatoid arthritis patients treated hourly basis. KH, SS and PG: none declared. with methotrexate or golimumab. J Rheumatol 2013;40:590–8. Ethics approval This study was approved by the regional ethics committees of all 22 Curtis JR, van der Helm-van Mil AH, Knevel R, et al. Validation of a novel participating units. multibiomarker test to assess rheumatoid arthritis disease activity. Arthritis Care Res 2012;64:1794–803. Provenance and peer review Not commissioned; externally peer reviewed. 23 van der Helm-van Mil AH, Knevel R, Cavet G, et al. An evaluation of molecular and Open Access This is an Open Access article distributed in accordance with the clinical remission in rheumatoid arthritis by assessing radiographic progression. Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which Rheumatology (Oxford) 2013;52:839–46. permits others to distribute, remix, adapt, build upon this work non-commercially, 24 Centola M, Cavet G, Shen Y, et al. Development of a multi-biomarker disease and license their derivative works on different terms, provided the original work is activity test for rheumatoid arthritis. PloS One 2013;8:e60635. properly cited and the use is non-commercial. See: http://creativecommons.org/ 25 Bakker MF, Cavet G, Jacobs JW, et al. Performance of a multi-biomarker score licenses/by-nc/3.0/ measuring rheumatoid arthritis disease activity in the CAMERA tight control study. Ann Rheum Dis 2012;71:1692–7. 26 Hirata S, Dirven L, Shen Y, et al. A multi-biomarker score measures rheumatoid arthritis disease activity in the BeSt study. Rheumatology (Oxford) 2013;52: REFERENCES 1202–7. 1 Mclnnes IB, Schett G. The pathogenesis of rheumatoid arthritis. N Engl J Med 27 van Vollenhoven RF, Ernestam S, Geborek P, et al. Addition of infliximab compared 2011;365:2205–19. with addition of sulfasalazine and hydroxychloroquine to methotrexate in patients 2 Uhlig T, Kvien TK. Is rheumatoid arthritis really getting less severe? Nat Rev with early rheumatoid arthritis (SWEFOT trial): 1-year results of a randomised trial. Rheumatol 2009;5:461–4. Lancet 2009;374:459–66. 3 Liao KP, Weinblatt ME, Cui J, et al. Clinical predictors of erosion-free status in 28 van Vollenhoven RF, Geborek P, Forslind K, et al. Conventional combination rheumatoid arthritis: a prospective cohort study. Rheumatology (Oxford) treatment versus biological treatment in methotrexate-refractory early rheumatoid 2011;50:1473–9. arthritis: 2 year follow-up of the randomised, non-blinded, parallel-group SWEFOT 4 Smolen JS, Van Der Heijde DM, St Clair EW, et al. Predictors of joint damage in trial. Lancet 2012;379:1712–20. patients with early rheumatoid arthritis treated with high-dose methotrexate with or 29 Prevoo ML, van’t Hof MA, Kuper HH, et al. Modified disease activity scores that without concomitant infliximab: results from the ASPIRE trial. Arthritis Rheum include twenty-eight-joint counts. Development and validation in a prospective 2006;54:702–10. longitudinal study of patients with rheumatoid arthritis. Arthritis Rheum 5 Garnero P, Landewe R, Boers M, et al. Association of baseline levels of markers of 1995;38:44–8. bone and cartilage degradation with long-term progression of joint damage in 30 Van Gestel AM, Prevoo ML, van‘t Hof MA, et al. Development and validation of the patients with early rheumatoid arthritis—the COBRA Study. Arthritis Rheum European League Against Rheumatism response criteria for rheumatoid arthritis. 2002;46:2847–56. Comparison with the preliminary American College of Rheumatology and the World 6 Quinn MA, Gough AKS, Green MJ, et al. Anti-CCP antibodies measured at disease Health Organization/International League Against Rheumatism Criteria. Arthritis onset help identify seronegative rheumatoid arthritis and predict radiological and Rheum 1996;39:34–40. functional outcome. Rheumatology (Oxford) 2006;45:478–80. 31 Takahashi N, Kojima T, Kaneko A, et al. Clinical efficacy of abatacept compared to 7 Goronzy JJ, Matteson EL, Fulbright JW, et al. Prognostic markers of radiographic adalimumab and tocilizumab in rheumatoid arthritis patients with high disease progression in early rheumatoid arthritis. Arthritis Rheum 2004;50:43–54. activity. Clin Rheumatol 2013;33:39–47. 8 Avouac J, Gossec L, Dougados M. Diagnostic and predictive value of anti-cyclic 32 Inoue E, Yamanaka H, Hara M, et al. Comparison of Disease Activity Score (DAS) citrullinated protein antibodies in rheumatoid arthritis: a systematic literature review. 28- erythrocyte sedimentation rate and DAS28- C-reactive protein threshold values. Ann Rheum Dis 2006;65:845–51. Ann Rheum Dis 2007;66:407–9. 9 Nell VPK, Machold KP, Stamm TA, et al. Autoantibody profiling as early diagnostic 33 van der Heijde D. How to read radiographs according to the Sharp/van der Heijde and prognostic tool for rheumatoid arthritis. Ann Rheum Dis 2005;64:1731–6. method. J Rheumatol 2000;27:261–3. 10 Forslind K, Ahlmen M, Eberhardt K, et al. Prediction of radiological outcome in 34 Vastesaeger N, Xu S, Aletaha D, et al. A pilot risk model for the prediction of rapid early rheumatoid arthritis in clinical practice: role of antibodies to citrullinated radiographic progression in rheumatoid arthritis. Rheumatology (Oxford) peptides (anti-CCP). Ann Rheum Dis 2004;63:1090–5. 2009;48:1114–21. 11 Meyer O, Nicaise-Roland P, Santos MD, et al. Serial determination of cyclic 35 Bruynesteyn K, van der Heijde D, Boers M, et al. Determination of the minimal citrullinated peptide autoantibodies predicted five-year radiological outcomes in a clinically important difference in rheumatoid arthritis joint damage of the Sharp/van

1108 Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com Clinical and epidemiological research

der Heijde and Larsen/Scott scoring methods by clinical experts and comparison radiological progression is not fully prevented: data from the methotrexate responders with the smallest detectable difference. Arthritis Rheum 2002;46:913–20. population in the SWEFOT trial. Ann Rheum Dis 2012;71:186–91. 36 Eastman PS, Manning WC, Qureshi F, et al. Characterization of a multiplex, 40 Landewe R, Geusens P, Boers M, et al. Markers for type II collagen breakdown 12-biomarker test for rheumatoid arthritis. J Pharm Biome Anal 2012;70: predict the effect of disease-modifying treatment on long-term radiographic 415–24. progression in patients with rheumatoid arthritis. Arthritis Rheum 2004;50: 37 Saevarsdottir S, Rezaei H, Geborek P, et al. Current smoking status is a strong predictor 1390–9. of RP in early rheumatoid arthritis: results from the SWEFOT trial. Ann Rheum Dis 41 Lillegraven S, Prince FH, Shadick NA, et al. Remission and radiographic outcome in Published Online First: 15 March 2014. doi:10.1136/annrheumdis2013-204601. rheumatoid arthritis: application of the 2011 ACR/EULAR remission criteria in an 38 Molenaar ET, Voskuyl AE, Dinant HJ, et al. Progression of radiologic damage in observational cohort. Ann Rheum Dis 2012;71:681–6. patients with rheumatoid arthritis in clinical remission. Arthritis Rheum 42 Klarenbeek NB, Koevoets R, van der Heijde DM, et al. Association with joint 2004;50:36–42. damage and physical functioning of nine composite indices and the 2011 ACR/ 39 Rezaei H, Saevarsdottir S, Forslind K, et al. In early rheumatoid arthritis, patients with EULAR remission criteria in rheumatoid arthritis. Ann Rheum Dis 2011;70: a good initial response to methotrexate have excellent 2-year clinical outcomes, but 1815–21.

Hambardzumyan K, et al. Ann Rheum Dis 2015;74:1102–1109. doi:10.1136/annrheumdis-2013-204986 1109 Downloaded from http://ard.bmj.com/ on September 30, 2015 - Published by group.bmj.com

Pretreatment multi-biomarker disease activity score and radiographic progression in early RA: results from the SWEFOT trial Karen Hambardzumyan, Rebecca Bolce, Saedis Saevarsdottir, Scott E Cruickshank, Eric H Sasso, David Chernoff, Kristina Forslind, Ingemar F Petersson, Pierre Geborek and Ronald F van Vollenhoven

Ann Rheum Dis 2015 74: 1102-1109 originally published online May 8, 2014 doi: 10.1136/annrheumdis-2013-204986

Updated information and services can be found at: http://ard.bmj.com/content/74/6/1102

These include: Supplementary Supplementary material can be found at: Material http://ard.bmj.com/content/suppl/2014/05/02/annrheumdis-2013-2049 86.DC1.html References This article cites 41 articles, 17 of which you can access for free at: http://ard.bmj.com/content/74/6/1102#BIBL Open Access This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/ Email alerting Receive free email alerts when new articles cite this article. Sign up in the service box at the top right corner of the online article.

Topic Articles on similar topics can be found in the following collections Collections Open access (489) Connective tissue disease (3952) Degenerative joint disease (4300) Immunology (including allergy) (4742) Musculoskeletal syndromes (4597) Rheumatoid arthritis (3015)

Notes

To request permissions go to: http://group.bmj.com/group/rights-licensing/permissions

To order reprints go to: http://journals.bmj.com/cgi/reprintform

To subscribe to BMJ go to: http://group.bmj.com/subscribe/ NIH Public Access Author Manuscript Ann Intern Med. Author manuscript; available in PMC 2013 September 30.

NIH-PA Author Manuscript Published in final edited form as: Ann Intern Med. 2010 October 5; 153(7): 425–434. doi:10.7326/0003-4819-153-7-201010050-00005.

Multi-Center Validation of the Diagnostic Accuracy of a Blood- based Gene Expression Test for Assessment of Obstructive Coronary Artery Disease in Non-Diabetic Patients

Steven Rosenberg, PhD, CardioDx, Inc., 2500 Faber Place, Palo Alto, CA 94303 Michael R. Elashoff, PhD, CardioDx, Inc., 2500 Faber Place, Palo Alto, CA 94303 Philip Beineke, CardioDx, Inc., 2500 Faber Place, Palo Alto, CA 94303

NIH-PA Author Manuscript Susan E. Daniels, PhD, CardioDx, Inc., 2500 Faber Place, Palo Alto, CA 94303 James A. Wingrove, PhD, CardioDx, Inc., 2500 Faber Place, Palo Alto, CA 94303 Whittemore G. Tingley, MD PhD1, Division of Cardiology, University of California, San Francisco, San Francisco, CA 94143 Philip T. Sager, MD2, Gilead Sciences, Inc. 3172 Porter Drive, Palo Alto, CA 94304 William E. Kraus, MD, Duke Center for Living, 3475 Erwin Road, Box 3022, Rm. 254, Aesthetics Building, Durham, NC 27705 L. Kristin Newby, MD, Duke Clinical Research Institute, P.O. Box 17969, Durham, NC 27715-7969 Robert S. Schwartz, MD, Minneapolis Heart Institute Foundation, Abbott Northwestern Hospital, 920 E. 28th Street, Suite

NIH-PA Author Manuscript 620, Minneapolis, MN 55407

*Reprint requests should be addressed to: Steven Rosenberg, CardioDx, Inc., 2500 Faber Place, Palo Alto, CA 94303. Author [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]. 1Present address: Division of Cardiology, UCSF, San Francisco, CA 94143 2Present address: Gilead Sciences, Inc. 3172 Porter Drive, Palo Alto, CA 94304. 3PREDICT, Personalized Risk Evaluation and Diagnosis in the Coronary Tree, www.clinicaltrials.gov, NCT00500617, see Appendix 1 for PREDICT investigators list “This is the prepublication, author-produced version of a manuscript accepted for publication in Annals of Internal Medicine. This version does not include post-acceptance editing and formatting. The American College of Physicians, the publisher of Annals of Internal Medicine, is not responsible for the content or presentation of the author-produced accepted version of the manuscript or any version that a third party derives from it. Readers who wish to access the definitive published version of this manuscript and any ancillary material related to this manuscript (e.g., correspondence, corrections, editorials, linked articles) should go to www.annals.org or to the print issue in which the article appears. Those who cite this manuscript should cite the published version, as it is the official version of record.” RSS, SZ, RW, JM, and NT report no conflicts of interest with respect to the contents of this manuscript. Rosenberg et al. Page 2

Szilard Voros, MD, Piedmont Heart Institute, 95 Collier Road, NW Suite 2035, Atlanta, GA 30309

NIH-PA Author Manuscript Stephen Ellis, MD, The Cleveland Clinic, 9500 Euclid Avenue, F25, Cleveland, OH 44195 Naeem Tahirkheli, MD, 4221 S. Western, Suite 4000, Oklahoma City, OK 73109 Ron Waksman, MD, Cardiovascular Research Institute, Medstar Research Institute, Washington Hospital Center, 110 Irving St. NW, Suite 6B-5, Washington, DC 20010 John McPherson, MD, 1215 21st Ave S. MCE 5th Fl S. Tower, Nashville, TN 37232 Alexandra Lansky, MD, Cardiovascular Research Foundation, 111 East 59th Street, New York, NY 10022-1122 Nicholas J. Schork, PhD, Scripps Translational Science Institute, 3344 North Torrey Pines Court, Suite 300, La Jolla, CA 92037 Mary E. Winn, NIH-PA Author Manuscript Scripps Translational Science Institute, 3344 North Torrey Pines Court, Suite 300, La Jolla, CA 92037 Eric J. Topol, MD*, and Scripps Translational Science Institute, 3344 North Torrey Pines Court, Suite 300, La Jolla, CA 92037 for PREDICT Investigators3

Abstract Background—Diagnosis of significant coronary artery disease (CAD) in at risk patients can be challenging, typically including non-invasive imaging modalities and ultimately the gold standard of coronary angiography. Previous studies suggested that peripheral blood gene expression can reflect the presence of CAD. Objective—To validate a previously developed 23-gene expression-based classifier for diagnosis of obstructive CAD in non-diabetic patients. Design—Multi-center prospective trial with blood samples drawn prior to coronary angiography. NIH-PA Author Manuscript Setting—Thirty-nine US centers. Patients—An independent validation cohort of 526 non-diabetic patients clinically-indicated for coronary angiography Intervention—None. Measurements—Receiver-operator characteristics (ROC) analysis of classifier score measured by real-time polymerase chain reaction (RT-PCR), additivity to clinical factors, and reclassification of patient disease likelihood vs disease status defined by quantitative coronary angiography (QCA). Obstructive CAD defined as ≥50% stenosis in ≥1 major coronary artery by QCA. Results—The overall ROC curve area (AUC) was 0.70 ±0.02, (p<0.001); the classifier added to clinical variables (Diamond-Forrester method) (AUC 0.72 with classifier vs 0.66 without, p =

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 3

0.003). Net reclassification was improved by the classifier over Diamond-Forrester and an expanded clinical model (both p<0.001). At a score threshold corresponding to 20% obstructive

NIH-PA Author Manuscript CAD likelihood (14.75), the sensitivity and specificity were 85% and 43%, yielding NPV of 83% and PPV 46%, with 33% of patient scores below this threshold. Limitations—The study excluded patients with chronic inflammatory disorders, elevated white blood counts or cardiac protein markers, and diabetes. Conclusions—This non-invasive whole blood test, based on gene expression and demographics, may be useful for assessment of obstructive CAD in non-diabetic patients without known CAD. Primary Funding Source—CardioDx, Inc.

Chronic coronary artery disease (CAD), including chronic stable angina, afflicts 16.5 million patients in the United States, with approximately 500,000 new patients diagnosed annually (1). Substantially more patients are evaluated for chest pain or other symptoms suggestive of CAD, but only a minority are ultimately diagnosed with CAD (2–4). Clinical evaluation of patients with suspected CAD is variable and includes diagnostic tests of varied accuracy, reproducibility, ease of use and potential for patient morbidity (5). Many patients undergoing invasive diagnostic coronary angiography do not have obstructive CAD, despite widespread availability of non-invasive diagnostic modalities (6).

NIH-PA Author Manuscript No simple blood-based biomarker has been validated for diagnosis of obstructive CAD. Biomarkers such as C-reactive protein (CRP) have been associated with future cardiovascular event risk (7, 8), but there is no well-defined role for biomarkers in current assessment of patients with symptoms suggestive of CAD (9). We recently identified differential blood cell gene expression levels in patients with CAD (10) suggesting that CAD detection from a peripheral blood sample might be possible. The PREDICT multi- center study was designed to develop and validate blood-based gene expression tests for CAD , enrolling both diabetic and non-diabetic patients clinically indicated for invasive angiography. Differences in plaque morphology have been observed for CAD patients with and without diabetes (11, 12), and these differences were also reflected at the level of gene expression (Elashoff et al., submitted). Thus, we have derived an algorithm specifically relating non-diabetic patient CAD status to expression levels of 23 genes and sex-specific age functions (Elashoff et al., submitted).

Herein we report initial prospective validation of this gene expression algorithm for likelihood of obstructive CAD, defined as one or more coronary atherosclerotic lesions causing ≥50% luminal diameter stenosis, in non-diabetic patients with suspected CAD.

NIH-PA Author Manuscript Methods General Study Design and Study Population Subjects were enrolled in PREDICT, a 39 center prospective study, between July 2007 and April 2009. The study was approved by institutional review boards at all centers and all patients gave written informed consent. Subjects referred for diagnostic coronary angiography were eligible with a history of chest pain, suspected anginal-equivalent symptoms, or a high risk of CAD, and no known prior myocardial infarction (MI), revascularization, or obstructive CAD. Subjects were ineligible if at catheterization, they had acute MI, high risk unstable angina, severe non-coronary heart disease (congestive heart failure, cardiomyopathy or valve disease), systemic infectious or inflammatory conditions, or were taking immunosuppressive or chemotherapeutic agents. Detailed eligibility criteria are in Appendix 2.

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 4

From 2186 enrolled subjects who met inclusion criteria, 606 diabetic patients were excluded, as this initial algorithm development and validation was focused on non-diabetics.

NIH-PA Author Manuscript The limitation to non-diabetic patients was based on the significant differences observed in CAD classifier gene sets dependent on diabetic status (Elashoff et al., submitted). Of the remaining 1580 patients, 5 had angiographic images unsuitable for QCA and 6 had unusable blood samples. For the remaining 1569 subjects, 226 were used in gene discovery; the remaining 1343 were divided into independent algorithm development and validation cohorts (Figure 1) sequentially based on date of enrollment.

Clinical Evaluation and Quantitative Coronary Angiography Pre-specified clinical data, including demographics, medications, clinical history and presentation, and myocardial perfusion imaging results were obtained by research study coordinators using standardized data collection methods and data verified by independent study monitors.

Coronary angiograms were analyzed by computer-assisted QCA. Specifically, clinically- indicated coronary angiograms performed according to site protocols were digitized, de- identified and analyzed with a validated quantitative protocol at Cardiovascular Research Foundation, New York, NY (13). Trained technicians, blinded to clinical and gene expression data, visually identified all lesions >10% diameter stenosis (DS) in vessels with NIH-PA Author Manuscript diameter >1.5mm. Using the CMS Medis system, (Medis, version 7.1, Leiden, the Netherlands), technicians traced the vessel lumen across the lesion between the nearest proximal and distal non-diseased locations. The minimal lumen diameter (MLD), reference lumen diameter (RLD = average diameter of normal segments proximal and distal of lesion) and %DS (%DS = (1 - MLD/RLD) x 100) were then calculated.

The Diamond-Forrester (D–F) risk score, comprised of age, sex, and chest pain type, was prospectively chosen to evaluate the added value of the gene expression score to clinical factors (14). D–F classifications of chest pain type (typical angina, atypical angina and non- anginal chest pain) were assigned based on subject interviews as described (Appendix 2) (14), and D–F scores assigned (15). For this classification, subjects without chest pain symptoms were classified as non-anginal chest pain. Myocardial perfusion imaging was performed as clinically indicated, with local protocols, and interpreted by local readers with access to clinical data but not gene expression or catheterization data. Imaging results were defined as positive if ≥1 reversible or fixed defect consistent with obstructive CAD was reported. Indeterminate or intermediate defects were considered negative.

Obstructive CAD and Disease Group Definitions NIH-PA Author Manuscript Patients with obstructive CAD (N=192) were defined prospectively as subjects with ≥1 atherosclerotic plaque in a major coronary artery (≥1.5mm lumen diameter) causing ≥50% luminal diameter stenosis by QCA; non-obstructive CAD (N=334) had no lesions >50%.

Blood Samples Prior to coronary angiography, venous blood samples were collected in PAXgene® RNA- preservation tubes. Samples were treated according to manufacturer’s instructions, then frozen at −20°C.

RNA Purification and RT-PCR Automated RNA purification from whole blood samples using the Agencourt RNAdvance system, cDNA synthesis, and RT-PCR were performed as described (Elashoff et al., submitted). All PCR reactions were run in triplicate and median values used for analysis. Genomic DNA contamination was detected by comparison of expression values for splice-

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 5

junction spanning and intronic ADORA3 assays normalized to values of TFCP2 and HNRPF. The RPS4Y1 assay was run as confirmation of sex for all patients; patients were

NIH-PA Author Manuscript excluded if there was an apparent mismatch with clinical data. Sample QC metrics and pass- fail criteria were pre-defined and applied prior to evaluation of results as described (Elashoff et al., submitted).

Statistical Methods The analyses for comparison of demographic and clinical factors (Table 1) used SAS Version 9.1 (SAS Institute Inc, Cary, NC, USA). All other analysis was performed using R Version 2.7 (R Foundation for Statistical Computing, Vienna, Austria). Unless otherwise specified, univariate comparisons for continuous variables were done by t-test and categorical variables by Chi-square test. All reported p-values are two-sided.

Gene Expression Algorithm Score The gene expression algorithm was developed with obstructive CAD defined by QCA as ≥50% stenosis in >1 major coronary artery, corresponding approximately to 65–70% stenosis based on clinical angiographic read. The algorithm was locked prior to the validation study. Raw algorithm scores were computed from median expression values for the 23 algorithm genes, age and sex as described (Appendix 3) and used in all statistical

NIH-PA Author Manuscript analyses; scores were linearly transformed to a 0–40 scale for ease of reporting.

ROC Estimation and AUC Comparisons The prospectively defined primary endpoint was the ROC curve area for algorithm score prediction of disease status. ROC curves were estimated for the a) gene expression algorithm score, b) the D–F risk score, c) a combined model of algorithm score and D–F risk score, d) Myocardial perfusion imaging, and e) a combined model of algorithm score and imaging. Standard methods (16) were used to estimate empirical ROC curves and associated AUCs and AUC standard errors. The Z-test was used to test AUCs versus random (AUC = . 50).

Paired AUC comparisons: i) gene expression algorithm score plus D–F risk score vs D–F risk score, and ii) gene expression algorithm score plus myocardial perfusion imaging vs imaging alone; were performed by bootstrap. For each comparison, 10,000 bootstrap iterations were run, and observed AUC differences computed. The median bootstrapped AUC difference was used to estimate the AUC difference, and the p-value estimated using the empirical distribution of bootstrapped AUC differences (i.e. the observed quantile for 0 AUC difference in the empirical distribution). NIH-PA Author Manuscript Logistic Regression A series of logistic regression models were fit with disease status as the binary dependent variable, and compared using a likelihood ratio test between nested models. Comparisons were: i) gene expression algorithm score plus D–F risk score versus D–F risk score alone; ii) gene expression algorithm score plus myocardial perfusion imaging versus imaging alone; iii) gene expression algorithm score versus the demographic component of the gene expression algorithm score; iv) algorithm score plus expanded clinical model vs expanded clinical model alone.

Correlation of Algorithm Score with Maximum Percent Stenosis The correlation between algorithm score and percent maximum stenosis as continuous variables was assessed by linear regression. Stenosis values were grouped into five increasing categories (no measurable disease, 1–24%, 25–49% in ≥1 vessel, 1 vessel ≥50%,

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 6

and >1 vessel ≥50%) and ANOVA was used to test for a linear trend in algorithm score across categories. NIH-PA Author Manuscript Expanded Clinical Model An expanded clinical factor model was developed that incorporated the 11 clinical factors that showed univariate significance (p<.05) between obstructive CAD and no obstructive CAD patients in the development set (sex, age, chest pain type, race, statin, aspirin, anti- platelet, and ACE inhibitor use, systolic blood pressure, hypertension, and dyslipidemia). A logistic regression model was fit using disease status as the dependent variable and these 11 clinical factors as predictor variables. A subject’s ‘expanded clinical model score’ was the subject’s predicted value from this model.

Reclassification of Disease Status Gene expression algorithm score and D–F risk scores were defined as low (0% to <20%), intermediate (≥20%,<50%), and high risk (≥50%) obstructive CAD likelihoods. Myocardial perfusion imaging results were classified as negative (no defect/possible fixed or reversible defect) or positive (fixed or reversible defect). For the D–F risk score analysis, a reclassified subject was defined as i) D–F intermediate risk to low or high algorithm score, ii) D–F high risk to algorithm low, or iii) D–F low risk to algorithm high. For the myocardial perfusion

NIH-PA Author Manuscript imaging analysis, a reclassified subject included i) imaging positive to algorithm score low risk, or ii) imaging negative to algorithm score high risk. Net reclassification improvement of the gene expression algorithm score (and associated p-value) compared to the D–F risk score, expanded clinical model, or myocardial perfusion imaging result was computed as described in Supplementary methods, with the definition of reclassifications shown above (17). Net reclassification improvement is a measure of reclassification clinical benefit, and is sensitive to both the fraction and accuracy of reclassification. Conceptually, it is the difference between a) the fraction of subjects who are reclassified correctly from an incorrect initial classification, and b) the fraction of subjects who are reclassified incorrectly from a correct initial classification.

Results A total of 1343 non-diabetic patients from the PREDICT trial, enrolled between July 2007 and April 2009, were sequentially allocated to independent development (N= 694) and validation (N= 649) sets, as shown in Figure 1. The clinical characteristics of the development and validation sets were similar. Overall, subjects were 57% male, 37% had obstructive CAD and 26% had no detectable CAD. Significant clinical or demographic

NIH-PA Author Manuscript variables that were associated with obstructive CAD in both cohorts were increased age, male sex, chest pain type, elevated systolic blood pressure (all p< 0.001), hypertension (p=0.001), and white ethnicity (p=0.015), as summarized in Table 1.

The final algorithm, consisting of 23 genes, grouped in the 6 terms, 4 sex-independent and 2 sex-specific, is shown schematically in Figure 2. The subsequent analyses are for the independent validation set only.

ROC Analysis The primary endpoint AUC was 0.70 ±0.02, (p<0.001) with independently significant performance in male (0.66) and female subsets (0.65) (p <0.001 for each). For the primary clinical comparator of the Diamond-Forrester (D–F) risk score, ROC analysis showed a higher AUC for the algorithm score and D–F risk score combination, compared to D–F risk score alone (AUC 0.72 versus 0.66, p=0.003, Figure 3).

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 7

The most prevalent form of non-invasive imaging in PREDICT was myocardial perfusion imaging. In the validation set 310 patients had clinically-indicated imaging performed, of

NIH-PA Author Manuscript which 72% were positive. Comparative ROC analysis showed an increased AUC for the combined algorithm score and myocardial perfusion imaging results versus imaging alone (AUC 0.70 versus 0.54, p <0.001).

Sensitivity, Specificity We calculated the sensitivity and specificity for a score threshold of 14.75, corresponding to a disease likelihood of 20% from the validation set data. At this threshold, the sensitivity was 85% with a specificity of 43%, corresponding to negative and positive predictive values of 83% and 46%, respectively, with 33% of patients having scores below this value.

Association with Disease Severity The algorithm score was moderately correlated with maximum percent stenosis (R=0.34, p<0.001), and the average algorithm score increased monotonically with increasing percent maximum stenosis (p< 0.001, Figure 4). The average scores for patients with and without obstructive CAD were 25 and 17, respectively.

Reclassification

NIH-PA Author Manuscript Reclassification may be a more clinically relevant measure of comparative predictor performance than standard measures such as AUC (18). Table 2A shows reclassification results for the gene expression algorithm compared to D–F risk score. In the validation cohort 27% of patients were reclassified and the net reclassification improvement for the gene expression algorithm score was 20% (p<0.001). The majority of reclassified subjects (75/141) were those with intermediate D–F risk scores. The gene expression algorithm reclassified 78% (75/96) of these patients, with 47 reclassified correctly to low or high risk versus 28 reclassified incorrectly; the incorrect reclassifications were predominantly to high risk (21/28). Additionally, 38 D–F low risk subjects (15%) were reclassified as high risk, and 28 high risk subjects (16%) reclassified as low risk.

Classification by the expanded clinical model alone and with the addition of the gene expression score was also analyzed. A total of 22% of the patients were reclassified by the gene expression score and the net reclassification improvement was 16% (p<0.001, Table 2B). The vast majority of reclassified patients were intermediate risk by the expanded clinical model alone (112/118) and of these 74 were reclassified correctly and 38 incorrectly; incorrect reclassifications were preferentially to high risk (22/38). The AUC of the expanded clinical model alone was 0.732, and the AUC for the gene expression

NIH-PA Author Manuscript algorithm plus the full clinical model was 0.745 (p=0.089). For both clinical models overall, when reclassification errors occurred they were more likely to the high risk category, consistent with the gene expression algorithm having a higher negative predictive value than positive predictive value at this threshold.

A comparison of myocardial perfusion imaging versus gene expression algorithm results yielded a net reclassification improvement of 21% (p<0.001, Table 2C).

Discussion This study prospectively validates in non-diabetic patients, clinically referred for invasive angiography, a non-invasive test for obstructive CAD defined by QCA, that is based on gene expression in circulating whole blood cells, age and gender. This study extends our previous work on correlation of gene expression changes in blood with CAD (10) to prospective validation of a classifier for non-diabetic patients with obstructive CAD by ROC analysis.

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 8

The test yields a numeric score (0–40) with higher scores corresponding to higher likelihood of obstructive CAD and higher maximum percent stenosis (Supplementary Figure 1). NIH-PA Author Manuscript The gene expression score increases classification accuracy by ROC analysis compared to clinical factors (Diamond-Forrester), which has been challenging to achieve with genetic or biomarker approaches, at least for cardiovascular event prognosis (19, 20). It has also been suggested that reclassification of patient clinical risk or status, as captured by net reclassification improvement, may be a more appropriate measure than comparative ROC analysis for evaluating potential biomarkers (17, 18). The gene expression algorithm score improves the accuracy of clinical CAD assessment as shown by a net reclassification improvement of 20% relative to D–F score and 16% relative to an expanded clinical model (Tables 2A,B). For the most prevalent non-invasive test, myocardial perfusion imaging, the improvement was 21% (Table 2C), although these results are likely exaggerated in this angiographically referred population. The contributions to the reclassification improvements were 2%, 20%, and 6% from subjects without obstructive CAD and 18%, 1%, and 10% for subjects with obstructive CAD for Diamond-Forrester, myocardial perfusion, and the expanded clinical model, respectively. Overall, independent of imaging result or clinical risk category, increasing gene expression score leads to monotonically increased obstructive CAD risk. This is at least partially a reflection of the correlation of gene expression score with the extent of CAD, as measured here by maximum percent stenosis. NIH-PA Author Manuscript This gene-expression test could have clinical advantages over current non-invasive CAD diagnostic modalities since it requires only a standard venous blood draw, and no need for radiation, intravenous contrast, or physiologic and pharmacologic stressors. In the validation cohort, for example, only 37% of patients undergoing invasive angiography had obstructive CAD and the rate was particularly low in women (26%). A similar overall rate of obstructive CAD in an angiography registry for patients without prior known CAD was recently reported, with little sensitivity to the exact definition of obstructive CAD (6). The gene-expression test described here identified a low-likelihood (<20%) of obstructive CAD in 33% of patients referred for invasive angiography, although the majority of these patients were also at low risk by clinical factor analysis (Table 2A). After excluding low risk D–F score patients, an additional 11% (56/525) were classified as low risk by the gene expression algorithm. These patients had an observed risk of 23% as compared to 49% overall for the D–F intermediate and high risks groups.

Relationship of the Gene Expression Algorithm to Prior Studies The algorithm consists of two types of terms: sex-specific age functions of obstructive CAD likelihood and gene expression terms that reflect changes in gene expression within a cell NIH-PA Author Manuscript type, changes in cell type proportions, or a combination of the two. The sex-specific differences in cardiovascular risk and presentation are well known and largely reflect reduced risk in pre-menopausal women (21, 22). The gene expression terms appear to reflect an innate immune response, as illustrated by the preponderance of up-regulated genes preferentially expressed in granulocytes/neutrophils and natural killer cells. This is at first view surprising as the cell types most consistently found in atherosclerotic plaque are monocytes and T-cells. However, roles for a variety of circulating cells and both innate and adaptive immunity in atherosclerosis have been described (23, 24).

The significance of changes in specific cell-type distributions in whole blood with respect to cardiovascular events has been investigated in both angiographically evaluated and post-PCI populations, with neutrophil/lymphocyte ratio being the most significant predictor (25, 26). Algorithm Term 2 consists of three genes expressed predominantly in granulocytes/ neutrophils (27, 28). In men term 2 is normalized to RPL28, one of the ribosomal proteins, which are preferentially expressed in lymphocytes (27). Thus, this term reflects the

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 9

neutrophil/lymphocyte ratio in men. In women, this term is normalized to genes highly correlated with neutrophil count, rather than RPL28, consistent with reduced significance of

NIH-PA Author Manuscript neutrophil count in predicting CAD in women (29). The common upregulated genes in term 2 (S100A8, S100A12, and CLEC4E) are highly correlated with the 11 gene signature we described previously, including the most significant gene from that study, S100A12 (10).

Limitations This study has a number of limitations. First, the gene expression changes observed likely represent disease-correlated effects or disease-responsive effects, and not a causal role in disease pathogenesis. These changes may reflect overall disease burden, the inflammatory activity of the patient, or both, and not a specific level of stenosis.

From a clinical perspective the validation study patient population is limited to a non- diabetic US-based population with chest pain or asymptomatic high-risk presentation who have been referred for invasive angiography, and does not address test performance in low- risk, asymptomatic individuals, patients with high risk unstable angina or acute MI or diabetics. Further studies will be needed to refine the test negative predictive value, and to examine directly test performance relative to non-invasive imaging modalities as the current population of patients clinically referred for angiography likely over-estimates disease prevalence. In particular, the present myocardial perfusion imaging comparisons are affected NIH-PA Author Manuscript by referral bias, especially for the negative patients. Thus, the clinical utility of the gene expression algorithm will need additional validation in lower risk populations.

The maximum stenosis endpoint is anatomical rather than functional; correlation to fractional flow reserve might be more informative. In addition, the influence of non- coronary atherosclerosis on the test score has not been determined, although this is less likely to have impact in a chest-pain population.

Finally, prognosis of future cardiac events has not been evaluated. While coronary inflammation is widely accepted as a cause of plaque progression, rupture, and MI (30, 31), a possible relationship between the test result and future events remains unexplored. It is intriguing that some of the algorithm terms may reflect cell-type ratios that have been implicated in major coronary event prediction (25, 32).

From a molecular and cellular perspective, defining disease risk from circulating blood cell RNA levels and reported age and gender only partially reflects the molecular changes in coronary atherosclerosis observable in blood. Analysis of protein or lipid biomarker levels, secreted by smooth muscle, endothelial, and inflammatory cells in the diseased vessel wall,

NIH-PA Author Manuscript or other sources of inflammatory markers (such as liver) might yield complementary information (33). In addition, gene expression based measures of physiological rather than chronological age may yield improved predictive information (34).

Conclusions We describe the prospective multi-center validation of a peripheral blood-based gene expression test to determine the likelihood of obstructive CAD in non-diabetic patients as defined by invasive angiography. This test provides a statistically significant but modest improvement in patient classification as compared to clinical factors and non-invasive imaging as defined by patient CAD status. Further studies are needed to define the performance characteristics and clinical utility in populations with lower pre-test probability.

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 10

Supplementary Material

Refer to Web version on PubMed Central for supplementary material. NIH-PA Author Manuscript

Acknowledgments

The corresponding author had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. An independent statistical analysis was completed by Dr. Nicholas Schork at the Scripps Research Institute without any funding from the sponsor. The primary endpoint and secondary endpoints were prospectively defined and transmitted for external statistical analysis prior to completion of the validation study data set.

The authors gratefully acknowledge all the patients who provided samples for the PREDICT study as well as the study site research coordinators and those who contributed to patient recruitment, clinical data acquisition and verification, validation study experimental work and data analysis. We thank Michael Walker, Richard Lawn, and Fred Cohen for helpful suggestions on the manuscript.

This work was funded by CardioDx, Inc. SR, MRE, PB, SED, JAW, AJS, and MY are employees of CardioDx, Inc and have equity interests and/or stock options in CardioDx. WGT and PTS are former employees have equity or stock options in CardioDx. SR, MRE, JAW, AJS, PB and WGT have filed patent applications on behalf of CardioDx, Inc. WEK reports research support from CardioDx. NJS, MW, and EJT are supported in part by the Scripps Translational Science Institute Clinical Translational Science Award (NIHUL1RR025774). AL reports funding from CardioDx to complete the QCA studies reported herein. NIH-PA Author Manuscript References 1. Lloyd-Jones D, Adams R, Carnethon M, et al. Heart disease and stroke statistics--2009 update: a report from the American Heart Association Statistics Committee and Stroke Statistics Subcommittee. Circulation. 2009; 119(3):480–486. [PubMed: 19171871] 2. Martina B, Bucheli B, Stotz M, Battegay E, Gyr N. First Clinical Judgment by Primary Care Physicians Distinguishes Well between Nonorganic and Organic Causes of Abdominal or Chest Pain. J Gen Intern Med. 1997; 12:459–465. [PubMed: 9276650] 3. Svavarsdottir AE, Jonasson MR, Gudmundsson GH, Fjeldsted K. Chest pain in family practice. Diagnosis and long-term outcome in a community setting. Can Fam Physician. 1996; 42:1122– 1128. [PubMed: 8704488] 4. Klinkman MS, Stevens D, Gorenflo DW. Episodes of care for chest pain: a preliminary report from MIRNET. Michigan Research Network. J Fam Pract. 1994; 38(4):345–352. [PubMed: 8163958] 5. Gibbons RJ, Abrams J, Chatterjee K, et al. ACC/AHA 2002 guideline update for the management of patients with chronic stable angina--summary article: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines (Committee on the Management of Patients With Chronic Stable Angina). J Am Coll Cardiol. 2003; 41(1):159–168. [PubMed: 12570960]

NIH-PA Author Manuscript 6. Patel MR, Peterson ED, Dai D, et al. Low diagnostic yield of elective coronary angiography. N Engl J Med. 2010; 362(10):886–895. [PubMed: 20220183] 7. Ridker PM, Paynter NP, Rifai N, Gaziano JM, Cook NR. C-reactive protein and parental history improve global cardiovascular risk prediction: the Reynolds Risk Score for men. Circulation. 2008; 118(22):2243–2251. 4p following 2251. [PubMed: 18997194] 8. Melander O, Newton-Cheh C, Almgren P, et al. Novel and conventional biomarkers for prediction of incident cardiovascular events in the community. JAMA. 2009; 302(1):49–57. [PubMed: 19567439] 9. Zebrack JS, Muhlestein JB, Horne BD, Anderson JL. C-reactive protein and angiographic coronary artery disease: independent and additive predictors of risk in subjects with angina. J Am Coll Cardiol. 2002; 39(4):632–637. [PubMed: 11849862] 10. Wingrove JA, Daniels SE, Sehnert AJ, et al. Correlation of Peripheral-Blood Gene Expression With the Extent of Coronary Artery Stenosis. Circulation: Cardiovascular Genetics. 2008; 1(1): 31–38. [PubMed: 20031539]

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 11

11. Burke AP, Kolodgie FD, Zieske A, et al. Morphologic findings of coronary atherosclerotic plaques in diabetics: a postmortem study. Arterioscler Thromb Vasc Biol. 2004; 24(7):1266–1271. [PubMed: 15142859] NIH-PA Author Manuscript 12. Ibebuogu UN, Nasir K, Gopal A, et al. Comparison of atherosclerotic plaque burden and composition between diabetic and non diabetic patients by non invasive CT angiography. Int J Cardiovasc Imaging. 2009; 25(7):717–723. [PubMed: 19633998] 13. Lansky, AJ.; Popma, JJ. Qualitative and quantitative angiography. Philadelphia, PA: Saunders; 1998 Text Book of Interventional Cardiology; 14. Diamond GA, Forrester JS. Analysis of probability as an aid in the clinical diagnosis of coronary- artery disease. N Engl J Med. 1979; 300(24):1350–1358. [PubMed: 440357] 15. Chaitman BR, Bourassa MG, Davis K, et al. Angiographic prevalence of high-risk coronary artery disease in patient subsets (CASS). Circulation. 1981; 64(2):360–367. [PubMed: 7249303] 16. Newson R. Confidence intervals for rank statistics: Somers' D and extensions. Stata Journal. 2006; 6:309–334. 17. Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008; 27(2):157–172. discussion 207-12. [PubMed: 17569110] 18. Cook NR, Ridker PM. Advances in measuring the effect of individual predictors of cardiovascular risk: the role of reclassification measures. Ann Intern Med. 2009; 150(11):795–802. [PubMed: 19487714] 19. Paynter NP, Chasman DI, Buring JE, Shiffman D, Cook NR, Ridker PM. Cardiovascular disease NIH-PA Author Manuscript risk prediction with and without knowledge of genetic variation at chromosome 9p21.3. Ann Intern Med. 2009; 150(2):65–72. [PubMed: 19153409] 20. Wilson PW, Pencina M, Jacques P, Selhub J, D’Agostino R Sr, O’Donnell CJ. C-reactive protein and reclassification of cardiovascular risk in the Framingham Heart Study. Circ Cardiovasc Qual Outcomes. 2008; 1(2):92–97. [PubMed: 20031795] 21. Gopalakrishnan P, Ragland MM, Tak T. Gender differences in coronary artery disease: review of diagnostic challenges and current treatment. Postgrad Med. 2009; 121(2):60–68. [PubMed: 19332963] 22. Stangl V, Witzel V, Baumann G, Stangl K. Current diagnostic concepts to detect coronary artery disease in women. Eur Heart J. 2008; 29(6):707–717. [PubMed: 18272503] 23. Vanderlaan PA. Thematic review series: the immune system and atherogenesis. The unusual suspects:an overview of the minor leukocyte populations in atherosclerosis. J Lipid Res. 2005; 46(5):829–838. [PubMed: 15772419] 24. Packard RR, Lichtman AH, Libby P. Innate and adaptive immunity in atherosclerosis. Semin Immunopathol. 2009; 31(1):5–22. [PubMed: 19449008] 25. Horne BD, Anderson JL, John JM, et al. Which white blood cell subtypes predict increased cardiovascular risk? J Am Coll Cardiol. 2005; 45(10):1638–1643. [PubMed: 15893180] 26. Duffy BK, Gurm HS, Rajagopal V, Gupta R, Ellis SG, Bhatt DL. Usefulness of an elevated NIH-PA Author Manuscript neutrophil to lymphocyte ratio in predicting long-term mortality after percutaneous coronary intervention. Am J Cardiol. 2006; 97(7):993–996. [PubMed: 16563903] 27. Palmer C, Diehn M, Alizadeh AA, Brown PO. Cell-type specific gene expression profiles of leukocytes in human peripheral blood. BMC Genomics. 2006; 7:115. [PubMed: 16704732] 28. Watkins NA, Gusnanto A, de Bono B, et al. A HaemAtlas: characterizing gene expression in differentiated human blood cells. Blood. 2009; 113(19):e1–e9. [PubMed: 19228925] 29. Rana JS, Boekholdt SM, Ridker PM, et al. Differential leucocyte count and the risk of future coronary artery disease in healthy men and women: the EPIC-Norfolk Prospective Population Study. J Intern Med. 2007; 262(6):678–689. [PubMed: 17908163] 30. Libby P, Theroux P. Pathophysiology of coronary artery disease. Circulation. 2005; 111(25):3481– 3488. [PubMed: 15983262] 31. Packard RR, Libby P. Inflammation in atherosclerosis: from vascular biology to biomarker discovery and risk prediction. Clin Chem. 2008; 54(1):24–38. [PubMed: 18160725]

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 12

32. Dragu R, Huri S, Zuckerman R, et al. Predictive value of white blood cell subtypes for long-term outcome following myocardial infarction. Atherosclerosis. 2008; 196(1):405–412. [PubMed: 17173924] NIH-PA Author Manuscript 33. Ardigo D, Assimes TL, Fortmann SP, et al. Circulating chemokines accurately identify individuals with clinically significant atherosclerotic heart disease. Physiol Genomics. 2007; 31(3):402–409. [PubMed: 17698927] 34. Hong MG, Myers AJ, Magnusson PK, Prince JA. Transcriptome-wide assessment of human brain and lymphocyte senescence. PLoS One. 2008; 3(8):e3024. [PubMed: 18714388] NIH-PA Author Manuscript NIH-PA Author Manuscript

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 13 NIH-PA Author Manuscript NIH-PA Author Manuscript

Figure 1. Allocation of Patients from the PREDICT trial for algorithm development and validation. From a total of 1569 subjects meeting the study inclusion/exclusion criteria 226 were used for gene discovery. The remaining 1343 were divided into independent cohorts for algorithm development (694) and validation (649) as shown; 94% of patients in these cohorts came from the same centers. For algorithm development a total of 640 patient samples were used; 54 were excluded due to incomplete data (13), inadequate blood volume (19), sex mismatch between experimental and clinical records (5), or statistical outlier assessment (17) (see Supplement for details). For the validation cohort a total of 123 samples were excluded NIH-PA Author Manuscript based on: inadequate blood volume or RNA yield (43), significant contamination with genomic DNA (78), or prespecified statistical outlier assessment (2).

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 14 NIH-PA Author Manuscript NIH-PA Author Manuscript

Figure 2. Schematic of the Algorithm Structure and Genes. The algorithm consists of overlapping

NIH-PA Author Manuscript gene expression functions for males and females with a sex-specific linear age function for the former and a non-linear age function for the latter. For the gene expression components shown 16/23 genes in 4 terms are gender independent: term 1 – neutrophil activation and apoptosis, term 3 – NK cell activation to T cell ratio, term 4, B to T cell ratio, and term 5 – expression of gene AF289562 normalized to the mean of TFCP2 and HNRPF. In addition, Term 2 consists of 3 sex-independent neutrophil/innate immunity genes normalized in a sex- specific way to neutrophil gene expression (AQP9,NCF4) for females and to RPL28 (lymphocytes) in males. The final male specific term is the normalized expression of TSPAN16. The raw algorithm score is calculated from RT-PCR data as described (Appendix 3); for clinical use, the raw score was converted to a 0–40 scale by linear transformation.

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 15 NIH-PA Author Manuscript NIH-PA Author Manuscript

Figure 3. ROC analysis of Validation Cohort Performance For Algorithm and Clinical Variables. Algorithm performance adds to Clinical Factors by Diamond-Forrester. Comparison of the combination of D–F score and algorithm score (heavy solid line) to D–F score alone (---) in ROC analysis is shown. The AUC=0.50 line (light solid line) is shown for reference. A total of 525 of the 526 validation cohort patients had information available to calculate D–F scores. The AUCs for the two ROC curves are 0.721 ± 0.023 and 0.663 ±0.025, p = 0.003. NIH-PA Author Manuscript

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. Rosenberg et al. Page 16 NIH-PA Author Manuscript NIH-PA Author Manuscript

Figure 4. Dependence of Algorithm Score on % Maximum Stenosis in the Validation Cohort. The extent of disease for each patient was quantified by QCA maximum % stenosis and grouped into 5 categories: no measurable disease, 1–24%, 25–49% in ≥1 vessel, 1 vessel >50%, and >1 vessel >50%. The average algorithm score for each group is illustrated; error bars correspond to 95% confidence intervals. The complete relationship of algorithm score to NIH-PA Author Manuscript obstructive CAD likelihood is depicted in Supplementary Figure 1; in Figure 4 scores of 10, 20, and 30 correspond to 15, 30, and 57% disease likelihood. A score of 15, corresponding to a 20% likelihood was used for dichotomous analyses as described in the text.

Ann Intern Med. Author manuscript; available in PMC 2013 September 30. acu hne lces3 1.% 6(12)02 6(35)3 1.% 0.25 (10.2%) 34 (13.5%) 26 0.29 (11.2%) 46 (14.3%) 33 blockers channel Calcium

Page 17 nitni eetrbokr 9(26)3 95)02 8(.% 4(02)0.76 (10.2%) 34 (9.4%) 18 0.26 (9.5%) 39 (12.6%) 29 blockers receptor Angiotensin

0.013 57 (24.8%) 67 (16.3%) 67 (24.8%) 57 inhibitors ACE 7(45)6 1.% 0.155 (19.2%) 64 (24.5%) 47

2(57)13(24)05 5(43)14(71)0.113 (37.1%) 124 (44.3%) 85 0.52 (32.4%) 133 (35.7%) 82 blockers Beta

0.003 0.021 93 (48.4%) 127 (38.0%) 127 (48.4%) 93 (34.6%) 142 (47.4%) 109 Statins

0.029 0.012 Aspirin and salicylates 153 (66.5%) 232 (56.6%) 232 (66.5%) 153 salicylates and Aspirin 139 (72.4%) 205 (61.4%) 205 (72.4%) 139

Medications

smtmtc ihrs 2(13)13(76)03 3(76)10(99)0.60 (29.9%) 100 (27.6%) 53 0.32 (27.6%) 113 (31.3%) 72 risk high Asymptomatic,

5(52)8 1.% .4 1(61)5 1.% 0.74 (17.4%) 58 (16.1%) 31 0.149 (19.8%) 81 (15.2%) 35 angina Unstable

2 5.% 1 5.% .817(57)16(27)0.46 (52.7%) 176 (55.7%) 107 0.78 (52.2%) 214 (53.5%) 123 angina Stable

Clinical syndrome Clinical

0.016 0.015 Ethnicity, White not Hispanic 210 (91.3%) 347 (84.6%) 347 (91.3%) 210 Hispanic not White Ethnicity, 181 (94.3%) 293 (87.7%) 293 (94.3%) 181

0.009 30.5 (6.0) 31.0 (7.5) 0.35 29.8 (5.5) 31.3 (7.0) 31.3 (5.5) 29.8 0.35 (7.5) 31.0 (6.0) 30.5 kg/m2 (SD), mean BMI,

3(32)9 2.% .53 1.% 8(04)0.70 (20.4%) 68 (19.8%) 38 0.75 (24.3%) 99 (23.2%) 53 smoking Curent

< 0.001 170 (73.9%) 225 (54.9%) 225 (73.9%) 170 Dyslipidemia 3 6.% 0 6.% 0.109 (62.3%) 208 (69.3%) 133

0.002 0.001 142 (74.0%) 203 (60.8%) 203 (74.0%) 142 (57.8%) 237 (70.9%) 163 Hypertension

97(10 96(17 .47. 1.)7. 1.)0.086 (10.9) 77.5 (11.3) 79.2 0.94 (11.7) 79.6 (11.0) 79.7 Diastolic

<0.001 <0.001 138 (17.7) 133 (18.3) 133 (17.7) 138 Systolic 140 (17.7) 132 (18.1) 132 (17.7) 140

(SD), mm Hg mm (SD),

Blood pressure, mean pressure, Blood

None 91 (39.6%) 143 (34.9%) 143 (39.6%) 91 58 (30.2%) 109 (32.6%) 109 (30.2%) 58

Non-cardiac 47 (20.4%) 137 (33.4%) 137 (20.4%) 47 50 (26.0%) 134 (40.1%) 134 (26.0%) 50

. Author manuscript; available in PMC 2013 September 30.

Atypical 28 (12.2%) 56 (13.7%) 56 (12.2%) 28 42 (21.9%) 49 (14.7%) 49 (21.9%) 42

Typical 61 (26.5%) 66 (16.1%) 66 (26.5%) 61 42 (21.9%) 41 (12.3%) 41 (21.9%) 42

<0.001 <0.001 Chest pain type pain Chest

<0.001 <0.001 Men, No. (%) No. Men, 180 (78.3%) 193 (47.1%) 193 (78.3%) 180 134 (69.8%) 165 (49.4%) 165 (69.8%) 134

Ann Intern Med

<0.001 <0.001 Age, mean (SD), y (SD), mean Age, 63.7 (11.1) 57.2 (11.8) 57.2 (11.1) 63.7 64.7 (9.8) 57.7 (11.7) 57.7 (9.8) 64.7

Characteristic A N40 P-value (N=410) CAD (N=230) CAD CAD (N=192) CAD A N34 P-value (N=334) CAD

2

No Obstructive No Obstructive No Obstructive No Obstructive

Development Validation

Clinical and Demographic Characteristics of the Final Development and Validation Patient Sets Patient Validation and Development Final the of Characteristics Demographic and Clinical

Rosenberg et al. Rosenberg 1 Table 1 Table

NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript Page 18

. Author manuscript; available in PMC 2013 September 30.

Obstructive CAD is defined as >50% luminal stenosis in > 1 major vessel by QCA. by vessel major 1 > in stenosis luminal >50% as defined is CAD Obstructive

2 variables. Significant p values in both sets are bolded and underlined and are bolded if significant in single sets. single in significant if bolded are and underlined and bolded are sets both in values p Significant variables.

Characteristics of the 640 subjects in the Algorithm Development and 526 subjects in the Validation sets. P values were calculated by t-tests for continuous variables and using chi-square tests for discrete for tests chi-square using and variables continuous for t-tests by calculated were values P sets. Validation the in subjects 526 and Development Algorithm the in subjects 640 the of Characteristics

Ann Intern Med 1

NSAIDS 7(04)7 1.% .63 1.% 8(74)0.60 (17.4%) 58 (15.6%) 30 0.76 (19.0%) 78 (20.4%) 47

Steroids, not systemic not Steroids, 3(00)3 80)04 9(.% 8(14)0.59 (11.4%) 38 (9.9%) 19 0.45 (8.0%) 33 (10.0%) 23

0.003 Antiplatelet agents Antiplatelet 27 (11.7%) 21 (5.1%) 21 (11.7%) 27 6(.% 7(.% 0.142 (5.1%) 17 (8.3%) 16

Characteristic (N=230) CAD A N40 P-value (N=410) CAD CAD (N=192) CAD A N34 P-value (N=334) CAD

2

No Obstructive No Obstructive No Obstructive No Obstructive

Rosenberg et al. Rosenberg

Development Validation

NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript

ECM Low Risk Low ECM

oe ihrTotal Higher Lower Page 19 High Int Low

Algorithm

Total Reclassified % Reclassified Total Expression Gene With

Table 2B. Reclassification analysis of Gene Expression Algorithm with Expanded Clinical Model (ECM) Model Clinical Expanded with Algorithm Expression Gene of analysis Reclassification 2B. Table

bevdrs 7 3 0 37% 60% 33% 17% risk Observed

oOsrcieCD15187 333 70 118 145 CAD Obstructive No

btutv A 95 0 192 104 59 29 CAD Obstructive

oa ains147 7 525 174 77 174 Patients Total

21% -- -- 51% 63% 48% risk Observed

38.4 22 38.4 86 33 31 CAD Obstructive No

6.6 6 6.6 91 56 29 CAD Obstructive

15.8 28 15.8 177 89 60 Total

D-F High Risk High D-F

% % 55 25 6 ------46% 52% risk Observed

0440.4 40.4 21 21 80.8 52 10 CAD Obstructive No

5959.1 15.9 26 7 75.0 44 11 CAD Obstructive

28 47 9249.0 29.2 Total 21 96 78.1

D-F Intermediate Risk Intermediate D-F

% 58 bevdrs 4 20% 14% risk Observed --- -- 23% . Author manuscript; available in PMC 2013 September 30.

16 8.2 oOsrcieCD1277 102 CAD Obstructive No 195 8.2

22 38.6 btutv A 619 16 CAD Obstructive 57 38.6

38 15.1 Total 1 96 118 252 15.1

Ann Intern Med D-F Low Risk Low D-F

o n.High Int. Low oe ihrTotal Higher Lower

Algorithm

With Gene Expression Gene With Total Reclassified % Reclassified Total

Reclassification analysis of Gene Expression Algorithm with Diamond-Forrester Clinical Model Clinical Diamond-Forrester with Algorithm Expression Gene of analysis Reclassification

Rosenberg et al. Rosenberg Table 2A Table

NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript

12.3 8 12.3 65 23 34 CAD Obstructive No

31.8 7 btutv A 8 7 CAD Obstructive 31.8

Page 20 22

17.4 15 17.4 87 31 41 Total

Imaging Negative Imaging

oe ihrTotal Higher Lower High Int. Low

Algorithm

Total Reclassified % Reclassified Total Expression Gene With

Table 2C. Reclassification analysis of Gene Expression Algorithm with Myocardial Perfusion Imaging Results Imaging Perfusion Myocardial with Algorithm Expression Gene of analysis Reclassification 2C. Table

5 3 1 37% 61% 33% 15% risk Observed

oOsrcieCD12156 334 67 115 152 CAD Obstructive No

btutv A 95 0 192 106 57 29 CAD Obstructive

8 7 7 526 173 172 181 Total

% 0 59% 62% 60% 526 60% 62% 59% risk Observed

2 2.1 74 95 44 17 CAD Obstructive No 2.1

0 0.0 47 63 71 24 CAD Obstructive 0.0

2 1.3 115158 115 41 Total 1.3

ECM High Risk High ECM

% % 28 59 Observed risk Observed 28% 36% --

42 22 1816.7 31.8 No Obstructive CAD Obstructive No 68 132 48.5

16 32 1643.2 21.6 Obstructive CAD Obstructive 26 74 64.8

. Author manuscript; available in PMC 2013 September 30.

58 54 8226.2 28.2 Total 94 206 54.4

ECM Int Risk Int ECM

% 75 Observed risk Observed 1 19% 11% 14%

Ann Intern Med

1 0.8 oOsrcieCD1830 108 CAD Obstructive No 139 0.8

3 13.0 btutv A 37 13 CAD Obstructive 23 13.0

4 2.5 Total 2 37 121 162 2.5

o n High Int Low oe ihrTotal Higher Lower

Rosenberg et al. Rosenberg Algorithm

With Gene Expression Gene With Total Reclassified % Reclassified Total Table 2B. Reclassification analysis of Gene Expression Algorithm with Expanded Clinical Model (ECM) Model Clinical Expanded with Algorithm Expression Gene of analysis Reclassification 2B. Table

NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript

Page 21

Reclassification categories which are included in this calculation are bolded. are calculation this in included are which categories Reclassification

Classification improved in 1.0% of disease patients and improved in 20.3% of non disease patients for a net reclassification improvement of 21.3% (p<.001) 21.3% of improvement reclassification net a for patients disease non of 20.3% in improved and patients disease of 1.0% in improved Classification

Risk categories: Low = 0–<20%, Intermediate = ≥20–50%, High = ≥50%. = High ≥20–50%, = Intermediate 0–<20%, = Low categories: Risk

Reclassification categories which are included in this calculation are bolded. are calculation this in included are which categories Reclassification

Classification improved in 9.9% of disease patients and improved in 6.3% of non disease patients for a net reclassification improvement of 16.2% (p<.001) 16.2% of improvement reclassification net a for patients disease non of 6.3% in improved and patients disease of 9.9% in improved Classification

Risk categories: Low = 0–<20%, Intermediate = ≥20–50%, High = ≥50%. = High ≥20–50%, = Intermediate 0–<20%, = Low categories: Risk

Reclassification categories which are included in this calculation are bolded. are calculation this in included are which categories Reclassification

Classification improved in 18.2% of disease patients and improved in 1.8% of non disease patients for a net reclassification improvement of 20.0% (p<.001) 20.0% of improvement reclassification net a for patients disease non of 1.8% in improved and patients disease of 18.2% in improved Classification

Risk categories: Low = 0–<20%, Intermediate = ≥20–50%, High = ≥50%. = High ≥20–50%, = Intermediate 0–<20%, = Low categories: Risk

3 7 4 32% 54% 27% 13% risk Observed

oOsrcieCD8 04 212 47 80 85 CAD Obstructive No

btutv A 32 698 56 29 13 CAD Obstructive

Total 81913310 103 109 98

. Author manuscript; available in PMC 2013 September 30.

% 11 Observed risk Observed 7 6 34% 56% 27%

51 34.7 No Obstructive CAD Obstructive No 73 147 39 57 34.7

6 7.9 Obstructive CAD Obstructive 14 76 49 21 7.9

57 25.6 Ann Intern Med Total 88 223 88 78 25.6

Imaging Positive Imaging

% 47 Observed risk Observed 7 26% 17% 25%

o n.High Int. Low oe ihrTotal Higher Lower

Rosenberg et al. Rosenberg Algorithm

With Gene Expression Gene With Total Reclassified % Reclassified Total Table 2C. Reclassification analysis of Gene Expression Algorithm with Myocardial Perfusion Imaging Results Imaging Perfusion Myocardial with Algorithm Expression Gene of analysis Reclassification 2C. Table

NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript Cost effectiveness of a gene expression score and myocardial perfusion imaging for diagnosis of coronary artery disease Charles E. Phelps, PhD, a Amy K. O’Sullivan, PhD, b Joseph A. Ladapo, MD, PhD, c Milton C. Weinstein, PhD, d Kevin Leahy, BA, b and Pamela S. Douglas, MD e Rochester, and New York, NY; Medford, and Boston, MA; and Durham, NC

Background Over 3 million patients annually present with symptoms suggestive of obstructive coronary artery disease (oCAD) in the United States (US), but a cardiac etiology is found in as few as 10% of cases. Usual care may include advanced cardiac testing with myocardial perfusion imaging (MPI), with attendant radiation risks and increased costs of care. We estimated the cost effectiveness of CAD diagnostic strategies including “no test,” a gene expression score (GES) test, MPI, and sequential strategies combining GES and MPI. Methods We developed a Markov-based decision analysis model to simulate outcomes and costs in patients presenting to clinicians with symptoms suggestive of oCAD in the US. We estimated quality-adjusted life years (QALYs), total costs, and incremental cost-effectiveness ratios (ICERs) for each strategy. Results In our base case, the 2-threshold GES strategy is the most cost-effective strategy at a threshold of $100,000 per QALY gained, with an ICER of approximately $72,000 per QALY gained relative to no testing. Myocardial perfusion imaging alone and the 1-threshold strategy are weakly dominated. In sensitivity analysis, ICERs fall as the probability of oCAD increases from the base case value of 15%. The ranking of ICERs among strategies is sensitive to test costs, including the time cost for testing. The analysis reveals ways to improve on prespecified GES thresholds. Conclusions Diagnostic testing for oCAD with a novel GES strategy in a 2-threshold model is cost effective by conventional standards. This diagnostic approach is more efficient than usual care of MPI alone or a 1-threshold GES strategy in most scenarios. (Am Heart J 2014;167:697-706.e2.)

Approximately 3 million United States (US) patients time for the patient.6 Despite the significant medical visit a clinician’s office each year for evaluation of costs incurred in the work-up of suspected oCAD, symptoms suggestive of obstructive coronary artery typically only 10% to 20% of newly presenting patients disease (oCAD), leading to over $6.7 billion in direct ultimately receive a diagnosis of oCAD.7-11 Diagnostic medical costs.1-5 In selecting a diagnostic option to pathways that improve patient selection for referral to evaluate patients with suspected oCAD, clinicians bal- noninvasive testing and ICA may improve the quality of ance test accuracy with the risks of testing (including care and contain health care costs. iatrogenic injury and disease from invasive coronary Common diagnostic pathways involve stress electro- angiography [ICA]) and ionizing radiation from some cardiography, stress myocardial perfusion imaging (MPI), noninvasive tests), the cost of testing, and the value of lost and referral for ICA. Myocardial perfusion imaging is a high-volume test in the US, with over 10 million procedures annually.4 This study focuses on pathways involving MPI and a new commercially available gene From the aUniversity of Rochester, Rochester, NY, bOptum Insight, Medford, MA, cNew York University School of Medicine, New York, NY, dCenter for Health Decision Science, expression score (GES) diagnostic test for oCAD (Corus Harvard School of Public Health, Harvard University, Boston, MA, and eDuke Clinical CAD; CardioDx, Inc, Palo Alto, CA). The GES reports the Research Institute, Duke University, Durham, NC. likelihood of a patient having oCAD on a scale of 1 (low) Sean van Diepen, MD, served as guest editor for this article. to 40 (high) and has a negative predictive value of 96% for Submitted August 17, 2013; accepted February 13, 2014. 12 Reprint requests: Charles E. Phelps, PhD, 30250 South Highway 1, Gualala, CA 95445. scores of ≤15, as previously described. Obstructive E-mail: [email protected] CAD is defined throughout this article as the patient 0002-8703 © 2014, The Authors. Published by Elsevier Inc. having ≥1 coronary arteries with stenosis of ≥50% as Open access under CC BY-NC-ND license. http://dx.doi.org/10.1016/j.ahj.2014.02.005 determined by quantitative coronary angiography or core laboratory computed tomographic angiography. American Heart Journal 698 Phelps et al May 2014

Figure 1

Strategies examined. The various diagnostic strategies considered are outlined. Patients with positive ICA are treated with PCI (bare metal or drug-eluting stent) or CABG and then receive ongoing medical therapy. Patients with negative test results receive no further cardiac testing or treatment.

Methods Figure 1 shows the diagnostic strategies and the associated clinical pathways that we analyzed in addition to “no test.” We Model overview label these strategies as GES alone, MPI alone, 1-threshold strategy, We estimated the cost effectiveness of various diagnostic and 2-threshold strategy. Myocardial perfusion imaging alone strategies using MPI, the GES test, and sequential combinations represents routine care for those receiving any testing. The GES thereof, using a Markov-based decision analysis model of strategies use score thresholds to determine which patients move diagnosis, treatment, costs, and outcomes, measured by on to MPI and/or ICA. “No test” is the reference strategy against quality-adjusted life years (QALYs). The model represents a which the others are compared. In current practice, patients such population of nondiabetic US adults presenting for diagnosis, as those in our model may receive a cardiology referral, at which generally in a primary care setting, with typical and atypical point MPI and/or ICA may be ordered. In contrast, the blood symptoms suggestive of oCAD. Usual care for such patients may sample for the GES test may be drawn in primary care settings and “ ” include watchful waiting or referral to cardiology for may remove the need for a cardiology referral and further cardiac — diagnostic testing typically MPI as the initial test. We exclude testing for patients with low likelihood of oCAD. patients with a previous history of revascularization or In every pathway examined, our model specifies that a myocardial infarction (MI). positive diagnostic test or test sequence leads to ICA, which, if We assume that at anytime, a patient resides in one of a finite positive, leads to revascularization (coronary artery bypass graft number of discrete Markov health states defined by their CAD surgery [CABG] or percutaneous coronary intervention [PCI]). status, whether it has been diagnosed and their MI history. Patients with diagnosed oCAD receive daily medical therapy of a Patients with diagnosed oCAD received revascularization upon statin, a β blocker, an angiotensin-converting inhibitor, diagnosis and receive daily medical therapy thereafter. and aspirin.14,15 A negative diagnostic test or sequence leads to We used published data on the diagnostic accuracy of MPI and no further cardiac testing. the GES to stratify patients across the health states based on true- In the Markov model, patients transition across a set of positive, true-negative, false-positive, and false-negative diagnos- mutually exclusive health states (see online Appendix Supple- tic results depending on the strategy chosen. Each year, patients mentary Figure A1). All patients begin in one of the following can move between states as dictated by transition probabilities health states: that may change over time. We discounted costs (in 2012 US dollars) and QALYs at 3% annually and used a societal (a) No oCAD, off treatment, no history of MI; perspective in calculating direct medical care and test-related (b) Obstructive CAD, on treatment, no history of MI; or time costs over a lifetime horizon.13 (c) Obstructive CAD, off treatment, no history of MI. American Heart Journal Phelps et al 699 Volume 167, Number 5

Table I. Summary of key model parameters

Parameter Value Source

Demographic parameters Starting age 57 12 oCAD prevalence 15% 12 Test characteristics GES at threshold of 15 Sensitivity 89% 12 Specificity 52% 12 GES at threshold of 28 Sensitivity 43% 12 Specificity 89% 12 MPI Sensitivity 81% 16 Specificity 65% 16 Transition probabilities Annual probability of nonfatal MI oCAD, on treatment, no history of MI 2.85% 14 oCAD, off treatment, no history of MI 3.69% 14,17 oCAD, on treatment, history of MI 4.24% 14,18 Annual probability of fatal MI oCAD, on treatment, no history of MI 0.44% 14 oCAD, off treatment, no history of MI 0.57% 14,17 oCAD, on treatment, history of MI 0.65% 14,18 Mortality multipliers for non-MI death oCAD, on treatment, no history of MI 1.5 18 oCAD, off treatment, no history of MI 1.5 18 oCAD, on treatment, history of MI 3.7 18 Annual probability of receiving diagnosis for persons in oCAD, no history of MI, 10.4% 19 off-treatment state Quality-of-life factors oCAD, on treatment, no history of MI 0.78 20 oCAD, off treatment, no history of MI 0.77 20 oCAD, on treatment, history of MI 0.65 20 No oCAD, off treatment, no history of MI (age-specific) 0.75-0.92 21 Cost parameters (2012 US $) Diagnostic test costs GES 1050 Medicare rate per CMS letter of determination dated October 10, 2012 MPI Procedure cost 899 4,22-24 Patient time cost 100 25 ICA Procedure cost 5757 5,22,26 Patient time cost 200 25 Treatment and events costs Medical therapy for oCAD (annual supply) 1000 27-30 Revascularization 17169 22,26,31,32 Nonfatal MI 20231 22 Fatal MI 5885 22 Annual discount rate 3% 13

State (a) includes patients with a true-negative result on their state (c) have an annual probability of receiving an ICA and index test and those with a false-positive on their index test getting diagnosed, at which point they undergo a revasculari- followed by a negative ICA. State (b) consists of patients who had zation and move to state (b). ICA confirming a true-positive index test result and hence receiving treatment for oCAD. State (c) consists of patients with undiagnosed Model parameterization oCAD due to a false-negative result on their index test. Table I lists all model inputs; we describe each type of Patients in state (a) are assumed to remain in this state until data here. they die of non-MI causes; incidence of new oCAD is not Test characteristics. A meta-analysis summarizes the included in the model. Patients in states (b) and (c) are at risk of accuracy of MPI as having sensitivity of 81% and specificity of nonfatal MI, fatal MI, or death from non-MI causes. Patients in 65%.16 Myocardial perfusion imaging readings are treated as American Heart Journal 700 Phelps et al May 2014

positive or negative. The GES test characteristics, including include the time cost only of the diagnostic strategies under sensitivity and specificity at positivity thresholds of 15 and 28, consideration, not time later spent in treatment of diagnosed come from a prospective validation study with full quantitative oCAD. The GES examination requires only a single blood angiographic verification with either ICA or computed tomo- sample, done as a part of the initial physician visit, so we add graphic angiography.12 no time cost for its use. Prior probability of oCAD (p(oCAD)). We used An MPI generally requires 3 to 4 hours of patient time. the p(oCAD) from the GES validation study as our baseline Allowing for patient travel and waiting time, we used 4 hours as scenario.12 This value is midrange between studies in similar the minimum patient time for an MPI examination. This does not patient populations in the literature reporting p(oCAD) from 10% account for possible 2-day protocols (as some centers use) or for to 34.5%, centering around the 10% to 20% range.7-11 the time of companions, drivers, or assistants possibly required Transition probabilities. Probabilities for transitioning by patients and hence represents a lower bound on time costs. between health states in the model depend on the underlying For a value per hour, we used the average hourly wage as risk of events and whether the patient has been accurately reported by the US Bureau of Labor Statistics, estimated for 25 diagnosed with oCAD. Transition probabilities were calculated December 2012 at $23.75 per hour. Rounding so as not to from published data (see Table I). Annual probabilities of a convey overly high accuracy, we use $100 per MPI examination patient with oCAD on treatment experiencing fatal or nonfatal as our base estimate. In sensitivity analysis, we varied the time MI were estimated from the COURAGE trial.14 For patients with cost of MPI from $0 to $200. undiagnosed oCAD, risk reduction estimates for CABG and PCI Invasive coronary angiography involves a larger time cost, were applied to the event probabilities for patients on treatment typically nearly a full day in the hospital and ICA laboratory, to derive the event probabilities without revascularization.17 For plus travel time. Accordingly, we assumed 8 hours of patient oCAD patients with a history of MI, the ratio of nonfatal to fatal time for ICA. Most ICA providers urge that patients not drive MIs in the COURAGE trial was applied to the probability of after the procedure, hence requiring a companion. We did not having a second MI to determine the probability of a fatal or include companion time in our base case estimate, and thus, it nonfatal MI.14,18 represents a lower bound on time cost for ICA. Applying Non-MI risk of death among oCAD patients was calculated the average hourly wage described above, we estimated using US life expectancy tables and mortality multipliers to ICA time cost to be $200. We varied this from $0 to $400 in account for increased risk of death from oCAD.18,33 Age- sensitivity analysis. specific life table mortality rates were decomposed into age- specific non-Cardio-Vascular Disease (CVD) mortality rates The authors are solely responsible for the design and conduct using the prevalence of oCAD and the mortality multiplier for of this study, all study analyses, the drafting and editing of the oCAD patients. Mortality multipliers for non-MI death among manuscript, and its final contents. oCAD patients on treatment, oCAD patients off treatment, and oCAD patients with a history of MI were applied to these non- Cardio-Vascular Disease (CVD) mortality rates to estimate age- specific non-MI mortality rates in the model. Results Health care outcomes and costs. We simulated the Base case population over time and calculated discounted costs and The results for the primary study outcomes—base case QALYs for each diagnostic strategy. We obtained quality of life costs, QALYs, and incremental cost-effectiveness ratios utility weights for each health state from a national catalog of (ICERs)—for each strategy are shown in Figure 2. Both EQ-5Dindex scores of chronic conditions that was developed the 1-threshold and 2-threshold GES strategies lead to 20 using data from the Medical Expenditure Panel Survey. Age- lower costs and fewer QALYs than MPI alone, whereas specific utility weights for patients without oCAD were obtained GES alone leads to both increased costs and QALYs from a separate report on EQ-5D scores.21 Our model index compared with MPI alone. At a cost-effectiveness includes costs for diagnostic test strategies, revascularization, treatment for oCAD, and treatment for an MI. Costs of MPI, GES, threshold of $100,000 per QALY gained, the 2-threshold ICA, revascularization, fatal MI, and nonfatal MI were estimated strategy is the most cost-effective strategy, with an ICER using the Centers for Medicare and Medicaid Services (CMS) of approximately $72,000 per QALY gained relative to 2012 approved fees and 2010 Drug Topics Red Book as well as “no test.” Both MPI alone and the 1-threshold strategy are other available literature.22,23 We assumed that revascularization weakly dominated by the 2-threshold strategy. GES alone includes PCI and CABG at approximately a 3:1 ratio.31 The has an ICER of approximately $152,000 relative to the annual cost of medical therapy for oCAD, assumed as 40 mg of 2-threshold strategy. simvastatin, 100 mg metoprolol tartrate, 10 mg lisinopril, and 75 Table II shows, for each strategy, the percentage of to 325 mg aspirin daily, was estimated from the 2013 Red Book patients receiving diagnostic testing with MPI and/or ICA Online and the available literature.27-30 The Appendix provides and the yield of positive disease diagnosis for patients detailed assumptions and sources for these calculations. receiving ICA. As compared with MPI alone, both the MPI and ICA time cost. Many cardiac diagnostic tests, such as MPI and ICA, require significant patient time. As 1-threshold and the 2-threshold strategies resulted in appropriate when using a societal perspective, we account fewer patients receiving MPI and ICA and higher for the cost of lost patient time from diagnostic testing, diagnostic yield at ICA. The 1-threshold strategy resulted which contains 2 components—time consumed and the in the highest diagnostic yield, with 43% of patients value of that time. Consistent with standard practice,13 we demonstrating oCAD at ICA. American Heart Journal Phelps et al 701 Volume 167, Number 5

Figure 2

Baseline results. Per-person costs and QALYs are plotted for each strategy in adults in the US presenting with symptoms suggestive of oCAD. The lines show the ICER relative to the next best strategy. Prevalence of oCAD in the population is assumed to be 15%. Thresholds for GES strategies are ≤15 for low scores and for the 2-threshold strategy, ≥28 for high scores. Costs and QALYs are calculated over the lifetime of the population and are discounted to the present at 3% annually.

Table II. Diagnostic testing—base case ICERs of the 2-threshold and GES alone strategies rise to approximately $94,000 and $229,000 respectively. Percent receiving test Sensitivity to test costs MPI and ICA time cost. Our base case included a ⁎ Strategy MPI ICA Percent with oCAD at ICA $100 time cost for MPI. If we ignore that time cost, MPI cost then simply becomes the Medicare reimbursement GES alone 0 54 25 rate. In this case, MPI remains weakly dominated at MPI alone 100 42 29 p(oCAD) = 10%, but for the baseline value of p(oCAD) of 2-threshold strategy 38 32 37 1-threshold strategy 54 25 43 15%, MPI alone enters the efficient frontier and has an ICER of approximately $140,000 incremental to the 2-threshold ⁎ Results shown are for the diagnostic testing that occurs in the initial stratification of patients for each strategy. Patients who have false-negative test results and later strategy. As p(oCAD) increases, MPI alone remains on the receive ICA upon diagnosis are not included in this analysis. In the base case, efficient frontier, and all strategies have increasingly ≤ ≥ thresholds for GES are 15 for low scores and for the 2-threshold strategy, 28 for smaller ICERs. When p(oCAD) reaches 25%, MPI costs high scores. Probability of oCAD = 15%. approximately $54,000 per QALY gained and weakly dominates both the 1-threshold and 2-threshold strategies, Sensitivity to p(oCAD) whereas GES alone costs approximately $101,000 per Results are sensitive to the underlying patient likeli- QALY gained incremental to MPI alone. If MPI time cost is hood of having oCAD. As with all diagnostic tests, total doubled to $200, the results importantly differ from the costs rise, and ICERs fall as the pretest p(oCAD) rises, as base case only when p(oCAD) = 25%, at which point seen in Figure 3. At the highest disease prevalence tested, MPI alone is weakly dominated, as it was at all lower values p(oCAD) = 25%, the ICER of the 2-threshold strategy falls of p(oCAD) with a time cost of $100. to approximately $54,000, GES alone to approximately We included a $200 time cost for ICA in our base case. $91,000, and MPI alone is no longer weakly dominated, If ICA time costs are ignored, results are qualitatively costing approximately $83,000 per QALY gained relative unchanged, but ICERs for all strategies decrease very to the 2-threshold strategy. At p(oCAD) = 20%, ICERs slightly. If ICA time cost is doubled to $400, ICERs for all of all efficient frontier strategies fall between those strategies increase very slightly, but results are qualita- of p(oCAD) of 15% and 25%, but MPI alone remains tively unchanged until p(oCAD) reaches 25%. In the base weakly dominated. At p(oCAD) = 10%, MPI alone and the case, at p(oCAD) = 25%, MPI alone entered the efficient 1-threshold strategy remain weakly dominated, and the frontier, but when ICA time cost is increased to $400, MPI American Heart Journal 702 Phelps et al May 2014

Figure 3

Sensitivity to p(oCAD). Per-person costs and QALYs are plotted for each strategy in adults in the US presenting with symptoms suggestive of oCAD. The lines show the ICER relative to the next best strategy. Prevalence of oCAD in the population is ranged from 10% to 25%. Thresholds for GES strategies are ≤15 for low scores and for the 2-threshold strategy, ≥28 for high scores. Costs and QALYs are calculated over the lifetime of the population and are discounted to the present at 3% annually. alone remains weakly dominated even at this level of involving the GES occurred using thresholds of ≤19 oCAD prevalence. for “low” and ≥32 for “high”. Figure 4 shows the cost- GES cost. Holding other values constant, once the cost effectiveness results for the various strategies with of the GES exceeds $1,100, MPI alone enters the efficient these new thresholds. frontier. The 2-threshold strategy costs less per QALY The general ranking of testing strategies remains similar gained than MPI alone until the cost of the GES exceeds to the base case, but strategies involving the GES have $1,150, at which point MPI alone weakly dominates the 2- lower ICERs than when using the base-case cutoffs for all threshold strategy. At lower costs of GES, the ICER of the values of p(oCAD). The cost and QALY outcomes of GES 2-threshold strategy and GES alone improve relative to alone become more similar to those of MPI alone and that MPI alone, although the 1-threshold strategy remains strategy weakly or fully dominates MPI alone. At p(oCAD) = weakly dominated until the cost of the GES falls to ≤$410. 25%, MPI alone remains weakly dominated, rather than entering the efficient frontier as in the base case. Sensitivity to GES threshold selection Incremental cost-effectiveness ratios for the 2-threshold Baseline threshold values for the GES test were strategy improve by approximately 5% to 10% and for analyzed in the initial assessment of the GES, and the GES alone by approximately 35% to 40% across all tested lower threshold (≤15) was validated in a second values of p(oCAD). validation trial.12,34 These values were based on findings Table III, similarly to Table II, shows the percentage of from physician market research, in which physicians patients receiving MPI and ICA and positive disease indicated they would consider a 20% or lower likelihood diagnosis yield from ICA with each testing strategy at of disease to be low risk of oCAD being the cause of a baseline p(oCAD) of 15%. Shifting to the higher GES patient’s symptoms. thresholds reduces the use of both MPI and ICA in Another approach identifies optimal thresholds based both the 1-threshold and 2-threshold strategies, with on p(oCAD) and costs and benefits of treatment (and the largest effects being reduction of MPI use in the avoidance of costs and side effects for nondiseased 1-threshold strategy and of ICA in the 2-threshold patients).35,36 To understand the consequences of strategy. Use of MPI in the 2-threshold strategy changes threshold selection, we varied the GES cutoff values only slightly because the number of GES scores leading to upward and downward, in units of 4 on the scale of 1 to MPI testing remains similar in both analyses. Diagnostic 40, from the baseline values (≤15 defined as “low” and yield at ICA increases for both the 1-threshold and ≥28 as “high”). The lowest ICERs for strategies 2-threshold strategies as well. American Heart Journal Phelps et al 703 Volume 167, Number 5

Figure 4

Revised GES thresholds. Per-person costs and QALYs are plotted for each strategy in adults in the US presenting with symptoms suggestive of oCAD. The lines show the ICER relative to the next best strategy. Prevalence of oCAD in the population is ranged from 10% to 25%. Thresholds for GES strategies are ≤19 for low scores and for the 2-threshold strategy, ≥32 for high scores. Costs and QALYs are calculated over the lifetime of the population and are discounted to the present at 3% annually.

Table III. Diagnostic testing—alternative GES thresholds accurately exclude oCAD as a cause of the patient’s symptoms in almost 50% of presenting patients, and 6- Percent receiving month follow-up of the study patients showed a major test adverse cardiovascular event rate of 0% among patients with low GES.12 Our analysis adds the cost-effectiveness ⁎ Strategy MPI ICA Percent with oCAD at ICA metric to the previously published evidence on the GES. Generally, at a cost-effectiveness threshold of $100,000 GES alone 0 41 31 per QALY gained, the 2-threshold strategy was the MPI alone 100 42 29 most cost-effective strategy for the assessment of oCAD 2-threshold strategy 37 22 49 1-threshold strategy 41 20 51 in this study. The 2-threshold strategy improved the diagnostic yield of ICA in comparison with MPI alone ⁎ Results shown are for the diagnostic testing that occurs in the initial stratification of patients for each strategy. Patients who have false-negative test results and later and GES alone and was a more efficient strategy than the receive ICA upon diagnosis are not included in this analysis. In this analysis, thresholds 1-threshold strategy. for GES are ≤19 for low scores (sensitivity = 84%, specificity = 67%) and for the 2- threshold strategy, ≥32 for high scores (sensitivity = 16%, specificity = 98%). In comparison with MPI alone and GES alone, the 2- Probability of oCAD = 15%. threshold strategy resulted in fewer patients going to ICA and higher yield at ICA (Tables II and III). In comparison with the 1-threshold strategy, the 2-threshold strategy replaced the diagnostic capabilities of MPI with those of Discussion high GES scores for high-risk patients, thus bypassing MPI Our analysis demonstrates that using the GES as a first- to go immediately to ICA. This resulted in more testing line modality for the assessment of suspected oCAD could than in the 1-threshold strategy (32% of patients in the offer a cost-effective alternative to the usual practice of base case receive ICA in the 2-threshold strategy versus assessment with MPI. The GES has already been shown to 25% in the 1-threshold strategy). However, by sending be feasibly incorporated into clinical practice by primary more patients to ICA, the 2-threshold strategy resulted in care physicians to rule out oCAD in patients with typical fewer false-negatives than did the 1-threshold strategy. and atypical presentations of chest pain.37,38 As reported The greater number of false-negative test results and their in the COMPASS trial, the GES has been shown to health consequences in the 1-threshold strategy lead to American Heart Journal 704 Phelps et al May 2014

fewer estimated QALYs. Balancing these differences in most strongly in favor of the MPI alone strategy and, to a outcomes and cost, the 2-threshold strategy weakly lesser degree, in favor of the 1-threshold strategy and dominates the 1-threshold strategy in most scenarios. (lesser again) the 2-threshold strategy. Our omission of As Figures 2 to 4 show, some of these test strategies risks associated with ICA creates biases in favor of those (particularly MPI alone and the 2-threshold strategy) have strategies with higher proportions of patients sent to ICA similar cost and QALY outcomes. Thus, although MPI (see Tables II and III). alone is often weakly dominated in our analyses, the Because ICA carries inherent risks to patients, almost all differences between MPI alone and strategies involving studies of MPI accuracy use only patients already referred GES are not large and could change with different for ICA based on MPI results, thus study patients have a baseline assumptions. Sensitivity analysis revealed these high pretest p(oCAD). In general, this verification bias results to be sensitive to the underlying patient risk of will lead to reported estimates of MPI accuracy that oCAD, with all test strategies being less cost effective overstate sensitivity and understate specificity.44 Given at lower probabilities of oCAD. In addition, the cost of that these biases work in opposite directions, their effect GES and the time costs associated with MPI also impact on ICER values and their rank order across our tested the relative cost effectiveness of each testing strategy. strategies cannot be established a priori. The 2-threshold strategy only remains cost effective in Other limitations include modeling assumptions. We comparison with MPI at GES costs b$1,150. did not include incidence of new oCAD in the model. We Our calculated ICERs for all of these tests are generally also did not incorporate the possibility of retesting with considered cost effective using current commentary and either MPI or GES. We assumed an abnormal MPI would research regarding the proper ICER threshold values for always lead to a referral to ICA, although other work societal decision making. In this analysis, we use a threshold shows this is not always the case.44 Lastly, we assumed of $100,000 per QALY gained, as this is a commonly used revascularization rates upon discovery of oCAD at ICA threshold.39,40 The World Health Organization suggests would be 100%; although they are likely lower; this is a using 3 times per-capita Gross Domestic Product (GDP) as standard assumption in models of oCAD. an upper bound for cost effectiveness.41 The 2012 US per- capita Gross Domestic Product (GDP) of approximately $50,000 equates to a cutoff of $150,000 per QALY gained. Conclusions One source estimates a threshold upward of $180,000 per Testing strategies involving the GES test for oCAD are QALY gained.42 The 2-threshold strategy meets all of these cost effective by usual standards in patients presenting to thresholds for patient populations with oCAD prevalence of primary care settings with typical and atypical symptoms ≥10%, and with the improved cutoff criteria (Figure 4), all suggestive of oCAD because the GES test improves efficient frontier tests involving the GES have ICER values patient selection for MPI and ICA. The 2-threshold well b$150,000. strategy costs approximately $72,000 per QALY gained We compared our results with previous estimates of the in the base case and, at a cost-effectiveness threshold of cost effectiveness of diagnostic testing for oCAD. One $100,000 per QALY gained, is the most cost effective previous analysis reported ICERs for exercise Single among tested strategies in many evaluated scenarios. Photon Emission Computed Tomography (SPECT, i.e., Strategies involving the GES, particularly the 2-threshold MPI) as compared with “no test” of $27,600 for “mild strategy, appear to provide new approaches to diagnosis chest pain, typical angina” and $33,300 for “mild chest of oCAD that will generally create health outcomes in a pain with atypical angina,” values notably lower than more cost-effective manner compared with current ours.17 Two important factors account for this differen- practice involving referral to MPI or other advanced tial. Those results are reported in 1996 US dollars, and the cardiac testing. population modeled in that analysis had a 70% prevalence of CAD, much higher than our base case. Our results are comparable or below their estimates with adjustments to References account for these differences. A separate study found that 1. National Ambulatory Medical Care Survey: 2010 Summary Tables. MPI had ICERs in the range of $60,000 to $130,000 CDC/NCHS, 2013. (Accessed July 8, 2013, at http://www.cdc. incremental to echocardiography in 1999 US dollars gov/nchs/data/ahcd/namcs_summary/2010_namcs_web_ in populations with p(oCAD) of 25% to 75%.43 Consumer tables.pdf.). price index adjusted equivalents are in line with 2. Woodwell DA, Cherry DK. National Ambulatory Medical Care our results. Survey: 2002 summary. Adv Data 2002;2004:1-44. Limitations of our study largely arise from exclusion of 3. Cayley Jr WE. Diagnosing the cause of chest pain. Am Fam Physician 2005;72:2012-21. certain testing risks from the model and verification bias 4. Nuclear Medicine Market Outlook Report Des Plaines, IL: IMV occurring in the estimates of MPI accuracy. First, our Medical Information Division, Inc.; 2011. model did not incorporate the consequences of patient 5. Cardiac Catheterization Lab Market Summary Report. Des Plaines, IL: exposure to radiation from MPI. This biases our results, IMV Medical Information Division, Inc.; 2008. American Heart Journal Phelps et al 705 Volume 167, Number 5

6. Shaw LJ, Hachamovitch R, Berman DS, et al. The economic for 7 health-related quality-of-life scores. Med Decis Making 2006; consequences of available diagnostic and prognostic strategies for 26:391-400. the evaluation of stable angina patients: an observational assessment 22. Medicare Information for Providers, Partners and Health Care of the value of precatheterization ischemia. Economics of Professionals. Centers for Medicare & Medicaid Services, 2012. Noninvasive Diagnosis (END) multicenter study group. J Am Coll (Accessed March 30, 2012, at http://www.cms.gov/home/ Cardiol 1999;33:661-9. medicare.asp). 7. An exploratory report of chest pain in primary care. A report from 23. Drug Topics Red Book. Montvale, NJ: Thomson Healthcare; 2010. ASPN. J Am Board Fam Pract 1990;3:143–50. 24. Miyamoto MI, Vernotico SL, Majmundar H, et al. Pharmacologic 8. Bosner S, Becker A, Haasenritter J, et al. Chest pain in primary care: stress myocardial perfusion imaging: a practical approach. J Nucl epidemiology and pre-work-up probabilities. Eur J Gen Pract 2009; Cardiol 2007;14:250-5. 15:141-6. 25. Economic News Release, Table B-3. Bureau of Labor Statistics, 2012. 9. Klinkman MS, Stevens D, Gorenflo DW. Episodes of care for chest (Accessed March 11, 2013, at http://www.bls.gov/news.release/ pain: a preliminary report from MIRNET. Michigan Research empsit.t19.htm.). Network. J Fam Pract 1994;38:345-52. 26. Bottner RK, Blankenship JC, Klein LW, et al. Current usage and 10. Cheng VY, Berman DS, Rozanski A, et al. Performance of the attitudes among interventional cardiologists regarding the traditional age, sex, and angina typicality-based approach for performance of percutaneous coronary intervention (PCI) estimating pretest probability of angiographically significant in the outpatient setting. Catheter Cardiovasc Interv 2005;66: coronary artery disease in patients undergoing coronary computed 455-61. tomographic angiography: results from the multinational coronary 27. Jick H, Wilson A, Wiggins P, et al. Comparison of prescription drug CT angiography evaluation for clinical outcomes: an costs in the United States and the United Kingdom, part 1: statins. international multicenter registry (CONFIRM). Circulation 2011; Pharmacotherapy 2012;32:1-6. 124(2423–32):1-8. 28. Heart Protection Study Collaborative Group. Statin cost-effectiveness 11. Verdon F, Herzig L, Burnand B, et al. Chest pain in daily practice: in the United States for people at different vascular risk levels. occurrence, causes and management. Swiss Med Wkly 2008;138: Circ Cardiovasc Qual Outcomes 2009;2:65-72. 340-7. 29. Curtiss FR, Fairman KA. Tough questions about the value of statin 12. Thomas GS, Voros S, McPherson JA, et al. A blood-based gene therapy for primary prevention: did JUPITER miss the moon? J Manag expression test for obstructive coronary artery disease tested in Care Pharm 2010;16:417-23. symptomatic nondiabetic patients referred for myocardial perfusion 30. Red Book Online. Truven Health Analytics, 2013. (Accessed April 1, imaging the COMPASS study. Circ Cardiovasc Genet 2013;6: 2013, at http://www.redbook.com/redbook/online/.). 154-62. 31. Epstein AJ, Polsky D, Yang F, et al. Coronary revascularization 13. Gold MR, Siegel JE, Russell LB, et al. Cost-effectiveness in health and trends in the United States, 2001-2008. JAMA 2011;305: medicine. New York: Oxford University Press. 1996. 1769-76. 14. Boden WE, O'Rourke RA, Teo KK, et al. Optimal medical therapy 32. Birkmeyer JD, Gust C, Baser O, et al. Medicare payments for common with or without PCI for stable coronary disease. N Engl J Med 2007; inpatient procedures: implications for episode-based payment 356:1503-16. bundling. Health Serv Res 2010;45:1783-95. 15. Qaseem A, Fihn SD, Dallas P, et al. Management of stable ischemic 33. Arias E. United States life tables, 2006. Natl Vital Stat Rep 2010;58: heart disease: summary of a clinical practice guideline from the 1-40. American College of Physicians/American College of Cardiology 34. Rosenberg S, Elashoff MR, Beineke P, et al. Multicenter validation of Foundation/American Heart Association/American Association for the diagnostic accuracy of a blood-based gene expression test for Thoracic Surgery/Preventive Cardiovascular Nurses Association/ assessing obstructive coronary artery disease in nondiabetic patients. Society of Thoracic Surgeons. Ann Intern Med 2012;157:735-43. Ann Intern Med 2010;153:425-34. 16. Mowatt G., Vale L., Brazzelli M., et al. Systematic review of the 35. Phelps CE, Mushlin AI. Focusing technology assessment effectiveness and cost-effectiveness, and economic evaluation, of using medical decision theory. Med Decis Making 1988;8: myocardial perfusion scintigraphy for the diagnosis and 279-89. management of angina and myocardial infarction. Health Technol 36. Felder S, Mayrhofer T. Medical decision making: a health economics Assess 2004;8:iii-iv, 1–207. primer. Berlin: Springer-Verlag. 2011. 17. Kuntz KM, Fleischmann KE, Hunink MG, et al. Cost-effectiveness of 37. Herman L., Froelich J., Kanelos D., et al. Improved diagnostic testing diagnostic strategies for patients with chest pain. Ann Intern Med patterns using a genomic-based personalized medicine test among 1999;130:709-18. patients presenting to primary care clinicians with symptoms of 18. Taylor DC, Pandya A, Thompson D, et al. Cost-effectiveness of suspected obstructive coronary artery disease: results from the intensive atorvastatin therapy in secondary cardiovascular prevention IMPACT-PCP (Investigation of a Molecular Personalized Coronary in the United Kingdom, Spain, and Germany, based on the Treating Gene Expression Test on Primary Care Practice Pattern) trial. J Am to New Targets study. Eur J Health Econ 2009;10:255-65. Board Fam Med In Press. 19. Metz LD, Beattie M, Hom R, et al. The prognostic value of normal 38. Conlin M, Herman L, Mouton M, et al. The use of a personalized gene exercise myocardial perfusion imaging and exercise expression test to improve decision making in the evaluation of echocardiography: a meta-analysis. J Am Coll Cardiol 2007;49: patients with suspected coronary artery disease. J Gen Inter Med 227-37. 2012;27:S540-1. 20. Sullivan PW, Lawrence WF, Ghushchyan V. A national catalog of 39. Cutler DM, McClellan M. Is technological change in medicine worth preference-based scores for chronic conditions in the United States. it? Health Aff (Millwood) 2001;20:11-29. Med Care 2005;43:736-49. 40. Ubel PA, Hirth RA, Chernew ME, et al. What is the price of life and 21. Hanmer J, Lawrence WF, Anderson JP, et al. Report of nationally why doesn’t it increase at the rate of inflation? Arch Intern Med 2003; representative values for the noninstitutionalized US adult population 163:1637-41. American Heart Journal 706 Phelps et al May 2014

41. Murray CJ, Evans DB, Acharya A, et al. Development of WHO 43. Garber AM, Solomon NA. Cost-effectiveness of alternative test guidelines on generalized cost-effectiveness analysis. Health Econ strategies for the diagnosis of coronary artery disease. Ann Intern 2000;9:235-51. Med 1999;130:719-28. 42. Braithwaite RS, Meltzer DO, King Jr JT, et al. What does 44. Ladapo JA, Blecker S, Elashoff MR, et al. Clinical implications of the value of modern medicine say about the $50,000 per referral bias in the diagnostic performance of exercise testing for quality-adjusted life-year decision rule? Med Care 2008;46: coronary artery disease. Journal of the American Heart Association 349-56. 2013;2:e000505. American Heart Journal Phelps et al 706.e1 Volume 167, Number 5

Appendix diagnosed (involving an ICA and a revascularization) and moving to the oCAD, On Tx state. Patients in The model represents the health states that patients an On Tx state are assumed to be receiving daily can occupy for 1-year intervals. During each 1-year drug therapy (a statin, a beta blocker, an ACE period, a patient in any of the obstructive CAD states inhibitor, and aspirin) in addition to having received is at risk for nonfatal MI, fatal MI, or noncardiac revascularization. death. Patients in an obstructive CAD state may not Note: ICA = invasive coronary angiography; MI = return to the No oCAD state. Patients in the oCAD, myocardial infarction; oCAD = obstructive coronary Off Tx state have a 10.4% annual probability of being artery disease; Tx = treatment.

Supplementary Figure A1

}

Markov Model Diagram. American Heart Journal 706.e2 Phelps et al May 2014

Cost Parameters–Assumptions and Sources

Value Parameter (2012 US $) Assumptions Sources

MPI procedure 899 Medicare reimbursement rates calculated from CMS 2012 fee schedules: CPT codes 78451, 78452, 93015, 22 cost 93016, APCs 0100, 0377 Assume split for setting of care is 49% hospital outpatient and 51% physician office 4 Assume split for exercise stress vs. pharmacological stress is 50%/50% 24 For pharmacological stress, assume market share split between regadeneson and adenosine is 80%/20% 23 Invasive coronary 5,757 Medicare reimbursement rates calculated from CMS 2012 fee schedules: CPT codes 93454-93461, APC 22 angiography 0080 for outpatient, and DRGs 286-287 for inpatient procedure cost Assume split between inpatient and outpatient procedures is 50%/50% 5,26 Medical therapy 1,000 Assumed as 40 mg of simvastatin, 100 mg metoprolol tartrate, 10 mg lisinopril and 75-325 mg aspirin daily 14,15 for oCAD Costs estimated from 2013 Red Book Online and published literature 27-30 (annual supply) Revascularization 17,169 Assume split is 77% PCI and 23% CABG 31 PCI Medicare reimbursement rates calculated from CMS 2012 fee schedules: CPT codes 92980, 92982, 22 92995, APC 0088, 0104O, 0083O, 0082O, 0656O for outpatient, and DRGs 245-249, 251 for inpatient Assume split between inpatient and outpatient PCI procedures is 75%/25% 26,31 Assume 90% of PCI procedures by patient case include a stent and that 75% of those stent cases include a 31 drug eluting stent CABG costs taken from published literature 32 Non-fatal MI 20,231 Medicare reimbursement rates calculated from CMS 2012 fee schedules: CPT codes 99223, 99239, 99291, 22 and DRGs 231, 233, 235, 246, 248, 250, 280-282 Fatal MI 5,885 Medicare reimbursement rates calculated from CMS 2012 fee schedules: CPT codes 99223, 99239, 99291, 22 and DRGs 283-285

*CABG = coronary artery bypass graft surgery, CMS = Centers for Medicare and Medicaid Services, MI = myocardial infarction, MPI = myocardial perfusion imaging, oCAD = obstructive coronary artery disease, PCI = percutaneous coronary intervention.

TITLE: Gene Expression Profiling for the Diagnosis of Heart

Transplant Rejection

AUTHOR: Jeffrey A. Tice M.D.

Assistant Professor of Medicine

Division of General Internal Medicine

Department of Medicine

University of California San Francisco

PUBLISHER: California Technology Assessment Forum

DATE OF PUBLICATION: October 13, 2010

PLACE OF PUBLICATION: San Francisco, CA

1

GENE EXPRESSION PROFILING FOR THE DIAGNOSIS OF HEART TRANSPLANT REJECTION

A Technology Assessment

INTRODUCTION

The California Technology Assessment Forum (CTAF) has been asked to update its review of the scientific literature on the safety and efficacy of gene expression profiling for the diagnosis of heart transplant rejection. The topic was last reviewed in October 2006. A randomized trial comparing use of the Food and Drug Administration (FDA) approved test, AlloMap, to endomyocardial biopsies was published in 2010.1 AlloMap remains the only FDA approved gene-expression test for heart transplant rejection monitoring, but there is another potential test under development in Canada.2 This update of the CTAF review will focus on AlloMap.

BACKGROUND

Heart transplantation

Outcomes following heart transplant have improved markedly since the procedure was first performed in 1967. Mortality rates in the year following transplant were as high as 44%.3 More recently, one year mortality rates in the Registry of the International Society for Heart and Lung Transplantation (ISHLT) have decreased from approximately 23% in the 1980‟s to approximately 15% and continue to steadily improve.4 Median survival over the same period has increased from approximately eight years to greater than 11 years.4 Acute transplant rejection is a common problem resulting in significant morbidity and mortality. It accounts for approximately 7% of deaths in the first 30 days following transplant, 12% of deaths from day 31 through one year and approximately 10% from years one through three.4 Acute cellular rejection is the primary form of transplant rejection, although antibody-mediated or humoral rejection also contributes to morbidity in transplant recipients.5-7 Much of the improvement in long term mortality is thought to be due to refinements in the immunosuppressive medications that prevent transplant rejection.8

After heart transplantation, patients are carefully monitored for signs of rejection. The incidence of rejection peaks at about one month after transplant and then rapidly declines.9 Biopsy evidence of rejection usually is present before other signs and symptoms of myocardial compromise. Cardiac transplant rejection is often asymptomatic. Thus, routine surveillance biopsies have been the mainstay of early detection and treatment of acute rejection. The usual surveillance course includes endomyocardial biopsy of the right ventricle

2 weekly for the first month, once or twice monthly for six months, and then on an annual basis. Because late rejection is a rare event, some centers do not perform routine endomyocardial biopsies after one-year post- transplant in clinically stable patients.

In 1990, a consensus conference devised a standard system for evaluating rejection in heart biopsy specimens, the International Society for Heart and Lung Transplantation grading system.10 This scale ranges from 0 to 4, with 0 indicating no evidence of rejection and higher scores reflecting greater degrees of lymphocyte infiltration and myocyte necrosis (Table 1). Low grade rejection (ISHLT grade 1A, 1B, or 2) is generally not treated unless there is evidence of a decline in cardiac function. Pulse steroids are usually used to treat higher grades of rejection (ISHLT grade 3A, 3B, or 4) and more aggressive immunosuppressive therapies are reserved for rejection associated with hemodynamic compromise. A revised scale was agreed upon in 2004 and published in 2005,7 but it has not been used in most of the published studies evaluating alternatives to endomyocardial biopsy.

Table: ISHLT Grading System for Acute Cellular Rejection

Grade (1990) Revision (2004) Mononuclear cell infiltrate Myocyte injury 0 0 None Absent 1A 1R Focal perivascular or interstitial Absent 1B 1R Multifocal or diffuse, sparse Absent 2 1R Single focus, dense Present 3A 2R Multifocal, dense Present 3B 3R Diffuse and dense Present 4 3R Diffuse and extensive; Present hemorrhage, edema, and vascular injury may be present

Unfortunately, there is a high degree of inter-observer variability in the grading of the biopsy results11-13 and transplant rejection can occur in the setting of apparently normal biopsy results.14 Furthermore, sampling error from random biopsies can miss areas with the most severe/significant rejection. Thus, endomyocardial biopsy is an invasive and imperfect measure of rejection with risks for significant adverse events. The overall risk of complications is low (<2%), but serious complications such as cardiac perforation with tamponade, vascular injury, arrhythmias, and death can occur.15-17 Tricuspid regurgitation, occasionally requiring valve replacement, is a long term complication reported in transplant patients who undergo multiple right ventricular biopsies.18 There is active research attempting to identify less invasive and potentially more accurate methods to identify transplant rejection.

3 Echocardiographic19-24, electrocardiographic25-27, and magnetic resonance imaging (MRI)28 measures have been studied as early indicators of rejection, but none have proved sensitive or specific enough in validation studies when compared with endomyocardial biopsy. A number of biomarkers have been investigated in the search for a reliable serologic marker for transplant rejection. These include elevated troponin levels29-32, brain natriuretic peptide (BNP)33, 34, C - reactive protein35-38 and soluble interleukin-2 receptor levels39. Recently, gene expression profiling of peripheral blood lymphocytes has generated the most excitement.40-42

Gene expression profiling

Gene expression profiling refers to a number of different technologies that attempt to quantify the relative levels of messenger RNA (mRNA) for large numbers of genes in specific cells or tissues. The goal is to measure differences in the level of translation (expression) of different genes and utilize patterns of differential gene expression in order to characterize different biological states of the tissue. One potential value of this approach is the identification of genes and gene products associated with a disease process that were not previously known. In cancer biology, the technology has been used to try to differentiate between different subtypes of cancers43-49, to identify tumors with good and bad prognoses43, 48, 50-56, and to identify subgroups of tumors with a high likelihood of responding to one therapeutic regimen compared with another.57, 58 In transplant medicine, the focus has been on profiling gene expression in circulating white blood cells in order to identify early changes in the immune system that correlate with rejection of the transplanted organ.59

The most common approach to gene expression profiling utilizes arrays of deoxyribonucleic acid (DNA) sequences bound to a surface like a glass slide. Often, tens of thousands of DNA sequences are organized on an individual microarray in an attempt to profile all of the 20-30,000 genes in the human genome. DNA from a test sample (tumor, white blood cells, normal tissue) is bound to fluorescent dye. Then, it is exposed to the surface of the microarray. Any sample DNA that matches DNA on the microarray (complementary sequences) is bound to the microarray at a specific location. The remaining sample is then washed away. The amount of DNA binding at each site is measured by the intensity of the fluorescent signal. Since the identity of the DNA at each site on the microarray is known, the degree of fluorescence can be correlated with the relative amount of RNA in the original sample.

Another approach to the measurement of gene expression is known as real-time, reverse-transcriptase polymerase chain reaction (RT-PCR). This approach uses the reverse transcriptase enzyme to generate complementary DNA (cDNA) from the mRNA in a sample. The cDNA is then amplified using PCR. This approach is most commonly used to quantify the relative amounts of a smaller set of genes as it is more precise and reproducible. 4 Gene expression experiments usually start with microarrays containing many thousands of genes and compare the profiles of tissue with and without certain characteristics in order to identify a smaller subset of genes that differentiate between the two states (rejection/no rejection; metastases/no metastases). This smaller subset of genes is then validated using new patient samples. Additional candidate genes based on known biological associations may also be included.

These experiments generate tens of thousands of data points, but because of the expense of microarrays and the difficulty obtaining appropriate tissue, the number of patients evaluated is often quite low. Much has been written about the statistical dangers of evaluating thousands of predictor variables in small datasets (multiple hypothesis testing, overfitting).60-62 It is essential that any pattern identified by such experiments be independently validated. Unfortunately, excitement about the results from initial experiments has often overwhelmed statistical caution. One recent paper re-evaluated the data from seven gene expression profiles of cancer prognosis and showed that five of them were likely to predict outcome no better than chance.63

AlloMap

AlloMap is the only commercially available gene expression profile currently available for heart transplant patients. The developers of the diagnostic test hypothesized that peripheral blood mononuclear cells may contain information on the host response to the heart transplant and that this could be detected by measuring gene expression levels in these cells. A complex series of experiments (described below) identified eleven genes that distinguish transplant rejection from quiescence. The assay also measures expression levels of an additional nine housekeeping genes that serve as reference standards. The informative genes are primarily involved in T-cell activation and trafficking, the response to corticosteroids, and hematopoesis. RT-PCR is used to measure the relative expression of these twenty genes in peripheral blood mononuclear cells. Then a proprietary algorithm is applied to the results to generate a score ranging from 0 to 40. The value of the score is then used to predict the likelihood of rejection. The exact cut-point for low risk of rejection varies depending on the time since the initial transplant. The test is marked as identifying the absence of acute cellular rejection. Thus, the primary diagnostic test statistic of interest is the negative predictive value of the test.

Technology Assessment (TA)

TA Criterion 1: The technology must have the appropriate regulatory approval.

5 Until recently, AlloMap and other Gene Expression Profiles were considered laboratory-developed tests, developed by a single clinical laboratory for use only in that laboratory. Laboratory-developed tests are exempt from FDA oversight. AlloMap was available as a laboratory-developed test from the manufacturer„s CLIA certified laboratory starting in 2005.

However, on September 7, 2006, the FDA published draft guidance on planned regulation of In Vitro Diagnostic Multivariate Assays (IVDMIA). Complex tests combining data from multiple laboratory tests using a complex algorithm, like those derived from gene expression profiles, will be subject to FDA review in the future. In particular, those with direct implications for medical therapy will be considered Class III devices and will be subject to the Pre-Market Approval (PMA) process. AlloMap was approved by the FDA using the 510(k) process and as an IVDMIA in August 2008 for use in heart transplant recipients 15 years of age and older who are at least 55 days post-transplant to aid in the identification of patients with stable allograft function who have a low probability of moderate/severe acute cellular rejection at the time of testing in conjunction with standard clinical assessment.

TA Criterion 1 is met.

TA Criterion 2: The scientific evidence must permit conclusions concerning the effectiveness of the technology regarding health outcomes. For diagnostic tests, there is evidence that use of the test would result in improved medical management in a way that will benefit the patient.

The Medline database, EMBASE, Cochrane clinical trials database, Cochrane reviews database and the Database of Abstracts of Reviews of Effects (DARE) were searched using the key words „AlloMap,‟ „xdx,‟ „heart transplantation,‟ and „gene expression profiling‟. The updated search was performed for the period from January 2006 through September 2010 and identified 136 articles. The bibliographies of systematic reviews and key articles were manually searched for additional references. The abstracts of citations were reviewed for relevance, and all potentially relevant articles were reviewed in full. In order to be included in this systematic review, articles had to compare the results of gene expression profiling using AlloMap with the results of endomyocardial biopsies or describe clinical care guided by AlloMap. Ideally, randomized clinical trials would compare the clinical outcomes of cardiac transplant patients managed using standard monitoring, including endomyocardial biopsy, to those of patients managed using information from gene expression profiling. The search identified six new publications of observational studies 64-69 and one randomized trial.1, 70 Of note, three of the studies reported on subsets of patients from the CARGO study and two reported on the same series of patients from the Cleveland Clinic.

6 Level of evidence: 1, 3

TA Criterion 2 is met.

TA Criterion 3: The technology must improve the net health outcomes. The primary outcomes of interest should be overall survival and intermediate measures such as ejection fraction, New York Heart Association functional status, Minnesota Living with Heart Failure quality of life questionnaire, and the 6-minute walking distance. An alternative method to screen for transplant rejection that significantly reduces the need to perform endomyocardial biopsies, while preserving survival and quality of life, would be a great advance. In the absence of such studies evaluating these outcomes, the sensitivity and specificity of gene expression profiling for the detection of rejection using the pathology results of endomyocardial biopsy specimens as the gold standard may be helpful.

Cardiac Allograft Rejection Gene Expression Observational (CARGO) Study

The Cardiac Allograft Rejection Gene Expression Observational (CARGO) study59 was a complex series of studies designed to develop and validate a parsimonious set of mRNA markers that could be measured in peripheral blood mononuclear cells to predict acute heart transplant rejection. The AlloMap test is based on the results of this study. Patients were enrolled prospectively beginning in September 2001. Slides from each patient were sent to a central pathology department for interpretation by a panel of three pathologists blinded to clinical data, though it is not clear if they were blinded to the original interpretation or to the results of the gene expression profiling. Out of 4917 samples from 629 patients, 827 samples from 273 patients were used in the development and validation of the diagnostic test. Patients were required to be at least five years old, at least 21 days post-transplant and post-rejection therapy, and at least 30 days post blood transfusion. The authors included all samples exhibiting rejection (ISHLT grade ≥ 3A by at least two of four pathologists) and a “representative” sample of ISHLT Grade 0 samples frequency matched on age, sex, race, use of induction center, use of cyclosporine or FK506, time since transplant, and clinical center. There were three phases to the study. Phase One used a custom microarray to evaluate the relative expression of 7,370 genes in 285 samples from 98 patients. Based on prior knowledge from literature reviews and correlation of gene expression levels with biopsy results, a set of 252 genes was selected for further evaluation. Phase Two used quantitative PCR to evaluate the 252 candidate genes in an additional 145 samples from 107 patients (36 grade ≥ 3A rejections and 109 grade 0 quiescent samples). Phases One and Two were not completely independent. A total of 39 samples from 31 patients were used in both Phase One and Phase Two. Through a complex series of machine learning algorithms, a linear discriminant classifier

7 was developed that utilized 11 genes. The final algorithm gives a score between 0 and 40, with higher scores reflecting a higher likelihood of transplant rejection. Phase Three validated the classifier in two additional sets of samples. The primary validation used 63 samples from 63 patients not included in Phases One or Two of the study. The secondary validation included these 63 samples (31 grade ≥ 3A; 32 grade 0) and an additional 184 samples, 30 of which were also used in Phase One of the study. The primary objective of the validation study was to test the hypothesis that the diagnostic score distinguishes between quiescence (ISHLT grade 0) and moderate to severe rejection (ISHLT grade ≥ 3A).

This review will focus on the validation study results. For additional details of the methods used in Phase One and Two, please see the original article59 and the supplemental methods that are available online. In the primary validation study, the score from the classifier was significantly higher in samples from patients with at least ISHLT Grade 3A rejection on biopsy compared with samples from patients with Grade 0 rejection (values not reported, p=0.0018). The investigators prospectively defined a score ≥20 as the threshold for rejection. Using this threshold, the test had a sensitivity of 84% (95% CI 66%-94%) and a specificity of 38% (95% CI 22%-56%). The larger secondary validation set gave similar results (sensitivity 76%, specificity 41%). In a post-hoc analysis, the investigators noted that the scores increased with time post-transplant, usually in association with decreasing the intensity of steroid therapy. The investigators suggest that optimal thresholds should be a score of 28 in the period from six months to one year and a score of 30 for patients more than one year post-transplant.

In an unplanned analysis, the investigators then evaluated the performance of the test in a representative set of 281 samples from 166 patients at least one year post-transplant (prevalent population study). This sample was chosen to avoid spectrum bias: the distribution of ISHLT grade in these biopsies was representative of patients at least one year post-transplant. Only nine of the 281 samples had ISHLT scores ≥ 3A. Using a threshold of 30, the test had a positive predictive value of 6.8% and a negative predictive value of 99.6%. It is instructive to note that a test that classifies everyone as not having rejection would have a negative predictive value of 96.8% in this test set. Positive and negative predictive values are very sensitive to the prevalence of disease in the population studied. By focusing on a group of patients with a very low probability of rejection, the investigators could be assured of a very high negative predictive value. This says very little about the clinical utility of the test.

The study offers hope that expression profiling of peripheral blood may be useful in some heart transplant patients. However, the results of the primary validation study, using the a priori threshold of 20, were disappointing. The sensitivity of the test for rejection was 84% (95% CI 66%-94%) and the specificity was only 38% (95% CI 22%-56%). Both estimates had wide confidence intervals reflecting the small sample size

8 in each group. Furthermore, the validation study suffered from significant spectrum bias. Biopsy specimens with ISHLT Grades 1A, 1B, and 2 were excluded from the initial validation study. These grades represent approximately 40% of the specimens used in the prevalent population study described in the paper. Because the AlloMap test measures white blood cell activation, test results are likely to be altered by infection and by the form of immunosuppressive therapy used by the patient. Further studies will need to evaluate the test characteristics of AlloMap in patients with active infection and in patients treated with different immunosuppressive regimens. Finally, in a post-hoc analysis, the investigators realized that a threshold of 20 was too low and that different thresholds were needed for patients at different time points in their post-transplant course. Additional validation studies are needed using thresholds that are defined prior to the start of the study before we can have confidence in the clinical utility of the test.

Six new observational studies

None of the six new observational studies prospectively evaluated the sensitivity and specificity of AlloMap compared with endometrial biopsy. Three were additional publications using subsets of patients from the CARGO study exploring associations of the AlloMap score with Grade 1B acute cellular rejection or with the prediction of future rejection.64, 66, 67 Two other studies likely used the same set of patients from the Cleveland Clinic to explore potential associations of the AlloMap score with post-transplant ischemic injury and with coronary allograft vasculopathy.65, 68 The fifth explored associations between the AlloMap score and routinely collected measures in cardiac transplant patients.69 These studies suggest potential additional uses for AlloMap score, but do not give us any additional information about the negative predictive value of the test or whether it can safely reduce the use of endomyocardial biopsy in the follow-up monitoring of patients who have received a heart transplant.

The first substudy64 using data from the CARGO participants included only patients with blood drawn at least 55 days post-transplant and more than 21 days since treatment for rejection: 265 of the 737 patients from the CARGO study centers. They reported the mean AlloMap scores for patients in each of the acute cellular rejection grades as defined by the three pathologist consensus panel. The scores for patients with severe rejection (Grade ≥ 3A) were significantly higher than the scores for Grades 0, 1A, and 2 (32.0 versus 25.3, 23.8, and 26.9 respectively, p<0.01 for each comparison). However, the AlloMap score for Grade 1B, 29.8, was not significantly lower than that for Grades 3A and higher (p=0.25). The authors suggest that some specimens graded as 1B may represent false negatives with diffuse monocyte infiltration and myocyte injury that is not clinically evident with light microscopy. Using the 2004 revision of the grading system, there was a stepwise increase in the AlloMap score: 25.3 for Grade 0; 26.9 for Grade 1R, and 32.0 for Grades ≥ 2R.

9 The second CARGO substudy67 was a nested case control study that included patients who were at least 30 days from either transplantation or treatment for rejection and were without signs or symptoms of rejection in the past 30 days. Both cases and controls had Grade 0 or 1A rejection at baseline. Cases were patients who developed severe acute cellular rejection within 12 weeks of their baseline visit. Controls remained free of Grade ≥ 2 rejection for at least 12 weeks and were frequency matched on “demographic and clinical factors” that were not further defined in the paper There were 39 cases and 65 controls. There were highly significant differences in the baseline ISHLT biopsy grade at baseline (Grade 1A in 69% of cases versus 34% of controls, p=0.006). There was also a significant difference in the AlloMap score at baseline (27.4 versus 23.9, p=0.01) that remained significant after adjusting for race and biopsy grade. In the subset of patients who were less than 180 days post-transplant, the difference was more marked (28.4 versus 22.4, p = 0.0004) and none of the patients who went on to rejection had AlloMap scores less than twenty. The authors suggest that the AlloMap score may be useful in predicting future episodes of acute cellular rejection and in tailoring patients‟ immunosuppressive medications. However, the results need validation in larger prospective studies.

The same research group expanded on these results using 127 patients from the CARGO study.66 They found that among patients with ISHLT Grade 0 or 1A rejection on endomyocardial biopsy and AlloMap scores ≤ 20, none progressed to severe rejection. Additionally, a gene score ≥ 30 was associated with progression to severe rejection for 58% of patients. They estimated that approximately 44% of heart transplant patients would have gene expression scores ≤ 20 or ≥ 30 during the first six months post- transplant. Again, these results require prospective validation.

The investigators at Cleveland Clinic65, 68 explored the association between the AlloMap score and both post transplant ischemia and transplant-associated coronary allograft vasculopathy. The AlloMap score was higher in patients with early post-transplant ischemia (31.5, n=19) than in controls without evidence of ischemia (21, n=48, p<0.001). Similarly, the AlloMap score was higher in patients with coronary allograft vasculopathy 32.2, n=20) than in controls without evidence of coronary allograft vasculopathy (26, n=49, p<0.001). Both studies had too few patients to adequately control for potential confounding, but they do suggest possible reasons for false positive AlloMap scores.

Most recently, investigators at Columbia presented an exploratory analysis looking at 35 parameters routinely collected on 76 post-cardiac transplant patients and correlated them with the AlloMap score.69 Using a one-sided p-value, ten of the parameters were correlated with the AlloMap score. Of these, only the platelet count and the corrected QT interval (QTc) from electrocardiography remained significant in multivariate regression. The study used looked at a large number of associations with a low p-value

10 threshold, so many of the associations that were identified may represent Type 1 errors. However, the QTc has been associated with acute cellular rejection in prior studies, so this is likely to be a true correlation. It is unclear what impact these findings will have on clinical practice.

The Invasive Monitoring Attenuation Through Gene Expression (IMAGE) Randomized Trial

The Invasive Monitoring Attenuation Through Gene Expression (IMAGE) study was a non-blinded randomized trial at 13 U.S. cardiac transplant centers designed to evaluate whether monitoring heart transplant patients with the AlloMap gene expression score was not inferior to monitoring based on endomyocardial biopsy. Patients were followed for two years after randomization. The threshold for non- inferiority was the upper bound of the one-sided 95% confidence interval for the hazard ratio for the AlloMap group would be less than 2.054. The initial inclusion criteria were patients ages 18 years and older with an ejection fraction of at least 45% who were between 12 and 60 months post transplant and who were without evidence of severe cardiac allograft vasculopathy or antibody mediated rejection. Patients were excluded if they had current signs or symptoms of cardiac dysfunction, therapy for ISHLT Grade 3A or higher rejection in the past two months, recent changes in immunosuppressive medications, blood transfusion in the past four weeks, or current use of corticosteroids at a dose equivalent to ≥ 20 mg / day of prednisone. The study was active from January 2005 through October 2009. Because of slow enrollment, in November 2007, the entry criteria were expanded to include patients from six to twelve months post-transplant.

The primary endpoint was a composite of a 25% or greater reduction in ejection fraction on echocardiography, rejection with hemodynamic compromise, retransplantation, or death. Patients randomized to the endomyocardial biopsy group received biopsies according to each center‟s usual protocol and had gene expression testing done at each study visit, but the treating physicians were blinded to the gene expression profiling results. Patients randomized to the AlloMap group had their gene expression profile performed at each rejection surveillance visit. Patients with clinical or echocardiographic evidence of allograft dysfunction also underwent endomyocardial biopsy and other testing according to each center‟s protocol. If the AlloMap score was greater than or equal to 30, then the patient underwent biopsy. If the biopsy result was ≥ 2R, the patient was treated for acute cellular rejection. Otherwise, the patient continued to receive routine surveillance with gene expression profiling. In November 2005, the protocol was amended to require biopsy for a score of 34 or greater. The original threshold score were based on data from the CARGO study described above. In comments received from the manufacturer, they note that the new test had not yet been used for any prospective patient management decisions. Early data from the prospective IMAGE trial found that the suggested thresholds from CARGO were too low to achieve the desired high 11 negative predictive value. This led to the protocol amendment and highlights the fact that the most appropriate way to use the test for clinical management of patients continues to evolve as better quality studies are performed.

The study randomized 602 subjects and the subjects had a median follow-up of 19 months. Their average age was 54 years and 82% were men. The two groups were well matched except for a lower proportion of black patients in the AlloMap group (8% versus 15%, p=0.01). The two-year rate of the primary outcome was similar in both groups (14.5% AlloMap versus 15.3% biopsy, HR 1.04, 95% CI 0.67 to 1.68). Thus, the study‟s prespecified criterion for non-inferiority was met. However, there was a higher rate of the primary outcome in black patients (18.3% versus 10.2%, p=0.07). The hazard ratio in the primary analysis was 1.13, 95% CI 0.70 to 1.84, after adjusting for black race. The two-year overall mortality did not differ between the two groups (6.3% versus 5.5%, p=0.82).

There were fewer endomyocardial biopsies performed in the AlloMap group (409 versus 1249; 0.5 biopsies per year versus 3.0 biopsies per year, p<0.001). Biopsy-related complications occurred in one patient in the AlloMap group (medication error – formalin given rather than lidocaine at the cannulation site: wound required debridement) and four patients in the biopsy group (two tricuspid valve insufficiency, one symptomatic pericardial effusion, one bleeding episode). Patient satisfaction scores in the AlloMap group increased from 6.86 at baseline to 8.15 at one year and 8.74 at two years. Patient satisfaction scores in the biopsy group remained stable from 6.74 at baseline to 6.64 at one year and 6.66 at two years (p for the difference between groups <0.001 at one and two years). There were no differences on the mental-health summary scores of the short-form 12 (SF-12) quality of life measures at any time point. However the AlloMap group had a lower physical health summary score at one year (44.7 versus 47.3, p=0.03). The difference was no longer significant at two years (45.1 versus 46.2, p=0.52).

It is commendable that the investigators undertook a randomized trial to test the utility of a cardiac transplant management strategy based on the AlloMap gene expression profile despite the difficulties given that only 3500 heart transplants are done annually in the world and that the patients lives depend on diagnosing severe rejection early enough to treat it successfully. That said, there are several concerns about the quality of the trial. The baseline imbalance in the distribution of race suggests that allocation concealment was inadequate and that there may have been selection bias during the randomization process. The investigators adjusted for this in their analyses, but the presence of potential selection bias in a randomized trial is concerning. In addition, the study was unblinded. This can impact both outcome identification as well as final endpoint adjudication. The magnitude of bias introduced by lack of blinding tends to be largest for subjective outcomes and may explain part of the difference in the patient satisfaction

12 scores between the two groups. However, it can impact co-interventions, such as the adjustment of patient‟s immunosuppressive medications and the decision to biopsy based on borderline clinical symptoms. Data in the supplementary appendix to the NEJM article suggest that there were baseline differences in immunosupression between the two groups, but that these differences diminished through the trial. The definition of a “negative” test continued to evolve in this study. When the test was initially developed the threshold was 20, then it was increased to 28 for patients six to twelve months post-transplant and 30 for patients more than a year post-transplant. In this study, the threshold was increased again to 34. This varying threshold across studies makes it difficult to interpret the entire body of literature on the test and highlight the continued evolution in thinking about the clinical utility of the test. Finally, the confidence intervals surrounding the primary estimate were wide: the point estimate from the adjusted analysis was a 13% increase in the primary outcome with reasonable estimates falling between a 30% decrease and an 84% increase in the risk of death, retransplantation, or significant cardiovascular compromise. If the true value is a 50% increased risk for the above events, many patients and their treating physicians would not choose monitoring with gene expression profiling.

The results should not be generalized to patients six to twelve months from transplantation. These patients represent less than 15% of the sample in the study and none of them had two years follow-up as the protocol modification including them in the trial occurred less than two years before the close of the trial.

TA Criterion 3 is met for patients at least one year post-transplantation.

TA Criterion 4: The technology must be as beneficial as any established alternatives. The established alternative to gene expression profiling is endomyocardial biopsy. Both are used in the context of the patient‟s clinical history, physical exam findings, and regular echocardiographic evaluation. The right heart catheterization performed as part of the biopsy procedure provide important hemodynamic information, such as cardiac filling pressures and cardiac output, which are also used to guide patient management. These data would not be available to clinicians making therapeutic decisions based on the AlloMap score. The high negative predictive value of the CARGO prevalent population sub-study suggested that patients at least a year post-transplant may be adequately assessed by gene expression profiling, but the majority of biopsies have been performed in transplant patients by one year. In addition, some authors have questioned the need for routine surveillance biopsies in this population and some sites do not routinely perform biopsies in patients who are clinically stable.

13 The IMAGE trial described above demonstrated non-inferiority of a management strategy guided by gene expression profiling to one guided by endomyocardial biopsy. There were significantly fewer biopsies performed and patient‟s reported higher satisfaction with the gene expression profiling strategy. However, there were many concerns about the trial as detailed under TA criterion 3. The results should not be applied to patients less than one year post-transplantation. Furthermore, patients and treating clinicians need to be informed about the uncertainties surrounding the relative benefits and harms associated with a monitoring strategy that incorporates gene expression profiling. Finally, clinically stable patients may not require routine gene expression profiling or endomyocardial biopsy beyond one year in addition to the usual clinical monitoring for signs and symptoms of rejection as well as routine echocardiography. Given those caveats, the evidence supports gene expression profiling being as beneficial as the current standard, endomyocardial biopsy, when used as part of the routine post-transplant monitoring in stable patients at least one year post-transplant.

TA Criterion 4 is met.

TA Criterion 5: The improvement must be attainable outside the investigational setting.

The large CARGO trial reported data from eight transplant centers and the IMAGE trial from 13 centers, suggesting that the sample collection process can be handled reliably at centers with the expertise to perform heart transplants. All of the tests are required to be performed at one central laboratory. No data was presented on the reliability of the final test itself. Measures of the consistency of the test results when performed on the same sample would be useful.

TA Criterion 5 is met.

CONCLUSION

Heart transplant patients face significant risks for life-threatening rejection, particularly during the first year after transplant. Endomyocardial biopsies are performed according to a strict schedule in order to diagnose significant rejection as early as possible. The search for a less invasive marker of rejection has been a research priority for decades. Gene expression profiling offers the potential for a non-invasive test that may replace endomyocardial biopsy as the gold standard for transplant rejection in stable patients. The AlloMap gene expression profile has a high negative predictive value, but a low positive predictive value. Thus it may be useful to avoid biopsy in stable patients, but the high false positive rate precludes its use to definitively diagnose acute cellular rejection. Endomyocardial biopsies will still need to be performed in all patients with elevated AlloMap scores and all patients with clinical signs of rejection. The IMAGE trial provides data

14 supporting the non-inferiority of a monitoring strategy for heart transplant patients incorporating the AlloMap gene expression profile in lieu of routine endomyocardial biopsy. However, the data only support such strategies in patients more than a year post-transplant. More data are needed to confirm the tests utility earlier in the post-transplant period when the majority of endomyocardial biopsies are performed.

RECOMMENDATION

It is recommended that the use of gene expression profiling meets Technology Assessment Criterion 1 through 5 for safety, effectiveness and improvement in health outcomes when used to manage heart transplant patients at least one year post-transplant.

October 18, 2006 This is the second CTAF review of this technology

The CTAF panel voted to accept the recommendation as presented.

15

RECOMMENDATIONS OF OTHERS

Blue Cross Blue Shield Association (BCBSA) The BCBSA Technology Evaluation Center (TEC) has not conducted a review of this technology.

Centers for Medicare and Medicaid Services (CMS) At this time CMS does not have a published National Coverage Decision regarding the use of this technology. However, the lab providing the services does have a provider number.

American College of Cardiology California Chapter (CA ACC) The CA ACC was asked to provide an opinion regarding this technology and to have a representative attend the meeting.

International Society of Heart and Lung Transplant (ISHLT) The ISHLT 2010 Guidelines for the Care of Heart Transplant Recipients, Task Force 2: Immunosupression and Rejection are available at: https://www.ishlt.org/ContentDocuments/ISHLT_GL_Task_Force_2_080510.pdf

16 ABBREVIATIONS USED IN THIS ASSESSMENT:

CTAF California Technology Assessment Forum FDA Food and Drug Administration ISHLT International Society of Heart and Lung Transplant MRI Magnetic resonance imaging BNP: Brain Natriuretic Peptide mRNA: Messenger RNA DNA Deoxyribonucleic acid RT-PCR: Reverse-transcriptase polymerase chain reaction cDNA: Complementary DNA IVMIA: In-Vitro Diagnostic Multivariate Assays PMA: Pre-Market Approval DARE: Database of Abstracts of Reviews of Effects SAM: Statistical Analysis of Microarrays CARGO: Cardiac Allograft Rejection Gene Expression Observational IMAGE: Invasive Monitoring Attenuation Through Gene Expression PCR: Polymerase chain reaction QTc: Corrected QT interval SF-12: Short form 12

17

REFERENCES

1. Pham MX, Teuteberg JJ, Kfoury AG, et al. Gene-expression profiling for rejection surveillance after cardiac transplantation. N Engl J Med. May 20 2010;362(20):1890-1900.

2. Lin D, Hollander Z, Ng RT, et al. Whole blood genomic biomarkers of acute cardiac allograft rejection. J Heart Lung Transplant. Sep 2009;28(9):927-935.

3. Caves PK, Stinson EB, Griepp RB, Rider AK, Dong E, Jr., Shumway NE. Results of 54 cardiac transplants. Surgery. Aug 1973;74(2):307-314.

4. Taylor DO, Edwards LB, Boucek MM, et al. Registry of the International Society for Heart and Lung Transplantation: twenty-third official adult heart transplantation report--2006. J Heart Lung Transplant. Aug 2006;25(8):869-879.

5. Behr TM, Feucht HE, Richter K, et al. Detection of humoral rejection in human cardiac allografts by assessing the capillary deposition of complement fragment C4d in endomyocardial biopsies. J Heart Lung Transplant. Sep 1999;18(9):904-912.

6. Grauhan O, Muller J, v Baeyer H, et al. Treatment of humoral rejection after heart transplantation. J Heart Lung Transplant. Dec 1998;17(12):1184-1194.

7. Stewart S, Winters GL, Fishbein MC, et al. Revision of the 1990 working formulation for the standardization of nomenclature in the diagnosis of heart rejection. J Heart Lung Transplant. Nov 2005;24(11):1710-1720.

8. Patel JK, Kobashigawa JA. Immunosuppression, diagnosis, and treatment of cardiac allograft rejection. Semin Thorac Cardiovasc Surg. Winter 2004;16(4):378-385.

9. Kubo SH, Naftel DC, Mills RM, Jr., et al. Risk factors for late recurrent rejection after heart transplantation: a multiinstitutional, multivariable analysis. Cardiac Transplant Research Database Group. J Heart Lung Transplant. May-Jun 1995;14(3):409-418.

10. Billingham ME, Cary NR, Hammond ME, et al. A working formulation for the standardization of nomenclature in the diagnosis of heart and lung rejection: Heart Rejection Study Group. The International Society for Heart Transplantation. J Heart Transplant. Nov-Dec 1990;9(6):587-593.

18 11. Marboe CC, Billingham M, Eisen H, et al. Nodular endocardial infiltrates (Quilty lesions) cause significant variability in diagnosis of ISHLT Grade 2 and 3A rejection in cardiac allograft recipients. J Heart Lung Transplant. Jul 2005;24(7 Suppl):S219-226.

12. Winters GL, McManus BM. Consistencies and controversies in the application of the International Society for Heart and Lung Transplantation working formulation for heart transplant biopsy specimens. Rapamycin Cardiac Rejection Treatment Trial Pathologists. J Heart Lung Transplant. Jul 1996;15(7):728-735.

13. Nielsen H, Sorensen FB, Nielsen B, Bagger JP, Thayssen P, Baandrup U. Reproducibility of the acute rejection diagnosis in human cardiac allografts. The Stanford Classification and the International Grading System. J Heart Lung Transplant. Mar-Apr 1993;12(2):239-243.

14. Fishbein MC, Kobashigawa J. Biopsy-negative cardiac transplant rejection: etiology, diagnosis, and therapy. Curr Opin Cardiol. Mar 2004;19(2):166-169.

15. Talwar KK, Varma S, Chopra P, Wasir HS. Endomyocardial biopsy--technical aspects experience and current status. An Indian perspective. Int J Cardiol. Mar 1 1994;43(3):327-334.

16. Cooper DK, Fraser RC, Rose AG, et al. Technique, complications, and clinical value of endomyocardial biopsy in patients with heterotopic heart transplants. Thorax. Oct 1982;37(10):727- 731.

17. Sekiguchi M, Take M. World survey of catheter biopsy of the heart. In: Sekiguchi M, Olsen EG, eds. Cardiomyopathy: Clinical, Pathological and Theroretcial Aspects. Baltimore: University Park Press; 1980.

18. Alharethi R, Bader F, Kfoury AG, et al. Tricuspid valve replacement after cardiac transplantation. J Heart Lung Transplant. Jan 2006;25(1):48-52.

19. Dandel M, Hummel M, Meyer R, et al. Left ventricular dysfunction during cardiac allograft rejection: early diagnosis, relationship to the histological severity grade, and therapeutic implications. Transplant Proc. Sep 2002;34(6):2169-2173.

20. Moran AM, Lipshultz SE, Rifai N, et al. Non-invasive assessment of rejection in pediatric transplant patients: serologic and echocardiographic prediction of biopsy-proven myocardial rejection. J Heart Lung Transplant. Aug 2000;19(8):756-764.

19 21. Mankad S, Murali S, Kormos RL, Mandarino WA, Gorcsan J, 3rd. Evaluation of the potential role of color-coded tissue Doppler echocardiography in the detection of allograft rejection in heart transplant recipients. Am Heart J. Oct 1999;138(4 Pt 1):721-730.

22. Moidl R, Chevtchik O, Simon P, et al. Noninvasive monitoring of peak filling rate with acoustic quantification echocardiography accurately detects acute cardiac allograft rejection. J Heart Lung Transplant. Mar 1999;18(3):194-201.

23. StGoar FG, Gibbons R, Schnittger I, Valantine HA, Popp RL. Left ventricular diastolic function. Doppler echocardiographic changes soon after cardiac transplantation. Circulation. Sep 1990;82(3):872-878.

24. Valantine HA, Yeoh TK, Gibbons R, et al. Sensitivity and specificity of diastolic indexes for rejection surveillance: temporal correlation with endomyocardial biopsy. J Heart Lung Transplant. Sep-Oct 1991;10(5 Pt 1):757-765.

25. Bourge R, Eisen H, Hershberger R, et al. Noninvasive rejection monitoring of cardiac transplants using high resolution intramyocardial electrograms: initial US multicenter experience. Pacing Clin Electrophysiol. Nov 1998;21(11 Pt 2):2338-2344.

26. Graceffo MA, O'Rourke RA. Cardiac transplant rejection is associated with a decrease in the high- frequency components of the high-resolution, signal-averaged electrocardiogram. Am Heart J. Oct 1996;132(4):820-826.

27. Hetzer R, Potapov EV, Muller J, et al. Daily noninvasive rejection monitoring improves long-term survival in pediatric heart transplantation. Ann Thorac Surg. Oct 1998;66(4):1343-1349.

28. Marie PY, Angioi M, Carteaux JP, et al. Detection and prediction of acute heart transplant rejection with the myocardial T2 determination provided by a black-blood magnetic resonance imaging sequence. J Am Coll Cardiol. Mar 1 2001;37(3):825-831.

29. Gleissner CA, Klingenberg R, Nottmeyer W, et al. Diagnostic efficiency of rejection monitoring after heart transplantation with cardiac troponin T is improved in specific patient subgroups. Clin Transplant. Jun 2003;17(3):284-291.

30. Mullen JC, Bentley MJ, Scherr KD, et al. Troponin T and I are not reliable markers of cardiac transplant rejection. Eur J Cardiothorac Surg. Aug 2002;22(2):233-237.

20 31. Dengler TJ, Zimmermann R, Braun K, et al. Elevated serum concentrations of cardiac troponin T in acute allograft rejection after human heart transplantation. J Am Coll Cardiol. Aug 1998;32(2):405- 412.

32. Labarrere CA, Nelson DR, Cox CJ, Pitts D, Kirlin P, Halbrook H. Cardiac-specific troponin I levels and risk of coronary artery disease and graft failure following heart transplantation. Jama. Jul 26 2000;284(4):457-464.

33. Claudius I, Lan YT, Chang RK, Wetzel GT, Alejos J. Usefulness of B-type natriuretic peptide as a noninvasive screening tool for cardiac allograft pathology in pediatric heart transplant recipients. Am J Cardiol. Dec 1 2003;92(11):1368-1370.

34. Masters RG, Davies RA, Veinot JP, Hendry PJ, Smith SJ, de Bold AJ. Discoordinate modulation of natriuretic peptides during acute cardiac allograft rejection in humans. Circulation. Jul 20 1999;100(3):287-291.

35. Eisenberg MS, Chen HJ, Warshofsky MK, et al. Elevated levels of plasma C-reactive protein are associated with decreased graft survival in cardiac transplant recipients. Circulation. Oct 24 2000;102(17):2100-2104.

36. Hognestad A, Endresen K, Wergeland R, et al. Plasma C-reactive protein as a marker of cardiac allograft vasculopathy in heart transplant recipients. J Am Coll Cardiol. Aug 6 2003;42(3):477-482.

37. Labarrere CA, Lee JB, Nelson DR, Al-Hassani M, Miller SJ, Pitts DE. C-reactive protein, arterial endothelial activation, and development of transplant coronary artery disease: a prospective study. Lancet. Nov 9 2002;360(9344):1462-1467.

38. Pethig K, Heublein B, Kutschka I, Haverich A. Systemic inflammatory response in cardiac allograft vasculopathy: high-sensitive C-reactive protein is associated with progressive luminal obstruction. Circulation. Nov 7 2000;102(19 Suppl 3):III233-236.

39. Windsor NT, Lloyd KS, Young JB, et al. Dynamics of soluble interleukin-2 receptor levels immediately after heart transplantation. Attenuation of increase by OKT3 therapy. Transplantation. Jul 1991;52(1):78-82.

40. Evans RW, Williams GE, Baron HM, et al. The economic implications of noninvasive molecular testing for cardiac allograft rejection. Am J Transplant. Jun 2005;5(6):1553-1558.

21 41. Mehra MR. The emergence of genomic and proteomic biomarkers in heart transplantation. J Heart Lung Transplant. Jul 2005;24(7 Suppl):S213-218.

42. Seo D, Ginsburg GS, Goldschmidt-Clermont PJ. Gene expression analysis of cardiovascular diseases: novel insights into biology and clinical applications. J Am Coll Cardiol. Jul 18 2006;48(2):227-235.

43. Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A. Nov 20 2001;98(24):13790-13795.

44. Blaveri E, Simko JP, Korkola JE, et al. Bladder cancer outcome and subtype classification by gene expression. Clin Cancer Res. Jun 1 2005;11(11):4044-4055.

45. Maxwell GL, Chandramouli GV, Dainty L, et al. Microarray analysis of endometrial carcinomas and mixed mullerian tumors reveals distinct gene expression profiles associated with different histologic types of uterine cancer. Clin Cancer Res. Jun 1 2005;11(11):4056-4066.

46. Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. Sep 11 2001;98(19):10869- 10874.

47. Wilson CS, Davidson GS, Martin SB, et al. Gene expression profiling of adult acute myeloid leukemia identifies novel biologic clusters for risk classification and outcome prediction. Blood. Jul 15 2006;108(2):685-696.

48. Yeoh EJ, Ross ME, Shurtleff SA, et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell. Mar 2002;1(2):133-143.

49. Alizadeh AA, Eisen MB, Davis RE, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. Feb 3 2000;403(6769):503-511.

50. Beer DG, Kardia SL, Huang CC, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med. Aug 2002;8(8):816-824.

51. Pomeroy SL, Tamayo P, Gaasenbeek M, et al. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature. Jan 24 2002;415(6870):436-442. 22 52. Ramaswamy S, Ross KN, Lander ES, Golub TR. A molecular signature of metastasis in primary solid tumors. Nat Genet. Jan 2003;33(1):49-54.

53. Rosenwald A, Staudt LM. Clinical translation of gene expression profiling in lymphomas and leukemias. Semin Oncol. Jun 2002;29(3):258-263.

54. Rosenwald A, Wright G, Chan WC, et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med. Jun 20 2002;346(25):1937-1947.

55. van 't Veer LJ, Dai H, van de Vijver MJ, et al. Expression profiling predicts outcome in breast cancer. Breast Cancer Res. 2003;5(1):57-58.

56. van 't Veer LJ, Dai H, van de Vijver MJ, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. Jan 31 2002;415(6871):530-536.

57. Ayers M, Symmans WF, Stec J, et al. Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide chemotherapy in breast cancer. J Clin Oncol. Jun 15 2004;22(12):2284-2293.

58. Chang JC, Wooten EC, Tsimelzon A, et al. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet. Aug 2 2003;362(9381):362-369.

59. Deng MC, Eisen HJ, Mehra MR, et al. Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. Am J Transplant. Jan 2006;6(1):150-160.

60. Halloran PF, Reeve J, Kaplan B. Lies, damn lies, and statistics: the perils of the P value. Am J Transplant. Jan 2006;6(1):10-11.

61. Ransohoff DF. Discovery-based research and fishing. Gastroenterology. Aug 2003;125(2):290.

62. Ransohoff DF. Rules of evidence for cancer molecular-marker discovery and validation. Nat Rev Cancer. Apr 2004;4(4):309-314.

63. Michiels S, Koscielny S, Hill C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet. Feb 5-11 2005;365(9458):488-492.

23 64. Bernstein D, Williams GE, Eisen H, et al. Gene expression profiling distinguishes a molecular signature for grade 1B mild acute cellular rejection in cardiac allograft recipients. J Heart Lung Transplant. Dec 2007;26(12):1270-1280.

65. Yamani MH, Taylor DO, Haire C, Smedira N, Starling RC. Post-transplant ischemic injury is associated with up-regulated AlloMap gene expression. Clin Transplant. Jul-Aug 2007;21(4):523- 525.

66. Mehra MR, Kobashigawa JA, Deng MC, et al. Clinical implications and longitudinal alteration of peripheral blood transcriptional signals indicative of future cardiac allograft rejection. J Heart Lung Transplant. Mar 2008;27(3):297-301.

67. Mehra MR, Kobashigawa JA, Deng MC, et al. Transcriptional signals of T-cell and corticosteroid- sensitive genes are associated with future acute cellular rejection in cardiac allografts. J Heart Lung Transplant. Dec 2007;26(12):1255-1263.

68. Yamani MH, Taylor DO, Rodriguez ER, et al. Transplant vasculopathy is associated with increased AlloMap gene expression score. J Heart Lung Transplant. Apr 2007;26(4):403-406.

69. Cadeiras M, Shahzad K, John MM, et al. Relationship between a validated molecular cardiac transplant rejection classifier and routine organ function parameters. Clin Transplant. May 2010;24(3):321-327.

70. Pham MX, Deng MC, Kfoury AG, Teuteberg JJ, Starling RC, Valantine H. Molecular testing for long-term rejection surveillance in heart transplant recipients: design of the Invasive Monitoring Attenuation Through Gene Expression (IMAGE) trial. J Heart Lung Transplant. Aug 2007;26(8):808-814.

24 TRANS.00025 Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejecti... Page 1 of 7

Medical Policy

Subject: Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejection Policy #: TRANS.00025 Current Effective Date: 07/07/2015 Status: Reviewed Last Review Date: 05/07/2015

Description/Scope

This document addresses specific noninvasive laboratory tests for the early detection of rejection following a heart transplant. This includes the Heartsbreath test (Menssana Research, Inc. Fort Lee, NJ), which measures the chemical byproducts of allograft rejection and has been investigated to potentially make the process of monitoring heart transplant recipients safer and less complicated. Also addressed in this document is the AlloMap® molecular expression testing (XDx, Inc., South San Francisco, CA) which has also been investigated as a noninvasive method of determining the risk of rejection in heart transplant recipients.

Even with modern drug therapy, rejection remains a constant hazard, and transplant recipients must be tested repeatedly for signs of renewed rejection. Currently, the gold standard to detect heart transplant rejection is endomyocardial biopsy. This is typically performed weekly for the first 6 weeks, biweekly until the third month, monthly to 6 months and then every 1 to 3 months, as indicated.

Position Statement

Medically Necessary:

AlloMap molecular expression testing is considered medically necessary as a non-invasive method of determining the risk of rejection in heart transplant recipients between 1 and 5 years post- transplant.

Investigational and Not Medically Necessary:

Breath testing with the Heartsbreath test is considered investigational and not medically necessary for use as an aid in the diagnosis of heart transplant rejection.

AlloMap molecular expression testing is considered investigational and not medically necessary when the criteria above are not met.

Rationale

Breath Test

Heartsbreath (Breath test for Grade 3 heart transplant rejection), manufactured by Menssana Research, Inc., (Fort Lee, NJ) received U.S. Food and Drug Administration (FDA) clearance on February 24, 2004 under the Humanitarian Device Exemption (HDE)* program with the following indications for use:

The Heartsbreath test is indicated for use as an aid in the diagnosis of grade 3 heart transplant rejection in patients who have received heart transplants within the preceding year. The Heartsbreath test is intended for use as an adjunct to, and not as a substitute for, endomyocardial biopsy. The use of the device is limited to patients who have had endomyocardial biopsy (EMB) within the previous month (FDA, 2004).

The Heartsbreath test works on the principle that rejection of the transplanted heart is accompanied by oxidative stress that degrades membrane polyunsaturated fatty acids, evolving alkanes and methylalkanes that are excreted in the breath as volatile organic compounds (VOCs). The individual breathes for 2 minutes through a disposable mouthpiece attached to a breath collecting device, which then analyzes the VOCs in alveolar and room air and interprets the values, using a proprietary algorithm to predict the probability of Grade 3 heart transplant rejection.

The Heartsbreath test should not be used for individuals who have received a heart transplant more than 1 year ago, or have a Grade 4 heart transplant rejection, because Heartsbreath has not been evaluated in these groups.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 TRANS.00025 Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejecti... Page 2 of 7

FDA clearance was based on the results of the Heart Allograft Rejection: Detection with Breath Alkanes in Low Levels (HARDBALL) Study, which was sponsored by the National Heart Lung and Blood Institute. In this 3 year multicenter study, investigators evaluated a new marker of heart transplant rejection, the breath methylated alkane contour (BMAC). In the HARDBALL study, 1061 breath VOC samples were collected from 539 heart transplant recipients at seven sites on the day of scheduled EMB. The gold standard of rejection was the concordant set of International Society for Heart and Lung Transplantation (ISHLT) grades in biopsies read by two cardiac pathologists. Results included concordant biopsies of:

• Grade 0, 645 of 1061 (60.8%); • Grade 1A, 197 (18.6%); • Grade 1B, 84 (7.9%); • Grade 2, 93 (8.8%); • Grade 3A, 42 (4.0%).

A combination of 9 VOCs in the BMAC identified Grade 3 rejection (sensitivity 78.6%; specificity 62.4%; cross-validated sensitivity 59.5%; cross-validated specificity 58.8%; positive predictive value 5.6%; negative predictive value 97.2%). Site pathologists identified the same cases with sensitivity of 42.4%, specificity 97.0%, positive predictive value 45.2% and negative predictive value 96.7%. The authors concluded that a breath test for markers of oxidative stress was more sensitive and less specific for Grade 3 heart transplant rejection than a biopsy reading by a single on-site pathologist, but the negative predictive values of the two tests were similar. They concluded that a negative screening breath test could potentially identify transplant recipients at low risk of Grade 3 rejection and obviate the need for EMB in this group, thereby reducing the overall number of EMBs performed, which was estimated to be by as much as 50% (Phillips, 2004).

Currently, there is inadequate evidence in the published literature to demonstrate the safety, efficacy, and clinical utility of the Heartsbreath test in the management of rejection surveillance following heart transplant. Large trials are needed to further define the role of this technology and demonstrate how use of this test will impact treatment management.

AlloMap Molecular Expression Testing

The AlloMapmolecular expression blood test was developed by XDx, Inc. (South San Francisco, CA). In 2008, FDA 510(k) clearance as a Class II approval was granted for AlloMap Molecular Expression Testing as an in-vitro, diagnostic, multivariate, index assay test service, which assesses the gene expression profile of RNA isolated from peripheral blood mononuclear cells for the following indication:

To aid in the identification of heart transplant recipients with stable allograft function who have a low probability of moderate/severe acute cellular rejection (ACR) at the time of testing in conjunction with standard clinical assessment. AlloMap is indicated for use in heart transplant recipients who are 15 years of age or older and at least 2 months (greater than or equal to 55 days) post-transplantation (FDA, 2008).

The test assesses the expression of 20 genes, about half of which are directly involved in rejection while the remainder provide other information needed for rejection risk assessment. It is hoped the results of this test will decrease the number of necessary EMBs. Among the proposed benefits are the AlloMap test's ability to differentiate mild rejection, for which histologic findings may be the least accurate, and the potential for monitoring physiologic responses to steroid weaning. It has been recognized that the test is not effective at monitoring rejection within the first 6 months of transplantation, and it is yet unclear what a high AlloMap score might mean in the setting of no histologic rejection.

These patterns of gene expression, detected in peripheral blood by the AlloMap testing, were studied in the Cardiac Allograft Rejection Gene Expression Observation Study (CARGO), which included eight U.S. cardiac transplant centers where 650 heart transplant recipients were tested. Results of the CARGO study have appeared in abstracts presented at the 2005 annual meeting of the International Society of Heart and Lung Transplantation. While the results were promising, the data was considered inadequate to permit firm scientific conclusions regarding how use of this test will impact the management of heart transplant recipients (Deng, 2006). There have been subsequent validation studies and sub-study analyses of the CARGO results which provided additional data regarding the potential utility of the AlloMap test in detecting transplant rejection (Bernstein, 2007; Mehra, 2007b; Mehra, 2008).

Results of another trial were published in 2010. The Invasive Monitoring Attenuation through Gene Expression (IMAGE) trial , which was sponsored by the manufacturer of AlloMap (XDx,

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 TRANS.00025 Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejecti... Page 3 of 7

Inc.), was a randomized, event-driven, noninferiority trial which was conducted at 13 U.S. transplant centers between January 2005 and October 2009 (with median follow-up of 19 months). This trial included 602 selected transplant recipients who had undergone a transplant more than 6 months prior and who were considered at low risk for rejection. The purpose of this study was to compare rejection outcomes between those who underwent routine EMB and those who were monitored with the AlloMap gene expression profiling test. The primary outcome was the first occurrence of rejection with hemodynamic compromise, graft dysfunction due to other causes, death, or retransplantation. Results indicated that a strategy of monitoring for rejection that involved gene-expression profiling, as compared with routine biopsies, was not associated with an increased risk of serious adverse outcomes and resulted in the performance of significantly fewer biopsies. During the median follow-up period (19 months), subjects who were monitored with AlloMap and those who underwent routine EMB had similar 2 year cumulative rates of the composite primary outcome (14.5% and 15.3%, respectively; hazard ratio with gene-expression profiling, 1.04; 95% confidence interval, 0.67 to 1.68). The 2 year rates of death from any cause were also similar in the two groups (6.3% and 5.5%, respectively; p=0.82). Although the limited power of the study did not allow for firm conclusions regarding the utility of AlloMap as a substitute for EMB, the authors concluded that gene expression profiling of peripheral blood specimens may offer a reasonable alternative to routine biopsies, for monitoring cardiac transplant rejection, if the interval since transplantation is at least 6 months and the individual is considered to be at low risk for rejection (Pham, 2010).

In 2010, the International Society of Heart and Lung Transplantation (ISHLT) issued guidelines for the care of heart transplant recipients which included the following:

• The standard of care for adult heart transplant recipients is to perform periodic endomyocardial biopsy (EMB) during the first 6-12 months after transplant for rejection surveillance; (Class IIa, Level of Evidence: C) • After the first year post-transplant, EMB surveillance every 4-6 months is recommended for patients at higher risk of late acute rejection; (Class IIa, Level of Evidence: C) • Gene expression profiling using the AlloMap test can be used to rule out acute heart rejection (grade 2 or greater) in appropriate low-risk patients between 6 months and 5 years post- transplant (Class IIa, Level of Evidence: B).

The recommendation for AlloMap is based on the results of the CARGO and IMAGE trials (Costanzo, 2010).

In summary, the current ISHLT recommendations for the use of AlloMap in limited clinical protocols, the results of the IMAGE trial, and input from the transplant practice community support the use of AlloMap to assess risk for rejection in clinically stable heart transplant recipients between 1 and 5 years post-transplant.

Background/Overview

Breath Test

Although the current gold standard test for detecting rejection is EMB, this is limited in accuracy, has a high degree of inter-observer variability, and may yield tissue that is not representative of the overall pathology. It is also invasive and can lead to infections, arrhythmias, or ventricular perforation. Despite these limitations, the breath test is currently not established as a substitute for EMB.

*Note: A Humanitarian Use Device (HUD) is a device that has been given special approval by the FDA under the Humanitarian Device Exemption (HDE) regulations and is utilized in special circumstances where a condition is so rare (fewer than 4000 individuals in the U.S. per year) that testing of large numbers of subjects is not feasible. In these special situations, the FDA may grant an HDE provided that: the device does not pose an unreasonable or significant risk of illness or injury; and the probable benefit to health outweighs the risk of injury or illness from its use, taking into account the probable risks and benefits of currently available devices or alternative forms of treatment. Additionally, the FDA notes that the applicant must demonstrate that no comparable devices are available to treat or diagnose the disease or condition, and that they could not otherwise bring the device to market. The labeling for an HUD must state that the device is a Humanitarian Use Device and that, although the device is authorized by federal law, the effectiveness of the device for the specific indication has not been demonstrated. (FDA, 2004)

AlloMap Molecular Expression Testing

The California Technology Assessment Forum (CTAF) conducted a technology assessment of gene-expression profiling for the diagnosis of heart transplant rejection in 2006, at which time it

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 TRANS.00025 Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejecti... Page 4 of 7

was determined that the use of gene-expression profiling did not meet CTAF criteria when used to manage heart transplant patients. The CTAF assessment stated that:

Gene expression profiling offers the potential for a non-invasive test that may replace endomyocardial biopsy as the gold standard for transplant rejection. However, given the history of poor reproducibility of other gene expression profiles in the recent past, it is prudent to require independent confirmation of the CARGO study results before widespread adoption of the AlloMap gene expression profile to monitor heart transplant patients for early detection of rejection occurs (CTAF, 2006).

This initial CTAF determination was based on concerns around the post-hoc change in the threshold used to define a positive test result in the CARGO study and the small size of this primary validation study, in addition to the fact that there were no studies, published to date, comparing the clinical outcomes of individuals monitored with gene expression profiling to those monitored with EMB (CTAF, 2006).

In 2010, the CTAF conducted a systematic re-review of available published evidence focusing on results of the AlloMap test which included six observational studies and one randomized trial. Three of the publications included in this review reported on subsets of participants from the CARGO trial, as well as results of the IMAGE trial and concluded that, "This technology meets CTAF's assessment criteria for safety, effectiveness and improvement in health outcomes when used to manage heart transplant patients at least one year post-transplant." The CTAF assessment included the following conclusions:

The AlloMap gene expression profile has a high negative predictive value, but a low positive predictive value. Thus, it may be useful to avoid biopsy in stable patients, but the high false positive rate precludes its use to definitively diagnose acute cellular rejection. Endomyocardial biopsies will still need to be performed in all patients with elevated AlloMap scores and all patients with clinical signs of rejection. The IMAGE trial provides data supporting the non- inferiority of a monitoring strategy for heart transplant patients incorporating the AlloMap gene expression profile in lieu of routine endomyocardial biopsy. However, the data only support such strategies in patients more than a year post-transplant. More data are needed to confirm the tests utility earlier in the post-transplant period when the majority of endomyocardial biopsies are performed (CTAF/Tice, 2010).

In 2011, the Blue Cross Blue Shield Association published a Technology Assessment Report of gene expression profiling as a noninvasive method to monitor for cardiac allograft rejection. This review and analysis of the available published evidence concluded that the use of gene expression profiling as a noninvasive method to monitor for cardiac allograft rejection does not meet the TEC criteria. The following are some summarized conclusions:

Although a higher score is associated with a greater likelihood of rejection class 3A or higher, the diagnostic characteristics of AlloMap® testing are uncertain. Study methods are unclear, study samples are incompletely described, numbers of cases of rejection are apparently small, and cutoff scores appear to have been determined post hoc. The sensitivity of the test for detecting rejection is uncertain (TEC, 2011).

Definitions

Allograft rejection: The recipient's immune system rejects the donor heart.

Endomyocardium: The innermost lining of the heart.

Endomyocardial biopsy: A tissue sample of the endomyocardium.

Heart transplant: Removal of a human heart and replacing it with a donor heart.

Coding

The following codes for treatments and procedures applicable to this document are included below for informational purposes. Inclusion or exclusion of a procedure, diagnosis or device code(s) does not constitute or imply member coverage or provider reimbursement policy. Please refer to the member's contract benefits in effect at the time of service to determine coverage or non-coverage of these services as it applies to an individual member.

When services may be Medically Necessary when criteria are met:

CPT 81599

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 TRANS.00025 Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejecti... Page 5 of 7

Unlisted multianalyte assay with algorithmic analysis [when specified as AlloMap test] 86849 Unlisted immunology procedure [when specified as AlloMap test]

ICD-9 Diagnosis [For dates of service prior to 10/01/2015] V42.1 Organ or tissue replaced by transplant, heart

ICD-10 Diagnosis [For dates of service on or after 10/01/2015] Z48.21 Encounter for aftercare following heart transplant Z94.1 Heart transplant status

When services are Investigational and Not Medically Necessary: For the Allomap test when medically necessary criteria are not met.

When services are also Investigational and Not Medically Necessary: For the procedure code listed below in all instances, or when the code describes a procedure indicated in the Position Statement section as investigational and not medically necessary.

CPT 0085T Breath test for heart transplant rejection [Heartsbreath test]

ICD-9 Diagnosis [For dates of service prior to 10/01/2015] All diagnoses

ICD-10 Diagnosis [For dates of service on or after 10/01/2015] All diagnoses

References

Peer Reviewed Publications:

1. Bernstein D, Williams GE, Eisen H, et al. Gene expression profiling distinguishes a molecular signature for grade 1B mild acute cellular rejection in cardiac allograft recipients. J Heart Lung Transplant. 2007; 26(12):1270-1280. 2. Cadeiras M, Shahzad K, John MM, et al. Relationship between a validated molecular cardiac transplant rejection classifier and routine organ function parameters. Clin Transplant. 2010; 24(3):321-327. 3. Cadeiras, M, von Bayern M, Sinha A, et al. Noninvasive diagnosis of acute cardiac allograft rejection. Curr Opin Organ Transplant. 2007; 12(5):543-550. 4. Crespo-Leiro MG, Zuckermann A, Bara C, et al. Concordance among pathologists in the second Cardiac Allograft Rejection Gene Expression Observational Study (CARGO II). Transplantation. 2012; 94(11):1172-1177. 5. Deng MC, Eisen HJ, Mehra MR, et al.; CARGO Investigators. Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. Am J Transplant. 2006; 6(1):150-160. 6. Deng MC, Elashoff B, Pham MX, et al.; IMAGE Study Group. Utility of gene expression profiling score variability to predict clinical events in heart transplant recipients. Transplantation. 2014; 97(6):708-714. 7. Fang KC. Clinical utilities of peripheral blood gene expression profiling in the management of cardiac transplant patients. J Immunol. 2007: 4(3):209-217. 8. Jarcho JA. Fear of rejection--monitoring the heart-transplant recipient. N Engl J Med. 2010; 362(20):1932-1933. 9. Marboe CC, Lal PG, Chu K, et al. Distinctive peripheral blood gene expression profiles in patients forming nodular endocardial infiltrates (Quilty lesions) following heart transplantation. J Heart Lung Transplant. 2005; 24(2 suppl):S97. 10. Mehra MR, Kobashigawa JA, Deng MC, et al.; CARGO Investigators. Transcriptional signals of T-cell and corticosteroid-sensitive genes are associated with future acute cellular rejection in cardiac allografts. J Heart Lung Transplant. 2007b; 26(12):1255-1263. 11. Mehra MR, Kobashigawa JA, Deng MC, et al.; CARGO Investigators. Clinical implications and longitudinal alteration of peripheral blood transcriptional signals indicative of future cardiac allograft rejection. J Heart Lung Transplant. 2008; 27(3):297-301. 12. Mehra MR, Uber PA. Genomic biomarkers and heart transplantation. Heart Fail Clin. 2007a; 3(1):83-86. 13. Pham MX, Deng MC, Kfoury AG, et al. Molecular testing for long-term rejection surveillance in heart transplant recipients: design of the Invasive Monitoring Attenuation through Gene Expression (IMAGE) trial. J Heart Lung Transplant. 2007; 26(8):808-814.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 TRANS.00025 Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejecti... Page 6 of 7

14. Pham MX, Teuteberg JJ, Kfoury AG, et al.; IMAGE Study Group. Gene-expression profiling for rejection surveillance after cardiac transplantation. N Engl J Med. 2010; 362(20):1890- 1900. 15. Phillips M, Boehmer JP, Cataneo RN, et al. Heart allograft rejection: detection with breath alkanes in low levels (the HARDBALL study). J Am Coll Cardiol. 2002; 40(1):12-13. 16. Phillips M, Boehmer JP, Cataneo RN, et al. Heart allograft rejection: detection with breath alkanes in low levels (the HARDBALL study). J Heart Lung Transplant. 2004a; 23(6):701- 708. 17. Phillips M, Boehmer JP, Cataneo RN, et al. Prediction of heart transplant rejection with a breath test for markers of oxidative stress. Am J Cardiol. 2004b; 94(12):1593-1594. 18. Sobotka PA, Gupta DK, Lansky DM, et al. Breath pentane is a marker of acute cardiac allograft rejection. J Heart Lung Transplant. 1994; 13(2):224-229. 19. Starling RC, Pham M, Valantine H, et al.; Working Group on Molecular Testing in Cardiac Transplantation. Molecular testing in the management of cardiac transplant recipients: initial clinical experience. J Heart Lung Transplant. 2006; 25(12):1389-1395. 20. Strecker T, Rösch J, Weyand M, Agaimy A. Endomyocardial biopsy for monitoring heart transplant patients: 11-years-experience at a German heart center. Int J Clin Exp Pathol. 2013; 6(1):55-65. 21. Yamani MH, Taylor DO, Haire C, et al. Post-transplant ischemic injury is associated with up-regulated AlloMap gene expression. Clin Transplant. 2007a; 21(4):523-525. 22. Yamani MH, Taylor DO, Rodriguez ER, et al. Transplant vasculopathy is associated with increased AlloMap gene expression score. J Heart Lung Transplant. 2007b; 26(4):403-406.

Government Agency, Medical Society, and Other Authoritative Publications:

1. Berry GJ, Burke MM, Andersen C, et al. The 2013 International Society for Heart and Lung Transplantation Working Formulation for the standardization of nomenclature in the pathologic diagnosis of antibody-mediated rejection in heart transplantation. J Heart Lung Transplant. 2013; 32(12):1147-1162. 2. Blue Cross Blue Shield Association. Gene Expression Profiling as a Noninvasive Method to Monitor for Cardiac Allograft Rejection. TEC Assessment, 2011; 26(8). 3. Centers for Medicare and Medicaid Services. National Coverage determination. Heartsbreath Test for Heart Transplant Rejection. NCD #260.10. December 8, 2008. Available at: http://www.cms.gov/medicare-coverage-database/details/ncd-details.aspx? NCDId=325&ncdver=1&DocID=260.10&from2=index_chapter_list.asp&list_type=&bc=gAAAAAgAAAAAAA% 3d%3d&. Accessed on March 18, 2015. 4. Costanzo MR, Dipchand A, Starling R. et al. The International Society of Heart and Lung Transplantation guidelines for the care of heart transplant recipients. J Heart Lung Transplant. 2010; 29(8):914-956. Available at: http://www.jhltonline.org/article/S1053-2498 (10)00358-X/abstract. Accessed on March 18, 20154. 5. Francis GS, Greenberg BH, Hsu DT, et al. ACCF/AHA/ACP/HFSA/ISHLT 2010 clinical competence statement on management of patients with advanced heart failure and cardiac transplant: a report of the ACCF/AHA/ACP Task Force on Clinical Competence and Training. Circulation. 2010; 122(6):644-672. Available at: http://circ.ahajournals.org/content/122/6/644.full.pdf. Accessed on March 18, 2015. 6. Society for Cardiovascular Pathology. International Society for Heart and Lung Transplantation (ISHLT) revised grading criteria. 2012. 7. Tice JA. California Technology Assessment Forum (CTAF). Gene expression profiling for the diagnosis of heart transplant rejection. A Technology Assessment. San Francisco, CA: CTAF; October 13, 2010. Available at: http://www.ctaf.org/sites/default/files/assessments/1208_file_AlloMap_2010_W.pdf. Accessed on March 18, 2015. 8. U.S. Food and Drug Administration (FDA). Center for Devices and Radiological Health (CDRH). New Humanitarian Device Approval. Heartsbreath No. H030004. Rockville, MD:FDA. February 24, 2004. Available at: http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cftopic/pma/pma.cfm?num=H030004. Accessed on March 18, 2015. 9. U. S. Food and Drug Administration (FDA). Center for Devices and Radiological Health (CDRH). 510(k) Substantial Equivalence Determination Decision Summary for AlloMap® Molecular Expression Testing. No. K073482. Rockville, MD:FDA. Nov. 4, 2008. Available at: http://www.accessdata.fda.gov/cdrh_docs/reviews/K073482.pdf. Accessed on March 18, 2015. 10. U.S. Food and Drug Administration (FDA). Center for Devices and Radiological Health. Humanitarian Use Device Exemptions. Available at: http://www.fda.gov/MedicalDevices/ProductsandMedicalProcedures/DeviceApprovalsandClearances/HDEApprovals/ucm161827.htm. Accessed on March 18, 2015. 11. XDx, Inc. A Comparison of AlloMap Molecular Testing and Traditional Biopsy-based Surveillance for Heart Transplant Rejection Early Post-transplantation (EIMAGE). NLM

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 TRANS.00025 Laboratory Testing as an Aid in the Diagnosis of Heart Transplant Rejecti... Page 7 of 7

Identifier: NCT00962377. Last updated December 20, 2010. Available at: http://www.clinicaltrials.gov/ct2/show/NCT00962377?term=AlloMap&rank=1. Accessed on March 18, 2015. 12. XDx, Inc. Cardiac Allograft Rejection Gene Expression Observational (CARGO) II STUDY (CARGOII). NLM Identifier: NCT00761787. Last updated March 5, 2009. Available at: http://www.clinicaltrials.gov/ct2/show/NCT00761787?term=AlloMap&rank=2. Accessed on March 18, 2015. 13. XDx, Inc. IMAGE: A Comparison of AlloMap Molecular Testing and Traditional Biopsy- based Surveillance for Heart Transplant Rejection. NLM Identifier: NCT00351559. Last updated November 18, 2009. Available at: http://www.clinicaltrials.gov/ct2/show/NCT00351559?term=00351559&rank=1. Accessed on March 18, 2015.

Index

AlloMap Breath Test as an Aid for Diagnosis of Heart Transplant Rejection Gene Expression Molecular Profiling Heartsbreath

The use of specific product names is illustrative only. It is not intended to be a recommendation of one product over another, and is not intended to represent a complete listing of all products available.

Document History Status Date Action Reviewed 05/07/2015 Medical Policy & Technology Assessment Committee (MPTAC) review. References were updated. Reviewed 05/15/2014 MPTAC review. The Rationale and References were updated. Reviewed 05/09/2013 MPTAC review. References were updated. Reviewed 05/10/2012 MPTAC review. The Background and References were updated. Revised 05/19/2011 MPTAC review. The position on AlloMap molecular expression testing has been changed to now consider medically necessary when criteria are met. The Rationale, Background, Coding and Reference sections were updated. Reviewed 05/13/2010 MPTAC review. The Background and Reference sections were updated. Reviewed 05/21/2009 MPTAC review. Updated Reference section. Reviewed 05/15/2008 MPTAC review. References were updated. 02/21/2008 The phrase "investigational/not medically necessary" was clarified to read "investigational and not medically necessary." This change was approved at the November 29, 2007 MPTAC meeting. Reviewed 05/17/2007 MPTAC review. Reference section was updated. Reviewed 06/08/2006 MPTAC review. References were updated and information was added about the CARGO Study of AlloMap testing. Revised 07/14/2005 MPTAC review. AlloMap® molecular testing added as investigational/not medically necessary. Revised 04/28/2005 MPTAC review. Revision based on Pre-merger Anthem and Pre-merger WellPoint Harmonization. Pre-Merger OrganizationsLast Review Date Document Title Number Anthem, Inc No prior document WellPoint Health Networks, 12/02/2004 2.04.32 Breath Test for Use as an Aid in Inc. the Diagnosis of Heart Transplant Rejection

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 1 of 21

Close

www.aetna.com

• •

Heart Transplantation

• Print • Share

Number: 0586

Policy

I. Human Heart Transplantation

Aetna considers heart transplantation medically necessary for any of the following conditions (not an all-inclusive list) when the member meets the transplanting institution's protocol eligibility criteria. In the absence of a protocol, Aetna considers heart transplantation medically necessary for heart failure with irreversible underlying etiology, including the following indications when the selection criteria listed below are met and none of the absolute contraindications is present:

◾ Cardiac arrhythmia ◾ Cardiac re-transplantation due to graft failure ◾ Cardiomyopathy due to nutritional, metabolic, hypertrophic or restrictive etiologies ◾ Congenital heart disease ◾ End-stage ventricular failure ◾ Idiopathic dilated cardiomyopathy ◾ Inability to be weaned from temporary cardiac-assist devices after myocardial infarction or non-transplant cardiac surgery ◾ Intractable coronary artery disease ◾ Myocarditis ◾ Post-partum cardiomyopathy ◾ Right ventricular dysplasia/cardiomyopathy ◾ Valvular heart disease.

Selection Criteria for Human Heart Transplantation (for members off protocol, all criteria listed below must be met):

A. New York Heart Association (NYHA) classification of heart failure III or IV (see Note below), -- does not apply to pediatric members; and

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 2 of 21

B. Member has potential for conditioning and rehabilitation after transplant (i.e., member is not moribund); and C. Life expectancy (in the absence of cardiovascular disease) is greater than 2 years; and D. No malignancy (except for non-melanomatous skin cancers) or malignancy has been completely resected or (upon individual case review) malignancy has been adequately treated with no substantial likelihood of recurrence with acceptable future risks; and E. Adequate pulmonary, liver and renal function; and F. Absence of active infections that are not effectively treated; and G. Absence of uncontrolled HIV infection, defined as:

1. CD4 count greater than 200 cells/mm3 for greater than 6 months; and 2. HIV-1 RNA (viral load) undetectable; and 3. On stable anti-viral therapy greater than 3 months; and 4. No other complications from AIDS, such as opportunistic infections (e.g., aspergillus, tuberculosis, coccidiodomycosis, resistant fungal infections) or neoplasms (e.g., Kaposi's sarcoma, non-Hodgkin's lymphoma); and

H. Absence of active or recurrent pancreatitis; and I. Absence of diabetes with severe end-organ damage (neuropathy, nephropathy with declining renal function and proliferative retinopathy); and J. No uncontrolled and/or untreated psychiatric disorders that interfere with compliance to a strict treatment regimen; and K. No active alcohol or chemical dependency that interferes with compliance to a strict treatment regimen.

Note: NYHA Class III and Class IV for heart failure are defined as follows:

Class III: Persons with cardiac disease resulting in marked limitation of physical activity. They are comfortable at rest. Less than ordinary activity (i.e., mild exertion) causes fatigue, palpitation, dyspnea, or anginal pain. Class IV: Persons with cardiac disease resulting in inability to carry on any physical activity without discomfort. Symptoms of cardiac insufficiency or of the anginal syndrome may be present even at rest. If any physical activity is undertaken, discomfort is increased.

Contraindications: Heart transplant is considered not medically necessary for persons with any of the following contraindications:

◾ Presence of irreversible end-organ diseases (e.g., renal, hepatic, pulmonary) (unless person is to undergo dual organ transplantation, e.g., heart-lung, heart-kidney, etc.); or ◾ Presence of severe pulmonary hypertension with irreversibly high pulmonary vascular resistance; or ◾ Presence of a recent intra-cranial cerebrovascular event with significant persistent deficit; or ◾ Presence of bleeding peptic ulcer; or

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 3 of 21

◾ Presence of hepatitis B antigen; or ◾ Presence of diverticulitis; or ◾ Presence of immediately life-threatening neuromuscular disorders; or ◾ Presence of HIV/AIDS with profound immunosuppression (CD4 count of less than 200 cells/mm3); or ◾ Presence of amyloidosis (although amyloidosis is considered a contraindication to heart transplantation, exceptions may be made in circumstances where curative therapy of amyloidosis has been performed or is planned (e.g., stem cell transplantation in primary amyloidosis, liver transplantation in familial amyloidosis).

II. Xenotransplantation of the Heart

Aetna considers cardiac xenotransplantation (e.g., porcine xenografts) experimental and investigational because its safety and effectiveness has not been established.

III. Left Ventricular Assist Device as Destination Therapy

For Aetna's CPB policy on left ventricular assist devices as destination therapy for persons with severe heart failure, see CPB 0654 - Ventricular Assist Devices.

IV. Total Artificial Heart

Aetna considers the use of a total artificial heart (e.g., ABIOCOR Total Artificial Heart, SynCardia™ temporary Total Artificial Heart (formerly known as CardioWest Total Artificial Heart)) as permanent treatment (destination therapy) (i.e., as an alternative to heart transplantation) experimental and investigational because its safety and effectiveness for this indication has not been established.

Aetna considers an Food and Drug Administration-approved total artificial heart (e.g., CardioWest Total Artificial Heart, SynCardia Systems, Tucson, AZ) medically necessary when used as a bridge to transplant for transplant-eligible members who are at imminent risk of death (NYHA Class IV) due to biventricular failure who are awaiting heart transplantation. See CPB 0654 - Ventricular Assist Devices.

V. Breath Test for Heart Transplant Rejection

Aetna considers the Heartsbreath Test (Menassana Research, Inc, Fort Lee, NJ) experimental and investigational for diagnosing heart transplant rejection and for all other indications because its clinical value has not been established.

VI. AlloMap™ Molecular-Expression Blood Test

Aetna considers the Allomap gene expression profile medically necessary for monitoring rejection in heart transplant recipients more than six months post-heart transplant.

Aetna considers the Allomap gene expression profile experimental and investigational for all other indications because its clinical value has not been established.

VII. Cytokine Gene Polymorphism Testing

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 4 of 21

Aetna considers cytokine gene polymorphism testing experimental and investigational for evaluating graft rejection following heart transplantation because of insufficient evidence.

Background

Heart transplantation has become a commonly used therapeutic option for the treatment of end-stage heart disease. It has been projected that patients who receive cardiac transplants have an in-hospital mortality rate of less than 5 %, a 1-year survival rate of about 85 %, and a 5-year survival rate of 75 % to 80 %. Moreover, 90 % of cardiac transplant patients lead a relatively normal lifestyle having no limitations in their activity and 40 % return to work.

In adults, cardiac transplantation is most frequently performed for patients with cardiomyopathy (about 50 %), coronary artery disease (about 40 %), valvular disease (about 4 %), re-transplantation following a failed primary transplantation (about 2 %) and congenital heart disease (about 2 %).

In children, the most common indications for cardiac transplantation are congenital heart disease (about 47 %), dilated cardiomyopathy (about 45 %), and re-transplantation (about 3 %). Moreover, survival in children with dilated cardiomyopathy relies on accurate diagnosis and aggressive treatment. The literature indicates that patients may respond to conventional treatment for heart failure or may deteriorate, requiring mechanical support. Extracorporeal membrane oxygenation (see CPB 0546 - Extracorporeal Membrane Oxygenation (ECMO)) has been used effectively for mechanical support in children until improvement occurs or as a bridge to transplantation. For individuals who are listed to receive a heart transplant, the mortality rate while waiting for a donor organ averages approximately 20 %. Survival after transplantation is good, with an intermediate survival rate of about 70 %.

The New York Heart Association (NYHA) classification of heart failure is one of the many parameters used for selecting heart recipients. It is a 4-tier system that categorizes patients based on subjective impression of the degree of functional compromise. The 4 NYHA functional classes are as follows:

Class I: Patients with cardiac disease but without resulting limitation of physical activity. Ordinary physical activity does not cause undue fatigue, palpitation, dyspnea, or anginal pain. Symptoms only occur on severe exertion. Class II: Patients with cardiac disease resulting in slight limitation of physical activity. They are comfortable at rest. Ordinary physical activity (e.g., moderate physical exertion such as carrying shopping bags up several flights or stairs) results in fatigue, palpitation, dyspnea, or anginal pain. Class III: Patients with cardiac disease resulting in marked limitation of physical activity. They are comfortable at rest. Less than ordinary activity (i.e., mild exertion) causes fatigue, palpitation, dyspnea, or anginal pain. Class IV: Patients with cardiac disease resulting in inability to carry on any physical activity without discomfort. Symptoms of cardiac

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 5 of 21

insufficiency or of the anginal syndrome may be present even at rest. If any physical activity is undertaken, discomfort is increased.

Contraindications to cardiac transplantation include irreversible end-organ diseases (e.g., renal, hepatic, pulmonary), active malignancy or infections, systemic diseases (e.g., autoimmune, vascular), chronic gastro-intestinal disease (e.g., diverticulitis, active or recurrent pancreatitis, bleeding peptic ulcer), psychiatric disorders, and intra-cranial cerebrovascular disease. Amyloidosis has also been considered a contraindication to cardiac transplantation due to the high likelihood of development of amyloid in the transplanted organ. Good outcomes of cardiac transplantation have been reported after curative liver transplantation for familial amyloidosis or stem cell transplantation for primary amyloidosis. HIV infection is not an absolute contraindication to cardiac transplantation if the HIV infection is well- controlled. Because of the potential impact of transplant-related immunosuppression, it is especially important for HIV-infected transplant recipients to be followed by an HIV-AIDS multi-disciplinary team with expertise in this area.

Cardiac transplantation is currently the only proven curative treatment for end-stage heart disease, but the supply of donor hearts has not kept pace with the demand. Therefore, surgical techniques such as reduction ventriculoplasty, transmyocardial laser revascularization (see CPB 0163 - Transmyocardial and Endovascular Laser Revascularization), myoreduction operations (see CPB 0182- Ventricular Remodeling Operation (Batista Operation) and Surgical Ventricular Restoration (Dor Procedure)) or dynamic cardiomyoplasty are employed to maintain heart function or provide a bridge to heart transplantation. In addition, ventricular assist devices (see CPB 0654 - Ventricular Assist Devices) and the total artificial heart have been approved by the Food and Drug Administration (FDA) for use as a bridge to transplant in selected persons who are awaiting heart transplantation.

The FDA approval of the CardioWest Total Artificial Heart (TAH) (SynCardia Systems, Inc., Tucson, AZ) as a bridge to heart transplantation in transplant eligible patients at imminent risk of death from non-reversible biventricular failure was based on the results of a controlled multi-center clinical study that found that such patients who were implanted with the CardioWest TAH did better than similar control patients who underwent emergency cardiac transplantation (SynCardia, 2004; Copeland et al, 2004). In this study, 95 patients were implanted with the CardioWest TAH and 35 patients were controls. Of the 95 patients implanted, 81 met all inclusion criteria and were designated the core implant group. All patients were in NYHA Class IV at time of enrollment. The control group did not receive the TAH but met study inclusion criteria. Both groups were on maximal medical therapy and were at imminent risk of death before a donor heart could be obtained. Treatment success was defined as patients who, at 30 days post transplant, were (i) alive, (ii) NYHA Class I or II, (iii) not bedridden; (iv) not ventilator dependent, and (v) not requiring dialysis. Trial success was achieved in 56 (69 %) of the 81 core patients and in 13 (37 %) of the 35 control patients, a difference that was statistically significant (p = 0.0019). There was also statistically significant differences in favor of the core patients with respect to survival to transplant (p = 0.0008) and survival to 30 days post transplant (p = 0.0018). Of the core patients, 64 of the 81 (79 %) reached transplant after an average of 79 days (range of 1 to 414); whereas 16 of the 35 (46 %) controls reached transplant after an average of 9 days (range of 1 to 44). Fifty-eight (72 %) core patients and 14 (40 %) controls survived to 30 days post-transplant.

Renlund (2004) explained that a variety of devices can be used as a bridge to heart transplant. The selection of a device depends on the type of heart failure, as well as the size of the patient, the surgeon's experience, and the institutional preference. Implantable left ventricular assist devices, which channel blood from the left ventricle to the pump and back to the aorta, are generally inadequate for bridging to transplantation in patients with severe biventricular heart failure. The replacement of both ventricles

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 6 of 21

with a TAH may be warranted when replacement of both ventricles may be warranted in severe biventricular failure (Renlund, 2004). Such circumstances frequently arise in patients with severe aortic insufficiency, intractable ventricular arrhythmias, an aortic prosthesis, an acquired ventricular septal defect, or irreversible biventricular failure requiring a high pump output. Paracorporeal devices, with the pump placed outside of the body, can provide an alternative to either the ventricular assist device for supporting 1 ventricle, or to the TAH for supporting both ventricles.

Xenotransplantation of the Heart:

The scarcity of donor organs has also resulted in intense research on xenotransplantation. As a consequence of physiological compatibility as well as infectious consideration, pig is the most likely source of xenotransplantation. The advent of transgenic pigs expressing human complement regulatory proteins and new immunosuppressive therapies have provided early promising results in the laboratory. However, more research is needed to advance porcine xenotransplantation to clinical trials.

The Heartsbreath Test:

Menssana Research, Inc. (Fort Lee, NJ) has received a humanitarian device approval (see note below) for the Heartsbreath Test for evaluation of heart transplant rejection. According to the FDA-approved product labeling, the product is to be used as an aid in diagnosis of grade 3 heart transplant rejection in patients who have received heart transplants within the preceding year (FDA, 2004). The labeling states that the Heartsbreath test is intended to be used as an adjunct to, and not as a substitute for endomyocardial biopsy. The labeling states that the use of the Heartsbreath Test is limited to patients who have had endomyocardial biopsy within the previous month.

The Heartsbreath test assesses heart transplant rejection by measuring the amount of methylated alkanes, a marker of oxidative stress, in the patient's breath. Heart transplant rejection appears to be accompanied by oxidative stress which degrades membrane polyunsaturated fatty acids, creating methylated alkanes, which are excreted in the breath as volatile organic compounds. The value generated by the Heartsbreath Test is compared to the results of a biopsy performed the previous month to measure the probability of the implant being rejected.

According to the FDA (2004), the Heartsbreath test's greatest potential value may be in helping to separate less severe organ rejection (grades 0, 1, and 2) from more severe rejection (grade 3). The FDA-approved labeling states that the Heartsbreath test should not be used for patients who have received a heart transplant more than 1 year ago, or who have grade 4 heart transplant rejection because the Heartsbreath test has not been evaluated in these patients.

The FDA's Humanitarian Device Approval of the Heartsbreath Test was based on the results of a multi- center clinical study entitled Heart Allograft Rejection: Detection with Breath Alkanes in Low Levels (HARDBALL), which compared the sensitivity and specificity of the Heartsbreath Test with myocardial biopsy reading by a single pathologist at the transplant site (usually a general pathologist) in distinguishing grade 3 heart transplant rejection from lesser grades of rejection, using biopsy reading by 2 cardiac pathologists as the gold standard for comparison (Phillips et al, 2004; FDA, 2004). In this study, 1,061 breath samples were collected from 539 heart transplant recipients prior to scheduled endomyocardial biopsy. Compared to the gold standard, the Heartsbreath Test had a sensitivity of 59.5 %, a specificity of 58.8 %, a positive- predictive value of 5.6 % and a negative-predictive value of 97.2 %. The biopsy reading by the general pathologist had a sensitivity of 42.4 %, a specificity of 97.0 %, a positive- predictive value of 45.2 %, and a negative-predictive value of 96.7 %. The investigators concluded that the Heartsbreath Test was more sensitive but less specific for grade 3 heart transplant rejection than a biopsy reading by a single general pathologist, but the negative-predictive values of

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 7 of 21

the 2 tests are similar. Therefore, a screening breath test may provide supportive information to help identify heart transplant recipients who are at low-risk for grade 3 rejections (Phillips et al, 2004).

In a report of the HARDBALL study results published in the New England Journal of Medicine, the investigators explained that the major potential benefit of the Heartsbreath test is in reducing the number of heart biopsies (Phillips et al, 2004). If the breath analysis is negative, a biopsy is not needed because, with a negative-predictive value of 97 %, this test accurately predicts where there is not any organ rejection. If the breath analysis is positive, however, the patient will need a biopsy to determine whether there is rejection, because the Heartsbreath test, with a positive-predictive value of 6 %, does not accurately predict the presence of rejection. The investigators explained that the low positive- predictive value of this test means that it does not predict the presence of rejection.

A commentary on the HARDBALL study (Williams and Miller, 2002) noted that the study results are “difficult to evaluate” because of a “surprising inconsistency” between the biopsy interpretations of the general pathologist at the transplant site and the biopsy interpretation by the 2 cardiac pathologists used as the gold standard. The commentary also noted that only 9 of 42 biopsies with grade 3 rejection were predicted by the Heartsbreath test. Finally, the commentary stated that there needs to be further study of the effect of concurrent illness, such as hemodynamic compromise and infection, on the Heartsbreath test, because such illnesses could theoretically decrease the sensitivity and specificity of the Heartsbreath or any other test that is a marker of oxidative stress.

The FDA-approved product labeling of the Heartsbreath test states that the effectiveness of this device for diagnosis of grade 3 heart transplant rejection “has not been demonstrated” (FDA, 2004). The FDA, however, approved this device based on the Center for Devices and Radiological Health conclusion that the probable benefit of this test outweighs the risk. The FDA approval also was based on the assumption that this test would not be used as a substitute for a heart biopsy, as has been suggested by the HARDBALL study investigators (Phillips et al, 2004), but to be used as a confirmatory test in combination with myocardial biopsy to detect grade 3 heart transplant rejection (FDA, 2004). The Humanitarian Device Exemption for the Heartsbreath was not referred to the FDA's Clinical Chemistry and Clinical Toxicology Devices Panel for review and recommendation because the Heartsbreath is used as an adjunct to myocardial biopsy rather than replacing myocardial biopsy.

According to the FDA, the major benefit of the Heartsbreath test is that it may reduce the risk of a patient getting the wrong treatment because of an erroneous biopsy report:

The benefits are of 2 kinds: (i) the Heartsbreath test may help identify patients with grade 3 rejections and a false-negative biopsy report, which may help protect them from under-treatment of a life- threatening condition, and (ii) the Heartsbreath test may help identify patients with a false-positive biopsy report who do not have grade 3 rejections, and may help protect them from the hazards of unnecessary treatment with steroids and other immunosuppressant medications.

The FDA states that the major risk of the Heartsbreath Test is a result that conflicts with a biopsy report. According to the FDA, this risk, however, can be minimized by recommending secondary biopsy review of any discordant results by a 2nd pathologist prior to considering any change in treatment.

Note: A Humanitarian Use Device (HUD) is a device that has been given special approval by the FDA under the Humanitarian Device Exemption (HDE) regulations. The standard approval process for devices requires that companies demonstrate that the devices are safe and effective (better than medicine or another procedure). However, the FDA recognizes that sometimes a condition is so unusual that it would be difficult for a company to scientifically demonstrate effectiveness of their

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 8 of 21

device in the large number of patients that usually must be tested. In these special situations, they may grant a HDE provided that: (i) the device does not pose an unreasonable or significant risk of illness or injury; and (ii) the probable benefit to health outweighs the risk of injury or illness from its use, taking into account the probable risks and benefits of currently available devices or alternative forms of treatment.

A HUD may only be used in facilities that have an Institutional Review Board (IRB) to supervise clinical testing of devices and after the IRB has approved the use of the device to treat or diagnose the specific disease.

On December 8, 2008, the Centers for Medicare and Medicaid Services (CMS) issued a decision memorandum in response to a formal request for Menssana Research, Inc., to consider national coverage of the Heartsbreath test as an adjunct to the heart biopsy to detect grade 3 heart transplant rejection in patients who have had a heart transplant within the last year and an endomyocardial biopsy in the prior month. The CMS determined that the evidence does not adequately define the technical characteristics of the test nor demonstrate that Heartsbreath testing to predict heart transplant rejection improves health outcomes.

AlloMap Molecular Expression Blood Test:

The AlloMap molecular expression blood test was developed by XDx Expression Diagnostics. The test evaluates the expression of 20 genes, about half of which are directly involved in rejection while the remainder provide other information needed for rejection risk assessment. It is hoped that the results of this test will reduce the number of endomyocardial biopsies. Among the proposed benefits are the AlloMap test's ability to differentiate mild rejection for which histological findings may be the least accurate and the potential for monitoring physiological responses to steroid weaning. It has been recognized that the test is not effective in monitoring rejection within the first 6 months of transplantation, and it is yet unclear what a high AlloMap score might mean in the setting of no histological rejection.

In a multi-center study called CARGO (Cardiac Allograft Rejection Gene Expression Observational study), Deng et al (2006) examined gene expression profiling of peripheral blood mononuclear cells to discriminate International Society of Heart and Lung Transplantation (ISHLT) grade 0 rejection (quiescence) from moderate/severe rejection (ISHLT greater than or equal to 3A). Patients were followed prospectively with blood sampling at post-transplant visits. Biopsies were graded by ISHLT criteria locally and by 3 independent pathologists blinded to clinical data. Known alloimmune pathways and leukocyte microarrays identified 252 candidate genes for which real-time polymerase chain reaction (PCR) assays were developed. An 11 gene real-time PCR test was derived from a training set (n = 145 samples, 107 patients) using linear discriminant analysis, converted into a score (0 to 40), and validated prospectively in an independent set (n = 63 samples, 63 patients). The test distinguished biopsy-defined moderate/severe rejection from quiescence (p = 0.0018) in the validation set, and had agreement of 84 % (95 % confidence interval [CI]: 66 % to 94 %) with grade ISHLT greater than or equal to 3A rejection. Patients over 1 year post-transplant with scores below 30 (approximately 68 % of the study population) are very unlikely to have grade greater than or equal to 3A rejection (negative-predictive value = 99.6 %). Gene expression testing can detect absence of moderate/severe rejection, thus avoiding biopsy in certain clinical settings. The authors concluded that more research is needed to establish the role of molecular testing for prediction of clinical event prediction and management of immunosuppression. Furthermore, an editorial (Halloran et al, 2006) that accompanied the CARGO study questioned the biological plausibilty of this technology and emphasized the need for replication of these findings.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 9 of 21

In a subsequent study, the investigators from the CARGO study (Starling et al, 2006) provided recommendations regarding the use of the gene expression profiling (GEP) test. However, none of the recommendations received Class I classification and/or Level A evidence.

Candidates for GEP Testing:

Class IIa:

◾ GEP testing can be used in clinically stable cardiac transplant recipients who are 15 years of age or older and 6 months or more post-transplant to identify patients at low-risk for moderate/severe (Grade greater than or equal to 3A/2R) cellular rejection. (Level of Evidence: B) ◾ At the time of GEP testing, a thorough history and physical examination should be obtained/performed by an appropriately trained transplant physician, and a non-invasive assessment of cardiac allograft function utilizing echocardiography should be performed to evaluate allograft function. (Level of Evidence: C)

Class III:

◾ GEP testing should not be used in patients at high-risk for acute rejection or graft failure, including those with (a) signs/symptoms of cardiac allograft dysfunction or hemodynamic compromise (including LVEF less than 40 % and cardiac index less than 2 L/min), (b) recurrent Grade greater than or equal to 3A/2R cellular rejection (greater than or equal to 2 episodes within the past year), or (c) a history of Grade greater than or equal to 3A/2R cellular rejection within the preceding 6 months or antibody-mediated rejection within the preceding 12 months. (Level of Evidence: C) ◾ GEP testing should not be performed in pregnant women, in patients who have had a blood transfusion in the previous 30 days, or in patients who have received hematopoietic growth factors affecting leukocytes within the previous 30 days. (Level of Evidence: C) ◾ GEP testing should not be used to rule out rejection in patients who have received high-dose steroids (intravenous bolus or oral augmentation) within the past 21 days or who are currently on greater than or equal to 20 mg/day of prednisone equivalent. (Level of Evidence: C) ◾ Molecular testing should not be used in patients less than 15 years of age. (Level of Evidence C)

Classification of Recommendations:

Class I: Conditions for which there is evidence and/or general agreement that a given procedure/therapy is beneficial, useful, and/or effective.

Class II: Conditions for which there is conflicting evidence and/or a divergence of opinion about the usefulness/efficacy of a procedure/therapy.

Class IIa: Weight of evidence/opinion is in favor of usefulness/efficacy.

Class III: Conditions for which there is evidence and/or general agreement that a procedure/therapy is not useful/effective and in some cases may be harmful.

Level of Evidence:

A: Data are derived from multiple randomized clinical trials or meta-analyses.

B: Data are derived from a single randomized trial, or nonrandomized studies.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 10 of 21

C: Only consensus opinion of experts, case studies, or standard of care.

Starling and colleagues (2006) noted that while the performance of the GEP test has been validated in a large number of transplant recipients, the clinical outcomes associated with using a GEP-based strategy to monitor for rejection are currently unknown. A multi-center randomized clinical study is currently underway to assess a GEP-based strategy, compared to a biopsy-based strategy, for evaluating rejection in cardiac transplant patients who are 2 to 5 years post-transplant. This study will examine the impact of these 2 strategies with respect to clinical outcomes (e.g., graft dysfunction, death, and clinically apparent rejection), incidence of biopsy-related complications, quality of life, as well as resource utilization.

The AlloMap was assessed by the California Technology Assessment Forum (CTAF, 2006), which concluded that this technology does not meet CTAF's assessment criteria. The CTAF assessment stated that GEP offers the potential for a non-invasive test that may replace endomyocardial biopsy as the gold standard for transplant rejection. However, given the history of poor reproducibility of other GEP in the recent past, it is prudent to require independent confirmation of the CARGO Study results before widespread adoption of the AlloMap gene expression profile to detect early rejection in cardiac transplant recipients. This is particularly true given the post-hoc change in the threshold used to define a positive test result in the study and the small size of the primary validation study. Additionally, there are no studies published to date comparing the clinical outcomes of patients monitored with GEP to those of patients monitored with endomyocardial biopsies.

A subsequent randomized, controlled study of the Allomap GEP concluded that, among selected patients who had received a cardiac transplant more than 6 months previously and who were at a low- risk for rejection, a strategy of monitoring for rejection that involved Allomap GEP, as compared with routine biopsies, was not associated with an increased risk of serious adverse outcomes and resulted in the performance of significantly fewer biopsies. In the Invasive Monitoring Attenuation Through Gene Expression (IMAGE) study (Pham et al, 2010), investigators randomly assigned 602 patients who had undergone cardiac transplantation 6 months to 5 years previously to be monitored for rejection with the use of GEP or with the use of routine endomyocardial biopsies, in addition to clinical and echocardiographic assessment of graft function. The investigators performed a non-inferiority comparison of the 2 approaches with respect to the composite primary outcome of rejection with hemodynamic compromise, graft dysfunction due to other causes, death, or re-transplantation. During a median follow-up period of 19 months, patients who were monitored with GEP and those who underwent routine biopsies had similar 2-year cumulative rates of the composite primary outcome (14.5 % and 15.3 %, respectively; hazard ratio with GEP, 1.04; 95 % CI: 0.67 to 1.68). The 2-year rates of death from any cause were also similar in the 2 groups (6.3 % and 5.5 %, respectively; p = 0.82). Patients who were monitored with the use of GEP underwent fewer biopsies per person-year of follow- up than did patients who were monitored with the use of endomyocardial biopsies (0.5 versus 3.0, p < 0.001).

An editorial accompanying the IMAGE trial (Jarcho, 2010) commented that the most notable implication of the IMAGE trial may be the evidence it offers that calls into question the importance of any form or routine screening for the early detection of rejection in the longer term after transplantation. The editorialist explained that, of 34 rejection episodes identified in the GEP group in the trial, only 6 were detected solely on the basis of the GEP test. All other episodes of rejection were associated with clinical manifestations of heart failure or echocardiographic evidence of allograft dysfunction. "This observation suggests that, even if rejjection is not identified until graft dysfunction is present, the clinical outcomes may not be substantially worse than when rejection is detected early." Other limitations of the trial include the fact that the investigators only enrolled patients who had undergone transplantation at least 6 months previously, a group that was a much lower risk of rejection

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 11 of 21

than patients within 6 months of transplantation. In addition, the non-inferiority margin was wide; the actual 95 % CI was consistent with as much as a 68 % increase in risk with the GEP strategy.

A re-assessment of the AlloMap by the California Technology Assessment Forum (Tice, 2010), considering the results of the IMAGE trial, concluded that this technology meets CTAF's assessment criteria. The CTAF assessment stated that the AlloMap GEP has a high negative-predictive value, but a low positive-predictive value. Thus, it may be useful to avoid biopsy in stable patients, but the high false-positive rate precludes its use to definitively diagnose acute cellular rejection. The assessment states that endomyocardial biopsies will still need to be performed in all patients with elevated AlloMap scores and all patients with clinical signs of rejection. CTAF found that the IMAGE trial provides data supporting the non-inferiority of a monitoring strategy for heart transplant patients incorporating the AlloMap GEP in lieu of routine endomyocardial biopsy. However, the data only support such strategies in patients more than 1 year post-transplant. CTAF stated that more data are needed to confirm the tests utility earlier in the post-transplant period when the majority of endomyocardial biopsies are performed.

Mehra and Uber (2007) stated that clinicians have entered a new era for managing heart transplant recipients with the use of multi-marker GEP. Early after transplantation, when steroid modification is the main concern, gene expression testing might aid in optimizing the balance of immunosuppression, defraying the occurrence of rejection, and avoiding crisis intervention. Late after transplantation, the reliance on endomyocardial biopsy could be reduced. These advances, if continually validated in practice, could result in decreased immunosuppression complications, lesser need for invasive surveillance, and more clinical confidence in immunosuppressive strategies.

Total Artificial Heart:

Slepian et al (2013) stated that the SynCardia total artificial heart (TAH; SynCardia Systems Inc., Tuscon, AZ) is the only FDA-approved TAH in the world. The SynCardia(™) TAH is a pneumatically driven, pulsatile system capable of flows of greater than 9 L/min. The TAH is indicated for use as a bridge to transplantation (BTT) in patients at imminent risk of death from non-reversible bi-ventricular failure. In the pivotal U.S. approval trial the TAH achieved a BTT rate of greater than 79 %. Recently a multi-center, post-market approval study similarly demonstrated a comparable BTT rate. A major milestone was recently achieved for the TAH, with over 1,100 TAHs having been implanted to date, with the bulk of implantation occurring at an ever increasing rate in the past few years. The TAH is most commonly utilized to save the lives of patients dying from end-stage bi-ventricular heart failure associated with ischemic or non-ischemic dilated cardiomyopathy. Beyond progressive chronic heart failure, the TAH has demonstrated great efficacy in supporting patients with acute irreversible heart failure associated with massive acute myocardial infarction. In recent years several diverse clinical scenarios have also proven to be well served by the TAH including severe heart failure associated with advanced congenital heart disease, failed or burned-out transplants, infiltrative and restrictive cardiomyopathies and failed ventricular assist devices. Looking to the future a major unmet need remains in providing total heart support for children and small adults. As such, the present TAH design must be scaled to fit the smaller patient, while providing equivalent, if not superior flow characteristics, shear profiles and overall device thrombogenicity. To aid in the development of a new "pediatric," TAH an engineering methodology known as "Device Thrombogenicity Emulation (DTE)", that these researchers have recently developed and described, is being employed. Recently, to further their engineering understanding of the TAH, as steps towards next generation designs these investigators had: (i) assessed of the degree of platelet reactivity induced by the present clinical 70 cc TAH using a closed loop platelet activity state assay, (ii) modeled the motion of the TAH pulsatile mobile diaphragm, and (iii) performed fluid-structure interactions and assessment of the flow behavior through inflow and outflow regions of the TAH fitted with modern bi-leaflet heart valves. Developing a range of TAH

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 12 of 21

devices will afford bi-ventricular replacement therapy to a wide range of patients, for both short- and long-term therapy.

Cytokine Gene polymorphism Testing:

Yongcharoen et al (2013) performed a systematic review and meta-analysis with the aim of assessing the association between cytokine gene polymorphisms and graft rejection in heart transplantation. These researchers identified relevant studies from Medline and Embase using PubMed and Ovid search engines, respectively. Allele frequencies and allele and genotypic effects were pooled. Heterogeneity and publication bias were explored. Four to 5 studies were included in pooling of 3 gene polymorphisms. The prevalence of the minor alleles for TNF α -308, TGF β 1-c10, and TGF β 1-c25 were 0.166 (95 % CI: 0.129 to 0.203), 0.413 (95 % CI: 0.363 to 0.462), and 0.082 (95 % CI: 0.054 to 0.111) in the control groups, respectively. Carrying the A allele for the TNF α -308 had 18 % (95 % CI of OR: 0.46 to 3.01) increased risk, but this was not significant for developing graft rejection than the G allele. Conversely, carrying the minor alleles for both TGF β 1-c10 and c25 had non-significantly lower odds of graft rejection than major alleles, with the pooled ORs of 0.87 (95 % CI: 0.65 to 1.18) and 0.70 (95 % CI: 0.40 to 1.23), respectively. The authors concluded that there was no evidence of publication bias for all pooling; an updated meta-analysis is needed when more studies are published to increase the power of detection for the association between these polymorphisms and allograft rejection.

Furthermore, an UpToDate review on “Acute cardiac allograft rejection: Diagnosis” (Eisen and Jessup, 2014) does not mention cytokine gene polymorphism testing as a management tool.

Statin for the Management of Graft Vessel Disease:

Som and colleagues (2014) noted that graft vessel disease (GVD) is a significant cause of morbidity and mortality in cardiac allograft recipients. Hyperlipidemia is a risk factor for GVD, and the majority of patients will display abnormal lipid profiles in the years following transplant. This systematic review aimed to establish the clinical impact of statins in cardiac allograft recipients, critically appraising the literature on this subject. These investigators performed a literature search for randomized studies assessing statin use in cardiac allograft recipients. The Cochrane Central Registry of Controlled Trials, MEDLINE, EMBASE, clinicaltrials.gov, and the Transplant Library from the Centre for Evidence in Transplantation were searched. The primary outcome was presence of GVD. Secondary outcomes included graft and patient survival, acute rejection, and adverse events. Meta-analysis was precluded by heterogeneity in outcome reporting and therefore narrative synthesis was undertaken. A total of 7 randomized controlled trials (RCTs) were identified. The majority of RCTs demonstrated some risk of bias, and methods of outcome measurement were variable. Studies reporting incidence or severity of GVD suggested that statins do confer benefit. Survival benefit from statin use is modest. There is a low incidence of adverse events attributable to statins. There was no difference in the overall number of episodes of rejection. The authors concluded that while the methodological quality of evidence describing the use of statins in cardiac allograft recipients is limited, the available evidence suggested benefit from their use. These findings need to be validated by well-designed studies.

CPT Codes / HCPCS Codes / ICD-9 Codes CPT codes covered if selection criteria are met: 0051T Implantation of a total replacement heart system (artificial heart) with recipient cardiectomy 0052T Replacement or repair of thoracic unit of a total replacement heart system (artificial heart) 0053T

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 13 of 21

Replacement or repair of implantable component or components of total replacement heart system (artificial heart), excluding thoracic unit 33940 Donor cardiectomy, (including cold preservation) 33945 Heart transplant, with or without recipient cardiectomy CPT codes not covered for indications listed in the CPB: 0085T Breath test for heart transplant rejection Other CPT codes related to the CPB: + 0049T Prolonged extracorporeal percutaneous transseptal ventricular assist device, greater than 24 hours, each subsequent 24 period (List separately in addition to code for primary procedure) 33990 Insertion of ventricular assist device, percutaneous including radiological supervision and interpretation; arterial access only 33991 Insertion of ventricular assist device, percutaneous including radiological supervision and interpretation; both arterial and venous access, with transseptal puncture 33992 Removal of percutaneous ventricular assist device at separate and distinct session from insertion 33993 Repositioning of percutaneous ventricular assist device with imaging guidance at separate and distinct session from insertion 33975 Insertion of ventricular assist device; extracorporeal, single ventricle 33976 extracorporeal, biventricular 33977 Removal of ventricular assist device; extracorporeal, single ventricle 33978 extracorporeal, biventricular 33979 Insertion of ventricular assist device, implantable intracorporeal, single ventricle 93015 - 93018 Cardiovascular stress test using maximal or submaximal treadmill or bicycle exercise, continuous electrocardiographic monitoring, and/or pharmacological stress 93451- 93454 Cardiac catheterization 93798 Physician services for outpatient cardiac rehabilitation; with continuous ECG monitoring (per session) Other HCPCS codes related to the CPB: G0422 Intensive cardiac rehabilitation; with or without continuous ECG monitoring with exercise, per session S9472 Cardiac rehabilitation program, non-physician provider, per diem ICD-9 codes covered if selection criteria are met (not all-inclusive): 410.00 - 411.89 Acute myocardial infarction and other acute and subacute forms of ischemic heart disease 414.00 - 414.07 Coronary atherosclerosis 414.8 - 414.9 Other specified and unspecified chronic ischemic heart disease 424.0 - 424.99 Other diseases of the endocardium 425.1 Hypertrophic obstructive cardiomyopathy 425.4 Other primary cardiomyopathies 425.7 Nutritional and metabolic cardiomyopathy 427.0 - 427.9 Cardiac dysrhythmias 428.0 - 428.9 Heart failure

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 14 of 21

429.0 Myocarditis, unspecified 674.80, 674.82, Other complications of the puerperium, unspecified as to episode of care or not 674.84 applicable, delivered, with mention of postpartum complication, and postpartum condition or complication [postpartum cardiomyopathy] 745.0 - 746.9 Bulbous cordis anomalies and anomalies of cardiac septal closure, endocardial cushion defects and other congenital anomalies of heart 996.83 Complications of transplanted organ, heart V42.1 Organ or tissue replaced by transplant, heart Other ICD-9 codes related to the CPB: 140.0 - 209.36, Malignant neoplasms 209.75 250.40 - 250.63 Diabetes mellitus with renal, ophthalmic, or neurological manifestations 279.00 - 279.09 Disorders involving the immune mechanism 290.0 - 316 Psychoses, schizophrenic disorders, neurotic disorders, personality disorders, and other nonpsychotic mental disorders 356.0 - 356.9 Hereditary and idiopathic peripheral neuropathy 362.01 - 362.02 Diabetic retinopathy 430 - 437.9 Cerebrovascular disease 440.0 - 459.9 Diseases of arteries, arterioles, and capillaries, and diseases of veins and lymphatics, and other diseases of circulatory system 518.81 - 518.89 Other diseases of lung 577.0 - 577.9 Diseases of pancreas 584.5 - 587 Acute renal failure V10.0 - V10.9 Personal history of malignant neoplasm ICD-9 codes contraindicated for this CPB (not all-inclusive): 001.0 - 139.8 Infectious and parasitic diseases 277.30 - 277.39 Amyloidosis 358.0 - 359.99 Myoneural disorders, muscular dystrophies and other myopathies 416.0 - 416.9 Chronic pulmonary heart disease [severe] 438.0 - 438.9 Late effects of cerebrovascular disease [significant persistent deficit] 496 Chronic airway obstruction, not elsewhere classified [unless person is to undergo dual organ transplantation, e.g., heart-lung, heart-kidney] 533.00, 533.01, Peptic ulcer with hemorrhage 533.20 - 533.21, 533.40 - 533.41, 533.60 - 533.61 571.0 - 571.9 Chronic liver disease and cirrhosis [unless person is to undergo dual organ transplantation, e.g., heart-lung, heart-kidney] 585.6 End stage renal disease [unless person is to undergo dual organ transplantation, e.g., heart-lung, heart-kidney] CPT Codes / HCPCS Codes / ICD-10 Codes Information in the [brackets] below has been added for clarification purposes. Codes requiring a 7th character are represented by "+": ICD-10 codes will become effective as of October 1, 2015: CPT codes covered if selection criteria are met:

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 15 of 21

0051T Implantation of a total replacement heart system (artificial heart) with recipient cardiectomy 0052T Replacement or repair of thoracic unit of a total replacement heart system (artificial heart) 0053T Replacement or repair of implantable component or components of total replacement heart system (artificial heart), excluding thoracic unit 33940 Donor cardiectomy, (including cold preservation) 33945 Heart transplant, with or without recipient cardiectomy CPT codes not covered for indications listed in the CPB: 0085T Breath test for heart transplant rejection Other CPT codes related to the CPB: 33975 Insertion of ventricular assist device; extracorporeal, single ventricle 33976 extracorporeal, biventricular 33977 Removal of ventricular assist device; extracorporeal, single ventricle 33978 extracorporeal, biventricular 33979 Insertion of ventricular assist device, implantable intracorporeal, single ventricle 33990 Insertion of ventricular assist device, percutaneous including radiological supervision and interpretation; arterial access only 33991 both arterial and venous access, with transseptal puncture 33992 Removal of percutaneous ventricular assist device at separate and distinct session from insertion 33993 Repositioning of percutaneous ventricular assist device with imaging guidance at separate and distinct session from insertion 93015 - 93018 Cardiovascular stress test using maximal or submaximal treadmill or bicycle exercise, continuous electrocardiographic monitoring, and/or pharmacological stress 93451- 93454 Cardiac catheterization 93798 Physician services for outpatient cardiac rehabilitation; with continuous ECG monitoring (per session) Other HCPCS codes related to the CPB: G0422 Intensive cardiac rehabilitation; with or without continuous ECG monitoring with exercise, per session S9472 Cardiac rehabilitation program, non-physician provider, per diem ICD-10 codes covered if selection criteria are met (not all-inclusive): I21.01 - I24.9 Acute myocardial infarction and other acute forms of ischemic heart disease I25.10 - I25.799 Chrinic ischemic heart disease I25.810 - I25.9 Other and unspecified forms of chronic ischemic heart disease I34.0 - I39 Nonrheumatic mitral valve, aortic valve, tricuspid valve and pulmonary valve disorders I42.0, I42.2, I42.5, Other cardiomyopathies I42.8, I42.9 I42.1 Obstructive hypertrophic cardiomyopathy I43 Cardiomyopathy in diseases classified elsewhere I47.0 - I49.9 Cardiac dysrhythmias I50.1 - I50.9 Heart failure

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 16 of 21

I51.4 Myocarditis, unspecified O90.81 - O90.9 Other and unspecified complications of the puerperium, not elswhere classified [postpartum cardiomyopathy] Q20.0 - Q24.9 Bulbous cordis anomalies and anomalies of cardiac septal closure, endocardial cushion defects and other congenital anomalies of heart T86.20 - T86.298 Complications of heart transplant Z94.1 Heart transplant status ICD-10 codes contraindicated for this CPB (not all-inclusive) : A00.0 - B99.9 Infectious and parasitic diseases E85.0 - E85.9 Amyloidosis G70.0 - G73.7 Diseases of myoneural junction and muscle I27.0 - I27.9 Other pulmonary heart diseases [severe] I69.00 - I69.998 Sequelae of cerebrovascular disease [significant persistent deficit] J44.9 Chronic obstructive pulmonary disease, unspecified [unless person is to undergo dual organ transplantation, e.g., heart-lung, heart-kidney, etc] K27.0, K27.2, Peptic ulcer with hemorrhage K27.4, K27.6 K57.00 - K57.93 Diverticular disease of intestine K70.0 - K74.69, Diseases of liver [unless person is to undergo dual organ transplantation, e.g., K76.89 heart-lung, heart-kidney, etc] N18.6 End stage renal disease [unless person is to undergo dual organ transplantation, e.g., heart-lung, heart-kidney, etc]

The above policy is based on the following references:

1. Steinman TI, Becker BN, Frost AE, et al. Guidelines for the referral and management of patients eligible for solid organ transplantation. Transplantation. 2001;71(9):1189-1204. 2. Magliato KE, Trento A. Heart transplantation -- surgical results. Heart Fail Rev. 2001;6 (3):213-219. 3. Jayakar DV. Surgical treatment of chronic heart failure. What to tell patients about heart- saving options. Postgrad Med. 2001;109(3):61-70. 4. Francis GS, et al. Pathophysiology and diagnosis of heart failure. In: Hurst's The Heart. V Fuster, et al., eds. Ch. 20. 10th ed. New York, NY: McGraw Hill; 2001; 655-685. 5. Morrow WR. Cardiomyopathy and heart transplantation in children. Curr Opin Cardiol. 2000;15(4):216-223. 6. DeRose JJ Jr, Oz MC. Surgical alternatives to transplantation and assist devices in the treatment of heart failure. Curr Cardiol Rep. 2000;2(6):564-571. 7. Olivari MT, Windle JR. Cardiac transplantation in patients with refractory ventricular arrhythmias. J Heart Lung Transplant. 2000;19(8 Suppl):S38-S42. 8. Odim J, Laks H, Burch C, et al. Transplantation for congenital heart disease. Adv Card Surg. 2000;12:59-76. 9. Adams DH, Chen RH, Kadner A. Cardiac xenotransplantation: Clinical experience and future direction. Ann Thorac Surg. 2000;70(1):320-326. 10. Allen MD, Fishbein DP, McBride M, et al. Who gets a heart? Rationing and rationalizing in heart transplantation. West J Med. 1997;166(5):326-336.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 17 of 21

11. Frigerio M, Gronda EG, Mangiavacchi M, et al. Restrictive criteria for heart transplantation candidacy maximize survival of patients with advanced heart failure. J Heart Lung Transplant. 1997;16(2):160-168. 12. Johnson MR, Naftel DC, Hobbs RE, et al. The incremental risk of female sex in heart transplantation: A multiinstitutional study of peripartum cardiomyopathy and pregnancy. Cardiac Transplant Research Database Group. J Heart Lung Transplant. 1997;16(8):801- 812. 13. Shaddy RE, Naftel DC, Kirklin JK, et al. Outcome of cardiac transplantation in children. Survival in a contemporary multi-institutional experience. Pediatric Heart Transplant Study. Circulation. 1996;94(9 Suppl):II69-II73. 14. Stevenson LW, Warner SL, Steimle AE, et al. The impending crisis awaiting cardiac transplantation. Modeling a solution based on selection. Circulation. 1994;89(1):450-457. 15. Rickenbacher PR, Rizeq MN, Hunt SA, et al. Long-term outcome after heart transplantation for peripartum cardiomyopathy. Am Heart J. 1994;127(5):1318-1323. 16. Sarris GE, Smith JA, Bernstein D, et al. Pediatric cardiac transplantation. The Stanford experience. Circulation 1994;90(5 Pt 2):II51-II55. 17. Slaughter MS, Braunlin E, Bolman RM 3rd, et al. Pediatric heart transplantation: Results of 2- and 5-year follow-up. J Heart Lung Transplant. 1994;13(4):624-630. 18. Addonizio LJ, Hsu DT, Douglas JF, et al. Decreasing incidence of coronary disease in pediatric cardiac transplant recipients using increased immunosuppression. Circulation. 1993;88(5 Pt 2):II224-II229. 19. Mudge GH, Goldstein S, Addonizio LJ, et al. 24th Bethesda Conference: Cardiac transplantation. Task Force 3: Recipient guidelines/prioritization. J Am Coll Cardiol. 1993;22(1):21-31. 20. Muirhead J. Heart transplantation in children: Indications, complications, and management considerations. J Cardiovasc Nurs. 1992;6(3):44-55. 21. Benson L, Freedom RM, Gersony W, et al. Session II: Cardiac replacement in infants and children: Indication and limitations. J Heart Lung Transplant. 1991;10(5 Pt 2):791-801. 22. Pennington DG, Noedel N, McBride LR, et al. Heart transplantation in children: An international survey. Ann Thorac Surg. 1991;52(3):710-715. 23. Deng MC, Smits JM, Packer M. Selecting patients for heart transplantation: Which patients are too well for transplant? Curr Opin Cardiol. 2002;17(2):137-144. 24. Hunt SA. Comment--the REMATCH trial: Long-term use of a left ventricular assist device for end-stage heart failure. J Card Fail. 2002;8(2):59-60. 25. Rose EA, Gelijns AC, Moskowitz AJ, et al. Long-term mechanical left ventricular assistance for end-stage heart failure. N Engl J Med. 2001;345(20):1435-1443. 26. Alpert JS. Left ventricular assist devices reduced the risk for death and increased 1-year survival in chronic end-stage heart failure. ACP J Club. 2002;136(3):88. 27. National Institutes of Health, National Heart, Lung & Blood Institute. Expert Panel Review of the NHLBI Total Artificial Heart (TAH) Program. June 1998 - November 1999. Bethesda, MD: NHLBI, April 2000. 28. Copeland JG, Arabia FA, Banchy ME, et al. The CardioWest total artificial heart bridge to transplantation: 1993 to 1996 national trial. Ann Thorac Surg. 1998;66(5):1662-1669. 29. Copeland JG, Pavie A, Duveau D, et al. Bridge to transplantation with the CardioWest total artificial heart: the international experience 1993 to 1995. J Heart Lung Transplant. 1996;15 (1 Pt 1):94-99. 30. Arabia F A, Copeland JG, Pavie A, Smith RG. Implantation technique for the CardioWest total artificial heart. Ann Thorac Surg. 1999;68:698-704.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 18 of 21

31. Jouveshomme S, Baffert S, Fay A-F. Artificial heart (systematic review, expert panel). Paris, France: Comite d'Evaluation et de Diffusion des Innovations Technologiques (CEDIT), 1998:46. 32. Noorani HZ, McGahan L. Criteria for selection of adult recipients for heart, cadaveric kidney and liver transplantation. Technology Report. Issue 6. Ottawa, ON: Canadian Coordinating Office for Health Technology Assessment (CCOHTA); July 1999. 33. Remme WJ, Swedberg K. Guidelines for the diagnosis and treatment of chronic heart failure. Eur Heart J. 2001;22(17):1527-1560. 34. Cooley DA. The total artificial heart. Nat Med. 2003;9(1):108-111. 35. Nose Y. Totally implantable total artificial heart for clinical application. Artif Organs. 2002;26(3):214-215. 36. Arabia FA. Update on the total artificial heart. J Card Surg. 2001;16(3):222-227. 37. Nose Y. Implantable total artificial heart developed by Abiomed gets FDA approval for clinical trials. Artif Organs. 2001;25(6):429. 38. Sharma P, Perri RE, Sirven JE, et al. Outcome of liver transplantation for familial amyloidotic polyneuropathy. Liver Transpl. 2003;9(12):1273-1280. 39. Grazi GL, Cescon M, Salvi F, et al. Combined heart and liver transplantation for familial amyloidotic neuropathy: Considerations from the hepatic point of view. Liver Transpl. 2003;9(9):986-992. 40. Suhr OB, Svendsen IH, Andersson R, et al. Hereditary transthyretin amyloidosis from a Scandinavian perspective. J Intern Med. 2003;254(3):225-235. 41. Arpesella G, Chiappini B, Marinelli G, et al. Combined heart and liver transplantation for familial amyloidotic polyneuropathy. J Thorac Cardiovasc Surg. 2003;125(5):1165-1166. 42. Razonable RR, Patel R, Wilhelm MP, et al. Fatal disseminated aspergillosis following sequential heart and stem cell transplantation for systemic amyloidosis. Am J Transplant. 2001;1(1):93-95. 43. Comenzo RL. Primary systemic amyloidosis. Curr Treat Options Oncol. 2000;1(1):83-89. 44. Ruygrok PN, Gane EJ, McCall JL, et al. Combined heart and liver transplantation for familial amyloidosis. Intern Med J. 2001;31(1):66-67. 45. Mohty M, Albat B, Fegueux N, Rossi JF. Autologous peripheral blood stem cell transplantation following heart transplantation for primary systemic amyloidosis. Leuk Lymphoma. 2001;41(1-2):221-223. 46. Dubrey SW, Burke MM, Khaghani A, et al. Long term results of heart transplantation in patients with amyloid heart disease. Heart. 2001;85(2):202-207. 47. Mundy L, Merlin T. Thoratec heartmate (R) left ventricular assist device for patients with heart failure who are ineligible for heart transplantation. Horizon Scanning Prioritising Summary - Volume 2. Adelaide, SA: Adelaide Health Technology Assessment (AHTA) on behalf of National Horizon Scanning Unit (HealthPACT and MSAC); 2003. 48. Mundy L, Merlin T, Parrella A. Heartsbreath: Diagnostic test of grade III heart transplant rejection in heart transplant recipients. Horizon Scanning Prioritising Summary - Volume 5. Adelaide, SA: Adelaide Health Technology Assessment (AHTA) on behalf of National Horizon Scanning Unit (HealthPACT and MSAC); 2004 49. Center for Medicare and Medicaid Services (CMS). NCA Tracking Sheet for Autologous Stem Cell Transplantation (AuSCT) for Amyloidosis (CAG-00050R). Baltimore, MD: CMS; July 26, 2004. Available at: http://www.cms.hhs.gov/mcd/viewtrackingsheet.asp? id=126. Accessed September 9, 2004. 50. SynCardia Systems, Inc. CardioWest Total Artificial Heart (TAH). Directions for Use. Tucson, AZ; SynCardia; 2004. Available at: www.fda.gov/ohrms/dockets/ac/04/briefing/4029b1_FINAL.pdf. Accessed October 27, 2004.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 19 of 21

51. Renlund DG. Building a bridge to heart transplantation. N Engl J Med. 2004;351(9):849- 851. 52. Copeland JG, Smith RG, Arabia FA, et al. and the CardioWest Total Artificial Heart Investigators. Cardiac replacement with a total artificial heart as a bridge to transplantation. N Engl J Med. 2004;351(9):859-867. 53. Phillips M, Boehmer JP, Cataneo RN, et al. Heart allograft rejection: Detection with breath alkanes in low levels (the HARDBALL Study). J Heart Lung Transplant. 2004;23:701-708. 54. U.S. Food and Drug Administration (FDA), Center for Devices and Radiological Health (CDRH). Menssana Research, Inc. Heartsbreath Test for grade 3 heart transplant rejection. Humanitarian Device Exemption No. H030004. Rockville, MD: FDA; February 24, 2004. 55. U.S. Food and Drug Administration (FDA), Center for Devices and Radiological Health (CDRH). Heartsbreath - H030004. New Humanitarian Device Approval. CDRH Consumer Information. Rockville, MD: FDA; March 10, 2004. Available at: http://www.fda.gov/cdrh/MDA/DOCS/H030004.html. Accessed October 11, 2004. 56. Phillips M, Cataneo RN, Greenberg J, et al. Effect of age on the breath methylated alkane contour, a display of apparent new markers of oxidative stress. J Clin Lab Med. 2000;136:243-249. 57. Williams ES, Miller JM. Results from late-breaking clinical trial sessions at the American College of Cardiology 51st Annual Scientific Session. J Am Coll Cardiol. 2002;40(1):1-18. 58. Corrado D, Basso C, Nava A, Thiene G. Arrhythmogenic right ventricular cardiomyopathy: Current diagnostic and management strategies. Cardiol Rev. 2001;9(5):259-265. 59. Towbin JA. Cardiomyopathy and heart transplantation in children. Curr Opin Cardiol. 2002;17(3):274-279. 60. Lacroix D, Lions C, Klug D, Prat A. Arrhythmogenic right ventricular dysplasia: Catheter ablation, MRI, and heart transplantation. J Cardiovasc Electrophysiol. 2005;16(2):235-236. 61. Yoda M, Minami K, Fritzsche D, et al. Three cases of orthotopic heart transplantation for arrhythmogenic right ventricular cardiomyopathy. Ann Thorac Surg. 2005;80(6):2358- 2360. 62. Evans RW, Williams GE, Baron HM, et al. The economic implications of noninvasive molecular testing for cardiac allograft rejection. Am J Transplant. 2005;5(6):1553-1558. 63. Deng MC, Eisen HJ, Mehra MR, et al; CARGO Investigators. Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. Am J Transplant. 2006;6(1):150-160. 64. Halloran PF, Reeve J, Kaplan B. Lies, damn lies, and statistics: The perils of the P value. Am J Transplant. 2006;6(1):10-11. 65. Starling RC, Pham M, Valantine H, et al; Working Group on Molecular Testing in Cardiac Transplantation. Molecular testing in the management of cardiac transplant recipients: Initial clinical experience. J Heart Lung Transplant. 2006;25(12):1389-1395. 66. California Technology Assessment Forum (CTAF). Gene expression profiling for the diagnosis of heart transplant rejection. A Technology Assessment. San Francisco, CA: CTAF; October 18, 2006. Available at: http://ctaf.org/content/general/detail/624. Accessed May 15, 2007. 67. Webber SA, McCurry K, Zeevi A. Heart and lung transplantation in children. Lancet. 2006;368(9529):53-69. 68. Schnoor M, Schäfer T, Lühmann D, Sievers HH. Bicaval versus standard technique in orthotopic heart transplantation: A systematic review and meta-analysis. J Thorac Cardiovasc Surg. 2007;134(5):1322-1331. 69. Copeland JG, Smith RG, Bose RK, et al. Risk factor analysis for bridge to transplantation with the CardioWest total artificial heart. Ann Thorac Surg. 2008;85(5):1639-1644.

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 20 of 21

70. Haddad H, Isaac D, Legare JF, et al. Canadian Cardiovascular Society Consensus Conference update on cardiac transplantation 2008: Executive Summary. Can J Cardiol. 2009;25(4):197-205. 71. Centers for Medicare & Medicaid Services (CMS). Decision memo for Heartsbreath test for heart transplant rejection (CAG-00394N). Baltimore, MD: CMS; December 8, 2008. 72. Centers for Medicare & Medicaid Services (CMS). MLN Matters: Heartsbreath test for heart transplant rejection. MLN Matters Number: MM6366 Revised. Baltimore, MD: CMS; March 12, 2009. Available at: http://www.cms.hhs.gov/MLNMattersArticles/downloads/MM6366.pdf. Accessed July 13, 2009. 73. Pham MX, Teuteberg JJ, Kfoury AG, et al.; IMAGE Study Group. Gene-expression profiling for rejection surveillance after cardiac transplantation. N Engl J Med. 2010;362 (20):1890-1900. 74. Jarcho JA. Fear of rejection--monitoring the heart-transplant recipient. N Engl J Med. 2010;362(20):1932-1933. 75. Tice JA. Gene expression profiling for the diagnosis of heart transplant rejection. Technology Assessment. San Francisco, CA: CTAF; October 13, 2010. 76. Estep JD, Bhimaraj A, Cordero-Reyes AM, et al. Heart transplantation and end-stage cardiac amyloidosis: A review and approach to evaluation and management. Methodist Debakey Cardiovasc J. 2012;8(3):8-16. 77. Cahoon WD, Ensor CR, Shullo MA. Alemtuzumab for cytolytic induction of immunosuppression in heart transplant recipients. Prog Transplant. 2012;22(4):344-349; quiz 350. 78. ECRI Institute. Portable Freedom Driver for in-home support of the total artificial heart. In: AHRQ Healthcare Horizon Scanning System Potential High-Impact Interventions: Priority Area 03: Cardiovascular. Prepared by ECRI Institute under Contract No. HHSA290201000006C. Rockville, MD: Agency for Healthcare Research and Quality; June 2013. 79. Slepian MJ, Alemu Y, Girdhar G, et al. The Syncardia(™) total artificial heart: In vivo, in vitro, and computational modeling studies. J Biomech. 2013;46(2):266-275. 80. Yongcharoen S, Rattanasiri S, McDaniel DO, et al. Meta-analysis of cytokine gene polymorphisms and outcome of heart transplantation. Biomed Res Int. 2013;2013:387184. 81. Eisen HJ, Jessup M. Acute cardiac allograft rejection: Diagnosis. UpToDate Inc., Waltham, MA. Last reviewed April, 2014. 82. Som R, Morris PJ, Knight SR. Graft vessel disease following heart transplantation: A systematic review of the role of statin therapy. World J Surg. 2014;38(9):2324-2334.

Policy History

• Last Review 08/07/2015 Effective: 02/12/2002 Next Review: 06/24/2016 • Review History • Definitions

Additional Information

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015 Heart Transplantation Page 21 of 21

• Clinical Policy Bulletin Notes

Copyright Aetna Inc. All rights reserved. Clinical Policy Bulletins are developed by Aetna to assist in administering plan benefits and constitute neither offers of coverage nor medical advice. This Clinical Policy Bulletin contains only a partial, general description of plan or program benefits and does not constitute a contract. Aetna does not provide health care services and, therefore, cannot guarantee any results or outcomes. Participating providers are independent contractors in private practice and are neither employees nor agents of Aetna or its affiliates. Treating providers are solely responsible for medical advice and treatment of members. This Clinical Policy Bulletin may be updated and therefore is subject to change.

• Web Privacy

• Legal Statement

• Privacy Information

• Member Disclosure

Copyright © 2001-[current-year] Aetna Inc.

You are now leaving the Aetna website.

Links to various non-Aetna sites are provided for your convenience only. Aetna Inc. and its subsidiary companies are not responsible or liable for the content, accuracy, or privacy practices of linked sites, or for products or services described on these sites.

Continue >

file://dhs.sdc.pvt/HSB/OHPR%20HERC%20Public/DBIssues/2015/2016%20CPT%20cod... 10/6/2015