ACCURACY OF ONTARIO HEALTH ADMINISTRATIVE DATABASES IN IDENTIFYING PATIENTS WITH RHEUMATOID

by

Jessica Widdifield

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy in Health Services Research Institute of Health Policy, Management & Evaluation University of Toronto

© Copyright by Jessica Widdifield 2013

Accuracy of Ontario Health Administrative Databases in Identifying Patients with (RA): Creation of the Ontario RA administrative Database (ORAD)

Jessica Widdifield

Doctor of Philosophy

Institute of Health Policy Management and Evaluation University of Toronto

2013

Abstract

Rheumatoid arthritis (RA) is a chronic, destructive, inflammatory arthritis that places significant burden on the individual and society. This thesis represents the most comprehensive effort to date to determine the accuracy of administrative data for detecting RA patients; and describes the development and validation of an administrative data algorithm to establish a province-wide RA database. Beginning with a systematic review to guide the conduct of this research, two independent, multicentre, retrospective chart abstraction studies were performed amongst two random samples of patients from and primary care family physician practices, respectively. While a diagnosis by a rheumatologist remains the gold standard for establishing a RA diagnosis, the high prevalence of RA in rheumatology clinics can falsely elevate positive predictive values. It was therefore important we also perform a validation study in a primary care setting where prevalence of RA would more closely approximate that observed in the general population. The algorithm of [1 hospitalization RA code] OR

[3 physician RA diagnosis codes (claims) with !1 by a specialist in a 2 year period)] demonstrated a ii high degree of accuracy in terms of minimizing both the number of false positives (moderately good

PPV; 78%) and true negatives (high specificity: 100%). Moreover, this algorithm has excellent sensitivity at capturing contemporary RA patients under active rheumatology care (>96%). Application of this algorithm to Ontario health administrative data to establish the Ontario RA administrative

Database (ORAD) identified 97,499 Ontarians with RA as of 2010, yielding a cumulative prevalence of

(0.9%). Age/sex-standardized RA prevalence has doubled from 473 per 100,000 in 1996 to 784 per

100,000 in 2010, with approximately 50 new cases of RA emerging per 100,000 Ontarians each year.

Our findings will inform future population-based research and will serve to improve arthritis surveillance activities across Canada and abroad.

iii

Acknowledgments

I am tremendously appreciative of the support of my thesis committee for the myriad of ways in which they have all actively supported me in my determination to find and realize my potential. To Claire Bombardier, my thesis supervisor, for providing financial support and stimulating my interest in pursuing post-graduate training in health services research, clinical epidemiology and the field of rheumatology. To Sasha Bernatsky, thesis committee member, for your methodological expertise, sharing your valuable time, and friendship. My interest in administrative data research stems from the joy of being able to work with you. To Michael Paterson, thesis committee member, for your sound advice, expertise, careful guidance, and patience. Under your careful guidance and trust, the Ontario Rheumatoid Arthritis administrative Database (ORAD) is in excellent hands at the Chronic Disease and Pharmacotherapy Program at the Institute for Clinical Evaluative Science (ICES). To Karen Tu, thesis committee member, I am profoundly grateful for inviting me to be a part of your research team, enabling access to your data, and opening up the world of primary care research to me. To Carter Thorne for your clinical expertise; If only all patients had a champion like you. Thank you to Janet Pope and George Tomlinson for your expertise and critique of this work; Vandana Ahluwalia for disseminating the results; and all rheumatologists who provided data.

To ICES staff and scientists (Simon Hollands, Ryan Ng, Alexander Kopp, Peter Gozdyra, Nadia Gunraj, Ping Li, Nelson Chong) and to the EMRALD research team [Liisa Jaakkimainen, Noah Ivers, Debra Butt, Myra Wang, Jacqueline Young, Diane Green, Robert Turner, William Oud] and to EMRALD chart abstractors and contracted staff (Nancy Cooper, Abayomi Fowora, Diane Kerbel, Anne Marie Mior and Barbara Thompson).

To the Canadian Institutes of Health Research (CIHR) for financial support; members of the Ontario Biologics Research Initiative/Ontario Best Practices Research Initiative (OBRI), particularly Annette Wilkins, Angela Cesta, Xiuying Li and Chris Sammut (for data applications development); the CANRAD (Canadian Rheumatology Administrative Data) Network (especially Lisa Lix and Jeremy Labrecque) and support of: the Public Health Agency of Canada (PHAC), particularly Siobhan O’Donnell; the Ontario Rheumatology Association (ORA), the Canadian Rheumatology Association (CRA), The Arthritis Society (TAS), the Arthritis Alliance of Canada (AAC), the Canadian Arthritis Patient Alliance (CAPA), particularly Anne Lyddiatt, Catherine Hofstetter; and the Canadian Arthritis Network Consumer Committee – all who supported this work.

I also wish to thank the external reviewers of this thesis, Drs. Jeffrey Curtis and Susan Jaglal.

Finally, to my friends, family, peers, and personal champions, who pushed me at every stage. This thesis is dedicated to all individuals with Rheumatoid Arthritis. My hope is that this research (and all future research arising from it) will improve their lives.

iv

Table of Contents

!"#$%&'()*+($,-...... /0!

123'(4%546%$,($,-...... 0!

7/-,4%54123'(- ...... /8!

7/-,4%549/*:;(-...... 8!

7/-,4%54!<<($)/"(- ...... 8/!

4

6=2<,(;4>4?$,;%):",/%$...... >! >.>41=(-/-4@0(;0/(& ...... A! >.A4BC<%,=(-(-42$)4D(-(2;"=4E:(-,/%$- ...... F! "#$#"!%&'&()*!%+)*!)',!-./,0!1&234'####################################################################################################################################5! "#$#$!607+.8&2&2 ###########################################################################################################################################################################5! "#$#5!9&2&)(:8!;/&2.3+'2###########################################################################################################################################################.F4G2"#*;%:$)...... H! "#5#"!98&/=).+3,!>(.8(3.32!?9>@A!B73,&=3+*+40!)',!C/(,&'####################################################################################D! "#5#$!E*3'3:)*!E8)():.&(32.3:2####################################################################################################################################################D! "#5#5!E*)223F3:).3+'!E(3.&(3)#######################################################################################################################################################G! "#5#############################################################################################################################################G! >.I4J-/$*462$2)/2$4B(2',=4!)+/$/-,;2,/0(4K2,232-(-45%;4D!4D(-(2;"=42$)4L:;0(/''2$"(...... M! "#<#"!IJ&(J3&K!+F!6&)*.8!>,=3'32.().3J&!1).)L)2&2#####################################################################################################M! "#<#$!I'.)(3+!6&)*.8!>,=3'32.().3J&!1).)##########################################################################################################################M! "#<#5!N3=3.).3+'2!+F!6&)*.8!>,=3'32.().3J&!1).)#############################################################################################################O! "#<#,=3'32.().3J&!1).)!F+(!9>!9&2&)(:8 #############################################################################"Q! "#<#D!6&)*.8!>,=3'32.().3J&!1).)!R)*3,).3+'!-./,3&2#################################################################################################""!

4

4

v

6=2<,(;4A4N4!4LC-,(+2,/"4D(0/(&42$)46;/,/"2'4!<<;2/-2'4%54O2'/)2,/%$4L,:)/(-45%;4,=(4 K/2*$%-/-4%54D=(:+2,/"4K/-(2-(-4:-/$*4B(2',=4!)+/$/-,;2,/0(4K2,232-(-...... >F! A.>! !3-,;2", ...... >I! A.A! ?$,;%):",/%$...... >H! A.F! P2,(;/2'-42$)4P(,=%)-...... >Q! A.I! D(-:',-...... >M! A.H! K/-":--/%$...... AF! A.Q! 123'(-42$)49/*:;(-...... AR! A.S! !<<($)/8 ...... I>! !

6=2<,(;4F4!"":;2"C4%54@$,2;/%4B(2',=4!)+/$/-,;2,/0(4K2,232-(-4/$4?)($,/5C/$*4T2,/($,-4 &/,=4D=(:+2,%/)4!;,=;/,/-4UT!D14?VN4!4O2'/)2,/%$4L,:)C4:-/$*4,=(4P()/"2'4D("%;)-4%54 D=(:+2,%'%*/-,-...... H>! F.>! !3-,;2", ...... HA! F.A! ?$,;%):",/%$...... HF! F.F! L:3W(",-42$)4P(,=%)- ...... HI! F.I! D(-:',-...... QX! F.H! K/-":--/%$...... QF! F.Q! 123'(-42$)49/*:;(-...... QR! F.S! !<<($)/8 ...... SI! !

vi

6=2<,(;4I4!"":;2"C4%54@$,2;/%4B(2',=4!)+/$/-,;2,/0(4K2,232-(-4/$4?)($,/5C/$*4T2,/($,-4 &/,=4D=(:+2,%/)4!;,=;/,/-4UT!D14??VN4!402'/)2,/%$4-,:)C4:-/$*4+()/"2'4;("%;)-4%54T;/+2;C4 62;(4T=C-/"/2$- ...... MF! I.>! !3-,;2", ...... MI! I.A! ?$,;%):",/%$...... MH! I.F! L:3W(",-42$)4P(,=%)- ...... MQ! I.I! D(-:',-...... R>! I.H! K/-":--/%$...... RF! I.Q! 123'(-42$)49/*:;(-...... RM! I.S! !<<($)/8 ...... >XF!

4

6=2<,(;4H4YXH! H.>! !3-,;2", ...... >XQ! H.A! ?$,;%):",/%$...... >XS! H.F! L:3W(",-42$)4P(,=%)- ...... >XM! H.I! D(-:',-...... >>X! H.H! K/-":--/%$...... >>A! H.Q! 123'(-42$)49/*:;(-...... >>Q! H.S! !<<($)/8 ...... >AM!

4

4

4

4

4

4

4

4

vii

6=2<,(;4Q4K/-":--/%$...... >AR! Q.>! 6=2<,(;4%0(;0/(& ...... >AR! Q.A! L:++2;C4%54D(-(2;"= ...... >FX! Q.F! G(-,4T;2",/"(-4/$4"%$):",/$*42)+/$/-,;2,/0(4)2,2402'/)2,/%$4-,:)/(-45%;4;=(:+2,/"4 "%$)/,/%$- ...... >FA! G#5#"! C(3,43'4!.8&!:+':&7.2!+F!=&)2/(&=&'.!J)*3,3.0!)',!,32&)2&!)2:&(.)3'=&'.!K3.8! ),=3'32.().3J&!,).)################################################################################################################################################################ "5$! G#5#$! C&2.!S():.3:&2!F+(!>,=3'32.().3J&!1).)!R)*3,).3+'!-./,0!1&234'!)',!9&7+(.3'4 ######################## "5D! G#5#5! 9&*).3+'2837!)=+'4!)::/():0!=&)2/(&2A!T8&!3=7):.!+F!,32&)2&!7(&J)*&':&U!27&:.(/=!+F! ,32&)2&!)',!.07&!+F!:+=7)().+(!4(+/7!+'!=&)2/(&2!+F!,3)4'+2.3:!)::/():0# ############################################## "5O! G#5#,=3'32.().3J&!1).)!R)*3,).3+' ############################################################# "IQ! Q.H! 7/+/,2,/%$-4%54@D!K42$)4\($(;2'/]23/'/,C...... >IM! Q.Q! B(2',=4!)+/$/-,;2,/0(4K2,245%;4L("%$)2;C4D(-(2;"=N49:,:;(4K/;(",/%$- ...... >HF! Q.S! !<<($)/8 ...... >HQ!

4

4

D(5(;($"(- ...... >HM!

viii

List of Tables

Chapter 2 2.6.2 Table 1: Characteristics of included studies...... 31 2.6.3 Table 2: List of included articles and their individual characteristics...... 32 2.6.4 Table 3: Number of Studies meeting individual data quality and reporting items...... 37

4

6=2<,(;4F4 3.6.2 Table 1: Characteristics of RA patients and non-RA patients ...... 70 3.6.3 Table 2: Test characteristics of multiple algorithms: Results for Patients aged >20y...... 71 3.6.4 Table 3: Test characteristics of multiple algorithms: Results for Patients aged ! 65y ...... 72

Chapter 4 4.6.2 Table 1: Clinical Characteristics and Drug Exposures for RA patients...... 100 4.6.3 Table 2: Test characteristics of multiple algorithms among Patients > 20y ...... 101 4.6.4 Table 3: Test characteristics of multiple algorithms among Patients aged ! 65y ...... 102

Chapter 5 5.6.1 Table 1: Crude and Age/Sex-standardized prevalence and incidence of RA by year ...... 117 5.6.11 Table 2: Crude and age/sex-standardized rates by area of patient residence in 2010 ...... 127

ix

List of Figures

Chapter 2 2.6.1 Figure 1: Flow chart for studies evaluated for inclusion in the systematic review...... 30 2.6.5 Figure 2: Various approaches to conducting administrative data validation studies ...... 39

6=2<,(;4F4 3.6.1 Figure 1: Flow diagram of selection of study participants...... 69 3.6.5 Figure 2: Forest Plot of Specificity estimates by Rheumatology Site ...... 73

Chapter 4 4.6.1 Figure 1: Flow diagram of selection of study participants...... 99

Chapter 5 5.6.2 Figure 1: Age and Sex-Standardized prevalence of RA from 1996 to 2010 ...... 118 5.6.3 Figure 2: Age Standardized prevalence of RA Overall and by Sex from 1996 to 2010...... 119 5.6.4 Figure 3: Sex-standardized Prevalence of RA by Age from 1996 to 2010...... 120 5.6.5 Figure 4: Prevalence of RA by Sex and Age in 2010 ...... 121 5.6.6 Figure 5: Age and sex-standardized incidence of RA over 1996-2010 ...... 122 5.6.7 Figure 6: Age-standardized incidence of RA by Sex from 1996 to 2010...... 123 5.6.8 Figure 7: Sex-standardized incidence of RA by Age from 1996-2010...... 124 5.6.9 Figure 8: Incidence of RA by Sex and Age as of 2010...... 125 5.6.10 Figure 9: Map of age/sex-standardized rates by area of patient residence in 2010...... 126

x

List of Appendices

Chapter 2 2.7.1 Systematic literature search strategies: MEDLINE and embase...... 41

Chapter 3 3.7.1 Sample Size Estimation...... 74 3.7.2 Data Abstraction Forms ...... 77

Chapter 4 4.7.1 Impact of Levels of Evidence on Classifying RA by the Reference Standard ...... 103 4.7.2 Impact of Levels of Evidence on algorithm accuracy...... 104

Chapter 5 5.7.1 A map of the distribution of Rheumatologists in Ontario ...... 128

Chapter 6 6.7.1 Physician Specialty ...... 156

xi

Chapter 1 Introduction

The purposes of this chapter are to:

1. Provide an overview of the thesis and describe each chapter.

2. Present the research hypotheses, study questions and an overview of the study designs.

3. Provide an introduction to the epidemiology, burden, clinical features and management of Rheumatoid Arthritis.

4. Discuss the Rheumatoid Arthritis surveillance and research efforts in Canada, including the use of health administrative databases, their limitations, and approaches to optimizing their use.

1

1.1 Thesis Overview

This thesis contains six chapters, beginning with this Chapter (1), which describes the research hypotheses; study questions; overview of study designs; background information on the epidemiology; burden, clinical features and care management of rheumatoid arthritis (RA); and the use of health administrative databases for surveillance and research efforts in Canada. This chapter also discusses limitations and approaches to optimizing the use of administrative data for secondary research.

Chapter 2 contains a systematic literature review to identify studies that validate administrative databases for rheumatic disease case ascertainment. The quality of the methods and reporting of each study was assessed. The review highlights important heterogeneity with respect to the study design, diseases evaluated, administrative data sources, types of reference standard definitions, sample sizes, algorithms that were tested, measures of diagnostic accuracy reported, and reporting of prevalence. We propose a methodological framework and recommendations for validation study conduct and reporting. This framework guided the conduct of the validation studies described in Chapters 3 and 4.

Chapter 3 reports Part I of our efforts to determine the accuracy of administrative database algorithms for identifying patients with RA. This study was performed amongst a random sample of patients seen at rheumatology clinics.

Chapter 4 reports Part II of our efforts to determine the accuracy of administrative database algorithms for identifying patients with RA. This study was performed amongst a random sample of patients seen at primary care clinics (where the prevalence of RA more closely approximates that observed in the general population).

Chapter 5 describes and the application of the optimal case-finding algorithm to Ontario's health administrative data to develop the Ontario RA administrative Database (ORAD) and the linkage of ORAD with population-based census data to describe the epidemiology of RA.

Chapter 6 summarizes the preceding findings of Chapters 2-5, discusses the implications of this work and future directions.

2

1.2 Hypotheses and Research Questions

1.2.1 General Goal and Study Design

The ultimate aim of this thesis is to create a validated population-based cohort of all individuals with RA in the Canadian province of Ontario. To accomplish this, a systematic review of the literature was conducted to identify the best approaches for conducting administrative data validation studies for identifying rheumatic diseases. Subsequently, two independent, multicentre, retrospective chart abstraction studies were performed amongst random samples of patients from rheumatology and primary care clinics. The first step was to identify rheumatology clinic patients with and without RA, followed by primary care clinic patients with and without RA. These two cohorts were then (separately) linked to administrative data to test and validate administrative data algorithms comprised of different combinations of physician billing, hospitalization and pharmacy data to identify true RA cases within administrative data. Results of these validation studies were then used to select the best algorithm to establish a province-wide RA cohort and determine the incidence and prevalence of RA for the entire province of Ontario.

1.2.2 Hypotheses

1. Despite the widespread use of administrative data in rheumatology research, empirical research on the validity of RA case ascertainment is sparse, and the studies that do exist vary widely in their conceptual sophistication and methodological rigor.

2. Health administrative data can be used to accurately identify patients with Rheumatoid Arthritis.

a. While health care utilization data do not record the onset of diseases, we hypothesize that the diagnosis date derived from health administrative data will occur later than the date of disease onset documented in medical charts; however, health administrative data can be used to identify incident cases of Rheumatoid Arthritis.

b. Variation in coding practices exists and the true sensitivity and specificity will vary across practices.

3. The prevalence of Rheumatoid Arthritis is increasing in Ontario.

3

1.2.3 Research Questions

The study questions that follow these hypotheses are:

1. What are the best approaches to conducting validation studies to identify rheumatic diseases within health administrative data?

2. What is the optimal algorithm to identify patients with rheumatoid arthritis within health administrative databases?

a. Does the reference standard affect algorithm performance?

i. Using a random sample of patients from rheumatology clinics as the reference standard?

ii. Using a random sample of patients from primary care clinics as the reference standard?

3. What are the incidence and prevalence of rheumatoid arthritis in Ontario?

4

1.3 Background

1.3.1 Rheumatoid Arthritis (RA): Epidemiology and Burden

Rheumatic diseases contribute greatly to the medical burden in Canada, with approximately 1% of the population suffering from RA.1 Surveillance data has consistently shown that RA prevalence is highest in older age groups. One in 12 women and one in 20 men will develop inflammatory autoimmune rheumatic disease during their lifetime. At the age of fifty, 2.7% of women and 1.4% of men will go on to develop RA.2 As our population ages, there will also be a greater number of individuals with RA and with multiple other health conditions, which will add to the complexity of their chronic disease management. Current Canadian estimates suggest that as many as 300,000 Canadians suffer from RA. By 2040, the prevalence of RA will reach 550,000 patients and the total cumulative economic burden (both direct and indirect costs) will reach $9.7 billion.3

1.3.2 Clinical Characteristics

Rheumatoid arthritis is a disease characterized by symmetrical joint swelling in addition to systemic symptoms such as morning stiffness and fatigue. The pain and loss of function associated with the disease are attributable to the combined effect of continuing synovitis and progressive, irreversible destruction, or erosion, of joints. Patients experience a chronic fluctuating course of disease that, despite therapy, may result in joint deformity and disability, and even premature death.4 Patients are greatly burdened, not only because of the pain and limitations caused by RA, but also because of an increased risk of co-morbidities related both to RA and its treatment.

Assessment of joint-specific is critical to the diagnosis of RA. Joint-specific inflammation (swollen and tender joints), symmetrical distribution of affected joints across both sides of the body, and principal involvement of proximal interphalangeal (PIP), metacarpophalangeal (MCP) and metatarsophalangeal (MTP) joints are all characteristic of RA. Subjects presenting with a greater number and severity of the above symptoms may be more likely to be correctly classified with RA.5

5

Examining patients for potential RA is fraught with difficulty because there is no single laboratory test to confirm the diagnosis.6 Laboratory markers proven to have diagnostic utility include Rheumatoid factor (RF), erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), and anti- cyclic citrullinated peptide antibodies (anti-CCP).7 Most diagnostic tests are often only performed during early stages of the disease, and while RF is absent in 15-30% of RA patients in the early stages of diagnosis, once patients have a high titre for RF or anti-CCP, these tests are not usually re- performed.6 Acute phase reactants (ESR and CRP), however, may be routinely performed to measure underlying disease activity.

1.3.3 Classification Criteria

The standard for identifying RA is a clinical diagnosis by a rheumatologist. RA classification criteria were developed primarily for clinical research and are not routinely used for diagnostic purposes. Until recently, the 1987 American College of Rheumatology (ACR) Classification Criteria5 were widely used to verify patient homogeneity for clinical studies, displacing the 1958 American Rheumatism Association (ARA) criteria.8 Morning stiffness, inflammation in three or more joints, inflammation in the hands, and symmetrical distribution of affected joints on both sides of the body must all be present for more than six weeks and patients must fulfill at least four of the seven criteria to be classified as RA. The criteria perform well in subsets of RA but some criteria (e.g. radiographic changes) may only be present in the long-term.9,10 In an effort to accurately identify a larger group of RA patients, including those with earlier disease, the ACR criteria were revised in 2010.11 These new criteria also include a newer diagnostic test, anti-CCP, which was not available when the 1987 Criteria were created.

1.3.4 Clinical Management of RA

The clinical management of RA patients differs substantially from that of patients with other chronic diseases,12 such as hypertension or diabetes. RA is a complex disease with a multitude of manifestations. Quality of care and health outcomes are better for RA patients who have contact with relevant specialists than for those who do not.13 Current guidelines recommend prompt referral to rheumatology specialists.14 Timely access to specialists, physiotherapists and pharmacologic interventions are not only essential to preventing disability, but would also hopefully reduce the

6 economic burden of RA. Recent research has also shown that disease remission is an achievable goal (especially early in the disease course) with timely access to treatment.15,16,17,18 In 2004, The Canadian Rheumatology Association (CRA) convened an expert panel regarding optimal care in early RA. This led to the development and dissemination of a consensus statement, which reinforced the importance of early referral and treatment for patients with RA. The heterogeneous clinical presentations of RA, diagnostic inaccuracies, over-reliance on the laboratory tests, limited physician confidence, and lack of, or insufficient, residency training for carrying out musculoskeletal exams, may impair a clinical diagnosis of RA in the primary care setting.19,20,21,22,23 As such, it is highly recommended that a rheumatologist assess patients with suspected RA.

7

1.4 Using Canadian Health Administrative Databases for RA Research and Surveillance

Health administrative databases are an appealing data source for population-based research and surveillance. The databases are relatively inexpensive to access and analyze, cover the entire population, are available in all provinces and territories, and contain data for multiple years. These comprehensive administrative health databases potentially represent a powerful tool for health policy and epidemiological research, and are increasingly being used to study real-world patterns of disease, outcomes of disease and/or treatments24, variations in access and quality of care25, costs of care26, and effectiveness of care. At the same time, these databases were designed for purposes of health system management and provider remuneration, and accordingly their use for chronic disease research and surveillance requires careful and ongoing evaluation.

1.4.1 Overview of Health Administrative Databases

Administrative claims databases record information for payers about outpatient and/or inpatient care for beneficiaries. Individual beneficiaries of a governmental or private healthcare system (e.g. health maintenance organizations) are assigned unique identifiers and details about all health care contacts reimbursed by the payer are recorded.27 These records most often include information about hospitalizations and outpatient physician visits, including details of diagnoses and procedures. Frequently, data on filled drug prescriptions are available as well.

1.4.2 Ontario Health Administrative Data

In Canada’s health care system, provincial health insurance plans provide universal coverage for hospital and physician services with no copayments or other patient charges. The majority of physicians are paid on a fee-for-service basis, where a claim is submitted for each patient encounter (although alternative payment plans do co-exist). Although health administrative data exist for all Canadians, Canada’s health care system is a provincial responsibility. As such, each province is the data custodian of their administrative data, with the exception of federally funded hospitals for which the Canadian Institute for Health Information (CIHI) oversees the collection of hospitalization data and dissemination of data back to the provinces.27

8

A copy of Ontario’s health administrative databases is located at the Institute for Clinical Evaluative Sciences (ICES). Each database is comprised of individual data files for each beneficiary including hospital separations (inpatient records), physician billings (in- and out-patient physician services), and prescription drug claims; all fully linkable via a unique encrypted health insurance number. The Ontario Health Insurance Plan (OHIP)28 physician billing database maintains information for physician services including diagnosis codes, based on International Classification of Diseases (ICD) codes.29 The pharmacy claims database, maintains prescription data on residents aged 65 years or older who receive publicly funded medications through the Ontario Drug Benefit Program (ODB).30 The ODB database includes drug identification numbers and information about the quantity of drug prescribed. The Discharge Abstract Database (DAD) and National Ambulatory Care Reporting System (NACRS) contain all hospitalization data and emergency room visits respectively, including detailed clinical and demographic data for acute hospital admissions, and day surgeries.

1.4.3 Limitations of Health Administrative Data

While administrative data provide an efficient source of population-based data, the databases were designed for administrative purposes and not for secondary research.31 Information that is unnecessary for physician or hospital remuneration may not be provided or recorded accurately. Potential limitations include limited clinical detail and incomplete data on drug use. The accuracy of physicians’ diagnosis coding can be an issue, particularly since there are few incentives for physicians to code diagnoses accurately.32 Moreover, in Ontario, physician service claims only require one diagnosis code per patient visit, limiting what is reported for patients who have multiple health problems. ‘False- positive’ administrative data diagnoses may arise when physicians use diagnosis codes when a disease is being ‘ruled out’. In some jurisdictions, physicians may also be paid under a system other than fee- for-service, and in that event, even when the physicians are asked to provide ‘shadow’ billing claims, it is not clear how consistently this is done. Lastly, fee-for-service remuneration can drive coding practice when incentives exist for the recording of specific diagnostic codes.

9

1.4.4 Optimizing Health Administrative Data for RA Research

The optimal method to identify patients with RA within health administrative databases is a topic of ongoing investigation. Researchers often apply case definitions involving multiple data sources (algorithms) to improve the validity of case ascertainment. Such algorithms can include diagnosis or prescription drug codes, procedure codes, the timing of diagnosis codes, and the source of diagnosis (e.g., physician specialty).

However, a wide variety of choices for administrative data algorithms in chronic disease research and surveillance may result in incomparable estimates of disease prevalence, incidence, or health outcomes. This may result in confusion for the end users of studies based on administrative databases, such as decision-makers seeking to implement population-based disease management or intervention programs.

The use of many different administrative data algorithms may arise, in part, because of differences in administrative databases across provinces and territories. Although Canada has a universal health care system, the delivery and administration of the system is a provincial and territorial responsibility. Consequently, the contents of health administrative databases, particularly physician billing claims and prescription drug databases may vary across jurisdictions (e.g., prescription drug databases may not include drugs filled for patients of all ages). Case definitions defined by investigators in one province or territory may not be applicable or useful in other jurisdictions.

Algorithms are not only selected based on the databases available to research, the choice of case definitions may vary as a function of the purpose of the study. The relative importance of different measures of accuracy [i.e., sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV)] depends on the intended use of the case definition (i.e., study purpose). For example, case definitions with high sensitivity to detect positive cases may be chosen to estimate the potential burden of disease at the population level, while others may choose to maximize the combination of sensitivity and specificity to improve accuracy. Case definitions with high specificity are often needed to create a homogeneous sample of patients for evaluating outcomes of disease and/or treatments. A high PPV will help researchers to avoid detecting false disease cases, which may be important for studies evaluating processes and quality of care.

10

Thus, confirming the accuracy of RA case definitions through a validation study is essential to conducting credible research and surveillance activities using administrative data.

1.4.5 Health Administrative Data Validation Studies

Validation of Canadian health administrative data diagnoses has been performed for certain common chronic conditions, such as hypertension33 and diabetes.34 Validation studies on the use of administrative data for RA are much less common. Only two Canadian studies have been done using data from Saskatchewan35 and Manitoba,36 with important limitations. The Saskatchewan validation study used hospitalization data from 1978-80. Hospitalized patients with at least one inpatient diagnostic code (primary or secondary) for RA were assessed regarding whether they could be confirmed according to the 1958 American Rheumatism Association (ARA) criteria for RA.5 One limitation of this study was that hospitalization of RA patients may only occur in the later stages of the disease as initial treatment and diagnosis of this condition is typically in the outpatient setting. In addition, these ARA criteria have since been superseded by more updated criteria (e.g., the 19875 and 2010 RA critieria11). The Manitoba validation study used patient-reported diagnosis from the Canadian Community Health Survey (CCHS) as the reference standard. This reference standard that was used for validation has limitations, since patients responding to the self-report question in the CCHS cannot often differentiate RA from other forms of arthritis. More than 8.0% of CCHS respondents in the study cohort reported an RA diagnosis, significantly higher than prevalence estimates based on other surveys and from clinical registries. Thus, both RA validation studies of Canadian administrative databases have important limitations. There have also been international administrative data validation studies of rheumatic disease diagnoses (the focus of Chapter II), however, most have been performed in patients seen within rheumatology clinics,35-48 which limits the generalizability of study findings.

Accurate administrative data algorithms are needed to provide accurate estimates of RA incidence and prevalence, which are in turn necessary to identify areas in need. In the absence of accurate algorithms, estimates of disease prevalence and incidence have differed greatly.49,50 For example, an algorithm with 100% sensitivity will accurately ascertain all true disease cases. However, if an algorithm has a sensitivity of 50%, this will reduce the correct ascertainment of true disease cases and reduce disease prevalence estimates. Recently, the Public Health Agency of Canada (PHAC) and Canadian rheumatology researchers convened a workshop to identify and discuss case definitions for 11 musculoskeletal diseases using Canadian administrative data. One conclusion of the workshop was the crucial need for more validation work.51

12

Chapter 2 : A Systematic Review and Critical Appraisal of Validation Studies for the Diagnosis of Rheumatic Diseases using Health Administrative Databases

Authors: Jessica Widdifield1 BSc, PhD(c); Jeremy Labrecque2 MSc; Lisa Lix3 PhD; J. Michael Paterson1,4,5 MSc; Sasha Bernatsky2 MD, FRCPC, PhD; Karen Tu1,4 MD, MSc, Noah Ivers1 MD, PhD(c); Claire Bombardier1 MD, FRCPC.

Affiliations: 1. University of Toronto, Toronto, ON; 2. McGill University, Montreal, PQ; 3. University of Saskatchewan, Saskatoon, SK; 4. Institute for Clinical Evaluative Sciences, Toronto, ON; 5. McMaster University, Hamilton, ON;

Financial Support: Canadian Arthritis Network Rapid Impact Platform Program: Administrative Data in Rheumatic Disease Research and Surveillance Acknowledgements: We wish to thank: the Canadian Rheumatology Administrative Data (CANRAD) Network members, especially Dr. Diane Lacaille (University of British Columbia, Vancouver, BC); Information Specialists: Amy Faulkner, Rouhi Fazelzad, Marina Englesakis, University Health Network; We also wish to thank Dr. Debra Butt (University of Toronto, ON) for her review of the manuscript. Dr. Bernatsky holds a CIHR New Investigator Award (2005-2010); Dr. Tu holds a Canadian Institutes of Health Research fellowship award in the Area of Primary Care (2011-2013). Dr. Ivers holds a CIHR Fellowship Award in Clinical Research and a Fellowship Award from the Department of Family and Community Medicine, University of Toronto; Dr. Bombardier holds a Canada Research Chair in Knowledge Transfer for Musculoskeletal Care (2002-2016) and a Pfizer Research Chair in Rheumatology. ! Manuscript word count: N= 3478/3800

13

2.1 Abstract

Objective:

To evaluate the quality of the methods and reporting of published studies that validate administrative database algorithms for rheumatic disease case ascertainment.

Methods:

We systematically searched MEDLINE, Embase and the reference lists of articles published from 1980 to 2011. We included studies that validated administrative data algorithms for rheumatic disease case ascertainment using medical record or patient-reported diagnoses as the reference standard. Each study was evaluated using published standards for the reporting and quality assessment of diagnostic accuracy, which informed the development of a methodological framework to help critically appraise and guide research in this area.

Results:

Twenty-three studies met the inclusion criteria. Administrative database algorithms to identify cases were most frequently validated against diagnoses in medical records (83%). Almost two-thirds of the studies (61%) used diagnosis codes in administrative data to identify potential cases, and then reviewed medical records to confirm the diagnoses. The remaining studies did the reverse, identifying patients using a reference standard, and then testing algorithms to identify cases in administrative data. Many authors (61%) described the patient population, but few (26%) reported key measures of diagnostic accuracy (sensitivity, specificity, positive and negative predictive values). Only one-third of studies reported disease prevalence in the validation study sample.

Conclusion:

Methods used in administrative data validation studies of rheumatic diseases are highly variable. Few studies report key measures of diagnostic accuracy, despite their importance for drawing conclusions about the validity of administrative database algorithms. We developed a methodological framework and recommendations for validation study conduct and reporting.

14

2.2 Introduction

Health administrative databases are potentially an efficient source of data for population-based rheumatology research and are increasingly being used to study disease burden, disease and treatment outcomes24 and quality of care.25,52 The value of studies that use administrative databases for secondary research rests heavily upon the accuracy of data for ascertaining disease cases. To reduce misclassification error in case ascertainment, researchers often make use of case definitions (usually in the form of algorithms based on diagnosis codes and/or other information such as pharmacy dispensations). However, estimates of disease prevalence using different algorithms may vary greaty.50,49 Therefore, confirming the accuracy of case ascertainment algorithms through a validation study (see Box 1) is an important step to improving rheumatology surveillance and research using administrative databases.

Box 1. Steps in performing an administrative database validation study53

PARTICIPANT SAMPLING: ! Sample potential patients to comprise a validation cohort

PARTICIPANT SELECTION (TO CLASSIFY PATIENTS AS CASES AND NON-CASES): ! Identify, develop or define a reference standard to identify patients with and without the disease within the validation cohort.

METHODS: ! Develop one or more case ascertainment algorithms to apply to the administrative database. ! Test each administrative data algorithm against the reference standard for ability to accurately identify patients with the disease (similar to testing the accuracy of a diagnostic test).

RESULTS: ! Report measures of diagnostic accuracy: sensitivity, specificity and predictive values. ! Interpret results, recognizing tradeoffs between these measures.

15

Complete and accurate reporting of the methods used in validation studies is important to assess the potential biases and generalizability of results. Benchimol and colleagues54 recently developed consensus criteria for the reporting of studies that validate administrative database algorithms, but a methodological framework to guide the conduct of such studies was not established. We performed a systematic review to identify studies that validate administrative database algorithms for rheumatic diseases and evaluate the quality of the methods and reporting of these studies. Here we summarize the various approaches to performing administrative data validation studies, we illustrate the outcome measures associated with each approach, and provide practical advice for how to achieve reliable and meaningful results.

2.3 Materials and Methods

Our systematic review used the Consort Group’s Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and followed a protocol that pre-specified study selection, eligibility criteria, quality assessment, and data abstraction.55

Search Strategy. A systematic literature search was conducted of Ovid MEDLINE and Embase covering the period of January 1980 to May 2011 to identify all validation studies using administrative data for rheumatology diagnoses. As the term ‘‘health administrative data’’ is not recognized as a

Medical Subject Heading (MeSH) by the National Library of Medicine56 or as an Embase subject heading57, we developed a sensitive search strategy with the assistance of a health librarian and adapted it to each database. A complete list of the search terms is available in the appendix. We additionally

16 hand-searched reference lists and performed a “grey literature” review, which included the websites of health policy units for relevant articles not captured by the electronic searches.

Study Selection. Two reviewers (JL and JW) independently screened the titles and abstracts of all studies for eligibility. Our main inclusion criterion was studies that addressed the validation of health administrative databases or health information systems for case ascertainment of rheumatology diagnoses using medical records and/or patient-reported diagnoses as the reference standard, and (b) the article was written in English. ‘Health administrative data’ was defined as information passively collected, often by government and health care providers, for the purpose of managing the health care of patients58 and ‘health information system’ was defined as administrative data supplemented with detailed clinical information.59 Rheumatology diagnoses included all diagnoses according to Medical

Subject Headings.56 There was no geographic restriction on included studied. Studies evaluating the agreement between two or more administrative data sources were excluded.

Data Abstraction for Reporting and Quality Assessment. For data abstraction, we used the STAtement for Reporting of Diagnostic accuracy (STARD)60 and the QUality Assessment of Diagnostic Accuracy

Studies (QUADAS)61 tools. The purpose of the STARD criteria is to evaluate the reporting of diagnostic accuracy studies whereas the purpose of the QUADAS tool is to assess the quality of diagnostic accuracy studies. Both criteria were harmonized and modified to be applicable to the administrative database setting. Each individual item was adapted for this review by consensus of three authors (JL, JW and LL) and pilot tested. Items were re-phrased to increase their clarity and to be

17 action oriented with the goal of improving validation protocol development for future research.

Consensus on all issues was established prior to commencing quality assessment. Data were abstracted by two of the authors (JL and JW) and any disagreement between the two reviewers was resolved by consensus, or if necessary, by a third party. In addition, we abstracted details of the data sources

[country origin, type of administrative data (e.g., inpatient, outpatient)], the specific rheumatic disease that was studied, the choice of reference standard, sample sizes, and measures of diagnostic accuracy for the algorithms tested. The data were descriptively analyzed.

Methodological Framework Development. The results of the data abstraction for reporting and quality assessment were used to develop a framework to help critically appraise and guide research in this area.

Using basic epidemiologic principles and the consensus criteria for the reporting of diagnostic accuracy studies, several factors that threaten the internal and external validity were identified. We assessed the methodological merit (internal validity) by classifying the studies according to the method of patient sampling and presence or absence of a comparator group without the disease, and identified measures of diagnostic accuracy that could be computed with each approach. Second, we report the strengths and weaknesses of the various approaches to ensure the results are generalizable to the target population

(i.e., external validity).

2.4 Results

Studies Included. Our search identified 486 and 1063 references in MEDLINE and Embase, respectively. The number of articles assessed for inclusion and the reasons for exclusion are detailed in

18

Figure 1. Sixteen studies were identified in the bibliographic databases and seven studies were further identified from reference lists and health policy research unit websites.

For the 23 studies identified in the published literature, Table 1 summarizes the details of the administrative data sources, diseases and reference standards. Most studies were conducted in the

United States (n=15; 65%) for rheumatoid arthritis (RA) (n=13; 57%) using a combination of medical records sampled from hospitalized, ambulatory and rheumatology clinics (n=14; 61%). Most authors

(n=18; 78%) evaluated algorithms that were derived from various linked data sources (inpatient, outpatient and/or prescription data). Reference standard definitions to classify individuals as true cases and non-cases came from various sources: (a) strict clinical classification criteria (e.g., 1987 RA classification criteria5); (n=9; 39%), (b) clinical case definitions involving diagnoses documented in medical records (n=7; 30%); (c) both clinical classification criteria and a clinical case definition (n=3;

13%); and (d) patient-reported data from surveys (n=4; 17%).

Table 2 describes the characteristics of the included studies. There is important heterogeneity with respect to the diseases evaluated, administrative data sources, types of reference standard definitions

(previously described), and sample sizes. For example, sources of data included health maintenance organizations (HMOs), Medicare, Medicaid, Veteran’s Affairs databases, the clinical information system of Rochester, Minnesota (Mayo Clinic) in the United States, Canadian administrative claims databases, Scandinavian population registers, and the comprehensive record linkages of the General

Practice Research Database in the United Kingdom. Sample sizes ranged from 151 to 18,464 patients.

19

The tested algorithms differed in the number (and timing) of diagnosis codes, the source of diagnoses

(e.g., specialist versus general practice physician), and the use of prescription drug and procedure codes. Also, the results of diagnostic accuracy that were used to evaluate the algorithms varied considerably and appear to depend on methodology (both study design and study population). For example, studies that produced high estimates of sensitivity and PPV selected their subjects from rheumatology specialty clinics41,44,62,63 (highlighted in table 2), which may imply that these estimates may not be representative across all populations. In general, increasing the number of diagnosis codes improved algorithm specificity; the addition of pharmacy information to diagnosis codes also improved specificity slightly, but at the cost of a dramatic reduction in sensitivity.

Quality Assessment for Reporting and Methodological Conduct. Table 3 lists the number of studies that met each of the data quality and reporting criteria (modified STARD/QUADAS criteria). Most authors (n=21; 91%) identified their research as validating administrative data to identify rheumatic diseases, and all described the data source, the setting and locations where the data were collected, and the data abstraction method. All studies described participant selection methods and just over half of the studies (n=14; 61%) reported patient clinical and/or demographic characteristics with the most common being age and sex (n=12; 52%). Very few studies reported patients’ duration of disease (n=3;

13%) or co-morbid conditions (n=2; 9%).

Few studies provided study flow diagrams (n=3; 13%), statistical justification for the sample size (n=1;

4%), or confirmed that abstractors were blind to the diagnosis codes of patients (n=5; 26%). The most

20 common statistics used to estimate diagnostic accuracy were positive predictive value (PPV) (n=14;

61%), sensitivity (n=11; 48%), specificity (n=9; 39%), and negative predictive value (NPV) (n=7;

30%). Most authors (n=16; 70%) reported results of multiple algorithms tested but only one-quarter of studies (n=6; 26%) reported at least four measures of diagnostic accuracy, and only a third (n=8; 35%) reported disease prevalence within their samples (pre-test prevalence).

Methodological Framework. Testing the accuracy of administrative database algorithms is measured on a binary scale and the results can be classified as a true positive (TP), a true negative (TN), a false positive (FP) or a false negative (FN). In order to properly evaluate a diagnostic test, both cases and non-cases are needed to populate all four cells of a 2 x 2 contingency table.

Our review identified in the published studies two main approaches to conducting administrative data validation studies (Figure 2). The defining characteristic of each approach is the manner in which patients are sampled (either by the reference standard or by diagnosis codes in administrative data) and the corresponding absence or presence of a comparator group (non-cases).

Nine studies (39%) sampled patients using the reference standard prior to testing administrative data algorithms (Figure 2: Diagram 1 A-B). Of these studies, seven applied diagnostic criteria to a random sample of patients to develop a reference standard that included cases and non-cases prior to analysis

(Diagram 1A).40,64,65,36,65,62,66,67 Only four studies reported the four key measures of diagnostic

21 accuracy: sensitivity, specificity, PPV and NPV that can be computed using this approach. Authors commonly reported kappa in place of key measures of diagnostic accuracy and one study performed multivariable logistic regression analyses to identify predictors of discordance between the reference standard and administrative database diagnosis. Two of the nine studies that sampled from a reference standard (Diagram 1B) tested their administrative data algorithms using only cases (e.g., a sample of patients with known disease status) and no comparator group without the disease68,41. With this approach only sensitivity can be computed.

In contrast, 16 studies (61%) initially identified patients using administrative data algorithms prior to confirming the diagnoses within the reference standard (Diagram 2A-B). Of these studies, six (26%) sampled patients with positive and negative test results in the administrative data (e.g., patients with and without specific diagnosis codes who fulfill the initial administrative data case definition) and then diagnostic criteria were applied to this sample to develop a reference standard of true cases and non- cases (Diagram 2A).69,39,70,71,72,63 Sensitivity, specificity, and predictive values can be computed using this approach. The remaining ten studies sampled only patients with a positive test result in the administrative data source (those who fulfill the initial administrative data case definition) and then patients were subsequently classified as true cases and non-cases by the reference standard (Diagram

2B).68,73,74,41,43,75,76,77,78,79 With this approach only the false positive fraction can be computed.

22

2.5 Discussion

Despite the widespread use of health administrative databases for epidemiological research in rheumatology, few studies have rigorously evaluated the accuracy of administrative data algorithms for rheumatic disease case ascertainment. We conducted a systematic review, finding 23 studies and used a modified version of the STARD/QUADAS criteria to assess the quality of the methods and reporting.

Based on the variable methods to conducting validation studies, we propose a methodological framework to guide the conduct of such studies.

Thorough assessment of the internal and external validity of individual validation studies is important for assessing risk of bias. However, our quality assessment identified important heterogeneity with regards to patient sampling, reference standards to classify patients, and the measures of diagnostic accuracy that were reported. Our methodological framework also identified important heterogeneity with regards to study conduct, including the direction of patient sampling (patients are either initially sampled from the reference standard or, alternatively, from diagnosis codes in the administrative database) and the inclusion or exclusion of a comparator group without the disease.

The usefulness of validation studies depends greatly upon how potential patients are initially identified as this can impact disease prevalence, the generalizability of patient characteristics, and the measures of diagnostic accuracy that can be computed; All of which impact the outcomes of algorithms tested.

When an appropriate reference standard is applied (to accurately classify cases by the reference standard) and patients are randomly sampled (ideally from a general or generalizable population), the disease prevalence approximates the population prevalence and provides unbiased estimates of

23 sensitivity, specificity, PPV and NPV (Diagram 1A). Unfortunately, our review did not find any studies that randomly sampled patients from the general population and reported all four key measures of diagnostic accuracy. Rather, several studies randomly selected patients from specialty clinics, which can generate falsely elevated PPVs due to the high prevalence of case patients. Recognizing that the study of randomly sampled patients from the general population is not always feasible (especially for diseases of low prevalence), it remains critical for authors to report the pre-test disease prevalence ascertained from their study population to avoid errors in interpretation. As previously stated, in order to properly evaluate the characteristics of a diagnostic test and be able to report the pre-test disease prevalence, both cases and non-cases are needed to populate all four cells of a 2 x 2 contingency table.

Thus, for diseases of low prevalence, strategically sampling from a source population that has a high concentration of case patients may be the only viable option. Even if the disease prevalence is falsely elevated in the validation cohort, the pre-test prevalence should approximate the post-test prevalence for the administrative data algorithm to perform well.

While the alternative approach to sampling patients by the presence or absence of diagnosis codes in administrative data (Diagram 2A) also enables computation of important parameters (true and false positives, true and false negatives), unbiased estimates of accuracy cannot be generated because estimates of underlying prevalence are unknown. Furthermore, very few studies randomly sampled patients, which may have introduced verification bias and reduced external validity by impacting the spectrum of disease in the validation cohort. Sensitivity and specificity are dependent on the spectrum of patients in the study sample and may vary among subpopulations defined by patient age, sex, disease duration, co-morbidity or drug exposures. Unfortunately, such characteristics were not consistently

24 reported. Therefore, it may not be suitable to generalize findings about sensitivity and specificity without accurate reporting of the characteristics of both cases and non-cases. In addition, because predictive values are dependent on the disease prevalence80, future studies that wish to generalize findings regarding PPV and NPV estimates should provide accurate information on disease prevalence in the study cohort. In sum, future validation studies should follow the modified STARD recommendations54 and provide a complete description of the patients under study (spectrum of disease). This would allow investigators to assess the effect of specific patient characteristics and disease prevalence on their results.

Different reference standards were used to classify rheumatic diseases, and this influenced the study results. In our review, medical records were the most frequently used reference source. However, their use assumes that the records contain complete information to determine a patient’s disease status81. A related challenge in studies of rheumatic disease is that diagnoses may evolve over time: for example, a patient who initially fulfills RA criteria may later meet clinical criteria for systemic lupus. A separate problem is the use of patient-reported diagnoses (such as patient surveys) as a reference standard.

However, studies that tested algorithms against patient-reported diagnoses had poor estimates of sensitivity (<50%) and substantially higher pre-test prevalence estimates. Thus, patients may not be aware of their specific underlying diagnosis or arthritis subtype82. Sensitivity of self-report is generally highest for medical conditions that are well-defined (from both the perspective of the layperson and the physician), and relatively easily diagnosed83. Therefore, clinical classification criteria and clinical case definitions derived from medical records should be encouraged as a reference standard (as opposed to using patient-reported diagnoses).

25

Our review identified a lack of explicit reporting of statistical methods and all but one study failed to provide statistical justification for their sample size. As there is no single statistic for the measure of diagnostic accuracy, ideally, researchers should report all relevant measures.84 Only a quarter of the studies reported four or more measures of diagnostic accuracy with the most commonly reported being

PPV and sensitivity as studies commonly sampled by patients their diagnosis codes or did not include patients without disease to act as true-negatives

The majority of authors are testing and reporting results of multiple algorithms. As the selection of algorithms for future research will vary according to their application,85 authors of administrative data validation studies should continue to test and report results for multiple algorithms.86 Depending on the research question, algorithms can be selected based on high sensitivity to optimize detection of cases

(e.g., studying population-level burden of disease), or on high specificity and/or PPV to create a more homogeneous sample and to avoid detecting false disease cases (e.g., evaluating quality of care and/or outcomes), or the maximum combination of sensitivity and specificity. Generally, additional criteria in algorithms are expected to increase specificity at the expense of sensitivity. For example, in our review, the addition of pharmacy claim dispensations or diagnosis codes by specialists improved algorithm performance, but at the cost of dramatic reductions in sensitivity.

Limitations to this review include the inclusion of English only studies and the lack of a standardized approach to identify administrative data validation studies in scientific literature. We did not include abstracts presented at scientific meetings because they do not contain sufficient information to properly

26 assess study quality. Finally, we did not address the ethical issues associated with each study as ethical considerations vary by jurisdiction; however these principles may be guiding the conduct of research using administrative data.87,88 Feasibility, practicality, or ethical considerations may have played a role in the different methodological approaches that we identified and future work is required to fully understand and address solutions to these real-world problems that may impede optimal administrative data validation methodology.

This review highlights important gaps with respect to both reporting and methodology in administrative data validation studies. Due to these gaps, each study published to date has to be interpreted individually in light of its potential for bias and generalizability. We identified strengths and weaknesses in the published literature and provide a framework to guide future study conduct in this field. Box 2 lists several recommendations for improving the design and reporting of administrative data validation studies. Our best practice statements can be used by investigators in the planning and reporting of administrative data validation studies, and by reviewers, editors, and readers to evaluate the studies to avoid errors in interpretation. Additional high quality studies, employing more rigorous methodology, would be an essential step towards improving rheumatic disease surveillance and research using administrative data.

27

BOX 2: SUMMARY OF RECOMMENDATIONS

1. The optimal approach to patient selection includes developing a reference standard among a random sample of patients to classify patients as cases and non-cases.

2. Authors should provide a complete description of the validation cohort, including age, sex, a description of the disease or health condition under study, distribution of disease severity, co- morbidities (if applicable) and the setting from which patients are sampled. Ideally, patients should be drawn from a sampling frame that is otherwise similar, such that the disease prevalence in the sample can approximate the disease prevalence in the administrative data source.

3. Clinical classification criteria and clinical case definitions derived from medical records should be encouraged as a reference standard and readers of the reference standard should be blinded to the results of the classification by administrative data for that patient.

4. Authors should test and report multiple measures of diagnostic accuracy (sensitivity, specificity and predictive values) for multiple administrative algorithms and report information on study prevalence in order to provide comprehensive information about their study.

28

2.6 Tables and Figures

29

2.6.1 Figure 1: Flow chart for studies evaluated for inclusion in the systematic review.

References identified by electronic search N=1549 [MEDLINE n=486; Embase n=1063]

Excluded n=1522 a. Used administrative data but no validation n=99 b. Validates with another administrative data source n=10 c. Administrative data Validation but non-rheumatic disease diagnosis n=32 d. Non-administrative data source. N=877 e. Other (literature reviews, editorials, abstracts) n=502

Studies Identified from electronic search after duplicates removed n=16

Reference list search and grey literature search added an additional 7 studies

Studies Included in Final Analysis

n=23

30

2.6.2 Table 1: Characteristics of included studies Number of studies by Country of Data Source, Type of Secondary Data Source, Sampling source, type of records in which diagnoses for case definitions were selected from, types of specific rheumatic diseases that were evaluated and the choice of reference standard. Characteristics Frequency (N=23) Studies by Country of Data Source USA 15 (65%) Canada 4 (17%) UK/Europe 4 (17%) Type of Secondary Automated Data Source Health administrative claims database* 11 (48%) Clinical information systems** 12 (52%) Sampling Population Source Hospitalized patients only 3 (13%) Rheumatology clinic patients only 4 (17%) Ambulatory patients only 2 (9%) Across all 3 domains (above) 14 (61%) Source of Data for case definition Inpatient diagnosis only 3 (13%) Outpatient diagnosis only 2 (9%) Linked records (inpatient, outpatient +/- pharmacy/laboratory) 18 (78%) Diagnoses*** Rheumatoid Arthritis 13 (57%) 6 (26%) Diseases 3 (13%) 2 (9%) Spondylarthropathies 2 (9%) Fibromyalgia 1 (4%) Unspecified arthritis 2 (9%) Reference Standard Definitions Clinical Classification Criteria 9 (39%) Medical record diagnoses 7 (30%) Both classification and medical record diagnoses 3 (13%) Patient-Reported diagnoses 4 (17%) *defined as information passively collected, often by government and health care providers, for the purpose of managing the health care of patients e.g., claims data); **Clinical or health information systems (administrative data incorporating electronic health records) ***Total do not add up to 23 as several studies evaluated >1 diagnoses

31

2.6.3 Table 2: List of included articles and their individual characteristics. Citation Diagnosis Data Source Record Type Reference Standard Sample Administrative Data Measures of Diagnostic Definition(s) Size Algorithm Accuracy (Case Definition) Allebeck1983 RA Stockholm County Inpatient Clinical classification 276 > 1 inpatient diagnosis SENS: 65 – 91% 68 Medical Information criteria System; Sweden Hakala 1993 RA Sickness Insurance Insurance + inpatient Clinical classification 151 > 1 diagnosis SENS: 56% 73 Register, Finland criteria Tennis 1993 69 RA Saskatchewan Health, Inpatient + outpatient Clinical classification 432 > 1 inpatient diagnosis SENS: 84% Canada criteria Gabriel 1994 RA REP, USA Inpatient + outpatient Clinical classification 1602 ! 1 diagnosis SENS: 89%; SPEC: 74%; 39 (health information criteria PPV: 57%; NPV: 94%; ! system) = 0.54 Fowles 1995 RA Medicare Part B, USA Outpatient Case definition† 1596 > 1 outpatient diagnosis !: 0.44 40 (ambulatory clinics only) Gabriel 1996 OA REP, USA Inpatient + outpatient Case definition 387 > 1 diagnosis PPV: 60% 74 (health information Classification tree SENS: 75%; SPEC: 86%; system) PPV: 89%; NPV: 70% Katz 1997 41 RA, OA, Medicare Part B, USA Outpatient Clinical classification 378 ! 1 diagnosis SENS: 90% PPV: 95% FM, SLE (Rheumatology clinics criteria (rheumatologist only) only) Harrold 2000 OA HMO, USA Inpatient + outpatient Clinical classification 599 > 1 diagnosis PPV: 62% 70 (health information criteria > 2 diagnosis PPV: 67% system) > 1 diagnosis AND > 1 PPV: 83% rheumatology/ orthopedic surgeon visit Losina 2003 71 RA, OA, Medicare Part A and Part Inpatient (recipients of Case definition 922 ! 1 inpatient diagnosis SENS: 65% (RA), 54% AVN B, USA total hip replacement) (AVN), 96% (OA); PPV: 86-89%; !: 0.73 Pedersen 2004 RA National Patient Registry, Inpatient + outpatient Clinical classification 217 ! 1 diagnosis SENS: 59% (clinical case 43 Denmark criteria and case definition), 46% (ACR definition criteria)

32

Citation Diagnosis Data Source Record Type Reference Sample Administrative Data Algorithm Measures of Diagnostic Accuracy Standard Size (Case Definition) Definition(s) Rector Arthritis" Medicare + HMO, Inpatient + Patient- 3633 > 1 diagnosis SENS: 43%; SPEC: 87% 2004 64 USA outpatient + reported > 2 diagnosis SENS: 28%; SPEC: 94% pharmacy diagnoses > 1 outpatient diagnosis SENS: 29%; SPEC: 93% > 1 outpatient diagnosis SENS: 23%; SPEC: 95% (primary only) > 1 Rx SENS: 32%; SPEC: 87% > 1 diagnosis OR > 1 Rx SENS: 55%; SPEC: 77% > 1 diagnosis AND > 1 Rx SENS: 20%; SPEC: 96% Singh RA VHA, USA Outpatient + Case definition 184 !1 outpatient diagnosis SENS: 100%; SPEC: 55.1%; PPV: 66.2%; 2004 65 (health information pharmacy + NPV: 100% system) laboratory !1 outpatient diagnosis AND ! SENS: 84.9%; SPEC: 82.7%; PPV: 81.1%; (Rheumatology 1 Rx (> 3 month duration) NPV: 86.2% clinics only) !1 outpatient diagnosis AND SENS: 88.2%; SPEC: 91.4%; PPV: 92.6%; RF positive NPV:86.5% ! 1 Rx (> 3 month duration) SENS: 76.5%; SPEC: 95.7%; PPV: 95.6%; AND RF positive NPV: 77% !1 outpatient diagnosis AND ! SENS: 76.5%; SPEC: 97.1%; PPV: 97%; 1 Rx AND RF positive NPV: 77.3% Lix RA, OA Manitoba Population Inpatient + Patient- 5589 > 1 outpatient diagnosis < 5 SENS: 78.1% (OA), 11.3% (RA); SPEC: 58.6% 2006# Health Research Data outpatient + reported years (OA), 99.2% (RA); PPV: 37.4% (OA), 55.9% 36,66 Repository, Canada pharmacy diagnosis (RA); NPV: 89.4%(OA), 92.6% (RA); !: 0.27 (OA). 0.17 (RA) ; Youden: 0.37 (OA), 0.11(RA) > 2 outpatient diagnosis < 5 SENS: 63.1% (OA), 8.3% (RA); SPEC: 76.2% years (OA), 99.7% (RA); PPV: 45.7% (OA), 69.1% (RA); NPV: 86.7% (OA), 92.4% (RA); !: 0.35 (OA), 0.13 (RA); Youden: 0.39 (OA), 0.08 (RA) > 1 inpatient diagnosis or > 2 SENS: 63.7% (OA), 8.9% (RA); SPEC: 75.9% outpatient diagnosis < 5 years (OA), 99.7% (RA); PPV: 45.6% (OA), 70.7% (RA); NPV: 88.4% (OA), 92.4% (RA); !: 0.35 (OA), 0.14 (RA); Youden: 0.40 (OA), 0.09 (RA) > 1 inpatient diagnosis or > 2 SENS: 71.1% (OA), 9.4% (RA); SPEC : 70.1% outpatient diagnosis; OR >1 (OA), 99.6% (RA); PPV: 42.9% (OA), 68.3% outpatient diagnosis and > 2 Rx (RA); NPV: 88.4% (OA), 92.5% (RA); ! : 0.34 < 5 years (OA), 0.17 (RA); Youden: 0.41 (OA), 0.11(RA) 33

Citation Diagnosis Data Source Record Type Reference Standard Sample Administrative Data Measures of Diagnostic Accuracy Definition(s) Size Algorithm (Case Definition) Harrold Gout HMO database, Outpatient + Clinical 200 > 2 outpatient PPV: 61% (case definition) 2007 75 USA (health pharmacy classification diagnosis > 30 days information criteria and case apart system) definition > 3 outpatient PPV: 64% (case definition) diagnosis > 4 outpatient PPV: 67% (case definition) diagnosis > 1 Rx PPV: 39% (case definition) Visit to rheumatologist PPV: 92% (case definition) Singh SpA VHA, USA Inpatient + Case definition 184 > 1 Diagnosis SENS: 91% (AS), 100% (PsA), 71% (ReA) ; SPEC: 99% 2007 62 (health outpatient + (AS), 100% (PsA), 100% (ReA); PPV: 83% (AS), 100% information pharmacy (PsA), 100% (ReA); NPV: 99% (AS), 100% (PsA), 99% system) (rheumatology (ReA); !: 0.82 (AS), 1.0 (PsA), 0.83 (ReA) clinics only) > 1 Diagnosis AND > SENS: 27% (AS), 65% (PsA), 57% (ReA); SPEC: 99% 1 Rx (AS), 100% (PsA), 100% (ReA); PPV: 75% (AS), 100% (PsA), 100% (ReA); NPV: 96% (AS), 97% (PsA), 98% (ReA); !: 0.34 (AS), 0.77 (PsA), 0.72 (ReA) > 2 Diagnosis SENS: 82% (AS), 94% (PsA), 57% (ReA); SPEC: 100% (AS), 100% (PsA), 100% (ReA); PPV: 100% (AS), 100% (PsA), 100% (ReA); NPV: 99% (AS), 99% (PsA), 98% (ReA); !: 0.89 (AS), 0.97 (PsA), 0.72 (ReA) > 2 Diagnosis AND > SENS: 27% (AS), 59% (PsA), 57% (ReA); SPEC: 100% 1 Rx (AS), 100% (PsA), 100% (ReA); PPV: 100% (AS), 100% (PsA), 100% (ReA); NPV: 96% (AS), 96% (PsA), 98% (ReA); !: 0.41 (AS), 0.72 (PsA), 0.72 (ReA) Icen PsO REP, USA Inpatient + Case definition 2556 > 1 Diagnosis PPV: 68.7% (PsO), 94.0% (PsO, vulgaris), 18.7% (PsO, 2008 76 (health information outpatient dermatitis), 77.8% (PsO, guttate), 90.2% (PsO, pustular), system) 84.2% (seborrhiasis/sebopsoriasis), Thomas RA GPRD, UK Inpatient + Clinical 224 >3 Diagnosis SENS: 80%; SPEC: 81% 2008 (health outpatient + classification > 1 Diagnosis AND >2 SENS: 93%; SPEC: 27% 72 information pharmacy criteria Rx (NSAID) <6mo. system) (ambulatory > 1 Diagnosis AND > SENS: 78%; SPEC: 96% records) 1 Rx (DMARD) > 1 Diagnosis AND > SENS: 37%; SPEC: 82% 1 Rx (oral steroid) > 1 Diagnosis AND > SENS: 21%; SPEC: 86% 1 steroid injection 34

Citation Diagnosis Data Source Record Type Reference Sample Administrative Data Algorithm Measures of Diagnostic Accuracy Standard Size (Case Definition) Definition(s) Malik Gout VHA, USA Inpatient + Clinical 289 > 2 outpatient diagnosis OR > 1 PPV: 36% (ACR criteria), 30% 2009 77 (health outpatient classification inpatient diagnosis AND > 1 outpatient (Rome criteria), 33% (New York information criteria diagnosis criteria) system) Singh Arthritis Inpatient + Patient-reported 18464 > 1 diagnosis one year prior to survey !: 0.25 2009 67 VHA, USA outpatient + diagnosis > 1 diagnosis one year after survey !: 0.23 (health pharmacy > 1 diagnosis < 2 years !: 0.28 information > 1 diagnosis OR > 1 Rx < 2 years !: 0.32 system) > 1 inpatient diagnosis AND > 1 !: 0.19 outpatient diagnosis AND Rx < 2 years Chibnik SLE Medicaid, USA Inpatient + Clinical 234 >2 diagnosis AND > 2 nephrologist PPV 92% (SLE), 86% (nephritis) 2010 78 outpatient classification visits criteria >2 diagnosis AND > 2 renal diagnosis PPV: 89% (SLE), 80% (nephritis) >2 diagnosis AND > 2 nephrologist PPV: 91% (SLE), 88% (nephritis) visits OR > 2 renal Diagnosis >2 diagnosis AND > 2 nephrologist PPV: 89% (SLE), 79% (nephritis) visits AND > 2 renal Diagnosis Kim 2011 RA Medicare, USA Inpatient + Case definition 325 > 2 outpatient diagnosis, >7 days apart PPV: 55.7% (clinical case definition), 79 outpatient + + clinical 33.6% (> 4 ACR criteria), 42.8% (> 3 pharmacy classification ACR criteria) criteria > 2 outpatient diagnosis, >7 days apart PPV: 86.2% (clinical case definition), AND > 1 Rx (DMARD) 58.6% (> 4 ACR criteria), 72.4% (> 3 ACR criteria) > 3 outpatient diagnosis, >7 days apart PPV: 65.5% (clinical case definition), 40.0% (> 4 ACR criteria), 50.9% (> 3 ACR criteria) > 3 outpatient diagnosis, >7 days apart PPV: 87.5% (clinical case definition), AND > 1 Rx (DMARD) 60.7% (> 4 ACR criteria), 75.0% (> 3 ACR criteria) > 2 outpatient diagnosis from a PPV: 66.7% (clinical case definition), rheumatologist, >7 days apart 39.3% (> 4 ACR criteria), 50.0% (> 3 ACR criteria) > 2 outpatient diagnosis from a PPV: 88.9% (clinical case definition), rheumatologist, >7 days apart AND > 1 55.6% (> 4 ACR criteria), 73.3% (> 3 Rx (DMARD) ACR criteria)

35

Citation Diagnosis Data Source Record Type Reference Sample Administrative Data Measures of Diagnostic Accuracy Standard Size Algorithm Definition(s) (Case Definition) Bernatsky SLE, SSc, Nova Scotia Inpatient + Case definition 824 > 1 inpatient diagnosis or > 2 SENS: 98.2% (SLE), 80.5% (SSc) 2011 63 myositis, SS, Population outpatient outpatient diagnosis >8 weeks 88.4% (myositis), 95.5% (SS), 93.5% vasculitis, and Health Research (rheumatology apart but <2 years, or >1 (vasculitis), 99.5% (PMR); SPEC: PMR Unit, Canada clinics only) outpatient diagnosis by a 72.5% (SLE), 94.9% (SSc), 96.4% rheumatologist (myositis), 95.8% (SS), 95.4% (vasculitis), 92.2% (PMR) Abbreviations – AS: ; AVN: avascular ; DMARD: Disease-modifying anti-rheumatic drug; Dx: Diagnosis FM: fibromyalgia, GPRD: General Practice Research Database; !: kappa; NPV: Negative Predictive Value; NSAID: Non-steroidal anti-inflammatory drug; OA: osteoarthritis; PMR: polymyalgia rheumatica; PPV: Positive Predictive Value; PsA: ; PsO: Psoriasis; RA: Rheumatoid Arthritis; ReA: ; REP: Rochester Epidemiology Project; RF: Rheumatoid Factor; Rx: pharmacy claim; SENS: Sensitivity; SLE: systemic lupus erythematosus; SpA: spondylarthritides; SPEC: Specificity; SS: Sjögren’s syndrome; SSc: systemic sclerosis; VHA: Veterans Health Administration Case Definitions – ! Prescription (Rx) classes are not defined † Clinical case definition based on medical record review and not as stringent as clinical classification criteria " Diagnosis codes: 714.x-720.x; except 720.1 # Lix and colleagues re-ran analysis in 2008 and only results of the initial study are presented.

36

2.6.4 Table 3: Number of Studies meeting individual data quality and reporting items.

37 Section/Topic # Item Frequency* Title/Abstract/ 1 Identifies article as a study of diagnostic accuracy 100% keywords 2 Identifies article as a study of health administrative data (HAD) 100% Introduction 3 States disease ascertainment or estimating diagnostic accuracy from HAD as one of goals of study? 91% 4 Describes the data source 100% METHODS Describes type of records (inpatient, outpatient, linked records) 100% Describes setting and locations where the data were collected 100% 5 Reports a priori sample size 17% Participants Provides statistical justification for the sample size 4% 6 PARTICIPANT SAMPLING of how patients were identified for data collection 100% a) Patients were first identified by diagnosis codes within the administrative data 61% b) Patients were first identified by clinical records or self-reported diagnosis irrespective of diagnosis codes 39% c) Describes a systematic sampling method? 44% d) Describes a non-systematic sampling method? 9% e) All patients within the study population were sampled 48% 7 PARTICIPANT SELECTION: How patients were chosen for data collection and analysis 100% Describes Inclusion/Exclusion criteria 100% Describes who identified patients (for patients identified from medical records n=4) 100% 8 Describes data collection 100% Describes use of a priori data collection form 74% 9 Reports use of a split sample or an independent sample (re-validation using a separate cohort) 30% 10 Describes the reference standard 100% Test methods 11 Reports the number of persons reading the reference standard n=19 95% Describes training or expertise of persons reading reference standard (medical records) n=19 90% 12 Reports a measure of concordance if >1 persons reading the reference standards n=11 55% 13 Readers of the reference standard were blinded to the results of the classification by administrative data for that patient (reference standard: medical records) n=19 24% 14 Describes explicit methods for calculating or comparing measures of diagnostic accuracy, and the Statistical methods statistical methods used to quantify uncertainty. 65% 15 Reports the number of participants satisfying the inclusion/exclusion criteria 100% RESULTS 16 Provides study flow diagram 13% 17 If patients are sampled by reference standard, reports the number of records unable to link n=9 33% Reports missing medical records or reports the number of patients unwilling to participate 52% Participants Reports incomplete records 30% 18 Reports clinical and demographic characteristics of the study population 61% a) Reports age 52% b) Reports sex 52% c) Reports disease duration 13% d) Reports a measure of disease severity 0% e) Reports co-morbid conditions 9% 19 Describes the characteristics of misclassified patients (false positives) 30% Test results 20 Presents a cross tabulation of the results of the index tests by the results of the reference standard 30% 21 Reports the pre-test prevalence in the study sample 35% 22 Tests and Reports results of multiple algorithms 70% 23 Reports estimates of diagnostic accuracy 83% Measures of a) Reports sensitivity 48% Diagnostic b) Reports specificity 39% Accuracy c) Reports PPV 61% d) Reports NPV 30% e) Reports > 4 measures of diagnostic accuracy 26% f) Reports Youden's Index 9% g) Reports Kappa 30% h) Reports likelihood ratio(s) 4% i) Reports area under the receiver operating characteristic (ROC) curve 9% Reports 95% confidence intervals 57% 24 Report estimates of test reproducibility of the split or independent sample(s), if done n=7 57% 25 Discusses the applicability of the study findings DISCUSSION 100% *N=23 unless otherwise stated

38

2.6.5 Figure 2: Various approaches to conducting administrative data validation studies

39

40

2.7 Appendix

2.7.1 Systematic literature search strategies: MEDLINE and embase Search Strategy: MEDLINE

Rheumatic Diseases + Database + Validation

1 exp Rheumatic Diseases/ (153145) 2 (rheum$ adj3 disease$).mp. (38440) 3 rheumatism.mp. (7351) 4 enthesopath$.mp. (456) 5 rheumatoid inflammation$.mp. (150) 6 (rheum$ adj2 syndrome$).mp. (793) 7 Arthritis, Rheumatoid/ (72765) 8 (Rheum$ adj3 arthr$).mp. (93570) 9 (arthr$ adj3 deformans).mp. (358) 10 beauvais disease.mp. (0) 11 (chronic adj2 ).mp. (2113) 12 (chronic adj2 poly arthritis).mp. (3) 13 (rheumatic adj2 polyarthritis).mp. (115) 14 (rheumatic adj2 poly arthritis).mp. (0) 15 rheumarthritis.mp. (1) 16 Arthritis, Juvenile Rheumatoid/ (7473) 17 (arthr$ adj2 juvenil$).mp. (8613) 18 (chauffard$ adj2 (syndrome$ or disease$)).mp. (48) 19 (juvenile adj2 polyarthritis).mp. (147) 20 (progressive splenoadenomegalic adj2 polyarthritis).mp. (0) 21 (stiel$ adj2 (disease$ or syndrome$)).mp. (0) 22 (Still adj (disease$ or syndrome$)).mp. (190) 23 (stills adj (disease$ or syndrome$)).mp. (1304) 24 caplan syndrome/ (137) 25 (Caplan$ adj2 (Syndrome$ or disease$)).mp. (180) 26 Felty's Syndrome/ (604) 27 (felty$ adj2 (syndrome$ or disease$)).mp. (785) 28 Rheumatoid Nodule/ (800) 29 (rheumat$ adj2 nodule$).mp. (1317) 30 fereol node.mp. (0) 31 meynet node.mp. (0) 32 rheumatic skin disorder$.mp. (1) 33 Rheumatoid Vasculitis/ (5) 34 (rheumatoid adj2 vasculit$).mp. (381) 35 Vascular Diseases/ (24178) 36 limit 35 to yr="1975 - 1976" (870)

41

Search Strategy: MEDLINE

Rheumatic Diseases + Database + Validation

37 Vasculitis/ (10012) 38 limit 37 to yr="1977 - 2009" (9685) 39 Sjogren's Syndrome/ (9087) 40 (Sjogren$ adj2 (Syndrome$ or disease$)).mp. (11539) 41 (sicca$ adj2 (syndrome$ or disease$)).mp. (685) 42 (Sjoegren$ adj2 (syndrome$ or disease$)).mp. (67) 43 dacryosialoadenopathia atrophican$.mp. (1) 44 (mucoserous adj2 dyssecretosis$).mp. (1) 45 mikulicz radecki syndrome$.mp. (0) 46 mucoserous dyssecretosis.mp. (1) 47 mukilicz radecki syndrome$.mp. (0) 48 oculobuccopharyngeal dryness.mp. (0) 49 (rheumatic adj2 sialosis$).mp. (0) 50 Still's Disease, Adult-Onset/ (740) 51 gout/ (8007) 52 gout$.mp. (11102) 53 arthragra.mp. (0) 54 arthritis urica.mp. (79) 55 urate inflammation.mp. (0) 56 uric arthritis.mp. (37) 57 ARTHRITIS, GOUTY/ (679) 58 Hyperostosis, Sternocostoclavicular/ (112) 59 (sternocostoclavicular adj2 hyperostos?s).mp. (152) 60 Bone Diseases/ (16530) 61 limit 60 to yr="1975 - 1988" (4726) 62 Clavicle/ (3806) 63 61 and 62 (53) 64 Sternum/ (7154) 65 61 and 64 (51) 66 Osteoarthritis/ (24729) 67 osteoarth$.mp. (45862) 68 osteo-arth$.mp. (463) 69 (degenerative adj2 arthr$).mp. (1283) 70 (noninflammatory adj2 arthrit$).mp. (48) 71 (non-inflammatory adj2 arthrit$).mp. (28) 72 arthrosis.mp. (3921) 73 degenerative joint disease$.mp. (1679) 74 rheumatoid arthrosis.mp. (4) 75 Osteoarthritis, Hip/ (4353) 76 coxarthros?s.mp. (1269) 77 (hip adj2 arthrosis).mp. (95) 78 coxartherosis.mp. (0) 79 malum coxae senilis.mp. (5) 80 OSTEOARTHRITIS, HIP/ (4353)

42

Search Strategy: MEDLINE

Rheumatic Diseases + Database + Validation

81 Osteoarthritis, Knee/ (7013) 82 (knee adj2 arthrosis).mp. (111) 83 femorotibial arthrosis.mp. (5) 84 gonarthrosis.mp. (756) 85 Osteoarthritis, Spine/ (31) 86 Polymyalgia Rheumatica/ (1985) 87 (polymyalgia adj2 rheumatic$).mp. (2342) 88 forestier-certonciny ad2 syndrome.mp. (0) 89 (rhizomelic adj2 pseudopolyarthr$).mp. (51) 90 / (8670) 91 (rheumatic adj2 fever$).mp. (10000) 92 (acute adj2 rheumatic adj2 arthr$).mp. (28) 93 (polyarthritis adj2 rheumatic$).mp. (140) 94 Rheumatic Nodule/ (211) 95 aschoff bod$.mp. (46) 96 Wissler's Syndrome/ (131) 97 (Wissler$ adj2 (Syndrome$ or disease$)).mp. (156) 98 Fibromyalgia/ (5008) 99 fibromyalgia$.mp. (5842) 100 fibrosit$.mp. (426) 101 diffuse myofascial pain syndrome.mp. (0) 102 (fibro adj2 myalgia$).mp. (3) 103 Lupus Erythematosus, Cutaneous/ (1305) 104 (lupus adj3 erythemato$).mp. (49026) 105 Lupus Erythematosus, Discoid/ (2299) 106 (discoid adj2 lupus).mp. (2628) 107 Panniculitis, Lupus Erythematosus/ (149) 108 (lupus adj3 panniculiti$).mp. (203) 109 (lupus adj2 systematic).mp. (63) 110 PANNICULITIS/ (1125) 111 limit 110 to yr="1989 - 1990" (64) 112 Panniculitis, Nodular Nonsuppurative/ (832) 113 limit 112 to yr="1966 - 1988" (575) 114 (lupus adj3 profundus).mp. (139) 115 (lupus adj3 erythemato$).mp. (49026) 116 (lupus adj3 syndrome$).mp. (1924) 117 exp Lupus Erythematosus, Systemic/ (41980) 118 (lupus adj2 systematic).mp. (63) 119 libman sacks disease.mp. (8) 120 (malignant adj2 dermatovisceritism).mp. (0) 121 disseminated lupus.mp. (577) 122 (erythematodes adj3 visceralis).mp. (73) 123 lupovisceritis.mp. (0) 124 Lupus Nephritis/ (3748)

43

Search Strategy: MEDLINE

Rheumatic Diseases + Database + Validation

125 Glomerulonephritis/ (22622) 126 limit 125 to yr="1966 - 1986" (11834) 127 Nephritis/ (6346) 128 limit 127 to yr="1966 - 1986" (3227) 129 (lupus adj3 nephriti$).mp. (5487) 130 (lupus adj3 glomerulonephriti$).mp. (564) 131 (lupoid adj3 glomerulonephriti$).mp. (6) 132 (lupoid adj3 nephriti$).mp. (11) 133 (lupus adj3 kidn$).mp. (174) 134 (lupus adj3 nephropath$).mp. (374) 135 (lupus adj3 vasculiti$).mp. (761) 136 Lupus Vasculitis, Central Nervous System/ (442) 137 (central nervous system adj3 lupus).mp. (583) 138 (lupus adj3 meningoencephaliti$).mp. (2) 139 brain vasculitis.mp. (7) 140 brain angiitis.mp. (1) 141 brain arteritis.mp. (0) 142 cerebral arteritis.mp. (84) 143 cerebral vasculitis.mp. (408) 144 or/1-34,36,38-59,63,65-109,111,113-124,126,128-143 (261205) 145 Databases, Factual/ (31428) 146 Information Systems/ (17117) 147 limit 146 to yr="1966 - 1997" (12393) 148 ((database or databases) adj2 factual).mp. (31437) 149 ((data base or data bases) adj2 factual).mp. (1) 150 ((databank or databanks) adj2 factual).mp. (2) 151 ((data bank or data banks) adj2 factual).mp. (7) 152 Databases as Topic/ (7288) 153 ((database or databases) adj2 topic?).mp. (7302) 154 Database Management Systems/ (6348) 155 SOFTWARE/ (63212) 156 limit 155 to yr="1987 - 1990" (5542) 157 (database? adj2 system?).mp. (8118) 158 "Insurance Claim Review"/ (3495) 159 INSURANCE, HEALTH/ (25921) 160 limit 159 to yr="1986-1990" (2823) 161 Insurance, Health/ut (594) 162 limit 161 to yr="1966-1985" (122) 163 Insurance, Hospitalization/ut (80) 164 limit 163 to yr="1966-1985" (42) 165 Utilization Review/ (6476) 166 limit 165 to yr="1968-1985" (1284) 167 (claim? adj2 review?).mp. (3623) 168 (claim$ adj2 analys$).mp. (433)

44

Search Strategy: MEDLINE

Rheumatic Diseases + Database + Validation

169 (insurance adj2 audit$).mp. (7) 170 "International Classification of Diseases"/ (3284) 171 (international adj3 classification).mp. (10153) 172 icd cod$.mp. (337) 173 Medical Records Systems, Computerized/ (16799) 174 (medical record? adj2 system?).mp. (17378) 175 (computer$ adj2 patient$ adj2 record$).mp. (802) 176 ((database or databases) adj2 administrat*).mp. (1628) 177 ((data base or data bases) adj2 administrat*).mp. (59) 178 ((database or databases) adj2 claim?).mp. (1003) 179 ((data base or data bases) adj2 claim?).mp. (18) 180 (data adj2 analys#s).mp. (43048) 181 (regist?r* adj3 database?).mp. (1614) 182 (regist?r* adj3 data base?).mp. (115) 183 Patient Discharge/ (15418) 184 (discharge adj2 data).mp. (1930) 185 (physician? adj2 claim?).mp. (378) 186 ((pharmacy or drug) adj2 claim?).mp. (795) 187 database$.ti. (12567) 188 exp Registries/ (42769) 189 registr$.mp. (92399) 190 or/145,147-154,156-158,160,162,164,166-189 (245734) 191 144 and 190 (2898) 192 exp animals/ not (humans/ and exp animals/) (3579240) 193 191 not 192 (2868) 194 exp Diagnostic Errors/ (81835) 195 Diagnosis, Differential/ (339877) 196 "Predictive Value of Tests"/ (109696) 197 "Sensitivity and Specificity"/ (233062) 198 ROC Curve/ (19348) 199 Area under Curve/ (19528) 200 Bayes Theorem/ (12745) 201 diagnostic standard*.mp. (563) 202 (positive adj2 predictive).mp. (22168) 203 (negative adj2 predictive).mp. (20882) 204 sensitivit*.mp. (639331) 205 specificit*.mp. (706018) 206 accurac*.mp. (156488) 207 validit*.mp. (78439) 208 reliabilit*.mp. (73466) 209 agree*.mp. (172567) 210 concord*.mp. (30339) 211 misclass*.mp. (5371) 212 (case adj2 ascertain*).mp. (909)

45

Search Strategy: MEDLINE

Rheumatic Diseases + Database + Validation

213 Algorithms/ (129541) 214 algorithm?.mp. (161479) 215 or/194-214 (1974522) 216 193 and 215 (492) 217 limit 216 to yr="1980 -Current" (486)

Search Terminology: EMBASE

Rheumatic Diseases + Database + Validation

1 exp Rheumatic Disease/ (143043) 2 (rheum$ adj3 disease$).mp. (66905) 3 rheumatism.mp. (9744) 4 rheumatoid.mp. (130624) 5 enthesopath$.mp. (773) 6 rheumatoid inflammation$.mp. (177) 7 (rheum$ adj2 syndrome$).mp. (1440) 8 exp rheumatoid arthritis/ (113201) 9 (Rheum$ adj3 arthr$).mp. (126806) 10 (arthr$ adj3 deformans).mp. (373) 11 beauvais disease.mp. (0) 12 (chronic adj2 polyarthritis).mp. (2064) 13 (chronic adj2 poly arthritis).mp. (4) 14 (rheumatic adj2 polyarthritis).mp. (121) 15 (rheumatic adj2 poly arthritis).mp. (0) 16 rheumarthritis.mp. (4) 17 juvenile rheumatoid arthritis/ (10698) 18 (arthr$ adj2 juvenil$).mp. (12070) 19 (chauffard$ adj2 (syndrome$ or disease$)).mp. (45) 20 (juvenile adj2 polyarthritis).mp. (145) 21 (progressive splenoadenomegalic adj2 polyarthritis).mp. (0) 22 (stiel$ adj2 (disease$ or syndrome$)).mp. (0) 23 (Still adj (disease$ or syndrome$)).mp. (839) 24 (stills adj (disease$ or syndrome$)).mp. (1415) 25 / (6165) 26 (Caplan$ adj2 (Syndrome$ or disease$)).mp. (126) 27 Felty syndrome/ (710) 28 (felty$ adj2 (syndrome$ or disease$)).mp. (861) 29 Rheumatoid Nodule/ (1163) 30 (rheumat$ adj2 nodule$).mp. (1504) 31 fereol node.mp. (0)

46

Search Terminology: EMBASE

Rheumatic Diseases + Database + Validation

32 meynet node.mp. (0) 33 rheumatic skin disorder$.mp. (1) 34 (rheumatoid adj2 vasculit$).mp. (437) 35 Sjogren Syndrome/ (14073) 36 (Sjogren$ adj2 (Syndrome$ or disease$)).mp. (11841) 37 (sicca$ adj2 (syndrome$ or disease$)).mp. (826) 38 (Sjoegren$ adj2 (syndrome$ or disease$)).mp. (14423) 39 dacryosialoadenopathia atrophican$.mp. (1) 40 (mucoserous adj2 dyssecretosis$).mp. (1) 41 mikulicz radecki syndrome$.mp. (0) 42 mucoserous dyssecretosis.mp. (1) 43 mukilicz radecki syndrome$.mp. (0) 44 oculobuccopharyngeal dryness.mp. (0) 45 (rheumatic adj2 sialosis$).mp. (0) 46 adult onset Still disease/ (665) 47 gout/ (11775) 48 gout$.mp. (13851) 49 arthragra.mp. (0) 50 arthritis urica.mp. (80) 51 urate inflammation.mp. (0) 52 uric arthritis.mp. (37) 53 Hyperostosis/ (2207) 54 (sternocostoclavicular adj2 hyperostos?s).mp. (116) 55 exp Osteoarthritis/ (58775) 56 osteoarth$.mp. (65084) 57 osteo-arth$.mp. (576) 58 (degenerative adj2 arthr$).mp. (1491) 59 (noninflammatory adj2 arthrit$).mp. (54) 60 (non-inflammatory adj2 arthrit$).mp. (38) 61 arthrosis.mp. (4952) 62 degenerative joint disease$.mp. (1878) 63 rheumatoid arthrosis.mp. (3) 64 coxarthros?s.mp. (1473) 65 (hip adj2 arthrosis).mp. (150) 66 coxartherosis.mp. (0) 67 malum coxae senilis.mp. (5) 68 hip osteoarthritis/ (5107) 69 knee osteoarthritis/ (11168) 70 exp spondylosis/ (4806) 71 (knee adj2 arthrosis).mp. (162) 72 femorotibial arthrosis.mp. (6) 73 gonarthrosis.mp. (1078) 74 Polymyalgia Rheumatica/ (2993) 75 (polymyalgia adj2 rheumatic$).mp. (3306)

47

Search Terminology: EMBASE

Rheumatic Diseases + Database + Validation

76 forestier-certonciny ad2 syndrome.mp. (0) 77 (rhizomelic adj2 pseudopolyarthr$).mp. (59) 78 rheumatic fever/ (9856) 79 (rheumatic adj2 fever$).mp. (10973) 80 (acute adj2 rheumatic adj2 arthr$).mp. (32) 81 (polyarthritis adj2 rheumatic$).mp. (553) 82 Rheumatic Nodule/ (1163) 83 aschoff bod$.mp. (41) 84 (Wissler$ adj2 (Syndrome$ or disease$)).mp. (94) 85 Fibromyalgia/ (9509) 86 fibromyalgia$.mp. (10101) 87 fibrosit$.mp. (699) 88 diffuse myofascial pain syndrome.mp. (0) 89 (fibro adj2 myalgia$).mp. (3) 90 exp lupus erythematosus/ (61475) 91 skin lupus erythematosus/ (1548) 92 (lupus adj3 erythemato$).mp. (65490) 93 discoid lupus erythematosus/ (2560) 94 (discoid adj2 lupus).mp. (2879) 95 Panniculitis/ (2438) 96 (lupus adj3 panniculiti$).mp. (168) 97 (lupus adj2 systematic).mp. (126) 98 (lupus adj3 profundus).mp. (183) 99 (lupus adj3 erythemato$).mp. (65490) 100 (lupus adj3 syndrome$).mp. (3142) 101 systemic lupus erythematosus/ (50825) 102 (lupus adj2 systematic).mp. (126) 103 libman sacks disease.mp. (15) 104 (malignant adj2 dermatovisceritism).mp. (0) 105 disseminated lupus.mp. (582) 106 (erythematodes adj3 visceralis).mp. (56) 107 lupovisceritis.mp. (0) 108 lupus erythematosus nephritis/ (6213) 109 Glomerulonephritis/ (24471) 110 (lupus adj3 nephriti$).mp. (7357) 111 (lupus adj3 glomerulonephriti$).mp. (614) 112 (lupoid adj3 glomerulonephriti$).mp. (7) 113 (lupoid adj3 nephriti$).mp. (12) 114 (lupus adj3 kidn$).mp. (201) 115 (lupus adj3 nephropath$).mp. (407) 116 (lupus adj3 vasculiti$).mp. (423) 117 brain vasculitis/ (1353) 118 (central nervous system adj3 lupus).mp. (189) 119 (lupus adj3 meningoencephaliti$).mp. (2)

48

Search Terminology: EMBASE

Rheumatic Diseases + Database + Validation

120 brain vasculitis.mp. (1358) 121 brain angiitis.mp. (2) 122 brain arteritis.mp. (0) 123 cerebral arteritis.mp. (90) 124 cerebral vasculitis.mp. (521) 125 enthesopath*.mp. (773) 126 factual database/ (10955) 127 ((database or databases) adj2 factual).mp. (10978) 128 ((data base or data bases) adj2 factual).mp. (2) 129 ((databank or databanks) adj2 factual).mp. (4) 130 ((data bank or data banks) adj2 factual).mp. (8) 131 ((database or databases) adj2 topic?).mp. (18) 132 data base/ (78758) 133 ((database or databases) adj2 topic?).mp. (18) 134 (database adj2 system?).mp. (2576) 135 (claim? adj2 review?).mp. (200) 136 (database? adj2 system?).mp. (2799) 137 (claim? adj2 review?).mp. (200) 138 (claim$ adj2 analys$).mp. (667) 139 (insurance adj2 audit$).mp. (21) 140 "International Classification of Diseases"/ (5016) 141 (international adj3 classification).mp. (13118) 142 icd cod$.mp. (460) 143 Medical record/ (92948) 144 (medical record? adj2 system?).mp. (1216) 145 (computer$ adj2 patient$ adj2 record$).mp. (1205) 146 ((database or databases) adj2 administrat*).mp. (2162) 147 ((data base or data bases) adj2 administrat*).mp. (153) 148 ((database or databases) adj2 claim?).mp. (1620) 149 ((data base or data bases) adj2 claim?).mp. (20) 150 (data adj2 analys#s).mp. (131410) 151 (regist?r* adj3 database?).mp. (2200) 152 (regist?r* adj3 data base?).mp. (3069) 153 hospital Discharge/ (44867) 154 (discharge adj2 data).mp. (2372) 155 (physician? adj2 claim?).mp. (438) 156 ((pharmacy or drug) adj2 claim?).mp. (1153) 157 database$.ti. (15013) 158 exp Registries/ (34917) 159 (registry or registries).mp. [mp=title, abstract, subject headings, heading word, drug trade name, original title, device manufacturer, drug manufacturer] (56929) 160 or/126-159 (424969) 161 or/1-125 (349872) 162 160 and 161 (7448)

49

Search Terminology: EMBASE

Rheumatic Diseases + Database + Validation

163 exp animals/ not (humans/ and exp animals/) (1255866) 164 162 not 163 (7435) 165 exp diagnostic error/ (43621) 166 differential diagnosis/ (269680) 167 "prediction and forecasting"/ (24081) 168 "sensitivity and specificity"/ (138446) 169 roc curve/ (5496) 170 area under the curve/ (48098) 171 bayes theorem/ (14254) 172 diagnostic standard*.mp. (864) 173 (positive adj2 predictive).mp. (27395) 174 (negative adj2 predictive).mp. (26247) 175 sensitivit*.mp. (679659) 176 specificit*.mp. (475701) 177 accuranc*.mp. (42) 178 validit*.mp. (102325) 179 reliabilit*.mp. (129321) 180 agree*.mp. (211379) 181 validity/ (12754) 182 reliability/ (71445) 183 concord*.mp. (37353) 184 misclass*.mp. (6190) 185 (case adj2 ascertain*).mp. (1064) 186 algorithm/ (106982) 187 algorithm?.mp. (148036) 188 or/165-187 (1790631) 189 162 and 188 (1067) 190 limit 189 to yr="1980 -Current" (1063)

50

Chapter 3 Accuracy of Ontario Health Administrative Databases in Identifying Patients with Rheumatoid Arthritis [PART I]: A Validation Study using the Medical Records of Rheumatologists

Jessica Widdifield1, BSc, PhD(c); Sasha Bernatsky2, MD, FRCPC, PhD; J. Michael Paterson1,3,4, MSc; Karen Tu1,3, MD, MSc; Ryan Ng3, MSc; J. Carter Thorne5 MD, FRCPC, Janet E. Pope6 and Claire Bombardier1, MD, FRCPC Affiliations: 1. University of Toronto, Toronto, ON; 2. McGill University, Montreal, QC; 3. Institute for Clinical Evaluative Sciences, Toronto, ON; 4. McMaster University, Hamilton, ON; 5. Southlake Regional Health Centre, Newmarket, ON; 6. Western University, London, ON;

Acknowledgements: First and foremost, we wish to thank all the rheumatologists who participated in this study. We would also like to individually acknowledge and thank Diane Green (ICES) for analyst expertise, Dr. George Tomlinson, University of Toronto, and Dr. Debra Butt, University of Toronto.

This study was supported by the Institute for Clinical Evaluative Sciences (ICES), a non-profit research corporation funded by the Ontario Ministry of Health and Long-Term Care (MOHLTC). The opinions, results and conclusions are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred. We also wish to thank Brogan Inc., Ottawa for use of their Drug Product and Therapeutic Class Database. This study was also supported by the Ontario Biologics Research Initiative/Ontario Best Practices Research Initiative (OBRI).

Dr. Bernatsky holds a career award from the Fonds de la recherche en santé du Québec (FRSQ), and Dr. Tu holds a Canadian Institutes of Health Research (CIHR) fellowship award in the Area of Primary Care (2011-2013). Dr. Bombardier holds a Canada Research Chair in Knowledge Transfer for Musculoskeletal Care (2002-2016) and a Pfizer Research Chair in Rheumatology.

Abstract: 248/250 words Manuscript word count: 3751/3800 (excluding abstract) No. of Tables: 3 No. of Figures: 2

51

3.1 Abstract

In a universal single-payer health system, health administrative data can be a valuable tool for disease surveillance and research. Few studies have rigorously evaluated the accuracy of administrative databases for identifying rheumatoid arthritis (RA) patients. Our aim was to validate administrative data algorithms to identify RA in Ontario, Canada.

Methods:

We performed a retrospective review of a random sample of 450 patients from 18 rheumatology clinics. Using rheumatologist-reported diagnosis as the reference standard, we tested and validated different combinations of physician billing, hospitalization and pharmacy data.

Results:

149 rheumatology patients were classified as having RA and 301 as not having RA based on our reference standard definition. Overall, algorithms that included physician billings had excellent sensitivity (94-100%). Specificity and positive predictive value (PPV) were modest to excellent and increased when algorithms included multiple physician claims or specialist claims. The addition of RA drugs did not significantly improve algorithm performance. The algorithm of “(1 hospitalization RA code) OR (3 physician RA claims with !1 by a specialist in a 2 year period)” had a sensitivity of 97%, specificity of 85%, PPV of 76% and NPV of 98%. Most RA patients (84%) had an RA diagnosis code present in the administrative data within +/- 1 year of a rheumatologist’s documented diagnosis date.

Conclusion:

We demonstrate that administrative data can be used to identify RA patients with a high degree of accuracy. RA diagnosis date and disease duration are fairly well estimated from administrative data in jurisdictions of universal health care.

Abstract: 248/250 words

52

3.2 Introduction

Currently, much of disease surveillance, quality improvement, and epidemiological research relies on health administrative databases such as physician billing claims, hospitalization records, and prescription drug claims. In Canada’s health care system, provincial health insurance plans provide universal coverage for hospital and physician services, creating a large repository of health administrative data for all Canadians. Compared to primary clinical data, administrative data provide a cost-efficient means to perform epidemiologic and evaluative studies among large populations.

A number of population-based studies of rheumatoid arthritis (RA) have been performed in recent years using various algorithms (or case definitions) to identify patients with RA in administrative data.

However, few studies have examined the accuracy of diagnosis coding for RA within administrative databases, and the optimal approach to identifying RA cases using the available resources is undetermined.

Our primary objective was to validate administrative data algorithms for accurate identification of RA patients using the clinical diagnoses documented in the medical records of rheumatologists as the reference standard. The overarching aim is to inform the optimal method(s) of ascertaining RA patients from administrative data, and to use this information to create a provincial-wide research database of all individuals with RA living in Ontario, Canada.

53

3.3 Subjects and Methods

Setting and Design. A retrospective chart abstraction study was performed among a random sample of patients seen in Ontario rheumatology clinics to identify patients with and without RA. These patients were then linked to health administrative data to test and validate different combinations (algorithms) of physician billing, hospitalization and pharmacy data to identify RA patients within administrative data. The study was approved by the Research Ethics Board of Sunnybrook Health Sciences Centre,

Toronto and by the Research Ethics Boards associated with all institutions involved.

Participant Selection. Of the 160 rheumatologists who submit electronic reimbursement claims, we selected eighteen (>10%) rheumatologists based on demographic, geographic and clinical practice representation (9 male and 9 female; 7 Toronto sites and 11 sites dispersed throughout Ontario, 9 community and 9 academic centers).

Patients were randomly sampled (unselected by diagnosis) from medical records at each rheumatologist’s clinic and charts were reviewed to obtain their underlying clinical diagnoses.89 Our sampling frame included all active, adult (aged 20+) patients seen prior to March 31st 2011. We defined active patients as those with a visit in the preceding two years [between March 2009 - 2011]. Patients were also required to have a health insurance number documented in their medical record and to have been seen by the rheumatologist at least twice in order to ensure that there was enough recent data in the medical record to allow for our review.

54

Data Abstraction. For all randomly selected patients meeting the inclusion criteria, a single trained abstractor (who was blinded to patients’ diagnosis codes in the administrative data) used a standardized data collection form to abstract information from the entire medical record. Data elements included documented rheumatology diagnoses at each visit, disease onset date for RA patients, and previous encounters with other rheumatologists. Age; sex; prescriptions for non-steroidal anti-inflammatory drugs (NSAIDS), glucocorticosteroids, disease-modifying anti-rheumatic drugs (DMARDs), and biologics; and clinical characteristics (including diagnostic test results and data required to apply clinical RA classification criteria90,11) were abstracted for all patients. As the newer 2010 RA classification criteria11 are intended for newly presenting patients with at least one active joint, these criteria were only applied to a subset of patients with initial documentation present in the medical record. As the individual 2010 Criteria are weighted differently, patients were also required to have data on all four domains (number and site of involved joints, serologic abnormality, elevated acute- phase response, and symptom duration) in order to calculate individual scores.

Reference Standard. Patients were classified as having or not having RA based on the main diagnosis

(in the opinion of the rheumatologist) that was documented in the medical record. When the diagnosis was either unclear or changed over the course of care, an adjudication team of four rheumatologists

(CB, JCT, SB, JEP) classified patients by consensus based on a review of clinical characteristics.

Health Administrative Data Sources. Once patients were classified as having or not having RA as per our reference standard definition, corresponding administrative data were obtained for each patient

55 covering the period of April 1, 1991 to March 31st 2011 (the years in which administrative data were available during the study period). Physician claims data were identified in the Ontario Health

Insurance Plan (OHIP) Database.28 OHIP is the provincial insurance plan, covering health care for all

Ontario citizens. Physicians are reimbursed by submitting claims to OHIP for medical services provided. A diagnosis is provided with each claim, which represents the main ‘reason for the visit’.

These diagnoses are coded in the 8th or 9th revision of the International Classification of Diseases

(ICD).29 Hospital visits were identified using the Canadian Institute for Health Information Discharge

Abstract Database (CIHI-DAD), which contains detailed information regarding all hospital admissions, and the National Ambulatory Care Reporting System (NACRS), which records all hospital-based and community-based ambulatory care for day surgery and emergency rooms (ERs).91 Hospital data prior to 2002 have diagnoses coded in ICD-929 and can contain up to 16 diagnoses recorded per hospital encounter. Hospitalizations and emergency department encounters after 2002 are coded using ICD-10 and each record can contain between 1 and 25 diagnoses per encounter. For seniors, medication exposures for glucocorticosteroids, DMARDS and biologics were determined using the pharmacy claims database of the Ontario Drug Benefit (ODB) Program, which covers residents who are aged 65 years or older30. Physician specialty was obtained from the Institute for Clinical Evaluative Sciences

Physician Database (IPDB).92 The quality of data in the IPDB is routinely validated against the Ontario

Physician Human Resource Data Centre database, which verifies this information through periodic telephone interviews with physicians. Using a combination of physician service claims and the IPDB, we identified the specialty of each physician who treated a study patient. Physicians were classified as musculoskeletal specialists [which included rheumatologists, orthopedic surgeons, and internal medicine specialists (general internists)]; or other. In Canada, general internists are specialists who have hospital-based and referral practices, and do very little primary care. Some general internists 56 further specialize in areas such as arthritis and practice effectively as rheumatologists. These datasets are linked in an anonymous fashion using encrypted health insurance numbers and they have very little missing information.32

Sample Size. We aimed for a sample size that would allow us to detect a sensitivity and specificity of approximately 85% with 95% confidence intervals (CI) within +/- 5% of the estimate. We determined that this would require at least 50 patients with rheumatologist-confirmed RA. Current Ontario estimates suggest that at least 25% of patients seen in rheumatology clinics have RA.93,94 Therefore, to identify 50 RA cases would require a random sample of 200 patients. This target sample size was then doubled in order to permit identification of at least 50 additional RA patients aged 65 years and older.

Adding seniors, who are covered by public drug insurance, would allow us to properly evaluate the benefit of including prescription drug claims in our case-finding algorithms. We further maximized our precision by taking into account between-cluster variation when estimating our sample size.95 Within each cluster (rheumatology site) the sensitivity and specificity is assumed to be the same but is expected to be different among clusters in an amount dependent upon the intraclass correlation (ICC).

We set our ICC to 0.05 and our cluster size (n) to be 25 patients as we estimated this to be the number of charts we could abstract in one day (25 patients per rheumatologist). Our final sample size needs were estimated to be 440 patients. Therefore, from our sampling frame, we randomly selected 25 patient records from each of the 18 rheumatology clinics (N=450).

57

Test Methods. Results are reported using the modified Standards for Reporting of Diagnostic

Accuracy (STARD) criteria.96,54

Algorithm Testing. We pre-specified a set of over 130 case-finding algorithms using combinations of

RA-coded physician billings and primary and secondary hospital discharge diagnoses (using ICD9 and

ICD10 diagnosis codes 714 and M05-M06), prescription drug claims (for glucorticosteroids,

DMARDs, and biologics), various time intervals between the RA-coded health services, and whether the services were rendered by a musculoskeletal specialist (rheumatologist, internist or orthopedic surgeon). Additionally, we tested algorithms with exclusion criteria in which diagnosis codes for ‘other rheumatology conditions’ appeared after a RA diagnosis code, or for which the RA diagnosis code was not provided when a patient was seen by a rheumatologist. The ‘other rheumatology diagnoses’ included: osteoarthritis, gout, polymyalgia rheumatica, other seronegative , ankylosing spondylitis, connective tissue disorder, psoriasis, synovitis/tenosynovitis/bursitis, and vasculitis. All algorithms were tested on the entire sample except those requiring prescription drug claims, which were limited to patients aged 65 years and older. The algorithms were applied to all administrative data for the study period (up to March 31, 2011).

Statistical Analysis. Descriptive statistics were used to characterize the study population. We computed the sensitivity, specificity, positive predictive value (PPV) and negative predictive value

(NPV) of each algorithm with corresponding 95% CIs. Details of the how these measures were computed are illustrated below:

58

Reference Standard Pre-test Prevalence = TP + FN___ Cases Non-Cases TP+FP+FN+TN

PPV = True False TP___ Positive Positive TP + FP Positive Positive

istrative

data NPV = False True TN___ Negative Negative Admin FN + TN

Negative Negative

Sensitivity = Specificity Post-test Prevalence = TP___ = TN__ TP + FP____ TP + FN FP + TN TP+FP+FN+TN

For patients for whom a RA diagnosis date was present in the medical record, we also assessed the timing of the first RA diagnosis code in relation to the rheumatologist-reported diagnosis date, and compared the mean (standard deviation; SD) disease duration across the two data sources (medical records and administrative data) using a paired t test.

Additionally, we reviewed the false positive cases using an algorithm that performed well (details below) in the primary analysis. A false positive was defined as a patient who was coded as having RA by the administrative data algorithm but who did not have RA according to the reference standard.

False negative cases were also reviewed.

59

Our a priori definition for selecting the optimal definition for use in Ontario was to target one that obtained the highest possible PPV (>75%) while maximizing sensitivity (>95%) and specificity

(>85%) over the shortest possible duration (in years).

Finally, we assessed the extent of variation in the accuracy of the optimal algorithm across the 18 rheumatology clinics. For this analysis, each individual rheumatologist was treated as a separate entity and forest plots of specificity were generated.

All analyses were performed at the Institute for Clinical Evaluative Sciences (ICES) using SAS version 9.2 (SAS Institute, Cary, North Carolina).

3.4 Results

We randomly selected 556 patients from 18 rheumatology clinics in order to meet our target sample size of 450 patients. Of the 106 patients excluded, the majority (75; 71%) had only one visit to the rheumatologist during the study time frame (Figure 1). Among the 450 patients included, only 5 required adjudication by the rheumatology team. In total, 149 patients (33%) were classified as having

RA as per our reference standard definition. Characteristics of RA cases and non-RA patients are presented in table 1. Many of the patients were female (77% RA; 65% non-RA) and the mean (SD) age for RA cases and non-RA patients was 61.8 (14.1) and 57.9 (16.7) years, respectively. Almost half of the RA cases (42%) had documentation of being seen by another rheumatologist at some point in time.

60

Among patients without RA, the most prevalent rheumatologist-reported diagnoses were osteoarthritis (45%), seronegative (19%) and connective tissue diseases (19%).

Most RA patients (91%) had documentation of having RA for over two years, 71% had documentation of meeting the 1987 RA criteria, and 80% met the 2010 RA criteria. The frequency of rheumatologic drug prescriptions was higher among all drug classes for RA patients in comparison to non-RA patients and almost all RA cases (96%) had documentation of at least one DMARD prescription (Table 1).

We successfully linked 447 (99% of 450) rheumatology clinic patients to the administrative databases.

The unsuccessful linkages were due to invalid health insurance numbers. Sensitivity, specificity, and predictive values are reported in tables 2 (for all patients) and 3 (for patients over 65).

Overall, using any algorithms that included RA diagnosis codes derived from physician billing claims, sensitivity was very high (94-100%). Specificity and PPV were modest to excellent and increased when algorithms required multiple RA codes or codes by a specialist. There was a slight increase in sensitivity and reductions in specificity and PPV as the observation window for multiple diagnosis codes increased from 1 to 2 years.

Among seniors, addition of any RA drugs (glucocorticosteroids, DMARDs, or biologics) to an

61 algorithm had little impact on sensitivity, but decreased both specificity and PPV (table 3). When RA drugs were restricted to DMARDs or biologics, there was a slight reduction in sensitivity but an increase in PPV.

The addition of hospitalization data to physician-billing algorithms had little impact on sensitivity or

PPV. Varying the window between RA codes and exclusion criteria also did not improve algorithm performance.

Among the 116 RA patients who had a documented diagnosis date in their medical record, 17 (15%) had a diagnosis date prior to 1991, the earliest year available in the administrative data. For each patient, the diagnosis code of RA in the OHIP physician claims database predated any RA hospitalization codes. Excluding the 17 patients with established RA prior to the onset of administrative data, 67% of patients had their first RA code appear in administrative data within the same calendar year as the rheumatologist-reported RA diagnosis, and 84% within +/- 1 year. The mean

(SD) disease duration estimated from administrative data [9.6 (6.6) years] was approximately one year shorter than the rheumatologist-diagnosed disease duration taken from the reference standard [10.8

(10.6) years]; a difference that was not statistically significant on a paired t test (p > 0.0544).

Based upon our a priori definition, the optimal algorithm to identify RA patients was: “(1 hospitalization RA code ever) OR (3 physician RA claims with !1 RA code by a specialist in a 2 year

62 period)” which had a sensitivity of 97%, specificity of 85%, PPV of 76% and NPV of 98%. Using this algorithm, there were 45 patients classified as false positive cases, where the administrative data algorithm mistakenly identified the patient as having RA. For these false positive patients, the most common reference standard diagnoses included seronegative spondyloarthropathies (49%) and severe osteoarthritis with overlapping conditions (27%). For the false negative cases, all 5 patients appeared to be incident RA cases who had only two of the required three RA physician claims in the administrative data.

A forest plot of specificity estimates by rheumatology clinic (figure 2) illustrates overlapping confidence intervals for all but one rheumatologist.

3.5 Discussion

In a single-payer healthcare system, administrative data provide a powerful resource for population- based rheumatology research. We demonstrate that with the use of an algorithm, administrative data can be used to accurately identify RA patients who have had their diagnosis clinically confirmed by a rheumatologist.

We identified 12 peer-reviewed studies evaluating the accuracy of administrative data algorithms for identifying RA patients.35-39,41-46,97 While each study has to be interpreted individually due to inconsistent methods98, all of our physician-billing algorithms tested out-performed (>95%) previous

63 validation studies in terms of sensitivity. Our specificity estimates were modest overall, which is reflected in our comparator group of non-RA patients being other rheumatology patients more similar to RA in contrast to using a sample from the general population as a reference standard. Singh and colleagues44 used a similar approach to validation by sampling 184 patients from one rheumatology clinic in the Veterans Administration Database and observed similar specificity. However, they were able to improve the specificity by including laboratory test (rheumatoid factor) results, which were not available to us in administrative data. Considering our comparator group (non-RA patients) had a high proportion of patients with other inflammatory arthritides, it is likely that at a population level, administrative data algorithms are likely to perform very well at detecting the presence of RA. These conclusions are supported by the results of an independent evaluation of the performance of these same algorithms using primary care medical records as the reference standard (Chapter 4).

The addition of drug exposure information to administrative data algorithms has been shown to improve both specificity and PPV.44,46 We defined RA drug exposure two ways: grouping glucocorticosteroids, DMARDs, and biologics; and then excluding glucocorticosteroids. Our results suggest that prescription drug claims may slightly improve the accuracy in identifying RA patients, albeit with overlapping confidence intervals. While requiring a DMARD or biologic dispensations increased specificity and PPV, it reduced sensitivity, as not all RA patients have consistent DMARD use throughout their entire disease course.99,100 Also, adding rheumatology drug exposure data to the algorithm may not have improved specificity among a rheumatology clinic population, because a large proportion of our rheumatology patients are subjected to these drugs for similar autoimmune rheumatologic conditions.

64

On their own, hospital diagnosis codes demonstrated excellent specificity (96%), but poor sensitivity

(22%). Interestingly, a previous Canadian study reported a sensitivity of 84% for patients with ‘at least one inpatient diagnostic code’; however, the reference standard was comprised of hospitalized RA

35 patients. At the other extreme, our study had substantially higher sensitivity than that found by another Canadian validation study using patient self-report as a reference standard. Lix and colleagues36 estimated a sensitivity of <10% for patients with ‘at least one inpatient or two outpatient

RA claims’; however, many of their self-reported cases of RA would not be true RA, as verified by a physician.82 Additionally, we found that adding hospitalization data to physician-billing data added little value. However, our study was performed in a setting where all patients had access to rheumatologists. Information about hospitalization of RA patients may be more important for case ascertainment in settings with poor access to rheumatologists

While requiring a single RA code to identify RA patients was a 100% sensitive method in capturing

RA, it only slightly improved sensitivity over algorithms requiring multiple RA codes, but with a dramatic cost to the number of false positives identified. Others101,102,103 have used this approach to identify RA patients, however, the high proportion of false positives identified would substantially over-report the burden of RA. The false positives for non-RA patients having at least one RA diagnosis code may be related to testing for disease rather than confirmed disease, coding errors, or perhaps due the evolution of rheumatology diagnoses over time. Finally, of the false positives identified using the algorithm of ‘at least one outpatient RA claim’, over half were due to specialist codes (58%). A recent

65 survey on billing practices among Canadian rheumatologists identified that almost half (48%) of rheumatologists use a diagnosis code for billing even if they are not certain that the patient does have the disease,104 which further emphasizes the importance of using cumulative diagnosis codes for disease ascertainment.

When we evaluated the ability of administrative data to detect incident RA cases, 84% of RA patients had a RA diagnosis code present within +/- 1 year of the rheumatologist-documented RA diagnosis.

However, we may be underestimating the ability of administrative data to detect incident RA cases

(especially for patients with longstanding RA) as 42% of RA patients were seen by other rheumatologists, who may have a more accurate date of diagnosis. Additionally, disease duration will be underestimated for patients with a diagnosis established prior to the onset of administrative data or for those who move into the jurisdiction. Previous administrative data studies have estimated disease duration as means for risk adjustment105,106 and our results suggest that this variable is fairly well estimated from administrative data in jurisdictions of universal health care.

This study has both strengths and limitations. Patients were randomly sampled (unselected by diagnoses) from rheumatology clinics. We performed purposeful sampling of rheumatologists to participate primarily for feasibility, but our other rationale is that previous administrative database validation studies have shown low participation rates107 from randomly selected physicians, which could actually bias participant selection towards good performers; hence, our strategic sampling may have produced results that are as (or more) valid than random sampling. Initially, 20 rheumatologists

66 were approached to participate and only one declined. Furthermore, over 10% of all Ontario rheumatologists participated.

Our inclusion criteria required patients to be active in the clinic and therefore, our approach may not capture quiescent RA patients that are no longer seeking rheumatology care or those patients who do not have access to a rheumatologist. As our sample was drawn from rheumatology clinics and our validation sample had a higher RA prevalence than the general population, our estimates may not be generalizable on the population level. While our sensitivity and specificity estimates can be applied to other populations that have different prevalence rates, predictive values are dependent on the disease prevalence. However, this is not a limitation per se, as we concurrently validated these algorithms in a random sample of patients seen in primary care (in which the study prevalence closely approximates the RA prevalence of the general population) to further support our findings (reported in Chapter 4).

Algorithms are not only selected based on the databases available to research, the choice of algorithms may vary by study purpose. For example, we identified the optimal algorithm by prioritizing PPV which is important when creating a cohort of patients in order to avoid detecting false disease cases.85

Researchers may choose to prioritize higher sensitivity to detect positive cases in order to estimate the potential burden of disease at the population level, while others may choose to maximize the combination of sensitivity and specificity to improve accuracy. Algorithms with high specificity are often needed to create a homogeneous sample of patients for evaluating outcomes of disease and/or treatments. Generally, additional criteria in algorithms are expected to increase specificity at the

67 expense of sensitivity. Incorrectly choosing the wrong algorithm or prioritizing the wrong accuracy measure can lead to misclassification, which can lead to reduced power, loss of generalizability, as well as increased bias, and possibly study cost.85

In conclusion, this study has demonstrated the accuracy of administrative data algorithms for identifying RA patients among a rheumatology clinic population. We found that for RA patients that have seen a rheumatologist, physician-billing algorithms are highly sensitive in identifying these patients. Our findings suggest that pharmacy data do not significantly improve the accuracy in identifying RA. The results of this work support the selection of an algorithm to identify the optimal approach of identifying RA patients in administrative data to establish a provincial-wide database of all patients with RA. Overall, using the algorithm of “[1 hospitalization RA code] OR [3 physician RA diagnosis codes (claims) with !1 by a specialist in a 2 year period]” with the first RA diagnosis code defining disease onset was selected as the preferred algorithm.

68

3.6 Tables and Figures

3.6.1 Figure 1: Flow diagram of selection of study participants

No. of Patients Randomly Sampled (n=556) Excluded n=106* Health card missing or void n=5 Only 1 recent visit n=75 Not seen at least twice between 2000 and 2011 n=51 Disease onset < 20 years of age n=9 *34 patients failed > 1 inclusion criteria No. of Patients Meeting Inclusion

Criteria (n=450)

Review by the rheumatology adjudication team n=5 (1 RA; 4 non-RA)

No. of Patients Included (n=450) RA = 149; Non-RA = 301

Linkage with administrative data unsuccessful n=3 (1 RA; 2 non-RA) No. of Patients Included (n=447) RA = 148; Non-RA = 299

69

3.6.2 Table 1: Characteristics of RA patients and non-RA patients

RA NON-RA CHARACTERISTICS n=149 n=301

Age, mean (SD) years 61.8 (14.1) 57.9 (16.7) Female, n(%) 115 (77.2) 195 (64.8) Seen by another rheumatologist, n(%) 62 (41.6) 60 (19.8) Documentation of Disease onset, n(%) 116 (77.9) - Disease Duration > 2 years+, n(%) 106 (90.6) - Disease Duration, mean (SD) years 10.8 (10.6) - Rheumatoid Factor positive++, n(%) 81 (61.8) 20 (16.3) 1987 ACR Criteria*, n(%)

> 3 out of 7 Criteria 121 (81.2) 38 (12.6) > 4 out of 7 Criteria 106 (71.1) 18 (6.0) 2010 ACR/EULAR criteria**, n(%)

> 5 out of 10 Criteria 101 (96.2) 21 (25.0) > 6 out of 10 Criteria 84 (80.0) 14 (16.7) Documented Pharmacologic Exposures (ever), n(%)

> 1 NSAID 86 (57.8) 125 (41.5) > 1 COX-2 Inhibitor 48 (32.2) 83 (27.6) > 1 glucocorticosteroid (any route) 110 (73.8) 161 (53.5) > 1 DMARD 143 (96.0) 123 (40.9) Methotrexate 121 (81.2) 51 (16.9) Hydroxychloroquine 93 (62.4) 55 (18) Sulfasalazine 45 (30.2) 27 (9.0) Leflunomide 41 (27.5) 9 (3.0) Other DMARDs++ 28 (18.8) 35 (15.0) > 1 Biologic& 35 (23.5) 24 (8.0) + Date of RA diagnosis of 2009 and prior for those RA patients with documentation of date of disease onset n=116 ++ Rheumatoid factor status was unavailable for 18 (12%) RA patients and 178 (59%) non-RA patients. * 1987 Criteria were summed even if a patient had missing data; For classification of ‘definite RA’, a total score of 4 or greater (of a possible 7) is required. ** 2010 Criteria were only summed on patients with data on all four domains for patients with new-onset arthritis: number and site of involved joints (score range 0–5), serologic abnormality (score range 0–3), elevated acute-phase response (score range 0–1), and symptom duration (2 levels; range 0–1). Results of the 2010 Criteria are reported for 105 (70.5%) RA patients and 84 non-RA patients who had data on all four domains. A total score of 6 or greater (of a possible 10) is required for classification of ‘definite RA’. As the individual criteria within in each domain are weighted differently, with placement into the highest possible category, only the total scores are presented (not the individual criteria). ++ Other DMARDs include: Azathioprine, Chloroquine, Cyclophosphamide, Cyclosporine, Gold, Minocycline &Etanercept, Adalimumab, Infliximab, Certolizumab, Golimumab; Abatacept, Anakinra, Rituximab, Tocilizumab

70

3.6.3 Table 2: Test characteristics of multiple algorithms: Results for Patients aged >20y

Post- Sensitivity Specificity PPV NPV Test [95 CI%] [95 CI%] [95 CI%] [95 CI%] Algorithms [Pretest Prevalence: 33%] TP TN FN FP Prev.#

1 H ever 34 288 114 11 10% 23 (16-30) 96 (94-99) 76 (63-88) 72 (67-76) 1 H ever OR 1 ER visit ever 40 286 108 13 12% 27 (20-34) 96 (93-98) 76 (64-87) 73 (68-77) 1 P ever 148 179 0 120 60% 100 (100-100) 60 (54-65) 55 (49-61) 100 (100-100) 1 P ever by a specialist 147 230 1 69 48% 99 (98-100) 77 (72-82) 68 (62-74) 100 (99-100) 2 P by any physician in 1 YR 145 231 3 68 48% 98 (96-100) 77 (73-82) 68 (62-74) 99 (97-100) 2 P by any physician in 2 YR 146 228 2 71 49% 99 (97-100) 76 (71-81) 67 (61-74) 99 (98-100) 2 P by any physician in 3 YR 146 225 2 74 49% 99 (97-100) 75 (70-80) 66 (60-73) 99 (98-100) 3 P by any physician in 1 YR 140 256 8 43 41% 95 (91-98) 86 (82-90) 77 (70-83) 97 (95-99) 3 P by any physician in 2 YR 143 245 5 54 44% 97 (94-100) 82 (78-86) 73 (66-79) 98 (96-100) 3 P by any physician in 3 YR 143 243 5 56 45% 97 (94-100) 81 (77-86) 72 (67-78) 98 (96-100) 2 P with ! 1 P by a specialist in 1 YR 145 248 3 51 44% 98 (96-100) 83 (79-87) 74 (68-80) 99 (98-100) 2 P with ! 1 P by a specialist in 2 YR 146 246 2 53 45% 99 (97-100) 82 (78-87) 73 (67-80) 99 (98-100) 2 P with ! 1 P by a specialist in 3 YR 146 244 2 55 45% 99 (97-100) 82 (77-86) 73 (67-79) 99 (98-100) 3 P with ! 1 P by a specialist in 1 YR 139 264 9 35 39% 94 (90-98) 88 (85-92) 80 (74-86) 97 (95-99) 3 P with ! 1 P by a specialist in 2 YR 143 257 5 42 41% 97 (94-100) 86 (82-90) 77 (71-83) 98 (96-99) 3 P with ! 1 P by a specialist in 3 YR 143 256 5 43 42% 97 (94-100) 86 (82-90) 77 (71-83) 98 (96-100) (1 H ever) OR (2 P with ! 1 P by a specialist in 1 YR) 145 245 3 54 45% 98 (96-100) 82 (78-86) 73 (67-79) 99 (97-100) (1 H ever) OR (2 P with ! 1 P by a specialist in 2 YR) 146 243 2 56 45% 99 (97-100) 81 (77-86) 72 (66-79) 99 (98-100) (1 H ever) OR (2 P with ! 1 P by a specialist in 3 YR) 146 241 2 58 46% 99 (97-100) 81 (76-85) 72 (65-78) 99 (98-100) (1 H ever) OR (3 P with !1 P by a specialist in 1 YR) 139 261 9 38 40% 94 (90-98) 87 (84-91) 79 (73-85) 97 (95-99) (1 H ever) OR (3 P with !1 P by a specialist in 2 YR) 143 254 5 45 42% 97 (94-100) 85 (81-89) 76 (70-82) 98 (96-100) (1 H ever) OR (3 P with !1 P by a specialist in 3 YR) 143 253 5 46 42% 97 (94-100) 85 (81-89) 76 (70-82) 98 (96-100) (1 H ever) OR (2 P ! 8 weeks apart in 2 YR, no exclusions* 144 229 4 70 48% 97 (95-100) 77 (72-81) 67 (61-74) 98(97-100) (1 H ever) OR (2 P ! 8 weeks apart in 2 YR) excluding Case A or Case B 104 243 44 56 36% 70 (63-78) 81 (77-86) 65 (58-72) 85 (81-89) (1 H ever) OR (2 P ! 8 weeks apart in 3 YR) excluding Case A or Case B 106 239 42 60 37% 72 (64-79) 80 (75-85) 64 (57-71) 85 (81-89) (1 H ever) OR (2 P ! 8 weeks apart in 4 YR) excluding Case A or Case B 108 237 40 62 38% 73 (66-80) 79 (75-84) 64 (56-71) 86 (81-90) (1 H ever) OR (2 P ! 8 weeks apart in 5 YR) excluding Case A or Case B 108 235 40 64 38% 73 (66-80) 79 (74-83) 63 (56-70) 86 (81-90) (1 H ever) OR (2 P ! 8 weeks apart in 2 YR) excluding Case A 104 244 44 55 36% 70 (63-78) 82 (77-86) 65 (58-73) 85 (81-89) & All Ages: N = 447 patients: 148 RA and 299 non-RA; #Prevalence; TP= True Positive; TN= True Negative; FN= False Negative; FP= False Positive; H: Hospitalization code; P=physician diagnostic code; Specialist = rheumatologist, internal medicine, orthopedic surgeon; *Exclusions: Case A: Patients with ! 2 P with another rheumatology diagnosis subsequent to an RA diagnosis OR Case B: Patients for which the diagnosis of RA was not confirmed by a specialist. Other rheumatology diagnoses include: Osteoarthritis (715), Gout (274, 712), Polymyalgia Rheumatica (725), other seronegative spondyloarthropathy (721), Ankylosing spondylitis (720), psoriasis (696), synovitis/tenosynovitis/bursitis (727), connective tissue disorder (710), vasculitis (446), others (716; 718; 728; 727; 728; 729; 781)

3.6.4 Table 3: Test characteristics of multiple algorithms: Results for Patients aged ! 65y

Post- Sensitivity Specificity PPV NPV Test [95 CI%] [95 CI%] [95 CI%] [95 CI%] Algorithms [Pretest Prevalence: 36%] TP TN FN FP Prev.#

1 P AND !1 Rx ever 55 69 2 33 55% 97 (92-100) 68 (59-77) 63 (52-73) 97 (93-100) 1 P AND !1 DMARD or Biologic 53 86 4 16 43% 93 (86-100) 84 (77-91) 77 (67-87) 96 (91-100) 2 P AND !1 Rx ever 55 78 2 24 50% 97 (92-100) 77 (68-85) 70 (60-80) 98 (94-100) 2 P AND !1 DMARD or Biologic 53 90 4 12 41% 93 (86-100) 88 (82-95) 82 (72-91) 96 (92-100) 2 P > 60 days apart AND !1 RX ever 55 78 2 24 50% 97 (92-100) 77 (68-85) 70 (60-80) 98 (94-100) 2 P > 60 days apart AND !1 DMARD or Biologic 53 90 4 12 41% 93 (86-100) 88 (82-95) 82 (72-91) 96 (92-100) (1 H ever) OR (2 P AND !1 Rx in 1 YR) 54 88 3 14 43% 95 (89-100) 86 (80-93) 79 (70-89) 97 (93-100) (1 H ever) OR (2 P AND !1 DMARD or Biologic in 1 YR) 52 94 5 8 38% 91 (84-99) 92 (87-97) 87 (78-95) 95 (91-99) (1 H ever) OR (2 P AND !1 Rx in 2 YR) 54 86 3 16 44% 95 (89-100) 84 (77-91) 77 (67-87) 97 (93-100) (1 H ever) OR (2 P AND !1 DMARD or Biologic in 2 YR) 52 93 5 9 38% 91 (84-99) 91 (86-97) 85 (76-94) 95 (91-99) (1 H ever) OR (2 P with ! 1 P by specialist AND !1 Rx in 1 YR) 54 91 3 11 41% 95 (89-100) 89 (83-95) 83 (74-92) 97 (93-100) (1 H ever) OR (2 P with ! 1 P by specialist AND !1 DMARD or Biologic in 1YR) 52 95 5 7 37% 91 (84-99) 93 (88-98) 88 (80-96) 95 (91-99) (1 H ever) OR (2 P with ! 1 P by specialist AND !1 Rx in 2 YR) 54 90 3 12 42% 95 (89-100) 88 (82-95) 82 (73-91) 97 (93-100) (1 H ever) OR (2 P with ! 1 P by specialist AND !1 DMARD or Biologic in 2YR) 52 94 5 8 38% 91 (84-99) 92 (87-97) 87 (78-95) 95 (91-99) (1 H ever) OR (3 P AND ! 1 Rx in 1 YR) 52 95 5 7 37% 91 (84-99) 93 (88-98) 88 (80-96) 95 (91-99) (1 H ever) OR (3 P AND ! 1 DMARD or Biologic in 1 YR) 51 98 6 4 35% 90 (82-97) 96 (92-100) 93 (86-100) 94 (90-99) (1 H ever) OR (3 P AND ! 1 Rx in 2 YR) 54 87 3 15 43% 95 (89-100) 85 (78-92) 78 (69-88) 97 (93-100) (1 H ever) OR (3 P AND ! 1 DMARD or Biologic in 2 YR) 52 93 5 9 38% 91 (84-99) 91 (86-97) 85 (76-94) 95 (91-99) (1 H ever) OR (3 P AND ! 1 Rx in 3 YR) 54 87 3 15 43% 95 (89-100) 85 (78-92) 78 (69-88) 97 (93-100) (1 H ever) OR (3 P AND ! 1 DMARD or Biologic in 3 YR) 52 93 5 9 38% 91 (84-99) 91 (86-97) 85 (76-94) 95 (91-99) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 Rx in 1 YR) 52 96 5 6 36% 91 (84-99) 94 (90-99) 90 (82-98) 95 (91-99) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 DMARD or Biologic in 1 YR) 51 98 6 4 35% 90 (82-97) 96 (92-100) 93 (86-100) 94 (90-99) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 Rx in 2 YR) 54 90 3 12 42% 95 (89-100) 88 (82-95) 82 (73-91) 97 (93-100) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 DMARD or Biologic in 2 YR) 52 94 5 8 38% 91 (84-99) 92 (87-97) 87 (78-95) 95 (91-99) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 Rx in 3 YR) 54 90 3 12 42% 95 (89-100) 88 (82-95) 82 (73-91) 97 (93-100) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 DMARD or Biologic in 3 YR) 52 94 5 8 38% 91 (84-99) 92 (87-97) 87 (78-95) 95 (91-99) $ Seniors 65+: N= 159 patients: 57 RA and 102 non-RA; TP= True Positive; TN= True Negative; FN= False Negative; FP= False Positive; H: Hospitalization code; P=physician diagnostic code; Specialist = rheumatologist, internal medicine, orthopedic surgeon; Rx: oral corticosteroid, disease-modifying anti-rheumatic drug (DMARD) or biologic; DMARDs included: Azathioprine, Chloroquine, Cyclophosphamide, Cyclosporine, Gold, Minocycline; and Biologics included: Etanercept, Adalimumab, Infliximab, Certolizumab, Golimumab; Abatacept, Anakinra, Rituximab, Tocilizumab; 72

3.6.5 Figure 2: Forest Plot of Specificity estimates by Rheumatology Site

Using the Algorithm of “(1 hospitalization RA code) OR (3 physician RA claims with !1 by a specialist in a 2 year period)”. n=25 for each site [except for 3 sites who had n=24]

73

3.7 Appendix

3.7.1 Sample Size Estimation Sample size estimation for administrative data validation studies

The validation study design proposed is aimed at estimating the precision of coding algorithms for case ascertainment to determine accurate estimates of sensitivity and specificity. When the focus is on estimation, the sample size is determined through the width of the confidence interval. By specifying how precise (how narrow) the confidence interval is required, the sample size can be determined.108,109 The number of subjects needed to define an expected level of sensitivity and/or specificity together with the precision of that estimate (that is, the confidence intervals) can be calculated. When we are equally interested in having a test with a certain specificity and sensitivity, the greater of the two is required.

The sample size needed to obtain sensitivity and specificity estimates of approximately 85% with a precision that ensures that the 95% confidence intervals were within 10% (+/-5) of the estimate, was determined. We estimated that we would require at least 50 individuals with RA comprising approximately 25% of the study population for a total sample size of 200. Current Ontario estimates suggest that, on average, 25% of rheumatology patients have RA. When we require the need for both adequate sensitivity and specificity, we took the greater of the two sample size estimates (200 over 67 patients). However, we propose inflating our sample size to preserve precision as we will require at least 50 cases of RA who are under 65 and 65 cases of RA who are > 65 as sub-analysis will evaluate the use of drug prescriptions as a component of algorithms. Thus, we require at least 400 eligible patients.

74

For this study, we propose inflating our pre-determined sample size to maximize precision by taking into account of between-cluster variation when estimating the sample size. If the clustering effect is not taken into account, the calculated variance (thus the calculated confidence limits for sensitivity and specificity) will be inappropriately small because the calculation ignores the positive correlation among the patients nested within the practice of each rheumatologist. Even if the amount of correlation among the patients nested within the rheumatology practice is small, the variance (confidence limits) will be biased if the clustering effect is not taken into account.110 Correlated sampling units in the situation of diagnostic test evaluation studies imply that the sensitivity and specificity are not equal among clusters. Within each cluster (rheumatology site) the sensitivity and specificity is assumed to be the same but is expected to be different among clusters in an amount dependent upon the intraclass correlation (ICC). The sample size requirement must be inflated (multiplied by) the design effect, or variance inflation factor, which is a function of the average number of individuals sampled per cluster and the ICC.

Deff = 1 + (n-1)!

Where n is the average cluster size and ! the ICC of the outcome. We set our cluster size to be 25 patients, as estimated this to be the number of charts we could abstract in one day and would only be required to visit each rheumatologist once (25 patients per rheumatologist). An ICC of rheumatology diagnoses could not be identified from previous studies, so we obtained an ICC determined from chronic disease diagnoses (diabetes and cardiovascular diagnoses) in the GPRD.95 75

Deff = 1 + (25-1)0.05 = 2.2 Estimated Sample Size = 200 x 2.2 = 440

To simplify, we decided to maximize our sample size to target 500 patients cared for by rheumatologists. To reach this sample size, we would require 25 rheumatology patients/rheumatologist from 20 rheumatologists.

76

3.7.2 Data Abstraction Forms

77

Patient Master List: Page ___ of ____ Enrolled Site ID Patient # RA Non- DOB Health Card Number RA (dd/mm/yyyy) ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______/ ____ / ______/ ______/ ______

78

Inclusion Criteria Site ID: ______Patient ID: __ __ Patient Information Medical record3.7.2.1.1 could be located ! Y ! N Gender: ! M ! F OHIP No. in medical records ! Y ! N Previously seen by another Rheum: !Y !N !NA > 1 visit in the last 2 years (March 2009 – 2011) ! Y ! N For RA patients: > 2 visits between April 2000 and Mar 2011 ! Y ! N Date symptoms began: ____ / ____ / ____(dd/mm/yyyy) st Disease onset after > 20 years of age ! NA ! Y ! N Date of 1 RA Diagnosis: ___ / ___ / ____(dd/mm/yyyy)

Visit Info Rheumatic Diagnosis (check all diagnoses documented at each visit)

Date: eritis,

gative SpAs:

Other [txt field] (dd/MM/yyyy) cell Giant

temporal) art RA OA Gout PMR MSK Unspecified arthritis Other AS serone Other PsA, ReA PsO Synovitis/Tenosynovitis/ bursitis Vasculitis: ( polyarteritis nodosa Tissue: Connective lupus, scleroderma, dermatomyositis, polymyositis 1. ____ / ____ / ____ 2. ____ / ____ / ____ 3. ____ / ____ / ____ 4. ____ / ____ / ____ 5. ____ / ____ / ____ 6. ____ / ____ / ____ 7. ____ / ____ / ____ 8. ____ / ____ / ____ 9. ____ / ____ / ____ 10.____ / ____ / ____ 11.____ / ____ / ____ 12.____ / ____ / ____ 13.____ / ____ / ____ 14.____ / ____ / ____ 15.____ / ____ / ____ 16.____ / ____ / ____ 17.____ / ____ / ____ 18.____ / ____ / ____ 19.____ / ____ / ____ 20.____ / ____ / ____ 21.____ / ____ / ____ 22.____ / ____ / ____ 23.____ / ____ / ____ 24.____ / ____ / ____ 25.____ / ____ / ____ 26.____ / ____ / ____ 27.____ / ____ / ____

79

Visit Info SUPPMENTAL VISIT INFO SHEET Site ID: ______Patient ID: __ __

tis,

Date: ssue:

Other [txt field] Giant cell cell Giant (dd/MM/yyyy)

ovitis/Tenosynovitis/

temporal) arteri RA OA Gout PMR MSK Unspecified arthritis Other AS SpAs: seronegative Other PsA, ReA PsO Syn bursitis Vasculitis: ( polyarteritis nodosa Ti Connective lupus, scleroderma, dermatomyositis, polymyositis 31.____ / ____ / ____ 32.____ / ____ / ____ 33.____ / ____ / ____ 34.____ / ____ / ____ 35.____ / ____ / ____ 36.____ / ____ / ____ 37.____ / ____ / ____ 38.____ / ____ / ____ 39.____ / ____ / ____ 40.____ / ____ / ____ 41.____ / ____ / ____ 42.____ / ____ / ____ 43.____ / ____ / ____ 44.____ / ____ / ____ 45,____ / ____ / ____ 46.____ / ____ / ____ 47.____ / ____ / ____ 48.____ / ____ / ____ 49.____ / ____ / ____ 50.____ / ____ / ____ 51.____ / ____ / ____ 52.____ / ____ / ____ 53.____ / ____ / ____ 54.____ / ____ / ____ 55.____ / ____ / ____ 56.____ / ____ / ____ 57.____ / ____ / ____ 58.____ / ____ / ____ 59.____ / ____ / ____ 60.____ / ____ / ____ 61.____ / ____ / ____ 62.____ / ____ / ____ 63.____ / ____ / ____ 64.____ / ____ / ____

80

1987 Site ID: ______Patient ID: __ __ Date: ____ / ____ / ____(dd/mm/yyyy) Classification Criteria: Yes No N/A+ Morning Stiffness: Morning stiffness in and around joints, lasting at least 1 hour before ! ! ! maximal improvement > 3 involved joints: At least 3 joint areas simultaneously have had soft tissue swelling ! ! ! or fluid (not bony overgrowth alone) observed by a physician. The 14 possible areas are right or left PIP, MCP, wrist, elbow, knee, ankle, and MTP joints. Hand Joint involvement: At least one area swollen (as described above) in a wrist, ! ! ! MCP, or PIP joint. Symmetric Arthritis: Simultaneous involvement of the same joint areas (as defined in ! ! ! 2) on both sides of the body. (Bilateral involvement of PIPs, MCPs, or MTPs is acceptable without absolute symmetry.) Rheumatoid Nodules: Subcutaneous nodules over bony prominences or extensor ! ! ! surfaces, or in juxtaarticular regions, observed by a physician. Radiographic Changes: Radiographic changes typical of rheumatoid arthritis on ! ! ! posteroanterior hand and wrist radiographs, which must include erosions or unequivocal bony decalcifications localized in, or most marked adjacent to, the involved joints. (Osteoarthritis changes alone do not qualify.) Rheumatoid factor positive: Demonstration of abnormal amounts of serum rheumatoid ! ! ! factor by any method for which the result has been positive in <5% of normal control subjects

2010 Classification Criteria Yes No N/A+ Joint Involvement: 2-10 medium-large joints: (1 point) ! ! ! 1-3 small joints (2 points) ! ! ! 4-10 small joints (3 points) ! ! ! More than 10 small joints (5 points) ! ! ! Serology: Negative RF and negative ACPA (0 point) ! ! ! Low-positive RF or low-positive ACPA (2 points) ! ! ! High-positive RF or high-positive ACPA (3 points) ! ! ! Rheumatoid factor: ______Anti–citrullinated protein antibody: ______Duration of Symptoms/Synovitis (pain, swelling, tenderness) < 6 weeks (0 point) ! ! ! > 6 weeks (1 point) ! ! ! Acute phase reactants: Normal CRP and normal ESR (0 points) ! ! ! Abnormal CRP or abnormal ESR (1 point) ! ! ! ESR: ______CRP: ______Erosive arthritis: ! ! ! +Not done/not reported/missing;

81

Prescription Drug History Has patient ever had… Yes Date of 1st Record (dd/mm/yyyy) NSAID ! ____ / ____ / ____ Diclofenac (Arthrotec, Voltaren), Diflunisal (Dolobid), Etodolac (Ultradol), Fenoprofen calcium (Nalfon), Floctafenine (Idarac), Flurbiprofen (Ansaid, Froben), Ibuprofen (Motrin, Advil), Indomethacin (Indocid, Ketoprofen (Orudis, Oruvail, Rhodis), Ketorolac tromethamine (Toradol), Mefenamic acid (Ponstan), Meloxicam (Mobicox), Nabumetone (Relafen), Naproxen (Naprosyn), Oxaprozin (Daypro), Phenylbutazone (Butazolidine), Piroxicam (Feldene), Sulindac (Clinoril), Tenoxicam (Mobiflex), Tiaprofenic acid (Surgam), Tolmetin sodium (Tolectin), COX-2 Inhibitor ! ____ / ____ / ____ Celecoxib (Celebrex), Rofecoxib (Vioxx), Valdecoxib (Bextra) Corticosteroids Oral: Prednisone (Deltasone, Winpred), ! ____ / ____ / ____ Cortisone(Cortone), Dexamethasone, Hydrocortisone

(Cortef, Cortate, A-Hydrocort), Prednisolone Intravenous: Methylprednisolone (Solumedrol) ____ / ____ / ____ ! ____ / ____ / ____ Intramuscular: Methylprednisolone (Depo-medrol), ! Triamcinolone (Kenalog, Aristospan) ____ / ____ / ____ Intra-articular: Betamethasone (Celestone Soluspan, ! Betnesol), Methylprednisolone (Depo-medrol), Triamcinolone (Kenalog, Aristospan) DMARDs Azathioprine (Imuran) ! ____ / ____ / ____ Chloroquine (Aralen) ! ____ / ____ / ____ Cyclophosphamide (Cytoxan) ! ____ / ____ / ____ Cyclosporine (Neoral, Sandimmune) ! ____ / ____ / ____ D-Penicillamine (D-Penamine) ! ____ / ____ / ____ Gold sodium thiomalate (Myochrysine) ! ____ / ____ / ____ Hydroxychloroquine (Plaquenil) ! ____ / ____ / ____ Leflunomide (Arava) ! ____ / ____ / ____ Minocycline (Minocin) ! ____ / ____ / ____ Methotrexate (MTX, Rheumatrex) ! ____ / ____ / ____ Sulfasalazine (Salazopyrine) ! ____ / ____ / ____ Biologics Abatacept (Orencia) ! ____ / ____ / ____ Adalimumab (Humira) ! ____ / ____ / ____ Anakinra (Kineret) ! ____ / ____ / ____ Certolizumab (Cimzia) ! ____ / ____ / ____ Etanercept (Enbrel) ! ____ / ____ / ____ Golimumab (Simponi) ! ____ / ____ / ____ Infliximab (Remicade) ! ____ / ____ / ____ Rituximab (Rituxan) ! ____ / ____ / ____ Tocilizumab (Actemra) ! ____ / ____ / ____ Other:______! ____ / ____ / ____ 82

Chapter 4 Accuracy of Ontario Health Administrative Databases in Identifying Patients with Rheumatoid Arthritis [PART II]: A validation study using medical records of Primary Care Physicians

Jessica Widdifield1, BSc, PhD(c) Claire Bombardier1, MD, FRCPC, Sasha Bernatsky2, MD, FRCPC, PhD, J. Michael Paterson1,3,4, MSc, Diane Green3, BSc, Jacqueline Young3, MSc; Noah Ivers1,3, MD, PhD(c), Debra Butt1, MD, MSc; R. Liisa Jaakkimainen1,3 MD, MSc, J. Carter Thorne5, MD, FRCPC, Vandana Ahluwalia6 MD, FRCPC, Karen Tu1,3, MD, MSc. Affiliations: 1. University of Toronto, Toronto, ON; 2. McGill University, Montreal, QC; 3. Institute for Clinical Evaluative Sciences, Toronto, ON; 4. McMaster University, Hamilton, ON; 5. Southlake Regional Health Centre, Newmarket, ON; 6. William Osler Health Center, Brampton, ON;

Acknowledgements: First and foremost, we wish to thank all the family physicians who provide data to the Electronic Medical Record Administrative data Linked Database (EMRALD). We would like to thank our five chart abstractors for their work: Nancy Cooper, Abayomi Fowora, Diane Kerbel, Anne Marie Mior and Barbara Thompson and assistance from Myra Wang and Robert Turner at the Institute for Clinical Evaluative Sciences (ICES). Financial support provided by the Canadian Institutes of Health Research (operating grant 119348). This study was also supported by ICES, a non-profit research corporation funded by the Ontario Ministry of Health and Long-Term Care (MOHLTC). The opinions, results and conclusions are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred. We also wish to thank Brogan Inc., Ottawa for use of their Drug Product and Therapeutic Class Database. Dr. Bombardier holds a Canada Research Chair in Knowledge Transfer for Musculoskeletal Care (2002-2016) and a Pfizer Research Chair in Rheumatology. Dr. Bernatsky holds a career award from the Fonds de la recherche en santé du Québec (FRSQ). Dr. Ivers holds a CIHR Fellowship Award in Clinical Research and a Fellowship Award from the Department of Family and Community Medicine, University of Toronto; and Dr. Tu holds a CIHR award in the Area of Primary Care (2011-2013).

Abstract word count: 250 Manuscript word count: 3229/3800 (excluding abstract) No. of Tables: 3 No. of Figures: 1 Key words: Rheumatoid arthritis, health administrative databases, validation study, sensitivity and specificity

83

4.1 Abstract

Previous validation studies of health administrative data for rheumatoid arthritis (RA) case ascertainment have sampled patients primarily from rheumatology clinics, which may limit the generalizability of the results. Our aim was to validate administrative data algorithms to identify RA patients drawn from primary care physician records in Ontario, Canada.

Methods:

We performed a retrospective chart abstraction study using a random sample of 7500 adult patients among 84 family physicians within the Electronic Medical Record Administrative data Linked Database (EMRALD). Using physician-reported diagnoses as the reference standard, we validated administrative data algorithms for RA case ascertainment.

Results:

We identified 69 patients with physician-reported RA for a lifetime RA prevalence of 0.9%. Among RA cases, 86% had a documented diagnosis by a specialist. All algorithms tested had excellent specificity (>97%), however sensitivity varied (75-90%) among physician billing algorithms. Despite the low RA prevalence, algorithms for identifying RA patients had adequate positive predictive value (PPV). The algorithm of “[1 hospitalization RA code] OR [3 physician RA diagnosis codes (claims) with !1 by a specialist in a 2 year period]” had a sensitivity of 78% (95% confidence interval [CI] 69- 88), specificity of 100% (95% CI 100-100), PPV of 78% (95% CI 69-88) and NPV of 100% (95% CI 100-100).

Conclusion:

We rigorously evaluated the accuracy of RA administrative data algorithms in a random sample from primary care physician records. We demonstrate that with the use of a validated algorithm, administrative data can be used to accurately identify patients with RA within a general population.

Word count: 250/250 84

4.2 Introduction

Validation of health administrative data algorithms for identifying patients with different health states is an important step in using these data for chronic disease surveillance.111 Unfortunately, evaluating the accuracy of health administrative data for rheumatic diseases such as Rheumatoid Arthritis (RA) may be particularly challenging as there is no gold standard test for the diagnosis of RA.12 While rheumatologists are important care providers for RA, primary care physicians also play a key role. To date, previous validation studies of administrative data for RA case ascertainment have sampled patients primarily from rheumatology clinics. This may limit the usefulness of the results due to the high prevalence of RA in rheumatology practices. In order to establish the optimal approach to identifying RA patients using administrative data at the population level, the prevalence of disease in a validation cohort should approximate the prevalence of disease in the population.54 Primary care medical records are likely to provide a closer approximation of population-based disease prevalence in

Canada, as primary care physicians act as ‘gate-keepers’ for access to specialists and are the point of entry for patients into the health care system. Patients do not see specialists on an outpatient basis unless referred by another physician.

Our primary objective was to validate administrative data algorithms for accurate identification of RA patients from provincial administrative data by using diagnoses documented within the medical records of primary care physicians as the reference standard. The overarching aim is to inform the optimal method(s) of ascertaining RA patients from administrative data, and to use this information to create a provincial-wide research database of all individuals with RA living in Ontario, Canada.

85

4.3 Subjects and Methods

Setting and Design. In the province of Ontario, all 13 million residents are covered by a universal, single-payer, public health insurance including access to hospital care and physicians’ services. A retrospective chart abstraction study was performed among a random sample of patients seen in primary care physician clinics to identify patients with and without RA. These patients were then linked to health administrative data to test and validate different combinations (algorithms) of physician billing, hospitalization and pharmacy data to identify RA patients within administrative data.

This study was approved by the Sunnybrook Health Sciences Centre Research Ethics Board.

Participant Selection. We used the Electronic Medical Record Administrative data Linked Database

(EMRALD) at the Institute for Clinical Evaluative Sciences (ICES), which is currently one of Canada’s largest primary care computerized databases and continues to be a rich data source for Canadian administrative data validation studies.112,113,114 All clinically relevant information contained in the primary care physician patient chart is transferred into EMRALD twice a year. This includes all physician office visits, information on the patient’s current and past medical history, family history, risk factors, allergies and immunizations, laboratory test results, prescriptions, specialist consultation letters, discharge summaries and diagnostic tests. At the time of study, there were 84 physicians that had contributed to EMRALD voluntarily and whose data met quality standards. These physicians are at diverse stages of clinical practice and represent both urban and rural Ontario. A random sample of 7500 adult patients aged 20 years or older and 2,000 senior patients aged 65 years or older as of December

31, 2010 were drawn from 73,014 qualifying patients. Patients were included if they had a valid health insurance number and date of birth, had at least one visit in the previous year, and were rostered

(enrolled) to an EMRALD physician. This means that all physicians have a list of patients ‘rostered’ 86

within their practice for which both the physician and the patient have a signed agreement with the provincial government identifying the physician that is responsible for the patient’s primary care service delivery.115

Data Abstraction. Each patient’s entire medical record was screened by one of five trained abstractors to identify whether any had evidence of possible inflammatory arthritis. For this, a standardized abstraction manual was developed based on the investigators’ clinical experience and the literature. A

10% sample of charts was abstracted twice by the same abstractor and a second time by a different abstractor to assess the intra- and inter-rater reliability, respectively. Kappa scores for inter- and intra- rater reliability exceeded 0.85 indicating good agreement for all five chart abstractors. Then one experienced chart abstractor (JW) reviewed the records of over 700 of these patients who, based on initial screen, were identified as possibly having inflammatory arthritis. The purpose of second screen was to verify whether these patients had a diagnosis of RA and whether a rheumatologist, orthopedic surgeon or internist confirmed it. In addition, the abstractor verified drug history [noting use of non- steroidal anti-inflammatory drugs (NSAIDS), glucocorticosteroids, disease-modifying anti-rheumatic drugs (DMARDs), and biologics], and whether patients satisfied elements of both the 1987 and 2010 classification criteria for RA.90,11 Results of serology (i.e. rheumatoid factor, RF, and anti-CCP antibodies) and acute phase reactants tests (i.e. sedimentation rate, ESR, c-reactive protein, CRP) were also obtained from both laboratory test fields within the EMR and specialist consultation notes. All trained chart abstractors were blinded to the administrative data diagnoses codes for all patients.

87

Reference Standard. Patients were classified as RA cases and non-RA patients based on the level of evidence in the chart. The highest levels of evidence to support an RA diagnosis were: i) diagnosis by a rheumatologist, orthopedic surgeon, or internal medicine specialist; or ii) a primary care physician- documented RA diagnosis with supporting evidence (e.g., serology, joint involvement, or treatment) but without a supporting specialist consultation note. Additionally, patients were flagged as possible

RA cases if the record mentioned RA but lacked supporting evidence or if the record had a query RA diagnosis by both a specialist and the primary care physician. These ‘possible RA’ patients were used in a sensitivity analysis surrounding our reference standard definition of RA in which both definite RA and possible RA patients were grouped together. While these additional ‘possible RA patients’ may be true RA patients, their medical history was not sufficiently complete to confirm the diagnosis.

Health Administrative Data Sources. Once patients were classified as having or not having RA according to our reference standard definitions, administrative data (physician billing and hospitalization diagnoses) were obtained for these patients for the period April 1, 1991 to March 31st 2011 (the years in which administrative data were available during the study period). We used the Ontario Health

Insurance Plan (OHIP) Database28 to identify physician billing diagnosis codes. OHIP is the provincial insurance plan, covering health care for all Ontario citizens. Physicians are reimbursed by submitting claims to OHIP for medical services provided. A diagnosis is provided with each claim, which represents the main ‘reason for the visit’. These diagnoses are coded in the 8th or 9th revision of the

International Classification of Diseases (ICD).29 Hospital visits were identified using the Canadian

Institute for Health Information Discharge Abstract Database (CIHI-DAD), which contains detailed information regarding all hospital admissions, and the National Ambulatory Care Reporting System 88

(NACRS), which records all hospital-based and community-based ambulatory care for day surgery and emergency rooms (ERs).91 Hospital data prior to 2002 have diagnoses coded in ICD-929 and can contain up to 16 diagnoses recorded per hospital encounter. Hospitalizations and emergency department encounters after 2002 are coded using ICD-10 and each record can contain between 1 and

25 diagnoses per encounter. For subjects aged 65 years or older, medication exposures for DMARDS and biologics, and glucocorticosteroids, were determined using the pharmacy claims database of the

Ontario Drug Benefit (ODB) Program, which covers seniors30. Information on physician specialty (for the billing claims) was obtained by linking the Institute for Clinical Evaluative Sciences Physician

Database (IPDB) with the OHIP database.92 The quality of data in the IPDB is routinely validated against the Ontario Physician Human Resource Data Centre database, which verifies this information through periodic telephone interviews with physicians. Physician specialty was classified as primary care physicians and musculoskeletal specialists, confined to rheumatologists, orthopedic surgeons, and internal medicine specialists (general internists). The OHIP Registered Persons Database (RPDB) contains a single unique record for each health care beneficiary that enables the linkage of these datasets in an anonymous fashion using encrypted health insurance numbers.32

Test Methods. Results are reported using the modified Standards for Reporting of Diagnostic

Accuracy (STARD) criteria.96,54

Algorithm Testing. Over 130 algorithms were derived using combinations of physician billing diagnosis (ICD 714), primary and secondary hospital discharge diagnosis (ICD9 714; ICD10 M05-

89

M06), prescription drug claims (for DMARDs, biologics, and glucorticosteroids), by varying windows between diagnosis codes or the period in which diagnosis codes appeared, and whether the services were rendered by a musculoskeletal specialist (rheumatologist, internist or orthopedic surgeon).

Additionally, we tested algorithms with exclusion criteria in which diagnosis codes for ‘other rheumatology conditions’ appeared after a diagnosis of RA, or for which an RA diagnosis code was not provided when a patient was seen by a rheumatologist. The ‘other rheumatology diagnoses’ included: osteoarthritis, gout, polymyalgia rheumatica, other seronegative spondyloarthropathy, ankylosing spondylitis, connective tissue disorder, psoriasis, synovitis/tenosynovitis/bursitis, and vasculitis. All algorithms were tested on the entire sample except those requiring prescription drug claims, which were limited to patients aged 65 years and older. The algorithms were applied to all administrative data for the study period (up to March 31, 2011).

Statistical Analysis. Descriptive statistics were calculated for patient clinical characteristics. The parameters of accuracy [sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV)] of each algorithm were calculated with the 95% confidence interval. We define sensitivity as the proportion of patients with RA documented in their medical record that were also identified as having RA using the administrative data algorithm; and specificity as the proportion of patients who were identified as not having RA by the medical record that were also identified as not having RA by the administrative data algorithm. We define PPV as the proportion of patients identified as having RA by the administrative data algorithm who were identified as having RA by the medical record; and NPV as the proportion of patients identified as not having RA by the administrative data algorithm who were identified as not having RA by the medical record. 90

All analyses were performed at ICES on de-identified data using SAS version 9.2 (SAS Institute, Cary,

North Carolina).

Sample Size. We estimated that we would require at least 50 individuals with RA in order to obtain sensitivity and specificity estimates of approximately 85% with a precision that ensured that the 95% confidence intervals were within +/- 5% of the estimate. Assuming an overall RA prevalence of 0.7-1% in the adult population, we required a study cohort comprising at least 7000 adult patients. Taking into account that RA prevalence increases in a mature population (>65years of age),2 we estimated that we would need a random sample of approximately 2000 additional seniors in order to have enough RA cases aged >65 to ensure precise results when we restrict our analysis to seniors and incorporate drug prescriptions in our administrative algorithm. Therefore, stratified random sampling was performed to obtain 7500 patients aged 20 years and older for our primary analysis and add an additional 2000 patients aged 65 and older for our analyses restricted to seniors.

4.4 Results

Amongst the 84 physicians included in this study the average years in practice was 18.4 years and the average length of time using their EMR was 5.6 years. Analyses were performed on the entire sample, of adult patients aged 20 and older (n=7500), and subsequently restricted to all patients aged 65 and older (n=3426). The average age of the 7500 patients in the adult cohort was 51.5 years.

91

Based on our reference standard definition, the random sample of 7500 medical records identified 69

RA patients for a lifetime RA prevalence of 0.9% for all adults aged 20 and older (figure 1). Amongst these adult RA cases, almost two thirds of patients were female (64%) and the average age was 61.9

(standard deviation, SD 13.7) years. Amongst the 3426 patients aged 65 and older who were sampled,

63 (1.8%) patients had RA. Many were female (65%) and their average age was 76.3 (6.8) years.

Clinical characteristics of the 69 RA patients are shown in Table 1. Most RA patients (86%) had documentation of an RA diagnosis by a specialist. Most RA patients also had documentation of bilateral joint involvement and small joint synovitis. Seropositivity (for RF and/or anti-ccp) was documented in 39% of RA patients and elevated acute phase reactants (i.e. ESR or c-reactive protein) were documented in 45%. DMARDs were the most commonly reported RA drug exposures, and were present in 80%.

Sensitivity, specificity, and predictive values for selected administrative data algorithms are reported in

Table 2. All algorithms had excellent specificity (97-100%). However, the sensitivity of the physician billing algorithms varied (77-90%). Despite the low prevalence of RA in the primary care records, algorithms for identifying RA patients had modest PPV, which improved substantially with the requirement for musculoskeletal specialist billings (51-83%). Varying the duration of the observation window for diagnosis codes had little impact on the accuracy of the algorithms. The addition of time

92

restrictions between diagnosis codes (e.g., diagnosis codes > 8 weeks apart) and exclusion criteria also did not improve algorithm performance.

Among seniors (table 3), the requirement for having an RA drug exposure (glucocorticosteroid,

DMARD, or biologic) slightly improved the PPV, although there were overlapping 95% confidence intervals compared to not including RA drugs in the algorithm.

When we varied the definition of our reference standard to include all possible RA patients, the prevalence increased from 0.9% to 1.3%. When all algorithms were retested to also include all possible

RA patients, there was a trend toward decreasing sensitivity (see Appendix).

4.5 Discussion

Our study demonstrates that administrative data algorithms can detect RA patients who receive regular primary care with a high degree of accuracy. The algorithm of “[1 hospitalization RA code] OR [3 physician RA diagnosis codes (claims) with !1 RA code by a specialist in a 2 year period]” had a sensitivity of 78%, specificity of 100%, PPV of 78% and NPV of 100%. This algorithm was selected as the preferred case definition when using administrative data for defining RA in Ontario. When we independently validated this algorithm among a random sample of 450 patients seen in rheumatology clinics (reported elsewhere), it demonstrated a sensitivity of 97%, specificity of 85%, PPV of 76% and

NPV of 98%. As all algorithms tested had excellent specificity at the population level, the optimal

93

algorithm was selected based on the highest possible PPV while maximizing sensitivity over the shortest possible duration to achieve accurate diagnosis of RA.

Our findings extend those of previous research performed using the General Practice Research

Database in the United Kingdom, which also found a sensitivity of 80% for patients with 3 RA diagnosis codes.72 This study also identified challenges in identifying diagnostic information from primary care records and patients with longstanding RA were subsequently excluded from their analysis. While our sensitivity estimate was similar among patients seen in primary care in Ontario, our clinical definition of RA was able to capture quiescent RA patients who had not sought or were no longer seeking specialty care.

The specificity of the administrative data algorithms for RA patients followed by primary care physicians was high. This indicates that individuals without RA are unlikely to be misclassified at the general population level, where the prevalence of RA is low. However, there may be some degree of misclassification among non-RA patients seen in rheumatology care, where patients may have other inflammatory , more similar to RA.

Despite the low prevalence of RA in primary care records (similar to population-based RA prevalence), algorithms for identifying RA patients had moderately good PPV (78%). This is likely owing to the nature of RA management, which often involves referral to a specialist, frequent physician visits and a

94

non-curative long disease course. These observations suggest that patients with prevalent (long-term) chronic conditions may have a higher probability of being identified by the use of similar administrative data algorithms owing to the disease course, management practices, and frequency of physician visits. This finding may be further supported by the similar results we observed to those of the optimal administrative data algorithms for case ascertainment of diabetes,34 hypertension,33 and other chronic diseases with substantially higher prevalence than RA.

Others have included exclusion criteria specifically, excluding patients if they had diagnosis codes of other rheumatology conditions subsequent to an RA diagnosis, or if their RA diagnosis was not confirmed when a patient saw a rheumatologist.99 Use of these exclusions can increase the complexity of algorithms and as no improvement in algorithm performance was seen, our findings suggest that this additional complexity is not necessary.

This study has both strengths and limitations. Patients were randomly sampled from primary care physician records and we tested many more permutations of administrative data algorithms than other studies. We also conducted rigorous chart reviews. However, misclassification of RA is a potential risk if there is lack of documentation in the medical record, such as a failure to capture all specialist consultation notes. Further, clinical characteristics for the RA patients (e.g., seropositivity) in our study sample may be under-documented in primary care clinical records. Recognizing this challenge, we opted to include all physician-reported diagnoses to define our reference standard, as the retrospective study design of most validation studies can make it difficult to determine true disease status. Another

95

potential limitation is that our findings are derived from patients who have a regular source of primary care. Consequently, our results may not be generalizable to patients who do not have a regular primary care physician. Although almost ten million Ontarians (that is, over 80% of the population) are now rostered to a primary care physician,115,116 we acknowledge this limitation and opted to include inpatient RA diagnosis codes in our final preferred administrative algorithm, even though alone these codes had low sensitivity (22%) and offered little improvement over physician-billing algorithms. The addition of an inpatient RA code to “3 physician RA claims with !1 by a specialist in a 2 year period” may subsequently increase the sensitivity of our algorithm when it is applied to the entire population since hospitalization data may be needed to pick up RA cases who either have no regular primary care physician or arthritis specialist, or who are followed by the approximately 5% of Ontario physicians who are salaried (and who do not necessarily contribute to billing data).117

To our knowledge, this study represents the most comprehensive effort to determine the accuracy of administrative data for detecting RA patients. As the selection of algorithms for future research will vary according to their application,85,1 we have reported results for multiple algorithms. However, our recommendation is for researchers to use the algorithm of “[1 hospitalization RA code] OR [3 physician RA diagnosis codes (claims) with !1 by a specialist in a 2 year period)]” as a base algorithm to allow comparisons across future population-based studies. This algorithm has demonstrated a high degree of accuracy both in terms of minimizing the number of false positives (high PPV) and excluding true negatives (high specificity). Moreover, this algorithm has excellent sensitivity (>96%) at capturing contemporary RA patients under active rheumatology care. We also found that more complicated algorithms (e.g., that incorporate various exclusion criteria) were not necessary better than 96

the simpler ones. Our findings will inform future population-based research and will serve to improve arthritis surveillance activities across Canada and abroad.

97

4.6 Tables and Figures

98

4.6.1 Figure 1: Flow diagram of selection of study participants

!

!

!

!

!

!

!

! 99

4.6.2 Table 1: Clinical Characteristics and Drug Exposures for RA patients

Aged 20 and older CHARACTERISTICS n=69

Age (years), mean (SD) 61.9 (13.7) Female gender 64% Diagnosis provided by a Specialist 86% Morning Stiffness 65% Hand joint involvement (> 1 swollen wrist, MCP, or PIP joint) 67% Symmetric Arthritis (PIPs, MCPs, or MTPs) 65% Rheumatoid Nodules 19% Radiographic changes typical of RA 22% 2-10 medium-large joint involvement (shoulders, elbows, hips, knees, ankles) 39% 1-3 small joint involvement (MCPs, PIPs, 2nd-5th MTPs, thumb IPs, wrists) 54% 4-10 small joint involvement (as above) 51% >10 joint involvement (as above) 26% RF or ACPA positive 39% Elevated ESR or CRP 45% NSAID/COXIB use 77% Glucocorticosteroid use (Oral, inter-articular or intramuscular) 61% DMARD use 80% Biologic use 17% *Prescribed by any physician (specialist or primary care physician) +Abbreviations: MCPs: metacarpophalangeal joints; PIPs: proximal interphalangeal joints, MTPs: metatarsophalangeal joints, IPs: interphalangeal joints, DMARD: disease-modifying anti-rheumatic drug; NSAID: non-steroidal anti-inflammatory drug; COXIB: COX-2 selective inhibitor; RF: Rheumatoid Factor; ACPA: anti–citrullinated protein antibody; ESR: erythrocyte sedimentation rate; CRP: C-reactive protein

100

4.6.3 Table 2: Test characteristics of multiple algorithms among Patients > 20y

Post-Test Sensitivity Specificity PPV NPV Algorithms [Pretest Prevalence: 0.9%] TP TN FN FP Prev.# [95 CI%] [95 CI%] [95 CI%] [95 CI%]

1 H ever 15 7429 54 2 0.2% 22 (12-32) 100 (100-100) 88 (73-100) 99 (99-100) 1 H ever OR 1 ER ever 16 7427 53 4 0.3% 23 (13-33) 100 (100-100) 80 (63-98) 99 (99-100) 1 P ever 62 7188 7 243 4.1% 90 (83-97) 97 (96-97) 20 (16-25) 100 (100-100) 1 P ever by a specialist 56 7377 13 54 1.5% 81 (72-90) 99 (99-100) 51 (42-60) 100 (100-100) 2 P by any physician in 1 YR 58 7363 11 68 1.7% 84 (75-93) 99 (99-99) 46 (37-55) 100 (100-100) 2 P by any physician in 2 YR 58 7359 11 72 1.7% 84 (75-93) 99 (99-99) 45 (36-53) 100 (100-100) 2 P by any physician in 3 YR 58 7352 11 79 1.8% 84 (75-93) 99 (99-99) 42 (34-51) 100 (100-100) 3 P by any physician in 1 YR 55 7398 14 33 1.2% 80 (70-89) 100 (99-100) 63 (52-73) 100 (100-100) 3 P by any physician in 2 YR 55 7395 14 54 1.2% 80 (70-89) 100 (99-100) 60 (50-71) 100 (100-100) 3 P by any physician in 3 YR 55 7393 14 38 1.2% 80 (70-89) 100 (99-100) 59 (49-69) 100 (100-100) 2 P with ! 1 P by a specialist in 1 YR 54 7404 15 27 1.1% 78 (69-88) 100 (100-100) 67 (56-77) 100 (100-100) 2 P with ! 1 P by a specialist in 2 YR 54 7404 15 27 1.1% 78 (69-88) 100 (100-100) 67 (56-77) 100 (100-100) 2 P with ! 1 P by a specialist in 3 YR 54 7401 15 30 1.1% 78 (69-88) 100 (100-100) 64 (54-75) 100 (100-100) 3 P with ! 1 P by a specialist in 1 YR 53 7419 16 12 0.9% 77 (67-87) 100 (100-100) 82 (72-91) 100 (100-100) 3 P with ! 1 P by a specialist in 2 YR 53 7418 16 13 0.9% 77 (67-87) 100 (100-100) 80 (71-90) 100 (100-100) 3 P with ! 1 P by a specialist in 3 YR 53 7418 16 13 0.9% 77 (67-87) 100 (100-100) 80 (71-90) 100 (100-100) (1 H ever) OR (2 P with ! 1 P by a specialist in 1 YR) 55 7403 14 28 1.1% 80 (70-89) 100 (100-100) 66 (56-76) 100 (100-100) (1 H ever) OR (2 P with ! 1 P by a specialist in 2 YR) 55 7403 14 28 1.1% 80 (70-89) 100 (100-100) 66 (56-76) 100 (100-100) (1 H ever) OR (2 P with ! 1 P by a specialist in 3 YR) 55 7400 14 31 1.1% 80 (70-89) 100 (99-100) 64 (54-74) 100 (100-100) (1 H ever) OR (3 P with !1 P by a specialist in 1 YR) 54 7417 15 14 0.9% 78 (69-88) 100 (100-100) 79 (70-89) 100 (100-100) (1 H ever) OR (3 P with !1 P by a specialist in 2 YR) 54 7416 15 15 0.9% 78 (69-88) 100 (100-100) 78 (69-88) 100 (100-100) (1 H ever) OR (3 P with !1 P by a specialist in 3 YR) 54 7416 15 15 0.9% 78 (69-88) 100 (100-100) 78 (69-88) 100 (100-100) (1 H ever) OR (2 P ! 8 weeks apart in 2 YR, no exclusions* 57 7378 12 53 1.5% 83 (74-92) 99 (99-100) 52 (43-61) 100 (100-100) (1 H ever) OR (2 P ! 8 weeks apart in 2 YR) excluding Case A or Case B 41 7401 28 30 0.9% 59 (48-71) 100 (100-100) 58 (46-69) 100 (100-100) (1 H ever) OR (2 P ! 8 weeks apart in 3 YR) excluding Case A or Case B 41 7396 28 35 1.0% 59 (48-71) 100 (99-100) 54 (43-65) 100 (100-100) (1 H ever) OR (2 P ! 8 weeks apart in 4 YR) excluding Case A or Case B 43 7394 26 37 1.1% 62 (51-74) 100 (99-100) 54 (43-65) 100 (100-100) (1 H ever) OR (2 P ! 8 weeks apart in 5 YR) excluding Case A or Case B 44 7390 25 41 1.1% 64 (52-75) 99 (99-100) 52 (41-62) 100 (100-100) (1 H ever) OR (2 P ! 8 weeks apart in 2 YR) excluding Case A 41 7402 28 29 0.9% 59 (48-71) 100 (100-100) 59 (47-70) 100 (100-100) & All Ages: N = 7500 patients (69 RA and 7431 non-RA); #= Post-test prevalence; TP= True Positive; TN= True Negative; FN= False Negative; FP= False Positive H: Hospitalization diagnosis code; P=physician diagnosis code; Specialist = rheumatologist, internal medicine, orthopedic surgeon; Rx: corticosteroid, disease-modifying anti-rheumatic drug (DMARD) or biologic *Exclusions: Case A: Patients with ! 2 P with another rheumatology diagnosis subsequent to an RA diagnosis OR Case B: Patients for which the diagnosis of RA was not confirmed by a specialist. Other rheumatology diagnoses include: Osteoarthritis (715), Gout (274, 712), Polymyalgia Rheumatica (725), other seronegative spondyloarthropathy (721), Ankylosing spondylitis (720), psoriasis (696), synovitis/tenosynovitis/bursitis (727), connective tissue disorder (710), vasculitis (446), others (716; 718; 728; 727; 728; 729; 781)

4.6.4 Table 3: Test characteristics of multiple algorithms among Patients aged ! 65y

Post-Test Sensitivity Specificity PPV NPV Algorithms [Pretest Prevalence: 1.8%] TP TN FN FP Prev.# [95 CI%] [95 CI%] [95 CI%] [95 CI%]

1 P AND !1 Rx ever 53 3248 10 115 4.9% 84 (75-93) 97 (96-97) 32 (25-39) 100 (100-100) 2 P AND !1 Rx ever 51 3318 12 45 2.8% 81 (71-91) 99 (98-99) 53 (43-63) 100 (99-100) 2 P > 60 days apart AND !1 RX ever 49 3320 14 43 2.7% 78 (68-88) 99 (98-99) 53 (43-64) 100 (99-100) (1 H ever) OR (2 P AND !1 Rx in 1 YR) 52 3337 11 26 2.3% 83 (73-92) 99 (99-100) 67 (56-77) 100 (100-100) (1 H ever) OR (2 P AND !1 Rx in 2 YR) 52 3335 11 28 2.3% 83 (73-92) 99 (99-100) 65 (55-76) 100 (100-100) (1 H ever) OR (2 P AND !1 RX in 3 YR) 52 3332 11 31 2.4% 83 (73-92) 99 (99-99) 63 (52-73) 100 (100-100) (1 H ever) OR (2 P with ! 1 P by specialist AND !1 Rx in 1 YR) 52 3347 11 16 2.0% 83 (73-92) 100 (99-100) 77 (66-87) 100 (100-100) (1 H ever) OR (2 P with ! 1 P by specialist AND !1 Rx in 2 YR) 52 3346 11 17 2.0% 83 (73-92) 100 (99-100) 75 (65-86) 100 (100-100) (1 H ever) OR (2 P with ! 1 P by specialist AND ! 1 Rx in 3 YR) 52 3345 11 18 2.0% 83 (73-92) 100 (99-100) 74 (64-85) 100 (100-100) (1 H ever) OR (3 P AND ! 1 Rx in 1 YR) 47 3346 16 17 1.9% 75 (64-85) 100 (99-100) 73 (63-84) 100 (99-100) (1 H ever) OR (3 P AND ! 1 Rx in 2 YR) 48 3342 15 21 2.0% 76 (66-87) 99 (99-100) 70 (59-80) 100 (99-100) (1 H ever) OR (3 P AND ! 1 Rx in 3 YR) 48 3342 15 21 2.0% 76 (66-87) 99 (99-100) 70 (59-80) 100 (99-100) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 Rx in 1 YR) 47 3351 16 12 1.7% 75 (64-85) 100 (99-100) 80 (69-90) 100 (99-100) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 Rx in 2 YR) 48 3349 15 14 1.8% 76 (66-87) 100 (99-100) 77 (67-88) 100 (99-100) (1 H ever) OR (3 P with ! 1 P by specialist AND ! 1 Rx in 3 YR) 48 3348 15 15 1.8% 76 (66-87) 100 (99-100) 76 (66-87) 100 (99-100) $ Seniors 65+: N= 3426 patients (63 RA and 3363 non-RA); #= Post-test prevalence; TP= True Positive; TN= True Negative; FN= False Negative; FP= False Positive; H: Hospitalization code; P=physician diagnosis code; Specialist = rheumatologist, internal medicine, orthopedic surgeon; Rx: oral corticosteroid, disease-modifying anti-rheumatic drug (DMARD) or biologic; DMARDs included: Azathioprine, Chloroquine, Cyclophosphamide, Cyclosporine, Gold, Minocycline; and Biologics included: Etanercept, Adalimumab, Infliximab, Certolizumab, Golimumab; Abatacept, Anakinra, Rituximab, Tocilizumab;

102

4.7 Appendix

4.7.1 Impact of Levels of Evidence on Classifying RA by the Reference Standard

Recognizing that it may be challenging to classify RA patients according to medical records of family physicians, patients were initially classified as RA based on the levels of evidence in the chart. The highest level of evidence to support an RA diagnosis involved a documented diagnosis by a specialist (rheumatologist, orthopedic surgeon, internal medicine) or if patients had a family physician-reported RA diagnosis with supporting evidence (diagnostic screening, joint involvement, treatment). Additionally, patients were also flagged as possible RA patients if they had a brief mention of RA but lacked any supporting evidence or if they had a query RA diagnosis (by both specialists and family physicians) and these patients were used in a sensitivity analysis. Overall, 72% of potential RA patients were classified according to Levels I and II (described below).

I: Diagnosis documented by a specialist (rheumatologist, orthopedic surgeon, internal medicine/hospitalist);

II: Diagnosis documented by the family physician (no specialist notes in the chart) AND there was evidence to support the diagnosis;

III: Diagnosis documented by any physician in the chart but lacked any supporting evidence in the chart

IV: Query RA diagnosis (by specialists and family physicians)

4.7.2 Impact of Levels of Evidence on algorithm accuracy

Separate analyses were performed after patients were re-classified according to the levels of evidence in their medical record. There is a trend for decreasing sensitivity with decreasing levels of evidence in classifying RA patients i.e., the percentage of true positives the algorithms detect among all positive RA cases decreases when the definition of RA according to the reference standard becomes more sensitive. Example: 3 P in 3 YR at least 2 P by any specialist Sensitivity Specificity PPV NPV Level of Evidence (95 CI%) (95% CI) (95% CI) (95% CI)

I+II 75 (65-86) 100 (100-100) 81 (72-91) 100 (100-100)

I+II+III 75 (65-85) 100 (100-100) 81 (72-91) 100 (100-100)

I+II+III+IV 62 (52-71) 100 (100-100) 84 (76-93) 99 (99-100)

Example: 1 P Sensitivity Specificity PPV NPV Level of Evidence (95 CI%) (95% CI) (95% CI) (95% CI)

I+II 90 (83-97) 97(96-97) 20 (16-25) 100 (100-100)

I+II+III 85 (77-93) 97(96-97) 21 (16-26) 100 (100-100)

I+II+III+IV 75 (66-84) 97 (96-97) 23 (19-28) 100 (100-100)

104

Chapter 5 Epidemiology of rheumatoid arthritis (RA) in a universal public health care system: Results from the Ontario RA administrative Database (ORAD)

Jessica Widdifield1,2, BSc, PhD(c); J. Michael Paterson1,2,3, MSc; Sasha Bernatsky4, MD, FRCPC, PhD; Karen Tu1,2, MD, MSc; J. Carter Thorne,5 MD, FRCPC, Vandana Ahluwalia6 MD, FRCPC, Noah Ivers1,2, MD, PhD(c), Debra Butt1, MD, MSc; R. Liisa Jaakkimainen1,2 MD, MSc, and Claire Bombardier1, MD, FRCPC.

Affiliations: 1. University of Toronto, Toronto, ON; 2. Institute for Clinical Evaluative Sciences, Toronto, ON; 3. McMaster University, Hamilton, ON; 4. McGill University, Montreal, QC; 5. Southlake Regional Health Centre, Newmarket, ON; 6. William Osler Health Center, Brampton, ON;

Acknowledgements: Financial support provided by the Canadian Institutes of Health Research (CIHR operating grant 119348). This study was also supported by the Institute for Clinical Evaluative Sciences (ICES), a non-profit research corporation funded by the Ontario Ministry of Health and Long- Term Care (MOHLTC). The opinions, results and conclusions are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred. The authors also wish to thank Simon Hollands and Peter Gozdyra of ICES for their contribution. Dr. Bernatsky holds a career award from the Fonds de la recherche en santé du Québec (FRSQ). Dr. Ivers holds a CIHR Fellowship Award in Clinical Research and a Fellowship Award from the Department of Family and Community Medicine, University of Toronto. Dr. Tu holds a CIHR Fellowship Award in Primary Care Research. Dr. Bombardier holds a Canada Research Chair in Knowledge Transfer for Musculoskeletal Care (2002-2016) and a Pfizer Research Chair in Rheumatology.

Manuscript word count: 2258 (excluding abstract) Abstract word count: 247

105

5.1 Abstract

Objectives:

To describe the incidence and prevalence of rheumatoid arthritis (RA) in Ontario, Canada over the past 15 years.

Methods:

We used the Ontario Rheumatoid Arthritis administrative Database (ORAD), a validated population- based research cohort of all Ontarians with RA. ORAD records were linked with census data to calculate crude and age/sex-standardized incidence and prevalence rates among men and women, and by age group from 1996 to 2010. Age/sex-standardized rates by area of patient residence also were determined.

Results:

As of 2010, there were 97,499 Ontarians with RA, corresponding to a cumulative prevalence of 0.9% (females 1.3%, males 0.5%). Age/sex-standardized RA prevalence increased steadily over time from 473 (95% CI 469-478) per 100,000 population (0.5%) in 1996 to 784 (95% CI 779-789) per 100,000 (0.9%) in 2010. Prevalence increased with age and was higher in northern rural communities [e.g., 1038 (95% CI 1011-1066) per 100,000] than in southern urban areas [e.g., Toronto: 726 (95% CI 710- 743) per 100,000]. Age/sex-standardized incidence decreased gradually over time from 62 (95% CI 60- 63) per 100,000 in 1996 to 54 (95% CI 52-55) per 100,000 in 2010. Despite increasing prevalence among seniors (>65y), after adjustment for sex, incidence was relatively stable among adults <65y.

Conclusion:

Our work suggested an increasing RA prevalence over time and incidence appears to be declining over time. Regional prevalence rates illustrate the high burden of RA, particularly in northern Ontario. Our findings highlight the need for regional approaches to the planning and delivery of RA care.

N=245

106

5.2 Introduction

Rheumatoid arthritis (RA) affects 0.5% to 1% of the general population. However, both incidence and prevalence may vary substantially according to various factors, including geographic areas, patient sex and age, and calendar time.118 Some reports have indicated that RA prevalence may be decreasing,119,120,121,122,123,124 particularly among women.125,126 On the other hand, the prevalence has also been reported to be increasing (and projected to continue to increase) in Europe and North

America as a function of an underlying ageing population.3 Conflicting data also exist with regards to secular trends in incidence, with some studies reporting a rising incidence,127 and others declining128,126,129,124. Some authors have reported a shift toward a more elderly age at onset,130,131 but such shifts have not been observed by all.132 Further, given the relatively low prevalence of RA (0.5-

1%), study populations must be sufficiently large to obtain robust estimates, particularly when stratifying by age and sex.

Valid population-based assessments of the epidemiological trends in RA are critical in helping health- care providers and decision makers to anticipate the burden of RA and to optimize clinical and public health strategies for disease management. To date, Canadian RA surveillance activities have relied upon health administrative data algorithms of unknown validity.102,93

Our primary aim was to estimate the incidence and prevalence of RA in a universal public health care system over the past 15 years by using a validated population-based research cohort that includes all individuals with RA living in Ontario, which is Canada’s most populous province. We also aimed to quantify changes in RA incidence and prevalence by age and sex, and to map the regional distribution of RA.

107

5.3 Subjects and Methods

Setting and Design. Canada’s publicly funded healthcare system is universal and comprehensive for both hospital care and physicians’ services. We performed a population-based cohort study in the

Canadian province of Ontario, which has a large, diverse, multicultural population that constitutes more than one-third of Canada’s population.

Subjects and Data Sources. We used the Ontario Rheumatoid Arthritis administrative Database

(ORAD), a validated population-based research database that aims to include all individuals with RA living in Ontario. Patients are included in ORAD if they are admitted to hospital with a RA diagnosis or have at least three Ontario Health Insurance Plan (OHIP)28 physician service claims (over two years) with RA as the recorded diagnosis, and at least one of these claims originating from a musculoskeletal specialist (rheumatologist, orthopedic surgeon or internist). ORAD has a high sensitivity (80%), specificity (100%) and positive predictive value (78%) for identifying RA patients according to reviews of both primary care133 and rheumatology medical records.134 Records for individuals in

ORAD are also linked to the Ontario Registered Persons Database to obtain information regarding patient’s age, sex, place of residence, and vital status. Censal and inter-censal population estimates were obtained from Statistics Canada for 1991 through 2011.135 These datasets are linked in an anonymous fashion using encrypted health insurance numbers and they have very little missing information.32

Analysis. We used ORAD to calculate crude and age/sex-standardized incidence and prevalence rates

(with corresponding 95% confidence intervals, CIs)136 among men and women aged 15 years or older

108

over the period 1991-2011 (the years of data available in ORAD). Disease onset is defined as the first qualifying health services contact for which a diagnosis of RA is provided.134 Only individuals with no such previous contacts for RA were counted as incident cases for the relevant year and the incident population was calculated as census population minus prevalent cases from the previous year. Prevalent cases were carried forward for each year, and patients who died or moved out of the province were excluded from the numerator and denominator. Individuals who were less than 15 years of age were also excluded from both the numerator and the denominator.

Annual age-specific RA rates were computed for ten-year age bins and expressed per 100,000 population. Age-standardized rates reflect the number of RA patients who would have been diagnosed if the age-specific rates observed in the given population had occurred in a standard population. The

1991 Ontario population was used as the standard population for direct age/sex standardization. The use of this standard population allows for rate trends and comparisons across jurisdictions by adjusting variations in population age distributions over time and across geographic areas. Age/sex-standardized rates by area of patient residence were computed and mapped to illustrate the burden of RA among

Ontario health services planning areas.

For our primary analysis, we used a 5-year ‘run-in’ (wash out) period (as administrative data are only available from 1991 onward) to distinguish between incident and prevalent cases, allowing rates to be reported from 1996 onwards. Results are reported up until 2010 to allow for a 2-year ‘‘look forward’’ period to meet the terms of the case definition. Sensitivity analyses were performed to assess the impact upon incidence and prevalence of varying the length of the run-in/case ascertainment period

(years of data).

109

All analyses were performed at the Institute for Clinical Evaluative Sciences (ICES) using SAS version

9.2 (SAS Institute, Cary, North Carolina). The study was approved by the Sunnybrook Health Sciences

Centre Research Ethics Board in Toronto, Canada.

5.4 Results

As of 2010, there were 97,499 Ontarians with RA (72% were female and 44% were aged 65 years and older). During the study period (1996-2010), the number of individuals aged 15 years of age and older

(population denominator) increased from 8,720,499 in 1996 to 10,851,140 in 2010; however the number of patients with RA more than doubled from 42,734 to 97,499 during the same time frame

(table 1). The crude prevalence rates doubled during the study period from 490 to 899 per 100,000 population. The age/sex-standardized RA prevalence per 100,000 population also showed an increase steadily over time from 473 (95% CI 469-478) in 1996 to 784 (95% CI 779-789) in 2010 (0.5% to

0.9%) (table 1, figure 1).

The age-standardized prevalence of RA overall and by sex from 1996 to 2010 is illustrated in figure 2.

The prevalence increased amongst both sexes over time [females: 637 (95% CI 630-644) in 1996 to

1062 (95% CI 1054-1070) in 2010; males: 291 (95% CI 286-298) in 1996 to 472 (95% CI 466-478) in

2010]. The sex-standardized prevalence of RA by age over time are illustrated in figure 3. The prevalence also increased with age: 15-24y (0.1%), 25-34y (0.2%), 35-44y (0.5%), 45-54y (0.9%), 55-

64y (1.5%), 65-74y (2.1%), 75-84 (2.6%) to !85y (2.7%) as of 2010 (figure 4); and females had a higher prevalence than men in all age groups.

110

The crude number of cases identified each year varied from 5523 patients in 1996 to 6395 patients in

2010 (table 1). The crude incidence per 100,000 population was relatively stable from 64 in 1996 to 59 in 2010 (0.06% per year overall) as shown in Table 2. Regarding age/sex-standardized incidence per

100,000, there was a slight downward trend over time from 62 (95% CI 60-63) in 1996 to 54 (95% CI

52-55) in 2010, although incidence appeared to stabilize from 2000 onwards (figure 5). After adjusting for age, figure 6 illustrates a declining incidence per 100,000 population amongst both sexes over time

[females: 81 (95% CI 78-83) in 1996 to 72 (95% CI 70-75) in 2010; males: 41 (95% CI 39-43) in 1996 to 34 (95% CI 32-35) in 2010].

Figure 7 gives the sex-standardized incidence of RA by age over time. After adjusting for sex, incidence was stable among adults <65y, but decreasing among seniors over time.

As of 2010, the incidence increased with age: 15-24y (0.01%), 25-34y (0.02%), 35-44y (0.04%), 45-

54y (0.07%), 55-64y (0.10%), 65-74y (0.12%), 75-84 (0.13%) and women had a higher incidence than men in all age groups (figure 8).

Geographic variation was evident in age/sex-standardized rates by area of patient residence (Figure 9).

Prevalence was higher in northern rural communities [e.g., 1038 (95% CI 1011-1066) per 100,000] than in southern urban areas e.g., Toronto: 726 (95% CI 710-743) per 100,000], (table 2).

111

Sensitivity analyses showed that using 5, 10 and 15 years of data (“run-in period”), the incidence in 2010 was 58, 55, and 54 (per 100,000), respectively. By increasing the case ascertainment period over 5, 10 and 15 years of data, the prevalence estimates in 2010 varied from 555, 664, and 748 (per 100,000), respectively.

5.5 Discussion

In the universal public health insurance system of Ontario, we studied trends in the incidence and prevalence of RA – overall and by sex and age group – from 1996 to 2010. To our knowledge, ORAD represents the largest validated population-based cohort of patients with RA.

As of 2010 the prevalence of RA was 0.9% with a steady increase over time, similar to others.127 Our results are similar to RA prevalence estimates reported in British Columbia, Canada.99 However, recent data from Swedish RA population registers,137 estimated a prevalence of 0.77%. The slightly different results may be reflected in different methodologies to ascertain RA, and the authors acknowledge this estimate may underestimate the true prevalence by not including RA patients that are not under the care of rheumatologists.

Overall, we found that there are currently approximately 50 new cases of RA per 100,000 at risk. While this is mostly in keeping with population-based studies from other jurisdictions that on average report

25-50 new cases of RA develop per 100,000,138 our data indicate that Ontario is on the upper extreme.

Recent data from the United States report the overall age- and sex-adjusted annual RA incidence was

112

40/100,000 population and that the incidence of RA increased moderately in women but not in men.127

One might expect their results to capture slightly fewer individuals with RA than our report, as they required individuals to meet the 1987 RA classification criteria,5 which are not routinely employed for diagnostic purposes. Despite having more relaxed criteria than other studies, we also found that incidence may be on the decline among both men and women over the past 15 years, even though one might expect incidence to increase over time with better diagnostics and recognition of RA. While others have also observed a declining incidence over time,129,124 several studies have also reported a shift toward a more elderly age at onset.130,131 Such a shift was not found in our study nor that by Doran et al in Rochester, Minnesota.132 For RA incidence to increase with age and decrease with time among seniors, and the average age of the background population itself has increased, the declining secular trend among seniors may be partly due to prevalent cases being misclassified as incident cases during the early years of study.

Our regional prevalence rates illustrate the high burden of RA in all locales in Ontario. The highest prevalence estimates were observed in northern, rural communities of less dense population and also where there are few practicing musculoskeletal specialists.139,140 Ontario is a large province covering

1,076,395 square kilometers and northern, rural communities constitute nearly 90% of the area of

Ontario with a population of 800,000, which is only 6% of the total population for the province.141

Regional differences may be reflective of the underlying patient demographics (ethnicity, socioeconomic status), environmental exposures with RA risk (such as cigarette smoking, other air pollutants, occupational exposures, latitude)142,143,144 that may be driving rates of RA in specific locales.

For example, Toronto is comprised of a healthy immigrant population,145,146 whereas northern 113

communities have poor access to preventative health care services,147 many aboriginal communities,148,149 higher rates of chronic disease (e.g., diabetes and cardiovascular diseases)150, lower vitamin D levels (due to the higher latitude)151,144 all of which are associated with an elevated risk of

RA development.152,153 Finally, not only does regional variation have implications for planning for health care provisions for RA, over the past 15 years the number of RA patients has doubled (from

42,734 in 1996 to 93,558 in 2010). Unfortunately, the number of practicing rheumatologists has remained relatively stable.140 The overall provincial per capita provision in 2006 was 1.20 rheumatologists per 100,000 population (1 full-time equivalents per 100,000).140 In addition, a map of the distribution of rheumatologists (Appendix 5) illustrates that there are few rheumatologists in areas of the province where there is a particularly high burden of RA (such as northern communities). Thus, health human resources shortages and geographic variation in the supply of rheumatology services and

RA burden means that access to quality care for RA continues to be a challenge in Ontario.

Our study has both strengths and potential limitations. We conducted rigorous approaches to the validation and creation of ORAD133,134 and this is the largest epidemiological study of RA incidence and prevalence performed to date. We also used methods consistent with those of other studies examining incidence and prevalence trends of chronic disease using validated population-based databases in Ontario.154,155,156 To support our findings, we also found the same prevalence of RA when we previously performed a chart audit amongst a random sample of 7500 patients seen in primary care.133 We attempted to accurately define incidence cases, however, some of our incident cases may have been in fact prevalent cases (especially during the early years of study). To overcome this caveat, we did ensure a five-year run-in period to prevent possible misclassification of incident and prevalent 114

patients and tested variations of the run-in period. Furthermore, as part of our previous validation exercises, we found that disease onset is fairly well defined in health services data.134 Our main limitation is that health services data can only assess RA patients who sought and had access to health care providers. Therefore, we were unable to assess the population that rarely access care and it is important to view our data as physician-identified prevalence, particularly specialist identified RA.

Additionally, when using health services data, fee-for-service remuneration can drive coding practice when incentives exist for the recording of specific diagnostic codes. While there have been modifications to billing policies affecting rheumatology-specific billing practices in Ontario over the past decade (e.g., in 2005), we did not observe significant increases in prevalence during these time points. Furthermore, ORAD currently does not contain information on important risk factors for RA, such as ethnicity or education, and we were unable to explore the effects of specific factors on disease burden.

In conclusion, RA prevalence increases with age, and is highest among females and in northern rural communities. RA prevalence rates has also increased significantly over time. Factors contributing to the apparent increase in prevalence over time may be attributed to the increasing time to ascertain cases

(which may be latent in the population during earlier years of study), increasing survival and/or an increase in the aging background population. Incidence appears to be slowly declining over time (or stabilizing over the past decade with a shift toward fewer patients with elderly age at onset); however, increasing the observation period to distinguish between incident and prevalent cases has an effect in decreasing incidence as fewer prevalent cases are misclassified as incident over time. Regional prevalence rates illustrate the high burden of RA in all locales, especially in northern communities, highlighting the importance of regional differences for planning for health care provisions for RA.

115

5.6 Tables and Figures

116

5.6.1 Table 1: Crude and Age/Sex-standardized prevalence and incidence of RA by year

Prevalence Incidence Standardized Standardized Population Crude rate* Population Crude rate* Year Count rate % (95% CI) Count rate % (95% CI) 1996 42,734 8,720,499 490 0.49 473 (469-478) 5,523 8,682,077 64 0.06% 62 (60-63) 1997 46,961 8,828,425 532 0.53 509 (505-514) 5,546 8,785,669 63 0.06% 61 (59-62) 1998 51,248 8,959,209 572 0.57 544 (540-549) 5,731 8,912,234 64 0.06% 62 (60-63) 1999 55,398 9,085,331 610 0.61 576 (571-581) 5,662 9,034,053 63 0.06% 60 (58-61) 2000 59,129 9,220,621 641 0.64 602 (597-607) 5,429 9,165,200 59 0.06% 56 (55-58) 2001 62,795 9,390,567 669 0.67 625 (620-630) 5,495 9,331,423 59 0.06% 56 (54-57) 2002 66,537 9,588,554 694 0.69 646 (641-651) 5,614 9,525,743 59 0.06% 56 (54-57) 2003 69,997 9,779,736 716 0.72 662 (657-667) 5,414 9,713,187 56 0.06% 52 (51-54) 2004 73,575 9,939,997 740 0.74 679 (674-684) 5,641 9,869,989 57 0.06% 53 (52-55) 2005 77,330 10,100,741 766 0.77 697 (692-702) 5,815 10,027,150 58 0.06% 54 (52-44) 2006 81,614 10,257,323 796 0.80 719 (714-724) 6,431 10,179,982 63 0.06% 58 (57-60) 2007 85,706 10,410,695 823 0.82 738 (733-743) 6,220 10,329,070 60 0.06% 55 (54-57) 2008 89,420 10,556,974 847 0.85 753 (747-758) 6,046 10,471,253 58 0.06% 53 (52-54) 2009 93,558 10,708,605 874 0.87 769 (764-775) 6,490 10,619,171 61 0.06% 56 (54-57) 2010 97,499 10,851,140 899 0.90 784 (779-789) 6,395 10,757,575 59 0.06% 54 (52-55) Crude and standardized rates are per 100,000 population *Standardized by age and sex based on 1991 census population

117

5.6.2 Figure 1: Age and Sex-Standardized prevalence of RA from 1996 to 2010

118

5.6.3 Figure 2: Age Standardized prevalence of RA Overall and by Sex from 1996 to 2010

119

5.6.4 Figure 3: Sex-standardized Prevalence of RA by Age from 1996 to 2010

120

5.6.5 Figure 4: Prevalence of RA by Sex and Age in 2010

121

5.6.6 Figure 5: Age and sex-standardized incidence of RA over 1996-2010

122

5.6.7 Figure 6: Age-standardized incidence of RA by Sex from 1996 to 2010

123

5.6.8 Figure 7: Sex-standardized incidence of RA by Age from 1996-2010

124

5.6.9 Figure 8: Incidence of RA by Sex and Age as of 2010

125

5.6.10 Figure 9: Map of age/sex-standardized rates per 100,000 by area of patient residence in 2010

126

5.6.11 Table 2: Crude and age/sex-standardized rates by area of patient residence in 2010

Crude and age/sex-standardized* rates per 100,000 by area of patient residence according to Local Health Integration Networks (LHIN) in 2010. LHIN Standardized No. LHIN Description Count Population Crude rate rate* (95% CI)

1 Erie St Clair 5,063 533,985 948 789 (767-812) 2 South West 7,088 789,224 898 752 (734-770) 3 Waterloo Wellington 4,654 605,992 768 710 (689-731) 4 Hamilton Niagara 11,927 1,165,405 1023 842 (826-858) 5 Central West 5,474 649,231 843 846 (823-869) 6 Mississauga Halton 7,906 922,141 857 821 (803-840) 7 Toronto Central 7,935 967,942 820 726 (710-743) 8 Central 10,611 1,419,577 748 688 (674-701) 9 Central East 11,528 1,299,053 887 770 (755-784) 10 South East 4,074 414,221 984 783 (757-809) 11 Champlain 9,497 1,027,281 925 805 (789-822) 12 North Simcoe Muskoka 3,040 376,987 806 659 (635-684) 13 North East 6,256 482,452 1297 1038 (1011-1066) 14 North West 2,314 197,649 1171 1015 (972-1060) Overall All Ontario 97,367** 10,851,140 897 786 (778-788) *Standardized by age and sex based on 1991 census population ***Note: 132 patients who were prevalent for RA in 2010 had no LHIN

127

5.7 Appendix

5.7.1 A map of the distribution of Rheumatologists by LHIN in 2010.

128

Chapter 6 Discussion 6.1 Chapter overview

The purposes of this chapter are to summarize and discuss the implications of our results.

Upon summarizing the research conducted in this thesis, a discussion of the best practices in conducting administrative validation studies will be undertaken. Here, we will discuss concepts of measurement validity and disease ascertainment related to administrative data. A detailed discussion of the comparison of the results achieved in both validation studies performed is provided along with a discussion of the impact of disease prevalence, spectrum of disease, and type of comparator group on measures of diagnostic accuracy. We will also discuss the importance of our RA prevalence estimates with respect to health practitioners and policy-makers, and discuss the strengths and limitations of ORAD. Finally, we conclude with a discussion of some future research directions.

129

6.2 Summary of Research The preceding studies describe the development and validation of an algorithm to identify Ontario residents with RA, and the application of the algorithm to Ontario's health administrative data to develop the Ontario RA administrative Database (ORAD) in order to describe epidemiological trends in RA over the past decade.

As part of the development phase, a systematic literature review was conducted to evaluate the quality of the methods and reporting of published studies that validate administrative database algorithms for rheumatic diseases. This review highlights important gaps in best practices with respect to both reporting and methodology in administrative data validation studies. We identified strengths and weaknesses in the published literature that allowed us develop a framework to guide the validation study phase of this thesis.

As part of this validation phase, two independent validation studies were performed. We first examined the accuracy of administrative data algorithms for identifying RA patients among rheumatology clinic patients. Amongst our rheumatology sample, algorithms using one or more physician RA diagnosis codes were found to be highly sensitive in identifying RA patients. Adding prescription drug claims did not significantly improve the accuracy of the algorithms. We further evaluated the ability of administrative data to detect incident RA cases by comparing the timing of the first administrative data RA diagnosis code to the rheumatologist-reported diagnosis date. We determined that RA diagnosis date (and hence disease duration) could be fairly well estimated from administrative data in jurisdictions of universal health care (such as the Canadian health care system). To select the optimal case definition for RA, we aimed to identify one that obtained the highest possible PPV (>75%) while maximizing sensitivity (>95%) and specificity (>85%) over the shortest possible duration (in years). Based upon our a priori definition, the optimal algorithm to identify RA patients was: “(1 hospitalization RA code ever) OR (!3 physician RA claims with !1 RA code by a specialist in a 2 year period)” which had a sensitivity of 97%, specificity of 85%, PPV of 76% and NPV of 98%. The results of this work support the selection of an algorithm to identify the optimal approach for identifying RA patients in administrative data to establish a database of all Ontarians with RA.

130

The next part of the validation phase included repeating the validation of the administrative data algorithms amongst a random sample of patients seen in a primary care population in which the study prevalence closely approximates RA prevalence in the general population. Amongst our primary care sample, algorithms using one or more physician RA diagnosis codes were highly specific in identifying RA patients. As the overall purpose of this thesis was to establish a population-based cohort of Ontarians with physician-diagnosed RA, our a priori definition for selecting the optimal definition amongst this sample was similar to the first validation study performed amongst rheumatologists: to select an algorithm with the highest possible PPV with maximum sensitivity and specificity, over the shortest possible duration (observation period to identify diagnosis codes) to achieve accurate case ascertainment of RA. Based on data from both validation studies, the preferred algorithm was [1 hospitalization RA diagnosis code ever] OR [!3 physician RA diagnosis codes (billing claims) with !1 by a specialist in a 2 year period] with the first RA diagnosis code defining disease onset. Amongst our primary care sample, this algorithm had a sensitivity of 78%, specificity of 100%, PPV of 78% and NPV of 100%. Thus, this algorithm was demonstrated to both minimize false positives (high PPV) and exclude true negatives (high specificity). Moreover, this algorithm had excellent sensitivity (>96%) in capturing RA patients under active rheumatology care, defined as having at least one rheumatology visit between 2009 and 2011 and at least two visits over the period 2000-2011 (however, this latter criteria was more likely to ensure enough clinical documentation in the medical chart than to ensure active rheumatology care).

Finally, ORAD records were linked with census data to describe trends in the incidence and prevalence of RA in Ontario over the past 15 years. After applying our preferred algorithm to physician and hospitalization data over the period 1991-2011, we identified 97,499 Ontarians with RA as of 2010, corresponding to a cumulative prevalence of 0.9% (females 1.3%, males 0.5%). Secular trends over time were assessed, illustrating a rising prevalence of RA. Regional prevalence rates illustrate the high burden of RA, particularly in northern Ontario. Our findings highlight the need for regional approaches to the planning and delivery of RA care.

Finally, the overall findings of this thesis will inform future population-based research and will serve to improve arthritis surveillance activities across Canada and abroad.

131

6.3 Best Practices in conducting administrative data validation studies for rheumatic conditions

This section aims to synthesize the results of our systematic review (Chapter 2) as it pertains to the meaning of measurement validity; highlight our best practices for conducting administrative data validation studies as means to mitigate bias (reduce threats to validity), and discusses the implications of poorly designed health administrative data validation studies. In addition, we discuss the methods and results reported in chapters 3 and 4 (validation studies) as means to illustrate examples of how disease prevalence, spectrum of disease, and type of comparator group influence estimates of diagnostic accuracy.

6.3.1 Bridging the concepts of measurement validity and disease ascertainment with administrative data

As the goal of using administrative data algorithms for epidemiologic and health services research is to correctly classify individuals with a condition, measurement validity is critical to facilitating reliable, valid and accurate disease ascertainment. Since the diagnoses in administrative data are not usually clinically validated when they are entered, those who use these data sources for research or surveillance must be cautious in their attempts to correctly classify individuals with a condition. Researchers and others who make use of administrative data for surveillance must recognize and acknowledge that not all administrative data diagnosis codes are recorded accurately and reflect a true disease state.

Minimizing measurement error [systematic error (bias) and random error] is a central goal in research. With respect to disease ascertainment from administrative data, systematic error refers to misclassification bias. This type of error exists when we consistently make an error in the same direction. This would happen, for example, if one were to estimate disease prevalence using the administrative data algorithm of ‘one inpatient diagnosis code,’ which would always miss patients who were never hospitalized (although it would always identify patients who were hospitalized for their RA for the study period). Based on our medical record reviews of rheumatology (Chapter 3) and primary care practices (Chapter 4), we found the sensitivity for the algorithm ‘one inpatient diagnosis code’ to be 22%. On the other hand, random error occurs unpredictably, sometimes leading to overestimation and sometimes to underestimation. For example, by chance, the physicians sampled for our validation

132

studies (Chapter 3 and 4) may differ from other physicians in the population. A principal assumption in epidemiology is that we can draw an inference about the results of the entire population based on the evaluation of a sample of the population. However a problem with drawing such an inference is that the play of chance may affect the results because of the effects of random variation from sample to sample.157 Thus, the effect of random error may have resulted in either an underestimation or overestimation of the true accuracy of administrative data algorithms tested amongst our two samples. However, to lessen the role that chance may still play in influencing the results, one must take care in planning for an adequate number of physician clusters for optimal power, given the resources at hand (refer to Chapter 3 Appendix).

Reliability and validity are considered the foundations of measurement. By reducing measurement error, we are more likely to have reliable and valid results. Although the terminology is not homogeneously defined in the literature, reliability and validity are distinct forms of measurement for assessing the quality of health administrative data.

Reliability refers to the reproducibility of a measurement in terms of the consistency of results. Validity, on the other hand, refers to the degree to which the measurement correctly represents a true value. Accuracy is a related term, which can be considered as ‘an approximate measure of validity’. According to this line of thought, strictly speaking, validity reflects the truth as measured by a perfect gold standard, whereas accuracy is an approximation of the truth as measured by an imperfect reference standard. Recognizing that administrative data contain physician-reported diagnoses, the assessment of administrative algorithms on disease ascertainment reflects diagnostic accuracy and not necessarily diagnostic validity (since we did not specifically assess the true validity of the physician’s clinical diagnosis itself). However, in the remaining discussion of ‘validity’, we will make the assumption that even though our attempts to complete the validation studies produced estimates of ‘accuracy’, we can consider these estimates as good approximations of ‘validity’.

133

Consortiums of researchers have delineated three forms of validity, listed below.158,159 These concepts are applicable to administrative data validation studies.

(a) Content validity reflects the qualitative assessment or sensibility of an algorithm derived from administrative data. Content validity concerns the extent to which a measure includes all relevant items. For example, developing and testing administrative data algorithms that only incorporate inpatient diagnosis codes would have poor content validity, as RA is more frequently treated in outpatient settings. (b) Criterion validity describes the agreement of administrative data algorithms with a criterion of the reference standard, such as diagnoses recorded in a medical record. In general, the process of validating administrative data algorithms for disease case ascertainment refers to estimating the criterion validity of the algorithms. The statistics of sensitivity, specificity, and predictive values are estimated in order to establish the validity of the administrative data algorithm (a criterion) against the criterion of the reference standard. (c) Construct validity pertains to how well a measure accurately indicates the concept (or construct) of interest. For example, the association of disease onset with the first RA diagnosis code is a measure of construct validity. Recognizing that both symptom onset and diagnosis date are absence from administrative data, we demonstrated that for patients identified by administrative data algorithms, the first health services contact with a diagnosis code for RA is a reasonably valid way of capturing the construct of interest (physician- recognition of RA) (refer to chapter 3).

Additionally, the OMERACT (Outcome Measures in Rheumatology) group has provided a three-step framework for validation of outcome measures. This framework can also be extrapolated to the administrative data validation setting. OMERACT advocates that, before it is used, a measure (such as an administrative data algorithm) should be: (1) truthful (valid – as previously discussed), (2) discriminatory, and (3) feasible.160 Discrimination incorporates concepts of both validity and reliability: Is the measure able to consistently discriminate between situations or groups of subjects (e.g., does the algorithm classify individuals who have RA from those who do not)? Does the algorithm perform similarly in different settings (e.g., amongst a rheumatology setting versus a primary care setting)? 134

Discrimination also addresses the ability to measure change. However, it does not make sense to test the sensitivity to change when using administrative data for RA case ascertainment, as RA often does not resolve in patients (it is a non-curative disease). Feasibility refers to the actual usefulness of a measure, which is important in determining its success. For example, the administrative data algorithms selected for testing (Chapter 3 and 4) include data elements available to all Canadian provinces and territories. Thus, our results can be used to enhance national surveillance and research efforts across Canada.

6.3.2 Best Practices for Administrative Data Validation Study Design and Reporting

Since validation or assessment of data quality plays a central role when using administrative data for secondary research, proper validation methodology is imperative. Anyone reading a validation paper should be able to assess for potential bias in measurement validity. Our systematic literature review (Chapter 2) provides information about the methodology and the quality of published studies validating rheumatic disease case ascertainment approaches within health administrative data. Using basic epidemiologic principles and the consensus criteria for the reporting of diagnostic accuracy studies,60,54,61 several factors that threaten the internal and external validity were identified. A methodological framework was developed as an aid to improve health administrative data validation efforts in the future. Although it is impossible to eliminate all errors, the systematic review and critical appraisal of validation studies (Chapter 2) aims to provide a fuller understanding of measurement error in designing research, analyzing and interpreting data, and acknowledging limitations. This, in turn, can inform the planning and reporting of administrative data validation studies, and aid reviewers, editors, and readers in evaluating and interpreting the studies.

The implications of poorly designed health administrative data validation studies are numerous. Use of administrative data case definition algorithms that are not optimal, especially when the potential limitations are not emphasized, could lead to contradictory findings that confuse and/or mislead physicians and policy-makers. This may be one reason why some journals and policy makers view statistics generated from administrative data with much suspicion. As well, poorly designed validation

135

studies could create exaggerated results, which could trigger premature adoption of an administrative data algorithm that could lead physicians and policy-makers to make incorrect decisions about the care for individual patients. Specifically, pharmaco-epidemiology studies performed amongst the wrong patient population could impact evaluations of real-world drug safety and effectiveness. In the case of studies that use administrative data for surveillance purposes, incorrect methods may also have adverse effects. Decision-makers seeking to implement population-based disease management or intervention programs may be misguided in the true disease burden which could impact health system efficiencies. In general, sub-optimal validation methodology can lead to misclassification which in turn can lead to reduced power in some situations, loss of generalizability in others, as well as increased bias in certain settings.161,85,162 Thus, end-users of administrative data validation studies must be aware of the potential for bias and a possible lack of applicability.

To summarize the heterogeneity of validation study methodology, our systematic literature review identified variable methods with regards to patient sampling and reference standards to classify patients, which can create different subject pools of patients with varying demographic or clinical features that may influence the conclusions drawn. Most importantly, we identified important heterogeneity with regards to study conduct that impacts the measures of diagnostic accuracy that could be computed. The direction of patient sampling (whereby patients are either initially sampled from the reference standard or, alternatively, from diagnosis codes in the administrative database) and the inclusion or exclusion of a comparator group without the disease plays a key role in identifying the usefulness of each study. To optimally assess accuracy, it is important to determine the two types of errors that can occur: the false positive errors and false negative errors in order to compute the standard epidemiologic measures sensitivity, specificity, PPV, NPV to describe the accuracy of algorithms. Furthermore, the spectrum of disease and disease prevalence are also critical information to proper interpretation of the relationship among accuracy measures.

Therefore, below we describe the characteristics of the RA patients in our two study samples (Chapter 3 and 4), followed by a discussion of how disease prevalence, spectrum of disease, and type of comparator group influenced the estimates of diagnostic accuracy. Here, we discuss and contrast the findings using the algorithm of “[1 hospitalization RA code] OR [3 physician RA diagnosis codes (claims) with !1 by a specialist in a 2 year period]”. 136

After following our best practice statements set forth in Chapter 2, we randomly sampled patients from two separate settings (Chapter 3 and 4). Among the defining characteristics that differed amongst rheumatology clinic sample and the primary care sample, respectively, were: (1) Disease prevalence: 33% of patients had RA amongst the rheumatology sample vs. 0.9% of patients had RA amongst the primary care sample (2) Spectrum of disease (clinical characteristics): contemporary RA patients under active rheumatology care and treatment vs. patients with a lifetime RA diagnosis (3) Type of comparator group: rheumatology patients with other rheumatologic diagnoses vs. primary care patients who may be healthy or have any diagnoses

Below, Table 1 summarizes the characteristics of RA patients in the two samples.

Table 1: A comparison of characteristics of RA patients defined in Chapter 3 and Chapter 4.

RHEUMATOLOGY CLINICS PRIMARY CARE CHARACTERISTICS n=149 (33%) n=69 (0.9%) Age, mean (SD) years 62 (14) 62 (14) Female 77% 64% Diagnosis provided by a Specialist 100% 86% Hand joint involvement 81% 67% Morning Stiffness 60% 62% Symmetric Arthritis 76% 63% Rheumatoid Nodules 18% 19% Radiographic changes typical of RA 34% 22% Rheumatoid Factor positive 62% 39% Glucocorticosteroid use 74% 61% DMARD use 96% 80% Biologic use 24% 17% Definition of RA: Rheumatology sample: rheumatologist-documented RA in the medical record Primary care sample: Patients were classified as having RA based on a diagnosis by a musculoskeletal (MSK) specialist (rheumatologist, orthopedic surgeon, or internal medicine) via consultation letters or if a patient was labeled RA by the family physician with supporting evidence (e.g., serology, joint involvement, or treatment).

137

The mean (SD) age of RA patients was identical in the two settings: 62 (14) years. The age of the non- RA patients in the rheumatology clinic sample was 58 (17) years; and amongst the non-RA patients in the primary care sample was 49 (17) years. However, one would expect the primary care patient population to be slightly younger than the rheumatology patient population. In terms of documentation of seropositive status (RF+ and/or anti-CCP+), RA patients from the rheumatology sample had a higher documentation of seropositivity (that is, for rheumatoid factor or anti-CCP, two serologic markers of RA) than RA patients from primary care sample: 54% vs. 39%, respectively. While this may be a true difference, it is more likely that the differences observed are attributed to the retrospective design of our validation studies. As expected, rheumatology charts contained greater documentation of RA- specific characteristics than the charts of family physicians. Similarly, documentation of pharmacotherapy demonstrated that 96% of the RA patients in the rheumatology sample vs. 80% of RA patients in the primary care sample were prescribed DMARDs, and 24% vs. 17% of patients respectively, were prescribed biologics. It is worth noting that these prescriptions are most often correlated with seeing a rheumatologist, and all patients in our rheumatology sample had access to a rheumatologist, whereas only 86% of RA patients seen by primary care physicians had documentation of the diagnosis being reported by an MSK specialist. Additionally, 90% of the RA patients in the primary care sample had at least ‘one outpatient RA diagnosis code’ provided by any physician and 81% had at least ‘one outpatient RA diagnosis code provided by a MSK specialist’, meaning they had a visit to a specialist with a corresponding RA code since the onset of administrative data (1991 onwards). In contrast, all RA patients under active rheumatology care had at least one RA diagnosis code provided by a rheumatologist. Below, Table 2 summarizes the results of Chapter 3 and Chapter 4.

138

Table 2: A comparison of the algorithm results from Chapter 3 and Chapter 4. Rheumatology Clinic Sample Primary Care Sample Algorithm Pre-test Prev.: Pre-test Prev.: RA Present RA Absent RA Present RA Absent Classification: 33% 0.9% TP: FP: PPV: TP: FP: PPV: Positive 143 45 76% 54 15 78% FN: TN: NPV: FN: TN: NPV: Negative 5 254 98% 15 7416 100% Sensitivity: Specificity: Post-test Prev.: Sensitivity: Specificity: Post-test Prev.: 97% 85% 42% 78% 100% 0.9% Pre-test = ___TP + FN____ Prevalence TP+FP+FN+TN = ______TP_____ Sensitivity TP + FN = _____TN_____ Specificity FP + TN = _____TP_____ = ______prevalence ! sensitivity______PPV TP + FP (prevalence ! sensitivity) + (1 " prevalence) ! (1 " specificity) = TN ____ = ______specificity ! (1 " prevalence)______NPV FN + TN specificity ! (1 " prevalence) + prevalence ! (1 " sensitivity) Post-test = ___TP + FP____ Prevalence TP+FP+FN+TN

6.3.3 Relationship among accuracy measures: The impact of disease prevalence, spectrum of disease and type of comparator group on measures of diagnostic accuracy.

To understand the relationship among accuracy measures, it is important to understand the meaning of both quantitative measures and performance measures of validity. Sensitivity and specificity are quantitative measures of validity – i.e., they quantify the inherent accuracy of an administrative data algorithm because they quantify how well the algorithm reflects true disease status. The predictive values (PPV and NPV) are performance measures of validity – i.e., they quantify the value of the algorithm. Clinicians and epidemiologists also often interpret the predictive values differently (table 3).

139

Table 3: Interpreting Predictive Values Clinician Epidemiologist Probability that an individual with a Proportion of positive tests corresponding to PPV positive test really has RA true patients = Proportion with RA among those testing positive Probability that an individual with a Proportion of negative tests corresponding to NPV negative test is really healthy healthy subjects = Proportion of non affected among those testing negative

In this thesis, algorithm accuracy was measured relative to the diagnoses documented within medical charts of rheumatologists and family physicians (reference standards). In both samples, algorithm sensitivity was computed only among study subjects with RA, and specificity was computed among those without RA. Sensitivity and specificity do not depend on the prevalence of the RA in the study population, but they can vary across populations.163 For example, sensitivity or our administrative data algorithm was excellent (>97%) at identifying contemporary RA under active rheumatology care. In contrast, sensitivity was moderately good (78%) at identifying RA patients with a lifetime diagnosis of RA (who include patients under active rheumatology care, but also patients whose symptoms may have resolved, and are no longer seeking care for their RA). When we varied our definition of RA based on levels of evidence in the primary care charts (i.e., varied the spectrum of disease in our cohort), more specific definitions of RA according to the reference standard increased sensitivity, whereas more liberal definitions of RA (such as allowing any mention of RA with no supporting evidence, or those with a query RA diagnosis) decreased sensitivity of the administrative data algorithm. Thus, defining an a priori reference standard to classify individuals with RA has implications for validation methodology. To elucidate these findings with a clinical scenario, patients who fulfill strict classification criteria, such as the 1987 RA criteria, may have more advanced disease (prevalent cases with longer disease duration, or more active disease requiring multiple physician visits) that are more likely to be detected by administrative data algorithms. This finding was also observed in our literature review (Chapter 2), in which studies that tested administrative algorithms amongst RA patients who were required to meet strict classification criteria (e.g., 1987 RA criteria) had a higher sensitivity in comparison to patients who were classified by more liberal criteria (such as an RA diagnosis documented in the medical record).

140

On the other hand, as specificity is computed only among those without RA, the type of patients included in the non-RA comparator group (as defined by the reference standard) can greatly influence the estimates of specificity. The lower specificity observed amongst non-RA patients under rheumatology care (85% vs. 100% specificity observed amongst the primary care sample), is reflective of the patients without RA in the rheumatology clinic sample who had other rheumatologic diagnoses (which may resemble RA, and/or who at one point in time may have been considered a possible RA patient, and then evolved into a more clear diagnosis, such as systemic lupus). In contrast, our comparator group of individuals without RA within our primary care sample included healthy individuals, or patients with other conditions (unrelated to RA), which improved specificity. This finding has implications for researchers conducting administrative data validation studies that seek to identify population-based algorithms with high specificity. Furthermore, this finding emphasizes the importance of reporting the characteristics of the types patients in the comparator group to inform proper interpretation of specificity estimates.

While sensitivity and specificity are dependent on the characteristics of patients with and without the disease, respectively, predictive values (PPV and NPV) depend on disease prevalence, in addition to sensitivity and specificity. Table 4 highlights the impact of sensitivity, specificity and prevalence on predictive values.

Table 4: Impact of sensitivity, specificity and prevalence on predictive values Interpretation = ______prevalence ! sensitivity______Increasing specificity ! increasing PPV PPV (prevalence ! sensitivity) + (1 " prevalence) ! (1 " specificity) Increasing prevalence ! increasing PPV

= ______specificity ! (1 " prevalence)______Increasing sensitivity ! increasing NPV NPV specificity ! (1 " prevalence) + prevalence ! (1 " sensitivity) Decreasing prevalence ! increasing NPV

An important finding of our validation studies, which were performed in two separate settings with different underlying disease prevalence, was that our optimal algorithm had virtually the same PPV in both settings. Our primary care sample had a relatively low RA prevalence of 0.9% (reflective of that of the general population). Despite the low RA prevalence in this sample, there was a great variation among the PPV estimates, ranging from 51-83%. The PPV estimates improved substantially with the 141

requirement for musculoskeletal specialist billing codes for RA. In contrast, specificity was very high (>97%) amongst all algorithms tested within our primary care sample. In general, for conditions that are present in a minority of the study population (such as our primary care sample), specificity has a greater impact than sensitivity on PPV. However specificity alone is not the sole factor increasing PPV amongst this sample (as all algorithms with high specificity would have moderately good PPV). A more likely explanation is that the preferred algorithm identified fewer false positives in both settings (as having fewer false positives increases PPV). This is likely owing to the nature of RA management, which often involves referral to a specialist, frequent physician visits and a non-curative long disease course. These observations suggest that patients with prevalent (long-term) chronic conditions may have a higher probability of being identified by the use of similar administrative data algorithms owing to the disease course, management practices, and frequency of physician visits. This finding may be further supported by the concordance between our results and those from algorithms for case ascertainment of diabetes,34 hypertension,33 and other chronic diseases with substantially higher prevalence than RA.

In general, for researchers conducting administrative data validation studies and for those applying administrative data algorithms based upon validation of algorithms among rheumatology clinic patients (where the prevalence of RA is substantially higher than that of the general population), the potential generalizability of these validation studies needs to be emphasized. Unlike sensitivity and specificity, predictive values are highly influenced by disease prevalence in the study population. Thus, predictive values are not readily transferred from one setting to another. Notably, prevalence affects PPV and NPV differently. PPV is increasing, while NPV decreases with the increase of the prevalence of the disease in a population. Whereas the change in PPV is more substantial, NPV is somewhat less likely to be influenced by the disease prevalence, which is noted by high NPV estimates in both of our validation studies performed (98% vs. 100%), regardless of the algorithm tested.

Finally, when developing an algorithm, epidemiologists must often weigh the relative importance of sensitivity, specificity, PPV, and NPV, and prioritize the accuracy measure that is most important to a particular study.

142

Below, we summarize the characteristics of different components of administrative data algorithms and their influence on diagnostic accuracy (Tables 5 and 6).

Table 5: Influence of different components of administrative data algorithms on measures of diagnostic accuracy: Rheumatology Sample

Source of Diagnosis Codes Any physician codes Excellent SENS, modest to excellent SPEC, PPV Codes by specialists Increases SPEC and PPV Hospitalization codes Poor SENS; High SPEC No. of Diagnosis Codes Requiring multiple codes Increases SPEC and PPV Timing of Diagnosis Codes From 1-2 yrs: increases SENS, decreases SPEC, Increasing the observation window PPV; No benefit beyond 2 yrs Varying time between codes Little impact Drugs Adding steroids/DMARDs/BRMs Little impact on SENS but decreased SPEC & PPV Adding just DMARDs/BRMs Decreased SENS but increased PPV Exclusion criteria Pros Improved post-test prevalence Cons Decreased SENS and PPV, moderate SPEC

143

Table 6: Influence of different components of administrative data algorithms on measures of diagnostic accuracy: Primary Care Sample

Source of Diagnosis Codes Any physician codes Good SENS, excellent SPEC, good PPV Codes by specialists Increases PPV (51-83%) Hospitalization codes Poor SENS; High SPEC No. of Diagnosis Codes Requiring multiple codes Increases SPEC and PPV Timing of Diagnosis Codes Increasing the observation window Little impact Varying time between codes Little impact Drugs Adding steroids/DMARDs/BRMs Slightly improves PPV Exclusion criteria Pros Improved post-test prevalence Cons Decreased SENS and PPV

144

6.3.4 Future Directions for Health Administrative Data Validation

This thesis describes the validation of administrative data algorithms in two independent samples. Pre- defined algorithms were selected for testing based on algorithms already being employed by researchers. The testing of pre-defined algorithms enabled us to report on the accuracy of algorithms currently in use. Future efforts may opt to consider classification and regression tree analysis to develop the algorithm to better predict disease status. Regression analysis would also identify predictors of concordance (and discordance) on the different components included in administrative data algorithms.

Furthermore, both the selection of our samples and the absence of performing an internal revalidation may have provided overly optimistic estimates of the performance of the algorithm (due to “testimation bias”). A bootstrap method to estimate optimism could be considered by revalidating the algorithms within the original data.

Finally, algorithms could be tested among specific sub-populations (e.g., RA patients without specialty care or treatment), to determine the optimal algorithm for detecting specific subtypes of patients.

145

6.4 Rising Prevalence of Rheumatoid Arthritis: Implications for practice and policy

Valid population-based assessments of the epidemiological trends in RA are critical in helping health- care providers and decision makers to anticipate the burden of RA and to optimize clinical and public health strategies for disease management.

The significant rise in the number of people with RA in Ontario, and the rapidity of this increase, calls for an urgency to tackle this issue. RA places a significant burden on the individual and society and with the increasing costs related to disability and treatment, RA has become one of the most costly chronic diseases in the developed world.164 Despite its rising prevalence and the associated cost-burden on the system, RA has received relatively little attention compared to other chronic diseases with respect to optimizing models of care. Currently, rheumatologists are the predominant care providers for RA and access to rheumatologists depends on referral at the primary care level (or by other specialists). The benefit of early access to rheumatologists is early treatment to prevent joint damage and disability and hopefully improve quality of life. Although the evidence is clear that rheumatologists should treat RA early it remains a challenge to put this paradigm into practice. Current evidence suggests that Ontarians living with RA are not receiving the right care at the right time.100 Despite the fact that the number of RA patients has doubled over the past 15 years (from 42,734 in 1996 to 93,558 in 2010), the number of practicing rheumatologists has not been increasing,140 which is likely to cause further strain on our health care system. Health human resources shortages and geographic variation in the supply of rheumatology services means that access to quality care for RA continues to be a challenge in Ontario, and elsewhere in Canada. Given that RA is considered a chronic, life-long disease, rheumatologists are required to provide decades of care for individual patients. Spontaneous remission is rare, and most patients will experience fluctuating symptoms with slow progression of damage over time.165 Although drug treatment slows the disease process, few patients go into lasting remission.166 Thus, many rheumatologists are unable to accept new patients into their practice. Since health human resources have not increased alongside the growing burden of RA, newer, more efficient models of care allowing the transfer of care for stable RA patients from rheumatology back to their primary care physicians may need to be developed to free up time for rheumatologists to assess new patients. Primary care structures

146

could also be expanded to include other care providers (such as nurses, nurse practitioners, physiotherapists, pharmacists, etc.) to provide education and care in a coherent, organized way.

Our data not only indicate that RA prevalence rates have increased significantly over time, but that the prevalence of RA is rising with increasing age, and around twice as many women as men are affected. We have speculated on the reasons for this increase (due to an increase in the aging background population), but cannot discern the etiology using health administrative data alone. Furthermore, the changing demographics of the RA population also have implications regarding new models of care. Senior rheumatology patients have some special considerations compared to the younger population. Older patients tend to have more co-morbidities (for example, cardiovascular disease) and the resultant poly-pharmacy can make management issues for rheumatologic conditions all the more challenging. As the population ages, there will be a greater number of seniors with RA who are also developing multiple comorbidities, further complicating chronic disease management. Additionally, RA is not recognized as a significant chronic condition (like diabetes, renal disease) in most provinces. Never has there been a more urgent need for a coordinated, multi-partnered approach, to integrate RA into a broad chronic disease management strategy.

Regional prevalence rates also illustrate the high burden of RA in all locales, especially in northern communities, highlighting the need for regional considerations when planning for health care provisions for RA. Further study is required to elucidate the reasons for regional variation. One interesting avenue for future research is to determine if RA prevalence in certain regions is driven by the high proportion of Aboriginal residents who live there (since there are multiple known genetic risk factors for RA, which could be driving risk in these populations).

One principal advantage of the newly developed ORAD as a database resource, is that it is a tool to not only quantify disease burden, but also defines a population in which process and outcome of disease management may be explored.

147

6.5 Limitations of ORAD and Generalizability

Despite rigorous approaches to the validation and creation of ORAD, there are some limitations associated with both the methodology that was used, as well as the inherent limitations of using administrative data for health services research and epidemiology.

First and foremost, health administrative data provide information about patients who have sought out and received health care. Therefore, we are unable to study people who rarely access care. The number of patients with either subclinical disease or those that have not sought or obtained care from a physician could not determined, and they would not be identified as having RA within the database.

While we sought to determine the optimal administrative algorithm for detecting RA in Ontario, it is a task hampered by the trade-off between sensitivity and specificity. Prioritizing PPV further complicated our task of selecting the optimal algorithm to assemble our population-based cohort. However, prioritizing PPV is appropriate in studies where the cohort must be limited to persons with a particular condition but need not include or be representative of all persons with the condition of interest.85 In order to maximize PPV, diagnosis codes provided by MSK specialists were required. Thus, all individuals in ORAD have at least three diagnosis codes for RA, one of which must be from a MSK specialist (rheumatologist, internist or orthopedic surgeon) in a two-year period OR hospitalized for their RA. It should not be misinterpreted that ORAD includes only RA patients who receive specialty care within two years of disease onset, but rather, patients can accumulate multiple RA claims (>3 RA codes), but it is the specialist diagnosis codes that flag patients into ORAD, however, all the preceding RA codes that accumulate prior to specialist RA codes are equally as important in defining RA in our cohort. To this end, while PPV was prioritized, we still achieved 100% specificity. Ideally, algorithms for identifying RA would be 100% accurate, with perfect sensitivity and specificity; however, ORAD may miss some RA patients, such as those who had active disease prior to the onset of administrative data, or those patients whose symptoms have resolved and who for these or other reasons, are no longer seeking care for their RA.

Despite, a high degree of accuracy in identifying RA patients, misclassification may still exist. Thus, the ORAD cohort likely captures some patients with conditions that resemble RA such as sero-negative

148

inflammatory arthritides (e.g., Psoriatic arthritis which does not have a disease specific diagnosis code in Ontario).

Additionally, the limitations of using ORAD for health services and epidemiologic research are primarily related to its reliance on health administrative data to follow patients longitudinally (e.g., different observation periods for patients). ORAD represents a strong resource for population-based research and does not require informed patient consent and is, therefore, less prone to selection bias from non-response. However, ORAD does not contain information on potentially important variables, such as disease severity, smoking or alcohol use, laboratory values or other diagnostic results, important clinical variables, information on over-the-counter drug use and in-hospital medication, and prescription drug data is only limited to persons aged 65 and older. The lack of these variables may be important, particularly when some measure of disease severity is required, or when an outcome could be confounded by an unmeasured covariate, such as smoking. However, proxies or surrogate measures to account for potential confounding variables in future research using ORAD are possible. For example, possible surrogates of disease severity include the number of physician visits, acute care hospitalizations for RA, joint replacement surgeries, or dispensing of different DMARDs or biologics, all of which are typically indicative of more severe RA. Thus, partial adjustment for disease severity via proxies or surrogate measures is possible in ORAD.106 Furthermore, ORAD cannot provide adequate insight regarding clinical decision-making and/or patient preference, which limit the detailed study of disease management and care.

It is also important to emphasize that the algorithm validated in this thesis to identify adult patients with RA with is limited in its applications. For example, ORAD only contains individuals aged 15 years and older and a paediatric-specific algorithm has not been validated. While ORAD will be updated yearly by ICES in order to continue surveillance of RA in Ontario, we are uncertain whether the algorithm will perform with similar accuracy in the future. Changes to physician billing patterns or coding may result in altered properties of the preferred algorithm in the future. Thus, re-validation of algorithms at regular time periods in order to ensure consistency is needed. In addition, algorithm properties may vary across settings. Thus, applying ORAD’s algorithm, which was developed in Ontario, to a different setting (such as other provinces or territories in Canada) requires caution and an

149

understanding of similarities and differences in coding practices and related issues (e.g., shadow billing).

Similarly, the generalizability of our findings to other jurisdictions outside of Canada warrants careful consideration. Ideally, algorithm portability could eliminate a significant amount of redundant effort (e.g., external validation) and allow the collection of larger, more homogenous disease cohorts from multiple administrative databases, by using a standard algorithm that all researchers employ. However, the features of different administrative databases as well as the delivery of health care outside of Canada, are important considerations in determining the feasibility of ORAD’s algorithm elsewhere. For example, ORAD requires the linkage of three administrative databases: outpatient physician diagnosis codes, hospitalization data, and physician specialty. The quality and data completeness of outpatient and inpatient diagnosis codes may be different outside of Canada (such as the United States where patients do not have universal health care).

Additionally, our definition of a MSK specialist (rheumatologist, internist, orthopedic surgeon) may not be suitable outside of Canada due to different training requirements, practice styles, or the quality of data defining specialists in administrative data. For Ontario, we chose to classify a specialist in ‘broad terms’ as a MSK specialist (someone most likely to be able to recognize RA), based on two reasons: 1) the quality of data defining specialists (e.g., rheumatologists may be misclassified as internists in physician specialty databases), and 2) to increase algorithm sensitivity as not all patients in Ontario have access to rheumatologists. (Appendix 6 provides additional information on the database used to define physician specialty).

On the other hand, including non-rheumatologists such as orthopedic surgeons could potentially diminish the specificity of our case definition. Still, using our definition, we found a RA prevalence, based on administrative data, that does not seem to be grossly-over-inflated, compared to what would be expected. Moreover, the Public Health Association of Canada is currently using an approach similar to ours, which in fact does not differentiate among specialties at all. In the end though, we admit the possibility that for some purposes (e.g. if an extremely specific definition is needed, such as perhaps when evaluating the rates of certain outcomes particularly in RA), our definition of specialist physician may not be optimal. However, as both validation samples were comprised of either all or most patients

150

with rheumatologist-confirmed RA, our results are quite generalizabile to researchers who opt to classify specialty by rheumatology alone. Furthermore, additional data in Appendix 6 describes the source of data in which patients are first identified as RA in ORAD (by inpatient or outpatient data). As well, physician specialty assigning the first diagnosis code of RA is also reported in Appendix 6. The data indicate that outpatient physician billing claims initially identifies most RA patients and rheumatology and primary care physicians contribute the most initial RA coding.

Another potential caveat of ORAD’s algorithm is the two-year observation period to identify cases. One of the strengths of using health administrative databases that reside in a single-payer health care system (e.g., Canada) over other types of databases [e.g. Health maintenance organization (HMO) data] is the comprehensive longitudinal information available. In other settings (e.g., USA), it may be more suitable (feasible) to select an algorithm with a shorter observation period (e.g., one-year). Our choice of using a two-year observation period is that we found that cumulative diagnosis codes improves both specificity and PPV in both of our validation samples. As our aim was to select the algorithm that introduced the fewest number of false positives into our cohort, we found that requiring 3 diagnosis codes met our criteria. In Ontario, where access to health human resources remains a challenge, requiring 3 diagnosis codes within 1 year had poor sensibility (face validity). A related concern for researchers who opt to select an algorithm with an observation period of two years or more must remain cautious in most recent years of study as data availability bias can occur if not all patients have an equal ‘look forward’ period to meet the terms of the case definition over time.

In addition, while the overall goal was to recommend the optimal algorithm for use in Ontario, we report the results of numerous algorithms (Chapters 3 and 4) so that researchers can be better informed by choosing the case definition best suitable for their study populations and study purpose. Due to different characteristics inherit to different administrative databases it would be imprudent to suggest preferred algorithms for use outside of Canada, where other researchers may be better informed on the characteristics of their own databases under study. Rather, we re-iterate throughout Chapters 2-4, that algorithms should be selected based on study purpose and feasibility and we make recommendations which characteristics of accuracy (e.g., sensitivity, specificity, PPV) should be prioritized for different study purposes.

151

Despite its limitations, ORAD is a powerful tool to examine RA in Ontario. It represents the largest population-based cohort of RA patients in a universal health care system and the advantages far outweigh the disadvantages. Researchers should be aware of the limitations of using ORAD and hypotheses must be limited to those answerable using the data collected. In addition, we report on the results of numerous algorithms, recommendations on how to choose algorithms, and supporting data to enhance the generalizability of findings outside of Canada.

152

6.6 Health Administrative Data for Secondary Research: Future Directions

Decision-makers everywhere are endeavoring to enhance their ability to evaluate disease burden, quality of care, and outcomes such as drug effectiveness, safety and cost in the real world, to better understand “what works” in health care.

The future directions of ORAD and the overarching goals for future projects are to produce knowledge and tools to enhance the timeliness and accuracy of Canadian administrative data for research and surveillance. A number of steps are currently underway to (A) expand, strengthen and reinforce research conducted with ORAD, and (B) develop better tools to evaluate data collected in usual care.

Initially, ORAD will be strengthened through linkage to a variety of clinical data sources. By combining data sources, linking across databases and supplementing the administrative data, it will serve as a valuable tool in strengthening the ongoing surveillance of RA in Ontario, offering an added advantage of being able to study patient-level process and outcome measures, which are not available from administrative data alone. The linkage of clinical cohorts to ORAD will enable us to follow patients throughout Ontario for long periods of time at reasonable costs and minimizing loss to follow- up. There are currently two cohorts being prepared for linkage to ORAD, including a clinical registry of patients with RA followed by rheumatologists [Ontario Biologics Research Initiative/Ontario Best Practices Research Initiative (OBRI)], and a clinical cohort of RA patients assembled from the electronic medical records (EMRs) of family physicians (EMRALD-RA cohort).

Building on my current contributions in the area of validation, we are in the process of developing methodologies (via an automatic tool/case finding algorithm) to accurately identify RA patients within Canada’s only primary care EMR administrative data linked database known as the Electronic Medical Record Administrative data Linked Database (EMRALD). Upon completion of validation, we will apply this algorithm to create a ‘primary care’ RA cohort that will enable us to improve rheumatology surveillance, measure quality of care and describe care pathways by linking EMR and health administrative data.

153

Supplementing ORAD with clinical information derived from these clinical cohorts will overcome some of ORAD’s limitations, and as well these clinical cohorts will serve as ‘reference standards’ to develop better tools to evaluate data collected in usual care. Further methodological work is required to adjust for the effect of biases inherent in observational data. One area of unmet need relates to assessing health outcomes within longitudinal data. For example, in any longitudinal dataset (such as ORAD), patients contribute observations over time. However, individual patients provide quite different longitudinal information regarding numbers of observations (e.g, physician visits). A patient with a cluster of observations at the beginning and end of a long observation period is very different compared to a patient who has the same number of observations spaced evenly over the same period. It may be invalid to attempt to derive treatment outcomes from such dissimilar patients. Incomplete information in electronic healthcare data is a major challenge. In fact, the availability of data may be a reflection of both health status as well as the quality of follow-up. For pharmacoepidemiology studies, appropriate adjustment for confounding is also challenging because exposure is determined by a complex interaction of patient, physician, and healthcare system factors. Further methodological work is required to adjust for the effect of biases inherent in observational data, such as confounding by indication. In observational studies, confounding by indication represents a unique challenge, because often both the outcomes of interest, and the drug exposures of interest, are potentially associated with disease severity and co-morbidities. Several approaches to statistically adjust for confounding by indication (in order to better model the effects of drugs on the outcome of interest) have been proposed, such as using (a) clinical or proxy measures to adjust for disease severity (b) propensity scores, (c) instrumental variables, or (d) disease risk scores, although no single method is clearly superior. Co- morbidity and disease severity may also function as effect modifiers. Researchers have also employed various measures of co-morbidity indices (e.g., Charlson Index), which place relatively little emphasis (or none at all) on specific co-morbidites and underlying complications relevant to patients with RA.

Given the complexity of co-morbidities and disease severity that determine exposure to drugs as well as the limitations of healthcare databases, there is substantial uncertainty about how one should specify necessary statistical models to control confounding. Further effort is needed to define, develop, and validate a widely applicable measure to adjust for confounding by indication specific to RA using electronic healthcare databases. To optimally evaluate the accuracy of methods/tools applied to ORAD

154

requires access to a reference standard, such as the clinical cohorts being linked with ORAD. Linkage will enable us to i) establish tools to quantify the longitudinal value of electronic healthcare data for pharmacoepidemiology, ii) assess the magnitude of missing information from electronic healthcare data, iii) investigate the magnitude of potential confounding factors related to the evaluation of drug effectiveness, safety and cost of treatment; and iv) develop, and validate a widely applicable measure to adjust for confounding by indication for electronic healthcare databases.

Finally, based on our experiences with de-identified EMRs of family physicians in Ontario at ICES, we can envision that one day all rheumatologists will provide a copy of their EMRs to establish a similar EMR-based research database (like EMRALD) capable of linking with ORAD at ICES.

The development of tools and expanding on existing data platforms to enhance the accuracy of Canadian research on real-life comparative effectiveness and safety of drugs, and disease surveillance, are essential for building globally relevant real world evidence, particularly in understanding the outcomes in routine clinical practice.

Furthermore, ORAD will play a central role in the development and evaluation of new models of RA care in Ontario to improve the timeliness and quality of care and health outcomes of patients living with RA.

155

6.7 Appendix Physician Specialty

In Ontario, there is a systematic way to identify physician specialty that involves linkage of multiple data sources, which is validated by periodic interviews with all physicians. Other provinces do not have a systematic process and therefore the accuracy of defining physician specialty in other provinces is unknown. A common problem to all is that some rheumatologists may not change their status from internal medicine to rheumatology in a consistent fashion. Below, details how physician specialty is captured and validated in Ontario.

The ICES Physician Database (IPDB) comprises information from the Ontario Health Insurance Plan (OHIP) Corporate Provider Database (CPDB), the Ontario Physician Human Resource Data Centre (OPHRDC) database and the OHIP database of physician billings. The CPDB contains information about physician demographics, specialty training and certification and practice location. This information is validated against the OPHRDC database, which verifies this information through periodic telephone interviews with all physicians practicing in Ontario.

The IPDB is updated once a year, usually in November or December. OPHRDC surveys a portion (usually about 1/3) of all active physicians in Ontario each year. Thus, each physician in the province will be surveyed about every three years. It takes about 8-9 months for them to conduct the survey and then process the results. In this way validation of the OPHRDC ‘best’ specialty estimate is a continuous process.

In addition, the OPHRDC receives all certification information from the College of Physicians and Surgeons of Ontario and this information also informs their ‘best’ estimate of each physician’s practice specialty. Both OPHRDC and ICES also have access to the Corporate Provider Database of physician information, which includes all the specialty designations each physician is permitted to use when billing OHIP. These are called their ‘billing specialties’. ICES receives an updated CPDB monthly, and the IPDB is updated yearly.

156

157

References

1. Silman A. Epidemiology of the rheumatic diseases. Oxford: Oxford University Press; 2001.

2. Crowson CS, Matteson EL, Myasoedova E, et al. The lifetime risk of adult-onset rheumatoid arthritis and other inflammatory autoimmune rheumatic diseases. Arthritis Rheum 2011;63:633-9.

3. Bombardier C, Hawker G, Mosher D, et a. The Impact of Arthritis in Canada: Today and Over the Next 30 Years; 2011.

4. Myasoedova E, Davis JM, 3rd, Crowson CS, Gabriel SE. Epidemiology of rheumatoid arthritis: rheumatoid arthritis and mortality. Curr Rheumatol Rep 2010;12:379-85.

5. Arnett F, Edworthy S, Bloch D, et al. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum 1988;31:315-24.

6. Castro C, Gourley M. Diagnostic testing and interpretation of tests for autoimmunity. J Allergy Clin Immunol 2010;125:S238-47.

7. Schoels M, Bombardier C, Aletaha D. Diagnostic and prognostic value of antibodies and soluble biomarkers in undifferentiated peripheral inflammatory arthritis: a systematic review. J Rheumatol Suppl 2011;87:20-5.

8. Ropes MW, Bennett GA, Cobb S, Jacox R, Jessar RA. 1958 Revision of diagnostic criteria for rheumatoid arthritis. Bull Rheum Dis 1958;9:175-6.

9. Hulsemann JL, Zeidler H. Diagnostic evaluation of classification criteria for rheumatoid arthritis and reactive arthritis in an early synovitis outpatient clinic. Ann Rheum Dis 1999;58:278-80.

10. Harrison BJ, Symmons DP, Barrett EM, Silman AJ. The performance of the 1987 ARA classification criteria for rheumatoid arthritis in a population based cohort of patients with early inflammatory polyarthritis. American Rheumatism Association. J Rheumatol 1998;25:2324-30.

11. Aletaha D, Neogi T, Silman AJ, et al. 2010 Rheumatoid arthritis classification criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative. Arthritis Rheum 2010;62:2569-81.

12. Pincus T, Yazici Y, Sokka T. Complexities in assessment of rheumatoid arthritis: absence of a single gold standard measure. Rheum Dis Clin North Am 2009;35:687-97, v.

13. Yelin EH, Such CL, Criswell LA, Epstein WV. Outcomes for persons with rheumatoid arthritis with a rheumatologist versus a non-rheumatologist as the main physician for this condition. Med Care 1998;36:513-22.

158

14. Bykerk V, al. e. Canadian Consensus Statement on Early Optimal Therapy in Early Rheumatoid Arthritis. . The Journal of the Canadian Rheumatology Association 2004.

15. Mierau M, Schoels M, Gonda G, et al. . Assessing remission in clinical practice. . Rheumatology (Oxford) 2007;46:975–9.

16. Emery P, Breedveld F, Hall S, al. e. Comparison of methotrexate monotherapy with a combination of methotrexate and etanercept in active, early, moderate to severe rheumatoid arthritis (COMET): a randomised, double-blind, parallel treatment trial. . Lancet 2008;372:375–82.

17. Emery P, Salmon M. Early rheumatoid arthritis: time to aim for remission? . Ann Rheum Dis 1995;54:944–7.

18. Sokka T, Hetland M, Mäkinen H, al. e. Remission and rheumatoid arthritis: Data on patients receiving usual care in twenty-four countries. . Arthritis Rheum 2008;58:2642–51.

19. Glazier R, Dalby D, Badley E, et al. Management of the early and late presentations of rheumatoid arthritis: a survey of Ontario primary care physicians. CMAJ 1996;155:679-87.

20. Gamez-Nava J, Gonzalez-Lopez L, Davis P, Suarez-Almazor M. Referral and diagnosis of common rheumatic diseases by primary care physicians. Br J Rheumatol 1998;37(11):1215-9.

21. Bolumar F, Ruiz M, Hernandez I, Pascual E. Reliability of the diagnosis of rheumatic conditions at the primary health care level. . J Rheumatol 1994;21(12):2344-8.

22. Badley E, Lee J. The consultant's role in continuing medical education of general practitioners: the case of rheumatology. Br Med J (Clin Res Ed) 1987;294:100-3.

23. Glazier RH, Badley EM, Wright JG, et al. Patient and provider factors related to comprehensive arthritis care in a community setting in Ontario, Canada. J Rheumatol 2003;30:1846-50.

24. Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol 2005;58:323-37.

25. Schwartz RM, Gagnon DE, Muri JH, Zhao QR, Kellogg R. Administrative data for quality improvement. Pediatrics 1999;103:291-301.

26. Jacobs P, Yim R. Using Canadian administrative databases to derive economic data for health technology assessments. Ottawa: Canadian Agency for Drugs and Technologies in Health; 2009.

27. Iglehart JK. Revisiting the Canadian health care system. N Engl J Med 2000;342:2007-12.

28. Ontario Health Insurance Plan: the program. Toronto: Ontario Ministry of Health and Long- Term Care, 2008. http://www.health.gov.on.ca/en/public/programs/ohip/. (Accessed at

29. World Health Organization. International Classification of Diseases, Ninth Revision Clinical Modification. Geneva, Switzerland: World Health Organization. (Accessed at 159

30. Ontario Drug Benefit: the program. Toronto: Ontario Ministry of Health and Long-Term Care, 2008. http://www.health.gov.on.ca/english/providers/program/drugs/odbf_mn.html. (Accessed April 2012, at

31. Romano P. Can administrative data be used to compare the quality of health care? . Med Care Rev 1993;50:451-77.

32. Improving health care data in Ontario. ICES Investigative Report. Toronto: Institute for Clinical Evaluative Sciences; 2005.

33. Tu K, Campbell NR, Chen ZL, Cauch-Dudek KJ, McAlister FA. Accuracy of administrative databases in identifying patients with hypertension. Open Med 2007;1:e18-26.

34. Hux JE TM. Patterns of prevalence and incidence of diabetes. . Toronto: Institute for Clinical Evaluative Sciences; 2003.

35. Tennis P, Bombardier C, Malcolm E, Downey W. Validity of rheumatoid arthritis diagnoses listed in the Saskatchewan Hospital Separations Database. J Clin Epidemiol 1993;46:675–83.

36. Lix L, Yogendran M, Burchill C, et al. Defining and Validating Chronic Diseases: An Administrative Data Approach. Winnipeg, MB: Manitoba Centre for Health Policy; 2006.

37. Allebeck P, Ljungstrom K, Allander E. Rheumatoid Arthritis in a medical information system: How valid is the diagnosis? . Scand J Soc Med 1983;11:27-32.

38. Hakala M, Pollanen R, Nieminen P. The ARA 1987 Revised Criteria Select Patients with Clinical Rheumatoid Arthritis from a Population Based Cohort of Subjects with Chronic Rheumatic Diseases Registered for Drug Reimbursement. J Rheum 1993;20:1674-8.

39. Gabriel S. The sensitivity and specificity of computerized databases for the diagnosis of rheumatoid arthritis. Arthritis Rheum 1994;37:821-3.

40. Fowles JB, Lawthers AG, Weiner JP, Garnick DW, Petrie DS, Palmer RH. Agreement between physicians' office records and Medicare Part B claims data. Health Care Financ Rev 1995;16:189-99.

41. Katz J, Barrett J, Liang M, al e. Sensitivity and positive predictive value of medicare part B physician claims for rheumatologic diagnoses and procedures. Arthritis Rheum 1997;40:1594-600.

42. Losina E, Barrett J, Baron JA, Katz JN. Accuracy of Medicare claims data for rheumatologic diagnoses in total hip replacement recipients. . J Clin Epidemiol 2003;56:515-9.

43. Pedersen M, Klarlund M, Jacobsen S, Svendsen A, Frisch M. Validity of rheumatoid arthritis diagnoses in the Danish National Patient Registry. . Eur J Epidemiol 2004;19:1097-103.

44. Singh JA, Holmgren AR, Noorbaloochi S. Accuracy of veterans administration databases for a diagnosis of rheumatoid arthritis. . Arthritis Rheum 2004;51:952-7.

160

45. Thomas SL, Edwards CJ, Smeeth L, Cooper C, Hall AJ. How accurate are diagnoses for rheumatoid arthritis and juvenile idiopathic arthritis in the general practice research database? . Arthritis Rheum 2008 59:1314-21.

46. Kim S, Servi A, Polinski J, et al. Validation of rheumatoid arthritis diagnoses in health care utilization data. Arthritis Res Ther 2011;13:R32.

47. Hoy D, Brooks P, Woolf A, et al. Assessing risk of bias in prevalence studies: modification of an existing tool and evidence of interrater agreement. J Clin Epidemiol 2012;65:934-9.

48. Ng B, Aslam F, Petersen NJ, Yu HJ, Suarez-Almazor ME. Identification of rheumatoid arthritis patients using an administrative database: A veterans affairs study. Arthritis Care Res (Hoboken) 2012.

49. Lin KJ, Garcia Rodriguez LA, Hernandez-Diaz S. Systematic review of peptic disease incidence rates: do studies without validation provide reliable estimates? Pharmacoepidemiol Drug Saf 2011;20:718-28.

50. Ladouceur M, Rahme E, Pineau CA, Joseph L. Robustness of prevalence estimates derived from misclassified data from administrative databases. Biometrics 2007;63:272-9.

51. Bernatsky S, Lix L, O'Donnell S, Lacaille D, Network tC. Consensus Statements for the Use of Administrative Health Data in Rheumatic Disease Research and Surveillance. J Rheumatol 2012.

52. Goldfield N, Villani J. The use of administrative data as the first step in the continuous quality improvement process. Am J Med Qual 1996;11:S35-8.

53. Sorensen HT, Sabroe S, Olsen J. A framework for evaluation of secondary data sources for epidemiological research. Int J Epidemiol 1996;25:435-42.

54. Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A. Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. J Clin Epidemiol 2011;64:821-9.

55. Moher D, Liberati A, Tetzlaff J, Altman DG, Group P. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. J Clin Epidemiol 2009;62:1006-12.

56. Medical Subject Headings. 1999. (Accessed at http://www.nlm.nih.gov/mesh/.)

57. EMBASE. Elsevier Inc., 2010. (Accessed at Available at. http://www.info.embase.com/. .)

58. Spasoff R. Epidemiologic methods for health policy. New York, NY; 1999.

59. Shortliffe E, Cimino J. Biomedical Informatics: Computer Applications in Health Care and Biomedicine (3rd edition) New York Springer; 2006.

60. Bossuyt PM, Reitsma JB, Bruns DE, et al. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Ann Intern Med 2003;138:W1-12. 161

61. Whiting PF, Weswood ME, Rutjes AW, Reitsma JB, Bossuyt PN, Kleijnen J. Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. BMC Med Res Methodol 2006;6:9.

62. Singh JA, Holmgren AR, Krug H, Noorbaloochi S. Accuracy of the diagnoses of spondylarthritides in veterans affairs medical center databases. Arthritis Rheum 2007;57:648-55.

63. Bernatsky S, Linehan T, Hanly JG. The accuracy of administrative data diagnoses of systemic autoimmune rheumatic diseases. J Rheumatol 2011;38:1612-6.

64. Rector TS, Wickstrom SL, Shah M, et al. Specificity and sensitivity of claims-based algorithms for identifying members of Medicare+Choice health plans that have chronic medical conditions. Health Serv Res 2004;39:1839-57.

65. Singh JA, Holmgren AR, Noorbaloochi S. Accuracy of Veterans Administration databases for a diagnosis of rheumatoid arthritis. Arthritis Rheum 2004;51:952-7.

66. Lix L, Yogendran M, Mann J. Defining and Validating Chronic Diseases: An Administrative data Approach: An Update with ICD-10-CA. Winnipeg: Manitoba Centre for Health Policy, University of Manitoba; 2008.

67. Singh J. Discordance Between Self-report of Physician Diagnosis and Administrative Database Diagnosis of Arthritis and Its Predictors J Rheum 2009;36:1858-60.

68. Allebeck P, Ljungstrom K, Allander E. Rheumatoid arthritis in a medical information system: how valid is the diagnosis? Scand J Soc Med 1983;11:27-32.

69. Tennis P, Bombardier C, Malcolm E, Downey W. Validity of rheumatoid arthritis diagnoses listed in the Saskatchewan Hospital Separations Database. J Clin Epidemiol 1993;46:675-83.

70. Harrold LR, Yood RA, Andrade SE, et al. Evaluating the predictive value of osteoarthritis diagnoses in an administrative database. Arthritis Rheum 2000;43:1881-5.

71. Losina E, Barrett J, Baron JA, Katz JN. Accuracy of Medicare claims data for rheumatologic diagnoses in total hip replacement recipients. J Clin Epidemiol 2003;56:515-9.

72. Thomas SL, Edwards CJ, Smeeth L, Cooper C, Hall AJ. How accurate are diagnoses for rheumatoid arthritis and juvenile idiopathic arthritis in the general practice research database? Arthritis Rheum 2008;59:1314-21.

73. Hakala M, Pollanen R, Nieminen P. The ARA 1987 revised criteria select patients with clinical rheumatoid arthritis from a population based cohort of subjects with chronic rheumatic diseases registered for drug reimbursement. J Rheumatol 1993;20:1674-8.

74. Gabriel SE, Crowson CS, O'Fallon WM. A mathematical model that improves the validity of osteoarthritis diagnoses obtained from a computerized diagnostic database. J Clin Epidemiol 1996;49:1025-9. 162

75. Harrold LR, Saag KG, Yood RA, et al. Validity of gout diagnoses in administrative data. Arthritis Rheum 2007;57:103-8.

76. Icen M, Crowson CS, McEvoy MT, Gabriel SE, Maradit Kremers H. Potential misclassification of patients with psoriasis in electronic databases. J Am Acad Dermatol 2008;59:981-5.

77. Malik A, Dinnella JE, Kwoh CK, Schumacher HR. Poor validation of medical record ICD-9 diagnoses of gout in a veterans affairs database. J Rheumatol 2009;36:1283-6.

78. Chibnik LB, Massarotti EM, Costenbader KH. Identification and validation of lupus nephritis cases using administrative data. Lupus 2010;19:741-3.

79. Kim SY, Servi A, Polinski JM, et al. Validation of rheumatoid arthritis diagnoses in health care utilization data. Arthritis Res Ther 2011;13:R32.

80. Pepe M. The Statistical Evaluation of Medical Tests for Classification and Prediction: Oxford University Press; 2003.

81. Worster A, Haines T. Medical Record Review Studies: an Overview Israel Journal of Trauma, Intensive Care and Emergency Medicine 2002;2:18-23.

82. Gariepy G, Rossignol M, Lippman A. Characteristics of subjects self-reporting arthritis in a population health survey: distinguishing between types of arthritis Can J Public Health 2009 100:467- 71.

83. Kehoe R, Wu SY, Leske MC, Chylack LT, Jr. Comparing self-reported and physician-reported medical history. Am J Epidemiol 1994;139:813-8.

84. Chen G, Faris P, Hemmelgarn B, Walker R, Quan H. Measuring agreement of administrative data with chart data using prevalence unadjusted and adjusted kappa. BMC Med Res Methodol 2009;9:5.

85. Chubak J, Pocobelli G, Weiss NS. Tradeoffs between accuracy measures for electronic health care data algorithms. J Clin Epidemiol 2012;65:343-9 e2.

86. Carnahan R. Mini-Sentinel's systematic reviews of validated methods for identifying health outcomes using administrative data: summary of findings and suggestions for future research. Pharmacoepidemiology and Drug Safety 2012;21:90–9.

87. Stiles P, Boothroyd R, Robst J, Ray J. Ethically using administrative data in research: Medicaid administrators current practices and best practices recommendations. Administration & Society 2011;43:171-92.

88. Stiles P, Boothroyd R. Ethical Use of Administrative Data for Research Purposes; 2012.

89. Hogg W, Gyorfi-Dyke E, Johnston S, et al. Conducting chart audits in practice-based primary care research: a user's guide. Can Fam Physician 2010 56(5):495-6. 163

90. Arnett FC, Edworthy SM, Bloch DA, et al. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum 1988;31:315-24.

91. Canadian Institute for Health Information, The CIHI Data Quality Framework, . Ottawa, Ont.: CIHI; 2009.

92. Chan B, Schultz S. Supply and utilization of general practitioner and family physician services in Ontario. Toronto, ON: Institute for Clinical Evaluative Sciences; 2005.

93. Hawker G, Badley E, Jaglal S, et al. Project for an Ontario Women’s Health Evidence-Based Report: Musculoskeletal Conditions. Toronto; 2010.

94. Badley E, Glazier R. Arthritis and related conditions in Ontario. Toronto Ontario.; 2004.

95. Gulliford MC. Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data. J Clin Epidemiol 2005;58(3):246-51.

96. Bossuyt PM, Reitsma JB, Bruns DE, et al. Standards for Reporting of Diagnostic Accuracy. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration Ann Intern Med 2003 138:W1-12.

97. Fowles JB, Lawthers AG. Agreement between physicians' office records and Medicare Part B claims data. Health Care Financing Review 1995;16:189 -200.

98. Widdifield J, Labrecque J, Lix L, et al. A Systematic Review and Critical Appraisal of Validation Studies to Identify Rheumatic Diseases in Health Administrative Databases Arthritis Care and Research [in press] 2013.

99. Lacaille D, Anis A, Guh D, Esdaile J. Gaps in care for rheumatoid arthritis: a population study. . Arthritis Rheum 2005;53:241-8.

100. Widdifield J, Bernatsky S, Paterson JM, et al. Quality care in seniors with new-onset rheumatoid arthritis: a Canadian perspective. Arthritis Care Res (Hoboken) 2011;63:53-7.

101. MacKay C, Canizares M, Davis AM, Badley EM. Health care utilization for musculoskeletal disorders. Arthritis Care Res (Hoboken) 2010;62:161-9.

102. Life with Arthritis in Canada A personal and public health challenge.; 2010.

103. Rodriguez LA, Tolosa LB, Ruigomez A, Johansson S, Wallander MA. Rheumatoid arthritis in UK primary care: incidence and prior morbidity. Scand J Rheumatol 2009;38:173-7.

104. Lacaille D, Hanly, J., Lix, L. M., Stringer, E., Bernatsky, S. . Rheumatology Billing Practices in Canada: A Survey. In: Canadian Arthritis Network Annual Scientific Conference. Quebec City, QC; 2011.

164

105. Lacaille D, Guh DP, Abrahamowicz M, Anis AH, Esdaile JM. Use of nonbiologic disease- modifying antirheumatic drugs and risk of infection in patients with rheumatoid arthritis. Arthritis Rheum 2008;59:1074-81.

106. Vinet E, Kuriya B, Widdifield J, Bernatsky S. Rheumatoid arthritis disease severity indices in administrative databases: a systematic review. J Rheumatol 2011;38:2318-25.

107. To T, Dell S, Dick PT, et al. Case verification of children with asthma in Ontario. Pediatr Allergy Immunol 2006;17:69-76.

108. Carley S, Dosman S, Jones SR, Harrison M. Simple nomograms to calculate sample size in diagnostic studies. Emerg Med J 2005;22:180-1.

109. Jones SR, Carley S, Harrison M. An introduction to power and sample size estimation. Emerg Med J 2003;20:453-8.

110. McCarthy W, Guo N. The Estimation of Sensitivity and Specificity of Clustered Binary Data In: SAS.

111. Bernatsky S, Lix L, Lacaille D, O'Donnell S, Bombardier C. Consensus statements for the use of administrative databases in rheumatic disease research and surveillance. [abstract]. Arthritis Rheum 2011;63 Suppl 10:1881.

112. Tu K, Mitiku T, Guo H, Lee DS, Tu JV. Myocardial infarction and the validation of physician billing and hospitalization data using electronic medical records. Chronic Diseases in Canada 2010;30:141-6.

113. Tu K, Mitiku T, Lee DS, Guo H, Tu JV. Validation of physician billing and hospitalization data to identify patients with ischemic heart disease using data from the Electronic Medical Record Administrative data Linked Database (EMRALD). Can J Cardiol 2010;26:e225-8.

114. Tu K, Manuel D, Lam K, Kavanagh D, Mitiku TF, Guo H. Diabetics can be identified in an electronic medical record using laboratory tests and prescriptions. Journal of Clinical Epidemiology 2011;64:431-5.

115. Glazier R, Zagorski B, Rayner J. Comparison of Primary Care Models in Ontario by Demographics, Case Mix and Emergency Department Use, 2008/09 to 2009/10. . Toronto: Institute for Clinical Evaluative Sciences; 2012.

116. Kralj B, Kantarevic J. Primary Care in Ontario: reforms, investments and achievements. Ontario Medical Review Available online: https://wwwomaorg/Resources/Documents/PrimaryCareFeaturepdf 2012.

117. MD Physician Services Inc. Canadian Medical Association: Physician remuneration options. Available Online: md.cma.ca. Ottawa; 2011.

165

118. Alamanos Y, Voulgari PV, Drosos AA. Incidence and prevalence of rheumatoid arthritis, based on the 1987 American College of Rheumatology criteria: a systematic review. Semin Arthritis Rheum 2006;36:182-8.

119. Hochberg MC. Changes in the incidence and prevalence of rheumatoid arthritis in England and Wales, 1970-1982. Semin Arthritis Rheum 1990;19:294-302.

120. Dugowson CE, Koepsell TD, Voigt LF, Bley L, Nelson JL, Daling JR. Rheumatoid arthritis in women. Incidence rates in group health cooperative, Seattle, Washington, 1987-1989. Arthritis Rheum 1991;34:1502-7.

121. Jacobsson LT, Hanson RL, Knowler WC, et al. Decreasing incidence and prevalence of rheumatoid arthritis in Pima Indians over a twenty-five-year period. Arthritis Rheum 1994;37:1158-65.

122. Kaipiainen-Seppanen O, Aho K, Isomaki H, Laakso M. Incidence of rheumatoid arthritis in Finland during 1980-1990. Ann Rheum Dis 1996;55:608-11.

123. Gabriel SE, Crowson CS, O'Fallon WM. The epidemiology of rheumatoid arthritis in Rochester, Minnesota, 1955-1985. Arthritis Rheum 1999;42:415-20.

124. Shichikawa K, Inoue K, Hirota S, et al. Changes in the incidence and prevalence of rheumatoid arthritis in Kamitonda, Wakayama, Japan, 1965-1996. Ann Rheum Dis 1999;58:751-6.

125. Spector TD, Hart DJ, Powell RJ. Prevalence of rheumatoid arthritis and rheumatoid factor in women: evidence for a secular decline. Ann Rheum Dis 1993;52:254-7.

126. Symmons D, Turner G, Webb R, et al. The prevalence of rheumatoid arthritis in the United Kingdom: new estimates for a new century. Rheumatology (Oxford) 2002;41:793-800.

127. Myasoedova E, Crowson CS, Kremers HM, Therneau TM, Gabriel SE. Is the incidence of rheumatoid arthritis rising?: results from Olmsted County, Minnesota, 1955-2007. Arthritis Rheum 2010;62:1576-82.

128. Linos A, Worthington JW, O'Fallon WM, Kurland LT. The epidemiology of rheumatoid arthritis in Rochester, Minnesota: a study of incidence, prevalence, and mortality. Am J Epidemiol 1980;111:87-98.

129. Isomaki H, Raunio J, von Essen R, Hameenkorpi R. Incidence of inflammatory rheumatic diseases in Finland. Scand J Rheumatol 1978;7:188-92.

130. Imanaka T, Shichikawa K, Inoue K, Shimaoka Y, Takenaka Y, Wakitani S. Increase in age at onset of rheumatoid arthritis in Japan over a 30 year period. Ann Rheum Dis 1997;56:313-6.

131. Kaipiainen-Seppanen O, Aho K, Isomaki H, Laakso M. Shift in the incidence of rheumatoid arthritis toward elderly patients in Finland during 1975-1990. Clin Exp Rheumatol 1996;14:537-42.

166

132. Doran M, Pond G, Crowson C, O’Fallon W, Gabriel S. Trends in incidence and mortality in rheumatoid arthritis in Rochester, Minnesota, over a forty-year period. . Arthritis Rheum 2002;46:625– 31.

133. Widdifield J, Bombardier C, Bernatsky S, et al. Accuracy of Canadian Health Administrative Databases for identifying Patients with Rheumatoid Arthritis: A validation study using the Medical Records of Primary Care Physicians. 2012 ACR Annual Scientific Meeting [Abstract #926]. Arthritis Care and Research 2012.

134. Widdifield J, Bernatsky S, Paterson J, et al. Accuracy of Canadian Health Administrative Databases in Identifying Patients with Rheumatoid Arthritis [Part I]: A validation study using the Medical Records of Rheumatologists. . Arthritis Care and Research 2012;[submitted].

135. StatisticsCanada. Population counts for Canada, provinces and territories, census divisions and census subdivisions (municipalities), by urban and rural, 2011 Census: 100% data (table). Population and Dwelling Count Highlight

Tables. 2011 Census. In. Ottawa (ON): Statistics Canada; 2011.

136. Fay M, Feuer E. Confidence intervals for directly standardized rates: a method based on the gamma distribution. . Stat Med 1997;16:791–801.

137. Neovius M, Simard JF, Askling J, group As. Nationwide prevalence of rheumatoid arthritis and penetration of disease-modifying drugs in Sweden. Ann Rheum Dis 2011;70:624-9.

138. Uhlig T, TK K. Is rheumatoid arthritis disappearing? Ann Rheum Dis 2005;64:7–10.

139. Badley EM, Canizares M, Mahomed N, Veinot P, Davis AM. Provision of orthopaedic workforce and implications for access to orthopaedic services in Ontario. J Bone Joint Surg Am 2011;93:863-70.

140. Badley E, Veinot P, Ansari H, Mackay C, Thorne C. Arthritis Community Research and Evaluation Unit 2007 survey of rheumatologists in Ontario. http://www.acreu.ca/pdf/pub5/08-03.pdf. 2008.

141. StatisticsCanada. Land and freshwater area, by province and territory. In; 2005.

142. Hoovestol RA, Mikuls TR. Environmental exposures and rheumatoid arthritis risk. Curr Rheumatol Rep 2011;13:431-9.

143. Costenbader KH, Chang SC, Laden F, Puett R, Karlson EW. Geographic variation in rheumatoid arthritis incidence among women in the United States. Arch Intern Med 2008;168:1664-70.

144. Vieira VM, Hart JE, Webster TF, et al. Association between residences in U.S. northern latitudes and rheumatoid arthritis: A spatial analysis of the Nurses' Health Study. Environ Health Perspect 2010;118:957-61.

167

145. Kobayashi KM, Prus SG. Examining the gender, ethnicity, and age dimensions of the healthy immigrant effect: factors in the development of equitable health policy. Int J Equity Health 2012;11:8.

146. Cott CA, Gignac MA, Badley EM. Determinants of self rated health for Canadians with chronic disease and disability. J Epidemiol Community Health 1999;53:731-6.

147. Rural and northern health care framework. 2011. (Accessed http://www.health.gov.on.ca/en/public/programs/ruralnorthern/docs/report_rural_northern_EN.pdf, at Retrieved from www.health.gov.on.ca.)

148. Aboriginal Affairs and Northern Development Canada. In.

149. Chronic Diseases in the Métis Nation of Ontario; 2012.

150. Glazier R, Moineddin R, Agha M, et al. The impact of not having a primary care physician among people with chronic conditions Toronto: Insitute for Evaluative Sciences; 2008.

151. Schwalfenberg GK, Genuis SJ, Hiltz MN. Addressing vitamin D deficiency in Canada: a public health innovation whose time has come. Public Health 2010;124:350-9.

152. Wasko MC, Kay J, Hsia EC, Rahman MU. Diabetes mellitus and insulin resistance in patients with rheumatoid arthritis: risk reduction in a chronic inflammatory disease. Arthritis Care Res (Hoboken) 2011;63:512-21.

153. Song GG, Bae SC, Lee YH. Association between vitamin D intake and the risk of rheumatoid arthritis: a meta-analysis. Clin Rheumatol 2012.

154. Tu K, Chen Z, Lipscombe LL, Canadian Hypertension Education Program Outcomes Research T. Prevalence and incidence of hypertension from 1995 to 2005: a population-based study. CMAJ 2008;178:1429-35.

155. Lipscombe L, Hux J. Trends in diabetes prevalence, incidence, and mortality in Ontario, Canada 1995–2005: a population-based study. Lancet 2007;369:750–6.

156. Gershon A, Wang C, Wilton A, al. e. Trends in chronic obstructive pulmonary disease prevalence, incidence, and mortality in Ontario, Canada, 1996 to 2007: a populationbased study. Arch Intern Med 2010;170:560–5.

157. Hennekens C, Buring J. Epidemiology in Medicine: Lippincott Williams & Wilkins; 1987.

158. Cronbach L, Meehl P. Construct validity in psychological tests. Psychological Bulletin 1955;52:281-302.

159. Goodwin L. Changing Conceptions of Measurement Validity: An Update on the New Standards. Journal of Nursing Education 2002;41:100-6.

168

160. Brooks P, Boers M, Simon LS, Strand V, Tugwell P. Outcome measures in rheumatoid arthritis: the OMERACT process. Expert Rev Clin Immunol 2007;3:271-5.

161. White E, Armstrong B. Principles of exposure measurement in epidemiology: Collecting, evaluating, and improving measures of disease risk factors. 2nd ed. New York, NY: Oxford

University Press; 2008.

162. Winkelmayer WC, Schneeweiss S, Mogun H, Patrick AR, Avorn J, Solomon DH. Identification of individuals with CKD from Medicare claims data: a validation study. Am J Kidney Dis 2005;46:225-32.

163. Koepsell T, Weiss N. Epidemiologic methods: studying the occurrence of illness. New York, NY: Oxford University Press; 2003.

164. Ozminkowski R, Burton W, Goetzel R, Maclean R, Wang S. The impact of rheumatoid arthritis on medical expenditures, absenteeism, and short-term disability benefits. J Occup Environ Med 2006;48(2):135-48.

165. Ahern M, Smith M. Rheumatoid arthritis. Medical Journal of Australia 1997;166:156-61.

166. Prevoo M, al. e. Remission in a prospective study of patients with rheumatoid arthritis. American Rheumatism Association preliminary remission criteria in relation to the disease activity score. . British Journal of Rheumatology 1996;35:1101-5.

169