THE DIAGNOSTIC INTERVAL OF COLORECTAL CANCER PATIENTS IN ONTARIO BY DEGREE OF RURALITY

by

Leah Hamilton

A thesis submitted to the Graduate Program in the Department of Public Health Sciences

in conformity with the requirements for the

Degree of Master of Science

Queen’s University

Kingston, Ontario, Canada

(September, 2015)

Copyright ©Leah Hamilton, 2015 Abstract

Background: Wait times while moving through the cancer diagnostic process are a public health concern. Rural populations may experience more challenges in accessing cancer care, which could translate into a longer diagnostic interval and represent a healthcare inequity. This project analyzed the association between rurality of residence and the diagnostic interval of colorectal cancer (CRC) patients in Ontario, Canada.

Methods: This was a retrospective population-based . We used administrative databases available through the Institute for Clinical Evaluative Sciences (ICES) to identify incident CRC cases diagnosed from Jan 1, 2007- May 31, 2012. We assigned each patient a rurality score, based on their census subdivision, and calculated the length of their diagnostic interval. We defined the diagnostic interval as the time (in days) between a patient’s first diagnostic-related encounter with the health care system to the CRC diagnosis date. Data linkage through ICES allowed us to describe variations in cancer stage and the diagnostic interval by degree of rurality of patient residence and to analyze associations through multivariable models taking into account potential confounders.

Results: Overall, the median diagnostic interval of the CRC cohort was 64 (IQR: 22-159) days and the

90th percentile was 288 days. Patients with stage I CRC had a longer median diagnostic interval than patients with stage IV CRC. Across rurality categories, a significant difference in median diagnostic interval was detected in the stage I stratum only, ranging from 58.5 to 108 days (p=0.0005), but with the most rural group having the shortest diagnostic interval. Results from adjusted multivariable models suggested that patients in mid-ranged rural categories had similar or longer diagnostic intervals compared to patients in the least rural category while patients in the most rural category maintained the shortest diagnostic intervals. Important covariates included age, comorbidities and CRC sub-site.

Conclusion: Our results do not support a rurality effect on stage or the diagnostic interval in the hypothesized direction. Estimates of a shorter interval in the most rural category, especially for stage I

ii disease, call for a deeper analysis to better understand care delivery in those areas and patient characteristics that might affect the interval.

iii

Co-Authorship

This thesis is the work of Leah Hamilton in collaboration with her supervisor Dr. Patti A. Groome and collaborator Ms. Colleen E. Webber. Thesis design represents the work of Leah Hamilton, Dr. Patti A.

Groome and Ms. Colleen E. Webber. The clinical consultant, Dr. Geoffrey A. Porter, provided advice and clinical insights throughout the thesis. Dr. Jennifer A. Flemming was consulted while creating the CRC symptom status algorithm and Dr. Hugh Langley provided insight on fecal occult blood test (FOBT)

Ontario Health Insurance Plan (OHIP) fee codes. Ms. Marlo Whitehead, senior analyst at the Institute for

Clinical Evaluative Sciences (ICES) Health Services Research Facility at Queen’s University, performed the database linkages and data cuts. Thesis writing, statistical analyses and interpretations were the work of Leah Hamilton with guidance and suggestions from Patti A. Groome and Colleen E. Webber. Editorial feedback was also provided by Patti A. Groome, Colleen E. Webber and Dr. Geoffrey A. Porter.

iv

Acknowledgements

It has been a true pleasure to work and learn from so many individuals throughout the duration of this thesis project. The efforts of numerous people went into this final piece of work and I am sincerely grateful.

Firstly, I would like to thank my supervisor, Dr. Patti Groome, for making this project possible.

Her consistent guidance and support as well as her epidemiological expertise and dedication to student mentorship allowed me to conceptualize and complete this thesis. Dr. Groome has a gift for teaching /

“academic translating” that I have benefitted from greatly. I will take much learning from both the thesis and weekly lab meetings with me in the future. I would also like to thank Ms. Colleen Webber, my thesis collaborator-extraordinaire. Ms. Webber was a part of every meeting, e-mail and draft of the thesis project and the end product truly benefitted from her input and perspective starting from day one. I will always appreciate being able to wheel my chair over to her desk to recap on the latest decisions and feedback.

This project also greatly benefited from the clinical expertise and insight of Dr. Geoffrey Porter, the clinical consultant for the thesis. Dr. Porter provided invaluable feedback throughout the thesis process.

I would also like to thank Dr. Jennifer Flemming and Dr. Hugh Langley for their clinical feedback.

This project would not have been possible without the help of the Institute for Clinical Evaluative

Sciences (ICES) staff; in particular Ms. Marlo Whitehead and Ms. Susan Rohland. I thank Ms. Whitehead for her patience through the many iterations of the project’s dataset creation plan and for my daily interruptions during the analysis phase with requests for data transfers and coding questions. I thank Ms.

Rohland for her encyclopedic knowledge of all things ICES-privacy and ethics-related as well as her dedication to following-up on ongoing processes/approvals.

A big thank you to the Groome Lab (Ally Mahar, Colleen Webber, Li Jiang, Kim Foley, John

Queenan and Graham Smith) at Cancer Care and (CCE) for creating such a great work environment. I truly valued the support, enthusiasm and feedback for my project both during lab meetings v and in our pod. Thank you to my classmates in the Department of Public Health Sciences, who made my time in Kingston so enjoyable and who I relied on during all phases of the program.

I would also like to acknowledge the financial support from the Department of Public Health

Sciences/Queen’s Graduate Program and from Dr. Patti Groome.

Finally, I need to thank my parents, sisters, housemates and friends for their endless support, love and encouragement.

vi

Table of Contents

Abstract ...... ii Co-Authorship...... iv Acknowledgements ...... v List of Figures ...... xi List of Tables ...... xii List of Abbreviations ...... xiv Chapter 1 Introduction ...... 1 1.1 Background and Rationale ...... 1 1.2 Study Design Overview ...... 2 1.3 Study Objectives ...... 2 1.4 Thesis Outline ...... 3 Chapter 2 Literature Review ...... 4 2.1 Colorectal Cancer ...... 4 2.1.1 Epidemiology ...... 4 2.1.2 Biology ...... 5 2.1.3 Risk Factors ...... 7 2.1.4 Signs and Symptoms ...... 9 2.1.5 Diagnostic Investigations and the Diagnostic Pathway ...... 10 2.1.6 CRC Stage ...... 12 2.2 CRC Diagnostic Interval ...... 14 2.2.1 Definition ...... 14 2.2.2 Quantifying the CRC Diagnostic Interval ...... 15 2.2.3 Factors Related to the CRC Diagnostic Interval ...... 18 2.2.4 Waiting Time Paradox ...... 20 2.2.5 Impact of the Diagnostic Interval ...... 20 2.2.6 Wait Time Targets ...... 22 2.3 Conceptual Framework ...... 23 2.4 Rurality and the CRC Diagnostic Interval ...... 25 2.4.1 Definitions for Rurality ...... 25 2.4.2 Rural Patients as a Vulnerable Population ...... 25 2.5 Summary ...... 31 Chapter 3 Methods ...... 32

vii

3.1 Purpose ...... 32 3.2 Empirical Objectives and Hypotheses ...... 32 3.3 Overview ...... 33 3.4 Student’s Contribution ...... 34 3.5 Study Population ...... 34 3.6 Study Time Frame ...... 35 3.7 Data Sources ...... 35 3.7.1 Ontario Cancer Registry...... 36 3.7.2 Ontario Health Insurance Plans Claims Database (OHIP) ...... 37 3.7.3 Registered Persons Database Files (RPDB) ...... 38 3.7.4 The Canadian Institute for Health Information Discharge Abstract Database (CIHI-DAD) ..... 38 3.7.5 National Ambulatory Care Reporting System (NACRS) ...... 39 3.7.6 Ontario Census Area Profiles ...... 39 3.7.7 ICES Physician Database (IPDB) ...... 39 3.8 Study Variables ...... 40 3.9 Rurality (exposure) ...... 41 3.10 Stage at Diagnosis (outcome objective one, stratification variable objective two)...... 42 3.11 Diagnostic Interval (outcome objective two) ...... 43 3.11.1 Identifying Key CRC Tests ...... 44 3.11.2 Identifying the Index Contact Date from Key CRC Tests ...... 45 3.11.3 Using Control Charts to Determine Look-Back Periods for Key CRC Tests ...... 48 3.11.4 Identification and Retrieval of Missing Index Contact Dates ...... 50 3.11.5 Refining the Colonoscopy and Sigmoidoscopy Capture...... 52 3.12 Potential Causal Pathway Variables: CRC symptom status, Emergency Department (ED) Presentation and First Test ...... 54 3.12.1 CRC Symptom Status at First Test ...... 54 3.12.2 Emergency Department (ED) Presentation ...... 59 3.12.3 First Test ...... 60 3.13 Covariates- Rationale for Inclusion and Variable Description ...... 61 3.14 Code Determination Summary...... 65 3.15 Summary of Variables ...... 68 3.16 Statistical Analysis ...... 69 3.16.1 Cohort Description and Representativeness ...... 69 3.16.2 Objective One ...... 70 viii

3.16.3 Objective Two ...... 70 3.16.4 Regression Diagnostics ...... 71 3.17 Minimum Detectable Effect ...... 72 3.18 Ethical Considerations ...... 72 Chapter 4 Results ...... 74 4.1 Overview ...... 74 4.2 The CRC Cohort ...... 75 4.2.1 CRC Cohort Selection ...... 75 4.2.2 Description of the CRC Cohort by RIO Groups ...... 77 4.2.3 Cohort Representativeness ...... 81 4.3 Objective One: Comparison of the Ontario CRC Stage Distribution across Regions Grouped by RIO Categories...... 89 4.3.1 CRC Stage Distribution across Regions Grouped by RIO Categories ...... 89 4.3.2 Multivariable Analysis: Logistic Regression ...... 91 4.3.3 Description of the stage distribution by RIO group stratified by potential causal pathway variables ...... 93 4.4 Objective Two: Comparison of the CRC Diagnostic Interval Across Regions Grouped by RIO Categories and Stratified by Stage ...... 96 4.4.1 The CRC Cohort Diagnostic Interval ...... 96 4.4.2 The CRC Cohort Diagnostic Interval Distribution across RIO Categories Stratified by Stage . 99 4.4.3 Multivariable Analysis: Quantile regression (50th/median and 90th percentiles) ...... 104 4.4.4 Description of the diagnostic interval distribution by RIO group stratified by potential causal pathway variables ...... 109 4.5 Regression Diagnostics ...... 114 Chapter 5 Discussion ...... 115 5.1 Summary of Key Findings ...... 115 5.2 Discussion of Key Findings ...... 116 5.3 Strengths and Limitations ...... 126 5.3.1 Strengths ...... 126 5.3.2 Limitations ...... 128 5.4 Study Contribution ...... 132 5.5 Public Health Implications and Future Research Directions ...... 133 5.6 Conclusions ...... 135 Appendix A Dataset Creation Plan ...... 155 ix

Appendix B List of CSDs and Respective RIO Scores...... 175 Appendix C Comparison of the Diagnostic Interval between Patients with and without a Referral for First Test ...... 179 Appendix D Control Charts ...... 181 ...... 181 Appendix E Rules for Interpreting Control Charts ...... 185 Appendix F Signal Strength Calculation for Relevant Look-Back Periods ...... 186 Appendix G Linearity Assumption Tests for Age, Comorbidities and Deprivation Quintile (Quantile Regression Models) ...... 187 Appendix H Linearity Assumption Tests for Comorbidities (logistic regression model) ...... 190 Appendix I Ethics Approval ...... 192 Appendix J Description of the CRC Cohort by Stage (I-IV) ...... 194 Appendix K Median Quantile Regression Results for Factors Associated with the Diagnostic Interval . 196 Appendix L 90th Quantile Regression Results for Factors Associated with the Diagnostic Interval ...... 201 Appendix M Median Quantile Regression Diagnostics ...... 206 Appendix N 90th Quantile Regression Diagnostics ...... 210 Appendices References ...... 213

x

List of Figures

Figure 2-1: Anatomical sub-sites of the colon and rectum with corresponding ICD-9 codes ...... 6 Figure 2-2: Time intervals along the pathway from first symptoms to start of treatment ...... 15 Figure 2-3: Conceptual frame work adapted from Zapka and the Chronic Care Model...... 24 Figure 3-1: Data sources linked through the Institute for Clinical Evaluative Sciences (ICES) ...... 36 Figure 3-2: Study variable conceptual model for objectives 1 and 2 ...... 40 Figure 3-3: Generalized diagnostic interval derivation ...... 44 Figure 3-4: The diagnostic interval is the time from the index contact date to the CRC diagnosis...... 47 Figure 3-5: Algorithm to assign CRC symptom status: asymptomatic or symptomatic ...... 56 Figure 4-1: Flow chart describing the CRC cohort selection ...... 76 Figure 4-2: CRC stage distribution by Rurality Index for Ontario (RIO) group ...... 90 Figure 4-3: Distribution of the diagnostic interval (days) by Rurality Index for Ontario (RIO) groups stratified by stage...... 101 Figure 4-4: Distribution of the diagnostic interval (days) by Rurality Index for Ontario (RIO) groups stratified by stage...... 102 Figure 4-5: Distribution of the diagnostic interval (days) by Rurality Index for Ontario (RIO) groups stratified by stage...... 103

xi

List of Tables

Table 2-1: Tumour, node, metastasis (TNM) CRC stage description, distribution and 5-year relative survival rate ...... 13 Table 2-2: Examples of studies that quantified a measure of the diagnostic interval ...... 16 Table 3-1: Relevant look-back periods and signal strength (%) for the eight key CRC tests determined by control charts ...... 49 Table 3-2: OHIP colonoscopy and sigmoidoscopy fee codes used to create the CRC symptom status algorithm ...... 58 Table 3-3: Brief rationale for covariate inclusion ...... 62 Table 3-4: CRC sub-site grouping scheme ...... 65 Table 3-5: Code determination summary ...... 67 Table 3-6: Summary of study variables ...... 68 Table 4-1: Description of CRC patients by RIO group (Jan 01, 2007 to May 31, 2012 N=27,942) ...... 79 Table 4-2: Comparison of study variable distributions between patients in the final cohort (N=27,942) and patients who were excluded from the final cohort due to a missing index contact date (N=1,189) ...... 82 Table 4-3: Comparison of study variable distributions between patients in the final cohort (N=27,942) and patients who were excluded from the final cohort due to a missing RIO score (N=305)...... 84 Table 4-4: Comparison of select study variable distributions between patients with a rectal cancer (N=6,818) and patients with a rectosigmoid junction cancer (N=2,140), who were erroneously omitted from the CRC cohort ...... 87 Table 4-5: Unadjusted and adjusted odds ratios (ORs) for late stage (III, IV) CRC, statistically significant estimates are in bold ...... 93 Table 4-6: Comparison of CRC symptom status (asymptomatic or symptomatic) distribution and stage (I- IV) across RIO groups ...... 94 Table 4-7: Comparison of emergency department (ED) presentation (yes or no) distribution and stage (I- IV) across RIO groups ...... 95 Table 4-8: Description of the CRC cohort by median and 90th percentile diagnostic interval (days) ...... 98 Table 4-9: Adjusted difference (days) in the median and 90th percentile diagnostic interval (95% CI) by RIO group stratified by stage ...... 106 Table 4-10: Comparison of CRC symptom status (asymptomatic or symptomatic) distribution and diagnostic interval (days) (50th/median and 90th percentiles) across RIO groups ...... 110

xii

Table 4-11: Comparison of emergency department (ED) presentation (yes or no) distribution and diagnostic interval (days) (50th/median and 90th percentiles) across RIO groups ...... 111 Table 4-12: Comparison of first test distribution and diagnostic interval (days) (50th/median and 90th percentiles) across RIO groups ...... 113

xiii

List of Abbreviations

ACG the Johns Hopkins Adjusted Clinical Group ADG the Johns Hopkins Aggregated Diagnosis Groups AJCC American Joint Committee on Cancer CAN-Marg Canadian Marginalization Index CCI Canadian Classification of Health Interventions CCO Cancer Care Ontario CIHI/DAD the Canadian Institute for Health Information Discharge Abstract Database CSD Census Subdivision CT Computer Tomography DX Diagnosis ED Emergency Department ED NOS Emergency Department Not Otherwise Specified FAP Familial Adenomatous Polyposis FP Family Physician gFOBT Guaiac Fecal Occult Blood Test ICD-9 International Classification of Diseases, Ninth Revision ICD-10-CA International Classification of Diseases and Related Health Problems, Tenth Revision, Canada ICES Institute for Clinical Evaluative Sciences IKN ICES key number IPDB ICES Physician Database MOHLTC Ontario Ministry of Health and Long-Term Care NACRS National Ambulatory Care Reporting System OCR Ontario Cancer Registry OHIP Ontario Health Insurance Plan OMA Ontario Medical Association ON-Marg Ontario Marginalization Index RIO Rurality Index for Ontario RPDB Registered Persons Database SES Socioeconomic Status TNM Tumour, Node, Metastasis

xiv

UICC Union for International Cancer Control US Ultra Sound

xv

Chapter 1

Introduction

1.1 Background and Rationale

In Canada, colorectal cancer (CRC) is the second most common type of cancer diagnosed in men and the third most common type of cancer diagnosed in women (1). CRC accounts for approximately

14% of new cancer cases in men, 12% of new cancer cases in women and 12% of cancers deaths in

Canada (1). CRC stage at diagnosis is one of the most important prognostic factors (2) and 5-year relative survival rates vary greatly from about 87-92% for stage I to 11-12% for stage IV (3).

The CRC diagnostic pathway is made up of a series of steps which generally consist of primary care appointments, family physician-directed diagnostic tests, referrals to specialists, consultations with specialists and specialist-directed diagnostic tests (4). Individual patient pathways will vary and CRC can also be diagnosed through screening and through the emergency department (ED). CRC can be particularly difficult to diagnose as symptoms can be vague and can appear to be self-limiting (5) which represents a challenge for family physicians (FPs) who are often the first point of contact. Some element of delay while moving through the cancer diagnostic process may be inevitable, however there is likely room for considerable improvement for many patients (6). Currently, there is conflicting literature as to whether the length of the diagnostic process can influence CRC outcomes like stage at diagnosis and survival since CRC is a slowly progressing disease (7–17). However, efforts to make this process more efficient should not be abandoned as this is often a time of increased psychological stress for patients and families (15) and longer diagnostic intervals maybe indicative of a less efficient diagnostic process (18).

Furthermore, indication from another Canadian province shows that the length of the CRC diagnostic process is increasing and may reach a threshold at which the delay begins to affect clinical outcomes (15).

The length of this CRC diagnostic process at a provincial population–level is unknown in Ontario.

1

Rural patients are vulnerable to poor health outcomes (19,20), and experience inequities to cancer services along the cancer care trajectory, which includes the diagnostic process (21). Geographic access to services may play an important role. Given that equity to health care is a pillar of the Canadian healthcare system it is important to study the diagnostic interval across the degree of rurality of a patient’s residence.

If the length of the diagnostic process is longer in rural areas, it may be associated with, and therefore ameliorated by, changes in the health care services that lead to a CRC diagnosis.

1.2 Study Design Overview

This was a retrospective population-based cohort study of all Ontario incident CRC cases diagnosed from Jan 1, 2007 to May 31, 2012. We used administrative databases available through the

Institute for Clinical Evaluative Sciences (ICES) Health Services Research Facility at Queen’s University to assign each patient a score describing the degree of rurality of their residence (Rurality Index for

Ontario score) and to calculate their CRC diagnostic interval. Fee codes that represented key CRC medical tests in the administrative databases were used to identify CRC diagnosis-related encounters. The earliest of these CRC diagnosis-related encounters, for a given patient, identified their index contact date with the health care system. We defined the diagnostic interval as the time between that index contact date and the date the patient was diagnosed with CRC. Data linkage through ICES allowed us to describe variation in cancer stage and the diagnostic interval by degree of rurality of patient’s residence (Rurality

Index for Ontario score) and to analyze those associations controlling for potential confounders. We also explored potential causal pathway variables that might influence those associations.

1.3 Study Objectives

The purpose of this study was to describe the variation in the length of the diagnostic interval for colorectal cancer (CRC) patients in Ontario by degree of rurality of the patient’s residence. This study also refined an existing CRC diagnostic interval algorithm by Singh and colleagues (22) and a CRC symptom status algorithm (23); thereby creating two fundamental resources for future research in this area. More specifically, the two study objectives were: 2

1. To compare the Ontario CRC stage distribution across census subdivisions grouped by Rurality

Index for Ontario (RIO) categories.

2. To determine whether patients living in more rural census subdivisions (higher RIO scores) have

a longer system-related diagnostic interval than patients living in less rural census subdivisions

(lower RIO scores) after stratifying by stage and controlling for potential confounders.

1.4 Thesis Outline

This thesis is organized into five chapters. Following the introduction is Chapter 2, the literature review. The literature review provides context for the thesis and covers basic CRC epidemiology and biology, a description of the general CRC diagnostic pathway, a definition of the diagnostic interval, factors influencing the diagnostic interval, rurality and the diagnostic interval as well as a conceptual framework for the study. Chapter 3 describes the study methods in detail and includes sections on the study population, data sources, study variables and derivation as well as a description of the statistical analyses. Results are presented in Chapter 4 and focus on, a description of the cohort, cohort representativeness, findings from objectives 1 and 2 as well as regression diagnostics. The final segment of the thesis is Chapter 5 which presents the discussion and conclusions. This last chapter includes a discussion of the key findings in context of the literature, the study strengths and limitations as well as the public health implications and future research directions.

3

Chapter 2

Literature Review

2.1 Colorectal Cancer

2.1.1 Epidemiology Colorectal cancer (CRC) is the second most common type of cancer diagnosed in Canada in men and the third most common type of cancer diagnosed in women (1). In Canada, CRC accounts for approximately 14% of new cancer cases in men and 12% in women (1). The age-standardized incidence rate in Canada for males and females is 59 per 100,000 and 40 per 100,000, respectively (1). The estimated age-standardized incidence rate in Ontario is slightly below the national estimate and corresponds to approximately 5,100 incident cases of CRC diagnosed in men and 4,100 diagnosed in women, during 2015 (1). CRC is the second most common cause of death from cancer in men and the third most common cause of death from cancer in women representing 12.4% and 11.5% of estimated cancer deaths in Canada, respectively (1). The age-standardized mortality rate of CRC in Canada in 2015 was 20 per 100,000 in men and 13 per 100,000 in women (1). In 2015 there will be approximately 3,350

CRC deaths in Ontario. The 10-year person-based CRC in Canada (as of Jan 1, 2009) was

105,195 people (1). The age-standardized five-year relative survival ratio for CRC is 64% (95% CI: 64,

65) overall, although survival is highly dependent upon stage at diagnosis (1). Five-year relative survival rates for colon cancer by stage range from 92% for stage I to 11% for stage IV and from 87% for stage I to 12% for stage IV rectal cancer (3). The stage distribution is relatively equal across the four CRC stages.

In Ontario, the population-based distribution of CRC stage in 2011 was: 22% stage I, 26% stage II, 28% stage III and 20% stage IV (the remaining 4% was stage unknown) (24). A very similar stage distribution is seen in other provinces (25).

There are some fascinating time and geographic trends in CRC incidence and mortality. Starting in the mid-1980s, CRC incidence declined until the mid-1990s. The incidence then increased through to

4

2000, and has since been decreasing by about 0.7% per year (1,26). Although the incidence rate is decreasing, the number of new cases has increased markedly. In 1983 there were approximately 12,500 incident CRC cases in Canada, contrast this to the 25,100 incident CRC cases in 2015 (1,26). This increasing trend in cases is due to both a growing and aging population.

There is a vast geographic difference in the incidence and mortality of CRC. Canada, Australia,

New Zealand, the United States and parts of Europe have the highest age-standardized incidence rates in the world, while much lower rates are found in Asia, Africa and South America (26,27). This trend is likely linked to differences in diet and physical activity. Interestingly, a dramatic increase in the incidence rate of CRC can be found in countries that have recently transitioned from a traditional diet to a more

“westernized” diet, as seen in Japan (26,28,29). Just within Canada a geographic trend is apparent. CRC incidence increases when travelling from the West Coast to the East Coast. British Columbia has the lowest incidence rate of 51 per 100,000 for men and 36 per 100,000 per women while Newfoundland has the highest rates at 85 per 100,000 for men and 53 per 100,000 for women (1). The same trend is observed in mortality rates. The range of incidence and mortality rates along the west-east gradient is striking and is more prominent than any of the other major cancers (26). It is hard to pinpoint the reasons for this geographic pattern, but variations in risk factors (both lifestyle and genetic) are likely a contributing feature (26).

2.1.2 Biology CRC is often further described by being grouped into one of three anatomical sub-sites of the large intestine: proximal colon, distal colon and rectum (see Figure 2-1). Cancers of the cecum, ascending colon, hepatic flexure, transverse colon and the splenic flexure are proximal colon cancers (or right-sided colon cancers). Continuing along the anatomy of the large intestine, cancers in the descending colon and the sigmoid colon are distal colon cancers (or left-sided colon cancers) and cancer in the rectosigmoid junction and the rectum are rectal cancers (30) (although there is variation depending on the source as to which specific sub-sites belong in each group; proximal, distal or rectal). Cancers of the appendix are 5 often grouped into an “other and not otherwise specified” category or excluded. Cancers of the anus are not usually included as a CRC because they are biologically a different disease (26). The CRC sub-site distribution varies somewhat by sex. For CRCs diagnosed in Ontario from 2002-2006, the sub-site distribution in males was: 31% proximal, 25% distal and 38% rectal. The sub-site distribution for females was: 42% proximal, 22% distal and 28% rectal (30). The term “large intestine” is used interchangeably with “large bowel” (which is used especially in studies from the United Kingdom and Australia). For clarity, only “large intestine” will be used in this project.

Proximal Colon Transverse colon 153.1 Hepatic flexure Splenic flexure 153.0 153.7

Ascending colon 153.6 Descending colon 153.2 Distal

Colon Sigmoid colon Cecum 153.3 153.4

Rectosigmoid junction 154.0 Appendix Rectum 154.1

Rectum

Figure 2-1: Anatomical sub-sites of the colon and rectum with corresponding ICD-9 codes (adapted from CCO) (30). The proximal (right) colon consists of the: cecum, ascending colon, hepatic flexure, transverses colon and the splenic flexure. The distal (left) colon consists of the: descending colon and the sigmoid colon. The rectum consists of the: rectosigmoid junction and the rectum.

At the histologic level there are several different types of tumour cells that can develop in the colon and rectum. However, more than 94% are adenocarcinomas (31–33) with the remaining 6% comprised of carcinoid tumours, gastrointestinal stromal tumours, lymphomas and sarcomas.

Adenocarcinomas are a type of cancer that starts in glandular cells in epithelia tissues. An adenoma, also called an adenomatous polyp in the large intestine, is a benign tumour that can transform into an

6 adenocarcinoma (malignant). A well-established multi-step progression from abnormal epithelium of the large intestine, to polyp, to cancer has been found in CRC (34). It is important to note that polyps in the large intestine are quite prevalent in the adult population (35,36). Not all polyps will develop into CRC, but almost all CRCs come from polyps. CRC is a slowly progressing disease (37) and has a relatively long latency period of five to ten years (from the progression of adenomas -benign tumours- to malignancy)

(38–40). The average tumour doubling time for a primary CRC is approximately 1.2-1.6 years (median

0.5-0.7 years) (33,41,42) (recurrent and metastatic colorectal tumours grow faster) (33,42). The most common distant metastasis is the liver, however pulmonary and bony metastases are also frequent

(43,44). CRC can spread through tissue, the lymph system and blood (31), but the rate of spread is largely unpredictable.

2.1.3 Risk Factors There are many factors that affect the risk of developing CRC. CRC can develop in individuals who have inherited or acquired genetic predispositions and are exposed to risk factors (45). It is estimated that approximately 75%-95% of CRCs are sporadic (34,45). This emphasizes the role of lifestyle factors in the development of CRC. As noted earlier, there is a much greater incidence of CRC in geographic areas with a Western culture, many of the risk factors are related to this lifestyle. Risk factors can be described as either non-modifiable or modifiable. Non-modifiable risk factors that are known to increase the risk of developing CRC include: age over 50, colorectal polyps, family history of CRC, certain genetic conditions like familial adenomatous polyposis and Lynch syndrome, inflammatory bowel disease

(ulcerative colitis and Crohn’s disease), certain racial and ethnic backgrounds (e.g. African American,

Ashkenazi descent), tall adult height, personal history of breast, ovarian or uterine cancer and exposure to ionizing radiation (38,46–48) . Modifiable risk factors that are known to increase the risk of developing

CRC include: diets high in red and processed meats, cooking meat at high temperatures, alcohol consumption, smoking, obesity, physical inactivity, diets low in fiber and type II diabetes (38,46–48).

Several modifiable risk factors that have an association with CRC, but do not have definitive evidence

7 include: sedentary behavior, diets high in fat, removal of the gallbladder, helicobacter pylori infection, by- products from chlorinated drinking water and night shift work (46,48).

It is hard to determine what proportion of CRC incidence is accounted for by each risk factor or combination of risk factors, however, several important risk factors will be highlighted. Approximately

95% of incident cases of CRC occur after age 50 in the Canadian population (26). Incidence and mortality are also generally higher in males for most age groups (26). Familial adenomatous polyposis (FAP) is a rare inherited genetic condition (only about 1% of CRCs are due to FAP) (48). In the most common type of FAP an individual will develop hundreds to thousands of colorectal polyps. Without removal of the polyps, CRC will almost certainly develop by age 40 (48).

More and more research is concluding that lifestyle has a major role in CRC incidence. A cohort study of middle-aged American men (49) found that a third to a half of colon cancer risk (a fourth to a third in the distal colon sub-site) in the cohort could be prevented if the exposure to six risk factors

(obesity, physical inactivity, alcohol consumption, early adulthood cigarette smoking, red meat consumption and low intake of folic acid from supplements) were similar to the risk scores of men in the bottom 20% or 5% of the risk score distribution. In a prospective study on CRC risk factors in white

American men (50), increased risk of colon cancer was found in: heavy cigarette smokers (≥30/day,

RR=2.3, 95% CI:0.9, 5.7), heavy beer drinkers (≥14 times/month, RR= 1.9, 95% CI:1.0, 3.8), white collar workers (RR=1.7, 95% CI:1.0, 3.0), craft workers with service and trade industries (RR=2.6, 95% CI: 1.1,

5.8) and high red meat consumption (>2times/day, RR= 1.8, 95% CI: 0.8, 4.4).

Similar to some other types of cancers and chronic diseases, maintaining a healthy diet, being physically active, not smoking and limiting alcohol consumption are key factors in reducing the risk for

CRC. Participation in CRC screening programs can also reduce both mortality and incidence of CRC

(CRC screening in Ontario will be described in section 2.1.5).

8

2.1.4 Signs and Symptoms

The majority of CRCs are diagnosed symptomatically (51,52), although the number will likely start to decline as population-based screening programs are now active (in various stages of implementation) in all provinces. Difficulty in diagnosing CRC can occur because symptoms can be quite vague (53). The most commonly reported symptoms include: rectal bleeding, abdominal pain, a change in bowel habits, anemia, occult bleeding, bloody stool, narrow stool, mucus in stools, feeling tired, unexplained weight loss, anorexia, constipation, diarrhea, nausea or vomiting, obstructive symptoms and tenesmus (4,51,54,55). A study in North Carolina (51) that analyzed the hospital and office records recorded by primary physicians of non-screened CRC patients before there was a suspicion or diagnosis of CRC found that the majority of patients had anemia (57%) and occult bleeding (77%). Other common symptoms found in this study were: rectal bleeding (58%), abdominal pain (52%) and a change in bowel habits (51%). At the time of diagnosis patients presented with an average of 3.4 symptoms. The researchers were also able to identify three clusters of symptoms through factor analysis. These clusters are statistically intercorrelated and likely represent a shared pathophysiology. The clusters were 1) anorexia, nausea, vomiting, abdominal pain or fatigue 2) constipation or obstructive symptoms and 3) diarrhea, mucus in stools, rectal pain or tenesmus.

Clinical presentation can vary by anatomical sub-site of the tumour. Patients with tumours located in the proximal colon often present with anemia and weight loss (56). Patients with tumours located in the distal colon and rectum typically present with rectal bleeding, tenesmus and a change in bowel habits (56) (where the lumen of the intestine is more narrow and stool is more solid). Similar results were found in the study by Majumdar and colleagues (51); the researchers were able to use multiple logistic regression to create a rule for predicting a distal location of the cancer (sensitivity of

93%, specificity of 47%). The relationship between symptoms and CRC stage will be described in section

2.1.6.

9

The majority of CRC cases are still diagnosed symptomatically. While there are common symptoms for the disease, the diagnosis is often not straightforward as many of the symptoms are also present in the seemingly well population. Furthermore, there is a great deal of overlap between CRC symptoms and other less severe gastrointestinal conditions. CRC symptoms can be generally related to the anatomical sub-site.

2.1.5 Diagnostic Investigations and the Diagnostic Pathway

CRC can either be diagnosed symptomatically or through CRC screening. The recommended diagnostic pathways for these two routes begin differently and it is important to have a general understanding of the diagnostic process when thinking about the diagnostic interval. The diagnostic pathway for symptomatic patients will be explored first followed by a brief description of Ontario’s provincial CRC screening program.

There are many steps in the diagnosis of symptomatic CRC and variation will exist amongst individual patients’ diagnostic pathways. The following description is a brief summary from Cancer Care

Ontario’s (CCO) recommended CRC diagnosis pathway (4). The diagnostic process begins when the patient visits their FP and presents with one or more CRC symptoms (listed earlier). A focused history and physical exam would then be done by the FP. If the FP finds certain key signs or symptoms (such as a palpable rectal mass, unexplained iron-deficiency or a specific combination of rectal bleeding and other symptoms) the patient would be referred to have a consult with an endoscopist (within a diagnostic assessment program if available). It is important to highlight the fact that this pathway requires the FP to recognize that symptoms could be due to CRC and a referral to a specialist is needed. If the patient’s symptoms do not meet the referral criteria the FP would use their clinical judgment to determine if there still remains a high degree of suspicion for CRC, in which case the patient would be referred for a consult with an endoscopist. If there is a low degree of suspicion, the symptoms would be treated and the patient would be seen again in 4-6 weeks to ensure symptom resolution (if unresolved after this time period, the patient is referred for an endoscopist consult) or the FP may order imaging tests like abdominal x-ray, 10 abdominal ultrasound, abdominal CT, barium enema, CT colonography or a Fecal Occult Blood Test

(FOBT) to substantiate or expedite the referral. After consultation with an endoscopist the patient would have a colonoscopy (colonoscopy is a key diagnostic and screening test). If the colonoscopy was complete and abnormal results were found, a tissue sample would be taken to pathology where it would be determined if the sample was benign (patient enters colonoscopy surveillance) or if it was malignant

(cancer staged). In some cases the colonoscopy is not completed either due to a tumour blocking the cecal intubation (patient referred to a surgeon) or for a technical reason (colonoscopy to be repeated or another form of investigation such as CT colonography completed). The endoscopist would determine if the mass/sample is greater than or less than 15 cm above the anal verge. If the sample was greater than

15cm it is considered a colon cancer, if it was less than or equal to 15 cm it is considered a rectal cancer.

Different investigations would be completed during preoperative staging depending on the cancer being considered colon or rectal. These investigations include: CT pelvis, CT abdomen, CT chest, X-Ray chest, ultrasound liver, MRI liver, MRI pelvis and transrectal ultrasound. The clinical stage of the cancer would be determined from here. Treatment would be based on staging.

The colorectal screening program in Ontario is the ColonCancerCheck Program. It was Canada’s first organized, population-based, CRC screening program and was launched in April 2008.

ColonCancerCheck (2010)(57) recommendations are as follows: biennial screening using the guaiac fecal occult blood test (gFOBT) in average risk Ontarians aged 50-74, with positive results followed-up by colonoscopy (join previously described “symptomatic” pathway). In Ontario, the FOBT kits are handed out by a patient’s FP (if an individual within the guidelines does not have a FP they can receive the kit through a pharmacy or Telehealth Ontario. Unattached patients who go on to have positive FOBT results will then be assigned a care provider by the program). Those with an increased risk of CRC have a different screening recommendation. Increased risk is defined by ColonCancerCheck as having one or more first degree-relatives (parent, sibling or child) with CRC. Increased risk individuals should have a colonoscopy at age 50, or 10 years before the age that the relative was diagnosed (whichever happens

11 first). This screening strategy is supported by a meta-analysis (58) of three randomized control trials. This meta-analysis (58) showed that there was a 15% decrease in CRC mortality (RR 0.85, 95% CI: 0.78-0.92) with the screening regime. The ColonCancerCheck recommendations for screening also agree with the recommendations of the Canadian Association of Gastroenterology (59) (although these recommendations are more detailed).

Of the 3.4 million Ontarians who were in the target age group (50-74 years old) for colorectal screening in 2009-2010, 27% had a FOBT in the last two years (57). Approximately 4.1% of those screened with an FOBT had an abnormal test result (positive test) and, following the ColonCancerCheck guidelines, should have a follow-up colonoscopy. About 5.4% of average risk participants with an abnormal FOBT were diagnosed with CRC (positive predictive value or PPV) (57). The sensitivity of the

FOBT is 15-50% (44,60) whereas the sensitivity of a colonoscopy is about 95% (44,60).

After CRC has been diagnosed, either through the symptomatic or screening pathway, cancer stage must be determined; this is summarized in the following section.

2.1.6 CRC Stage Accurate and uniform staging is critical in determining the treatment process for cancer patients.

The standard staging system used for CRC is the TNM (tumour, node, metastasis) system (61) of the

American Joint Committee on Cancer (AJCC) and the Union for International Cancer Control (UICC)

(2). T, N and M components of the cancer are evaluated using standardized tables produced and continually updated by the AJCC and UICC (TNM scores can be assigned clinically as well as pathologically). Briefly, the T component describes how far the primary tumour has spread through the layers of the colon and rectum. The N component defines the extent of spread (if any) to nearby lymph nodes and the M category indicates if the cancer has metastasized to distant organs (2,61) (see Table 2-1 for a further description). In a process referred to as stage grouping, individual T, N and M scores are combined (using a standardized guide) to determine the stage of the cancer (61). Stage is expressed in

Roman numerals from I (earliest stage) to IV (most advanced stage). Two older staging systems for CRC

12 also exist. They are the Dukes system and the Astler-Coller system. Three advantages of the TNM staging system over other systems are, that it is data-driven, has the capacity to change to allow for continual improvement and that the definitions are comprehensive to ensure uniformity (2). Stage is the single strongest predictor of survival for CRC patients (2) (see Table 2-1). Stage at diagnosis will also largely dictate the patient’s treatment protocol. The exact treatment path, however, is something that is decided upon on a case by case basis. Surgery, chemotherapy and radiation are the three major treatments.

Table 2-1: Tumour, node, metastasis (TNM) CRC stage description, distribution and 5-year relative survival rate. Table adapted from Ballinger and Anggiansah (62)

TNM Stage Description(2,61) Stage Distribution 5-year Relative in Ontario(24)* Survival Rate(3)** % % Colon Rectum I Tumour invades the mucosa and or the 22 92 87 submucosa (inner most linings of the large intestine) IIA, IIB Tumour invades into or through the muscle 26 63-87 49-80 layer, no lymph node involvement IIIA,IIIB,IIIC Lymph node involvement 28 53-89 58-84 Letter corresponds to the number of lymph nodes involved (1-3 or 4+) and specifics of tumour invasion IV Distant metastasis(es) present 20 11 12 *2011 (remaining 4% are stage unknown) **2004-2010 (USA, SEER database)

There are also some trends in stage at diagnosis by anatomical sub-site of CRC. A large

American study using administrative data found that cancers in the proximal region of the colon were less likely to be diagnosed at an early stage than cancers of the distal colorectum region (defined here as: sigmoid colon, rectosigmoid junction and rectum) (early stage cancer increased from 31.9% among the proximal colon to 41.5% in the distal colorectum) (63). A study conducted in Italy using older CRC data also found a similar trend (64).The researchers hypothesized that these results could be attributed to many different factors including: differences in screening practices, differences in sub-site specific risk factors and the possibility of differences in tumour pathology by sub-site, although this relationship is not clear.

13

The relationship between CRC stage and specific symptoms is a bit more ambiguous. Early stage

CRC may or may not cause any symptoms (ample room for tumour to grow in the abdomen) (65).

Symptom severity is often greater in advanced stage CRC when the tumour can be bigger and may have spread to other organs (65). A study in Denmark (55) found a significant decrease in the of having an advanced stage cancer (Dukes C, D) if the first symptom was rectal bleeding for both colon and rectal cancer sites.

2.2 CRC Diagnostic Interval

2.2.1 Definition Wait times while moving through the cancer diagnosis and treatment pathways are a public health concern (22). There is a lack of consistent and clearly defined variables used to describe patient pathways to cancer diagnosis (66). Several terms in the literature that are used to describe the period of time it takes to reach a definitive CRC diagnosis include: diagnostic delay (22,55), time to diagnosis (8,15) , lag time in diagnosis (11), wait time (67) and diagnostic interval (8,9). These terms can often be further disaggregated creating many inconsistencies in research on early cancer diagnosis and making it difficult to compare studies in this field. We have chosen the term “diagnostic interval” to describe the time frame measured in this research. This decision was made in accordance with the Aarhus Statement (66). The

Aarhus Statement (66) was created by an international Consensus Working Group to formulate standard definitions for early cancer diagnosis research. We define the diagnostic interval as the time between the patient’s first diagnostic-related encounter with the health care system to the definitive diagnosis of CRC.

Figure2-2 illustrates time intervals in the CRC care pathway and was adapted from Olesen and colleagues

(68), whose model helped inform the Aarhus Statement (66). The total time interval can be broken down into multiple units including the patient, doctor and system interval. This project examines the diagnostic interval only but Figure 2-2 is presented here to provide an understanding as to where the diagnostic interval fits within the pathway from first symptoms to start of treatment and to provide clear definitions.

14

Total Interval

Patient Interval Doctor Interval System Interval

Treatment Diagnostic Interval Interval

First diagnostic- related Initiate First First Onset of encounter Definitive Start of symptom referral to specialist Symptoms diagnosis treatment with the investigation specialist visit health care system

Figure 2-2: Time intervals along the pathway from first symptoms to start of treatment, diagnostic interval in bold. Figure adapted from Olesen et al. (68) and the Aarhus Statement (66)

2.2.2 Quantifying the CRC Diagnostic Interval A few Canadian studies have quantified the diagnostic interval in CRC patients, but the majority

of research in this field has come from Western Europe. Methodology for this thesis was grounded on

research by Singh and colleagues (2010) (22). They found that, in the Manitoba CRC population, the

overall health system wait time (defined as the diagnostic plus treatment intervals using the Aarhus

definition) increased significantly from 61 days in 2001 to 95 days in 2005 (p<0.001). This increase in

overall wait time was mostly due to the rise in the diagnostic interval portion which increased from a

median of 44 days in 2001 to 64 days in 2005 (p<0.001). The 90th percentile also increased from 184 days

in 2001 to 264 days in 2005. Recent examples of quantitative information on a measure of the diagnostic

interval are presented in Table 2-2. This table gives a basic summary on the scale of the diagnostic

interval, but, as mentioned above it is often hard to directly compare studies due to different definitions of

this interval. Furthermore, it is important to consider the substantial heterogeneity between each study, for

example, the difference in data collection methods.

15

Table 2-2: Examples of studies that quantified a measure of the diagnostic interval First Author Quantitative Information on a Measure of the Diagnostic Interval Location (reference number) (data collection method) (years of CRC diagnoses) (year published) Barrett (69) Median diagnostic interval: 69 (IQR:36,188) days 3 cities, United Kingdom (2006) (chart abstractions) (2002)

Grunfeld (70) Median interval to diagnosis, referral to CRC dx: 13 days Ottawa hospital, Canada (2009) (chart abstractions) (2004-2005)

Hansen (71) Median delay, first FP contact to first investigation of CRC symptoms: 0 days Aarhus, Denmark (2011) Median delay, first investigation of CRC symptoms to hospital referral: 0 days (2004-2005) Median delay, hospital referral to CRC dx: 30 days (administrative data, FP questionnaires)

Korsgaard (72) Colon cancer: median delay, first FP contact to referral: 3 days 3 counties, Denmark (2008) median delay, FP referral to CRC dx:16 days (2001-2002) Rectal cancer: median delay, first FP contact to referral: 5 days median delay, FP referral to CRC dx: 12 days (patient questionnaire-interviews)

Rupassara (13) Early group (< 50 days): median delay, FP referral to CRC dx: 15 days A surgical unit, United (2006) Late group: (≥ 50 days): median delay, FP referral to CRC dx: 108 days Kingdom *patients who were initially thought to have quick diagnoses due to the nature of their (1993-2002) cancer were excluded (e.g. advanced stage and emergency management) (administrative data, case notes)

Singh (22) Median diagnostic interval (2005 only), first FP contact to CRC dx: 64 days Manitoba, Canada (2010) 90th percentile (2005 only), first FP contact to CRC dx: 264 days (2001-2005) (administrative data)

Singh (67) Median wait time, referral for colonoscopy to colonoscopy: 57 days Texas, USA (2010) *veterans, excluded cases with colonoscopy solely for screening (2001-2007) (integrated electronic medical records)

(continued)

16

First Author Quantitative Information on a Measure of the Diagnostic Interval Location (reference number) (data collection method) (years of CRC diagnoses) (year published) Singh (15) Median diagnostic interval (2009 only), first FP contact to CRC dx: 105 days Manitoba, Canada (2012) 90th percentile (2009 only), first FP contact to CRC dx: 276 days (2004-2009) (administrative data)

Singh (18) Median time, referral for colonoscopy to colonoscopy: 123 days USA (2012) *safety-net health care system (2008-2009) (electronic medical records and FP semi-structured open-ended interviews)

Terhaar sive Droste (14) Early stage: median delay, first FP contact to referral: 7 days Northern Holland, Netherlands (2010) median delay, FP referral to CRC dx: 21 days (2005) Late stage: median delay, first FP contact to referral: 14 days median delay, FP referral to CRC dx: 14 days *symptomatic patients only (patient questionnaire-interviews, FP-interviews, medical charts, administrative data)

Tørring (8) Alarm symptoms: median diagnostic interval, Aarhus, Denmark (2011) first FP contact to CRC dx: 37 days (2004-2005) Vague symptoms: median diagnostic interval, first FP contact to CRC dx: 74 days (FP questionnaire and administrative data)

Van Hout (73) Median delay, first FP contact to referral: 14 days Utrecht, Netherlands (2011) Median delay, referral to CRC dx: 27 days (1997-2007) (administrative data, electronic medical records)

Wattacheril (11) Median lag time, referral for colonoscopy to dx: 41 days Texas, USA (2008) *veterans (2000-2005) (administrative data, medical records)

Abbreviations: IQR interquartile range; dx diagnosis; FP family physician Several other informative studies were not directly included in this table, if, for example: they only reported a section of the diagnostic interval, could not disaggregate diagnostic interval from a measure of total interval, unclear interval definitions, diagnostic interval only reported for specific sub-groups or the diagnostic interval was a broad categorical variable (10,12,16,17,53,55,74–78).

Please note that there are two separate researchers with the same first initial, last name and qualifications “H Singh, MD MPH”. Please follow reference numbers to distinguish these authors throughout the thesis. 17

2.2.3 Factors Related to the CRC Diagnostic Interval A systematic review published in 2008 (5) identified factors that influenced the patient interval

(onset of symptoms to first contact with the FP) and the practitioner interval (first contact with the patient to referral for secondary care) in CRC patients. The authors appraised the evidence of all included studies and created three grades of evidence: strong, moderate or insufficient. Studies with strong evidence had: a sufficient sample size, high quality data collection methods and reported statistically significant findings for factors that they determined to be delay-related. The authors also used two previously published studies to guide their assessment of the literature (79,80). Factors identified as increasing the practitioner interval in studies providing strong to moderate evidence included: initial misdiagnosis, failure to examine the patient, receiving negative or false-negative test results, lack of continuity of care (single study), patient frequent attendance and lower SES patient (single study). Factors identified as decreasing the practitioner interval included: rectal cancer sub-site, comorbidity (single study), older age and use of referral guidelines (single study). There was a discrepancy in the direction of influence for similar symptom presentation between studies. It was also noted that physicians in rural areas were less likely to refer patients due to the travel distance to secondary care services and rural residence was identified as a factor that increased delay in terms of the patient interval.

Wahls and Peleg (77) identified patient and system factors related to the diagnostic delay of CRC in an American veterans health care setting by review of medical records (veterans administration electronic medical records system which conglomerates patients’ clinical notes and laboratory, imaging and pathology results). They determined that 46% of cases had delays associated with system-level factors. The most common system-level factors included: missed lesions from colonoscopy/sigmoidoscopy and scheduling delay for colonoscopy due to backlog. The researchers also found that delayed responses to abnormal findings, which were partly due to lost follow-up, may have also contributed to a delay in the diagnostic interval. This was especially relevant for anemia as an 18

abnormal finding. In another study (75), the odds of a missed opportunity to begin CRC work-up

(endoscopy) were increased for patients with iron deficiency anemia (OR=2.2, 95% CI: 1.3, 3.6) and anemia was associated with the longest time to colonoscopy/sigmoidoscopy referral (75).

Singh and colleagues (18) used a mixed methods analysis to identify reasons for delays between referral and the colonoscopy procedure in an American safety-net health care system (using electronic health record reviews and semi-structured interviews with FPs). The most common reason for delay was a delayed appointment with gastroenterology. Results from the FP interviews suggested that this delay in scheduling was caused by uncertainty in the referral guidelines (inefficient process) and endoscopic capacity. In another study, examining the interval between referral for colonoscopy and colonoscopy performance, by Singh and colleagues (67) details of the electronic request for referral were examined.

Notation of urgency, having three diagnostic clues and outpatient referrals which indicated verbal contact with the consultant were identified as significant independent predictors of a reduced time from referral to colonoscopy.

The International Cancer Benchmarking Partnership (ICBP) is a collaboration of researchers, clinicians, scientists and policy makers from six countries with a mandate to describe and compare cancer survival (for specific sites) between participating jurisdictions as well as to investigate causes for survival differences (81). The ICBP recently published a study (82) in which the readiness of FPs to investigate possible cancer symptoms (via a referral to secondary care or by ordering specific diagnostic tests in the primary care setting) was assessed through the use of online clinical vignettes. Physician readiness to investigate was then correlated with cancer survival in the respective jurisdiction. The authors theorized that differences in the management of the diagnostic interval may explain the inter-jurisdictional survival variation. For CRC, there was a significant positive correlation with the readiness to investigate and 1- year survival as well as 5-year conditional survival. The authors acknowledged the ecologic nature of their findings, and suggested that further research on specific factors that affect FPs’ decisions to

19

investigate (e.g. clinical guidelines, relationships between primary and secondary care and access to investigations) is warranted. The study generally had low response rates from the FPs. There was feedback that the vignettes represented clinical practice well, although the FPs were aware that the surveys were related to cancer.

2.2.4 Waiting Time Paradox An interesting observation from the analysis of the diagnostic interval is that of the “waiting time paradox” which, counterintuitively, shows CRC patients with the shortest diagnostic interval having the highest mortality rates. This paradox is likely explained by the accelerated investigation of patients who have alarm symptoms or who are admitted to the hospital as an emergency (83). This paradox is an illustration of confounding by indication where the seriously ill, who have a higher inherent mortality, are given priority through the diagnostic pathway (8). This paradox has been observed in many CRC data sets

(8,9,15,16) and other cancer sites (84,85). Tørring and colleagues have found that the waiting time paradox is then followed by an increase in mortality with longer diagnostic intervals creating a U-shaped relationship between CRC mortality and the length of the diagnostic interval (decreasing then subsequently increasing mortality as the diagnostic interval increases) (9). Generally, the purpose of many studies that quantify the diagnostic interval is to assess if there are negative cancer outcomes associated with an increased diagnostic interval. The evidence on this question is reviewed in the next section.

2.2.5 Impact of the Diagnostic Interval There are conflicting results in the literature as to whether having a delayed diagnosis of CRC will actually impact stage and patient survival, as CRC is a slowly progressing disease (7–17). In a study by Singh et al.(2012) (15), a longer time to CRC diagnosis in the Manitoba population did not adversely affect stage at diagnosis or survival after diagnosis. Conversely, Tørring et al. (9) found results supporting the hypothesis that the diagnostic interval does have an impact on mortality in CRC patients. For patients

20

who presented with alarm symptoms, the risk of dying within 3 years decreased up to diagnostic intervals of about 35 days (waiting time paradox), after which, mortality increased. The opposite trend was found for patients presenting with vague symptoms, however, it was not statistically significant. Korsgaard et al.

(7) found that treatment delay (onset of symptoms to treatment) was associated with late stage rectal cancer, but not colon cancer.

A recent systematic review by Neal et al. (86) aimed to determine if an increased time to diagnosis and treatment in symptomatic cancers was associated with poorer outcomes. For CRC, the majority of studies (15 papers) found no association while seven studies found a positive association

(longer intervals, poorer outcomes in terms of survival or stage) and one study found a negative association. It was noted, that of the four studies to address the waiting time paradox, three found a positive association (all by Tørring et al.) (8,9,87) and one found no association (88). The authors concluded that the evidence was not clear due to heterogeneity in time interval definitions and study quality. The authors did recommend that efforts to decrease the diagnostic interval for symptomatic patients will likely have benefits in terms of stage at diagnosis, survival and quality of life. Breast, colorectal, head and neck, testicular and melanoma cancers have the most evidence to date of benefits for a quick diagnosis (86). Neal and colleagues suggested that future research should adhere to the Aarhus statement interval definitions and address the waiting time paradox where appropriate.

Even if, ultimately, longer prevailing diagnostic intervals are not found to have an impact on stage at diagnosis or survival, efforts to reduce the time to diagnosis should not be abandoned for two important reasons that have been emphasized by Singh and colleagues (2012) (15). The first reason is the importance of reducing patient anxiety by prompt evaluations (15). Secondly, signals that extreme wait times may affect clinical outcomes have been observed, although no threshold has been established (15).

In Singh’s work, patients who had the longest diagnostic intervals (at or greater than the 90th percentile)

21

had an increased hazard of death, although the estimate was marginally statistically significant [hazard ratio (95% CI): 1.40 (1.00, 1.96)] (15).

The importance of patient psychological health during the diagnostic interval should not be overlooked. In a study (54) examining the needs of CRC patients from three cancer centres in Ontario,

84% of participants felt that their needs for support were met. However, 57% of participants felt shocked or overwhelmed upon learning of their diagnosis and 40% reported having high levels of anxiety during the diagnostic interval. Most patients stated that they were not made aware of resources to help with their anxiety. Also, the rationale for the two week waiting time for endoscopy when there is a high likelihood of cancer that was set by the Canadian Association of Gastroenterology is based on mitigating patient stress and anxiety during this time (89) despite the absence of strong evidence that a delay in diagnosis (or the start of therapy) of several weeks would alter clinical outcomes of CRC (89). These wait time targets for the CRC diagnostic interval will be described next.

2.2.6 Wait Time Targets In 2006, the Canadian Association of Gastroenterology published a consensus statement on the maximal medically appropriate wait times for specialist consultation and procedures (89).Three wait time targets that related to the CRC diagnostic interval were: 1) patients referred to a specialist due to a high likelihood of cancer based on clinical or radiological investigations should be seen and endoscoped, if needed, within two weeks, 2) patients referred to a specialist due to a positive FOBT test should be seen and endoscoped, if needed, within two months and 3) patients referred for a screening colonoscopy should be seen and endoscoped, if needed, within six months. Ontario’s ColonCancerCheck has adopted the two month benchmark set by The Canadian Association of Gastroenterology for receiving a colonoscopy following a positive FOBT (90). Similar to the Canadian statement, the United Kingdom has implemented a national two-week maximum waiting time target for referral (from a FP to a specialist) of

22

urgent suspected cancer cases (91). This target was developed after it was found that the UK and

Denmark had a lower 5-year survival rate for CRC than comparable European countries through the

EUROCARE project (92), with one explanation for this poorer survival being more late diagnoses (93).

2.3 Conceptual Framework

This research was guided by a conceptual framework adapted from Zapka (94) and the Chronic

Care Model (95,96) (see Figure 2-3). The interval between CRC onset and diagnosis is a complex variable that is influenced by disease characteristics, patient related characteristics and health-system related factors. This model helped us to understand influences on the diagnostic interval and organize the health- system related variables along a patient’s diagnostic pathway starting from the index contact a patient has with the health care system to the definitive diagnosis of CRC. The principle goal outlined by this conceptual model, Figure 2-3, is to have productive encounters between informed, motivated patients and prepared practice teams in order to achieve an optimal diagnostic interval (days) through improved elements of the diagnostic pathway. The focus of this project lies within the organization and/or practice setting component of the model [e.g. structural arrangements (hours, on site specialists) information systems] as well as in the additional sectors of influence component (e.g. rurality, specialist capacity and access, provider guidelines). Rurality and the diagnostic interval is the next topic discussed.

23

Additional Sectors of Influence

Organization and/or Practice Setting

Provider Characteristics

Family and Social Supports

Patient Characteristic s

Disease Characteristics

Productive Informed, Activated interactions Prepared, Proactive Patients and encounters Practice Team

Improved Diagnostic Pathway e.g. - reduced time in between encounters and tests - fewer encounters and tests

nd rd 1st related 2 related 3 related nth related encounter encounter encounter encounter Index Definitive st nd rd th Contact 1 2 3 n Diagnosis test test test test

Optimal Diagnostic Interval

Figure 2-3: Conceptual frame work adapted from Zapka (94) and the Chronic Care Model (95,96)

24

2.4 Rurality and the CRC Diagnostic Interval

2.4.1 Definitions for Rurality There are multiple definitions for rurality, with no clear consensus as to which is the best (97).

Many definitions of rurality rely on combinations of: population size, population density, travel time/ distance to certain key resources and travel time/distance to more urban centres (97,98). In Canada, there are at least six different definitions of rural/urban areas at a national level, with the proportion of the population defined as rural ranging from 22%-38% (98) depending on the definition. What is clear, amongst these multiple definitions, is that the choice of rurality measurement needs to be based on the specifics of the study purpose (97,98).

This thesis used the Rurality Index for Ontario (RIO-RIO2008_BASIC) developed by Kralj

(99,100) to quantify the degree of rurality of each CRC patients’ place of residence. The RIO was chosen to measure rurality over other methods because it: was created for health-care planning purposes, is a broad measure of rurality (versus dichotomous rural/urban), was available through the administrative databases that we had access to and has been used in various other health care studies in Ontario examining rurality as a covariate (101–103). Using the RIO score to assess rurality we were able to determine if populations that were more rural were disadvantaged in terms of their CRC diagnostic interval. This is important in creating a health care system with an equitable opportunity for timely cancer diagnosis.

2.4.2 Rural Patients as a Vulnerable Population Rural populations are generally older and less healthy than their urban counterparts (19,20). A report by CIHI on the health status of rural Canadians determined that rural areas had: a greater proportion of people who had less than secondary school graduation, lower SES, a greater proportion of people who exhibited unhealthy behaviours (smoking, diet) and a higher overall mortality than those who

25

lived in urban areas (19). Positive health factors associated with those who lived in rural areas were: reporting a stronger sense of community, a lower incidence of most cancers and a lower proportion of people reporting a high stress life.

Rural populations are considered vulnerable to poor cancer care and experience inequities to cancer services along the cancer care trajectory, which includes the diagnostic process (21). If the diagnostic interval is longer in rural areas, it may be associated with, and therefore ameliorated by, changes in health care services that are related to the diagnostic process. Such changes would address an important health care inequity. Several studies that examine rural/urban-residing CRC patients and the association with: geographic access to cancer services, the diagnostic interval, stage and screening will be described next.

Rurality and Geographic Access to Cancer Services

It is important to note that geographic access to care encompasses multiple dimensions.

Guagliardo (104) describes two terms to characterize access to healthcare services: spatial availability

(ratios of health care providers to populations) and travel impedance (distance/time to services). For example, in an Australian study (105) researchers noted a lower colonoscopy rate in small rural towns versus the state capital (10.5 colonoscopies/1000 population versus 18.5 colonoscopies/1000 population, respectively). Explanations for this difference included a lack of physicians who could perform these procedures in rural areas as well as travel time needed for some rural patients to seek care in larger city centres.

Cancer care requires medical specialists and resources that are usually located in tertiary care hospitals in urban centres (106). Geographic access to cancer services is an underlying theme in the next three sections contrasting the diagnostic interval, stage at diagnosis and screening between rural and urban cancer patients.

26

Rurality and the CRC Diagnostic Interval There has not been a lot of research studying the association between rural living and the length of the diagnostic interval. When Singh and colleagues (2010) (22) studied the overall wait time (index contact with the healthcare system to the first definitive treatment) for CRC patients in Manitoba they found that rural versus urban residence, among other covariates, was a significant predictive factor of the overall wait time and that those living in urban areas actually had a longer overall wait (hazard ratio urban=0.88, 95% CI: 0.81, 0.96 where lower hazard ratios represented longer overall wait times).

Research from other countries was examined to relate rurality specifically to the diagnostic interval as we have not found examples in a Canadian context.

A qualitative study in rural Western Australia conducted structured discussions with rural FPs to identify factors that were believed to affect the cancer diagnostic interval based on reviews of their clinical cases (107). Identified factors included: a demographic shift in rural areas to a frailer, older population with comorbidities making it harder to identify early cancer symptoms, “patient delay” due to seasonal and demanding work patterns, unaccommodating scheduling with specialists and the impact of informal social networks (107) . The authors identified limited access to health care services as a principle barrier to receiving a timely cancer diagnosis. A longer travel distance for rural patients to receive diagnostic services has financial implications in terms of taking time off to travel as well as for accommodations in more urban centres. This may be unacceptable for some patients and can lead to a later diagnosis. Furthermore, poor scheduling of patient appointments can lead to the need for multiple trips into a more urban area, again discouraging patients to present with symptoms. This can also play a role in the FP’s decision to refer rural patients as they recognize the inconvenience that a trip may have and could be more selective and/or wait longer to refer patients. The FPs also noted the tendency for their patients to “save up” health issues until a convenient time for an appointment, thus proving more difficult to identify early cancer symptoms. Social relationships between patients and their healthcare providers in

27

rural areas was also identified as having the possibility to both help or hinder the diagnostic process.

Patients may be able to obtain an appointment quicker, feel more comfortable with the FP and the FP may know the patient’s health history better, but on the other hand, patients may not want to relay symptoms perceived as embarrassing (107).

A qualitative study conducted in Northern Scotland interviewed patients and family members to document their perceived care through CRC presentation, diagnosis and treatment, categorizing patients as urban or rural (108). Factors that may have influenced the diagnostic process included the consensus that rural-living patients were less insistent in pursing their cancer care while urban-living patients were intolerant of delays and more likely to do anything in their power to speed up the process. A second factor was related to FP continuity of care versus availability of second opinions. In small rural areas, where there were very few doctors, sometimes a second opinion by a visiting FP was important in generating a referral to a specialist. Conversely, in urban and larger rural areas it was sometimes difficult for patients to see their regular FP quickly so they sought out care from another FP in the meantime and felt that, in the end, this added delay to the process (108).

In another Scottish study, Robertson and colleagues (74) studied factors that influenced the time interval between patient presentation and treatment in both breast and CRC patients in rural and urban areas (measured by the straight line distance between the centroid of a patient’s postcode area and the nearest cancer centre). In the CRC subset they did not find a significant difference in the presentation to treatment time interval between rural and urban patients. The average time between presentation and treatment was much greater for CRC cases than breast cancer cases, however, this time interval was similar between the two sites when patients presented with a palpable mass. The authors suggest that the delay seen in CRC is then likely due to the vague symptoms at presentation.

An older study in France based on Cancer Registry data surmised that the loneliness of rural living, especially for women, contributed to a delay in diagnosis (109).

28

Rurality and CRC Stage at Diagnosis & Other Outcomes

Multiple studies have shown that rural CRC patients are diagnosed at a later stage than their urban counterparts (110,111). Rural-urban differences in cancer care from four cancer sites were studied in the Lake Superior Rural Cancer Care Project (USA) (112) .The study compared nine endpoints between rural and urban cancer patients. Significant differences for the CRC subset after adjustment for age and oncology consult (yes/no) between the two groups included: stage at diagnosis, initial management, surveillance testing and clinical trials participation, where rural patients were disadvantaged in each of these findings (112). Explanations for these disadvantaged cancer care outcomes included: poorer access and availability to cancer specialists and diagnostic services, fewer screening activities, greater patient delay/patient behaviour, lower ratios of health care providers to population, insufficient documentation and the rural health care system at large (111–114) (some of these explanations were based on research studying other cancer sites). However, interpretation of these results must be considered with the limitations of the study as socioeconomic, racial, educational and health insurance data were not included in the analysis.

In a large national cohort study (USA)(115), it was determined that colon cancer patients who traveled ≥ 80.5 km (50 miles) to a diagnosis facility compared to < 20 km (12.5 miles) were more likely to be diagnosed with metastatic colon cancer [OR (95% CI): 1.18 (1.12, 1.24)]. Similarly, a study in

Maine (USA) found that increased distance to a primary care provider in CRC patients (≥ 24.1 km versus

<16.1 km versus) was associated with late stage (116). However, distance to cancer treatment and gastroenterologists were not significant.

Results from a study by Fazio et al. (117) which used data from the Ontario Familial Colon

Cancer Registry determined that rural residence (rural versus non-rural) was associated with an increased risk of late stage colon cancer [OR (95% CI): 1.48 (1.01, 2.17)].

29

Access to CRC treatment is a common outcome measure in studies examining rural/urban differences in cancer care. Briefly, a large American study by Baldwin and colleagues using administrative data found that less than 50% of patients living in small and isolated rural areas were within 30 miles of medical or radiation oncologist services (106). Furthermore, they found that about a fifth of all rural patients bypassed their closest options for medical and radiation oncology care. The authors thought that this could be related to multiple reasons such as: patient concerns about local quality of care compared to that of urban centres, family connections in urban centres, availability of other cancer services in one location, referral patterns of physicians and capacity issues of more local services (106).

Rurality and CRC Screening

Multiple studies have shown that rural residents are less likely to be up to date with CRC screening than urban residents (118). A study in rural Kansas (119) determined that the main barrier to

CRC screening was not having enough time to discuss CRC screening between patients and FPs. The authors concluded that access and availability to services was not a major implication in CRC screening utilization. Conversely, a different American study recommended improving geographic access to health care providers in light of lower preventive healthcare services use (including CRC screening) in rural areas. The authors noted physician shortages as one of the challenges in delivery of healthcare services in rural areas. Attitudes/cultural barriers of rural residents was also identified as a challenge where rural residents maybe more self-reliant and hesitant to seek medical care unless seriously ill. Access and availability to specialists who perform colonoscopies for screening was an issue identified in a rural

Ontario community (120). The researchers monitored the safety of colonoscopies performed by non- specialists to enhance CRC screening because there were not enough specialists to otherwise screen the target population.

30

2.5 Summary

CRC is one of the most common types of cancer diagnosed in Canada and poses a great burden on the health care system. CRC survival is highly dependent on the stage at diagnosis, however, it remains unclear as to whether an extended diagnostic interval is associated with stage at diagnosis or survival. The diagnostic interval is a complex variable influenced by: tumour biology, patient factors and health care system organization factors. Rural populations are considered vulnerable to poor cancer care and experience inequities to cancer services along the cancer care pathway. Rurality as a main exposure and the association with the diagnostic interval has not been studied in the Canadian context. If the diagnostic interval is higher in rural areas, it may be associated with, and therefore ameliorated by, changes in health care services that are related to the diagnostic process. These changes would address an important health care inequity.

31

Chapter 3

Methods

3.1 Purpose The purpose of this study was to describe the variation in the length of the diagnostic interval for colorectal cancer (CRC) patients in Ontario by degree of rurality of a patient’s residence. This study also refined an existing CRC diagnostic interval algorithm by Singh and colleagues (22) and a CRC symptom status at first test algorithm (asymptomatic versus symptomatic) (23); thereby creating two fundamental resources for future research in this area.

3.2 Empirical Objectives and Hypotheses 1. To compare the Ontario CRC stage distribution across census subdivisions (CSDs) grouped by

Rurality Index for Ontario (RIO) categories.

Hypothesis: In Ontario, at a provincial level, there is a fairly even distribution of CRC stage (I-

IV) (24), but there is variation on degree of rurality across the province. We hypothesized that

more rural census subdivisions (higher RIO scores) would have a greater proportion of advanced

stage CRC than less rural census subdivisions (lower RIO scores). We based this hypothesis on

previous observations that some rural areas have poorer geographic access to specialty health care

services (106), lower screening rates (110) and increased patient-related diagnostic delay (110).

2. To determine whether patients living in more rural census subdivisions (higher RIO scores)

have a longer system-related diagnostic interval than patients living in less rural census

subdivisions (lower RIO scores) after stratifying by stage and controlling for potential

confounders.

Hypothesis: We hypothesize that, on average, patients living in more rural CSDs will have a

longer diagnostic interval than patients living in less rural CSDs. Three dimensions of geographic 32

access to care are important to consider when thinking about the diagnostic interval in relation to

rurality (104). The first, accessibility, relates to travel impedance to health care services. Longer

distances to specialists and diagnostic resources may affect time to access. The second dimension,

availability, relates to the number of health care providers and diagnostic resources per capita. In

this instance, resource availability may be better, or worse, in more rural areas compared to less

rural areas (121,122). Third, populations in rural areas are older (123) possibly putting more

demand on per capita resources than urban areas.

3.3 Overview This was a retrospective population-based cohort study. We used administrative databases available through the Institute for Clinical Evaluative Sciences (ICES) Health Services Research Facility at Queen’s University. Incident CRC cases from Jan 1, 2007 to May 31, 2012 were identified using the

Ontario Cancer Registry (OCR) and a RIO score and diagnostic interval were calculated for each patient.

Fee codes that represented key CRC medical tests in the administrative databases were used to identify

CRC diagnosis-related encounters. The earliest of these CRC diagnosis-related encounters, for a given patient, identified their index contact date with the health care system. We defined the diagnostic interval as the time between that index contact date and the date the patient was diagnosed with CRC. A tool to help assess variation in the data, called a control chart, was used to determine how far back in time to search for these key CRC tests. We also modified an algorithm to classify the symptom status of a patient’s first key CRC test as asymptomatic or symptomatic using data from the 2012 CRC patient subset, where symptom status was distinguishable using new fee codes. We then applied this algorithm to our study cohort. Data linkage through ICES allowed us to describe cancer stage variation and diagnostic interval variation by degree of rurality of patient residence (RIO score) and to analyze associations of interest taking into account many potential confounders. Three variables were also identified as potential

33

causal pathway variables and were not included in the multivariable models, but were analyzed in a bivariate manner by RIO group.

3.4 Student’s Contribution This project complements the work being done for Dr. Patti Groome’s study on understanding diagnostic episodes of care in three cancer sites. The student was responsible for the study design, dataset creation plan (see Appendix A), statistical analysis, interpretation and reporting. The dataset creation plan is a technical document that is a record of the methods used for variable derivation from the multiple datasets at ICES. The dataset creation plan was used by the senior analyst at ICES, Ms. Marlo Whitehead, to conduct ICES database cuts, linkages and to create a working dataset for the student. The student and the senior analyst worked closely together to finalize this working dataset. The student then used this final working dataset for all study analyses. Dr. Geoff Porter, a breast and CRC surgeon, provided clinical insight to the student’s work throughout the duration of the project and Dr. Jennifer Flemming, a gastroenterologist, was consulted while creating the CRC symptom status algorithm. Dr. Hugh Langley, a family physician, also provided insight on fecal occult blood test (FOBT) Ontario Health Insurance Plan

(OHIP) fee codes.

3.5 Study Population All histologically confirmed diagnoses of first primary CRC (International Classification of

Disease, ICD-9 codes: 153.0 to 153.9, 154.1 excluding 153.5-neoplasm of the appendix) from January 1,

2007 to May 31, 2012 with a valid ICES key number (IKN) were included in the study population unless one, or any, of the following exclusion criteria applied: 1) dead upon diagnosis (death certificate only) 2) less than or equal to 18 years old or greater than or equal to 105 years old at diagnosis 3) did not have an

Ontario postal code at the time of diagnosis (either missing or outside Ontario) 4) subsequent cancer diagnosis within six months of the first primary CRC diagnosis date 5) OHIP coverage for less than 36

34

months before diagnosis date 6) did not have the histologic classification of adenocarcinoma 7) stage ‘0’ cancer. Note that cancer of the rectosigmoid junction (ICD-9 code: 154.0) was not included in the study population due to a coding oversight recognized after analysis, we investigated to what impact it had on our findings in section 4.2.3 of the Results Chapter.

3.6 Study Time Frame The time frame of this retrospective study was anchored around an individual patient’s CRC diagnosis date. The accrual window in which to identify incident cases of first primary adenocarcinoma of the colon or rectum was January 1, 2007 to May 31, 2012. In order to satisfy exclusion criteria the study time frame spanned 36 months prior to the date of a patient’s CRC diagnosis to six months after the date of CRC diagnosis (to ensure no subsequent cancer diagnoses within six months).

3.7 Data Sources Administrative databases used in this project included: 1) Ontario Cancer Registry (OCR) 2)

Ontario Health Insurance Plan Claims Database (OHIP) 3) Registered Persons Database files (RPDB) 4)

The Canadian Institute for Health Information Discharge Abstract Database (CIHI/DAD) 5) National

Ambulatory Care Reporting System (NACRS) 6) Census 2006 (Ontario Census Area Profiles) and 7)

Institute for Clinical Evaluative Sciences Physician Database (IPDB). All databases for this project were housed at ICES. The databases can be grouped into four broad categories: population and demographics, health services, cohort and registry and lastly, care providers (see Figure 3-1). All the databases, except for geo-coded data, are linkable at the individual-level using the ICES Key Number (IKN) (each unique health card number is converted to an anonymous IKN when the data arrives at ICES). A further description of each database is found in the following sub-sections (1.7.1-1.7.9).

35

Ontario Cancer Registry Ontario ICES Health Physican Insurance Database Plan Claims database

ICES Registered Census 2006 Persons Database Files

Canadian National Institute for Ambulatory Health Care Information Reporting Discharge System Abstract Database

Population and Demographics Databases Health Services Databases Cohort and Registry Databases Care Providers Databases

Figure 3-1: Data sources linked through the Institute for Clinical Evaluative Sciences (ICES)

3.7.1 Ontario Cancer Registry The Ontario Cancer Registry (OCR) is a population-based tumour registry that collects information about all incident cases of cancer in Ontario (except non-melanoma skin cancer). The OCR uses a passive registry system and captures linkable information from four main sources: pathology reports, electronic patient records, electronic hospital discharge records from the Canadian Institute for

Health Information (CIHI) and electronic reports of death from the Registrar General of Ontario (124).

36

Some important variables that the OCR holds are: the date of cancer diagnosis, CRC stage, patient demographics, the primary site of cancer (using the International Classification of Diseases 9th edition,

ICD-9 codes), the histological diagnosis, date of death and cause of death (ICD-9 codes)(124).

3.7.2 Ontario Health Insurance Plans Claims Database (OHIP) There are two main physician compensation models in Ontario, the fee-for-service model and the alternative payment plan model (125). The OHIP claims database at ICES holds health care providers’ claims for services that are paid for by the Ontario Health Insurance Plan (125). About 94% of all Ontario physicians use the fee-for-service model (125) in which they submit their service claims to OHIP in order to be reimbursed. In almost all cases, physicians who work under an alternative payment plan model complete what is called “shadow billing” (125). With shadow billing the physician submits the service claim as if it were fee-for service billing. Both fee-for-service and shadow billing are recorded in the

OHIP claims database (although the fee paid in shadow billing records may be shown as $0.00) (125).

After correspondence with the senior biostatistician at ICES Queen’s, we did not think that missing claims from alternative payment plans would be an issue for the time frame of this study. The OHIP claims database contains encrypted patient and physician identifiers, codes for services provided, date of the service, diagnosis codes and the fee paid to the provider. The OHIP claims database does not include information on: some lab services, services provided by provincial psychiatric hospitals, inpatient diagnostic procedures and some services from alternative payment plan models such as health service organizations and community health centres (125).

It is worth giving a short explanation about OHIP diagnostic codes and OHIP fee codes, as this project relied heavily on these systems. An OHIP diagnostic code is a three digit number assigned to a diagnosis or a group of diagnoses (for example, OHIP diagnostic code 564: spastic colon, irritable colon, mucous colitis, constipation). There are approximately 700 OHIP diagnostic codes. The OHIP diagnostic codes are very similar to the ICD-9 coding system, but do not match exactly. An OHIP fee code is a letter 37

and three digit number that represents a specific health service that was provided to a patient that was extracted from an OHIP claim. There are more than 7,000 OHIP fee codes described in the Schedule of

Benefits for Physician Services booklet produced by the Ontario Ministry of Health and Long-Term Care

(MOHLTC)(126) (for example, OHIP fee code X409: computer tomography of the abdomen-without IV contrast).

3.7.3 Registered Persons Database Files (RPDB)

The Registered Persons Database files hold demographic information about any individual who has ever had an Ontario health card number (127). The data is provided by the MOHLTC. The main data elements held in the RPDB files are: demographic information (sex, date of birth/death), geographic information (postal code of residence) and eligibility of OHIP coverage over time (127).

3.7.4 The Canadian Institute for Health Information Discharge Abstract Database (CIHI-DAD) This database holds information on all hospital discharges in Ontario including: rehabilitation, chronic care, acute care and day surgeries(128). Clinical data, such as the diagnosis (up to 16 diagnosis codes for each discharge)(129) , the procedures and the physician are collected along with demographic data and administrative data (institution, length of stay, admission category) (128). Diagnoses are coded using the International Classification of Diseases-10, Canada (ICD-10-CA) system and procedures are recorded with the International Classification of Disease-10 Canadian Classification of Health

Interventions (ICD-10 CCI) codes(128). In a validation study using chart abstractions it was found that there was a 97% agreement with the DAD for administrative data, an 85% exact match for diagnostic codes and a 77% agreement with procedure codes (130).

38

3.7.5 National Ambulatory Care Reporting System (NACRS)

The National Ambulatory Care Reporting System captures data on outpatient visits to health care institutions (e.g. day surgeries, emergency department visits, some cancer care). Main variables include: clinical data, the type of ambulatory care setting, administrative data and patient demographics (131).

3.7.6 Ontario Census Area Profiles This database holds the 2006 Statistics Canada census data. The data is presented at six geographic levels (e.g. Dissemination Area, Census Division, Census Subdivision) and includes summary demographic data such as: age, sex and income (132). The 2011 Statistics Canada census data was not available at ICES and since the long form became voluntary for this iteration data representativeness may be an issue.

3.7.7 ICES Physician Database (IPDB) The ICES physician database (IPDB) was created at ICES and contains information about all

Ontario physicians (133). Principle data elements include: physician demographics, specialty, workload, services provided and location. The IPDB is updated at ICES on a yearly basis from the Ontario

Physician Human Resource Data Centre.

39

3.8 Study Variables Figure 3-2 displays the study variable relationships between the exposure and outcome for

objectives one and two. Our understanding of these relationships was based on the literature of the CRC

diagnostic process and will be further described below.

Objective 1 Potential Causal Pathway Variables CRC symptom status ED presentation Rurality Stage at Diagnosis RIO group Late (III,IV) versus early (I,II)

Covariates Age at diagnosis Sex Deprivation quintile Recent immigrant status Major and minor comorbidities CRC sub-site

Objective 2 Potential Causal Pathway Variables CRC symptom status ED presentation Rurality First test type Diagnostic Interval RIO group Days Stratified by Stage I, II, III, IV, unknown

Covariates Age at diagnosis Sex Deprivation quintile Recent immigrant status Major and minor comorbidities CRC sub-site

Figure 3-2: Study variable conceptual model for objectives one and two

40

3.9 Rurality (exposure) Rurality was defined using the Rurality Index for Ontario (RIO2008_BASIC) (99). Each patient was assigned a RIO score which is an ordinal measure using a 0-100 scale where a higher score represents a higher degree of rurality. The score is assigned by units of Statistics Canada CSDs and was calculated through a macro at ICES (%getdemo) (134).Postal codes for each patient were retrieved from the RPDB

(PSTL yearfile, pstlcode) and represent the best estimate of the patient’s postal code as of July 1st the year of their CRC diagnosis. The postal code is then matched to the corresponding CSD to be assigned a RIO score.

The RIO score is based on three weighted components: a measure of the CSD population (Pop) by count and density, a measure of the travel time to the nearest advanced referral centre (Timea) and a measure of the travel time to the nearest basic referral centre (Timeb) (starting/ending from the centroid of the CSD). RIO = Pop + Timea + Timeb. Weights are: Pop=28.6%, Timea=23.8%, Timeb=47.6%. The RIO is used by the MOHLTC and the Ontario Medical Association (OMA) to administer programs that primarily develop policies and incentives for physician recruitment. The RIO score was chosen to describe rurality over other methods because it was made for health services planning.

The MOHLTC uses a RIO score of 40 or greater to define rural CSDs where physicians may be eligible (among other criteria) for financial incentives (135). ICES also uses a RIO score of 40 or greater to define rural versus urban areas (136). Based on Chapter 2, the Literature Review, and the purpose of this project, a dichotomous split of urban versus rural using the RIO score of 40 as a cut-off point may have been too crude and may have failed to recognize diversity amongst rural areas or masked the degree to which a rural population was disadvantaged(137). The developers advised that the RIO should not be analyzed as a continuous variable as the scores 0-100 represent relative ranks (scores were transformed to a 0 to 100 scale) rather than cardinal values (99). We therefore categorized the RIO scores into six groups:

0-9, 10-30, 31-45, 46-55, 56-75 and 76+. These divisions were based on the five groupings made by the

41

developers of the RIO score(99) (see Appendix B for a list of all CSDs by RIO groups). The first RIO category created by the developers was “0-30”, however, we split this category into two groupings, “0-9” and “10-30”. We made this decision because of the high number of observations that fall into each of these two groups and because we wanted to divide the urban cores from moderately urban/suburban areas.

The rest of the categories follow what has been used by the developers of the RIO.

Approximately 200 of the 580 CSDs (using the 2001 Census definitions) are excluded from the

RIO (2008) measure. Excluded areas were: First Nations reserves and settlements, CSDs with a population of less than 500 and Unorganized Areas (UNO) which are areas that have no legal, political or jurisdictional authority(99). Issues with using the RIO variable identified by ICES include: it may only be applied to the CSDs that existed in 2001, it has only been computed for CSDs with a physician and new postal codes may be difficult to map to old CSDs. CRC patients who were missing a RIO score were compared to the final cohort.

3.10 Stage at Diagnosis (outcome objective one, stratification variable objective two) Rationale

Our first objective, looking at rurality and stage, addressed issues of access at a higher level: are people living in rural areas seriously compromised regarding access to an early diagnosis – possibly through lower screening rates leading to higher numbers with symptomatic disease. Our second objective, looking at rurality and the diagnostic interval, was stratified by CRC stage (I, II, III, IV, unknown). We thought of stage as an effect modifier for the rurality-diagnostic interval relationship. This stratification was done because stage is associated with the length of the diagnostic interval, likely due to differences in symptom severity (11,138,139) and stage I CRC patients may be more vulnerable to variations in access

(possibly by rurality) due to the symptom-based triaging that takes place. So our stratified diagnostic interval analyses refined our design beyond what had been previously done by looking within each stage

42

group to see if rurality was associated with the interval. We expected a stronger association in the lower staged groups.

Definition and derivation

Stage describes the extent of a patient’s cancer and is one of the most important factors in determining the outcome of CRC (2,17,140). The variable “best stage” contained in the OCR database was used to define stage at diagnosis. Staging information in the OCR comes from cancer clinic electronic medical records or collaborative stage from CCO’s Stage Capture Project. The population- based stage capture rate of CRC in Ontario was about 90% in 2011(141). Once all stage information for a patient is collected, best stage is determined by a hierarchy to prioritize the most informative data available (pathologic stage then clinic stage if pathologic not available). Best stage follows the tumour, node, metastasis (TNM) system of the AJCC and the UICC, the global standard staging system (142,143).

For objective one we grouped stage I and II to form “early stage” and grouped stage III and IV to form “late stage” as the outcome. In objective two, stage was a stratification variable. We used 5 stage strata: I, II, III, IV and stage unknown.

3.11 Diagnostic Interval (outcome objective two)

We defined the diagnostic interval as the time (in days) between the patient’s first CRC diagnosis-related encounter with the health care system (index contact) to the definitive diagnosis of

CRC, which is most often the date of a positive biopsy (144). In order to identify the patient’s index contact date, we started at the end of the diagnostic interval with the CRC diagnosis date and looked back through the patient’s trajectory of care for key CRC tests (set look-back periods were used to identify each key CRC test). From each of those key CRC tests we identified the last visit with the test-referring physician (looking up to a maximum of 12 months prior to the test). The index contact date was the earliest of those physician visits (see Figure 3-3 for a generalized overview of the diagnostic interval and

43

Figure 3-4 for a more detailed explanation of the interval). Thus, the diagnostic interval was the time in days between the index contact date and the CRC diagnosis date and was analyzed as a continuous variable (we added one day to the diagnostic interval for every patient so the diagnostic interval was never equal to zero days). Derivation of the diagnostic interval was based on work by Singh and colleagues (22) and is further described in the following paragraphs.

Index CRC Contact First Diagnosis Date Test Date

Diagnostic Interval Time (days)

Last visit with referring physician Last visit with referring physician

Last visit with X referring physician X

X

Key CRC tests identified in test specific look-back periods

Figure 3-3: Generalized diagnostic interval derivation

3.11.1 Identifying Key CRC Tests

Once the CRC diagnosis date was determined from the OCR, we used the OHIP claims database, the CIHI-DAD and the NACRS to capture all occurrences, within a certain time frame, of key CRC test and their corresponding test dates. Key CRC tests included 1) abdominal X-ray, 2) abdominal computer tomography (CT) 3) abdominal ultra sound (US) 4) double contrast barium enema, 5) sigmoidoscopy

6) colonoscopy, 7) fecal occult blood tests (FOBTs) and 8) emergency department presentation, not otherwise specified (ED NOS) (4,22) (see Figure 3-4). CT colonography was not included as a key CRC

44

test as the OHIP fee code to capture the test was not available for all years of data collection.

Furthermore, it may be unlikely for CT colonography to be used as an early test in the diagnosis of CRC as it is often used after a failed colonoscopy or if there were contraindications to colonoscopy, thus it would not likely affect our diagnostic interval calculation.

OHIP fee codes were used to capture all key CRC tests, except some ED cases, and were found primarily through the Schedule of Benefits Physician Services Under the Health Insurance Act document produced by the MOHLTC (126) . CIHI fee codes (Canadian Classification of Health Interventions-CCI codes) were also used to capture abdominal radiological imaging tests (145). ED presentation was determined by using an OHIP location variable as well as ED admissions identified in the NACRS database that were coupled with CRC-related ED diagnostic codes (see ED paragraph in section 3.12.2).

To ensure that all relevant CRC tests were captured, we examined a frequency table of all OHIP fee codes registered from our cohort in the three months directly preceding the date of CRC diagnosis.

Not all of these key CRC tests are used exclusively to diagnose CRC, therefore it was important to set relevant look-back windows in which to search for these tests before diagnosis. The look-back windows were determined separately for each of the eight tests using data-informed logic from control charts (146). This control chart technique is explained in sub-section 3.11.3.

3.11.2 Identifying the Index Contact Date from Key CRC Tests

The physician number of the ordering/referring physician for each of the key CRC tests in a patient’s diagnostic pathway was used to determine the date of the patient’s last visit with that physician.

The earliest date of all of a patient’s visits (A-H in Figure 3-4) preceding the CRC test was defined as the index contact date. Thus, diagnostic interval was defined by the period, in days, between the index contact date (first CRC-relevant encounter with the health care system) and the definitive diagnosis of CRC.

45

No referring physician visit was searched for FOBTs, as seen in Figure 3-4. This was due to the nature of the test as patients would complete the FOBT on their own time and then submit to a laboratory.

We did not want to include this “patient interval” in our measure as we did not think it was reflective of the system-related diagnostic interval. This subtle difference in measurement logic was based on the study purpose. Furthermore, it is possible to receive an FOBT kit without a visit to a FP (e.g. through a pharmacy) consequently, there would be no physician visit for the index contact date (although the number of patients using this route would likely be small). The date of the FOBT represented the date in which the testing laboratory received the FOBT kit.

The diagnostic interval derivation for colonoscopies and sigmoidoscopies included an extra step

(see Figure 3-4). Quantifying the diagnostic interval for ED cases involved a different methodology which is described in section 3.12.2.

As described above, the earliest date of all of a patient’s visits (A-H in Figure 3-4) preceding the

CRC test was defined as the index contact date. However, for a small portion of patients, there was no identifiable referral leading to any CRC tests and therefore no identifiable patient visit with the ordering

MD. For these cases the index contact date was the next earliest contact, typically the first test. In

Appendix C we compared the median and 90th diagnostic intervals between patients with and without a referral for their first test as we assumed that patients without a referral for their first test would have a shorter diagnostic interval.

46

A. Date of first FOBT in ‘a’ amount of time prior to CRC diagnosis

B. Date of the last visit with the ordering MD Physician # of Date of first Abdominal X-ray in ‘b’ prior to the date of Abdominal X-ray the ordering MD amount of time prior to CRC diagnosis

C. Date of the last visit with the ordering MD Physician # of Date of first Abdominal CT in ‘c’ prior to the date of Abdominal CT the ordering MD amount of time prior to CRC diagnosis

D. Date of the last visit with the ordering MD Physician # of Date of first Abdominal US in ‘d’ prior to the date of Abdominal US the ordering MD amount of time prior to CRC diagnosis

*Index E. Date of the last visit with the ordering MD Physician # of Date of first Barium Enema in ‘e’ contact prior to the date of Barium Enema the ordering MD amount of time prior to CRC diagnosis Date of date= CRC earliest of diagnosis A,B,C,D,E, F. Date of the last visit with the Collect Physician F,G or H referring MD prior to the earliest colonoscopies in ‘f’ # of the Endoscopist colonoscopy/ endoscopist visit with amount of time referring claim # the referral code. If not available Collect visits with the prior to CRC MD use first colonoscopy endoscopist (who diagnosis performed the test) within 12 months prior to the first colonoscopy/ Collect G. Date of the last visit with the sigmoidoscopy test date referring MD prior to the earliest Physician sigmoidoscopies in ‘g’ amount of sigmoidoscopy/endoscopist visit # of the Endoscopist time prior to CRC with the referral code. If not referring claim # diagnosis available use first sigmoidoscopy MD

H. Date of visit to the ED in ‘h’ amount of time

Figure 3-4: The diagnostic interval is the time from the index contact date to the CRC diagnosis. Figure adapted from Singh and colleagues (22). *Note if index contact is through the ED. See Appendix A, for same figure (within the DCP) with corresponding OHIP and CCI codes. Abbreviations: FOBT Fecal occult blood test; CRC colorectal cancer; CT computer tomography; US ultra sound; MD medical doctor; ED emergency department; endoscopist includes: general surgeons, gastroenterologist,47 internists and family physicians

3.11.3 Using Control Charts to Determine Look-Back Periods for Key CRC Tests

For each key CRC test we used a control chart to identify the point in time when patients’ test encounter frequencies surpassed their background rate (except ED presentation). This indicated the point in time, leading up to the CRC diagnosis, in which tests were more likely to be related to the cancer diagnosis. The use of these charts refined previous approaches to determining the diagnostic interval that used arbitrary look-back windows (22).

Control charts were originally developed in 1931 by Walter A. Shewhart (also called Shewhart charts) for use in industrial quality assurance (147). Control charts are a technique used in statistical process control and aid in the understanding of variation. At a deconstructed level, the charts consist of: data points plotted over time, a centre line (mean) and upper /lower control limit lines (3 standard deviations commonly used)(148). The centre line and control limits are determined from historical data and are then plotted over the time period of interest. Established rules are then used to interpret the charts.

Control charts are now used in several disciplines including health sciences as a tool to assess variation in data (149,150).

We plotted weekly test counts over two time periods for each of the key CRC tests (see

Appendix D for the control charts). Period one consisted of the months immediately preceding the CRC diagnosis (0-18 months) and period two, the background (historical data), was the 18-24 month interval preceding diagnosis. We then compared the counts in period one against the corresponding background rate, which was the average weekly count in period two. The 18-24 month interval was chosen to allow for stable baseline estimates of key CRC test counts. Established rules (151) were then used to interpret the control charts in order to determine at which specific point in time the test count exceeded the background rate (the data were no longer “in control”). The rules for interpretation were applied by two independent reviewers (the student and a research associate, Li Jiang) (see Appendix E for the rules used

48

to interpret the control charts and Appendix D for the seven control charts). The relevant look-back period for each key CRC test was the interval between the date of CRC diagnosis back to this control-chart determined cut-point.

Once the relevant look-back periods were determined for each key CRC test (see Table 3-1) we collected all such encounters within that timeframe. For example, we collected all colonoscopy tests from the date of the patient’s CRC diagnosis to 36 weeks prior to their diagnosis.

Table 3-1: Relevant look-back periods and signal strength (%) for the eight key CRC tests determined by control charts

Relevant look-back period from Signal Strength Key CRC Test date of CRC diagnosis (weeks) (%) Abdominal x-ray 53 86.7 Abdominal CT 60 88.4 Abdominal US 54 63.8 Barium enema 28 91.3 Sigmoidoscopy 32 94.7 Colonoscopy 36 97.1 FOBT 45 68.0 ED presentation 30 days NA

We calculated the “signal strength” for each key CRC test. This signal strength represented the percentage of encounters that we believed to have been truly CRC-related within the chosen look-back period. A high signal strength (%) provides justification for using a specific cut-off week determined by the control chart exercise. The signal strength percentage was calculated using an equation that parallels the positive predictive value, such that, signal strength equals the “true positives” divided by “all positives” multiplied by 100. “All positives” was the total number of encounters for a specific CRC test in the relevant look-back period (weeks) determined from the control chart. “True positives” was calculated by subtracting “true negatives” from “all positives”, where “true negatives” was the mean background rate of a respective CRC test multiplied by the number of weeks included in the look-back period (see

Appendix F for full calculations). We considered the signal strength (%) for each test, the low signal strength of abdominal US is further described within the next section. 49

3.11.4 Identification and Retrieval of Missing Index Contact Dates This sub-section and the following sub-section (3.11.5) were included to fully document the thesis process to provide an understanding of the complexity and iterative nature of the work.

Upon the initial cut of the dataset, 8,322 patients in the cohort (23%) were missing an index contact date and therefore could not be assigned a diagnostic interval. We compared several key covariates between the “with index contact” group (n=28,307) and the “missing index contact” group

(n=8,322). The covariates were: stage distribution, age at diagnosis, sex, and best source. We found that the missing index contact group had a greater proportion of advanced and unknown stage cancer and were slightly older than the with index contact group. This indicated that the index contact dates were not likely missing at random.

A missing index contact date meant that we had not been able to identify any key CRC test(s) for these patients. We investigated two possible explanations for the problem: 1) the relevant look-back periods were too short or 2) all codes for key CRC tests were not included.

1) Relevant look-back periods

The first step in the process to capture patients missing an index contact date was to ensure that the look-back periods, in which to collect key CRC tests, were not too restrictive. We compared the relative frequency key CRC tests in the three months (12 weeks) directly prior to the respective look-back periods in the with index contact group and the missing index contact group. For example, the look-back period for colonoscopy tests was the 0-36 weeks prior to CRC diagnosis, therefore, we compared the number of colonoscopies in the 36-48 weeks between the two groups. The hypothesis being that, if the relative number of tests in the missing index contact group was greater than the with index contact group then our look-back periods were too short. In conclusion, we found no difference in the relative frequencies between the two groups and thus could further trust our control chart method for determining

50

the look-back periods. It became apparent that the underlying issue was having a comprehensive set of codes and logic to pick-up key CRC tests.

2) Codes for key CRC tests

Codes for abdominal ultrasounds (US), ED investigations and radiological imaging were further examined based on clues from the literature and results from the initial comparison between the with index contact group and the missing index contact group.

Abdominal US was initially not included as a key CRC test because it had a low signal strength at

63.8% (see Table 3-1). However, upon realizing the problem with missing index contact dates, we revisited the control chart data to take a closer look at the background mean rate of abdominal US tests in comparison to the other key CRC tests. We also created an additional abdominal ultrasound control chart using only data from the missing index contact group. The background mean rate of abdominal US tests

(86.9 tests/week) was quite high compared to the other key CRC tests (FOBT was also high), which makes sense clinically as abdominal US is an imaging test used for a variety of purposes (source) and

FOBT is used for CRC screening. A high mean background rate can contribute to a lower signal strength as it will reduce the numerator in the signal strength equation, yielding a smaller quotient. The abdominal

US control chart using only data from the missing index contact group exhibited the characteristic spike near the CRC diagnosis date. With the use of the additional control chart and advice from the clinician consultant we decided to add abdominal US as a key CRC investigation. Of note, abdominal US was also included as a key CRC test in work by Singh and colleagues (22).

We examined how patients who presented to the ED were being captured. The initial proportion of cases with ED presentation was very low (5.5%) compared to what we were anticipating from the literature (approximately 20%)(129). Furthermore, the results from our comparison of the with index contact group and the missing index contact group suggested that the missing index contact group had

51

characteristics of CRC patients who would enter the system through the ED (e.g. higher stage at diagnosis). We had been using an OHIP variable that identified the OHIP claim location to capture the

ED presentation group. However, we decided our ED presentation algorithm needed to be more inclusive as we suspected this was another major reason for missing index contacts. We therefore added NACRS

ED admissions within 30 days prior to CRC diagnosis when the admission contained specific diagnostic codes (see ED presentation section 3.12.2 for a full description of our final methodology).

Lastly, for radiological imaging tests (abdominal US, abdominal CT, abdominal x-ray and barium enema), we incorporated Canadian Classification of Health Intervention codes (CCI) (145) from CIHI in addition to the OHIP codes because in some institutions, radiologists operate on a salary from the hospital and may not make test-related OHIP claims.

After the application of these steps, a new dataset was created and the number of patients missing an index contact date went from 8,322 (23%) to 1,616 (4.4%). We considered this an acceptable number and it was similar to the proportion excluded for this reason in another MSc. thesis from our research group which studied a different cancer site (152).

3.11.5 Refining the Colonoscopy and Sigmoidoscopy Capture Upon the first cut of the dataset a large proportion of patients had the date of their colonoscopy as their index contact date. This meant that no other relevant look-back data from the date of the colonoscopy was identified in the administrative databases for these cases. This would have shortened the diagnostic interval. The look-back logic for colonoscopy and sigmoidoscopy differed slightly from the other CRC tests (see Figure 3-4) as there was an intermediate visit with the endoscopist. Initially the look- back sequence was to identify the first colonoscopy/sigmoidoscopy then identify the endoscopist claim number, then identify the referring physician for the endoscopist consult visit to find the patients last visit with the referring physician. If the look-back logic to capture the index contact date was not reflective of

52

the true colonoscopy/sigmoidoscopy diagnostic process it was possible that the index contact date was being assigned to the actual colonoscopy versus an earlier colonoscopy/sigmoidoscopy-related encounter.

To resolve this issue we examined frequency tables of the number of visits (0, 1 or >1) a patient had with the endoscopist who performed their colonoscopy/sigmoidoscopy in three timeframes (6, 9 and

12 months) preceding the colonoscopy/sigmoidoscopy. We also created a frequency table of the number of visits (0, 1 or >1) a patient had with the endoscopist preceding the colonoscopy/sigmoidoscopy where there was also a referring physician on record (generally a FP). The results did not coincide with our initial understanding of the colonoscopy/sigmoidoscopy diagnostic pathway. In the 6 month period prior to the colonoscopy/sigmoidoscopy, 39.2% of recipients had zero visits with the endoscopist, 52.7% of recipients had one visit with their endoscopist and 8.1% of recipients had more than one visit with their endoscopist. Furthermore, if a patient did in fact have a visit to the endoscopist prior to the test date they almost always had a record of a referring physician. Therefore, it was evident that the problem with our look-back logic was that we were not allowing the query for the initial referring physician if the patient went directly to the colonoscopy/sigmoidoscopy without a consult visit to their endoscopist (39.2%).

We consulted a gastroenterologist who is a member of the Cancer Care and Epidemiology group at the Cancer Research Institute at Queen’s University to help us understand these results. We learned that certain patients are triaged directly to colonoscopy/sigmoidoscopy and do not have an intermediate endoscopy consult clinic visit. This would happen in cases where there was a straightforward referral, for example, a positive FOBT or a mass seen on imaging. The endoscopist may just book these patients directly for a colonoscopy and do the consult at the same time. We modified the look-back logic to allow the search for the referring physician directly from the colonoscopy/sigmoidoscopy and were able to decrease the number of patients who had a colonoscopy without a referral (from approximately 25.9% to

3.1%). This refinement would not have changed the number of patients with/without an index contact

53

date, but it instead improved the accuracy of the index contact assignment and the diagnostic interval length.

3.12 Potential Causal Pathway Variables: CRC symptom status, Emergency Department (ED) Presentation and First Test In review, three important variables have been described so far: rurality (exposure), stage at diagnosis (outcome and stratification variable) and the diagnostic interval (outcome). Potential causal pathway variables are considered next.

A causal pathway variable lies directly on the path between an exposure and outcome. In other words, the outcome could be a direct consequence of the causal pathway variable (see Figure 3-2). It was important to identify possible causal pathway variables to ensure that we did not adjust for them in our models, thereby, potentially removing part of the total effect of rurality on the outcome. In objective one

(rurality and stage), CRC symptom status and emergency department (ED) presentation were identified as potential causal pathway variables. In objective two (rurality and the diagnostic interval), CRC symptom status, ED presentation and a patient’s first test were identified as potential causal pathway variables; the rationale, definition and derivation for each are described below.

3.12.1 CRC Symptom Status at First Test

Rationale It is important to distinguish CRC symptom status (asymptomatic or symptomatic) as there are different recommended time frames in which a patient should travel through the diagnostic pathway depending on their symptom status. As described in the literature review, The Canadian Association of

Gastroenterology published a consensus statement on appropriate wait times for specialist consultation and procedures that were related to the CRC diagnostic interval (89). The statement indicated that: 1) patients referred to a specialist due to a high likelihood of cancer based on clinical or radiological

54

investigations should be seen and endoscoped, if needed, within two weeks, 2) patients referred to a specialist due to a positive FOBT test should be seen and endoscoped, if needed, within two months and

3) patients referred for a screening colonoscopy should be seen and endoscoped, if needed, within six months. Patient triaging by CRC symptom status (asymptomatic versus symptomatic) may play an important role in the chain of encounters and tests leading to a patient’s diagnostic interval. Furthermore, whether or not a patient is screened or symptomatic may be related to their rurality of residence. It has been documented that there is a lower CRC screening rate (using FOBT) in rural areas versus urban areas in Ontario (153).

Definition and derivation

The CRC symptom status was based on a patient’s first CRC test and was categorized as either asymptomatic or symptomatic. It was likely that the majority of the asymptomatic patients were screening cases, but without specific screening codes it is difficult to know with complete certainty. Of the eight

CRC tests, we considered FOBTs, colonoscopies and sigmoidoscopies as possible asymptomatic tests.

This decision was based on the ColonCancerCheck screening guidelines (Ontario population-based screening program established in 2008)(57) and expert opinion from two clinicians. The symptomatic category was comprised of patients who we believed to be symptomatic at the time of the test. All eight of the tests were considered to be possible tests for evaluation of symptomatic patients.

Assigning symptom status to the first test was a challenge because the same test, such as a colonoscopy, could be used for both asymptomatic and symptomatic cases. To categorize the first test as asymptomatic or symptomatic we revised an algorithm to apply to the cohort based on work by El-Serag and colleagues (23) (see Figure 3-5).

55

First test = Abdominal X-ray Abdominal CT Symptomatic Abdominal US Barium Enema OR ED presentation = yes

First test = Sigmoidoscopy Colonoscopy AND ED presentation= no

Are any dxcodes 1-16 present in the 1yr prior to test date? Yes Symptomatic

No Asymptomatic

First test = FOBT (L179) AND Asymptomatic ED presentation= no

First test = FOBT (L181 or G004) AND ED presentation= no

Are any dxcodes 1-16 present from the last encounter with the No Asymptomatic ordering physician?

Yes Symptomatic

Figure 3-5: Algorithm adapted from El-Serag and colleagues (23) to assign CRC symptom status: asymptomatic or symptomatic. List of accompanying diagnostic (dx) codes on following page 56

Dxcodes 1) 787: Anorexia, nausea and vomiting, heartburn, dysphagia, hiccough, hematemesis, jaundice, ascites, abdominal pain, melena, masses 2) 280: Iron deficiency anaemia 3) 569: Anal or rectal polyp, rectal prolapse, anal or rectal stricture, rectal bleeding, other disorders of intestine 4) 009: Diarrhea, gastro-enteritis, viral gastro-enteritis 5) 564: Spastic colon, irritable colon, mucous colitis, constipation 6) 455: Haemorrhoids

7) 285: Other anaemias 8) 562: Diverticulitis or diverticulosis of large or small intestine 9) 535: Gastritis 10) 281: Pernicious anaemia 11) 153: Large intestine - excluding rectum 12) 565: Anal fissure, anal fistula 13) 154: Rectum, rectosigmoid and anus 14) 555: Regional enteritis, Crohn's disease 15) 560: Intestinal obstruction, intussusception, paralytic ileus, volvulus, impaction of intestine 16) 556: Ulcerative colitis

First tests that we considered as indicative of one purpose only were straightforward to categorize: abdominal X-ray, abdominal CT, abdominal US, barium enema and ED presentation were all grouped as having a symptomatic status only. Colonoscopy and sigmoidoscopy could be used for both patients who were asymptomatic and patients who were symptomatic because the OHIP billing codes during most years of the study (2007-2011) were not purpose-specific. However, starting in September of

2011, several new OHIP colonoscopy fee codes were released that were purpose-specific (see Table 3-2).

57

Table 3-2: OHIP colonoscopy and sigmoidoscopy fee codes used to create the CRC symptom status algorithm OHIP fee code Description Code creation date Z496 Colonoscopy-For diagnosis or ongoing management Sept 1, 2011 Presence of signs or symptoms Z497 Colonoscopy-For risk evaluation Sept 1, 2011 Confirmatory colonoscopy (for a positive: FOBT, FIT, sigmoidoscopy, barium enema, CT abdomen/pelvis and CT colonography) Z498 Colonoscopy-For diagnosis or ongoing management Sept 1, 2011 Follow up of an abnormal colonoscopy Z499 Colonoscopy-For risk evaluation Sept 1, 2011 Absence of signs or symptoms, family history associated with an increased risk of malignancy Z555 Colonoscopy-For diagnosis or ongoing management July 1, 1991 Absence of signs or symptoms or risk factors, 50 years of (*although we believe that age or older the definition was changed in Sept, 2011) Z580 Sigmoidoscopy July 1, 1991 Sigmoidoscopy (using 60cm. flexible endoscope)

We first examined the frequency of these new OHIP fee codes for all CRC patients from 2007-

2012, by year, to ensure that what we had read about in the fee code manual (126) was reflected in our data. Using El-Serag and colleagues’ (23) algorithm as a starting point, we used the new fee codes to help create the algorithm to assign symptom status (asymptomatic versus symptomatic) to colonoscopies and sigmoidoscopies. To make our algorithm we created a 2012 CRC cohort and examined frequency tables of all OHIP diagnostic codes attached to colonoscopies and sigmoidoscopies for this cohort. The diagnostic codes must have been prior to the colonoscopy or sigmoidoscopy test date and ordered from the referring doctor of the patients first colonoscopy or sigmoidoscopy (some examples of these diagnostic codes were conditions like: gastritis, iron deficiency anaemia, influenza etc.). The idea being that we could use these work-up diagnostic codes to identify a particular group of codes that were indicative of a colonoscopy for a particular status, for example colonoscopy for symptoms versus colonoscopy for asymptomatic cases (screening) and then use this information to apply to the entire cohort (2007-2012).

58

We further organized the diagnostic code frequency table outputs into two time periods, the 0-6 months directly prior to CRC diagnosis and the 6-12 months prior to CRC diagnosis. We examined six separate frequency tables (one for each new colonoscopy/sigmoidoscopy code, see Table 3-2). In each frequency table we highlighted diagnostic codes that had markedly different absolute numbers in the 0-6 month column compared to the 6-12 month column. This helped to identify diagnostic codes that were likely related to the CRC. Next we plotted the frequencies of these highlighted diagnostic codes from one particular colonoscopy fee code against a different colonoscopy fee code, for example colonoscopy from symptoms (Z496) versus colonoscopy for screening (Z499). With the input from two clinicians and the use of our graphs we were able to first eliminate diagnostic codes that were not relevant to CRC, and second, create a list of diagnostic codes that were unique to a purpose. We then compared our diagnostic code list to El-Serag and colleagues’ (23), refined their algorithm and reviewed with the two clinicians. In summary, using logic from El-Serag et al. (23) we revised an algorithm to classify the symptom status for diagnostic investigations as asymptomatic or symptomatic (see Figure 3-5).

3.12.2 Emergency Department (ED) Presentation Rationale

It was important to be able to consider ED presentation as a causal pathway variable because

CRC cases diagnosed through an ED have shorter diagnostic intervals (22) and ED presentation may occur more often in rural areas. These cases generally represent a more complicated group of patients with worse cancer outcomes (154) including: advanced stage at diagnosis, higher mortality and longer length of in-hospital stay compared to non-emergency room cases (155). We expected that approximately

20% of the CRC cases (129) would have an ED presentation as their first indication of CRC. This is a higher percentage than seen in some other cancer sites (for example breast) and meant that excluding these ED cases was not a suitable option. The diagnostic process is much more accelerated for ED patients as symptoms may be severe warranting immediate tests and/or these patients receive expedited 59

referral to colonoscopy or surgery. Furthermore, ED services maybe used more frequently and differently in rural areas where there could be less access to non-emergency healthcare services.

Definition and derivation

A patient was considered as an ED presentation case if 1) the index contact with the health care system had ED for the OHIP variable “location” or 2) if the index contact was through an ED admission record in the NACRS database coupled with a specific ED diagnostic code (from a list we developed) within 30 days prior to the CRC diagnosis date. The list for CRC-relevant NACRS ED diagnostic codes was created by examining a frequency table of all ED diagnostic codes from the entire cohort divided into different time periods. We compared the absolute and relative difference of the frequencies in the 0-6 months prior to CRC diagnosis and the 18-24 months prior to CRC diagnosis in a subset of this list. After highlighting the ED diagnostic codes that had a large difference in frequencies between the two time periods, the list was sent to the clinician consultant on the project to examine for face validity and to provide insight. The final list of CRC-relevant NACRS ED dxcodes contained 53 dxcodes and can be found in Appendix A (within the DCP).

In some areas, patients may have had a scheduled appointment with the doctor on call in the emergency department. This could be particularly relevant in rural settings, and we felt that these patients should not be considered as true emergency department cases. The NACRS database variable

“schededvisit” was used to identify these patients and exclude them from the ED presentation group.

3.12.3 First Test Rationale

Previous studies have found that CRC wait times were longer if a patient’s first test was radiological imaging (22). The researchers postulated that this was due to a perceived notion among FPs that abdominal radiological imaging tests were faster and easier to obtain than a colonoscopy or sigmoidoscopy. However, if CRC is suspected from the radiological imaging tests, most patients will then 60

go on to have lower GI endoscopy for confirmation, which adds time to the diagnostic interval. We therefore thought that it would be important to describe this causal pathway variable because having a particular type of CRC test first might have direct consequences on the diagnostic interval (22) and the first test distribution might vary by rurality.

Definition and derivation

A patient’s first test was the CRC-related test with the earliest date, within the respective look- back windows, in a patient’s diagnostic pathway (refer back to Figure 3-3). Options for first test were: 1)

FOBT, 2) colonoscopy, 3) sigmoidoscopy, 4) barium enema, 5) abdominal CT, 6) abdominal X-ray, 7) abdominal US and 8) ED NOS (emergency department not otherwise specified). If there was more than one test on the earliest date for a patient a hierarchy was used to assign first test. This was particularly relevant for patients that were ED presentation cases. For example, if on the earliest date that a patient had tests, they presented to the ED and then had a subsequent abdominal x-ray on the same day, their first test would be abdominal x-ray and the patient would also be assigned as an ED presentation case. In another scenario, if a patient presented to the ED and there were no other tests found on that day (or earlier) the patient’s first test would be ED NOS and they would also be assigned as an ED presentation case.

3.13 Covariates- Rationale for Inclusion and Variable Description We were able to control for several potential confounders in this study. Covariate selection was based on previous literature, with a focus on variables that were associated with the diagnostic interval

(the main study outcome). A short rationale for covariate inclusion is found in Table 3-3, variable description and derivation follows.

61

Table 3-3: Brief rationale for covariate inclusion Covariate Rationale for Inclusion Age at diagnosis The average age tends to be greater in rural areas (123) and CRC incidence is greater in older people. More than 90% of new CRC cases occur in those aged 50 years and greater (156). The total delay interval and referral interval in CRC has been documented to be longer in younger people (157). Sex Females have been found to have a longer referral delay (157). Deprivation Quintile Rural areas generally have lower measures of SES than urban areas in Canada (19). However, there is conflicting evidence as to whether SES is associated with cancer stage at diagnosis or time intervals in the CRC diagnostic/treatment pathways (22,116,117,158,159). Varying results are likely attributed, in part, to: the wide array of measures used to quantify SES, differences in variable level collection (many use area-level scores derived from census data versus individual-level data) and differences in health care system coverage (universal versus private) between countries (117). Recent immigrant status There will be a greater proportion of recent immigrants living in urban areas in Canada as compared to rural areas (160). In Canada, recent immigrants are less likely to be up to date with CRC screening (161), this may in turn decrease the diagnostic interval (although the overall effect would likely be very small). Contrarily, it is possible that recent immigrants may have a more difficult time navigating the health care system, which may increase the diagnostic interval. Comorbidity In general, those living in rural Canada are less healthy than their urban counterparts (19). Those living in the most rural areas were disadvantaged in several health indicators including cardiovascular disease and diabetes, although there was a lower incidence of cancer (19). In a Canadian population- based study, the overall wait time for CRC patients (to be diagnosed and treated) was longer with multiple comorbidities (22). Patients with chronic heart failure, coronary artery disease and diabetes had a significantly longer median wait time to colonoscopy than patients without those comorbidities (67) and in a similar American study, chronic heart failure and coronary artery disease were significantly associated with missed opportunities (75). Medical comorbidity was also identified as a factor associated with diagnostic delay (77). CRC sub-site There are few studies examining the relationship between rurality and CRC sub-site distribution. Two older studies from France found no significant difference in the distribution of CRC sub-sites by rural/urban locale (109,162), however, the effect of environmental factors (such as diet) on CRC incidence may be different by CRC sub-site(163), which could possibly be linked to rural/urban living. Symptoms of CRC can vary by anatomical sub-site of the tumour (56), for example, having blood in bowel movements is often considered as an alarm symptom(164) and is more typical of distal colon and rectal cancers versus proximal colon cancer. Several studies have suggested that there is a tendency for proximal colon cancers to be diagnosed at a more advanced stage compared to distal colon or rectal cancers (63,165,166). It is unclear if the diagnostic interval is influenced by anatomical sub-site. One study found a trend of increasing age the more proximal a tumour was located and explained this finding by a diagnostic delay in proximal colon cancers due to less alarming symptoms (167).

62

Age at diagnosis and sex: The RPDB was used to assign age at diagnosis and sex. Age was grouped into six categories after preliminary analysis ruled out age as a continuous variable because the assumption of a linear relationship with the diagnostic interval was not met (see Appendix G for the linear assumption test).

Deprivation quintile: Deprivation quintile was used as a measure of neighbourhood-level socioeconomic status (SES). Deprivation quintile was created from the material deprivation index, which is a dimension of the Ontario Marginalization Index (ON-Marg)-the provincial version of the Canadian

Marginalization Index (CAN-Marg)(168). The material deprivation index is based on scores from six indicators collected at the dissemination area (DA)-level of the 2006 Canadian Census. A DA is the smallest geographic area at which all census information is disseminated and is comprised of 400-700 people in one or more neighbouring dissemination blocks (168,169). The six indicators included in the material deprivation index are: the percentage of people aged 25+ without a certificate, diploma or degree, the percentage of lone parent families, the percentage of people receiving government transfer payments, the percentage of people aged 15+ who are unemployed, the percentage of people living below the low income cut-off and the percentage of homes needing major repair(168,170). Scores have been standardized to create a mean of 0 and a standard deviation of 1 when the full CAN-Marg dataset is used.

Quintiles are then created by ranking scores and grouping them into five groups such that each group contains a fifth of the scores (each score characterizes a DA). Quintile 1 represents the least deprived DAs while Quintile 5 represents the most deprived DAs. To create this variable each patient in our dataset was assigned their respective DA using an ICES macro based on postal codes. Then the deprivation quintile was assigned for each patient using the ICES electronic look-up table (2006) for material deprivation quintile at the DA-level(169). In terms of validation, the analysis for this measure has been repeated for the 2001 and 2006 Census at the DA-level and was found to be reliable and stable (171). Details on the analysis used to create ON-Marg can be found in Matheson and colleagues (170).

63

Recent Immigrant Status: There is no specific variable for immigrant status in the databases we used. As a proxy for immigrant status, we considered all patients who had registered with the provincial health insurance plan (OHIP) within the 10 years prior to their CRC diagnosis as recent immigrants to

Canada. Registration date with OHIP as a proxy for immigrant status was the same method used by

Lofters et al. (172) in a study using similar databases. Lofters et al. (172) stated that over 70% of those who registered with OHIP 5-10 years before the study period were immigrants based on Census data. The remaining 30% would have been made up of new registrants moving to Ontario from other Canadian provinces (of whom a certain number would also be immigrants). We considered the accuracy of this variable adequate for use as a covariate and we applied Lofters and colleagues (172) approach to create a dichotomous (yes/no) recent immigrant status variable from the RPDB.

Comorbidities: Aggregated Diagnosis Groups (ADGs) from the Johns Hopkins Adjusted

Clinical Group (ACG) Case-Mix System were used as a measure of comorbidities. This system is a method of categorizing illnesses based on diagnosis clusters to predict resource utilization (173). Each

ICD-9/10 code (which indicates a patient’s specific disease or condition), within a certain time period, is assigned to one of 32 ADGs based on five factors that include: duration of the condition, severity of the condition, diagnostic certainty, etiology of the condition and specialty care involvement (a single patient can have between 0 and 32 ADGs)(174). It is important to note that patients with very different conditions can be in the same ADG as they are not disease-specific. With permission from the Johns Hopkins ACG software administrators, a macro at ICES (175) obtained patients’ ICD codes from the OHIP and CIHI-

DAD databases within the 12-24 month period before CRC diagnosis date and assigned patients’ ADGs, a one or two year collection time period is typical(173,174,176,177). There are multiple comorbidity indices in use, for example, The Charlson Comorbidity Index, The Elixhauser Comorbidity Index (for hospitalized patients) and the ACG/ADG System. The ADG comorbidity measurement was the best choice for our study purposes because it is designed to predict health services resource utilization

64

(173,174), is efficient to compute through ICES databases and has been validated in the Canadian population (174)(176). After preliminary analyses (frequency distributions), we created two continuous variables from the ADG data: number of major comorbidities (173) (which included the adult major

ADGs: 3, 4, 9, 11, 16, 22, 25, 32) and number of minor comorbidities (which included ADGs:1, 2, 5-8,

10-14, 17, 18, 20, 21, 23, 24, 26-31, 33, 34). See the Johns Hopkins ACG System Technical Reference

Guide (173) for a list of ADGs and common conditions assigned to them. The linearity assumption was met for use of the continuous form in our stage (early versus late) and diagnostic interval analyses (see

Appendix H and G).

CRC sub-site: CCO defines four sub-site groupings of CRCs: the proximal (right) colon, the distal (left) colon, the rectum and other/not otherwise specified (178). These definitions and ICD-9 codes from the OCR were used to determine CRC sub-site. Grouping scheme is detailed below in Table 3-4, refer back to Figure 2-1 in the Literature Review Chapter for an anatomical image.

Table 3-4: CRC sub-site grouping scheme ICD-9 code CRC sub-site CRC sub-site group 153.4 Cecum Proximal colon 153.6 Ascending colon 153.0 Hepatic flexure 153.1 Transverse colon 153.7 Splenic flexure 153.2 Descending colon Distal colon 153.3 Sigmoid colon 154.0 Rectosigmoid junction Rectum 154.1 Rectum 153.8 Other specified sites of large intestine Unspecified CRC 153.9 Colon unspecified

3.14 Code Determination Summary Administrative billing and procedure codes were a central component to this project. Several coding systems from multiple data sources were used. One of the challenges of this project was ensuring that the correct coding was used to describe the intended group of patients. A summary table (Table 3-5) is provided below outlining: several scenarios in which specific codes were needed, the system of coding

65

that was used, the relevant ICES database(s), and the key resources we consulted to inform our variable definitions and derivations.

66

Table 3-5: Code determination summary Group/item to identify Coding system ICES Key Resources Consulted database(s) CRC patients ICD-9 OCR -Singh and colleagues(22) -ICES-derived CRC screening algorithm (179) -Clinician consultant Colorectal adenocarcinoma ICD-O OCR -Stewart and colleagues(32) (histological classification) -International Classification of Diseases for Oncology 2nd edition (180) -WHO histological classification of tumours of the colon and rectum (181) Key CRC tests OHIP fee codes OHIP -Singh and colleagues(22) CCI codes CIHI-DAD -Toronto Community Health Profiles, prevention variable definitions (182) NACRS -Clinician consultant -Ontario MOHLTC Schedule of Benefits for Physician Services (126) -Ontario MOHLTC Schedule of Benefits for Laboratory Services (183) -ICES-derived CRC screening algorithm(179) -CCI, volume three (145) -Clinician consultant, FOBT codes specifically -Ontario MOHLTC Bulletin #4471 (184) (ColonCancerCheck Fecal Occult Blood Testing) -FOBT code frequency tables and timing relative to one another CRC symptom status OHIP dx code OHIP -El-Serag and colleagues (23) -Clinician consultants ED-presentation ICD-10 OHIP -Singh and colleagues (22) NACRS -Rabeneck and colleagues(129) -Clinician consultant

67

3.15 Summary of Variables

Table 3-6 lists all study variables as well as the type of data, data source from ICES and the analysis format.

Table 3-6: Summary of study variables Variable Type Source Analysis Format Rurality Categorical RPDB, OCR RIO score 0-9 (least rural) 10-30 31-45 46-55 56-75 76+ (most rural) Stage at diagnosis Categorical OCR Stage I Stage II Stage III Stage IV Stage known/unknown Dichotomous OCR Known Unknown Diagnostic Interval Continuous OHIP, OCR, Days CIHI-DAD, NACRS Potential Causal Pathway variables Symptom status Dichotomous OHIP, NACRS, Asymptomatic Algorithm Symptomatic ED presentation Dichotomous OHIP, NACRS Yes No First test Categorical OHIP, NACRS Colonoscopy FOBT Abdominal US Abdominal X-ray Abdominal CT ED NOS Sigmoidoscopy Barium enema Covariates Age at diagnosis Categorical RPDB <45 (years) 45-54 55-64 65-74 75-84 85+ Sex Dichotomous RPDB Male Female Deprivation quintile Categorical RPDB,OCR Deprivation Quintile ICES macro 1 (least deprived) 2 3 4 5 (most deprived)

68

Variable Type Source Analysis Format Recent Immigrant Status Dichotomous RPDB Yes No Major Comorbidities Continuous OHIP Number of major aggregated CIHI-DAD diagnosis groups (ADGs) ICES macro Minor Comorbidities Continuous OHIP Number of minor aggregated CIHI-DAD diagnosis groups (ADGs) ICES macro CRC Sub-site Categorical OCR Proximal Distal Rectal Unspecified CRC

3.16 Statistical Analysis The study used both descriptive and multivariable analyses to accomplish the thesis objectives.

Statistical analyses were performed using the program SAS version 9.3 (SAS Statistical Institute, Cary,

North Carolina) at the ICES Health Services Research Facility at Queen’s University.

Descriptive statistics for all study variables were first analyzed to understand baseline characteristics of the cohort. Measures of central tendency and distribution, including the mean, standard deviation, median and interquartile range were generated for continuous variables while frequency tables were created for categorical variables. The conventional two-sided alpha (α) of 0.05 was used to ascertain significance throughout. We also used clinical significance when interpreting our results, details are in the

Results Chapter.

3.16.1 Cohort Description and Representativeness The CRC cohort was described by RIO group and the chi-square test for independence was calculated to test for differences in proportions amongst a variable across RIO groups (the Cochrane-

Armitage trend test for linear association was also calculated for ordinal dichotomous variables across

RIO groups). The chi-square test for independence was used for categorical variables and the Kruskal-

Wallis test (non-parametric analysis of variance) which compares median values across categories was used for the continuous variable, median diagnostic interval. Excluded cases from the eligible cohort

69

(missing index contact date or RIO score) were compared to the final cohort to assess cohort representativeness and to ensure that there were no systematic differences between the main study variables.

3.16.2 Objective One

The CRC stage distribution (I-IV) across RIO groups was displayed graphically with a stacked bar graph and a chi-square test for independence. A multivariable logistic regression was performed to estimate the adjusted odds ratios (OR) of having a late stage (III, IV) cancer versus an early stage cancer

(I, II) for the six RIO groups as well as study covariates. Unadjusted (bivariate) ORs were also calculated.

Confounder selection was not performed as there was a large enough sample size to control for all available potential confounders.

The linearity assumption was tested for the two continuous variables in the model by plotting the beta estimates/coefficients from the bivariate model (see Appendix H). The variables were kept as continuous measures as the linearity assumption was met.

Possible causal pathway variable associations with stage were identified through our understanding of the literature (8,22),these variables were not included in the multivariable model. The distributions of the potential causal pathway variables by RIO group were reported and the chi-square test for independence was calculated. The association between stage (I-IV) and RIO group stratified by the causal pathway variables was also tested with the chi-square test for independence.

3.16.3 Objective Two

The diagnostic interval distribution (days) by RIO group and other study covariates was first analyzed by percentiles (25th, 50th, 75th and 90th) and the Kruskal-Wallis test. The diagnostic interval distribution by RIO group was then displayed graphically stratified by stage (I-IV and stage unknown) with side-by-side box plots and the Kruskal-Wallis test. For the multivariable analysis, quantile regression was performed at the 50th (median) and 90th percentiles for adjusted and unadjusted models stratified by

70 stage, creating a total of 10 adjusted models (5 stage groups x 2 different percentiles) and 10 unadjusted models. Median quantile regression was chosen because the distribution of the diagnostic interval is known to be right-skewed from previous literature (9,67,101) and studying the 90th percentile will allow for the characterization of patients who took the longest to be diagnosed. The adjusted difference in days in the median and 90th diagnostic intervals with 95% CI by RIO group stratified by stage was the principle interest of the objective. Confounder selection was not performed as there was a large enough sample size to control for all available potential confounders.

The linearity assumption was tested for the continuous variables in the model by plotting the diagnostic interval distribution with side-by-side boxplots for each variable (see Appendix G). Variables were kept as a continuous measure if the linearity assumption was met. If variables did not meet the linearity assumption they were categorized based on the diagnostic interval distribution (side-by-side boxplots).

Potential causal pathway variable associations with the diagnostic interval were identified through our understanding of the literature (8,22), these variables were not included in the multivariable model. The distributions of the potential causal pathway variables by RIO group were reported and the chi-square test for independence and Cochrane-Armitage trend test (where appropriate) were calculated.

The association between rurality and the median diagnostic interval stratified by the causal pathway variables was also tested with the Kruskal-Wallis test.

3.16.4 Regression Diagnostics

Basic regression diagnostics were performed for the fully adjusted multivariable models in both objectives. In objective one, three types of diagnostic regression statistics were assessed graphically for the logistic regression model as well as a Hosmer and Lemeshow goodness-of-fit test. For objective two we did not think there would be issues of model fit due to the robustness of quantile regression (185), however we created histograms of the standardized residuals for each of the 10 adjusted models.

71

3.17 Minimum Detectable Effect

Calculations for the minimum detectable effect were completed for the study design component of the thesis proposal. Calculations were based off of a log-transformed linear regression rearrangement of the difference in means formula for required sample size from Kelsey and colleagues (186) (since there is not an established method to calculate minimum detectable median differences). The formula was solved at four levels of study power (60%, 80%, 90% and 95%) for a two-tailed test with alpha (α=0.05).

At 80% power, we estimated a minimum detectable difference of 1.2 days.

The following assumptions were made for the required variables: 1) the standard deviation of the log-transformed diagnostic interval was estimated to be 1.18 using values from a study by Singh and colleagues in Manitoba (22) 2) unexposed individuals were considered as CRC patients living in CSDs with RIO scores from 10-30 from 2007-2011 (estimated as 4,558 cases) and the exposed individuals were considered as CRC patients living in CSDs with RIO scores of 76+ from 2007-2011 (estimated as 265 cases), CSD populations were from Statistics Canada (2011) (187) 3) the number of expected CRC cases were estimated using an incident rate of 58/100,000 for men and 39/100,000 for women (156), note this assumed that the CRC incident rate was uniform across exposed and unexposed individuals

The sample size of our final cohort was smaller than expected considering exclusions and missing the rectosigmoid junction cases. However, estimates were based on a CRC cohort from 2007-2011 and by the time of the data cut we were able to also include early 2012 CRC cases. Our original estimates

(described above) were similar to values observed (unexposed =4,945, exposed=329) and was sufficient to detect our clinical meaningful difference (further described in the results section).

3.18 Ethical Considerations Thesis project proposal was approved by the Health Sciences Research Ethics Board at Queen’s

University (see Appendix I) as well as the ICES institutional review board at Sunnybrook Health Sciences

Centre, Toronto, Canada. There was no direct patient contact in this research project and individual patients could not be identified in the administrative databases held at ICES. The datasets were linked 72 using unique encoded identifiers (IKN) and analyzed at ICES. The student received privacy training and signed a confidentiality agreement in order to access the ICES data. Confidentiality was upheld by complying with all ICES policies and procedures. The senior analyst at ICES Queen’s created a dataset for the student to analyze from the dataset creation plan written by the student. Data was kept in a secure location at all times.

This study was supported by the Institute for Clinical Evaluative Sciences (ICES), which is funded by an annual grant from the Ontario Ministry of Health and Long-Term Care (MOHLTC). The opinions, results and conclusions reported in this paper are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred.

Parts of this material are based on data and information compiled and provided by CIHI.

However, the analyses, conclusions, opinions and statements expressed herein are those of the author, and not necessarily those of CIHI.

73

Chapter 4

Results

4.1 Overview

The CRC (colorectal cancer) cohort description and multivariable analyses are presented in this chapter. The chapter is divided into four principle sections: the CRC cohort, objective one, objective two and regression diagnostics.

The CRC cohort section is made up of a cohort selection flow chart, a description of the cohort by

Rurality Index for Ontario (RIO) groups and a description of the cohort representativeness (comparison of the variable distributions between the final cohort and the eligible cohort).

Objective one results include a description of the stage distribution by RIO groups, results from a multivariable logistic regression model with late versus early stage CRC as the outcome and RIO group as the exposure, and a description of the stage distribution by RIO group stratified by potential causal pathway variables. Proportional differences of at least 5% in the bivariate results are highlighted. This study was often statistically powered to detect smaller, less meaningful differences.

Objective two results contain sub-sections describing the overall CRC cohort diagnostic interval, the length of the diagnostic interval across RIO groups stratified by stage, results from multivariable quantile regression models performed at the median/50th and 90th percentiles of the diagnostic interval, stratified by stage, with RIO group as the exposure, and a description of the diagnostic interval distribution by RIO group stratified by potential causal pathway variables. Differences in the diagnostic interval of at least 12 days in the bivariate results are highlighted as this study was often statistically powered to detect smaller, less meaningful differences.

The final section of this chapter presents results from some basic regression diagnostic tests for the multivariable models.

74

4.2 The CRC Cohort

4.2.1 CRC Cohort Selection

Figure 4-1 outlines the CRC cohort selection process described in the Methods Chapter (3.5).

There were 37,704 patients diagnosed with a first primary CRC between Jan 1, 2007 and Dec 31, 2012 who had a valid ICES key number (IKN). Nine exclusions were applied to this group to create an eligible study cohort of 31,558 patients. Two exclusions accounted for 83% of these excluded cases: having a

CRC diagnosis date after May 31, 2012 (excluded due to incomplete stage data after that date) and having a non-adenocarcinoma CRC. Our final study cohort contained 89% of the eligible cohort (n=27,942) with

1,189 cases excluded from the eligible cohort due to missing data needed to define diagnostic interval,

287 cases excluded from the eligible cohort because of a missing RIO score and 2,140 cases with a rectosigmoid junction cancer that belong in the eligible cohort, but were not included due to a coding oversight. In the following sub-section, the final study cohort will be described by RIO group.

75

Patients who met the inclusion criteria:  First primary CRC diagnosis in the Ontario Cancer Registry (ICD-9 codes: 153.0, 153.1, 153.2, 153.3, 153.4, 153.6, 153.7, 153.8, 153.9, 154.1, 159.0)  Diagnosed between Jan 1, 2007 and Dec 31, 2012 (inclusive)  Valid ICES key number n=37,704

Excluded patients:  Dead upon diagnosis (n=134) (death certificate only)  ≤18 or ≥105 years old at diagnosis (n=15)  Did not have an Ontario postal code at the time of diagnosis (n=36)  Another cancer diagnosis within 6 months of the first primary CRC

diagnosis date (n=598)  Did not have OHIP coverage for at least 3 years before the diagnosis date (n=292)  Diagnosed after May 31, 2012 (n=3,448)  ICD-9 code of 159.0 (n=157)  Did not have the histologic classification of adenocarcinoma (n=3,423)  Stage ‘0’ cancer (n=183)

Eligible Cohort *Cancer of the rectosigmoid junction n= 31,558 _(ICD-9 code: 154.0) (n=2,140)

Excluded patients:  Missing an index contact date (n=1,189)  Missing a RIO score (n=287)  *Cancer of the rectosigmoid junction (ICD-9 code: 154.0) (n=2,140)

Final Cohort

N=27,942

Figure 4-1: Flow chart describing the CRC cohort selection

Abbreviations: ICD-9 International Classification of Diseases 9th revision; ICES Institute for Clinical Evaluative Sciences, OHIP Ontario Health Insurance Plan; RIO Rurality Index for Ontario

*Cancer of the rectosigmoid junction was not included in the study due to a coding oversight recognized after analysis. In the flow chart, cancers of the rectosigmoid junction were added into the eligible study cohort and then excluded to reflect this oversight. We investigated to what impact it had on our findings in section 4.2.3

76

4.2.2 Description of the CRC Cohort by RIO Groups

Table 4-1 presents the CRC cohort (N=27,942) description by RIO groups. The majority of the cohort, 65.6%, resided in the RIO 0-9 group (the least rural RIO group), which included all major cities in

Ontario. The RIO 10-30 group contained 17.7% of the cohort, the RIO 31-45 group contained 9.9% of the cohort, the RIO 46-55 group contained 2.7% of the cohort, the RIO 56-75 group contained 2.9% of the cohort and the most rural group, RIO 76+, contained 1.2% of the cohort.

Mean age of the cohort was 68.5 [standard deviation (SD): 12.7] years. The age distribution of the CRC cohort was consistent with the literature (1), approximately half of the CRC cases occurred in patients who were aged 70 or greater and more than 90% of CRC cases occurred in patients who were aged 50 or greater. Age distribution throughout the RIO groups varied with the lowest mean age in the

RIO 76+ group (the most rural group) at 66.6 (SD: 11.9) years and highest mean age in the RIO 56-75 group at 69.7 (SD: 11.0) years, (p<0.0001). In the RIO 56-75 group there was a smaller proportion of patients in the 55-64 age range and a greater proportion of patients in the 65-74 age range. In the RIO 76+ group there was also proportionally about half the CRC cases aged 85+ as compared to the two least rural groups.

Deprivation quintile one, the least deprived group, had the greatest number of CRC patients overall. From quintile one to five, the number of CRC patients in each quintile decreased. There was a statistically significant difference in the deprivation quintile distribution across RIO categories

(p<0.0001). The proportion of patients in the least deprived category decreased markedly as rurality increased; however, the proportion in the most deprived category was fairly similar with the exception of the RIO 56-75 group which was relatively augmented.

A small percent (4.2%) of the cohort were recent immigrants. As expected, the largest proportion

(5.2%) of recent immigrants resided in the RIO 0-9 group (the least rural group). There was a significant decreasing trend in the proportion of recent immigrants as rurality increased (p<0.0001).

77

About 86% of the CRC cohort had 0 or 1 major Aggregated Diagnosis Groups (ADGs), used to assess comorbidity. The ADG distribution was similar throughout the RIO categories although small differences did render a significant p-value (p=0.02). About 24% of the CRC patients had 0 or 1 minor

ADGs. The majority of patients had between 2 and 5 minor ADGs. The distribution of minor ADGs was significantly different across RIO categories (p<0.0001). The RIO 76+ group had the largest proportion of patients with 0-3 minor ADGs (less comorbidity).

The proximal colon was the most common tumour sub-site in the cohort, accounting for 38.6% of the CRC cases. Distal colon and rectal sub-sites had similar distributions accounting for 26.3% and 24.4% of CRC cases, respectively. The remaining cases were from unspecified CRC sites. There were proportionally fewer cases with an unspecified CRC site in the RIO 76+ group versus the other RIO groups, especially the RIO 0-9 group. While the distributions by sub-site were fairly similar across RIO groups the differences where highly statistically significant (p<0.0001), due to our large sample size. In the next sub-section, the distributions of all variables were compared between the final study cohort and the eligible study cohort to assess if the final study cohort was reflective of the eligible cohort.

78

Table 4-1: Description of CRC patients by RIO group (Jan 01, 2007 to May 31, 2012 N=27,942)

Total RIO 0-9 RIO 10-30 RIO 31-45 RIO 46-55 RIO 56-75 RIO 76+ p-value Least rural Most rural X2 N=27,942 n= 18,341 n=4,945 n=2,774 n=754 n=799 n=329 Unless otherwise noted n % % % % % % Patient Characteristics Age at diagnosis (years) <0.0001 <45 1,034 4.3 2.9 2.3 2.5 1.3 4.6 45-54 3,048 11.5 10.1 9.7 9.2 9.0 10.0 55-64 6,080 21.6 21.3 23.5 24.0 18.8 26.4 65-74 7,784 26.4 29.3 31.5 30.2 36.8 31.3 75-84 7,282 26.3 26.4 24.5 24.5 26.5 22.5 85+ 2,714 10.0 10.0 8.5 9.6 7.6 5.2 Male sex 15,408 54.8 54.4 57.9 57.2 56.2 56.5 0.03, 0.02* Deprivation quintile n=27,714 1 (least deprived) 6,472 26.6 21.3 14.7 12.2 8.8 4.4 <0.0001 2 6,347 21.9 24.8 28.7 21.0 19.1 14.3 3 5,893 19.9 21.4 28.0 26.2 21.7 28.6 4 5,020 17.0 17.9 17.6 28.3 29.1 37.9 5 (most deprived) 3,982 14.6 14.7 11.0 12.3 21.4 14.9 Recent immigrant status 1,183 5.2 2.6 2.6 1.6 2.3 2.1 <0.0001, <0.0001* Major comorbidity 0 ADGs 15,982 57.9 55.4 55.6 58.9 55.1 60.5 0.02 1 ADGs 8,027 28.5 28.8 30.3 29.4 29.9 24.9 2 ADGs 2,729 9.5 10.9 10.0 7.6 9.6 10.6 3 ADGs 874 3.0 3.6 3.0 2.7 4.4 2.4 4+ ADGs 330 1.2 1.3 1.1 1.5 1.0 1.5

79

Total RIO 0-9 RIO 10-30 RIO 31-45 RIO 46-55 RIO 56-75 RIO 76+ p-value Least rural Most rural X2 N=27,942 n= 18,341 n=4,945 n=2,774 n=754 n=799 n=329 Unless otherwise noted n % % % % % % Minor comorbidity 0 ADGs 2,972 10.1 10.6 12.2 14.5 12.8 17.3 <0.0001 1 ADGs 3,798 12.6 15.2 16.3 15.1 13.8 16.1 2 ADGs 4,607 16.0 16.6 19.0 15.4 17.9 19.5 3 ADGs 4,496 16.2 16.4 15.3 16.2 16.7 11.6 4 ADGs 3,817 13.9 13.9 12.5 13.1 12.0 9.1 5 ADGs 2,910 10.8 9.8 9.6 10.3 10.6 7.0 6 ADGs 2,060 7.7 7.2 6.3 5.7 5.5 6.1 7 ADGs 1,414 5.3 4.7 4.3 4.5 4.1 4.3 8 ADGs 855 3.3 2.7 2.4 2.5 2.4 3.3 9 ADGs 495 1.9 1.5 1.2 1.2 1.6 3.3 10+ ADGs 518 2.1 1.5 0.9 1.5 2.6 2.4 Disease Characteristics CRC sub-site Proximal colon 10,776 37.8 40.6 38.8 39.7 41.4 41.0 <0.0001 Distal colon 7,360 27.1 24.9 24.4 26.8 24.5 24.9 Rectum 6,818 23.8 24.8 27.7 24.7 23.3 28.6 Unspecified CRC 2,988 11.4 9.7 9.1 8.9 10.8 5.5 Abbreviations: RIO Rurality Index for Ontario; ADGs Johns Hopkins Aggregated Diagnosis Groups; CRC colorectal cancer *Cochran-Armitage Trend Test (2-sided)

80

4.2.3 Cohort Representativeness

After the eligible study cohort was created, patients who were missing an index contact date (used to calculate the diagnostic interval) and patients who were missing a RIO score were excluded. Table 4-2 compares the distribution of all applicable study variables between the final cohort (N=27,942) and the missing index contact group (N= 1,189). Table 4-3 compares the distribution of all applicable study variables between the final cohort (N=27,942) and the missing RIO group (N= 305). Note that there were

18 patients who were missing both an index contact date and a RIO score. These 18 patients were included in both tables and explain the discrepancy in missing a RIO score sample size between Figure 4-

1 and Table 4-3).

There was not a statistically significant difference (p=0.78) in the RIO distribution between the missing index contact group and the final cohort (see Table 4-2). Patients in the missing index contact group had a smaller proportion of stage I cancer (15.3% versus 23.7%) and a greater proportion of stage

IV cancer (29.1 % versus 18.0%) compared to the final cohort (p<0.0001). A greater proportion of patients also had an unknown stage of cancer in the missing index contact group versus the final cohort

(14.6% versus 7.3%), this difference was statistically significant (p<0.0001). The mean age of the final cohort and the missing index contact group was the same (mean [SD]: final cohort, 68.5 [12.7] years; missing index contact date, 68.5 [14.0] years; p=0.93). There was a statistically significant difference in the age category proportions (p=0.0008), although all variations were less than 4%. There was a statistically significant difference in the distribution of minor comorbidities between the missing index contact group and the final cohort (p<0.0001) with a greater proportion of patients having 0 or 1 ADGs in the missing index contact group. There was a statistically significant difference in the CRC anatomical sub-site distribution (p <0.0001). Proportionally more patients had a rectal cancer in the missing index contact group than the final cohort (32.6% versus 24.4%).

81

Table 4-2: Comparison of study variable distributions between patients in the final cohort (N=27,942) and patients who were excluded from the final cohort due to a missing index contact date (N=1,189). Variables of principle importance are in bold Final Cohort Missing Index p-value Contact Date N=27,942 N=1,189 X2 % % RIO group n=1,171 0-9 (least rural) 65.6 65.5 0.78 10-30 17.7 17.3 31-45 9.9 10.6 46-55 2.7 2.3 56-75 2.9 2.7 76+ (most rural) 1.2 1.5 Stage at diagnosis n=25,909 n=1,015 <0.0001 I 23.7 15.3 II 28.1 24.8 III 30.2 30.8 IV 18.0 29.1 Stage unknown 7.3 14.6 <0.0001 Age at diagnosis (years) <45 3.7 4.7 0.0008 45-54 10.9 11.2 55-64 21.8 21.6 65-74 27.9 24.6 75-84 26.1 24.9 85+ 9.7 13.0 Male sex 55.1 58.8 0.01 Deprivation quintile n=27,714 n=1,172 1 (least deprived) 23.4 24.9 0.24 2 22.9 20.3 3 21.3 21.1 4 18.1 19.4 5 (most deprived) 14.4 14.3 Recent immigrant status 4.2 5.1 0.17 Major comorbidity 0 ADGs 57.2 60.5 0.06 1 ADGs 28.7 27.8 2 ADGs 9.8 7.4 3 ADGs 3.1 3.1 4+ ADGs 1.2 1.3 Minor comorbidity <0.0001 0 ADGs 10.6 16.5 1 ADGs 13.6 16.7 2 ADGs 16.5 16.1 3 ADGs 16.1 14.8 4 ADGs 13.7 11.3 5 ADGs 10.4 9.8 6 ADGs 7.4 5.9 7 ADGs 5.1 4.1 8 ADGs 3.1 1.9 9 ADGs 1.8 1.5 10+ ADGs 1.9 1.4

82

Final Cohort Missing Index p-value Contact Date N=27,942 N=1,189 X2 % % CRC sub-site Proximal colon 38.6 33.2 <0.0001 Distal colon 26.3 22.2 Rectum 24.4 32.6 Unspecified CRC 10.7 12.0 Abbreviations: RIO Rurality Index for Ontario; ADGs Johns Hopkins Aggregated Diagnosis Groups; CRC colorectal cancer Please note that the following variables were not included in this table as they could not be calculated without an index contact date: first test, emergency department presentation, symptom status at first test and diagnostic interval

The distribution of the diagnostic interval was very similar between the final cohort and the missing RIO group (see Table 4-3). There was not a statistically significant difference in the stage distribution (I-IV) between the two groups (p=0.40), but the proportion of patients with an unknown stage of cancer was significantly greater in the missing RIO group compared to the final cohort (13.1% versus

7.3%) (p=0.0001). The mean age of the final cohort was slightly older than the mean age of the missing

RIO group (mean [SD]: final cohort, 68.5 [12.7] years; missing RIO group, 66.4 [12.4] years; p=0.002).

The proportion of patients in the 55-64 age category was greater in the missing RIO group than the final cohort (29.2% versus 21.8%). However, in the three categories above age 64 the proportion in the missing

RIO group was smaller, as compared to the final cohort (p=0.02). Similar to the trend seen when comparing the missing index contact date group and the final cohort, there was a greater proportion of patients with 0 and 1 minor ADGs in the missing RIO group than the final cohort (p<0.0001). Slight variations in the proportions of a patient’s first test occurred between the two groups (p=0.03). The proportion of patients who had an abdominal x-ray as their first test was greater in the missing RIO group than the final cohort (16.4% versus 11.0%), while the proportion who had an abdominal ultrasound as their first test was lower in the missing RIO group than the final cohort (12.5% versus 17.9%). A greater proportion of patients in the missing RIO group had their CRC diagnosed via the emergency department

(ED) (ED presentation=yes) compared to the final cohort (27.9% versus 19.5%) (p=0.0004).

83

Table 4-3: Comparison of study variable distributions between patients in the final cohort (N=27,942) and patients who were excluded from the final cohort due to a missing RIO score (N=305). Variables of principle importance are in bold Final Cohort Missing RIO Score p-value N=27,942 N=305 X2 % % unless otherwise noted Diagnostic interval (days) n=287 Mean (SD)‡ 107 (113) 106 (111) Median (IQR) 64 (22-159) 63 (21-163) 0.91* 90th percentile 288 283 Range 1-759 1-569 Stage at diagnosis n=25,909 n=265 I 23.7 24.2 0.40 II 28.1 26.4 III 30.2 34.3 IV 18.0 15.1 Stage unknown 7.3 13.1 0.0001 Age at diagnosis (years) <45 3.7 4.3 0.02 45-54 10.9 11.8 55-64 21.8 29.2 65-74 27.9 26.9 75-84 26.1 19.7 85+ 9.7 8.2 Male sex 55.1 54.1 0.72 Recent immigrant status 4.2 3.0 0.27 Major comorbidity 0 ADGs 57.2 62.3 0.24 1 ADGs 28.7 26.9 2 ADGs 9.8 8.5 3 ADGs 3.1 1.3 4+ ADGs 1.2 1.0 Minor comorbidity 0 ADGs 10.6 17.4 <0.0001 1 ADGs 13.6 19.3 2 ADGs 16.5 17.1 3 ADGs 16.1 10.2 4 ADGs 13.7 11.2 5 ADGs 10.4 10.5 6 ADGs 7.4 4.6 7 ADGs 5.1 5.6 8 ADGs 3.1 2.3 9 ADGs 1.8 0.3 10+ ADGs 1.9 1.6 CRC sub-site Proximal colon 38.6 35.7 0.04 Distal colon 26.3 29.8 Rectum 24.4 27.9 Unspecified CRC 10.7 6.6

84

Final Cohort Missing RIO Score p-value N=27,942 N=305 X2 % % unless otherwise noted First test n=287 0.03 Colonoscopy 32.1 30.0 FOBT 19.5 17.8 Abdominal US 17.9 12.5 Abdominal X-Ray 11.0 16.4 Abdominal CT 7.9 8.7 ED NOS 5.6 7.7 Sigmoidoscopy 4.1 4.9 Barium Enema 1.9 2.1 ED presentation n=287 0.0004 19.5 27.9 Asymptomatic n=287 0.31 30.6 27.9 Abbreviations: RIO Rurality Index for Ontario; SD standard deviation; IQR interquartile range; ADGs Johns Hopkins Aggregated Diagnosis Groups; CRC colorectal cancer; FOBT fecal occult blood test; US ultrasound; CT computed tomography; ED NOS emergency department not otherwise specified; ED emergency department Please note that the following variables were not included in this table as they could not be calculated without a RIO score: deprivation quintile and RIO group ‡ Standard deviation included as a convention, warning, distribution not normal * Kruskal-Wallis Test, compares median values across categories

In summary, there was not a statistically significant difference in the distributions of the exposure or main outcome (diagnostic interval) between the final and eligible CRC cohorts. Furthermore, we had data on 89% of the eligible cohort. Variables with the largest differences in proportions (5.9% to 11.1%) between the final cohort and the missing index contact group included: stage, stage unknown, CRC sub- site and minor comorbidities. Variables with the largest differences in proportions (5.4% to 8.4%) between the final cohort and the missing RIO group included: stage unknown, age group, minor comorbidities, first test and ED presentation. Some comparisons rendered significant p-values with smaller proportional differences.

85

Erroneous exclusion of the rectosigmoid junction cancer sub-site (detected-post data-cut) and potential impact on findings Cancers of the rectosigmoid junction (ICD-9 code: 154.0), N=2,140, were not included in the

CRC cohort due to a coding oversight. These rectosigmoid junction cases belong in the eligible CRC cohort and would have later been included in the rectal sub-site group (ICD-9 codes: 154.0 and 154.1).

We compared the distribution of key variables between the rectal sub-site group and the omitted rectosigmoid junction cases (see Table 4-4). There was a greater proportion of rectosigmoid junction cases in the RIO 0-9 group compared to the rectal cases (difference of 5.5 %). Similar distributions were found for the other RIO groups although there was a statistically significant difference overall

(p<0.0001). Importantly the diagnostic interval distribution was similar between the two groups and there was no statistically significant difference in the median diagnostic interval (p=0.11). There was a greater proportion of stage IV cancer in the rectosigmoid junction cases compared to the rectal cases (21.3% versus 15.8%) (p<0.0001).

86

Table 4-4: Comparison of select study variable distributions between patients with a rectal cancer (N=6,818) and patients with a rectosigmoid junction cancer (N=2,140), who were erroneously omitted from the CRC cohort Rectum Rectosigmoid Junction p-value N=6,818 N=2,140 X2 % % unless otherwise noted RIO group 0-9 (least rural) 63.9 69.4 <0.0001 10-30 18.0 15.4 31-45 11.3 8.5 46-55 2.7 2.8 56-75 2.7 2.2 76+ (most rural) 1.4 1.8 Diagnostic interval (days) Mean (SD)‡ 100 (108) 103 (108) Median (IQR) 58 (22-141) 63 (24-150.5) 0.11* 90th percentile 271 278 Range 1-723 1-609 Stage at diagnosis N=6,216 N=1,956 I 26.2 21.7 <0.0001 II 22.7 23.9 III 35.3 33.0 IV 15.8 21.3 Stage unknown 8.8 8.6 0.74 Abbreviations: RIO Rurality Index for Ontario; SD standard deviation; IQR interquartile range ‡ Standard deviation included as a convention, warning, distribution not normal * Kruskal-Wallis Test, compares median values across categories

If the rectosigmoid junction cases were added to the final cohort (N=2,140 + 27,942= 30,082) there would have been almost no change in the distributions of RIO, diagnostic interval or stage. The greatest change in proportions for RIO group would have been in the RIO 0-9 group where the rectosigmoid junction plus final cohort was 0.3% greater than the final cohort (65.9% versus 65.6%, respectively). The 25th, median and 90th percentile diagnostic intervals (days) would have been the same between the two groups and the 75th percentile would have been one day greater in the final cohort. The greatest change in proportions for stage group would have been in stage II where the rectosigmoid junction plus final cohort was 0.3% less than the final cohort (27.8% versus 28.1%, respectively).

In summary, a significant difference was found when comparing the distribution of RIO and stage between the rectal sub-site group and the rectosigmoid junction group, however, the greatest difference in proportions was 5.5%. No statistically significant difference in the main outcome, the diagnostic interval,

87 was detected between these two groups. The percent change in variable distributions in the final cohort if the rectosigmoid junction cases were added would have been marginal. For any future publications of this work the 2,140 rectosigmoid junction cases will be added to the cohort for completeness, however, we do not anticipate that there will be a significant impact on the results.

88

4.3 Objective One: Comparison of the Ontario CRC Stage Distribution across Regions Grouped by RIO Categories

4.3.1 CRC Stage Distribution across Regions Grouped by RIO Categories Overall, the CRC stage distribution (I-IV) for the cohort was 23.7%, 28.1%, 30.2% and 18.0%, respectively, which was very similar to what has been reported for the Ontario population (24). Contrary to our hypothesis, there was no statistically significant difference in the CRC stage distribution (I-IV) across RIO categories (p=0.28) (see Figure 4-2). There was only one instance in which the stage distribution across RIO groups exceeded a 5% difference: stage II between the RIO 0-9 group (27.6%) and the RIO 56-75 group (32.9%). Stage III had the largest proportion among the four least RIO groups while stage II had the largest proportion among the two most rural RIO groups. Stage IV made up the smallest portion of the stage distribution across all RIO groups. The RIO 56-75 group had the lowest proportion of stage IV cancers at 16.4% and the RIO 46-55 group had the highest proportion of stage IV cancers at 19.1%. A description of the CRC cohort patient and disease characteristics by stage (I-IV) can be found in Appendix J.

89

n=16,982 n=4,622 n=2,594 n=692 n=733 n=286 100

90 18.3 % 16.9 % 18.0 % 19.1 % 16.4 % 16.8 %

80

70

uency 30.3 % 30.4 % 30.1 % 29.9 % 27.8 % 30.1 % 60

50

40

27.6 % 28.5 % 28.6 % 28.5 % 32.9 % 30.8 % Percent Percent ofTotal Freq 30

20

23.8 % 24.2 % 23.3 % 22.5 % 22.9 % 22.4 % 10

0 0-9 10-30 31-45 46-55 56-75 76+ Rurality Index for Ontario (RIO) group Least Rural Most Rural

Figure 4-2: CRC stage distribution by Rurality Index for Ontario (RIO) group (p=0.28)

90

4.3.2 Multivariable Analysis: Logistic Regression Logistic regression was performed modeling the odds of having a late stage cancer (III, IV)

(versus an early stage cancer [I, II]). This regression model was chosen to further explore the relationship between rurality and CRC stage. Results are presented in Table 4-5. Our principle interest in this model was the adjusted odds ratios (ORs) for the RIO groups (with the least rural group, RIO 0-9, as the reference). Only the RIO 56-75 group had a statistically significant OR: living in a RIO 56-75 area, versus a RIO 0-9 area, decreased the odds of having a late stage cancer by a factor of 0.83 [95% confidence intervals (CI): 0.72, 0.97] or 17%. This protective effect is in the opposite direction to what we had hypothesized. Although not statistically significant, all other RIO categories had adjusted ORs that were less than or very close to one, demonstrating a slightly protective or neutral effect of living in areas more rural than the RIO 0-9 group when comparing late versus early stage. There was almost no change in estimates between the unadjusted and adjusted results. We also conducted a secondary logistic regression analysis with RIO as a trend variable. In that analysis, the adjusted odds of having a late stage

CRC decreased by a factor of 0.97 (95% CI: 0.95, 1.0), or 3% with each RIO group increase.

Our model contains several other statistically significant results. All three age groups below the

65-74 reference category had significantly increased odds of having a late stage CRC cancer. For example, being less than 45 years old, versus 65-74 years old, when diagnosed with CRC increased the odds of having a late stage cancer by a factor of 1.61 (95% CI: 1.40, 1.85) or 61%. Patients in the most deprived quintile versus the least deprived quintile had a significantly increased odds of late stage cancer by a factor of 1.13 (95% CI: 1.04, 1.23) or 13%. Interestingly, having more comorbidities significantly decreased the odds of having a late stage cancer: with each addition of one major ADG, the adjusted odds of having a late stage CRC decreased by a factor of 0.95 (95% CI: 0.92, 0.98), or 5%. With each addition of one minor ADG, the adjusted odds of having a late stage CRC decreased by a factor of 0.97 (95% CI:

0.96, 0.98), or 3% (see Appendix H for the linear assumption tests of the comorbidity variables). The rectum was chosen as the reference category for CRC sub-site. Having a distal colon cancer, versus a

91 rectal cancer, significantly decreased the odds of having a late stage cancer by a factor of 0.84 (95% CI:

0.78, 0.90) or 16%. Having an unspecified CRC, versus a rectal cancer, significantly decreased the odds of having a late stage cancer by a factor of 0.83 (95% CI: 0.76-0.91) or 17%. An unadjusted effect in the proximal colon was diminished and not statistically significant with adjustment.

In summary, rurality was not associated with an increased odds of having a late stage cancer and one RIO group (56-75) had a significant protective association. There is some evidence of a protective trend effect, although the point estimates shown in Table 4-5 do not strongly support the linear trend required for such an analysis and the result was marginally statistically significant. Other covariates with significant protective effects included major and minor comorbidities as well as the distal colon and unspecified CRC sub-sites. Covariates that significantly increased the odds of late stage cancer included a young age at diagnosis and living in deprived neighborhoods. Confounding at the 10% threshold was only present for recent immigrant status, although there was also a meaningful change for the proximal colon sub-site.

92

Table 4-5: Unadjusted and adjusted odds ratios (ORs) for late stage (III, IV) CRC, statistically significant estimates are in bold

n Unadjusted OR Adjusted‡ OR 25,697 (95% CI) (95% CI) RIO group 0-9 (least rural) 16,841 Ref Ref 10-30 4,603 0.95 (0.89, 1.01) 0.94 (0.88, 1.01) 31-45 2,575 0.98 (0.90, 1.07) 0.97 (0.89, 1.05) 46-55 686 1.02 (0.87, 1.18) 0.99 (0.85, 1.16) 56-75 713 0.84 (0.72, 0.97) 0.83 (0.72, 0.97) 76+ (most rural) 279 0.93 (0.74, 1.18) 0.89 (0.70, 1.13) Age at diagnosis (years) <45 946 1.67 (1.45, 1.91) 1.61 (1.40, 1.85) 45-54 2,811 1.42 (1.31, 1.55) 1.37 (1.25, 1.50) 55-64 5,582 1.18 (1.10, 1.26) 1.15 (1.07, 1.23) 65-74 7,207 Ref Ref 75-84 6,726 0.94 (0.88, 1.00) 0.97 (0.90, 1.03) 85+ 2,424 0.97 (0.89, 1.06) 1.01 (0.92, 1.11) Sex Male 14,153 1.02 (0.97, 1.07) 1.00 (0.95,1.05) Female 11,544 Ref Ref Deprivation quintile 1 (least deprived) 5,991 Ref Ref 2 5,874 1.02 (0.95, 1.10) 1.03 (0.96, 1.11) 3 5,482 1.00 (0.93, 1.07) 1.02 (0.95, 1.10) 4 4,643 0.99 (0.92, 1.07) 1.01 (0.94, 1.10) 5 (most deprived) 3,707 1.10 (1.01, 1.19) 1.13 (1.04, 1.23) Recent immigrant status Yes 1,078 1.0 (0.88, 1.13) 0.89 (0.79, 1.01) No 24,619 Ref Ref Major comorbidity (ADGs) / 0.88 (0.86, 0.91) 0.95 (0.92, 0.98) Minor comorbidity (ADGs) / 0.95 (0.94, 0.96) 0.97 (0.96, 0.98) CRC sub-site Proximal Colon 10,075 0.91 (0.85, 0.97) 0.98 (0.92, 1.05) Distal Colon 6,761 0.82 (0.76, 0.88) 0.84 (0.78,0.90) Rectum 6,161 Ref Ref Unspecified CRC 2,700 0.78 (0.71, 0.85) 0.83 (0.76, 0.91) Abbreviations: OR ; RIO Rurality Index for Ontario; ADGs Johns Hopkins Aggregated Diagnosis Groups; CRC colorectal cancer ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

4.3.3 Description of the stage distribution by RIO group stratified by potential causal pathway variables As outlined in the Methods Chapter, two variables (symptom status at first test and ED presentation) were considered causal pathway variables for the RIO – stage association and therefore were not included in the statistical models for objective one. Distributions of the causal pathway variables by RIO group and the RIO groups’ stage distribution (I-IV) stratified by the causal pathway variables are presented in Tables 4-6 and 4-7. 93

As determined by our algorithm, 30.6% of patients in the cohort were asymptomatic at their first test and 69.4% were symptomatic (see Table 4-6). There was a significant difference in the symptom status at first test distribution across RIO groups (p<0.0001) as well as a significant trend (p=0.004).

Patients living in the RIO 76+ group had the smallest proportion of asymptomatic CRC at 25.5%.

Asymptomatic CRC ranged from 29.6% to 34.2% amongst the other RIO groups.

Referring to Table 4-6, the stage distribution in the asymptomatic group was different from the stage distribution in the symptomatic group, with a greater proportion of asymptomatic cases diagnosed in stage I in the former group. No statistically significant difference was found in the stage distribution across RIO categories within the asymptomatic group (p=0.36). However, corresponding with the asymptomatic distribution, there was a smaller proportion of stage I CRCs and a greater proportion of stage II CRCs in the RIO 76+ group versus the less rural groups. No statistically significant difference was found within the symptomatic group stage distribution across RIO categories (p=0.33).

Table 4-6: Comparison of CRC symptom status (asymptomatic or symptomatic) distribution (X2 p<0.0001, trend* p=0.004) and stage (I-IV) across RIO groups p-value RIO RIO RIO RIO RIO RIO 2 0-9 10-30 31-45 46-55 56-75 76+ X Least rural Most rural % Asymptomatic 29.6 32.8 34.2 30.2 31.5 25.5 (n=8,562) Stage distribution (n=7,908) n=5,002 n=1,512 n=882 n=209 n=230 n=73 Stage I 34.9 33.1 32.8 32.5 30.0 26.0 0.36 Stage II 25.5 28.1 29.7 28.2 30.9 34.3 Stage III 28.9 27.8 27.7 28.7 28.3 27.4 Stage IV 10.7 11.0 9.9 10.5 10.9 12.3 % Symptomatic 70.4 67.2 65.8 69.8 68.5 74.5 (n=19,380) Stage distribution (n=18,001) n=11,980 n=3,110 n=1,712 n=483 n=503 n=213 Stage I 19.1 19.9 18.3 18.2 19.7 21.1 0.33 Stage II 28.5 28.8 28.1 28.6 33.8 29.6 Stage III 30.9 31.7 31.3 30.4 27.6 31.0 Stage IV 21.5 19.7 22.3 22.8 18.9 18.3 Abbreviations: RIO Rurality Index for Ontario *Cochran-Armitage Trend Test (2-sided)

94

Overall, 19.5% of patients were diagnosed through an emergency department (ED) (see Table 4-

7). ED presentation generally increased with rurality and showed a significant trend (p<0.0001). The proportion of patients with an ED presentation ranged from 18.7% in the least rural RIO group to 26.4% in the most rural group.

The stage distribution between ED presentation cases and non-ED presentation cases was noticeably different with a smaller proportion of stage I cases (about 12% versus 26%) and a greater proportion of stage IV cases (about 25% versus 15%) in the ED presentation group. There was no statistically significant difference found in the stage distributions across RIO groups within the ED presentation group (p=0.30), but this analysis is somewhat underpowered to detect some meaningful differences, such as the relatively high proportion of stage II CRC in the RIO 56-75 group at 38.0% and, in RIO group 76+, a relatively high proportion of stage III and a low proportion of stage IV CRC at

40.3% and 18.2% respectively. There was no statistically significant difference found in the stage distribution across RIO groups within the non-ED presentation group (p=0.57).

Table 4-7: Comparison of emergency department (ED) presentation (yes or no) distribution (X2 p<0.0001, trend* p<0.0001) and stage (I-IV) across RIO groups p-value RIO RIO RIO RIO RIO RIO 2 0-9 10-30 31-45 46-55 56-75 76+ X Least rural Most rural % ED presentation 18.7 20.6 18.8 24.0 25.0 26.4 (n=5,439) Stage distribution (n=5,049) n=3,184 n=946 n=494 n=164 n=184 n=77 Stage I 12.1 12.4 11.7 8.5 12.0 11.7 0.30 Stage II 29.6 29.6 28.3 29.9 38.0 29.9 Stage III 31.3 33.4 32.0 29.9 28.3 40.3 Stage IV 27.1 24.6 27.9 31.7 21.7 18.2 % non-ED presentation 81.3 79.5 81.2 76.0 75.0 73.6 (n=22,503) Stage distribution(n=20,860) n=13,798 n=3,676 n=2,100 n=528 n=549 n=209 Stage I 26.5 27.2 26.0 26.9 26.6 26.3 0.57 Stage II 27.2 28.3 28.7 28.0 31.2 31.1 Stage III 30.1 29.6 29.6 29.9 27.7 26.3 Stage IV 16.3 14.9 15.7 15.2 14.6 16.3 Abbreviations: RIO Rurality Index for Ontario; ED emergency department *Cochran-Armitage Trend Test (2-sided)

95

4.4 Objective Two: Comparison of the CRC Diagnostic Interval Across Regions Grouped by RIO Categories and Stratified by Stage

4.4.1 The CRC Cohort Diagnostic Interval Overall, the median diagnostic interval for the CRC cohort was 64 (IQR: 22-159) days, the 90th percentile was 288 days and the range was 1-759 days. The diagnostic interval distribution exhibited the same right-skewed shape seen in other work in this field. This right skew can also be illustrated by the discrepancy found between the median diagnostic interval (64 days) and the mean diagnostic interval of

107 (SD: 113) days.

Table 4-8 presents the 25th, 50th (median), 75th and 90th percentile diagnostic interval values by

RIO group, patient characteristics and disease characteristics for the CRC cohort. There was a statistically significant difference in the medians across RIO groups (p=0.002), however, median diagnostic intervals for the first five RIO groups were fairly similar ranging from 62-69 days while in the RIO 76+ group the interval was shorter at 53 days, which met our minimum clinically meaningful difference criterion of 12 days when compared to RIO groups 10-30 and 46-55. We found no explicit statements in the literature about a minimum clinically important effect on the diagnostic interval although some studies (138) have made the smallest category in which to examine the CRC diagnostic interval as less than a month based on time intervals that would be meaningful in clinical practice.

Where stage was known, the diagnostic interval could be analyzed by stage I–IV groupings. As expected, there was a statistically significant difference in the median diagnostic interval between these stage groupings (p<0.0001). Stage I had the longest median and 90th percentile diagnostic intervals, stage

II and III had almost identical diagnostic interval distributions and stage IV CRC had the shortest median and 90th percentile diagnostic intervals.

Patients in the youngest age group, less than 45 years old when diagnosed with CRC, had the shortest median diagnostic interval (51.5 days) as compared to the other age groupings. Patients in the oldest age group, 85 years plus, had the second shortest diagnostic interval (55 days), although their 90th 96 percentile diagnostic interval (297 days) was one of the longest. The age group with the longest median diagnostic interval was 65-74 years at 69 days. Overall, there was a significant difference in the median diagnostic interval across age groups (p<0.0001).

The major and minor comorbidity (John Hopkins ADGs) variables displayed a monotonic increase in the median and 90th percentile diagnostic interval as the number of ADGs increased. Major comorbidity medians ranged from 56 days (0 major ADGs) to 118 days (4+ major ADGs) (p<0.0001).

Minor comorbidity medians ranged from 35 days (0 minor ADGs) to 124 days (10+ minor ADGs)

(p<0.0001).

We found a statistically significant difference in the median diagnostic interval between CRC sub-sites (p=0.003), ranging from 58 days for rectal cancers to 65 days and 67 days, respectfully, for proximal and distal cancers.

In conclusion, variables that showed a statistically significant difference in median diagnostic intervals were: RIO group, stage at diagnosis, age at diagnosis, deprivation quintile, major and minor comorbidity and CRC sub-site. The next section will describe the diagnostic interval by RIO group stratified by stage.

97

Table 4-8: Description of the CRC cohort by median and 90th percentile diagnostic interval (days)

Diagnostic Interval (days) P-value n N=27,942 Kruskal-Wallis 25th 50th 75th 90th Test* RIO group 0-9 (least rural) 18,341 21 63 158 287 0.002 10-30 4,945 24 69 169 297 31-45 2,774 23 62 148 279 46-55 754 24 69 161 290 56-75 799 20 62 149 297 76+ (most rural) 329 14 53 154 273 Stage at diagnosis I 6,147 43 98 200 315 <0.0001 II 7,277 21 60 150 284 III 7,830 21 60 150 283 IV 4,655 9 37 107 252 Stage known/unknown Unknown 2,033 22 65 168 295 0.33 Known 25,909 22 64 158 288 Age at diagnosis (years) <45 1,034 14 51.5 134 257 <0.0001 45-54 3,048 20 56 139 250 55-64 6,080 23 64 149 278 65-74 7,784 26 69 167.5 290 75-84 7,282 22 65 172 304 85+ 2,714 11 55 162 297 Sex Male 15,408 22 63 157 288 0.48 Female 12,534 22 65 162 288 Deprivation quintile 1 (least deprived) 6,472 23 65 160 287 0.04 2 6,347 22 64 161 291 3 5,893 23 65 157 280 4 5,020 22 61 158 291 5 (most deprived) 3,982 18 60 161 294 Recent immigrant status Yes 1,183 20 57 157 281 0.25 No 26,759 22 64 159 289 Major comorbidity 0 ADGs 15,982 19 56 136 255 0.0001 1 ADGs 8,027 26 71 178 304 2 ADGs 2,729 27 85 217 338 3 ADGs 874 30 90 235 358 4+ ADGs 330 28 118 279 372

98

Diagnostic Interval (days) P-value n N=27,942 Kruskal-Wallis 25th 50th 75th 90th Test* Minor comorbidity 0 ADGs 2,972 8 35 91.5 195 0.0001 1 ADGs 3,798 17 55 128 244 2 ADGs 4,607 21 57 141 260 3 ADGs 4,496 24 67 157 282 4 ADGs 3,817 27 69 164 295 5 ADGs 2,910 27 72 186 303 6 ADGs 2,060 27 79 197 331 7 ADGs 1,414 30 91.5 227 344 8 ADGs 855 33 99 244 364 9 ADGs 495 35 120 275 367 10+ ADGs 518 35 124 288 375 CRC sub-site Proximal colon 10,776 21 65 171 298 0.003 Distal colon 7,360 23 67 158 283 Rectum 6,818 22 58 141 271 Unspecified CRC 2,988 21 62 163 295 Abbreviations: RIO Rurality Index for Ontario; ADGs Johns Hopkins Aggregated Diagnosis Groups; CRC colorectal cancer * Kruskal-Wallis Test, compares median values across categories

4.4.2 The CRC Cohort Diagnostic Interval Distribution across RIO Categories Stratified by Stage The median diagnostic interval for CRC stages I, II, III and IV was 98 (IQR: 43, 200) days, 60

(IQR: 21,150) days, 60 (IQR: 21,150) days and 37 (IQR: 9,107) days respectively (p<0.0001). The median diagnostic interval for CRC stage unknown was 65 (IQR: 22, 168) days. Figures 4-3, 4-4 and 4-5 present the diagnostic interval distribution by RIO group stratified by stage. In stage I, there was a significant difference in median diagnostic interval by RIO group (p=0.0005) (see Figure 4-3). Median diagnostic intervals ranged from 58.5 (IQR: 14, 157) days in the RIO 76 + group to 108 (IQR: 56, 202) days in the RIO 46-55 group.

The differences in the median diagnostic interval across RIO groups were not statistically significant in the stage II stratum (p=0.88) (see Figure 4-3) or the stage III stratum (p=0.15) (see Figure 4-

4).

The differences in the median diagnostic interval across RIO groups were also not statistically significant in the stage IV stratum (p=0.89) (see Figure 4-4) but the RIO 76+ interval was clinically

99 significantly shorter than that of the RIO 46-55 group at 32 (IQR: 10,113) and 48.5 (IQR: 15, 99) days, respectively. The stage unknown group behaved most similarly to stages II and III (see Figure 4-5).

Overall, there was a significant difference in median diagnostic interval by RIO group (p=0.0005) in the stage I stratum only. Throughout, the RIO 10-30 and/or 46-55 group(s) often had the longest median diagnostic interval (with the exception of stage II) and the RIO 76+ group often had the shortest median diagnostic interval. It was also of interest to note that the interquartile ranges decreased as stage increased. The greatest difference in median diagnostic intervals within a stage, across RIO groups, was found in stage I, at 49 days and stage unknown at 44 days. The smallest difference in median diagnostic intervals within a stage, across RIO groups, was found in stage II at 10 days.

100

800 Stage I Stage II

600

400

DiagnosticInterval (days) 200

0 0-9 10-30 31-45 46-55 56-75 76+ 0-9 10-30 31-45 46-55 56-75 76+ Rurality Index for Ontario (RIO) group Rurality Index for Ontario (RIO) group (p=0.0005) (p=0.88)

n 4,038 1,118 603 156 168 64 4,689 1,319 743 197 241 88 Median DI 97 106.5 91 108 85.5 58.5 60 58 59 64 55 65 90th DI 311 336 308 306 329 212 286 281 273 316 287 342

Figure 4-3: Distribution of the diagnostic interval (days) by Rurality Index for Ontario (RIO) groups stratified by stage. Table displays the sample size (n), median percentile and 90th percentile diagnostic intervals (DI) (days) across RIO groups. P-value from Kruskal-Wallis test compares median diagnostic interval values across RIO groups. Box plot specifics: horizontal line within box represents the median (50th percentile); “+” marker within box represents the mean; bottom and top edges of box represent the 25th percentile (quartile1) and 75th percentile (quartile 3), respectively, or the interquartile range; whiskers represent the minimum and maximum observations

101

800

Stage III Stage IV

) 600

400 DiagnosticInterval (days 200

0 0-9 10-30 31-45 46-55 56-75 76+ 0-9 10-30 31-45 46-55 56-75 76+

Rurality Index for Ontario (RIO) group Rurality Index for Ontario (RIO) group (p=0.15) (p=0.89) n 5,148 1,405 780 207 204 86 3107 780 468 132 120 48 Median DI 58 68 60 68 59.5 46.5 36 38 36.5 48.5 41.5 32 90th DI 286 295 261.5 281 268 261 249 252.5 259 247 299 276

Figure 4-4: Distribution of the diagnostic interval (days) by Rurality Index for Ontario (RIO) groups stratified by stage. Table displays the sample size (n), median percentile and 90th percentile diagnostic intervals (DI) (days) across RIO groups. P-value from Kruskal-Wallis test compares median diagnostic interval values across RIO groups. Box plot specifics: horizontal line within box represents the median (50th percentile); “+” marker within box represents the mean; bottom and top edges of box represent the 25th percentile (quartile1) and 75th percentile (quartile 3), respectively, or the interquartile range; whiskers represent the minimum and maximum observations

102

800

Stage Unknown

600

400

DiagnosticInterval (days) 200

0 0-9 10-30 31-45 46-55 56-75 76+

Rurality Index for Ontario (RIO) group (p=0.14)

n 1359 323 180 62 66 43 Median DI 65 79 64 53.5 69 35 90th DI 287 299 319 269 333 327

Figure 4-5: Distribution of the diagnostic interval (days) by Rurality Index for Ontario (RIO) groups stratified by stage. Table displays the sample size (n), median percentile and 90th percentile diagnostic intervals (DI) (days) across RIO groups. P-value from Kruskal-Wallis test compares median th diagnostic interval values across RIO groups. Box plot specifics: horizontal line within box represents the median (50 percentile); “+” marker within box represents the mean; bottom and top edges of box represent the 25th percentile (quartile1) and 75th percentile (quartile 3), respectively, or the interquartile range; whiskers represent the minimum and maximum observations

103

4.4.3 Multivariable Analysis: Quantile regression (50th/median and 90th percentiles)

Quantile regression was performed at the 50th/median and 90th percentiles stratified by stage. The focus of these analyses was the relationship between rurality (with the RIO 0-9 group as the reference category) and the diagnostic interval (at the median and 90th percentiles). We report our fully adjusted

RIO associations below with the covariate results and unadjusted results presented in Appendix K and L.

Median Regression

Table 4-9 presents the adjusted difference (days) in the median and 90th percentile diagnostic interval by RIO groups stratified by stage. At the median, or 50th percentile, stage I CRC patients residing in the RIO 10-30 group took 10.8 (95% CI: 1.5, 20.1) days longer to be diagnosed than patients residing in the RIO 0-9 group (p=0.023). Although not statistically significant, at the median, stage I CRC patients residing in the RIO 76+ group were diagnosed 24.8 (95% CI: -53.0, 3.4) days faster than patients residing in the RIO 0-9 group (p=0.08). In the stage I median regression model, all RIO associations changed by more than 10% with adjustment for covariates. No statistically significant results were observed for stage

II CRC patients, although estimates were generally closer to zero than the stage I median differences. In the stage II median regression model, all RIO group associations except the RIO 31-45 group changed by more than 10% with adjustment for covariates. For stage III CRC there were two statistically significant differences at the median. Patients residing in the RIO 10-30 group took 13.3 (95% CI: 6.7, 19.8) days longer to be diagnosed than patients residing in the RIO 0-9 group (p<0.0001) and patients residing in the

RIO 46-55 group took 16.1 (95% CI: 1.0, 31.1) days longer to be diagnosed than patients residing in the

RIO 0-9 group (p=0.04). In the stage III median regression model, all RIO associations changed by more than 10% with adjustment for covariates. In stage IV, CRC patients living in the RIO 46-55 group took

13.0 (95% CI: 3.6, 22.4) days longer to be diagnosed than patients residing in the RIO 0-9 group

(p=0.007) at the median. Other estimates for the stage IV group were close to zero. Only the RIO 76+ group had a change in estimate of more than 10% with adjustment for covariates. No statistically significant results were found in the stage unknown group, but median differences most resembled those 104 seen in stage I. In the stage unknown median regression model, all RIO associations changed by more than 10% with adjustment for covariates. Generally, at the median, patients in the RIO 10-55 groups had diagnostic intervals that were the same or longer than patients in the RIO 0-9 group. No clear pattern emerged for patients in the RIO 56-75 group or the RIO 76+ group although estimates suggested a shorter diagnostic interval for stage I and stage unknown groups as compared to the RIO 0-9 group.

90th Percentile Regression

When studying the diagnostic interval, researchers are often also interested in the upper tail of the distribution. By studying the upper tail it is possible to determine if a variable has a lesser or greater association with the diagnostic interval at the extreme of the distribution versus the centre, furthermore, it allows for the characterization of patients who took the longest to be diagnosed. The 90th percentile diagnostic interval estimates found in Table 4-9 can be compared by RIO group as well as contrasted with estimates from the median regression. The results suggest that rurality has a greater effect on the diagnostic interval for stage I CRC patients who wait the longest versus those who are at the median. At the 90th percentile, stage I CRC patients residing in the RIO 10-30 group took 25.0 (95% CI: 8.5, 41.5) days longer to be diagnosed than patients residing in the RIO 0-9 group (p=0.003). Contrast this to the

10.8 days longer for the RIO 10-30 group found in the median regression. At the 90th percentile, stage I

CRC patients residing in the RIO 76+ group were diagnosed 67.3 (95% CI: -129.4, -5.1) days quicker than patients residing in the RIO 0-9 group (p=0.03). Compare this to the 24.8 day median advantage in the RIO 76+ group. In the stage I 90th percentile regression model, there were changes in estimates of more than 10% with adjustment for covariates for all RIO groups except the RIO 0-9 group. The 90th percentile estimates for all other stages were not statistically significant, but 90th percentile differences were generally larger than the median differences in all stage strata except stage III. In stages II, III, IV and stage unknown 90th percentile regression models, there were changes in estimates of more than 10% with adjustment for covariates for almost all RIO groups; changes in the estimates were generally more pronounced at the 90th percentile

105

Table 4-9: Adjusted‡ difference (days) in the median and 90th percentile diagnostic interval (95% CI) by RIO group stratified by stage, statistically significant estimates are in bold RIO 0-9 RIO 10-30 RIO 31-45 RIO 46-55 RIO 56-75 RIO 76+ n=18,191 n=4,923 n=2,755 n=747 n=776 n=322 Stage I Median Ref 10.8 (1.5, 20.1) -1.4 (-12.5, 9.7) 12.9 (-5.0, 30.7) -2.0 (-28.5, 24.5) -24.8 (-53.0, 3.4) 90th percentile 25.0 (8.5, 41.5) -3.9 (-28.3, 20.4) 6.7 (-23.4, 36.7) 17.1 (-17.7,52.0) -67.3 (-129.4,-5.1) Stage II Median Ref -0.05 (-5.9, 5.8) -1.0 (-8.9, 6.9) 4.6 (-8.7, 17.8) -6.7 (-16.9, 3.4) 1.8 (-32.5, 36.1) 90th percentile 1.8 (-19.2,15.6) -9.9 (-33.5,13.7) 16.9 (-22.8,56.6) 11.5 (-39.2,62.1) -7.3 (-100.9,86.4) Stage III Median Ref 13.3 (6.7, 19.8) 5.2 (-1.0, 11.4) 16.1 (1.0, 31.1) 8.0 (-9.3, 25.3) -5.9 (-25.7, 13.9) 90th percentile 3.8 (-14.2, 21.9) 4.6 (-19.6, 28.9) -7.9 (-58.0, 42.3) 1.2 (-52.1, 54.4) -12.8 (-62.1, 36.6) Stage IV Median Ref 2.1 (-2.7, 7.0) -0.7 (-7.0, 5.6) 13.0 (3.6, 22.4) 4.0 (-10.1, 18.1) -1.2 (-26.8, 24.3) 90th percentile -6.6 (-30.1, 17.0) 4.2 (-27.9, 36.4) -21.6 (-102.7,59.5) 50.2 (-9.9,110.2) 62.8 (-72.9,198.5) Stage unknown Median Ref 14.3 (-0.2, 28.8) -1.4 (-19.5, 16.8) -6.8 (-31.7, 18.0) -6.8 (-38.2, 24.6) -26.4 (-61.2, 8.4) 90th percentile 3.0 (-29.9,35.9) 16.4 (-27.1,59.9) 18.3 (-61.6,98.2) 75.1 (-15.3,165.4) 71.0 (-97.6,239.6) Abbreviation: RIO Rurality Index for Ontario ‡Separate regressions for each stage group adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

106

Median and 90th percentile differences across all other variables are presented in Appendix K and

L. Some significant findings are highlighted below.

Age: For stages I, II and unknown, the median diagnostic interval for patients in the oldest age group (85+) was significantly faster (between 24 and 31 days) compared to patients aged 65-74 (reference group). In stage III CRC, patients in the two youngest age groups as well as the two oldest age groups were diagnosed significantly faster (9 to 17 days) than the reference group and in stage IV CRC, patients in the three youngest age groups were diagnosed significantly faster (7-9 days) than the reference group.

At the 90th percentile, there were less statistically significant results. Overall, all statistically significant differences found in the median and 90th regression pointed to a faster diagnostic interval compared to the reference age group of 65-74 years old. There were changes in estimates of more than 10% with adjustment for covariates for almost all age groups in each model. Appendix G contains the linear assumption tests for age, comorbidities and deprivation quintile.

Comorbidities: The effect of major and minor comorbidities was consistent throughout.

Increasing comorbidity was consistently associated with a longer median and 90th percentile diagnostic interval across stage groups. At the median, statistically significant increases ranged from 4 to 9 days with each single increase in the comorbid disease count (number of ADGs). At the 90th percentile, statistically significant increases ranged from 12 to18 days with each single increase in the comorbid disease count

(number of ADGs). In the median regression models there were changes in estimates of more than 10% with adjustment for covariates for major comorbidity at all stages and for minor comorbidity at stages I and IV. All adjusted estimates were lower than unadjusted estimates. In the 90th percentile regression models there were changes in estimates of more than 10% with adjustment for covariates for major comorbidity and minor comorbidity at all stages. All adjusted estimates were lower than unadjusted estimates, especially for major comorbidity. Appendix G contains the linear assumption tests for age, comorbidities and deprivation quintile.

107

CRC sub-site: In the median regression there was a change in the direction of the estimates for

CRC sub-site through the stage strata. In CRC stages I and II, proximal and distal colon cancer had significantly longer diagnostic intervals than the rectum (reference site), ranging from 11 to 13 days longer. There were no statistically significant results in stage III cases, but in stage IV, proximal colon now had a statistically significant shorter (about 7 days) diagnostic interval than the rectum. At the 90th percentile, for stages I, II and unknown, the proximal colon had a significantly longer diagnostic interval than the rectum, ranging from 16 to 40 days longer. However, there were no statistically significant sub- site effects for stages III and IV. There were changes in estimates of more than 10% with adjustment for covariates for almost all sub-sites in each model.

Although the deprivation quintile variable did not have statistically significant results in most of the models there was a spot finding for stage III CRC at the median. Patients in the more deprived quintiles had faster diagnostic intervals with statistically significant estimates ranging from 7 to 9 days faster as compared to the least deprived quintile.

In summary, fully adjusted quantile regression models were produced at the median and 90th percentiles to identify relationships between rurality and the diagnostic interval stratified by CRC stage at diagnosis. At the median, patients in the RIO 10-55 groups generally had diagnostic intervals that were similar or longer than patients in the RIO 0-9 group. There was no clear pattern for patients residing in the

RIO 56-75 group or the RIO 76+ group although estimates suggested a shorter diagnostic interval for stage I and stage unknown groups as compared to the RIO 0-9 group. At the 90th percentile, for stage I

CRC, both the RIO 10-30 association and the RIO 76+ association were statistically significant and larger than the median estimates. Estimates for all other stages at the 90th percentile were not statistically significant, but they were generally farther from zero than the median estimates in all stage strata (except stage III). Age, comorbidity and CRC sub-site demonstrated associations throughout CRC stage groups and across both the median and 90th percentiles. It was important to adjust for covariates as confounding

(more than a 10% change in estimate between unadjusted and adjusted models) was detected for almost

108 all variables in each of the 10 models and some of the exposure estimates changed in terms of statistical significance with adjustment.

4.4.4 Description of the diagnostic interval distribution by RIO group stratified by potential causal pathway variables As outlined in the Methods Chapter, three variables (symptom status at first test, ED presentation and first test) were considered causal pathway variables on the association between rurality and the length of the diagnostic interval. Distributions of the causal pathway variables by RIO group and the diagnostic interval (median and 90th percentiles) are presented in Tables 4-10, 4-11 and 4-12.

As described in section 4.3.3, 30.6% of patients were asymptomatic at their first test (see Table 4-

10). There was a statistically significant difference in symptom status at first test across RIO groups

(p<0.0001). Patients living in the RIO 76+ areas had the smallest proportion of asymptomatic CRC at

25.5%. Asymptomatic CRC ranged from 29.6% to 34.2% amongst the other RIO groups.

In every RIO group the median diagnostic interval was longer for asymptomatic patients than symptomatic patients, this corresponded with our understanding of the triaging that takes place during the recommended CRC diagnostic pathway (89). The difference in the median diagnostic interval between asymptomatic and symptomatic CRC was 35 to 38 days in the RIO 0-55 groups. This difference in median diagnostic interval was greater for the RIO 56-75 group and the RIO 76+ group at 44 days and 55 days, respectively.

Amongst asymptomatic patients (n=8, 562) the median diagnostic interval was relatively stable across all six RIO groups, although there was a significant difference (p=0.04). Amongst symptomatic

CRC patients (n=19, 380), the RIO 76+ group had the shortest median diagnostic interval (36 days) compared to the other RIO groups (next shortest diagnostic interval was 47 days in the RIO 31-45 group and the RIO 56-75 group) (p=0.02).

Although the median diagnostic interval was always considerably shorter for the symptomatic patients versus the asymptomatic patients, the opposite trend was found for the 90th percentile. In all RIO

109 groups, except the RIO 56-75 group, the 90th percentile was 11 to 47 days longer for symptomatic patients than asymptomatic patients. For asymptomatic patients the 90th percentile diagnostic interval ranged from

252 -301 days across RIO groups and was similar across RIO groups except for the RIO 56-75 group, which was noticeably longer (301 days). For symptomatic patients the 90th percentile diagnostic interval ranged from 276 -313 days across RIO groups and generally declined as rurality increased.

Table 4-10: Comparison of CRC symptom status (asymptomatic or symptomatic) distribution (X2 p<0.0001, trend* p=0.004) and diagnostic interval (days) (50th/median and 90th percentiles) across RIO groups RIO RIO RIO RIO RIO RIO p-value 0-9 10-30 31-45 46-55 56-75 76+ Kruskal- Least rural Most rural Wallis Test** % Asymptomatic 29.6 32.8 34.2 30.2 31.5 25.5 (n=8,562) Diagnostic Interval (days) n=5,426 n=1,623 n=949 n=228 n=252 n=84 Median 86 92 85 90.5 91 91 0.04 90th percentile 260 268 268 252 301 265 % Symptomatic 70.4 67.2 65.8 69.8 68.5 74.5 (n=19,380) Diagnostic Interval (days) n=12,915 n=3,322 n=1,825 n=526 n=547 n=245 Median 50 54.5 47 55.5 47 36 0.02 90th percentile 303 313 292 299 295 276 Abbreviations: RIO Rurality Index for Ontario; DI diagnostic interval * Cochrane-Armitage Trend Test (2-sided) ** Kruskal-Wallis Test, compares median values across categories

As described in section 4.3.3, 19.5% of patients, overall, were diagnosed through an ED (see

Table 4-11). ED presentation generally increased with increasing rurality and showed a statistically significant trend (p<0.0001). The proportion of patients with an ED presentation ranged from 18.7% in the RIO 0-9 group to 26.4% in the RIO 76 + group. As expected, the median diagnostic interval was much shorter for ED presentation cases versus non-ED presentation cases (on the order of 47 to75 days shorter) for all RIO groups. Amongst the ED presentation group, the median diagnostic interval increased as rurality increased (p<0.0001) while amongst the non-ED presentation group the median diagnostic interval was the lowest in the RIO 76 + group (p=0.0006).

110

Table 4-11: Comparison of emergency department (ED) presentation (yes or no) distribution(X2 p<0.0001, trend* p<0.0001) and diagnostic interval (days) (50th/median and 90th percentiles) across RIO groups RIO RIO RIO RIO RIO RIO p-value 0-9 10-30 31-45 46-55 56-75 76+ Kruskal- Least rural Most rural Wallis Test** % ED Presentation 18.7 20.6 18.8 24.0 25.0 26.4 (n=5,439) Diagnostic Interval (days) n=3,433 n=1,016 n=522 n=181 n=200 n=87 Median DI 3 9 8 16 17 17 <0.0001 90th percentile DI 194 219 185 199 239.5 203 % non-ED Presentation 81.3 79.5 81.2 76.0 75.0 73.6 (n=22,503) Diagnostic Interval (days) n=14,908 n=3,929 n=2,252 n=573 n=599 n=242 Median DI 78 83 75 84 76 64 0.0006 90th percentile DI 298 308 289 306 309 276 Abbreviations: RIO Rurality Index for Ontario; ED emergency department; DI diagnostic interval * Cochrane-Armitage Trend Test (2-sided) ** Kruskal-Wallis Test, compares median values across categories

There was a statistically significant difference in the first test distribution (p<0.0001) (see Table

4-12). Colonoscopy was the most common first test in the CRC cohort (32.1%) followed by FOBT

(19.5%), abdominal ultra sound (17.9%), abdominal X-Ray (11.0%), abdominal CT (7.9%), ED NOS

(5.6%), sigmoidoscopy (4.1%) and barium enema (1.9%). For the most part, the distribution of a first test across RIO groups was consistent. Notable differences included a higher proportion of abdominal US as a first test in the RIO 0-9 group (19.3%) versus the RIO 56-75 group (13.6%), a higher proportion of colonoscopy as a first test in the more rural groups (about 39%) than the less rural groups (about 32%) and a monotonic decrease in the proportion of FOBTs as a first test from the RIO 0-9 group (20.0%) to the RIO 76+ group (12.8%).

The median diagnostic interval varied greatly depending on the first test. Abdominal US as a first test had the longest median diagnostic intervals (ranging from 118 to151 days across RIO groups) and, as expected, ED NOS had the shortest median diagnostic intervals (ranging from 2 to 3.5 days across RIO groups). Abdominal X-ray as a first test also had relatively short median diagnostic intervals (ranging from 18 to 27 days across RIO groups). There was not a statistically significant difference in the median diagnostic interval with a first test group across RIO categories for any tests except colonoscopy

(p<0.0001). At the 90th percentile, abdominal CT had the longest diagnostic intervals (ranging from 342

111 to 393 days across RIO groups). At the 90th percentile abdominal US also had long diagnostic intervals

(ranging from 337 to 379 days across RIO groups) and ED NOS had, by far, the shortest diagnostic intervals (ranging from 21 to 30 days across RIO groups).

112

Table 4-12: Comparison of first test distribution (X2 p<0.0001) and diagnostic interval (days) (50th/median and 90th percentiles) across RIO groups RIO RIO RIO RIO RIO RIO p-value 0-9 10-30 31-45 46-55 56-75 76+ Kruskal- Wallis First Test Least rural Most rural Test** % Colonoscopy 30.8 33.4 35.8 33.2 39.7 38.6 (n=8,978) Diagnostic Interval (days) n=5,640 n=1,652 n=992 n=250 n=317 n=127 Median 59 73 67 72 69 60 <0.0001 90th percentile 241.5 296 268 248 295 271 % FOBT 20.0 19.7 19.5 15.4 13.4 12.8 (n=5,442) Diagnostic Interval (days) n=3,662 n=974 n=541 n=116 n=107 n=42 Median DI 85 83 78 87.5 87 87 0.80 90th percentile DI 248 246 246 243 266 235 % Abdominal US 19.3 15.3 15.0 17.1 13.6 16.1 (n=5,004) Diagnostic Interval (days) n=3,543 n=754 n=416 n=129 n=109 n=53 Median DI 133 138.5 128 140 118 151 0.32 90th percentile DI 347 354 351 379 337 360 % Abdominal X-ray 10.4 12.5 10.1 12.7 14.8 12.2 (n=3,064) Diagnostic Interval (days) n=1,912 n=617 n=281 n=96 n=118 n=40 Median DI 27 24 22 21.5 23 18 0.67 90th percentile DI 269 277 234 264 247 213 % Abdominal CT 7.8 7.9 8.5 8.4 6.0 7.3 (n=2,199) Diagnostic Interval (days) n=1,437 n=391 n=236 n=63 n=48 n=24 Median DI 44 53 46 37 89.5 31 0.19 90th percentile DI 385 389 342 393 367 377 % ED NOS 5.8 5.2 5.1 6.5 5.9 8.5 (n=1,576) Diagnostic Interval (days) n=1,056 n=255 n=141 n=49 n=47 n=28 Median DI 2 2 2 2 2 3.5 0.25 90th percentile DI 21 21 23 30 25 23 % Sigmoidoscopy 4.0 4.4 4.1 * * * (n=1,148) Diagnostic Interval (days) n=733 n=218 n=115 n<35* n<40* n<20* Median DI 52 50 55 81 65 92.5 0.41 90th percentile DI 325 328 310 310 278 299 % Barium Enema 2.0 1.7 1.9 * * * (n=531) Diagnostic Interval (days) n=358 n=84 n=52 n<20* n<20* n<20* Median DI 45 40 43.5 40.5 69.5 - 0.80*** 90th percentile DI 144 149 174 190 162 - Abbreviations: RIO Rurality Index for Ontario; FOBT fecal occult blood test; US ultrasound; CT computed tomography; ED NOS emergency department not otherwise specified * Small cell size, suppressed due to ICES confidentiality regulations ** Kruskal-Wallis Test, compares median values across categories *** Small cell size, therefore, the p-value was an imperfect approximation

113

4.5 Regression Diagnostics

Regression diagnostics were completed for the fully adjusted logistic regression model and the 10 fully adjusted quantile regression models to assess model adequacy. In the logistic regression model three types of diagnostic statistics were assessed graphically. These were: Pearson residual, influence and leverage. No observations were identified as strongly influential or warranted further scrutiny. An overall summary score, the Hosmer and Lemeshow goodness-of-fit test, was also computed (p=0.35) and further supported our graphical findings. The adjusted logistic model fit the data adequately, no observations were deleted.

Quantile regression models are thought to be quite robust. In contrast to linear regression models, which are also used with a continuous dependent variable, quantile regression is insensitive to outliers and is free from many underlying model assumptions such as normality, homoscedasticity and the one-model assumption (that the regression model used is suitable for all data)(185). Quantile regression models are useful in the presence of skewed distributions, such as the diagnostic interval. Its insensitivity to outliers is essential, as deleting outliers is an unfavorable model-fitting solution when studying population inequities; these extreme data points are important to study (185). Based on the robustness of the quantile regression model we did not think that there would be issues with the model fit, however, diagnostics for each of the 10 quantile regression models were analyzed separately by examining histograms of the standardized residuals with two fitted density curves (normal and kernel) overlaid (see Appendix M and

N) (188). Histograms from the median regression exhibited a slightly right-skewed distribution of the residuals and histograms from the 90th regression were relatively normal. This suggested that the models fit the data adequately because the individual data points were falling equally about the predicted model.

114

Chapter 5

Discussion

5.1 Summary of Key Findings

There were 27,942 CRC patients included in the analyses of this retrospective cohort study using administrative data. The majority of patients (65.6%) lived in the RIO 0-9 group (least rural), which included all major cities in Ontario. Using our symptom status algorithm we found that 30.6% of the cohort was asymptomatic at the time of their first test and using our ED presentation logic, we determined that 19.5% of patients were diagnosed through an ED. Colonoscopy was the most common first test in the

CRC cohort (32.1%), followed by FOBT (19.5%).

In objective one we compared the CRC stage distribution across RIO categories. Contrary to our hypothesis, there was no statistically significant difference in CRC stage distribution across RIO categories (p=0.28). In a multivariable analysis we used logistic regression to model the odds of having late stage CRC (III, IV) versus early stage CRC (I, II). Rurality was not associated with an increased odds of late stage cancer, in fact, one rural RIO category (56-75) had a significant protective effect (adjusted

OR: 0.83, 95% CI: 0.72, 0.97). Covariates that had significant protective effects included: major and minor comorbidities as well as distal colon and unspecified CRC sub-sites. Covariates that significantly increased the odds of having late stage CRC included: younger age at CRC diagnosis and living in the most deprived neighbourhoods.

In objective two we compared the CRC diagnostic interval across RIO categories stratified by stage. The diagnostic interval distribution exhibited the characteristic right-skewed shape. Overall, the median diagnostic interval was 64 (IQR: 22-159) days and the 90th percentile was 288 days. As expected, the median diagnostic interval was longer for stage I CRCs than for stage IV CRCs. There was only a statistically significant difference in median diagnostic interval by RIO category for stage I CRCs

(p=0.0005), with the RIO 76+ group (most rural) having the shortest diagnostic interval. Two mid-ranged 115

RIO categories (RIO 10-30 and RIO 46-55) often had the longest diagnostic interval throughout the five stage strata (except stage II). A multivariable quantile regression analysis was performed at the

50th/median and 90th percentiles, stratified by stage. At the median, the adjusted difference (in days) for

RIO categories 10-55 generally suggested a similar or longer diagnostic interval compared to the RIO 0-9 group. At the 90th percentile, the only estimates that were statistically significant were for stage I CRCs, in which the RIO 10-30 group had a longer interval and the RIO 76+ group had a shorter interval compared to the RIO 0-9 group. Statistically significant covariates included age, comorbidity and CRC sub-site. In each RIO group the median diagnostic interval was longer for asymptomatic patients versus symptomatic patients. Furthermore, in each RIO group, the median diagnostic interval was much shorter for ED presentation cases versus non-ED presentation cases.

In summary, our study found that more rural RIO categories were not disadvantaged in terms of

CRC stage distribution. The median diagnostic interval for CRC patients in Ontario was 64 (IQR: 22-159) days and patients with stage I CRC had longer median diagnostic intervals than patients with stage IV

CRC. Increasing rurality did not have an increasing effect on the CRC diagnostic interval. Within a stage stratum, some rural categories demonstrated longer intervals (generally the mid-ranged RIO groups) and some demonstrated shorter intervals (generally the most rural RIO groups), especially in stage I disease.

5.2 Discussion of Key Findings

The discussion of key findings is presented under the following sub-headings: diagnostic interval, stratifying the diagnostic interval by stage, rurality, important covariates, potential causal pathway variables and statistical considerations.

Diagnostic Interval Methods to calculate the diagnostic interval were based on work by Singh and colleagues (22), who conducted a population-based study using administrative data in Manitoba. In 2005, the median diagnostic interval for CRC patients in Manitoba was 64 days and the 90th percentile was 264 days (22).

This was very similar to the overall results from our study (2007-2012) which also found a median 116 diagnostic interval of 64 days and a 90th percentile of 288 days. When Singh and colleagues subsequently studied the relationship between the diagnostic interval and CRC outcomes (15) the median diagnostic interval had continued to increase with a median of 92 days in 2008, which may be more reflective of the time interval we examined. Although these studies (15,22) represented the closest methodology to our own study, in terms of the diagnostic interval calculation, it should not be concluded that the median CRC diagnostic interval in Ontario was shorter than the median diagnostic interval in Manitoba, as a portion of this difference likely arose from differences in our methods. One such difference, meant to improve the accuracy of our study, was the use of control charts to determine the look-back periods for key CRC tests.

Singh and colleagues (22) used a one year look-back for the collection of key CRC tests. In our study, the control chart-based look-back back period for radiological imaging tests (with the exception of barium enema) was close to one year (52 weeks), however the look-back period for lower endoscopy tests

(colonoscopy and sigmoidoscopy) was shorter at about 34 weeks. Lower endoscopy is an important test in the diagnosis of CRC (4) and was the first test for 40% of the cases included in Singh and colleagues study (22) and 36% of cases in our study. Further, the 2012 study by Singh and colleagues (15) included

CRC cases residing in the two main cities of Manitoba only (where two-thirds of the population live).

This was because several key CRC tests in rural areas were not consistently recorded in their administrative databases. For diagnostic interval results to be directly compared with other studies a more similar methodology would be needed. This point can be further emphasized by the range of median diagnostic intervals found in Table 2-2 in the Literature Review Chapter. Ongoing work by a team of pan-

Canadian investigators, called CanIMPACT (189), may be able to provide more insight on the diagnostic interval across different provinces as they are all using similar methods. Adoption of the definitions outlined in the Aarhus Statement (66) will also help to make studies in this field more comparable.

Stratifying the Diagnostic Interval by Stage There was a statistically significant difference in the median diagnostic interval by stage (I-IV):

98 (IQR: 43, 200) days, 60 (IQR: 21,150) days, 60 (IQR: 21,150) days and 37 (IQR: 9,107) days, respectively (p<0.0001). A possible explanation for this variation in results was that FPs were better able 117 to identify stage IV CRCs (due to signs and symptoms), then convey this message of increased urgency to specialists and ultimately expedite the diagnostic process (9,84,85,87). Our results would seem to support the concept of the waiting time paradox as patients with late stage cancer, who would have a lower survival rate, had a faster median diagnostic interval. However, mortality/survival was not assessed in this study so we cannot comment about the U-shaped relationship between the diagnostic interval and mortality for CRC observed in the work by Tørring et al. (9,87) who caution against assuming a monotonic relationship between the diagnostic interval and mortality.

Stratifying objective two results by stage at diagnosis (effect modifier) allowed us to study whether the rurality – diagnostic interval association was different across stage groups. As expected, stage was a strong predictor of the diagnostic interval length, and rurality effects may have only been present in stage I cancer, which we hypothesized was most likely to be influenced by access issues. The direction of the stage I result did not, however accord with the idea that more rural areas had worse access. Overall, some evidence of effect modification was seen, although no formal statistical test for interaction was completed. At the median (see Table 4-9), a similar pattern emerged for the adjusted difference in days estimates (of the diagnostic interval) for the RIO 10-55 groups across stages I-IV, however there were differences in estimates within a RIO group across stage for RIO groups 56-75 and 76+. In the RIO 56-75 group stage III and IV estimates were greater than the reference group (RIO 0-9) while stage I and II estimates were shorter than the reference group. In the RIO 76+ group the stage I and stage unknown estimate was markedly shorter than estimates from the other stages. At the 90th percentile, differences in estimates across stage were most pronounced within the RIO 76+ group where the stage I estimate was

-67.3 (95% CI: -129.4, -5.1) days and the stage IV estimate was 62.8 (95% CI: -72.9, 198.5) days (stage unknown was similar). Stages II and III had comparable estimates at -7.3 and -12.8 days. Although stage at presentation seems to be associated with differences in access to a timely diagnosis, how that is affected by access issues in more and less rural areas is unclear. Further understanding of the resources available in the areas represented by our RIO categories is warranted.

118

Rurality Overall, the median diagnostic interval for the first five RIO categories (RIO 0-75) were quite similar, ranging from 62-69 days while the RIO 76+ category had a shorter median diagnostic interval at

53 days. The RIO 76+ category often had a shorter diagnostic interval throughout the objective two analyses which lead us to examine that group more closely. Contrary to our general understanding of rural populations, the most rural group had the youngest mean age of all RIO categories (66.6 years versus a cohort mean of 68.5 years) and had the greatest proportion of zero minor comorbidities (17.3% versus a cohort mean of 10.6%). We were expecting the most rural group to be the oldest, have the most comorbidities and to be more deprived than the other RIO categories (19,20,123). Considering the CSDs included in the RIO 76+ category (see Appendix B), perhaps a portion of this most rural group is a special population representing a younger, resource sector, workforce (e.g. mining, logging, forestry). Another explanation could be that of an inherent well-being required to live in the most isolated rural areas. There could also be variations in resources across CSDs within a given RIO group that may be important to study. The RIO 56-75 category (second most rural) had the oldest mean age (69.7 years), had the greatest proportion of the most deprived neighbourhoods and had a comorbidity distribution similar to the mid- ranged RIO groups. Perhaps the RIO 56-75group is more reflective of our concept of rural-living patients.

Even so, this would still not support our hypothesis that patients living in more rural areas would have longer diagnostic intervals as the median diagnostic interval for the RIO 56-75 group was similar to the other less rural RIO groups and was in fact one of the next fastest median intervals at 62 (IQR:20-149) days. A more in depth analysis of rurality is warranted to better identify and characterize pockets of communities to understand which may have better or worse access to health care services. There was also a greater proportion of patients in the most rural group who did not have a referral for their first test

(22.8% versus 14.2% overall). This could contribute to the shorter interval in the RIO 76+ group.

There are very few studies examining the association between rurality as a primary exposure and the diagnostic interval. A Scottish study by Robertson and colleagues (74) may have been the most similar in nature to our research question, although they studied the time from patient presentation to 119 treatment. Robertson and colleagues (74) did not find a statistically significant difference in the time from presentation to treatment between rural and urban areas.

We expected that stage I CRCs would be the most vulnerable to variations across RIO groups and it was the only stratum in which there was a statistically significant difference in the median diagnostic intervals (medians ranged from 58.5 days in the RIO 76+ category to 108 days in the mid-ranged RIO 46-

55 category), however, the direction of the effect, was not what we hypothesized. An underlying mechanism for the presence/absence of statistically significant interval variation by stage could be related to symptom presentation, whereby, it may be both easier for health care providers to identify more alarming symptoms associated with later stage CRC and for the patient to then move through a more uniform/structured diagnostic pathway (9,84,85,87). Thus, in stage II-IV CRC there was not a statistically significant difference in the median diagnostic interval detected across RIO categories. Additionally, the diagnostic interval interquartile range for each RIO category generally decreased from stage I to IV. The advantage, in terms of the reduced diagnostic interval, for stage I CRC patients living in the most rural areas could be associated with shorter wait times for some key CRC tests or perhaps fewer possible diagnostic options, streamlining decision making. It is also important to consider that there was a smaller proportion of asymptomatic CRCs in the most rural RIO category (25.5% versus 30.6% overall). A portion of the asymptomatic cases would represent screen-detected CRCs, which generally have a longer diagnostic interval than symptomatic cases. Therefore, this could also play a part in the reduced diagnostic interval for the RIO 76+ category for stage I CRCs.

It is also important to consider the implications of our RIO categorization scheme, which resulted in small numbers in the most rural categories (especially the RIO 76+ group) as compared to the less rural groups. For example, when studying the rurality-diagnostic interval association for stage I, it is possible that the observed shorter diagnostic interval in the RIO 76+ group was partially attributed to statistical artifact of a small sample size. Further, it is important to note that the patients who were missing a RIO score (n=287) would have likely belonged to the RIO 76+ group considering where the RIO excluded

120 areas are located (CSDs with populations less than 500, Unorganized Areas and First Nations reserves and settlements).

Some patients in our study may have had colonoscopies in private clinics that bill OHIP, especially for screening purposes. This would be more likely in the RIO 0-9 areas and might decrease the diagnostic interval for screen-detected patients in these areas. But our results show that while the median diagnostic interval for asymptomatic cases in the RIO 0-9 group was second shortest, there was not a great deal of variation between the RIO groups among the asymptomatic population (maximum differences between the median diagnostic intervals was seven days).

Important Covariates Patients in the youngest and oldest age groups had the fastest median diagnostic intervals, the trend followed a negative quadratic shape with patients aged 65-74 having the longest median diagnostic interval. A systematic review by Mitchell and colleagues (5) found that older patient age generally decreased practitioner delay (interval between first consult and referral), which is a component of the diagnostic interval. Contrary to our results, Van Hout and colleagues (73) reported that patients less than

50 years old had a higher risk of delay between referral and CRC diagnosis, although they cautioned that their numbers were small. The authors hypothesized that this result could be due to physicians assigning a lower chance of CRC in those younger patients. Several other studies did not find a statistically significant difference by age (11,13,67), although the intervals measured did not exactly match our study.

The association between comorbidity and the diagnostic interval was our most consistent finding.

There was a positive linear trend between the number of major and minor ADGs and the diagnostic interval; the more comorbidities a patient had, the longer the diagnostic interval. This was found throughout all stage groups. At the median, the addition of one major ADG increased the diagnostic interval by 5-9 days, the addition of one minor ADG increased the diagnostic interval by 4-6 days. This effect was even greater at the 90th percentile.

121

The direction of the association between comorbidities and the diagnostic interval is not consistent throughout the literature. A systematic review by Mitchell and colleagues (5) identified one study (190) which found that patients with co-existing disease decreased the practitioner delay

(consultation to referral), a finding opposite to ours. In three studies that characterized the time period between referral and diagnosis (11,67) or referral to colonoscopy (18) two found no statistically significant differences with or without comorbidities and one found significant results for select comorbidities only. When Van Hout and colleagues (73) studied the period from first FP contact to referral with a specialist, multivariate analysis revealed that patients with psychiatric comorbidity were at a four times greater risk of having a delay of more than 60 days (OR: 4.0, 95% CI: 1.1, 13.3). Singh and colleagues (22) identified comorbidity as a significant predictor of the overall wait-time (patient first contact with healthcare system to start of treatment) in the Manitoba population, with patients who had a

Charlson comorbidity index greater than three waiting longer (HR:0.86, 95% CI:0.76, 0.96, where lower

HR represented longer wait times). With a range of methods to measure comorbidities plus the variation in time intervals it is hard to draw a clear conclusion from the literature. An explanation for our results could be related to the difficulty in which a FP may have in recognizing and investigating CRC-relevant symptoms, which can already be vague, in patients with other multiple existing conditions. In an

American study (75), patients who had congestive heart failure and coronary heart disease were more likely to have a “missed opportunity” to initiate endoscopic tests for CRC.

We found that proximal and distal colon cancers had longer diagnostic intervals compared to rectal cancer for early stage CRC, but, for late stage CRC, proximal colon cancer had a faster diagnostic interval compared to rectal cancer. Generally, previous research has found that rectal cancers have shorter diagnostic intervals (or parts of the diagnostic interval) than other sub-sites of the large intestine (5) or that there is not a statistically significant difference (11). Most research did not present results by stage, but findings are likely linked to site specific symptoms.

122

Possible Causal Pathway Variables

In objective one, CRC symptom status and ED presentation were considered as potential causal pathway variables between rurality and stage. In objective two, CRC symptom status, ED presentation and first test type were considered as potential causal pathway variables between rurality and the diagnostic interval.

Using our ED presentation logic we found that 19.5% of the cohort was diagnosed through an

ED. This proportion was similar to the 18.1% of CRC cases that presented with obstruction, perforation or emergency admission in Ontario from a study using administrative data (129) and the 24% estimated from cases who were undergoing surgical resection for CRC in another Canadian study (191). In terms of obtaining a proportion of CRC cases who presented to the ED, our study may have achieved a more accurate estimate as we had access to a new administrative dataset (NACRS) since the publication of the

Ontario estimate (which used a CIHI “admission type” code and would not have included outpatient records) and we also tried to take into account for patients who had a scheduled appointment in the ED.

Emergency cases presenting at a later stage and having a much faster diagnostic interval (or overall interval) was consistent with the literature (22,191). Our causal pathway thinking was that ED presentation maybe more likely to happen in rural areas where there could be less access to FPs, this could manifest into a shorter interval than would have otherwise be seen in those areas. There was a greater proportion of ED presentation cases in the RIO 76+ group (26.4% versus 19.5% overall) which may have contributed to the shorter median diagnostic interval observed in this group. However, amongst the ED presentation group, the median diagnostic interval was longest for the RIO 76+ category making it difficult to predict an overall effect. If there was less access to FPs in rural areas, more patients may ultimately use the ED to be diagnosed with a late stage CRC. While we did observe a greater proportion of ED presentation cases in the RIO 76+ group this did not manifest to a greater proportion of late stage

CRC in the RIO 76+ group.

123

We revised an algorithm based on work from El-Serag and colleagues (23) to determine if a patient was asymptomatic or symptomatic at their first test. Approximately 30.6% of the cohort was asymptomatic at the time of their first test, according to our algorithm. We believe that the majority of these cases would be screen-detected (ColonCancerCheck FOBT or increased risk colonoscopy), however, it is difficult to know with complete certainty without specialized screening codes. An effort to validate our algorithm will be made by our lab group in the future by comparing results with 2013 data, where screening-specific codes exist. To further investigate the proportion of true screening individuals within the asymptomatic group various sub-analysis could be done on the current dataset. These could include, examining percent asymptomatic by diagnosis year (where we would expect an increasing asymptomatic percent with year as well as a spike in the proportion from 2007-2009 as

ColonCancerCheck was initiated in April 2008) and comparing the percent asymptomatic between the age groups that are and are not eligible for FOBT screening (i.e. between 50-74 years and not between 50-

74 years). Due to time constraints we could not complete these investigations, but it may be of interest for future work.

Clinical triage plays an important role in the diagnostic interval, our causal pathway thinking was that there would be a smaller proportion of asymptomatic patients in the RIO 76+ group, where there are generally lower rates of screening. This could manifest into a shorter diagnostic interval in those areas due to fewer cases traveling on a “slower” screening colonoscopy pathway, which is likely due to a lower probability of CRC in screening cases (67,89). There was a smaller proportion of asymptomatic patients in the RIO 76+ group (25.5% versus 30.6% overall) which may have partially contributed to the shorter median diagnostic interval observed in this group. Screening can detect CRC at an early stage, if there was fewer screening opportunities in more rural areas there could be more late stage CRCs. While we did observe a smaller proportion of asymptomatic patients in the RIO 76+ group, there was not a greater proportion of stage IV CRCs (16.8% versus 18.0% overall).

124

Having a particular test first for suspected CRC may influence a patient’s diagnostic interval. Our causal pathway thinking was that there could be different availability of particular tests over rurality and therefore, influence the diagnostic interval. The median diagnostic interval varied greatly by first test, however, the proportion of a first test across RIO groups and the median diagnostic interval of a test across RIO groups was quite similar. We did not think that first test would lie on the path to CRC stage at diagnosis.

General Statistical Considerations

In objective one, where we examined rurality and the association with late stage versus early stage CRC, a logistic regression multivariable model was chosen although late stage CRC was a common outcome (>10%). It was therefore important to interpret results from the logistic regression model as an odds ratio and not as a relative risk because the rare disease assumption would not hold true (192). A log- binomial model could have been used to directly estimate a relative risk with our common outcome data

(192), but we felt that the conventional logistic regression model, interpreted as an odds ratio only, was sufficient.

The issue of multiple comparisons should be addressed for objective two. Perhaps the most relevant example is the relationship explored in Table 4-9 (which compares the diagnostic interval adjusted difference in days across RIO groups stratified by stage) where there are 10 comparisons per stage group (five RIO groups for the median and five RIO groups for the 90th percentile). As more indicator variables are tested, it becomes more likely that rurality and stage will appear to have a significant relationship for at least one test by chance alone. However, since the analysis was stratified, each stage group can be considered as a separate population and using an alpha of 0.05 (or 1/20) we would expect less than one significant result to occur by chance for each stratum (because each stage stratum contains 10 comparisons).

We considered the issue of clustering and/or the need for hierarchical modeling. For example, when Massarweh and colleagues (115) studied the association between travel distance and late stage 125 colon cancer they considered that patients living within the same geographic area would have similar access to healthcare providers and facilities and therefore would have similar travel issues when accessing healthcare. Those researchers therefore used a two-level model, at the patient and geographic region levels, to take this correlation into account. In our study, rurality categories contained regions dispersed throughout the province, thereby reducing the correlation among observations within a category.

Clustering at the referring physician-level is highly unlikely as each individual FP has, at most, one or two new cases of CRC per year. There is a possibility that clustering could occur at other levels (e.g.

LHIN) and this could be a limitation, although, again the six rurality categories contained CSDs dispersed throughout the province.

5.3 Strengths and Limitations

Results should be interpreted within the context of the study strengths and limitations. Study strengths will be outlined first followed by the limitations.

5.3.1 Strengths Many of the study strengths were related to the large network of administrative databases that we were able to access through ICES Queen’s. For the most part, the specific databases used in this project, like the OCR, are routinely used for research and have been evaluated for completeness. Using administrative data allowed us to conduct a population-based study of CRC in Ontario. This was a major study strength and made us confident in the generalizability (external validity) of our findings to

Ontarians, helping to minimize selection bias. We were not, however able to include those living in First

Nations reserves and settlements so our results do not generalize to that group (see missing data section below). Another major strength associated with the use of ICES data was the link-ability between databases, which permitted us to control for many covariates. The use of routinely collected administrative data also avoided the issue of recall bias as we did not rely on patient self-reported data.

Recall bias is often an important limitation considered in studies that quantify the CRC diagnostic interval using self-reported data, such as patient questionnaires or interviews, because it can be especially difficult 126 to recall when CRC symptoms, which are often vague, truly began (14,55). Furthermore, there can be a long period of time in between diagnosis and administration of the study questionnaire (78). Access to this administrative data provided us with a large sample size of CRC patients and allowed us to detected small differences in estimates. Without a large sample size the number of patients living in the most rural areas would be too small and further stratification of the results into five stage groups (I-IV and unknown) would not likely be an option. This was an early use of Ontario’s population-based stage data and the use of multiple (six) categories of rurality, versus an urban/rural dichotomization, addressed a need described in the literature (100,137,193,194). Further, the rurality index used was designed specifically for the province of Ontario. This may be the only province in Canada where such a study could currently be conducted considering the population size and availability/quality of administrative data.

In terms of study design and methods there were several strengths. Firstly, the retrospective cohort design allowed us to be efficient in terms of identifying a large group of CRC patients. A methodological improvement from other studies that measured the diagnostic interval using administrative data (15,22) was the use of control charts. Control charts helped us to define the relevant look-back periods for each key CRC test versus choosing a somewhat arbitrary time window. The control charts increased the accuracy of the outcome measurement, making the diagnostic interval estimates more reflective of the truth, thus enhancing internal validity. Methodological decisions to define the diagnostic interval were made in a way that would favour a more conservative estimate. Another strength of the study was consideration for possible causal pathway variables (ED presentation, symptom status and first test).

The distribution of the diagnostic interval was right-skewed and using quantile regression over a log-transformed model was a study strength. Quantile regression is a robust model and allowed us to present results in terms of absolute differences in days, which is important from a public health and knowledge translation perspective. Further, quantile regression allowed us to study both the median and

90th percentiles of the diagnostic interval distribution.

127

5.3.2 Limitations

This study also had several limitations which are important to consider and are addressed below in sub-headings.

Study Design

Like many retrospective cohorts, the data used was not collected with the express purpose to answer our study question. Considering this, there were some covariates that we did not have information on and for some patients we may not have had the most reflective postal code of residence (both issues are expanded upon in the following paragraphs).

With the strictest definition, studying a cancer cohort as we have done may not be considered as a true retrospective cohort for two reasons. Firstly, we studied only patients who ultimately were diagnosed with CRC, versus all patients who were suspected of CRC and secondly we assigned our exposure (RIO score) at the time of diagnosis (resembling a cross-sectional design). Grunfeld and colleagues (70) point out that the “gold standard” study design would be to have prospectively collected data (on all suspected

CRC cases) where the diagnostic interval is unknown at the study outset and where having access to multiple patient charts (family physician, oncology, cancer centre) would provide data not available through administrative datasets. Singh et al. (15) highlight the need for prospective databases in this field of study, especially for the diagnostic interval (versus treatment interval), and hoped that the increasing use of electronic health records would make this a possibility in the future.

Exposure Assessment

Assigning a rurality score to each patient was subject to some measurement error. This could have happened if a patient moved during the year of their CRC diagnosis and the postal code on July 1st of that year was not reflective of the CSD where the patient went through the diagnostic process.

However, the more appropriate CSD of residence would have had to belong in a different RIO category

(six levels of rurality) for measurement error to occur. This misclassification would most likely have been

128 non-differential, but because our rurality measurement had more than two levels results may have been biased towards or away from the null. Further, it is possible that some postal codes transverse CSD boundaries and therefore, a patient may sometimes be assigned to the incorrect CSD. Another point to consider was that the distance component of the RIO score was measured from the centroid of the CSD, which may not perfectly reflect the distance traveled by an individual patient to basic and advanced referral centres. This could under or overestimate the distance travelled. However, this would not likely change the ultimate RIO category that a patient was grouped into.

A small proportion of study-eligible patients were missing a RIO score and were excluded. We compared study variable distributions between this group and the final cohort and concluded that there was not a statistically significant difference in the diagnostic interval distribution or stage (I-IV), however there was a greater proportion of stage unknown in the missing RIO group (13.1% versus 7.3% in the final cohort). This is further discussed in the Missing Data sub-section.

Outcome Assessment

Difficulties in measuring accurate diagnostic intervals were important to consider. This difficulty in measurement lies in the process to identify the index contact date. Using our method it was possible that we measured the diagnostic interval conservatively. For example, the true index contact date may have happened before the physician visit that lead to the referral for a key CRC test. This method may have caused us to see less of an effect than the truth as the interval lengths reported may have been shorter than reality.

Another consideration related to the outcome measurement was the inclusion of two key CRC tests, abdominal US and FOBT, that had control chart signal strengths below 80%. This meant that if either of these two tests generated a patient’s index contact date we were slightly less confident that these tests were truly related to a patient’s CRC diagnosis within the given control chart look-back period (as compared to the other key CRC tests). This may have extended the diagnostic interval for some case. It was important to consider the pros and cons of including abdominal US and FOBT. Unfortunately, once 129 the dataset was cut we could not determine the proportion of patients who would not have had an index contact date if these two tests were not included. However, indication from the retrieval of missing index contact dates outlined in the methods section suggested that abdominal US was important to include.

Furthermore, 19 % of the cohort had FOBTs as their first tests and 18% of the cohort had abdominal US as their first test, also signifying to us that these tests were important to keep. The low signal strength of abdominal US adds something new to the CRC literature and should be considered in future studies.

A minor, but interesting, point to consider is the difference between the definitions for “cancer date of diagnosis” in administrative data versus patient records. In studies using administrative data, such as our own, the cancer date of diagnosis is often the date that a tissue sample was received by pathology, which may or may not be synonymous with the date that a patient was informed of the diagnosis. For

CRC this is often the same day, however, for other cancer sites such as prostate and lung there can be variation in the two definitions (70) and this discrepancy should especially be considered if patient anxiety during the diagnostic interval is of main interest.

Covariate Assessment

While the use of administrative databases was a study strength in most regards there were certain limitations in association with this data source. One of these was the possibility of unmeasured confounding as not all covariates identified in the literature review as having an association with the diagnostic interval were included in this study because they were not accessible through the administrative databases. Variables not accounted for in this project, but identified as significant in other research included: patient’s presenting symptoms (8,55,67,75,195), lost follow-up of abnormal findings

(77), specifics of referral notes (67), patient non-adherence (67), marital status (157) and ethnicity (157).

Having definitive screening versus symptomatic fee codes would have also enhanced the study, although we did have our symptom status algorithm. Further, there is the possibility for residual confounding from imperfect adjustment of the variables that were included in the study. This may be of most relevance for

130 immigrant status, for which we used a proxy and for deprivation quintile which was an area-level variable used to represent individual-level deprivation.

Missing Data

There were patients who belonged in the eligible (target) cohort for this study who were not included in the final analyses due to missing data. The two sources for missing data that lead to being excluded from the eligible cohort where: missing an index contact date (therefore impossible to calculate the diagnostic interval) and missing a RIO score which, together, represented about 5% of the eligible cohort. In an effort to explore the possibility for selection bias (systematic) to occur here we compared study variable distributions between patients in the final cohort and patients who were excluded from the eligible cohort. For example, was it possible that patients living in more rural areas had a greater proportion of cases with missing index contact dates? Results from our comparison tables were reassuring; there were not statistically significant differences in the distributions of the exposure or main outcome between the final and eligible cohorts. Given the small percentage of excluded cases and the results from our comparison tables, missing data was likely not a major limitation to this study, but some differences in covariates did appear between the two groups. Missing CSDs due to the RIO score exclusions (First Nations reserves and settlements, CSDs with a population of less than 500 and

Unorganized Areas) may have implications for the external validity of the study. These are likely all

CSDs that would fall into our most rural category. Given its small size, inclusion of patients with missing

RIO in that category may have had an impact on the magnitude or direction of our results.

The exclusion of rectosigmoid junction cancers from the eligible cohort was a slightly different case of missing data as the entire group was erroneously not included. These cases would have been later categorized into the rectal sub-site grouping and a comparison of select study variable distributions between patients with a rectal cancer and a rectosigmoid junction cancer were made. There was not a statistically significant difference detected in the main outcome between these two groups. Furthermore,

131 the percent change in variable distributions in the final cohort if the rectosigmoid junction cases were added would have been small.

There were two other issues of missing data that were of lesser importance. Firstly, there was a small number of missing data (<1% of the cohort) for the deprivation quintile variable. Since this represented such a small number of cases and was for a confounder we were not worried about it biasing results. Secondly, stage data after May 31st, 2012 was found to be of poor quality in the ICES database, likely due to a lag in data transfer and we therefore ended our study period at this point versus the full

2012 calendar year. This would simply only affect our sample size, but is mentioned here for future ICES researchers to check early on.

Sample Size

While this study did have a large sample size, a small cell size was apparent when examining first test (barium enema) across rurality. The variable “first test” was not included in our models and sample size considerations were outlined in the Methods Chapter.

5.4 Study Contribution

This study enhanced an existing method to calculate the CRC diagnostic interval from administrative databases and revised a symptom status algorithm to create two fundamental resources for future research. The study was also the first to examine rurality as a primary exposure in relation to the diagnostic interval of CRC patients in the Canadian context. Given that equitable access to healthcare services is a pillar of the publically-funded Canadian health care system, our research question was important to try to answer. This study was the first to quantify the CRC diagnostic interval at an Ontario population-level and represents an early use of Ontario population-based stage data. It is difficult to improve upon the diagnostic process if it is not first measured and characterized; this study has provided preliminary methodology and initial estimates of the CRC diagnostic interval in Ontario for a future body of work.

132

5.5 Public Health Implications and Future Research Directions

Wait times while moving through the cancer diagnosis and treatment pathways are a public health concern (22). The diagnostic interval is often a time of anxiety for patients and their families that would be abetted by prompt testing (15). Although timeliness is an essential component of patient-centered care

(196) and is of great political and public health interest (70,197), there could be more emphasis and study on the wait time benchmarks set by the Canadian Association of Gastroenterology (89) for navigation through the peri-diagnostic interval. This may be an important area for future policy development. A study by Jiwa and colleagues (107) in Western Australia aimed to identify factors that influenced the speed of cancer diagnosis. For one component of the study, a framework that was first developed for examining industrial accidents was used to identify and classify “error producing circumstances” and

“organizational factors” in a clinical context (107,198). It would be interesting to do a similar study in

Ontario involving key stakeholders.

The difference in the diagnostic interval by stage was interesting, especially that a statistically significant difference in the median diagnostic interval across RIO categories was only found for stage I

CRCs. Perhaps this highlights an opportunity to study the diagnostic process in CSDs contained in the second most rural RIO category (56-75), where the screening proportion was similar to less rural categories, yet the median diagnostic interval was faster. With Canada’s ageing population it is estimated that by 2028-2032 the average annual number of new CRC cases in Canada will increase by 78% (86.6% for males and 69.4 % for females) as compared to 2003-2007 (1). Further, Singh and colleagues (15) have documented an increasing trend in the CRC median diagnostic interval in Manitoba (2004-2009).

Considering these two points it will be important to achieve the most efficient CRC diagnosis pathway within the limits of the healthcare system.

The proportion of CRC patients who were diagnosed through the ED (19.5%) was concerning, as well as the proportional differences in ED presentation across rurality, with the greatest proportion of ED cases in the most rural RIO category (26.4%). Emergency cases often represent a more complicated group 133 of CRCs usually presenting at a later stage. From a policy perspective this represents an area for improvement and in the long term would be aided by increased uptake of CRC screening, especially in the most rural areas where the proportion detected asymptomatically was the lowest. CRC screening of average-risk individuals has been shown to be cost-effective in Canada (199). It will be important for each province to monitor CRC screening uptake and to communicate the most effective strategies with other provinces as there is interprovincial variations in the program designs. The need for additional resources (especially endoscopy capacity) with the roll-out of province-wide screening programs must also be taken into consideration. Several ideas have been proposed in the literature to address poor access to cancer care (screening, diagnostics and treatment) in more rural areas, this includes: creating satellite clinics closer to patients who travel the longest distances (115), outreach services including cancer screening and diagnostics (200), mobile cancer screening staff (115) and specialty oncology nurses (200).

While it was reassuring that a statistically significant difference was not detected in the CRC stage distribution across RIO categories, approximately 18% of CRC cases (2007-2012) were detected at stage

IV. Stage IV CRC has a low survival rate as compared to stages I-III and this again would encourage screening to both reduce the incidence of CRC and to detect CRC at an earlier stage.

Our research has answered some primary questions about the diagnostic interval for CRC patients across Ontario, and in doing so, has naturally generated more questions for future study in this field. With the current dataset, questions about temporal trends in the diagnostic interval could be studied, similar to the work by Singh and colleagues in Manitoba (15,22). As well, questions pertaining to asymptomatic and symptomatic patients could be further explored with the algorithm that we revised.

A refinement of this study would be to analyze the diagnostic interval for all suspected cases of

CRC, that is, both patients who were diagnosed with CRC and patients who had the diagnostic work-up for CRC but did not have CRC. This question has been studied by Grunfeld and colleagues (70) using data from patients at one Ontario hospital and would be of interest to expand to a province wide population. There would be significant methodological challenges in identifying the study cohort using

134 administrative data, since the researcher could not restrict to cases who had a CRC diagnosis. Despite the challenge, it may be a worthwhile endeavor as the psychological stress during the diagnostic process would apply to both patients with and without an ultimate CRC diagnosis. Further, it would be an interesting opportunity to look for any possible clues to differentiate the two groups.

Another future development to our study would be to identify all possible patient pathways to a

CRC diagnosis, with corresponding diagnostic intervals in a population-based cohort. A study by Barrett and colleagues (69) examined the pathways for CRC patients from primary to secondary care in three UK cities. An Ontario equivalent with respective diagnostic intervals would be of interest to help further inform an optimal CRC diagnosis pathway and to create standard timeliness guidelines.

There are many contradictory results in studies that have examined the relationship between the diagnostic interval and CRC outcomes (e.g. mortality). Future research on this association is warranted taking into consideration the points brought forth by Tørring and colleagues (8,9,87), such as confounding by indication, and by Singh and colleagues (15,22) such as the need to establish a diagnostic interval threshold for wait times to become clinically significant. These ideas should be taken into consideration as the importance of the diagnostic interval is often minimized when a study finds that there is no statistically significant difference in the outcomes measured for patients with or without untimely diagnostic intervals.

Finally, as highlighted in the Literature Review Chapter, there are multiple definitions used in

Canada to define rurality. A study that directly compared access to health care indicators using multiple definitions of rurality would provide insight on variability and possibly identify a definition that is the most appropriate.

5.6 Conclusions

In Ontario, there is a very similar CRC stage distribution across degrees of rurality. Patients with stage I CRC have a longer diagnostic interval than patients with stage IV CRC. Rurality may only have an

135 effect on the CRC diagnostic interval for stage I CRC patients, although a deeper analysis is needed to better understand patient characteristics and access to cancer diagnostic services in rural Ontario. Future studies are also required to assess the impact of the CRC diagnostic interval on clinical outcomes including psychological health.

136

References

1. Canadian Cancer Society’s Advisory Commitee on Cancer Statistics. Canadian Cancer Statistics 2015. Toronto, ON: Canadian Cancer Society; 2015.

2. Compton CC, Greene FL. The staging of colorectal cancer: 2004 and beyond. CA: a cancer journal for clinicians [Internet]. 2004;54(6):295–308. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15537574

3. American Cancer Society. Survival rates for colon cancer/rectal cancer, by stage [Internet]. 2015 [cited 2015 Sep 3]. Available from: http://www.cancer.org/cancer/colonandrectumcancer/detailedguide/colorectal-cancer-survival- rates

4. Cancer Care Ontario. Colorectal Cancer Diagnosis Pathway version 2013.5 [Internet]. Cancer Care Ontario; Available from: https://www.cancercare.on.ca/ocs/qpi/dispathmgmt/pathways/colopath/

5. Mitchell E, Macdonald S, Campbell NC, Weller D, Macleod U. Influences on pre-hospital delay in the diagnosis of colorectal cancer: a systematic review. British Journal of Cancer [Internet]. 2008 Jan 15 [cited 2013 May 29];98(1):60–70. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2359711&tool=pmcentrez&rendertype =abstract

6. Neal RD. Do diagnostic delays in cancer matter? British Journal of Cancer [Internet]. 2009 Dec 3 [cited 2013 Dec 22];101 Suppl (2009):S9–S12. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2790696&tool=pmcentrez&rendertype =abstract

7. Korsgaard M, Pedersen L, Sørensen HT, Laurberg S. Delay of treatment is associated with advanced stage of rectal cancer but not of colon cancer. Cancer Detection and Prevention [Internet]. 2006 Jan [cited 2015 Jul 18];30(4):341–6. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16965875

8. Tørring ML, Frydenberg M, Hansen RP, Olesen F, Hamilton W, Vedsted P. Time to diagnosis and mortality in colorectal cancer: a cohort study in primary care. British Journal of Cancer [Internet]. 2011 Mar 15 [cited 2013 Jun 7];104(6):934–40. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3065288&tool=pmcentrez&rendertype =abstract

9. Tørring ML, Frydenberg M, Hamilton W, Hansen RP, Lautrup MD, Vedsted P. Diagnostic interval and mortality in colorectal cancer: U-shaped association demonstrated for three different datasets. Journal of Clinical Epidemiology [Internet]. Elsevier Inc; 2012 Jun [cited 2013 Jun 6];65(6):669–78. Available from: http://www.ncbi.nlm.nih.gov/pubmed/22459430

10. Viiala CH, Tang KW, Lawrance IC, Murray K, Olynyk JK. Waiting times for colonoscopy and colorectal cancer diagnosis. Medical Journal of Australia. 2007;186(6):282–5.

137

11. Wattacheril J, Kramer JR, Richardson P, Havemann BD, Green LK, Le A, et al. Lagtimes in diagnosis and treatment of colorectal cancer: determinants and association with cancer stage and survival. Alimentary Pharmacology & Therapeutics [Internet]. 2008 Nov 1 [cited 2015 Jul 16];28(9):1166–74. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2596579&tool=pmcentrez&rendertype =abstract

12. Gonzalez-Hermoso F, Perez-Palma J, Marchena-Gomez J, Lorenzo-Rocha N, Medina-Arana V. Can Early Diagnosis of Symptomatic Colorectal Cancer Improve the Prognosis ? World Journal of Surgery. 2004;28(7):716–20.

13. Rupassara KS, Ponnusamy S, Withanage N, Milewski PJ. A paradox explained? Patients with delayed diagnosis of symptomatic colorectal cancer have good prognosis. Colorectal Disease [Internet]. 2006 Jun [cited 2015 Jul 16];8(5):423–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16684087

14. Terhaar sive Droste JS, Oort FA, van der Hulst RWM, Coupé VMH, Craanen ME, Meijer GA, et al. Does delay in diagnosing colorectal cancer in symptomatic patients affect tumor stage and survival? A population-based observational study. BMC Cancer [Internet]. 2010 Jan;10:332. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2907342&tool=pmcentrez&rendertype =abstract

15. Singh H, Shu E, Demers A, Bernstein CN, Griffith J, Fradette K. Trends in time to diagnosis of colon cancer and impact on clinical outcomes. Canadian Journal of Gastroenterology. 2012;26(12):877–80.

16. Cerdán-Santacruz C, Cano-Valderrama O, Cárdenas-Crespo S, Torres-García AJ, Cerdán-Miguel J. Colorectal cancer and its delayed diagnosis: have we improved in the past 25 years? Revista Española de Enfermedades Digestivas [Internet]. 2011 Sep;103(9):458–63. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21951114

17. Roncoroni L, Pietra N, Violi V, Sarli L, Choua O, Peracchia A. Delay in the diagnosis and outcome of colorectal cancer: a prospective study. European Journal of Surgical Oncology [Internet]. 1999 Apr;25(2):173–8. Available from: http://www.ncbi.nlm.nih.gov/pubmed/10218461

18. Singh H, Khan R, Giardina TD, Paul LW, Daci K, Gould M, et al. Postreferral colonoscopy delays in diagnosis of colorectal cancer: a mixed-methods analysis. Quality Management in Health Care [Internet]. 2012 [cited 2013 Jul 14];21(4):252–61. Available from: http://www.ncbi.nlm.nih.gov/pubmed/23011072

19. Canadian Institute for Health Information. How Healthy Are Rural Canadians ? An Assessment of Their Health Status and Health Determinants. Ottawa, ON: Canadian Institute for Health Information; 2006.

20. Statisitcs Canada. 2006 Census: potrait of the Canadian population in 2006, by age and sex: subprovincial population dynamics [Internet]. 2009 [cited 2015 Jul 12]. Available from: http://www12.statcan.ca/census-recensement/2006/as-sa/97-551/p17-eng.cfm

138

21. Ahmed S, Shahid RK. Disparity in cancer care: a Canadian perspective. Current Oncology [Internet]. 2012 Dec;19(6):e376–82. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3503668&tool=pmcentrez&rendertype =abstract

22. Singh H, De Coster C, Shu E, Fradette K, Latosinksy S, Pitz M, et al. Wait times from presentation to treatment for colorectal cancer : A population-based study. Canadian Journal of Gastroenterology. 2010;24(1):33–9.

23. El-Serag HB, Petersen L, Hampel H, Richardson P, Cooper G. The use of screening colonoscopy for patients cared for by the Department of Verterans Affairs. Archives of Internal Medicine. 2006;166:2202–8.

24. Cancer Quality Council of Ontario. Population-based distribution of cancer stage, colorectal cancer, patients diagnosed from 2009 to 2011, Ontario [Internet]. 2014 [cited 2014 Oct 5]. Available from: http://www.csqi.on.ca/cms/one.aspx?portalId=289784&pageId=296178

25. Canadian Partnership Against Cancer. Colorectal Cancer Staging and Survival. Canadian Partnership Against Cancer; 2010.

26. Canadian Cancer Society’s Steering Committee on Cancer Statistics. Canadian Cancer Statistics 2011. Toronto, ON: Canadian Cancer Society; 2011.

27. Center M, Jemal A, Smith RA, Ward E. Worldwide Variations in Colorectal Cancer. CA: a cancer journal for clinicians. 2009;59:366–78.

28. Koyama Y, Kotake K. Overview of Colorectal Cancer in Japan. Diseases of the Colon & Rectum. 1997;40(10):S2–S9.

29. Marchand L Le. Combined Influence of Genetic and Dietary Factors on Colorectal Cancer Incidence in Japanese Americans. Journal of the National Cancer Institute Monographs. 1999;(26):101–5.

30. Cancer Care Ontario. Colorectal cancer subsite distribution differs by sex [Internet]. 2010 [cited 2014 Dec 6]. Available from: https://www.cancercare.on.ca/cms/one.aspx?portalId=1377&pageId=67772

31. American Cancer Society. What is colorectal cancer [Internet]. 2015 [cited 2015 Jul 22]. Available from: http://www.cancer.org/cancer/colonandrectumcancer/detailedguide/colorectal-cancer-what- is-colorectal-cancer

32. Stewart SL, Wike JM, Kato I, Lewis DR, Michaud F. A population-based study of colorectal cancer histology in the United States, 1998-2001. Cancer [Internet]. 2006 Sep 1 [cited 2014 Oct 30];107(5 Suppl):1128–41. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16802325

33. Boyer K, Ford M, Judkins A, Levin B. Primary Care Oncology. Saunders Company; 1999. p. 142– 55.

139

34. Kitisin K, Mishra L. Molecular biology of colorectal cancer: new targets. Seminars in Oncology [Internet]. 2006 Dec [cited 2015 Jul 26];33(6 Suppl 11):S14–23. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17178280

35. Giacosa a, Frascio F, Munizzi F. Epidemiology of colorectal polyps. Techniques in Coloproctology [Internet]. 2004 Dec [cited 2015 Jul 27];8 Suppl 2(2004):s243–7. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15666099

36. Neugut A, Jacobson J, Rella V. Prevalence and incidence of colorectal adenomas and cancer in asymptomatic persons. Gastrointestinal Endoscopy Clinics of North America. 1997;7(3):387–99.

37. Frank SA. Dynamics of Cancer: Incidence, Inheritance, and Evolution. Orr A, editor. Chapter 3: Princeton University Press; 2007.

38. Haggar FA, Boushey RP. Colorectal Cancer Epidemiology : Incidence, Mortality, Survival, and Risk Factors. Clinics in Colon and Rectal Surgery. 2009;22(4):191–7.

39. Kozuka S, Nogaki M, Ozeki T, Masumori S. Premalignancy of the mucosal polyp in the large intestine. Diseases of the Colon & Rectum [Internet]. 1975 Sep;18(6):494–500. Available from: http://content.wkhealth.com/linkback/openurl?sid=WKPTLP:landingpage&an=00003453- 197518060-00011

40. Davies RJ, Miller R, Coleman N. Colorectal cancer screening: prospects for molecular stool analysis. Nature Reviews Cancer [Internet]. 2005 Mar [cited 2013 May 24];5(3):199–209. Available from: http://www.nature.com/doifinder/10.1038/nrc1569

41. Choi SJ, Kim H, Ahn S, Jeong YM, Choi H. Evaluation of the growth pattern of carcinoma of colon and rectum by MDCT. Acta Radiologica. 2013;54(5):487–92.

42. Mackillop W, Short S, Feldman-Stewart D, King W, Brouwers M, Brundage M, et al. Review on Tumour Doubling Time, Report No 3, Appendix 2. 2006.

43. Fong BY, Cohen AM, Fortner JG, Enker WE, Turnbull AD, Colt DG, et al. Liver Resection for Colorectal Metastases. Journal of Clinical Oncology. 1997;15(3):938–46.

44. Perera F, Dingle B. Colorectal Carcinoma [Internet]. Available from: http://www.schulich.uwo.ca/oncology/docs/Colorectal Carcinoma Student Handbook 2007.pdf

45. Watson A, Collins P. Colon cancer a civilization disorder. Digestive Diseases. 2011;29(2):222–8.

46. Canadian Cancer Society. Risk factors for colorectal cancer [Internet]. [cited 2015 Jul 27]. Available from: http://www.cancer.ca/en/cancer-information/cancer- type/colorectal/risks/?region=on

47. Mayo Clinic. Colon Cancer Risk Factors [Internet]. 2013 [cited 2015 Jul 27]. Available from: http://www.mayoclinic.org/diseases-conditions/colon-cancer/basics/risk-factors/con-20031877

140

48. American Cancer Society. What are the risk factors for colorectal cancer [Internet]. 2015 [cited 2015 Jul 27]. Available from: http://www.cancer.org/cancer/colonandrectumcancer/detailedguide/colorectal-cancer-risk-factors

49. Platz EA, Willett W, Colditz G, Rimm E, Spiegelman D, Giovannucci E. Proportion of colon cancer risk that might be preventable in a cohort of middle-aged US men. Cancer Causes and Control. 2000;11(7):579–88.

50. Hsing A, McLaughlin J, Chow W, Chuman LMS, Co Chien H, Gridley G, et al. Risk Factors for Colorectal Cancer in a Prospective Study Among U.S. White Men. International Journal of Cancer. 1998;77(4):549–53.

51. Majumdar SR, Fletcher RH, Evans AT. How Does Colorectal Cancer Present? Symptoms, Duration, and Clues to Location. The American Journal of Gastroenterology. 1999;94(10):3039– 45.

52. Del Giudice ME, Vella ET, Hey A, Simunovic M, Harris W, Levitt C. Guideline for referral of Patients with suspected colorectal cancer by family physicians and other primary care providers. Canadian Family Physician. 2014;60:717–23.

53. Tomlinson C, Wong C, Au H-J, Schiller D. Factors associated with delays to medical assessment and diagnosis for patients with colorectal cancer. Canadian Family Physician. 2012;58:e495–501.

54. Wiljer D, Walton T, Gilbert J, Boucher A, Ellis PM, Schiff S, et al. Understanding the Needs of Colorectal Cancer Patients during the Pre-diagnosis Phase. Journal of Cancer Education [Internet]. 2013 May 21 [cited 2013 Jul 14]; Available from: http://www.ncbi.nlm.nih.gov/pubmed/23690171

55. Korsgaard M, Pedersen L, Sørensen HT, Laurberg S. Reported symptoms, diagnostic delay and stage of colorectal cancer: a population-based study in Denmark. Colorectal Disease [Internet]. 2006 Oct [cited 2013 Jul 14];8(8):688–95. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16970580

56. Richman S, Adlard J. Left and right sided large bowel cancer have significant genetic differences in addition to well known clinical differences. British Medical Journal. 2002;324:931–2.

57. Cancer Care Ontario. ColonCancerCheck 2010 Program Report. Toronto, Canada: Cancer Care Ontario; 2010.

58. Hewitson P, Glasziou P, Watson E, Towler B, Irwig L. Cochrane systematic review of colorectal cancer screening using the fecal occult blood test (hemoccult): an update. The American Journal of Gastroenterology [Internet]. 2008 Jun [cited 2013 Apr 8];103(6):1541–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/18479499

59. Leddin D, Hunt R, Champion M, Cockeram A, Flook N, Gould M, et al. Canadian Association of Gastroenterology and the Canadian Digestive Health Foundation: Guidelines on colon cancer screening. Canadian Journal of Gastroenterology [Internet]. 2004 Feb;18(2):93–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15085796

141

60. Mohammad A. Barriers to Timely Screening Colonoscopy: The Role of Health Insurance. Public Health, University of Connecticut; 2008.

61. American Joint Committee on Cancer. AJCC Cancer Staging Manual. 6th ed. Greene F, Page DL, Fleming I, Fritz A, Balch C, Haller D, et al., editors. Springer; 2002.

62. Ballinger AB, Anggiansah C. Colorectal cancer. British Medical Journal (Clinical research ed.) [Internet]. 2007 Oct 6 [cited 2015 Jun 25];335(7622):715–8. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2001051&tool=pmcentrez&rendertype =abstract

63. Wu X, Chen V, Steele B, Ruiz B, Fulton J, Liu L, et al. Subsite-specific incidence rate and stage of disease in colorectal cancer by race, gender, and age group in the United States, 1992-1997. Cancer [Internet]. 2001 Nov 15;92(10):2547–54. Available from: http://www.ncbi.nlm.nih.gov/pubmed/11745188

64. Ponz de Leon M. Trend of incidence, subsite distribution and staging of colorectal neoplasms in the 15-year experience of a specialised cancer registry. Annals of Oncology [Internet]. 2004 Jun 1 [cited 2015 May 30];15(6):940–6. Available from: http://annonc.oupjournals.org/cgi/doi/10.1093/annonc/mdh224

65. Canadian Cancer Society. Signs and symptoms of colorectal cancer [Internet]. 2015. [cited 2015 Jul 28]. Available from: http://www.cancer.ca/en/cancer-information/cancer-type/colorectal/signs- and-symptoms/?region=bc

66. Weller D, Vedsted P, Rubin G, Walter FM, Emery J, Scott S, et al. The Aarhus statement: improving design and reporting of studies on early cancer diagnosis. British Journal of Cancer [Internet]. Nature Publishing Group; 2012 Mar 27 [cited 2013 Sep 26];106:1262–7. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3314787&tool=pmcentrez&rendertype =abstract

67. Singh H, Petersen LA, Daci K, Collins C, Khan M, El-Serag HB. Reducing referral delays in colorectal cancer diagnosis: is it about how you ask? Quality & Safety in Health Care [Internet]. 2010 Oct [cited 2013 Jul 14];19(5):e27. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2965264&tool=pmcentrez&rendertype =abstract

68. Olesen F, Hansen RP, Vedsted P. Delay in diagnosis: the experience in Denmark. British Journal of Cancer [Internet]. Nature Publishing Group; 2009 Dec 3 [cited 2015 Jul 15];101 Suppl (S2):S5– 8. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2790711&tool=pmcentrez&rendertype =abstract

69. Barrett J, Jiwa M, Rose P, Hamilton W. Pathways to the diagnosis of colorectal cancer: an observational study in three UK cities. Family Practice [Internet]. 2006 Feb [cited 2015 Jul 15];23(1):15–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16286462

70. Grunfeld E, Watters JM, Urquhart R, O’Rourke K, Jaffey J, Maziak DE, et al. A prospective study of peri-diagnostic and surgical wait times for patients with presumptive colorectal, lung, or 142

prostate cancer. British Journal of Cancer [Internet]. 2009 Jan 13 [cited 2013 Jul 14];100(1):56– 62. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2634695&tool=pmcentrez&rendertype =abstract

71. Hansen RP, Vedsted P, Sokolowski I, Søndergaard J, Olesen F. Time intervals from first symptom to treatment of cancer: a cohort study of 2,212 newly diagnosed cancer patients. BMC Health Services Research [Internet]. BioMed Central Ltd; 2011 Jan [cited 2015 Jul 16];11(1):284. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3217887&tool=pmcentrez&rendertype =abstract

72. Korsgaard M, Pedersen L, Laurberg S. Delay of diagnosis and treatment of colorectal cancer--a population-based Danish study. Cancer Detection and Prevention [Internet]. 2008 Jan [cited 2015 Jul 15];32(1):45–51. Available from: http://www.ncbi.nlm.nih.gov/pubmed/18406067

73. Van Hout AMGH, de Wit NJ, Rutten FH, Peeters PHM. Determinants of patient’s and doctor's delay in diagnosis and treatment of colorectal cancer. European Journal of Gastroenterology & Hepatology [Internet]. 2011 Nov [cited 2013 Jul 14];23(11):1056–63. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21941190

74. Robertson R, Campbell NC, Smith S, Donnan PT, Sullivan F, Duffy R, et al. Factors influencing time from presentation to treatment of colorectal and breast cancer in urban and rural areas. British Journal of Cancer [Internet]. 2004 Apr 19 [cited 2013 Mar 8];90(8):1479–85. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2409724&tool=pmcentrez&rendertype =abstract

75. Singh H, Daci K, Petersen L a, Collins C, Petersen NJ, Shethia A, et al. Missed opportunities to initiate endoscopic evaluation for colorectal cancer diagnosis. The American Journal of Gastroenterology [Internet]. 2009 Oct [cited 2015 Jun 28];104(10):2543–54. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2758321&tool=pmcentrez&rendertype =abstract

76. Potter MA, Wilson RG. Audit: Diagnostic delay in colorectal cancer. Journal of the Royal College of Surgeons of Edinburgh. 1999;44(5):313–6.

77. Wahls TL, Peleg I. Patient- and system-related barriers for the earlier diagnosis of colorectal cancer. BMC Family Practice [Internet]. 2009 Jan [cited 2013 Jun 6];10:65. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2758830&tool=pmcentrez&rendertype =abstract

78. Allgar VL, Neal RD. Delays in the diagnosis of six cancers: analysis of data from the National Survey of NHS Patients: Cancer. British Journal of Cancer [Internet]. 2005 Jun 6 [cited 2014 Jan 10];92(11):1959–70. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2361797&tool=pmcentrez&rendertype =abstract

79. Macdonald S, Macleod U, Campbell NC, Weller D, Mitchell E. Systematic review of factors influencing patient and practitioner delay in diagnosis of upper gastrointestinal cancer. British Journal of Cancer [Internet]. 2006 May 8 [cited 2015 Jul 21];94(9):1272–80. Available from: 143

http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2361411&tool=pmcentrez&rendertype =abstract

80. Mitchell E, Sullivan F. A descriptive feast but an evaluative famine : systematic review of published articles on primary care computing during 1980-97. British Medical Journal. 2001;322(7281):279–82.

81. Coleman MP, Forman D, Bryant H, Butler J, Rachet B, Maringe C, et al. Cancer survival in Australia, Canada, Denmark, Norway, Sweden, and the UK, 1995-2007 (the International Cancer Benchmarking Partnership): an analysis of population-based cancer registry data. Lancet [Internet]. Elsevier Ltd; 2011 Jan 8 [cited 2014 Jan 29];377:127–38. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3018568&tool=pmcentrez&rendertype =abstract

82. Rose PW, Rubin G, Perera-Salazar R, Almberg SS, Barisic A, Dawes M, et al. Explaining variation in cancer survival between 11 jurisdictions in the International Cancer Benchmarking Partnership: a primary care vignette survey. British Medical Journal open [Internet]. 2015 Jan [cited 2015 May 31];5(5):e007212. Available from: http://www.ncbi.nlm.nih.gov/pubmed/26017370

83. National Cancer Intelligence Network. Routes to Diagnosis: NCIN Data Briefing. National Cancer Intelligence Network; 2010.

84. Crawford SC, Davis J a, Siddiqui N a, de Caestecker L, Gillis CR, Hole D, et al. The waiting time paradox: population based retrospective study of treatment delay and survival of women with endometrial cancer in Scotland. British Medical Journal (Clinical research ed.) [Internet]. 2002 Jul 27;325:196. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=117451&tool=pmcentrez&rendertype =abstract

85. Neal RD, Allgar VL, Ali N, Leese B, Heywood P, Proctor G, et al. Stage, survival and delays in lung, colorectal, prostate and ovarian cancer: comparison between diagnostic routes. The British Journal of General Practice [Internet]. 2007 Mar;57(536):212–9. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2042569&tool=pmcentrez&rendertype =abstract

86. Neal RD, Tharmanathan P, France B, Din NU, Cotton S, Fallon-Ferguson J, et al. Is increased time to diagnosis and treatment in symptomatic cancer associated with poorer outcomes? Systematic review. British Journal of Cancer [Internet]. 2015 Mar 31 [cited 2015 Jul 24];112 Suppl S92–107. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4385982&tool=pmcentrez&rendertype =abstract

87. Tørring ML, Frydenberg M, Hansen RP, Olesen F, Vedsted P. Evidence of increasing mortality with longer diagnostic intervals for five common cancers: a cohort study in primary care. European Journal of Cancer (Oxford, England : 1990) [Internet]. 2013 Jun [cited 2015 Jul 24];49(9):2187–98. Available from: http://www.ncbi.nlm.nih.gov/pubmed/23453935

88. Pruitt SL, Harzke AJ, Davidson NO, Schootman M. Do diagnostic and treatment delays for colorectal cancer increase risk of death? Cancer Causes & Control [Internet]. 2013 May [cited 144

2015 Jul 25];24(5):961–77. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3708300&tool=pmcentrez&rendertype =abstract

89. Paterson W, Depew W, Paré P, Petrunia D, Switzer C, Veldhuyzen van Zantan S, et al. Canadian consensus on medically acceptable wait times for digestive health care. Canadian Journal of Gastroenterology. 2006;20(6):411–23.

90. Cancer Care Ontario. Wait Times for Colonoscopy after a Positive Fecal Occult Blood Test [Internet]. 2009 [cited 2015 Jul 21]. Available from: https://www.cancercare.on.ca/cms/one.aspx?pageID=41101

91. National Institute for Health and Care Excellence (NICE). Suspected cancer : recognition and referral. National Institute for Health and Care Excellence; 2015.

92. Berrino F, De Angelis R, Sant M, Rosso S, Bielska-Lasota M, Lasota MB, et al. Survival for eight major cancers and all cancers combined for European adults diagnosed in 1995-99: results of the EUROCARE-4 study. The Lancet Oncology [Internet]. 2007 Sep [cited 2013 Jun 30];8(9):773–83. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17714991

93. Richards M. The size of the prize for earlier diagnosis of cancer in England. British Journal of Cancer [Internet]. Nature Publishing Group; 2009 Dec 3 [cited 2013 May 23];101 Suppl(S2):S125–9. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2790715&tool=pmcentrez&rendertype =abstract

94. Zapka J, Taplin SH, Price RA, Cranos C, Yabroff R. Factors in Quality Care — The Case of Follow-Up to Abnormal Cancer Screening Tests — Problems in the Steps and Interfaces of Care. Journal of the National Cancer Institute Monographs. 2010;40:58–71.

95. Wagner EH. Chronic disease management: What will it take to improve care for chronic illness. Effective Clinilcal Practice. 1998;1:2–4.

96. Groome P. Understanding diagnostic episodes of care in patients with early versus late cancers. Operating Grant for the Canadian Institute of Health Research; 2011.

97. Minore B, Hill ME, Pugliese I, Gauld T. Rurality literature review: Prepared for the North West local Health Integration Network. Centre for Rural and Northern Health Research, Lakehead University, Ontario; 2008.

98. Chartrand D, Vani R, Hillary N, Burroughs R, Clemenson H, Mogan A, et al. Rural and Small Town Canada Analysis Bulletin. Statistics Canada. 2001;3(3).

99. Kralj B. Measuring Rurality - RIO2008 _ BASIC : Methodology and Results. Ontario Medical Association Economics Department; 2009.

100. Kralj B. Measuring “rurality” for purposes of health-care planning: an empirical measure for Ontario. Ontario Medical Review. 2000;October:33–52.

145

101. Jiang L, Gilbert J, Langley H, Moineddin R, Groome P a. Effect of specialized diagnostic assessment units on the time to diagnosis in screen-detected breast cancer patients. British Journal of Cancer [Internet]. Nature Publishing Group; 2015 May 26 [cited 2015 May 28];112(11):1744– 50. Available from: http://www.ncbi.nlm.nih.gov/pubmed/25942395

102. Jaglal SB, Munce SEP, Guilcher SJ, Couris CM, Fung K, Craven BC, et al. Health system factors associated with rehospitalizations after traumatic spinal cord injury: a population-based study. Spinal cord [Internet]. Nature Publishing Group; 2009 Aug [cited 2013 Sep 23];47:604–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/19274059

103. Glazier R, Klein-geltink J, Kopp A, Sibley L. Capitation and enhanced fee-for-service models for primary care reform: a population-based evaluation. Canadian Medical Association Journal. 2009;180(11):E72–E81.

104. Guagliardo MF. Spatial accessibility of primary care: concepts, methods and challenges. International Journal of Health Geographics [Internet]. 2004 Feb 26;3(1):3–15. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=394340&tool=pmcentrez&rendertype =abstract

105. Azzopardi J, DeWitt DE. Quality and safety issues in procedural rural practice: a prospective evaluation of current quality and safety guidelines in 3000 colonoscopies. Rural and Remote Health [Internet]. 2012 Jan;12:1949. Available from: http://www.ncbi.nlm.nih.gov/pubmed/22985075

106. Baldwin L-M, Cai Y, Larson EH, Dobie SA, Wright GE, Goodman DC, et al. Access to cancer services for rural colorectal cancer patients. The Journal of Rural Health [Internet]. 2008 Jan;24(4):390–9. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3103393&tool=pmcentrez&rendertype =abstract

107. Jiwa M, Halkett G, Aoun S, Arnet H, Smith M, Pilkington M, et al. Factors influencing the speed of cancer diagnosis in rural Western Australia: a General Practice perspective. BMC Family Practice [Internet]. 2007 Jan [cited 2013 Jul 14];8(27). Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1868737&tool=pmcentrez&rendertype =abstract

108. Bain NSC, Campbell NC, Ritchie LD, Cassidy J. Striking the right balance in colorectal cancer care--a qualitative study of rural and urban patients. Family Practice [Internet]. 2002 Aug;19(4):369–74. Available from: http://www.ncbi.nlm.nih.gov/pubmed/12110557

109. Launoy G, Le Coutour X, Gignoux M, Pottier D, Dugleux G. Influence of rural environment on diagnosis, treatment and prognosis of colorectal cancer. Journal of Epidemiology and Community Health. 1992;46(4):365–7.

110. Campbell NC, Elliott a M, Sharp L, Ritchie LD, Cassidy J, Little J. Rural and urban differences in stage at diagnosis of colorectal and lung cancers. British Journal of Cancer [Internet]. 2001 Apr 6;84(7):910–4. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2363829&tool=pmcentrez&rendertype =abstract

146

111. Liff JM, Chow W, Greenberg RS. Rural-Urban differences in Stage at Diagnosis: possible relationships to cancer screening. Cancer. 1991;67(5):1454–9.

112. Elliott TE, Elliott BA, Renier CM, Haller I V. Rural urban differences in Cancer Care Results from the Lake Superior Rural Cancer Care Project. Minnesota Medicine. 2004;87(9):44–50.

113. Monroe AC, Ricketts TC, Savitz LA. Cancer in Rural versus Urban Populations: A Review. Journal of Rural Health. 1992;8(3).

114. Higginbotham JC, Medicine R, Moulder J, Central M, Registry C, Currier M, et al. Rural v . Urban Aspects of Cancer : First-Year Data from the Mississippi. Family & Community Health. 2001;24(2):1–9.

115. Massarweh NN, Chiang Y-J, Xing Y, Chang GJ, Haynes a. B, You YN, et al. Association Between Travel Distance and Metastatic Disease at Diagnosis Among Patients With Colon Cancer. Journal of Clinical Oncology [Internet]. 2014 Feb 10 [cited 2015 Jun 26];32(9):942–8. Available from: http://jco.ascopubs.org/cgi/doi/10.1200/JCO.2013.52.3845

116. Parsons MA, Askland KD. Cancer of the colorectum in Maine, 1995-1998: determinants of stage at diagnosis in a rural state. The Journal of Rural Health [Internet]. 2007 Jan;23(1):25–32. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17300475

117. Fazio L, Cotterchio M, Manno M, McLaughlin J, Gallinger S. Association between colonic screening, subject characteristics, and stage of colorectal cancer. The American Journal of Gastroenterology [Internet]. 2005 Nov [cited 2013 Jul 14];100(11):2531–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16279911

118. Casey MM, Thiede Call K, Klingner JM. Are rural residents less likely to obtain recommended preventive healthcare services? American Journal of Preventive Medicine [Internet]. 2001 Oct;21(3):182–8. Available from: http://www.ncbi.nlm.nih.gov/pubmed/11567838

119. Greiner KA, Engelman KK, Hall M a, Ellerbeck EF. Barriers to colorectal cancer screening in rural primary care. Preventive Medicine [Internet]. 2003 Mar [cited 2013 Jul 14];38(2004):269– 75. Available from: http://www.ncbi.nlm.nih.gov/pubmed/14766108

120. Cotterill M, Gasparelli R, Kirby E. Colorectal cancer detection in a rural community: Development of a colonoscopy screening program. Canadian Family Physician. 2005;51:1224–8.

121. Rural and Northern Health Care Panel. Rural and Northern Health Care Report Executive Summary [Internet]. Available from: http://www.health.gov.on.ca/en/public/programs/ruralnorthern/report.aspx

122. Ontario Ministry of Health and Long Term Care. Ontario Wait Times [Internet]. 2014 [cited 2015 Sep 8]. Available from: http://www.health.gov.on.ca/en/public/programs/waittimes/edrs/faq.aspx#8

123. McDonald S. Ontario ’s Aging Population: Challenges & Opportunities. Ontario Trillium Foundation; 2011.

147

124. ICES Data Dictionary. Ontario Cancer Registry [Internet]. 2015 [cited 2015 Aug 17]. Available from: https://ssl.ices.on.ca/dataprog/Data Holdings/Acquired Cohorts or Registries/OCR/,DanaInfo=.aioulhjFpkn2K00Nrq+index.htm

125. ICES Data Dictionary. Ontario Health Insurance Plan Claims Database [Internet]. 2014 [cited 2015 Aug 17]. Available from: https://ssl.ices.on.ca/dataprog/Data Holdings/Health Services/ohip/,DanaInfo=.aioulhjFpkn2K00Nrq+index.htm

126. Ontario Ministry of Health and Long Term Care. Schedule of Benefits Physcian Services Under the Health Insurance Act. Ontario Ministry of Health and Long Term Care; 2014.

127. ICES Data Dictionary. Registered Persons Data Base files [Internet]. 2015 [cited 2015 Aug 17]. Available from: https://ssl.ices.on.ca/dataprog/Data Holdings/Population and Demographics/RPDB/,DanaInfo=.aioulhjFpkn2K00Nrq+index.htm

128. ICES Data Dictionary. Discharge Abstract Database [Internet]. 2015 [cited 2015 Aug 17]. Available from: https://ssl.ices.on.ca/dataprog/Data Holdings/Health Services/dad/,DanaInfo=.aioulhjFpkn2K00Nrq+index.htm

129. Rabeneck L, Paszat LF, Rothwell DM, He J. Temporal trends in new diagnoses of colorectal cancer with obstruction, perforation, or emergency admission in Ontario: 1993-2001. The American Journal of Gastroenterology [Internet]. 2005 Mar [cited 2014 Oct 23];100(3):672–6. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15743367

130. Juurlink D, Preyra C, Croxford R, Chong N, Austin P, Tu J, et al. Canadian Institute for Health Information Discharge Abstract Database : A Validation Study. Toronto, ON: Institute for Clinical Evaluative Sciences; 2006.

131. ICES Data Dictionary. National Ambulatory Care Reporting System [Internet]. 2015 [cited 2015 Aug 17]. Available from: https://ssl.ices.on.ca/dataprog/Data Holdings/Health Services/nacrs- sds/,DanaInfo=.aioulhjFpkn2K00Nrq+index.htm

132. ICES Data Dictionary. 2006 Census Area Profiles [Internet]. 2013 [cited 2014 Jun 17]. Available from: https://ssl.ices.on.ca/Applications/DataDictionary/,DanaInfo=.adbvdhni0qxxl3.Nxsv- S88Vzy,SSL+Library.aspx?Library=CENSUS

133. ICES Data Dictionary. ICES Physician Database [Internet]. 2014 [cited 2014 Dec 5]. Available from: https://ssl.ices.on.ca/Applications/DataDictionary/,DanaInfo=.adbvdhni0qxxl3.Nxsv- S88Vzy,SSL+Library.aspx?Library=IPDB

134. ICES Macros and Scripts. Macros [Internet]. [cited 2015 Sep 8]. Available from: https://ssl.ices.on.ca/dataprog/,DanaInfo=.aioulhjFpkn2K00Nrq+index.html

135. Ontario Ministry of Health and Long Term Care. Northern Health Programs [Internet]. 2013. [cited 2014 Jun 17]. Available from: http://www.health.gov.on.ca/en/pro/programs/northernhealth/nrrr.aspx

148

136. ICES wikipedia of definitions. Urban and Rural [Internet]. 2010. [cited 2014 Jun 17]. Available from: https://ssl.ices.on.ca/dataprog/Data Definitions/wikipedia/,DanaInfo=.aioulhjFpkn2K00Nrq+urban.htm

137. Hall SA, Kaufman JS, Ricketts TC. Defining urban and rural areas in U.S. epidemiologic studies. Journal of Urban Health : Bulletin of the New York Academy of Medicine [Internet]. 2006 Mar [cited 2013 Nov 30];83(2):162–75. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2527174&tool=pmcentrez&rendertype =abstract

138. Redaniel MT, Martin RM, Ridd MJ, Wade J, Jeffreys M. Diagnostic intervals and its association with breast, prostate, lung and colorectal cancer survival in England: historical cohort study using the Clinical Practice Research Datalink. PloS one [Internet]. 2015 Jan [cited 2015 Aug 28];10(5):e0126608. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4416709&tool=pmcentrez&rendertype =abstract

139. Porter GA, Inglis KM, Wood LA, Veugelers PJ. Access to care and satisfaction in colorectal cancer patients. World Journal of Surgery [Internet]. 2005 Nov [cited 2015 Aug 16];29(11):1444– 51. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16240060

140. Boyle P, Langman JS. ABC of colorectal cancer Epidemiology. British Medical Journal. 2000;321:805–8.

141. Cancer Quality Council of Ontario. Reporting of Cancer Stage at diagnosis: prepared by Cancer Care Ontario, Informatics Centre for Excellence [Internet].2014 Feb [cited2013 Sept 8]. Figure 2, Population-based stage capture rate by disease site, cancer cases diagnosed, Ontario, 2009-2011. Available from: http://www.csqi.on.ca/cms/one.aspx?portalId=289784&pageId=296178

142. AJCC. AJCC Cancer staging manual Vol.649. Byrd DR, Compton CC, Fritz AG, Greene FL, Trotti A, editors. New York: Springer; 2010.

143. Union for International Cancer Control (UICC). TNM Classification of Malignant Tumours. 7th ed. Sobin LH, Gospodarowicz MK, Wittekind C, editors. Wiley-Blackwell; 2009.

144. Canadian Cancer Society. Diagnosing colorectal cancer [Internet]. [cited 2015 Sep 8]. Available from: http://www.cancer.ca/en/cancer-information/cancer- type/colorectal/diagnosis/?region=on#biopsy

145. Candian Institute for Health Information. Canadian classification of Health Interventions Volume Three. Canadian Institute for Health Information; 2012.

146. Groome P, Whitehead M, Grunfeld E, Moineddin R. Initial peri-diagnostic encounter leading to a cancer diagnosis: development of an administratiion data base approach to its identification (Ca- PRI). 2012.

147. Shewhart W. Economic control of quality of manufactured product. New York: Van Nostrand; 1931.

149

148. American Society for Quality. Control Charts [Internet]. [cited 2014 Dec 5]. Available from: http://asq.org/learn-about-quality/data-collection-analysis-tools/overview/control-chart.html

149. Mohammed MA, Worthington P, Woodall WH. Plotting basic control charts: tutorial notes for healthcare practitioners. Quality & safety in health care [Internet]. 2008 Apr [cited 2015 Sep 7];17(2):137–45. Available from: http://www.ncbi.nlm.nih.gov/pubmed/18385409

150. Woodall WH. The Use of Control Charts in Health-care and Public-Health Surveillance. Journal of Quality Technology. 2006;38(2l):89–104.

151. Tague NR. Quality Toolbox. 2nd ed. Milwaukee, WI: American Society for Quality (ASQ) Press; 2005. p. 155–61.

152. Jiang L. Association between use of specialized diagnostic assessment units and the diagnostic interval in Ontario breast cancer patients. Graduate Program in Epidemiology, Queen’s University; 2013.

153. Honein-AbouHaidar GN, Baxter NN, Moineddin R, Urbach DR, Rabeneck L, Bierman AS. Trends and inequities in colorectal cancer screening participation in Ontario, Canada, 2005-2011. Cancer Epidemiology [Internet]. Elsevier Ltd; 2013 Dec [cited 2015 Jun 23];37(6):946–56. Available from: http://www.ncbi.nlm.nih.gov/pubmed/23702337

154. Hwang H. Emergency presentation of colorectal cancer at a regional hospital : An alarming trend ? British Colombia Medical Journal. 2012;54(2):83–7.

155. Mitchell AD, Inglis KM, Murdoch JM, Porter GA. Emergency room presentation of colorectal cancer: a consecutive cohort study. Annals of Surgical Oncology [Internet]. 2007 Mar [cited 2014 Dec 6];14(3):1099–104. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17211732

156. Canadian Cancer Society’s Steering Committee on Cancer Statistics. Canadian Cancer Statistics 2012. Toronto, ON: Canadian Cancer Society; 2012.

157. Neal RD, Allgar VL. Sociodemographic factors and delays in the diagnosis of six cancers: analysis of data from the “National Survey of NHS Patients: Cancer”. British Journal of Cancer [Internet]. 2005 Jun 6 [cited 2013 Jun 13];92:1971–5. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2361785&tool=pmcentrez&rendertype =abstract

158. Sankaranarayanan J, Watanabe-galloway S, Sun J, Qiu F, Boilesen E, Thorson AG. Rurality and Other Determinants of Early Colorectal Cancer Diagnosis in Nebraska : A 6-year cancer registry study, 1998-2003. The Journal of Rural Health. 2009;25(4):358–65.

159. Singh SM, Paszat LF, Li C, He J, Vinden C, Rabeneck L. Association of socioeconomic status and receipt of colorectal cancer investigations: a population-based retrospective cohort study. Canadian Medical Association Journal. 2004;171(5):461–5.

160. Di Biase S, Bauder H. Immigrant Settlement in Ontario : Location and Local Labour Markets. Canadian Ethnic Studies. 2005;37(3).

150

161. Murphy C. Access to Colorectal Cancer Screening in Canada : Does Immigrant Status Matter ? Graduate Institute of Health Policy, Management and Evaluation University of Toronto; 2010.

162. Faivre J, Bedenne L, Boutron MC, Milan C, Collonges R, Arveux P. Epidemiological evidence for distinguishing subsites of colorectal cancer. Journal of Epidemiology and Community Health. 1989;43(4):356–61.

163. Tajima K, Hirose K, Nakagawa N, Kuroishi T, Tominaga S. Urban-rural differences in the trend of colo-rectal cancer mortality with special reference to the subsites of colon cancer in Japan. Japanese Journal of Cancer Research (Gann). 1985;76(8):717–28.

164. Svendsen RP, Støvring H, Hansen BL, Kragstrup J, Søndergaard J, Jarbøl DE. Prevalence of cancer alarm symptoms: a population-based cross-sectional study. Scandinavian Journal of Primary Health Care [Internet]. 2010 Sep [cited 2014 Feb 7];28:132–7. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3442327&tool=pmcentrez&rendertype =abstract

165. Cheng L, Eng C, Nieman LZ, Kapadia AS, Du XL. Trends in colorectal cancer incidence by anatomic site and disease stage in the United States from 1976 to 2005. American Journal of Clinical Oncology [Internet]. 2011 Dec [cited 2014 Jan 10];34(6):573–80. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21217399

166. Caldarella A, Crocetti E, Messerini L, Paci E. Trends in colorectal incidence by anatomic subsite from 1985 to 2005: a population-based study. International Journal of Colorectal Disease [Internet]. 2013 May [cited 2014 Jan 10];28:637–41. Available from: http://www.ncbi.nlm.nih.gov/pubmed/23478843

167. Jess P, Hansen IO, Gamborg M, Jess T. A nationwide Danish cohort study challenging the categorisation into right-sided and left-sided colon cancer. British Medical Journal open [Internet]. 2013 Jan [cited 2014 Feb 7];3(5). Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3657647&tool=pmcentrez&rendertype =abstract

168. Matheson FI, Dunn JR, Smith KL, Moineddin R, Glazier RH. Ontario Marginalization Index user guide version 1.0. Centre for Research on Inner City Health, St Michael’s Hospital, Toronto (ON).

169. Statistics Canada. Dissemination Area (DA) [Internet]. 2012 [cited 2015 Jun 30]. Available from: https://www12.statcan.gc.ca/census-recensement/2011/ref/dict/geo021-eng.cfm

170. Matheson FI, Moineddin R, Dunn JR, Creatore MI, Gozdyra P, Glazier RH. Urban neighborhoods, chronic stress, gender and depression. Social Science & Medicine [Internet]. 2006 Nov [cited 2014 Feb 7];63:2604–16. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16920241

171. Dunn J, Matheson F, Smith K. Overview of the Ontario Marginalization index (ON-Marg) [Internet]. 2012. Available from: http://www.torontohealthprofiles.ca/onmarg/additionalResources/OverviewOfONMarg06july2012 .pdf

151

172. Lofters AK, Gozdyra P, Lobb R. Using geographic methods to inform cancer screening interventions for South Asians in Ontario, Canada. BMC public health [Internet]. 2013 Jan;13(395). Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3640962&tool=pmcentrez&rendertype =abstract

173. Johns Hopkins University Bloomberg School of Public Health. The Johns Hopkins ACG System: Excerpt from Technical Reference Guide Version 10.0. Johns Hopkins University Bloomberg School of Public Health; 2011.

174. Austin PC, van Walraven C, Wodchis WP, Newman A, Anderson GM. Using the Johns Hopkins Aggregated Diagnosis Groups (ADGs) to predict mortality in a general adult population cohort in Ontario, Canada. Medical Care [Internet]. 2011 Oct;49(10):932–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21478773

175. ICES Data Repository.Using Johns Hopkins ACG Software at ICES [Internet]. [cited 2015 Jul 3]. Available from: https://ssl.ices.on.ca/dataprog/Things You Should Know/Rules/,DanaInfo=.aioulhjFpkn2K00Nrq+using_johns_hopkins_acg_software.htm

176. Reid RJ, Roos NP, MacWilliam L, Frohlich N, Black C. Assessing population health care need using a claims-based ACG morbidity measure: a validation analysis in the Province of Manitoba. Health Services Research [Internet]. 2002 Oct;37(5):1345–64. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1464032&tool=pmcentrez&rendertype =abstract

177. Glazier RH, Agha MM, Moineddin R, Sibley LM. Universal Health Insurance and Equity in Primary Care and Specialist Office Visits : a population-based study. Annals of Family Medicine. 2009;7(5):396–405.

178. Cancer Care Ontario. Colorectal cancer subsite distribution differs by sex [Internet]. 2010 [cited 2014 Dec 6]. Available from: https://www.cancercare.on.ca/cms/one.aspx?portalId=1377&pageId=67772

179. Jaakkimainen L, Del Giudice L, Saskin R. Wikipedia of ICES definitions: colorectal cancer screening [Internet]. 2011 [cited 2014 Dec 6]. Available from: https://ssl.ices.on.ca/dataprog/Data Definitions/wikipedia/,DanaInfo=.aioulhjFpkn2K00Nrq+Colorectal Cancer Screening.htm

180. Percy C, Van Holten V, Muir C, editors. International Classification of Diseases for Oncology. 2nd ed. World Health Organization; 1990.

181. Hamilton S, Aaltonen L. World health organization classification of tumours. Pathology and genetics of tumours of the digestive system. IARC Press:Lyon; 2000.

182. Toronto Community Health Profiles. About the data: Prevention of diabetes & cancer prevention [Internet]. Available from: http://www.torontohealthprofiles.ca/a_documents/aboutTheData/7_1_AboutTheData_Prevention_ 2008-2011.pdf

152

183. Ontario Ministry of Health and Long Term Care. Schedule of Benefits for Laboratory Services. Ontario Ministry of Health and Long Term Care; 1999.

184. Ontario Ministry of Health and Long Term Care. ColonCancerCheck Fecal Occult Blood Testing (FOBT): Bulletin 4471. 2008.

185. Hao L, Naiman D. Quantile Regression. Sage Pubilcations; 2007.

186. Kelsey J, Whittemore A, Evans A, Thompson W. Methods in Observational Epidemiology. 2nd ed. New York: Oxford University Press; 1996. p. 327–34.

187. Statistics Canada. Population and dwelling counts, for Canada and census subdivisions (municipalities), 2011 and 2006 censuses [Internet]. 2015 [cited 2015 Sep 8]. Available from: http://www12.statcan.ca/census-recensement/2011/dp-pd/hlt-fst/pd-pl/Table- Tableau.cfm?LANG=Eng&T=301&SR=1&S=3&O=D&RPP=25&PR=0&CMA=0

188. Chen C. An Introduction to Quantile Regression and the QUANTREG Procedure. Thirtieth Annual SAS User Group International Conference. Cary, NC:SAS Institute Inc; 2005.

189. CanIMPACT [Internet]. [cited 2015 Aug 11]. Available from: http://canimpact.utoronto.ca/

190. Mariscal M, Llorca J, Prieto D, Delgado-Rodriguez M. Determinants of the interval between the onset of symptoms and diagnosis in patients with digestive tract cancers. Cancer Detect Prevent. 2011;25(5):420–9.

191. Mitchell AD, Inglis KM, Murdoch JM, Porter GA. Emergency room presentation of colorectal cancer: a consecutive cohort study. Annals of Surgical Oncology [Internet]. 2007 Mar [cited 2014 Dec 6];14(3):1099–104. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17211732

192. McNutt L, Chuntao W, Xue X, Hafner J. Estimating the Relative Risk in Cohort Studies and Clinical Trials of Common Outcomes. American Journal of Epidemiology [Internet]. 2003 May 15 [cited 2015 May 18];157(10):940–3. Available from: http://aje.oupjournals.org/cgi/doi/10.1093/aje/kwg074

193. McKendry R. Physicians for Ontario: Too many? Too few? For 2000 and beyond: Report of the fact finder on physician resources in Ontario. 1999.

194. Parikh-Patel A, Bates JH, Campleman S. Colorectal cancer stage at diagnosis by socioeconomic and urban/rural status in California, 1988-2000. Cancer [Internet]. 2006 Sep 1 [cited 2013 Mar 21];107(5 Suppl):1189–95. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16835910

195. Harris GJC, Simson JNL. Causes of late diagnosis in cases of colorectal cancer seen in a district general hospital over a 2-year period. Annals of the Royal College of Surgeons of England. 1998;80(4):246–8.

196. Institute of Medicine (US). Crossing the quality chasm: a new health system for the 21st century. National Academy Press; 2001.

153

197. Globerman S. Reducing wait times for health care: what Canada can learn from theory and international experience. Globerman S, editor. 2013.

198. Clinical risk unit [Internet]. 2002 [cited 2015 Aug 19]. Available from: http://www.patientsafety.ucl.ac.uk/caseanalysis.htm

199. Telford JJ, Levy AR, Sambrook JC, Zou D, Enns RA. The cost-effectiveness of screening for colorectal cancer. Canadian Medical Association Journal [Internet]. 2010 Sep 7 [cited 2015 Aug 18];182(12):1307–13. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2934796&tool=pmcentrez&rendertype =abstract

200. Jong KE, Vale PJ, Armstrong BK. Rural inequalities in cancer care and outcome. The Medical Journal of Australia [Internet]. 2005 Jan 3;182(1):13–4. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15651940

154

Appendix A Dataset Creation Plan

The Diagnostic Interval of Colorectal Cancer Patients in Ontario by Degree of Name and Number of Study Rurality (Project #: 2015 0800 156 000) MSc Epidemiology (Cancer Research Institute, Division of Cancer Care and Research Program Epidemiology) Contacts Leah Hamilton, Patti Groome, Colleen Webber, Marlo Whitehead, Geoff Porter Who will be responsible Leah Hamilton for DCP updates? PIA Approved? Yes Created: Feb 24, 2014 DCP update history Updated: Dec 09, 2014 Wait times while moving through the cancer diagnosis and treatment pathways are a persistent public health concern. This study will describe the diagnostic interval for colorectal cancer patients in Ontario by degree of rurality of their residence. This study will also describe the association between diagnostic interval and geographic access to key diagnostic resources. Do patients living in a more rural area have a different diagnostic interval compared to those living in a less rural area? How is geographic access to key resources associated with the diagnostic interval? A delay in the diagnosis of colorectal cancer could impact stage and patient survival. Furthermore, it is important to reduce patient anxiety during the diagnostic process by prompt evaluations. If the diagnostic interval is higher in rural areas, it may be associated with, and therefore Short Description of ameliorated by, changes in health care services that are related to the Research Question diagnostic process. Such changes would address an important health care inequity. Study Objectives: 1) To compare the Ontario colorectal cancer stage distribution across regions grouped by Rurality Index for Ontario (RIO) categories. 2) To determine whether patients living in census subdivisions with a higher degree of rurality, based on RIO categories, have a longer system-related diagnostic interval across RIO scores stratified by stage. 3) To map the relationships between the system-related diagnostic interval, rurality and key colorectal cancer diagnostic resources through a geographic information system (GIS) mapping exercise. RPDB, CIHI/DAD(2004-2012),OHIP(2004-2012),OCR(2006-2012), List of Datasets Used NACRS(2004-2012),Census Area Profiles, IPDB(2007-2012),Stage(2006-2012)

155

Defining the Cohort Index Event Diagnosis of primary adenocarcinoma of the colon or rectum from Jan 1, 2007 to Dec 31, 2012 (OCR). DXCODE = ICD-9 codes 153.0 to154.1 and 159.0 (excluding 153.5 neoplasm of the appendix) Exclusions Exclude from OCR if: (In order) 1) Dead at diagnosis BESTSOURCE=D 2) Less than or equal to 18 years old OR greater than or equal to 105 years old at diagnosis AGEDX ≤18 years OR AGEDX ≥105 years (pediatric cases of CRC different for multiple reasons and probably an error in database if case is older than105) 3) Living outside Ontario when diagnosed VALIKN≠V 4) Colorectal cancer is not the first primary cancer OR there is a subsequent cancer diagnosis within 6 months of the first primary colorectal cancer diagnosis date N_INCPRIM≠1-CRC up to 6months after DXDATE 5) OHIP coverage for less than 36 months before diagnosis date In the RPDB STRTELIG ≤36 months before DXDATE or CONTACTdate ≤36 months before DXDATE (needed for diagnostic interval and comorbidities determination) Size of Cohort N= about 49,000 (will be less than this number) Colorectal Dataset for all eligible colorectal cancer patients Cancer Cohort Time Frame Definitions

Max Follow-up Date Accrual Window

Look-back Window Observation Window (in which to look for outcomes) Index Event Date Accrual Start/End Dates Jan 1, 2007 to Dec 31, 2012 Max Follow-up Date Dec 31, 2012 When does observation DXDATE (date of colorectal cancer diagnosis for each patient) window terminate?

156

Lookback Window(s) Control charts used to define a look-back period for each key diagnostic resource separately. The background rate will be quantified as the average weekly count of health care encounters in the 18-24 month interval before DXDATE. See “Identifying the index date” section. Variable Definitions Main Exposure Rurality of residence [RPDB variable: rio2008] or Risk Factor 1. Please create rio2008 using %getdemo macro 2. I will be grouping rio2008 scores into 6 categories (0-9,10-30,31-45,46-55,56-75,76+) Baseline Patient Related Characteristics 1. Age at diagnosis [OCR variable: AGE] 2. Sex [OCR variable: SEX] 3. Area-level material deprivation/ socioeconomic status (SES) [variable: deprivation_DA06] - Assign Census Dissemination Area (2006) to each patient based on most updated postal code -Link DA to ONmarg deprivation_DA06 using Moineddin’s look-up table (may now be a macro) 4. Comorbidities (ADGs: Aggregated Diagnosis Groups) within 12months-24months before the DXDATE - Create ADG variables using ICD diagnosis codes from OHIP and CIHI-DAD data within time window - Use ICES macro - Create new variables ADG1-ADG32 (with a summary score as well) - I will then look at the frequencies to determine the best way to categorize these variables (e.g. distinguish major ADGs-3,4,9,11,16,22,32 and minor ADGs) 5. Recent immigrant status (within10 years) - OHIP eligibility, new in the last 10 years [Yes/No] Cancer Related 1. Stage at diagnosis [OCR variable: BEST_STAGE] used to assign stage (I, II, III, IV) 2. Anatomical Sub-site [OCR variable: DXCODE ICD-9] track 4 digits (e.g. 153.0) of all included cases - I will categorize codes into proximal, distal and rectal sub-sites Physician Related 1. Referring physician specialty -Earliest doctor in the diagnostic pathway is the referring physician (earliest of doctors B-H in Appendix-1A) -Get referring physician specialty* [IPDB variable: mainspecialty] *if the earliest visit to a doctor in the diagnostic pathway was through the emergency department (ED=yes), please let referring physician specialty = EDdiagnosed Investigation Related

157

1. Emergency Department (ED) Presentation [yes/no] This variable is assigned to the index contact once the index contact has been determined ED presentation=yes if:  OHIP variable LOCATION = E, or  Index contact= box H (NACRS ED), referring to Appendix-1A (record of any listed NACRS ED dxcodes within 30 days prior to CRCdx date where schededvisit ≠Y) 2. First Test -Please create a “first_test” variable to identify the CRC test with the earliest date (within the respective look-back windows) in a patient’s pathway. The options for first_test are listed below (next page). In the case of a tie, where there is more than one test on the earliest date for a particular patient, please use the test hierarchy denoted in brackets.  FOBT (1)  Colonoscopy (2)  Sigmoidoscopy (3)  Barium enema (4)  Abdominal CT (5)  Abdominal x-ray (6)  Abdominal US (7)  ED NOS (not otherwise specified)= ED presentation, but no other tests on record that day (8) Note: The hierarchy is for the first test regardless of whether the test has a referring physician or not

As an example, on the earliest date that a patient has investigations on, they presented to the ED and then had a subsequent abdominal x-ray on the same day. The first test=abdominal x-ray and the patient would also be ED presentation=yes. In another scenario, if a patient presented to the ED and there were no others tests to be found on that day (or earlier) the first test=ED NOS and the patient will also have ED presentation=yes. 3. First Encounter -The “first_encounter” variable is the date of the index contact. Please see the Outcome Definitions section.

Other Variables Investigation Related 1. Purpose of first test (diagnostic or screening) -After the first_test variable has been determined see Appendix-3 to apply algorithm NOTE: The algorithm is applied to the CRC investigation: FOBT, X-ray, CT, US, BE, sigmoidoscopy, colonoscopy or ED presentation and therefore has to be applied after the first_test has been determined for each patient.

158

Outcome Diagnostic Interval: The time (days) between the patient’s first diagnostic related Definitions encounter with the health care system (index date) to the definitive diagnosis of colorectal cancer.

Strategy: Please use control charts (see “identifying the index date” section) to determine look-back window for each key diagnostic resource. Use 30 days as the look-back window for the ED NACRS cases. Work backwards X days from the date of diagnosis (include the date of diagnosis) for each key diagnostic resource to find the first investigation of that procedure (Appendix-1, Appendix-2). 1. Identify the physician number of the ordering physician 2. Identify the date of the last visit with the ordering physician for each diagnostic procedure 3. Find the earliest date of visit amongst ALL diagnostic procedures, this will be the index contact date (look-back a maximum of one year from the CRC TEST date) 4. Now that the index contact date is determined, decide if this contact was through the ED (see baseline characteristics section , investigation related #1) Identifying the The time window in which I will look-back for relevant CRC diagnostic encounters (to index date identify the index contact date) will be determined separately for each type of key diagnostic resource using control charts. The 8 key diagnostic resources include: FOBT, abdominal x-ray, abdominal CT, abdominal US, barium enema, sigmoidoscopy, colonoscopy and ED presentation (see Appendix-1). 1. Identify CRC cohort (see defining cohort section). Make groups for each of the previously determined key diagnostic resources from this cohort. For example, group 1) all patients who have had an FOBT, B) all patients who have had an abdominal x- ray…NOTE: an individual patient can be in more than one group. 2. Use OHIP claims to look at the weekly healthcare encounters in the months immediately preceding the CRC diagnosis against the background rate which will be quantified as the average weekly count in the 18-24 month interval before CRC diagnosis. This will be done separately for each of the key diagnostic resources, therefore there could be up to 8 different look-back windows. 3. I will use established rules for interpreting the control charts to end up with a definitive cutoff time for each of the key diagnostic resources. 4. I will supply Marlo with cutoff values for the control charts (See Appendix-7) 5. Conduct a signal strength test 6. Apply the look-back windows to the cohort and then follow the Outcome Definition section to determine the earliest diagnostic related encounter for each patient (Appendix- 1, Appendix-2 more for my own understanding). Preliminary 1. In order to ensure that correct and exhaustive OHIP codes are used to identify the six Analysis key diagnostic resources (Appendix-1) I would like to examine a frequency distribution of all OHIP procedure fee codes that the cohort had in the 3 months directly preceding the date of CRC diagnosis. Once I am confident that all codes for key diagnostic resources are present than we can apply them to the entire cohort to determine the control charts leading to determinations of the diagnostic interval. -Cohort: Diagnosis of colorectal cancer from Jan 1-2007 to Dec 31-2011 (ICD-9 codes 153.0 to 154.1 and 159.0 excluding 153.5 neoplasm of the appendix) -Time frame: 3 months directly preceding the date of CRC diagnosis (including CRC 159 diagnosis date) -N/10,000 output 2. Examine the stage distribution in the cohort using different variables for stage (cross tabulation) - Cohort: Diagnosis of colorectal cancer from Jan 1-2007 to Dec 31-2012 (ICD-9 codes 153.0 to 154.1 and 159.0 excluding 153.5 neoplasm of the appendix) -OCR variable: BEST GROUP STAGE at DIAGNOSIS -OCR variable (collaborative stage table): DerivedAJCC6StGrp

3. More detailed OHIP colonoscopy codes are available after my study time frame (2007- 2011) that are able to distinguish screening colonoscopies from symptomatic colonoscopies. Screening may be an important part of my study, but is difficult to assign on an individual patient basis. To examine this we could use CRC patients diagnosed in 2012 (where there are detailed OHIP colonoscopy codes) to help inform us of work-up codes that are indicative of colonoscopy for screening versus colonoscopy for symptoms. If there is a pattern we may apply an algorithm to our study cohort (2007-2011) to identify those that were likely screening colonoscopies versus those that were likely symptomatic colonoscopies (2012 would be the “gold standard”). A preliminary analysis would need to be done to determine this, make the algorithm and apply to the 2007-2011cohort if applicable. New colonoscopy codes of interest in the 2012 CRC patients include: Z497 Confirmatory colonoscopy (from a positive FOBT, FIT, sigmoidoscopy, barium enema, CT abdomen or CT colonography) Z499 Colonoscopy in absence of signs or symptoms, family history of increased risk malignancy Z492 Five year follow-up of a normal colonoscopy (Z499) Z493 Ten year follow-up of a normal colonoscopy (Z497,Z555) Z496 Colonoscopy presence of signs and symptoms Z494 Colonoscopy hereditary or other bowel disorders Z498 Colonoscopy follow-up from abnormal colonoscopy Z555 Colonoscopy absence of signs or symptoms or risk factors

A) Please identify all colorectal cancer patients diagnosed between Jan 1, 2012 to Dec 31, 2012 where DXCODE = ICD-9 codes 153.0 to154.1 and 159.0 (excluding 153.5 neoplasm of the appendix). Use this cohort for step B.

B) Create 6 separate frequency tables, one for each of the colonoscopy/sigmoidoscopy codes (Z496, Z497, Z498, Z499, Z555, Z580) that contain all OHIP billing dxcodes from the referring doctors for the first colonoscopy/sigmoidoscopy. In addition, please organize output into two time periods, the 0-6 months directly prior to cancer diagnosis and the 6-12 months prior to cancer diagnosis (colonoscopy/sigmoidoscopy date not included). A N/10,000 output for the dxcodes may be the easiest to interpret.

C) Examine each frequency table separately and highlight dxcodes that signal a difference between the 0-6 months and 6-12 months columns.

160

D) Plot highlighted dxcodes from step C comparing various colonoscopy codes and time intervals. The goal of this is to identify dxcodes that are strongly suggestive of being related to colonoscopy for diagnostic purposes vs. colonoscopy for screening purposes vs. surveillance purposes.

E) At this date a bar graph comparing Z496 (colonoscopy presence signs and symptoms) and Z499 (screening colonoscopy for high risk individuals) looks promising in terms of teasing out relevant dxcodes for diagnostic purposes vs. screening purposes.

F) Once key dxcodes have been identified an algorithm similar to Appendix-3 can be applied to the 2007-2012 cohort.

G) Test algorithm data. Algorithm will be applied to full cohort (2007-2012); we can then pull the new OHIP codes to do a comparison for the group of patients that have both the algorithm-determined purpose of investigation as well as the detailed new code. We will wait to see the cross tabulation before making a decision as to what method should be used for the group of patients who have both. (Crosstab to help with results interpretation also could calculate error rates).

4. The look-back logic for the diagnostic interval of colonoscopy and sigmoidoscopy tests is more involved (appendix 1). As an initial step could you please find all endoscopy visits within the 6 months prior to the patients’ colonoscopy or sigmoidoscopy tests (not including the actual visit for the test) and create a frequency table to check how many patients had: 0, 1, >1 visits (3 separate categories) with the endoscopist who performed their colonoscopy/sigmoidoscopy in this time frame. As well, create a frequency table to check how many patients had 0, 1, >1 referring doctors (to the endoscopist) identified by their physician number (interested in unique number of doctors if more than 1). We will not be using endoscopist consult codes (sorry for the confusion). Before we can move ahead to find the intervals for these patients we need to do this preliminary test and then I will update appendix 1 with our plan.

5. In order to ensure that we are examining all relevant NACRS ED dxcodes could you please extend the ED dxcodes frequency table to include the top 80 dxcodes, where the top 80 are determined by the frequencies in the 0-3 month column for both the pre- diagnostic and control periods (18-24months). Once we chose the relevant codes from this list (see appendix-4 for the list) the next step will be to determine our look-back period for the ED cases, you will be creating a control chart looking for cases of the specified dxcodes from the ED NACRS data for the whole cohort excluding those patients with the NACRS variable schededvisit=Y. We will use the same methods as the previous control charts in this project (background control period is 18-24 months prior to CRCdx). Cut-off signal was too long, strategy changed to using any listed NACRS ED dxcodes within 30 days prior to CRCdx date where NACRS variable: SCHEDEDVISIT≠ Y to define this group. Apply the ED category to the cohort

161

Outline of Analysis Plan

Objective 1: RIO scores will first be grouped into 6 categories (0-9, 10-30, 31-45, 46-55, 56-75 and 76+). Descriptive statistics will examine differences in stage distribution across RIO categories and differences tested using a chi-square test for trend and logistic regression, categorizing stage as either early (I and II) or late (III and IV). Objective 2: Median diagnostic interval (days) will be plotted against RIO scores stratified by stage. Stratification will be done because stage has a known influence on the diagnostic interval. Median diagnostic interval will be used as the diagnostic interval distribution is known to be right-skewed from previous literature, the 90th percentile diagnostic interval will also be used. A multivariate median (50th) and 90th percentile quantile regression model will be performed to analyze differences. The fully adjusted model will be used as there will be sufficient statistical power and not a great number of possible confounders. Attention will be made to identify any collinear relationships in the covariates.

Objective 3: Basic GIS mapping using ArcGIS software will be used to overlay the combination of median diagnostic interval and RIO score by census subdivision. Two key diagnostic resources, geographic access to a colonoscopy and the presence of diagnostic assessment programs will be mapped and interpreted.

Quality assurance activities (To be updated at project closure) UNIX directory of SAS programs Date that DCP and SAS code were reconciled Were there any changes made to the SAS code that are not reflected in this DCP? If yes, please explain briefly The UNIX README file is up to date as of (date): Have the results of the following data quality assurance tools been shared with the research team, as appropriate? assign evolution dinexplore track / exclude other (specify)

Additional comments:

162

DCP APPENDIX 1A- Diagnostic Interval

163

DCP APPENDIX 1B- Diagnostic Interval, Additional Logic

Colonoscopy/Sigmoidoscopy Logic 1) Collect all colonoscopies within the control chart look-back 2) Collect all visits with an endoscopist (who performed the test) within 12 months prior to the first colonoscopy. 3) Look for referring doctors codes in step 1 and 2. 4) Take the last visit with a referring doctor prior to any of the visits in step 1 or 2 (i.e. if >1 referring doctor, take the earliest). 5) If no referring doctor use earliest (farthest from CRCdxdate) in step 1 or 2.

REPEAT steps for sigmoidoscopy

OHIP/CCI logic For the radiological imaging tests there are OHIP and CCI codes. In the event that there is both an OHIP and CCI code present, use the OHIP record to find the date of the last visit with the ordering MD for the OHIP investigation. Compare this date to the CCI investigation date. Use the earlier of the two (farther away from the CRCdx date).

164

DCP APPENDIX 2- Index Contact Date and Control Chart Logic

Example of index contact date determination for an individual patient. Look-back windows set for each diagnostic resource using control charts which are created using data from all patients in the cohort with that particular investigation.

Index contact date

X Set x-ray look-back window

X Set CT look-back window

X Set endoscopy look-back window

Z months CRC DX prior to DX date

165

DCP APPENDIX 3- Algorithm to determine the purpose of CRC investigation: symptomatic or screening. Adapted from El-Serag and colleagues4

First test = Abdominal X-ray Abdominal CT Symptomatic Abdominal US Barium Enema

OR ED presentation = yes

First test = Sigmoidoscopy Colonoscopy AND ED presentation= no

Are any dxcodes 1-16 present in the 1yr prior to test date? Yes Symptomatic

No Asymptomatic

First test = FOBT (L179) AND Asymptomatic ED presentation= no

First test = FOBT (L181 or G004)

AND ED presentation= no

*Are any dxcodes 1-16 present from the last encounter with the No Asymptomatic ordering physician?

Yes Symptomatic

166

1) 787: Anorexia, nausea and vomiting, heartburn, dysphagia, hiccough, hematemesis, jaundice, ascites, abdominal pain, melena, masses 2) 280: Iron deficiency anaemia 3) 569: Anal or rectal polyp, rectal prolapse, anal or rectal stricture, rectal bleeding, other disorders of intestine 4) 009: Diarrhea, gastro-enteritis, viral gastro-enteritis 5) 564: Spastic colon, irritable colon, mucous colitis, constipation 6) 455: Haemorrhoids 7) 285: Other anaemias 8) 562: Diverticulitis or diverticulosis of large or small intestine 9) 535: Gastritis 10) 281: Pernicious anaemia 11) 153: Large intestine - excluding rectum 12) 565: Anal fissure, anal fistula 13) 154: Rectum, rectosigmoid and anus 14) 555: Regional enteritis, Crohn's disease 15) 560: Intestinal obstruction, intussusception, paralytic ileus, volvulus, impaction of intestine 16) 556: Ulcerative colitis

*Further explanation: We put in this box as we have heard from some of the clinicians that FOBTs are not solely done for screening purposes, although they are most of the time.

If FOBT is the first_test for a patient, and is therefore going through the algorithm, look-back from the FOBT test date for the physician # of the ordering MD and then back to the date of the last visit with that ordering MD prior to the date of the FOBT or on the same date of the FOBT (similar to our diagnostic interval calculation for the other investigations). On that visit with the ordering MD were the dxcodes 1-16 present? Look-back a maximum of 6 months from the test date for the respective patient visit. (We don’t think it would require the full year look-back from test date like our diagnostic interval methods. Patients should receive the results of their FOBT tests within one month if going through the screening program; however there is dependency on when the patient sends in the test in the first place).

If there is no ordering physician number attached to the FOBT, then I think it is safe to classify these FOBTs as 'screening' since some patients may receive their kits from a pharmacy. I am not exactly certain how the billing would work for this, but would believe that they were ‘screening’ for the purpose of the algorithm.

Algorithm adapted from El-Serag and colleagues4

167

DCP APPENDIX 4- CRC-related procedure codes Procedure fee codes FOBT OHIP: G004, L181, L179 Abdominal X-ray OHIP: X100, X101 CCI: 3OT10^^ Abdominal CT OHIP: X126, X409, X410 CCI: 3OT20^^ Abdominal US OHIP: J128, J428, J135, J435 CCI: 3OT30^^ Barium enema OHIP: X112, X113 CCI: 3NM10^^ Colonoscopy OHIP: Z491, Z492, Z493, Z494, Z495, Z496, Z497, Z498, Z499, Z570, Z571 or Z555 plus one of E740, E741, E747, E705 on the same day Sigmoidoscopy OHIP: Rigid- Z535, Z536 OHIP: Flexible- Z580 or Z555 without E740, E741, E747, E705 on the same day Emergency Record of any listed NACRS ED dxcodes within 30 days prior to CRCdx Department date where schededvisit ≠Y Presentation (Patients with a scheduled appointment in the ED are not considered as true ED patients, therefore SCHEDEDVISIT≠ Y. This may be particularly relevant in rural areas). NACRS ED dxcodes:

ICD-10 code Description

R104 Other and unspecified abdominal pain

K566 Other and unspecified intestinal obstruction

D649 Anaemia, unspecified

K922 Gastrointestinal haemorrhage, unspecified

K625 Haemorrhage of anus and rectum

K590 Constipation

R53 Malaise and fatigue

A099 Gastroenteritis and colitis of unspecified origin

C189 Malignant neoplasm colon, unspecified

K529 Noninfective gastroenteritis and colitis, unspecified

K579 Diverticular disease of intestine, part unspecified, without perforation or abscess

R190 Intra-abdominal and pelvic swelling, mass and lump

K921 Melaena

168

R1030 Right lower quadrant pain

R1039 Lower abdominal pain, unspecified

C20 Malignant neoplasm of rectum

K631 Perforation of intestine (nontraumatic)

R112 Vomiting alone

D509 Iron deficiency anaemia, unspecified

R1012 Epigastric pain

D508 Other iron deficiency anaemias

E860 Dehydration

C787 Secondary malignant neoplasm of liver and intrahepatic bile duct

Z513 Blood transfusion (without reported diagnosis)

C180 Malignant neoplasm of caecum

I802 Phlebitis and thrombophlebitis of other deep vessels of lower extremities

R1031 Left lower quadrant pain

R509 Fever, unspecified

R1010 Right upper quadrant pain

C187 Malignant neoplasm of sigmoid colon

A419 Sepsis, unspecified

D374 Neoplasm of uncertain or unknown behaviour of the colon

K573 Diverticular disease of large intestine without perforation or abscess

K37 Unspecified appendicitis

K639 Disease of intestine, unspecified

K628 Other specified diseases of anus and rectum

N179 Acute renal failure, unspecified

R18 Ascites

R113 Nausea with vomiting

R64 Cachexia

K219 Gastro-oesophageal reflux disease without oesophagitis

K297 Gastritis, unspecified

K650 Acute peritonitis

169

E871 Hypo-osmolality and hyponatraemia

I269 Pulmonary embolism without mention of acute cor pulmonale

K8050 Calculus of bile duct without cholangitis or cholecystitis without mention of obstruction

K561 Intussusception

R634 Abnormal weight loss

D375 Neoplasm of uncertain or unknown behaviour of the rectum

C260 Malignant neoplasm intestinal tract, part unspecified

J90 Pleural effusion, not elsewhere classified

K769 Liver disease, unspecified

R318 Other and unspecified hematuria

170

DCP APPENDIX 5- CRC-related procedure code definitions

G004= Occult blood (miscellaneous tests).in office. L181= Lab. Med.-Biochem.-Occult Blood L179= Tracking code-colon cancer check X100= Diagnostic radiology, abdomen-single view X101= Diagnostic radiology, abdomen-two or more views X126= Diagnostic radiology, computed tomography (CT), abdomen-with and without IV contrast X409= Diagnostic radiology, computed tomography (CT), abdomen- without IV contrast X410= Diagnostic radiology, computed tomography (CT), abdomen-with IV contrast J128= Diagnostic ultrasound, abdomen and retroperitoneum abdominal scan-limited study J428= Diagnostic ultrasound, abdomen and retroperitoneum abdominal scan-limited study J135= Diagnostic ultrasound, abdomen and retroperitoneum abdominal scan-complete J435= Diagnostic ultrasound, abdomen and retroperitoneum abdominal scan-complete X112= Diagnostic radiology, gastrointestinal tract, colon- barium enema including survey film, if taken X113= Diagnostic radiology, gastrointestinal tract, colon- air contrast, primary or secondary, including survey films, if taken Z491= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for diagnosis or ongoing management, follow up of incomplete polyp resection Z492= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for risk evaluation, five year follow up of normal colonoscopy (Z499), absence of intervening signs or symptoms – sigmoid to descending Z493= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for risk evaluation, ten year follow up of normal colonoscopy (Z497, Z555), absence of intervening signs or symptoms – sigmoid to descending Z494= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for diagnosis or ongoing management, hereditary (e.g. Familial adenomatous Polyposis or Hereditary Non-Polyposis Colorectal Cancer) or other bowel disorders (e.g. inflammatory bowel disease) associated with increased risk of malignancy Z495= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for diagnosis or ongoing management, follow up of unsatisfactory colonoscopy Z496= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for diagnosis or ongoing management, presence of signs or symptoms – sigmoid to descending colon Z497= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for risk evaluation, confirmatory colonoscopy – sigmoid to descending colon Z498= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for diagnosis or ongoing management, follow up of abnormal colonoscopy – sigmoid to descending colon Z499= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for risk evaluation, Absence of signs or symptoms, family history associated with an increased risk of malignancy (e.g. a first degree relative or at least two second degree relatives with colorectal cancer or a premalignant lesion) – sigmoid to descending colon Z555= Digestive system surgical procedures, intestines (except rectum), Colonoscopy – for diagnosis or ongoing management, absence of signs or symptoms or risk factors, 50 years of age or older – sigmoid to descending colon Z570= Digestive system surgical procedures, intestines (except rectum), Colonoscopy- Fulguration of first polyp through colonoscope Z571= Digestive system surgical procedures, intestines (except rectum), Colonoscopy- Excision of first polyp greater than or equal to 3mm through colonoscope E740= Digestive system surgical procedures, intestines (except rectum), Colonoscopy- to splenic flexure, to Z491, Z492, Z493, Z494, Z495, Z496, Z497, Z498, Z499 or Z555, add E741= Digestive system surgical procedures, intestines (except rectum), Colonoscopy- to hepatic flexure, to Z491, Z492, Z493, Z494, Z495, Z496, Z497, Z498, Z499 or Z555, add E747= Digestive system surgical procedures, intestines (except rectum), Colonoscopy- to cecum, to Z491, Z492, Z493, Z494, Z495, Z496, Z497, Z498, Z499 or Z555, add E705= Digestive system surgical procedures, intestines (except rectum), Colonoscopy- into terminal ileum, to Z491, Z492, Z493, Z494, Z495, Z496, Z497, Z498, Z499 or Z555 Z535= Digestive system surgical procedures, rectum, Endoscopy, Sigmoidoscopy with or without anoscopy- with rigid scope Z536= Digestive system surgical procedures, rectum, Endoscopy, Sigmoidoscopy with or without anoscopy- with biopsy (ies) Z580= Digestive system surgical procedures, intestines (except rectum), Sigmoidoscopy- sigmoidoscopy (using 60 cm. flexible endoscope)

171

DCP APPENDIX-6 –General Order of Operations

1-Define cohort, make exclusions 2-RIO scores determined, most covariates 3-Control charts created to define look-back period for each of the 8 diagnostic resources 4-Apply cutoff times from the control charts and find the first CRC test for each patient 5-Apply algorithm to the first CRC test to differentiate between screening and diagnostic purposes 6- Diagnostic interval calculated for each patient 7- Each patient will have a RIO score group, a diagnostic interval (days) and other covariates calculated 8-Analysis

172

DCP APPENDIX-7-Control Chart Cut Signals

Dxgroup Description Fee Codes Control Chart Cut Signal Strength Signal (weeks) (%) Rules 1-4 1 FOBT OHIP: G004, L181, L179 45 67.9 2 Abdominal X-ray OHIP: X100, X101 53 86.7 CCI: 3OT10^^ 3 Abdominal CT OHIP: X126, X409, X410 60 88.4 CCI: 3OT20^^ 4 Abdominal US OHIP: J128, J428, J135, J435 54 63.8 CCI: 3OT30^^ 5 Barium Enema OHIP: X112, X113 28 91.3 CCI: 3NM10^^ 6 Colonoscopy OHIP: Z491, Z492, Z493, Z494, 36 97.1 Z495, Z496, Z497, Z498, Z499, Z570, Z571 or Z555 plus one of E740, E741, E747, E705 on the same day 7 Sigmoidoscopy OHIP: Rigid- Z535, Z536 32 94.7 OHIP: Flexible- Z580 or Z555 without E740, E741, E747, E705 on the same day 8 Emergency Record of any listed NACRS N/A N/A Department ED dxcodes within 30 days Presentation prior to CRCdx date where schededvisit ≠Y

Calculating Signal Strength

Mean background rate = “true” negative Signal Strength = “true” positives/all positives

All positives= C= total # of encounters from 0 to X weeks where we decide to make cut-off “True” negatives= B= (mean background encounter rate)*(# of weeks included) “True” positives= A= C-B

Example: Let the mean background encounter rate/wk=10 Let the cutoff week that we chose=12 wks Let all positives encounters=600 (from 0-12wks) Therefore, C=600 B=(10)*(12wks)=120 A=(600)-(120)=480

Signal Strength=(480/600)=(0.8)*(100)=80%

173

DCP References

1. Singh H, De Coster C, Shu E, Fradette K, Latosinky S, Pitz M, Cheang M, Turner D. Wait times from presentation to treatment for colorectal cancer: a population-based study. Canadian Journal of Gastroenterology. 2010; 24(1):33-39. 2. Paterson WG, Depew WT, Pare P, Petrunia D, Switzer C, Veldhuyzen van Zanten SJ, Daniels S. Canadian consensus on medically acceptable wait times for digestive health care for the Canadian Association of Gastroenterology Wait Times Consensus Group. Membership of the Consensus Group. Canadian Journal of Gastroenterology. 2006; 20(6):411-423. 3. Stewart SL, Wike JM, Kato I, Lewis DR, Michaud F. A population-based study of colorectal cancer histology in the United States, 1998-2001. Cancer (supplement). 2006; 107(5):1128-1141. 4. El-Serag HB, Petersen L, Hampel H, Richardson P, Cooper G. The use of screening colonoscopy for patients cared for by the Department of Veterans Affairs. Archives of Internal Medicine. 2006; 166(20): 2202-2208. 5. Rabeneck L, Paszat LF, Rothwell DM, He J. Temporal Trends in new diagnoses of colorectal cancer with obstruction, perforation, or emergency admission in Ontario: 1993-2001. American Journal of Gastroenterology. 2005; 100(3): 672-676.

174

Appendix B List of CSDs and Respective RIO Scores

B-1: List of CSDs and Respective RIO Scores (1), Organized by RIO Category

RIO 0-9 RIO 10-30 RIO 10-30 continued Ajax: 3 Amherstburg: 20 Sarnia: 10 Aurora: 5 Aylmer: 27 Sault Ste. Marie: 24 Barrie: 8 Bradford West Gwillimbury: 20 Scugog: 26 Belleville: 6 Brant: 16 South Frontenac: 25 Brampton: 2 Brockville: 20 Stratford: 23 Brantford: 3 Carleton Place: 24 Strathroy-Caradoc: 24 Burlington: 2 Central Elgin: 23 Tecumseh: 14 Caledon: 8 Centre Wellington: 25 Thames Centre: 30 Cambridge: 4 Chatham-Kent: 11 Thorold: 23 Greater Sudbury/ Grand Sudbury: 3 Clarington: 10 Tillsonburg: 29 Guelph: 4 Cornwall: 12 Timmins: 29 Halton Hills: 7 East Gwillimbury: 21 Uxbridge: 27 Hamilton: 0 Essa: 29 Whitchurch-Stouffville: 18 Kingston: 0 Essex: 23 Wilmot: 26 Kitchener:5 Fort Erie: 21 Woodstock: 18 London: 0 Georgina: 14 Woolwich: 26 Markham: 2 Greater Napanee: 29 Milton: 7 Grimsby: 20 Mississauga: 0 Guelph/Eramosa: 28 Newmarket: 4 Haldimand County: 17 Niagara Falls: 8 Ingersoll: 28 Oakville: 2 Innisfil: 17 Oshawa: 5 Kawartha Lakes: 28 Ottawa: 0 King: 24 Pickering: 5 Kingsville: 24 Richmond Hill: 2 Lakeshore: 18 St. Catharines: 6 LaSalle: 13 St. Thomas: 7 Leamington: 24 Thunder Bay: 0 Lincoln: 20 Toronto: 0 Loyalist: 24 Vaughan: 6 Malahide: 30 Waterloo: 7 Middlesex Centre: 28 Welland: 7 New Tecumseth: 22 Whitby: 6 Niagara-on-the-Lake: 25 Windsor: 0 Norfolk County: 20 North Bay: 14 North Dumfries: 29 Orangeville: 22 Orillia: 27 Owen Sound: 27 Pelham: 24 Peterborough: 11 Port Colborne: 25 Prince Edward: 28 Puslinch: 29 Quinte West: 13

175

RIO 31-45 RIO 31-45 continued RIO 31-45 continued Adelaide Metcalfe: 40 Meaford: 40 Tweed: 45 Adjala-Tosorontio: 37 Merrickville-Wolford: 39 Tyendinaga: 34 Alfred and Plantagenet: 44 Midland: 38 Wainfleet: 34 Amaranth: 44 Mississippi Mills: 32 Warwick: 40 Arnprior : 32 Mono: 38 Wasaga Beach: 34 Asphodel-Norwood: 44 Montague: 38 Wellesley: 35 Athens: 37 Mulmur: 44 Wellington North: 42 Augusta: 34 Neebing: 40 West Elgin: 40 Bayham: 36 North Dundas: 40 West Grey: 44 Beckwith: 31 North Glengarry: 43 West Lincoln: 31 Blandford-Blenheim: 36 North Grenville: 33 West Nipissing/Nipissing Brighton: 39 North Middlesex: 40 Ouest: 45 Brock: 40 North Perth: 42 West Perth: 43 Brooke-Alvinston: 43 North Stormont: 40 Westport: 39 Casselman: 39 Norwich: 35 Zorra: 36 Cavan-Millbrook-North Monaghan:36 O'Connor: 39 Central Frontenac: 44 Oil Springs: 40 Centre Hastings: 39 Oliver Paipoonge: 31 Chatsworth: 43 Oro-Medonte: 33 Clarence-Rockland: 34 Otonabee-South Monaghan: 39 Clearview: 37 Penetanguishene: 43 Cobourg: 34 Perth East: 38 Collingwood: 37 Perth South: 42 Conmee: 40 Perth: 36 Cramahe: 45 Petrolia: 34 Dawn-Euphemia: 45 Plympton-Wyoming: 36 Deseronto: 33 Point Edward: 34 Douro-Dummer: 42 Port Hope: 36 Drummond/North Elmsley: 37 Prescott: 34 Dutton/Dunwich: 37 Ramara: 44 East Ferris: 45 Renfrew: 39 East Garafraxa: 40 Rideau Lakes: 38 East Luther Grand Valley: 45 Russell: 37 East Zorra-Tavistock: 37 Shelburne: 40 Edwardsburgh/Cardinal: 38 Shuniah: 42 Elizabethtown-Kitle: 34 Smith-Ennismore-Lakefield: 32 Enniskillen: 40 Smiths Falls: 33 Erin: 31 South Dundas: 40 Front of Yonge: 36 South Glengarry: 40 Frontenac Islands: 33 South Huron: 40 Gananoque: 32 South Stormont: 35 Georgian Bluffs: 41 Southwest Middlesex: 40 Gillies: 41 South-West Oxford: 35 Grey Highlands: 45 Southwold: 32 Hamilton: 39 Springwater: 32 Hanover: 45 St. Clair: 35 Hawkesbury: 45 St. Marys: 35 Lambton Shores: 38 Stirling-Rawdon: 41 Lanark Highlands: 43 Stone Mills: 37 Leeds and the Thousand Islands: 35 Tay Valley: 44 Lucan Biddulph: 37 Tay: 42 Mapleton: 39 The Nation/ La Nation: 40 Markstay-Warren: 44 Tiny: 43 McNab/Braeside: 38 Trent Hills: 41

176

RIO 46-55 RIO 56-75 continued RIO 56-75 continued Admaston/Bromley: 52 Addington Highlands: 56 Red Rock: 57 Alnwick/Haldimand: 47 Armour: 69 Ryerson: 71 Arran-Elderslie: 48 Ashfield-Colborne-Wawanosh: 58 Sables-Spanish Rivers: Baldwin: 50 Assiginack: 75 56 Blue Moutains: 47 Bancroft: 62 Seguin: 70 Bluewater: 48 Black River-Matheson: 66 Spanish: 64 Bonfield: 50 Bonnechere Valley: 59 St. Joseph: 64 Brockton: 50 Bracebridge: 57 Strong: 62 Callander: 47 Bruce Mines: 61 Sundridge: 57 Central Huron: 51 Brudenell, Lyndoch and Raglan: 70 Temagami: 66 Champlain: 46 Burk's Falls: 63 The Archipelago: 74 Chisholm: 53 Calvin: 56 Thessalon: 62 East Hawkesbury: 53 Carlow/Mayo: 69 Timiskaming Shores: 74 Espanola: 46 Cobalt: 74 Whitestone: 73 French River/Rivière des Français: Cochrane: 69 Wollaston: 60 55 Deep River: 70 Galway-Cavendish and Harvey: 50 Dysart and Others: 56 Goderich: 52 Elliot Lake: 71 Gravenhurst: 50 Faraday: 63 Greater Madawaska: 55 Georgian Bay: 64 Havelock-Belmont-Methuen: 52 Hastings Highlands: 69 Horton: 47 Highlands East: 63 Howick: 53 Huntsville: 62 Huron East: 47 Huron Shores: 67 Kincardine: 52 Huron-Kinloss: 57 Madoc: 47 Iroquois Falls: 60 Marmora and Lake: 52 Johnson: 63 Mattawa: 55 Killaloe, Hagarty and Richards: 69 Melancthon: 47 Kirkland Lake: 70 Minto: 46 Laird: 60 Morris-Turnberry: 54 Laurentian Hills: 73 Nipissing: 55 Laurentian Valley: 60 North Huron: 54 Macdonald, Meredith and Aberdeen North Kawartha: 55 Additional: 59 Pembroke: 51 Machar: 60 Powassan: 51 Madawaska Valley: 73 Prince: 54 Magnetawan: 71 Saugeen Shores: 46 McDougall: 71 Severn: 46 McKellar: 73 South Bruce Peninsula: 48 McMurrich/Monteith: 73 South Bruce: 55 Minden Hills: 61 South River: 55 Muskoka Lakes: 69 Southgate: 48 Nipigon: 58 St.-Charles: 50 North Algona Wilberforce: 61 Tudor and Cashel: 54 North Frontenac: 56 Whitewater Region: 53 North Shore: 71 Northeastern Manitoulin and the Islands: 68 Northern Bruce Peninsula: 64 Papineau-Cameron: 68 Parry Sound: 65 Perry: 70 Petawawa: 57 Plummer Additional: 64

177

RIO 76+ Alberton: 100 Algonquin Highlands: 77 Armstrong: 84 Atikokan: 83 Billings: 78 Blind River: 78 Carling: 76 Central Manitoulin: 77 Chapleau: 87 Chapple: 100 Charlton and Dack: 83 Dawson: 100 Dryden: 91 Dubreuilville: 100 Ear Falls: 100 Emo: 99 Englehart: 76 Fauquier-Strickland: 86 Fort Frances: 91 Gore Bay: 78 Greenstone: 79 Harley: 85 Harris: 83 Hearst: 95 Hornepayne: 100 Ignace: 96 Kapuskasing: 84 Kearney: 76 Kenora: 80 La Vallee: 100 Lake of Bays: 76 Larder Lake: 83 Machin: 100 Manitouwadge: 99 Marathon: 97 Mattice-Val Côté: 100 McGarry: 84 Michipicoten: 93 Moonbeam: 89 Rainy River: 95 Red Lake: 98 Schreiber: 86 Sioux Lookout: 97 Sioux Narrows-Nestor Falls: 96 Smooth Rock Falls: 79 South Algonquin: 78 Terrace Bay: 92 Val Rita-Harty: 95 White River: 100

178

Appendix C Comparison of the Diagnostic Interval between Patients with and without a Referral for First Test

A small portion of the cohort (n=3,955 or 14.2%) did not have an identifiable referral for their first test. Without a referral for a first test, which is typically the index contact date, it was expected that the median diagnostic interval would be considerably shorter than the median diagnostic interval of patients with a referral for their first test. Table C-1 shows that this was true for all tests except sigmoidoscopy, and highlights that our reporting of diagnostic interval was a conservative measure (cases without referrals would have had an overall shortening effect on the diagnostic interval). The test with the greatest difference in median diagnostic interval between with referral and without referral cases was colonoscopy (difference in median diagnostic interval was 44 days shorter for without referral cases). It was interesting to note that, although the median diagnostic interval was shorter for colonoscopies without referral cases, the 90th percentile diagnostic interval was longer (271 days versus 255 days) than colonoscopies with referral. Colonoscopy was the only test with discordance like this between median and 90th percentile diagnostic intervals.

The proportion of cases without referrals across the first five RIO groups were very similar

(13.1% to 16.5%), but this proportion was higher in the last RIO group (22.8%) (p< 0.0001), which is a partial explanation for the shorter intervals in that group.

179

C-1: The median and 90th percentile diagnostic interval (days) of patients with and without a referral for their first test With Referral Without Referral First Test n=23,987 n=3,955 Colonoscopy Median 66 22 90th percentile 255 271 FOBT Median 85 NA 90th percentile 248 Abdominal US Median 134 115 90th percentile 352 323 Abdominal X-Ray Median 30 8 90th percentile 275 241 Abdominal CT Median 49.5 16 90th percentile 387 350 ED NOS Median NA 2 90th percentile 22 Sigmoidoscopy Median 55 65 90th percentile 278 387 Barium Enema Median 44 40 90th percentile 150 80 Abbreviations: FOBT fecal occult blood test; US ultrasound; CT computed tomography; ED NOS emergency department not otherwise specified

180

Appendix D Control Charts

D-1: Control charts used to determine the relevant look-back periods for each key CRC test. FOBT shown here, other key CRC tests on following pages. Y- axis “weekly count” is the weekly count of the respective CRC test (FOBT here) using data from the entire study cohort. X-axis “week” is the week(s) prior to CRC diagnosis i.e. week 0 would be the CRC diagnosis. Panel on the left is period one (approximately the 0-18 months or 0-78 weeks directly preceding the CRC diagnosis) and panel on the right is period two, the background (historical) data (approximately the 18-24 months or 78-104 weeks preceding CRC diagnosis). Dashed blue lines represent three standard deviations from the background mean (determined from period two) and purple line is the weekly test count. Rules for interpreting the control charts were applied to this data.

181

182

183

184

Appendix E Rules for Interpreting Control Charts

Control charts were used in this project to determine the relevant look-back period in which to collect key CRC tests. The cut-off point (in which to look-back until) was determined by applying the rules below. The rules determine if a data point or set of data points are “out of control”. Therefore, when we could no longer apply rules 1-4 to a data point that week was the cut-off point, signaling that the data were now “in control” and were similar to a background rate.

The rules were applied to the data in a spreadsheet format (starting with week zero, the closest time point to the CRC diagnosis date) for each key CRC test.

Rules for interpreting control charts (2)

1. Whenever a single point falls outside 3 standard deviations from the centre line

2. Whenever at least 2 of 3 successive points fall on the same side of the centreline and

are more than 2 standard deviations away from the centre line

3. Whenever at least 4 of 5 successive points fall on the same side of the centreline and

are more than 1 standard deviation away from the centre line

4. Whenever at least 8 successive points fall on the same side of the centre line

185

Appendix F Signal Strength Calculation for Relevant Look-Back Periods

"푡푟푢푒" 푝표푠𝑖푡𝑖푣푒푠 푆𝑖𝑔푛푎푙 푠푡푟푒푛𝑔푡ℎ = 푎푙푙 푝표푠𝑖푡𝑖푣푒푠

Where, all positives = total number of encounters from 0 - X weeks determined from the control chart relevant look-back period “true” negatives = (mean background rate)*(number of weeks determined from the control chart relevant look-back period) “true” positives = all positives-“true” negatives Use Table 3-1: for relevant look-back periods Use Table F-1 (below): for mean background rate

Example calculation of the signal strength for colonoscopy: all positives = total number of encounters from 0 - 36 weeks = 24313 (data not shown) “true” negatives = (19.63)*(36)= 706.7 “true” positives = 24313-706.68=23606.3

23606.3 푆𝑖𝑔푛푎푙 푠푡푟푒푛𝑔푡ℎ = = 97.1% (푐표푙표푛표푠푐표푝푦) 24313

Table F-1: A comparison of the mean number of investigations/week between period 1 and 2 using the control chart data. FOBT and abdominal US are in bold to show the elevated mean background rates. Key CRC Test Mean number of Mean number of investigations/week investigations/week in Period 2 in Period 1 (background) (18-24 months prior to CRC (0-18 months prior to CRC diagnosis) diagnosis) FOBT 170.79 76.71 Abdominal X-ray 146.89 27.44 Abdominal CT 194.80 28.44 Abdominal US 195.86 86.85 Barium Enema 28.57 5.96 Colonoscopy 316.32 19.63 Sigmoidoscopy 41.71 5.00

186

Appendix G Linearity Assumption Tests for Age, Comorbidities and Deprivation Quintile (Quantile Regression Models)

Four continuous measures (age, major comorbidities, minor comorbidities and deprivation quintile,) were each plotted against the outcome (diagnostic interval) using side-by- side boxplots to assess if the linearity assumption held true. When the median diagnostic interval was plotted by age (Figure G-1) there were variations at the extremes (youngest and oldest), for this reason grouping age was more appropriate. Six age categories (<45, 45-54, 55-64, 65-74, 75-

84 and 85+) were created based on this figure. Side by side boxplots displaying the diagnostic interval by the number of major comorbidities (ADGs) showed a monotonic increase in the median diagnostic interval with increasing number of ADGs (Figure G-2). The interquartile range also followed this increasing trend. The same relationship was found with the median diagnostic interval and the number of minor comorbidities (ADGs) (Figure G-3), thus, both variables were kept as continuous measures as a linear trend was present. In Figure G-4, there was almost no difference in the median diagnostic interval across deprivation quintiles 1 to 5. Interquartile ranges and max/min values were also similar. Since there was no linear effect, the deprivation quintile variable was used as a categorical variable (quintiles 1 to 5).

187

DiagnosticInterval (days)

Age at Diagnosis Figure G-1: Distribution of the diagnostic interval (days) by age. Box plot specifics: horizontal line within box represents the median (50th percentile); “◊” marker within box represents the mean; bottom and top edges of box represent the 25th percentile (quartile1) and 75th percentile (quartile 3), respectively, or the interquartile range; whiskers represent the minimum and maximum observations

DiagnosticInterval (days)

Number of Major Aggregated Diagnosis Groups

Figure G-2: Distribution of the diagnostic interval (days) by number of major ADGs (comorbidities). Box plot specifics: horizontal line within box represents the median (50th percentile); “◊” marker within box represents the mean; bottom and top edges of box represent th th the 25 percentile (quartile1) and 75 percentile (quartile 3), respectively, or the interquartile range; whiskers represent the minimum and maximum observations 188

Interval(days)

Diagnostic

Number of Minor Aggregated Diagnosis Groups (ADGs) Figure G-3: Distribution of the diagnostic interval (days) by number of minor ADGs (comorbidities). Box plot specifics: horizontal line within box represents the median (50th percentile); “◊” marker within box represents the mean; bottom and top edges of box represent the 25th percentile (quartile1) and 75th percentile (quartile 3), respectively, or the interquartile range; whiskers represent the minimum and maximum observations

DiagnosticInterval

Deprivation Quintile Figure G-4: Distribution of the diagnostic interval (days) by deprivation quintile. Box plot specifics: horizontal line within box represents the median (50th percentile); “◊” marker within box represents the mean; bottom and top edges of box represent the 25th percentile (quartile1) and 75th percentile (quartile 3), respectively, or the interquartile range; whiskers represent the minimum and maximum observations

189

Appendix H Linearity Assumption Tests for Comorbidities (logistic regression model)

Major and minor comorbidity variables were assessed for the linearity assumption to ensure that they could be analyzed as continuous measures. Beta (β) estimates/coefficients from the unadjusted model were plotted and a linear equation and R2 value was generated (see Figure

H-1 and H-2). The linearity assumption was more clearly meet for the major comorbidity variable

(R2=0.98) than for the minor comorbidity variable (R2=0.69). To confirm that the minor comorbidity variable could be analyzed in a continuous manner we determined that the maximum discrepancy in the expected OR (from the linear equation) to the observed OR (from the model) was 0.05. We were willing to accept this difference and analyzed the minor comorbidity variable as continuous.

0 0 1 2 3 4 5 -0.05 y = -0.0856x - 0.0827 -0.1 R² = 0.9806

-0.15

-0.2 -0.25

β estimates β -0.3

-0.35 -0.4 -0.45 Number of Major ADGs unadjusted model

H-1: Beta (β) estimates from the unadjusted logistic regression model for major 2 comorbidity (number of major ADGs). The linearity assumption is met (R =0.98)

190

0 0 2 4 6 8 10 12 -0.1 y = -0.0269x - 0.2606 R² = 0.6946

-0.2

-0.3

estimtes β -0.4

-0.5

-0.6 Number of Minor ADGs unadjusted model

H-2: Beta (β) estimates from the unadjusted logistic regression model for minor comorbidity (number of minor ADGs). The linearity assumption is met (R2=0.69)

191

Appendix I Ethics Approval

192

193

Appendix J Description of the CRC Cohort by Stage (I-IV)

J-1: Description of the CRC cohort by stage (I–IV) (Jan 01, 2007 to May 31, 2012 n=25,909) Stage I Stage II Stage III Stage IV p-value n=6,147 n=7,277 n=7,830 n=4,655 X2 % % % % Unless otherwise noted RIO group 0-9 (least rural) 65.7 64.4 65.7 66.7 0.28 10-30 18.2 18.1 17.9 16.8 31-45 9.8 10.2 10.0 10.1 46-55 2.5 2.7 2.6 2.8 56-75 2.7 3.3 2.6 2.6 76+ (most rural) 1.0 1.2 1.1 1.0 Age at diagnosis (years) <45 2.7 3.1 4.4 4.7 <0.0001 45-54 10.5 8.5 12.0 13.5 55-64 23.6 18.4 22.4 23.2 65-74 29.8 28.3 27.6 25.9 75-84 26.1 29.5 24.3 24.3 85+ 7.4 12.1 9.3 8.5 Male sex 56.8 53.2 54.8 56.0 0.0003 Deprivation quintile n=25,697 1 (least deprived) 24.7 22.5 23.5 22.6 0.0013 2 23.2 22.5 23.1 22.6 3 21.6 21.5 21.4 20.7 4 17.6 18.8 17.3 18.7 5 (most deprived) 12.8 14.8 14.8 15.4 Recent immigrant 4.1 4.3 4.2 4.2 0.89, 0.82* status Major comorbidity 0 ADGs 53.1 56.4 58.0 63.0 <0.0001 1 ADGs 30.7 28.9 29.1 25.5 2 ADGs 11.1 10.2 9.1 8.2 3 ADGs 3.7 3.2 2.9 2.3 4+ ADGs 1.4 1.4 1.0 0.9 Minor comorbidity 0 ADGs 7.4 10.2 11.0 14.8 <0.0001 1 ADGs 12.4 13.5 13.6 15.7 2 ADGs 15.5 16.3 16.8 17.7 3 ADGs 16.7 16.5 15.9 15.0 4 ADGs 14.2 14.4 13.3 12.4 5 ADGs 11.3 10.6 10.6 9.4 6 ADGs 8.9 7.3 7.1 6.1 7 ADGs 5.7 5.1 5.0 3.7 8 ADGs 3.6 2.8 3.2 2.2 9 ADGs 2.0 1.8 1.6 1.6 10+ ADGs 2.3 1.7 1.8 1.4

194

Stage I Stage II Stage III Stage IV p-value n=6,147 n=7,277 n=7,830 n=4,655 X2 % % % % Unless otherwise noted CRC sub-site Proximal Colon 31.3 45.1 39.3 40.0 <0.0001 Distal Colon 31.7 23.8 24.2 26.9 Rectum 26.5 19.4 28.0 21.1 Unspecified CRC 10.5 11.8 8.4 12.0 Abbreviations: RIO Rurality Index for Ontario; ADGs Johns Hopkins Aggregated Diagnosis Groups; CRC colorectal cancer *Cochrane-Armitage Trend Test (2-sided)

195

Appendix K Median Quantile Regression Results for Factors Associated with the Diagnostic Interval

Table K-1: Stage unknown (n=2,017) median quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in median Diagnostic Interval median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 13 (-0.88, 26.9) 14.3 (-0.16, 28.8) 31-45 -1 (-17.3,15.3) -1.4 (-19.5, 16.8) 46-55 -9 (-38.3, 20.3) -6.8 (-31.7, 18.0) 56-75 4 (-31.7, 39.7) -6.8 (-38.2, 24.6) 76+ (most rural) -30 (-62.6, 2.6) -26.4 (-61.2, 8.4) Age at diagnosis (years) <45 1 (-19.2, 21.2) 7.2 (-10.4, 24.9) 45-54 -8 (-25.4, 9.4) 5.3 (-11.7, 22.3) 55-64 -4 (-15.3, 7.3) 0.23 (-13.2, 13.6) 65-74 Ref Ref 75-84 -8 (-21.7, 5.7) -13.0 (-26.3, 0.2) 85+ -22 (-44.7, 0.66) -28.7 (-49.0, -8.4) Sex Male -7 (-16.5, 2.5) 0.88 (-8.2, 10.0) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 4 (-10.7, 18.7) -0.67 (-15.0, 13.7) 3 10 (-3.4, 23.4) 11.1 (-3.0, 25.2) 4 11 (-0.78, 22.8) 12.7 (-1.2, 26.6) 5 (most deprived) 14 (-5.5, 33.5) 3.7 (-11.8, 19.2) Major comorbidity 11 (3.6, 18.4) 5.5 (-2.8, 13.9) Minor comorbidity 7 (4.5, 9.5) 7.5 (5.2, 9.9) Recent immigrant status Yes -4 (-33.6, 25.6) 5.6 (-18.2, 29.4) No Ref Ref CRC sub-site Proximal Colon 13 (-1.9, 27.9) 10.8 (-2.6, 24.2) Distal Colon 7 (-4.2, 18.2) 9.0 (-3.1, 21.0) Rectum Ref Ref Unspecified CRC -4 (-22.0, 14.0) -5.3 (-20.5, 9.9) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

196

Table K-2: Stage I (n=6,096) median quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in median Diagnostic Interval median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 9 (0.45, 17.5) 10.8 (1.5, 20.1) 31-45 -6 (-15.4, 3.4) -1.4 (-12.5, 9.7) 46-55 10 (-9.6, 29.6) 12.9 (-5.0, 30.7) 56-75 -5 (-29.9, 19.9) -2.0 (-28.5, 24.5) 76+ (most rural) -38 (-61.9, -14.1) -24.8 (-53.0, 3.4) Age at diagnosis (years) <45 -23 (-50.2, 4.2) -15.4 (-36.0, 5.2) 45-54 -7 (-20.4, 6.4) 0.43 (-13.2, 14.0) 55-64 2 (-8.4, 12.4) 4.9 (-4.1, 13.8) 65-74 Ref Ref 75-84 -6 (-15.9, 3.9) -8.5 (-18.0, 1.0) 85+ -28 (-45.2, -10.8) -30.8 (-45.8, -15.8) Sex Male 1 (-6.0, 8.0) 0.83 (-6.3, 7.9) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 0 (-10.3, 10.3) -0.49 (-10.5, 9.5) 3 -6 (-16.6, 4.6) -2.9 (-13.8, 8.0) 4 -4 (-14.4, 6,4) -5.6 (-17.3, 6.1) 5 (most deprived) -3 (-16.1, 10.1) -2.7 (-15.6, 10.1) Major comorbidity 13 (9.1, 16.9) 8.7 (3.6, 13.8) Minor comorbidity 5.3 (3.9, 6.8) 3.9 (2.2, 5.6) Recent immigrant status Yes -1 (-19.4, 17.4) 3.5 (-16.3, 23.4) No Ref Ref CRC sub-site Proximal Colon 18 (8.4, 27.6) 12.2 (2.7, 21.7) Distal Colon 13 (3.5, 22.5) 11.5 (2.3, 20.8) Rectum Ref Ref Unspecified CRC -1(-11.0, 9.0) -0.45 (-12.9, 12.0) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

197

Table K-3: Stage II (n=7,223) median quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in median Diagnostic Interval median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 -2 (-7.8, 3.8) -0.05 (-5.9, 5.8) 31-45 -1 (-8.8, 6.8) -1.0 (-8.9, 6.9) 46-55 3 (-9.6, 15.6) 4.6 (-8.7, 17.8) 56-75 -5 (-19.2, 9.2) -6.7 (-16.9, 3.4) 76+ (most rural) 4 (-25.8, 33.8) 1.8 (-32.5, 36.1) Age at diagnosis (years) <45 -12 (-27.9, 3.9) -3.3 (-19.0, 12.4) 45-54 -15 (-23.7, -6.3) -6.4 (-14.1, 1.3) 55-64 -7 (-12.9, -1.1) -0.9 (-7.0, 5.2) 65-74 Ref Ref 75-84 0 (-6.4, 6.4) -3.8 (-9.6, 2.1) 85+ -17 (-24.7, -9.3) -23.5 (-29.8, -17.1) Sex Male -1 (-6.0, 4.0) 2.4 (-1.9, 6.7) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 0 (-6.8, 6.8) -1.9 (-8.3, 4.6) 3 2 (-5.1, 9.1) -0.41 (-6.8, 6.0) 4 -1 (-8.1, 6.1) 0.14 (-7.3, 7.6) 5 (most deprived) -3 (-11.0, 5.0) -1.4 (-8.4, 5.7) Major comorbidity 10.3 (6.4, 14.3) 4.8 (1.4, 8.2) Minor comorbidity 6.0 (4.7, 7.3) 5.8 (4.5, 7.1) Recent immigrant status Yes -11 (-19.6, -2.4) -7.8 (-19.2, 3.5) No Ref Ref CRC sub-site Proximal Colon 15 (8.8, 21.2) 12.7 (6.9, 18.5) Distal Colon 13 (5.6, 20.4) 11.4 (5.1, 17.8) Rectum Ref Ref Unspecified CRC 11 (2.5, 19.5) 11.1 (4.0, 18.3) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

198

Table K-4: Stage III (n=7,769) median quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in median Diagnostic Interval median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 10 (3.5, 16.5) 13.3 (6.7, 19.8) 31-45 2 (-5.0, 9.0) 5.2 (-1.0, 11.4) 46-55 10 (-3.9, 23.9) 16.1 (1.0, 31.1) 56-75 2 (-13.1, 17.1) 8.0 (-9.3, 25.3) 76+ (most rural) -9 (-33.1, 15.1) -5.9 (-25.7, 13.9) Age at diagnosis (years) <45 -21 (-28.1, -13.9) -10.8 (-21.4, -0.21) 45-54 -14 (-21.4, -6.6) -8.9 (-16.2, -1.7) 55-64 -5 (-11.3, 1.3) 0.0 (-6.0, 6.0) 65-74 Ref Ref 75-84 -9 (-15.6, -2.4) -12.5 (-18.6, -6.4) 85+ -8 (-20.2, 4.2) -16.8 (-27.3, -6.4) Sex Male -4 (-8.8, 0.77) -0.56 (-5.2, 4.1) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 -5 (-11.0, 1.0) -6.7 (-13.0, -0.41) 3 -1 (-7.8, 5.8) -2.1 (-8.5, 4.3) 4 -9 (-15.9, -2.1) -9.3 (-15.7, -2.8) 5 (most deprived) -7 (-15.5, 1.5) -8.6 (-16.8, -0.38) Major comorbidity 14.5 (10.2, 18.8) 8.2 (4.2, 12.2) Minor comorbidity 6.6 (5.4, 7.8) 6.1 (4.5, 7.6) Recent immigrant status Yes -1 (-14.4, 12.4) 11.3 (-0.89, 23.6) No Ref Ref CRC sub-site Proximal Colon 7 (1.3, 12.7) 2.0 (-3.9, 7.8) Distal Colon 6 (0.55, 11.5) 5.3 (-0.86, 11.6) Rectum Ref Ref Unspecified CRC 3 (-7.3, 13.3) 2.2 (-6.7, 11.1) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

199

Table K-5: Stage IV (n=4,609) median quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in median Diagnostic Interval median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 2 (-2.8, 6.8) 2.1 (-2.7, 7.0) 31-45 0 (-7.3, 7.3) -0.67 (-7.0, 5.6) 46-55 13 (0.56, 25.4) 13.0 (3.6, 22.4) 56-75 4 (-17.1, 25.1) 4.0 (-10.1, 18.1) 76+ (most rural) -5 (-39.5, 29.5) -1.2 (-26.8, 24.3) Age at diagnosis (years) <45 -14 (-23.5, -4.5) -8.9 (-17.2, -0.60) 45-54 -10 (-15.8, -4.2) -7.0 (-12.6, -1.5) 55-64 -12 (-16.8, -7.2) -8.4 (-13.3, -3.6) 65-74 Ref Ref 75-84 0 (-6.6, 6.6) -1.7 (-7.3, 4.0) 85+ 0 (-10.1, 10.1) -2.1 (-12.3, 8.2) Sex Male -2 (-6.5, 2.5) -2.3 (-5.9, 1.4) Female Ref Ref Deprivation quintile 1(least deprived) Ref Ref 2 3 (-2.6, 8.6) 3.3 (-1.9, 8.4) 3 0 (-6.2, 6.2) 0.12 (-5.8, 6.0) 4 2 (-4.4, 8.4) 0.12 (-6.5, 6.8) 5(most deprived) 1 (-6.5, 8.5) -1.9 (-7.9, 4.1) Major comorbidity 15 (10.8, 19.2) 5.4 (1.2, 9.6) Minor comorbidity 6.2 (0.50, 5.2) 5.0 (3.7, 6.4) Recent immigrant status Yes -5 (-11.8, 1.8) -2.4 (-10.9, 6.1) No Ref Ref CRC sub-site Proximal Colon -3 (-7.9, 1.9) -7.3 (-12.1, -2.6) Distal Colon -1 (-6.3, 4.3) -4.9 (-10.0, 0.2) Rectum Ref Ref Unspecified CRC 4 (-3.5, 11.5) -0.73 (-9.0, 7.6) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

200

Appendix L 90th Quantile Regression Results for Factors Associated with the Diagnostic Interval

Table L-1: Stage unknown (n=2,017) 90th quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in Median Diagnostic Interval Median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 9.0 (-29.5, 47.5) 3.0 (-29.9, 35.9) 31-45 28.0 (-23.0, 79.0) 16.4 (-27.1, 59.9) 46-55 -18.0 (-99.0, 63.0) 18.3 (-61.6, 98.2) 56-75 46.0 (-73.2, 165.2) 75.1 (-15.3, 165.4) 76+ (most rural) 40.0 (-126.0, 206.0) 71.0 (-97.6, 239.6) Age at diagnosis (years) <45 -75.0 (-172.4, 22.4) -34.0 (-108.8, 40.8) 45-54 -50.0 (-100.6, 0.63) -13.6 (-62.7, 35.4) 55-64 -16.0 (-52.1, 20.1) -1.1 (-35.7, 33.5) 65-74 Ref Ref 75-84 12.0 (-17.8, 41.8) -4.9 (-43.5, 33.6) 85+ 16.0 (-20.9, 52.9) -1.1 (-47.3, 45.1) Sex Male -6.0 (-30.4, 18.4) 15.0 (-11.4, 41.4) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 20.0 (-81.1, 58.1) 26.2 (-7.5, 59.9) 3 20.0 (-18.6, 58.6) -1.9 (-45.9, 42.2) 4 39.0 (4.7, 73.3) 16.8 (-21.4, 55.0) 5 (most deprived) 46.0 (6.3, 85.7) 47.5 (3.8, 91.2) Major comorbidity 37.0 (23.3, 50.7) 8.9 (-6.7, 24.5) Minor comorbidity 17.0 (12.6, 21.4) 15.5 (10.1, 20.9) Recent immigrant status Yes 22.0 (-34.6, 78.6) 31.9 (-46.2, 110.0) No Ref Ref CRC sub-site Proximal Colon 36.0 (8.8, 63.2) 39.5 (4.2, 74.8) Distal Colon 22.0 (-19.2, 63.2) 26.2 (-7.2, 59.6) Rectum Ref Ref Unspecified CRC 9.0 (-38.7, 56.7) 17.2 (-26.7, 61.1) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

201

Table L-2: Stage I (n=6,096) 90th quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in Median Diagnostic Interval Median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 23.0 (10.0, 36.0) 25.0 (8.5, 41.5) 31-45 -2.0 (-24.0, 20.1) -3.9 (-28.3, 20.4) 46-55 -4.0 (-59.0, 51.0) 6.7 (-23.4, 36.7) 56-75 20.0 (-21.1, 61.1) 17.1 (-17.7, 52.0) 76+ (most rural) -98.0 (-189.5, -6.5) -67.3 (-129.4, -5.1) Age at diagnosis (years) <45 -41.0 (-85.3, 3.3) -26.5 (-59.0, 6.1) 45-54 -26.0 (-50.8, -1.2) -25.9 (-55.0, 3.1) 55-64 -7.0 (-24.1, 10.1) 0.68 (-15.9, 17.3) 65-74 Ref Ref 75-84 4.0 (-13.2, 21.2) -17.4 (-33.1, -1.6) 85+ 8.0 (-22.4, 38.4) -10.8 (-42.2, 20.5) Sex Male -10.0 (-22.8, 2.8) -5.3 (-17.4, 6.9) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 6.0 (-13.4, 25.4) 7.8 (-12.7, 28.4) 3 -9.0 (-27.4, 9.4) -11.0 (-29.0, 7.0) 4 1.0 (-18.4, 20.4) 2.3 (-17.7, 22.2) 5 (most deprived) -2.0 (-23.5, 19.5) -9.5 (-29.1, 10.1) Major comorbidity 25.0 (18.7, 31.3) 12.7 (6.6, 18.8) Minor comorbidity 14.0 (11.4, 16.6) 11.7 (9.1, 14.4) Recent immigrant status Yes -17.0 (-66.4, 32.4) 8.8 (-28.2, 45.7) No Ref Ref CRC sub-site Proximal Colon 13.0 (-7.0, 33.0) 15.6 (0.72, 30.5) Distal Colon 4.0 (-14.3, 22.3) 8.8 (-7.2, 24.8) Rectum Ref Ref Unspecified CRC 7.0 (-20.9, 34.9) 2.7 (-25.1, 30.6) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

202

Table L-3: Stage II (n=7,223) 90th quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in Median Diagnostic Interval Median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 -5.0 (-23.6, 13.6) -1.8 (-19.2, 15.6) 31-45 -17.0 (-44.2, 10.2) -9.9 (-33.5, 13.7) 46-55 16.0 (-33.4, 65.4) 16.9 (-22.8, 56.6) 56-75 9.0 (-39.1, 57.1) 11.5 (-39.2, 62.1) 76+ (most rural) 38.0 (-56.0, 132.0) -7.3 (-100.9, 86.4) Age at diagnosis (years) <45 -21.0 (-69.2, 27.2) 16.4 (-19.5, 52.3) 45-54 -19.0 (-49.2, 11.2) 4.1 (-22.9, 31.0) 55-64 -5.0 (-30.8, 20.8) 5.5 (-12.6, 23.6) 65-74 Ref Ref 75-84 17.0 (-2.7, 36.7) 5.4 (-11.4, 22.1) 85+ 2.0 (-21.0, 25.0) -14.7 (-41.9, 12.5) Sex Male 12.0 (-3.5, 27.5) 6.4 (-7.7, 20.6) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 7.0 (-14.1, 28.1) 0.54 (-19.1, 20.2) 3 -10.0 (-34.2, 14.2) -5.3 (-26.9, 16.3) 4 -8.0 (-27.6, 11.6) -6.3 (-27.6, 15.0) 5 (most deprived) -8.0 (-35.0, 19.0) 5.2 (-15.5, 25.8) Major comorbidity 34.5 (25.1, 43.9) 18.0 (9.4, 26.6) Minor comorbidity 16.7 (13.7, 19.6) 14.1 (11.3, 16.9) Recent immigrant status Yes -23.0 (-64.9, 18.9) -4.2 (-38.2, 29.8) No Ref Ref CRC sub-site Proximal Colon 38.0 (17.7, 58.3) 21.5 (3.2, 39.8) Distal Colon 23.0 (-1.2, 47.2) 17.3 (-5.8, 40.4) Rectum Ref Ref Unspecified CRC 32.0 (-0.39, 64.4) 18.2 (-7.8, 44.1) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

203

Table L-4: Stage III (n=7,769) 90th quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in Median Diagnostic Interval Median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 12.0 (-5.4, 29.4) 3.8 (-14.2, 21.9) 31-45 -23.0 (-41.8, -4.2) 4.6 (-19.6, 28.9) 46-55 -4.0 (-50.9, 42.9) -7.9 (-58.0, 42.3) 56-75 -8.0 (-70.4, 54.4) 1.2 (-52.1, 54.4) 76+ (most rural) -24.0 (-95.9, 47.9) -12.8 (-62.1, 36.6) Age at diagnosis (years) <45 -53.0 (-95.1, -10.9) -24.0 (-68.9, 20.8) 45-54 -54.0 (-79.3, -28.7) -35.8 (-61.3, -10.3) 55-64 -17.0 (-35.7, 1.7) -3.4 (-22.4, 15.6) 65-74 Ref Ref 75-84 12.0 (-6.7, 30.7) -13.7 (-33.0, 5.6) 85+ 15.0 (-9.6, 39.6) -10.3 (-35.3, 14.7) Sex Male -1.0 (-14.4, 12.4) 13.7 (-0.5, 27.9) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 -1.0 (-20.5, 18.5) -3.8 (-24.2, 16.6) 3 -7.0 (-28.1, 14.1) 3.6 (-16.5, 23.6) 4 0.0 (-24.1, 24.1) 0.0 (-21.6, 21.6) 5 (most deprived) 22.0 (0.84, 43.2) 22.5 (-1.7, 46.6) Major comorbidity 43.8 (38.5, 49.0) 19.7 (11.9, 27.6) Minor comorbidity 20.3 (17.9, 22.8) 16.3 (13.4, 19.3) Recent immigrant status Yes 16.0 (-17.0, 49.0) 36.0 (-6.0, 78.0) No Ref Ref CRC sub-site Proximal Colon 34.0 (12.9, 55.1) 13.7 (-4.1, 31.5) Distal Colon 11.0 (-6.8, 28.8) 6.8 (-12.3, 25.9) Rectum Ref Ref Unspecified CRC 32.0 (4.0, 60.0) 15.7 (-12.0, 43.4) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

204

Table L-5: Stage IV (n=4,609) 90th quantile regression results for factors associated with the diagnostic interval for unadjusted and adjusted models Unadjusted Difference (days) Adjusted‡ Difference (days) in in Median Diagnostic Interval Median Diagnostic Interval (95% CI) (95% CI) RIO 0-9 (least rural) Ref Ref 10-30 3.0 (-27.7, 33.7) -6.6 (-30.1, 17.0) 31-45 7.0 (-30.9, 44.9) 4.2 (-27.9, 36.4) 46-55 2.0 (-48.0, 52.0) -21.6 (-102.7, 59.5) 56-75 51.0 (3.2, 98.8) 50.2 (-9.9, 110.2) 76+ (most rural) -12.0 (-218.1, 194.1) 62.8 (-72.9, 198.5) Age at diagnosis (years) <45 -5.0 (-55.6, 45.6) 15.8 (-35.9, 67.6) 45-54 -50.0 (-79.5, -20.5) -29.7 (-63.0, 3.7) 55-64 -51.0 (-85.6, -16.4) -30.9 (-59.2, -2.7) 65-74 Ref Ref 75-84 29.0 (7.9, 50.1) 22.3 (-0.3, 44.9) 85+ 21.0 (-8.4, 50.4) 17.7 (-20.9, 56.3) Sex Male -12.0 (-31.1, 7.1) -0.20 (-18.1, 17.7) Female Ref Ref Deprivation quintile 1 (least deprived) Ref Ref 2 5.0 (-22.2, 32.2) 1.1 (-23.9, 26.1) 3 -1.0 (-28.1, 26.1) -5.5 (-31.0, 19.9) 4 24.0 (-5.5, 53.5) 10.4 (-18.6, 39.4) 5 (most deprived) -23.0 (-59.5, 13.5) -5.6 (-37.9, 26.8) Major comorbidity 41.0 (29.6, 52.4) 16.3 (6.0, 26.6) Minor comorbidity 21.5 (16.4, 26.6) 15.9 (11.7, 20.2) Recent immigrant status Yes -32.0 (-86.8, 22.8) -15.2 (-57.4, 27.1) No Ref Ref CRC sub-site Proximal Colon 29.0 (3.6, 54.4) 8.4 (-16.5, 33.2) Distal Colon -1.0 (-34.2, 32.2) -4.6 (-30.0, 20.8) Rectum Ref Ref Unspecified CRC 43.0 (9.6, 76.4) 10.4 (-23.4, 44.1) ‡Adjusted for: RIO, age at diagnosis, sex, deprivation quintile, major/minor comorbidities (ADGs Johns Hopkins Aggregated Diagnosis Groups), recent immigrant status and CRC sub-site

205

Appendix M Median Quantile Regression Diagnostics

Median Quantile Regression Diagnostics Histograms of the Standardized Residuals

One graphical technique used to help assess the quantile regression model fit is a histogram of the standardized residuals; if the regression model fits the data well the histogram will be normally distributed (3,4). The residual represents the distance between an observed data point and an estimated data point from the regression model. Each residual is then standardized by being divided by its standard deviation (effectively scaling the residuals to have a mean of zero and a standard deviation of one). We would then expect these standardized residuals to have a normal distribution, falling equally about the predicted model. The normal and kernel fitted density curves are superimposed on each histogram to aid in the comparison to the normal distribution (the kernel fitted density estimation is non-parametric and may help see an underlying distribution).

The histograms for the median quantile regression of the diagnostic interval (see Figure

M-1 to M-5) exhibit a right skew. The histograms for the 90th percentile diagnostic interval (see

Figure N-1 to N-5) are moderately normal.

206

(%) Percent

Standardized Residual Figure M-1: Median quantile regression, stage I. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

(%) Percent

Standardized Residual

Figure M-2: Median quantile regression, stage II. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

207

(%) Percent

Standardized Residual Figure M-3: Median quantile regression, stage III. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

(%) Percent

Standardized Residual

Figure M-4: Median quantile regression, stage IV. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

208

Percent (%) Percent

Standardized Residual

Figure M-5: Median quantile regression, stage unknown. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

209

Appendix N 90th Quantile Regression Diagnostics

Histograms of the Standardized Residuals

(%) Percent

Standardized Residual

Figure N-1: 90th quantile regression, stage I. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

210

Percent (%) Percent

Standardized Residual

th Figure N-2: 90 quantile regression, stage II. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

(%) Percent

Standardized Residual

Figure N-3: 90th quantile regression, stage III. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

211

Percent (%) Percent

Standardized Residual

Figure N-4: 90th quantile regression, stage IV. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves

are overlaid Percent Percent

Standardized Figure N-5: 90th quantile regression, stage unknown. Distribution of the residuals for the respective adjusted diagnostic interval model. Fitted normal and kernel density curves are overlaid

212

Appendices References

1. Kralj B. Measuring Rurality - RIO2008 _ BASIC : Methodology and Results FULL. OMA Economics Department; 2009. 2. Tague NR. Quality Toolbox. 2nd ed. Milwaukee, WI: American Society for Quality (ASQ) Press; 2005. p. 155–61. 3. Chen C. An Introduction to Quantile Regression and the QUANTREG Procedure. Thirtieth Annual SAS User Group International Conference. Cary, NC:SAS Institute Inc; 2005. 4. Jalali N, Babanezhad M. Quantile Regression due to Skewness and Outliers. Applied Mathematical Sciences. 2011;5(39):1947–51.

213