Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2016 An examination of epidemiological study designs in veterinary science Jonah Nelson Cullen Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Commons, and the Veterinary Commons

Recommended Citation Cullen, Jonah Nelson, "An examination of epidemiological study designs in veterinary science" (2016). Graduate Theses and Dissertations. 15687. https://lib.dr.iastate.edu/etd/15687

This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected].

An examination of epidemiological study designs in veterinary science

by

Jonah N. Cullen

A thesis submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

Major: Veterinary Preventive Medicine

Program of Study Committee: Annette M. O’Connor, Major Professor Duck-chul Lee Kenneth J. Koehler Jeffrey J. Zimmerman

Iowa State University

Ames, Iowa

2016

Copyright © Jonah N. Cullen, 2016. All rights reserved. ii

DEDICATION

This thesis is dedicated to my amazing wife Kelly, and always-supportive parents

Dan and Jeanne. iii

TABLE OF CONTENTS

Page

ACKNOWLEDGMENTS ...... vi

CHAPTER 1 INTRODUCTION ...... 1

Background ...... 1 Objectives ...... 2 Thesis Organization ...... 2 References...... 4

CHAPTER 2 A REVIEW FOR CLINCIANS PLANNING HOSPITAL-BASED OBSERVATIONAL STUDIES TO INVESTIGATE DISEASE ETIOLOGY USING THE CASE-CONTROL DESIGN ...... 5

Purpose and Scope ...... 5 What Constitutes a Case-Control Design? ...... 6 Why Choose a Case-Control Design to Investigate Etiology? ...... 9 When is the Case-Control Design the “Best” Option? ...... 11 How to Select Cases and Controls? ...... 13 Source population of cases and control ...... 13 The “nature” of the cases ...... 17 The number of control groups ...... 18 Size of the control group ...... 19 Population type ...... 19 Timing of control selection ...... 21 Design Consideration for the Practical Management of Common Biases ...... 23 Restriction ...... 24 Matching ...... 25 Analytical methods ...... 27 Interpretation of the ...... 29 Describing the odds ratio ...... 29 Cross-product ratio estimation ...... 31 Conclusions ...... 34 Glossary ...... 35 References ...... 36

CHAPTER 3 A SURVEY OF CASE-CONTROL STUDIES IN VETERINARY SCIENCE ...... 41

Abstract ...... 41 Introduction ...... 42 iv

Background and rationale ...... 42 Objectives ...... 44 Methods ...... 44 Study design ...... 44 Information sources and electronic search ...... 45 Study selection ...... 45 Data collection process ...... 46 Data items ...... 47 Bias ...... 49 Study size ...... 49 Quantitative variables ...... 50 Statistical methods ...... 50 Results ...... 50 Study population ...... 50 Descriptive data ...... 51 Approach to case selection ...... 53 Approach to control selection ...... 53 Effect measure reported ...... 54 Discussion ...... 56 Key results ...... 56 Limitations ...... 59 Generalizability ...... 61 Footnotes ...... 63 Figures ...... 64 Tables ...... 65 References ...... 73

CHAPTER 4 A SYSTEMATIC REVIEW AND META-ANALYSIS OF THE ANTIBIOTIC TREATMENT FOR INFECTIOUS BOVINE KERATOCONJUNCTIVITIS: AND UPDATE ...... 75

Abstract ...... 75 Introduction...... 76 Rationale ...... 76 Methods ...... 78 Protocol and registration ...... 79 Eligibility criteria ...... 79 Information sources ...... 81 Search ...... 82 Study selection ...... 82 Data extraction process ...... 84 Data items ...... 84 Geometry of the network studies ...... 86 Risk of bias within individual studies ...... 87 Summary measures ...... 89 Planned method of statistical analysis ...... 90 v

Risk of bias assessment ...... 92 Ancillary analysis ...... 92 Results ...... 92 Study selection ...... 92 Study characteristics ...... 93 Summary of the treatment network geometry ...... 96 Analysis of the treatment network geometry ...... 96 Risk of bias within studies ...... 97 Synthesis of pairwise meta-analysis ...... 98 Assessing sources of systematic bias ...... 100 Risk of bias across studies ...... 101 Active-to-active comparisons of interest ...... 102 Mixed treatment comparisons meta-analysis ...... 102 Ancillary analysis ...... 103 Discussion ...... 104 Acknowledgements ...... 107 References ...... 108 Footnotes ...... 117 Tables ...... 118 Illustrations ...... 121

CHAPTER 5 GENERAL CONCLUSIONS ...... 128

General Discussion ...... 128 Recommendations for Future Research ...... 129 vi

ACKNOWLEDGMENTS

I would like to thank my committee chair, Annette M. O’Connor, and my committee members, Duck-chul Lee, Kennth J. Koehler, and Jeffrey J. Zimmerman, for their guidance and support throughout the course of this research.

1

CHAPTER 1. INTRODUCTION

Background

Appropriate design, implementation, and analysis of any study design, experimental or observational, are paramount to making accurate and valid etiologic or associative inference. In the absence of thoughtful design and reporting, studies may suggest a spurious relationship or be misclassified and valid information lost. For the former, this has potentially serious implications in human or veterinary health. If underreported or inaccurately designed studies suggested an association between a risk factor and disease outcome that did not actually exist, substantial resources may be wasted in an attempt to manage the wrong disease cause. As a result, individuals would remain ill as additional cases develop and the population burden rises. Studies that are misclassified (erroneously indexed in reference databases) as a result of being incorrectly titled have major implications for the conducting of systematic reviews and meta-analyses. For example, if a systematic review were designed in such a way to include only trials that randomize study subjects to treatment, trials in the searched reference databases that successfully randomized during study implementation but failed to suitably report the process in the manuscript would be wrongly excluded. Consequently, potentially valuable data would be lost from a meta-analysis and the resulting summary effect may be less accurate. It is thus of the utmost importance that experimental and observational research is not only designed properly but reported accurately as well.

In recent years there has been a strong push in both human and veterinary research to establish guidelines for the reporting of research. For example, in human research, guidelines have been established and adopted by a number of peer-reviewed journals covering randomized 2 controlled trials (RCT) (Schulz et al., 2010), systematic reviews (Moher et al., 2009), and observational studies (von Elm et al., 2007). Similarly in the veterinary science there has been increased interest in adopting or modifying these same principles of reporting guidelines to ensure accuracy and reproducibility (O'Connor, 2010; Sargeant and O'Connor, 2014b). As of today, guidelines have been established and adopted in numerous veterinary journals for the reporting of RCTs in livestock and (O'Connor et al., 2010). In the veterinary sciences when design specific guidelines are unavailable, some diligent researchers have provided in- depth design descriptions from inception through analysis. For example, a recent series of manuscripts covered the application of systematic reviews, a relatively new or uncommon design strategy in the veterinary science (O'Connor et al., 2014a; O'Connor et al., 2014b; Sargeant and

O'Connor, 2014a). It is likely that additional design-specific guidelines in the veterinary science will be considered as more thorough examinations of the quality of the literature are conducted.

Objectives

The aim of this thesis is to examine epidemiological study designs in the veterinary science for study design considerations and the reporting of study characteristics. This was executed using two approaches for two study designs: 1) a survey of veterinary case-control studies and 2) an investigation of RCTs within the framework of a systematic review and meta- analysis.

Thesis Organization

The format of this thesis is based on the use of journal papers as chapters. The second chapter contains a literature review of hospital-based case-control studies in the veterinary science. Chapter three is a survey of the veterinary literature for author designated case-control studies. Both the second and third chapters will be submitted to the Journal of Veterinary 3

Internal Medicine for publication. Chapter three comprises a systematic review and meta- analysis for the treatment of infectious bovine keratoconjunctivitis. This chapter has been submitted and accepted for publication in Animal Health Research Reviews (Cullen et al., In press). Co-author contributions to these manuscripts will be described on the first page of each chapter. Finally, the last chapter contains the thesis conclusions and recommendations for future research. 4

References

Cullen, J.N., Yuan, C., Totton, S., Dzikamunhenga, ., Coetzee, J.F., da Silva, N., Wang, C., O'Connor, A.M., In press. A systematic review and meta-analysis of the antibiotic treatment for infectious bovine keratoconjunctivitis: an update. Anim Health Res Rev. Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., Group, P., 2009. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 6, e1000097. O'Connor, A., 2010. Reporting guidelines for primary research: Saying what you did. Prev Vet Med 97, 144-149. O'Connor, A.M., Anderson, K.M., Goodell, C.K., Sargeant, J.M., 2014a. Conducting systematic reviews of intervention questions I: Writing the review protocol, formulating the question and searching the literature. Zoonoses 61 Suppl 1, 28-38. O'Connor, A.M., Sargeant, J.M., Gardner, I.A., Dickson, J.S., Torrence, M.E., Dewey, C.E., Dohoo, I.R., Evans, R.B., Gray, J.T., Greiner, M., Keefe, G., Lefebvre, S.L., Morley, P.S., Ramirez, A., Sischo, W., Smith, D.R., Snedeker, K., Sofos, J., Ward, M.P., Wills, R., 2010. The REFLECT statement: methods and processes of creating reporting guidelines for randomized controlled trials for livestock and food safety. Prev Vet Med 93, 11-18. O'Connor, A.M., Sargeant, J.M., Wang, C., 2014b. Conducting systematic reviews of intervention questions III: Synthesizing data from intervention studies using meta- analysis. Zoonoses Public Health 61 Suppl 1, 52-63. Sargeant, J.M., O'Connor, A.M., 2014a. Conducting systematic reviews of intervention questions II: Relevance screening, data extraction, assessing risk of bias, presenting the results and interpreting the findings. Zoonoses Public Health 61 Suppl 1, 39-51. Sargeant, J.M., O'Connor, A.M., 2014b. Issues of reporting in observational studies in veterinary medicine. Prev Vet Med 113, 323-330. Schulz, K.F., Altman, D.G., Moher, D., Group, C., 2010. CONSORT 2010 Statement: Updated guidelines for reporting parallel group randomised trials. J Clin Epidemiol 63, 834-840. von Elm, E., Altman, D.G., Egger, M., Pocock, S.J., Gotzsche, P.C., Vandenbroucke, J.P., Initiative, S., 2007. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Epidemiology 18, 800-804.

5

CHAPTER 2. A REVIEW OF THE APPROACH TO CONDUCTING HOSPITAL- BASED CASE-CONTROL STUDIES IN VETERINARY SETTINGS

A paper to be submitted to the Journal of Veterinary Internal Medicine

Purpose and Scope

Many designs can be used to investigate disease etiology, and these can broadly be classified into experimental or observational. The three most common observational designs are cross-sectional, cohort, and case-control. Of the three observational designs, the case-control design is often considered simple to design and the most cost effective for investigating disease etiology. The inexpensive nature is related to financial and time costs. As a result, clinicians wanting to examine an association between a disease or outcome of interest and an exposure often conduct a case-control study. Despite, its popularity, the case-control study is also considered of lesser value when estimating an exposure-disease association, as illustrated by its lower location on many statistical evidence pyramids. It is likely true that case-control studies are cheaper to conduct than cohort studies, however the common perceptions of simplicity and evidentiary value can be false. The view that case-control studies are simple to design can lead to important design considerations ignored (if not unknown) and misinterpretation of the results.

Moreover when properly designed (and comprehensively reported), a case-control study may be of more evidentiary value than traditionally held. The purpose of this review is therefore to help clinicians better understand this important design. The aim is to describe the proper planning, necessary design descriptions, and suitable interpretation of the 6 results for case-control studies using hospital-based records. We focus on hospital-based designs as these are commonly used in veterinary settings.

This review is targeted at companion animal clinicians interested in studying disease etiology at the individual level but will be applicable to any clinician, companion or food producing animal, planning to conduct a hospital-based case-control study based on hospital records.

The review will include a general description of the observational designs, compelling reasons for choosing a case-control design, appropriate selection of cases and controls, methods for controlling common biases, and proper interpretation of the associations observed. Additionally, when possible, the discussion will be limited to dichotomous or nominal variables. Discussion of group level (e.g. herd, pen, or liter) case and control selection, and interpretation will be limited.

What Constitutes a Case-Control Design?

Before describing the defining features of the case-control design, it is worth briefly relating how this design fits within the framework of the other analytic (or explanatory) observational study designs. The primary purpose of these observational study designs is to investigate disease etiology or the causal association between an exposure and disease with or without a temporal component. Note that exposure and risk factor may be used interchangeably.

Arguably, the simplest observational study design, the cross-sectional study is often used by veterinary clinicians to examine factors associated with disease prevalence within a population (Dohoo et al., 2009). The cross-sectional design is suited to this purpose because disease and exposure information are collected from a population at a 7 single point in time. Disease prevalence may then be compared between different exposure subgroups. Cross-sectional studies can be conducted quickly and the investigator may examine multiple exposures and diseases simultaneously (Morgenstern and Thomas, 1993). Cross-sectional studies are however limited in making causal inference as it difficult to distinguish cause from effect (i.e. exposure leads to disease).

This is due to exposure and disease ascertainment occurring at a single point in time

(Mann, 2003). It is possible that researchers actually identify an association between exposures and survival, rather than disease, so the inference about etiology based on cross-sectional studies should be carefully considered.

Compared to the cross-sectional design, the explicitly employs a temporal component. This temporal component is in fact one of defining features of the cohort design: the study group or cohort, sampled based on exposure status, is followed over time for the development of an outcome of interest [see Grimes and Schulz (2002) for a concise review or (Rothman and Greenland, 1998)]. Dekkers et al. (2012) recently proposed that in addition to subjects sampled on exposure and followed forward in time, the ability to calculate an absolute risk, which may be done using a single group, is a defining feature of the cohort design. Thus a traditional cohort design consists of one or more study groups based on an exposure of interest that are followed over time to outcome. Follow-up from exposure to outcome may be conducted prospectively or retrospectively using historical records (Euser et al., 2009). It should be noted the terms

“prospective” and “retrospective” are not ideal and have historically been a source of confusion (Vandenbroucke, 1991). Ideally, the source population from which the study groups are derived should be the same (e.g. veterinary records from a single teaching 8 hospital). Additional information on the design characteristics, implementation, and analysis may be found elsewhere (Breslow and Day, 1987; Prentice, 1995; Rothman and

Greenland, 1998; Dohoo et al., 2009).

Where the cohort design is concerned with selecting and comparing study groups based on exposure status, the case-control design is based on the selection of study groups or individuals with or without the outcome of interest [see Schulz and Grimes

(2002) for a general review]. Individuals with the outcome are referred to as cases and individuals without the outcome are referred to as controls. The previous exposure status

(i.e. exposure data or risk factors ascertained prior to development of the outcome) of

“cases” is then compared to the previous exposure status of individuals of “non-cases” or

“controls”. Most importantly, the control group should be randomly selected in such a way that the exposure distribution is representative of the exposure distribution in the source population (Miettinen, 1985a; Wacholder et al., 1992b). There are multiple potential source populations from which the cases and controls may be drawn, and this review will focus specifically on hospital-based records. The hospital-based source population and the representative nature of the control series will be discussed in more detail below.

Sometimes, investigators think of the terms “cases”, “diseased animals” and “sick animals” as synonyms, while also thinking of the term “controls” as synonymous with

“non-diseased” or “healthy”. Therefore, any study with diseased and non-diseased animals is a case-control study. However, this approach to naming is incorrect and produces much of the confusion observed in case-control studies. The true defining 9 feature of a case control study is selection based on the outcome, not the use of healthy and non-healthy animals, as we will see.

Why Choose a Case-Control Design to Investigate Etiology?

Before planning and implementation of a case-control study, the first question a clinician should ask is why choose the case-control as opposed to the other observational or experimental design choices. The most common answer for choosing the case-control design is the relative efficiency in terms of money and time.

One of the main motivations for choosing a case-control study, as opposed to a cohort, is the savings in money for the researcher particularly for a hospital-based case- control study. When hospital-records are used to identify the cases, ideally well- maintained databases with detailed records can be queried to identify all patients diagnosed with a disease of interest. Similarly, controls are identified from the same databases and disease etiology may be investigated using proper analysis at relative low cost. If cases (and controls) are recruited from a population at large without hospital records, the cost required identifying a suitable number of patients might be significantly higher. Alternatively, if a cohort study (hospital or population based) is designed to investigate disease etiology, patients with the risk factor thought to cause disease, will need to be followed over time for development of the disease either through additional record searching or through patient owner contact.

Case-control studies are relatively more efficient than cohorts in terms of time required for the data collection phase. This is due to the defining feature of the case- control design where subjects are enrolled based on outcome, and only after enrollment is the exposure status determined. Thus the follow-up time for outcome incidence as needed 10 in cohort studies is not required. For the clinician using available hospital-based records, this timesaving could be significant compared to a cohort study.

Of course such efficiency must be balanced against the accuracy and validity of the exposure-disease association obtained from a case control study. The trade off between the case-control design efficiency and accuracy and validity of the exposure- disease association measure is a common concern amongst clinicians and epidemiologists. The validity and accuracy concerns are related to some common sources of bias associated with the case-control design (discussed in more detail below) (Sackett,

1979). Wacholder et al. (1992a) reviews how case-control design choices related to aspects of efficiency are often times at odds with design principles that reduce these biases. As a result, case-control designs are often considered of lesser inferential value.

Indeed, a hierarchical ranking of study designs places case-control in an the intermediate region (Phillips et al., 2001). Concato et al. (2000) compared meta-analyses of randomized controlled trials (RCT) and meta-analyses of cohort or case-control studies for five different human clinical topics. Their results suggested that when well designed, either cohort or case-control studies resulted in highly similar effect estimates compared to RCTs assessing the same intervention. For a review on the inclusion of observational studies (as related to their inherent biases) in meta-analyses for animal disease outcomes see O'Connor and Sargeant (2014).

To summarize, a clinician wishing to assess disease etiology should consider a case-control design if time and money are constraining factors. The trade off between efficiency, accuracy and validity should be carefully considered as well. The following 11 section will discuss under what conditions the case-control design can be employed to maximize this efficiency.

When is the Case-Control Design the “Best” Option?

The case-control design is typically the most efficient design option for investigating the etiology of a particular disease outcome under two scenarios, the disease is rare or there is a long latency period for disease onset.

When the outcome is rare, identifying individuals based on the outcome is an advantage compared to identification based on exposure. Rare in this case the incidence of the particular outcome is uncommon in the source population (e.g. patient records at a veterinary referral hospital) from which the study population is drawn.

Consider a situation of attempting a cohort study compared to a case-control where the outcome of interest in rare. Since the cohort design selects study groups based on an exposure of interest, the cohort would potentially require a much larger study population in order to identify individuals experiencing the outcome over time. However as the case- control study selects individuals based on the outcome, it is a much more efficient design choice to assess the exposure-disease association in this situation. Put another way, if the cost of identifying exposed and diseased individuals is equal for both the case-control and cohort, the most efficient design will identify a number of exposed individuals with less cost compared to the other option. Therefore when the outcome is rare compared to the proportion of exposed individuals within the study population (i.e. fewer animals with disease than animals exposed), it is more efficient to identify individuals that have the outcome and are exposed among the case group (case-control design) compared to identifying the same individuals among an exposed group (cohort design) (Wacholder, 12

1995). For example, a case-control design was used to investigate different endocrinopathies as risk factors for development of the uncommon outcome gall bladder mucoceles in dogs (Mesich et al., 2009).

The second situation, in which the case-control design is potentially more efficient than a cohort, is one in which the outcome of interest has a long latency period.

Similar to the above reasons, by selecting on the outcome with the case-control study, long follow-up times are not required in order to observe the outcome of interest among the study groups. A cohort study examining an exposure-outcome association where the outcome has a long latency period would require a longer follow-up time in order to observe subjects with the outcome. Thus a case-control design would be more efficient in this situation. Sallander et al. (2012) recently utilized a case-control design to demonstrate that obese cats were more likely to have diabetes mellitus (long latency period) compared to non-obese cats. This study might have been much more inefficient if a cohort design had been chosen by the researchers, where obese non-diabetic cats and normal-weight non-diabetic cats would have been followed for years to determine the incidence of diabetes development in each group.

Additionally, when exposure ascertainment is expensive compared to diagnosing the outcome, the case-control may be the cheaper option. For example, Minozzi et al.

(2010) examined the potential genetic risk factors for Johne’s susceptibility. Here the outcome (Johne’s disease) was assessed during routine screening via an enzyme-linked immunosorbent assay (ELISA) and the exposure (genomic features) was measured on a subset of serologically positive and negative animals. Clearly the cost of genomic analyses is far more expensive than ELISA per sample. It would be costly to measure the 13 exposure status of a large number of individuals for which only a fraction develop the outcome (as with a cohort design) compared to a subset of cases and non-cases as with a case-control study.

How to Select Cases and Controls?

The following section discusses important considerations for the selection of cases (individuals with the outcome of interest) and controls (individuals without the outcome): the source population for cases and controls, the “nature” of the cases, number of control groups, size of the control group, source population type, and the timing of control selection.

Source population of cases and controls

The source population is the population from which the cases and controls arise.

This population may be a primary or secondary study base. A primary study base is one in which the source population is clearly defined (or enumerated) for example by geographic region during a specific time period such as a farm with accurate record keeping (Miettinen, 1985b; Miettinen, 1985a; Dohoo et al., 2009). This may also be referred to as a population-based case-control study (Wacholder et al., 1992b). The advantage of designing a population-based study is the source population from which the controls are drawn is guaranteed to be the same as that of the cases. The main concern with employing a primary case base in a case-control design is the potential for incomplete case identification (Miettinen, 1985b). This population base is not discussed further because as the focus is on hospital based case control studies, which are considered a secondary base. 14

If a referral hospital (the focus of this review) or disease registry is employed to identify cases and controls over a specific time period, it is considered a secondary study base (Miettinen, 1985b; Miettinen, 1985a). The primary advantage of using a secondary study base such as a referral hospital, is the case series (the cases in a case-control study) is complete, i.e. incomplete case identification is less likely to occur as with a primary study base (Wacholder et al., 1992a). Complete case identification is a result of using hospital databases to identify potential cases as opposed to attempting to identify patients with the outcome of interest from population-based study. Complete case ascertainment will increase effect measure precision and decrease case selection influenced by the exposure of interest or disease severity (Wacholder, 1995). For additional information regarding other sources of controls see Wacholder et al. (1992b).

The difference between employing a primary or secondary study base is an important consideration. Miettinen (1985a) described the distinction between primary and secondary bases as related to the order in which cases or the base population is defined.

When the base is defined before or independent of the cases it is a primary study base.

When the base is defined secondary to cases it is a secondary study base (i.e. the population the cases likely came from). It is this distinction that leads to a fundamental issue when using hospital records (or any other secondary study base) for a case-control study, what is the study population (the case base) from which suitable controls may be identified? Recall that the control series should be selected with an exposure distribution representative of the non-diseased in the source population from which the cases were identified. The fact that the source population is “unknown” may create the potential to get that wrong. 15

When using hospital-based records to select controls, the base population may be difficult to define, as it is not explicitly based on a geographical region. Instead, the base population consists of patient owners that would potentially visit or be referred to the hospital and thus could have been considered as part of the case series if the disease or outcome of interest had occurred (Miettinen, 1985b). By identifying the case series via hospital records, it is a reasonable assumption that other patient owners without the outcome of interest at the same hospital would be members of the same secondary base.

These patient owners may therefore be considered for the control group, denoted hospital controls. This is in fact one of the main advantages of using a hospital-based secondary study base to identify controls (Miettinen, 1985b; Wacholder et al., 1992a).

Hospital controls may be healthy animals (e.g. hospital staff owned animals volunteered for inclusion in the study) or they may be affected by another disease.

Employing animals with other diseases could present a potential issue if distribution of exposure in the “sick” control group is not be representative of the exposure distribution in the case base, the fundamental comparison of the case-control design (Miettinen,

1985b). To avoid this potential pitfall, two assumptions must be made: 1) for the “sick” controls, the outcome that caused hospital admittance must be unrelated to the exposure of interest, and 2) admitted cases could have been enrolled as “sick” controls and vice versa (Wacholder et al., 1992a). It is considered best practice to exclude any hospital patients as controls with other diseases that may be related to the exposure being investigated (Wacholder and Silverman, 1990). For example, a case-control study at a veterinary teaching hospital evaluated the association between Bartonella spp and chronic rhinosinusitis (CRS) in cats. Cats were excluded if they had outcomes thought to 16 be associated with Bartonella infection such as lymphadenopathy or uveitis (Berryessa et al., 2008). The rationale for this is that such controls would likely have a higher prevalence of Bartonella spp than cats without CRS from the source population because the exposure (Bartonella spp infections) was a factor associated with admittance. If correct, the use of such a “sick” control group would have biased the association towards the null hypothesis.

The “sick” control group may consist of multiple outcomes or a single disease outcome. The use of a single disease control group should be avoided due to potential for future research suggesting an association between the exposure and control group disease, as well as the difficulty in providing satisfactory evidence that the exposure and control group disease are not associated. A control group consisting of multiple diseases is preferred (Wacholder et al., 1992b; Rothman and Greenland, 1998). As an example of a multi-disease control group, a case-control design was used to investigate if recent was a risk factor for immune-mediated thrombocytopenia (ITP) in dogs. The disease control group was restricted only in excluding dogs with immune-mediated illnesses thus the control group consisted of dogs with an assortment of diseases (Huang et al., 2012).

As mentioned previously, controls should be randomly sampled from the same source population that gave rise to the cases to ensure the exposure distribution of the controls is representative of the exposure distribution of the non-diseased in the source population (Miettinen, 1985b). However when a secondary base is used, as with the hospital based case-control study, the study base is not explicitly defined and the selected controls are considered a non-random subset of the base. They are considered nonrandom 17 as these subjects are admitted patients and not truly random subjects from the source population (Wacholder et al., 1992a). It is important to stress that the subset is representative only of the study base and not of the entirety of animals without the disease or outcome of interest (Rothman and Greenland, 1998).

The “nature” of the cases

When designing a case-control study, a decision must be made whether to include prevalent, incident, or both outcome types as the cases. Prevalent cases are patients that already have the outcome of interest at enrollment. The use of prevalent cases is most applicable to assessing disease burden within a population as opposed to disease etiology

(Pearce, 2004). Prevalent cases are also most suited to examining nonfatal chronic conditions, degenerative diseases, or congenital malformations (Rothman and Greenland,

1998). For example, prevalent cases were enrolled in case-control studies to examine chronic gingivostomatitis in cats (Farcas et al., 2014), bovine cysticercosis (Calvo-

Artavia et al., 2013), or Cryptosporidium infection in calves (Silverlas et al., 2010). The main concern with employing prevalent cases is the difficulty in disentangling whether the exposure represents a cause or survival determinant in the population. Where prevalence suggests a current disease status, incident cases are patients with a “newly” diagnosed disease or outcome of interest and are thus better suited for examining disease etiology in case-control studies (Pearce, 2012). When deciding to employ incident cases, it is important to clearly distinguish between whether only animals experiencing the first occurrence or those with subsequent occurrences of the same disease will be included

(Dohoo et al., 2009). As with studies based on prevalent cases, incident cases may be identified and exposure history ascertained via hospital records or prospectively as 18 admitted to the hospital or clinic. Finally, incident and prevalent cases may be enrolled in a case-control study. The authors of a survey on human case-control studies considered this situation to be analogous to the use of prevalent cases only but further discussion will not be considered here (Knol et al., 2008).

As the focus of this review is on examining disease etiology, the use of incident cases will be the primary consideration throughout the remainder of this review.

The number of control groups

While a case-control study is typically thought of a comparison between one case group and one control group, multiple control groups may be considered. Observing multiple control groups may strengthen effect estimates when the results are consistent or assist in controlling potential biases (Ibrahim and Spitzer, 1979). For example if the different control groups are included to account for different potential confounders

(Wacholder et al., 1992b). The different control groups may reflect different study base types (primary versus secondary) or different disease categories (e.g. animals with respiratory diseases as one control and animals with gastrointestinal disorders as a second control group) selected from hospital records. A human case-control study, observed similar effect measures when comparing controls from a primary study base (based on random digit dialing) versus controls from a secondary study base (based on “healthy” partners) (Pomp et al., 2010). Other human studies have suggested hospital controls may not be representative of the source population as the controls may be more similar to the cases than would otherwise be expected leading to differences in effect measure estimates. This situation was observed when patients with cervical cancer were enrolled as cases and a comparison was made between controls identified from the population (via 19 random digit dialing) and controls recruited from the hospital with gynecological disorders other than cancer (West et al., 1984). However, it is usually suggested to choose the control series most suitable to the association of interest ahead of time (Wacholder et al., 1992b; Moritz et al., 1997).

Size of the control group

Control group size or more specifically the ratio of controls to cases impacts the precision of the exposure-disease association. When there is an abundance of available cases and controls and the cost of data ascertainment is equal, it is suggested to use a ratio of one to one (Breslow and Day, 1980). In this scenario, the sample size should be calculated based on standard calculation methods beyond the scope of this review

(Schlesselman and Stolley, 1982). If there is a cost difference between obtaining case and control patients, a ratio should be used. A ratio of three to four controls per case is generally considered to be the maximum in terms of precision, i.e. including more than four controls per case does not increase effect measure precision. The statistical rationale is beyond the scope of this review but has been described elsewhere (Ury, 1975; Gail et al., 1976; Walter, 1977).

Population type

Based on the source population, the population may be considered one of two types: fixed cohort or a dynamic population. Case-control studies are often considered to be “nested” within a larger hypothetical cohort study or more generally within the source population from which the cases arose (Rothman and Greenland, 1998). When the cases and controls are selected from a specifically defined population where no additional individuals could conceivably be enrolled it is considered a fixed cohort. Fixed cohorts 20

(also known as closed populations) are not stable with regard to the exposure distribution and other group characteristics as members will die or leave the geographical area

(Greenland et al., 1986; Rothman and Greenland, 1998). As an example, a case-control study was undertaken to examine risk factors for stillbirths in pigs. Farms in Brazil were recruited between a specific calendar time and all farrowing pigs were monitored for stillbirths (as the outcome) (Silva et al., 2015). This is a fairly common type of fixed cohort in veterinary and human research, namely a birth cohort. For an example other than a birth cohort, Lefebvre et al. (2009) defined fixed cohorts based on dogs that visited human health care facilities versus other animal assisted activities.

Alternatively, a dynamic population (also known as an open population) may be considered where individuals can enter or leave the population during the study period.

This state of flux will occur through immigration, emigration, birth, or death (Rothman and Greenland, 1998). The use of hospital based records to identify cases and controls would typically be considered from a dynamic population. Patient owners may enter or leave the surrounding area that feeds the hospital or decide they prefer a neighboring hospital for any number of reasons. These owners would no longer be considered as part of the source population. For example, to investigate risk factors for cranial cruciate rupture in dogs, medical records spanning 12 years from a small animal hospital were examined to identify cases and controls (Adams et al., 2011). This population served by the animal hospital would have to be considered dynamic. There are however exceptions and a fixed cohort may be described from a hospital based case-control design. For example, a hypothetical case-control study could be used to examine the association between number of days a patient spends in the hospital and methicillin resistant 21

Staphylococcus pseudintermedius (MRSP) infection during an outbreak within the hospital. Here the source population would be considered fixed or closed if the controls were selected from patients admitted to the hospital during the outbreak. The population type will ultimately play a role in additional considerations that effect how the odds ratio may be interpreted such as the timing of control selection.

Timing of control selection

There are three options for the timing of control selection: at the beginning of the study period, concurrently as cases are enrolled, and at the end of follow-up.

Controls may be selected from the source population at the beginning of follow- up. This type of sampling scheme is referred to most commonly as a case-cohort design

(Prentice, 1986). This design choice has also been termed an inclusive design (Rodrigues and Kirkwood, 1990), case-base (Miettinen, 1982), or hybrid retrospective (Kupper et al.,

1975). This control sampling method identifies controls from a fixed cohort that were considered at risk before incident cases of disease were selected. That is a subcohort is randomly selected from the entire cohort and exposure history and other relevant characteristics are assessed (Prentice, 1986). The primary objective is again to sample controls such that their exposure distribution estimates the exposure distribution of the source population, which in this case is the fixed cohort. The primary advantage of this sampling strategy is it allows for the investigation of multiple disease outcomes using the same cohort of animals. For a review of the advantages and disadvantages of this sampling design see (Wacholder, 1991). As an example of this sampling approach,

O'Connor et al. (2012) obtained eye swabs from all calves born on two different farms in a single year (fixed cohort) and followed the calves for the development of infectious 22 bovine keratoconjunctivitis (IBK). The authors of this study employed a population- based cohort design to assess the association between IBK and different Moraxella spp.

However if a randomly selected subcohort of calves that became cases (infected with

IBK) were to be compared (i.e. isolation of Moraxella spp.) to calves that did not, this would have been considered a case-cohort design.

A second control sampling strategy is to sample the controls concurrently as cases are enrolled (Miettinen, 1976; Greenland et al., 1986). In other words, the controls are selected from the source population at risk at the same calendar time a case is selected.

For instance if computer records were used to find cases, controls could be selected among patients admitted or diagnosed on the same day as the case. With this type of sampling approach, controls may become cases but not vice versa. That is, a patient selected as a control may develop the disease during the study period but a patient with the disease is no longer at risk and thus cannot be considered as a control (Rodrigues and

Kirkwood, 1990). This approach may be employed when the source population is either a fixed cohort or dynamic. When controls are selected in this manner from dynamic populations it is typically referred to as incidence density sampling (Miettinen, 1976). For example, a case-control study examining risk factors for bovine neonatal pancytopenia

(BNP) used a fixed cohort and enrolled age- and farm-matched controls at the time a case was identified (Jones et al., 2013).

The third sampling approach entails selecting controls from non-cases at the end of follow-up. That is, patients that remain disease free and are thus still considered at risk at the study end. This is the traditional way in which the case-control design is taught.

This type of sampling is referred to as cumulative-incidence (Greenland and Thomas, 23

1982), traditional or exclusive design (Rodrigues and Kirkwood, 1990). It is with the teaching of this sampling strategy that the often mistaken “rare-disease” assumption is included (discussed below) (Cornfield, 1951; Rothman and Greenland, 1998).

Discussion of how the sampling strategy and additional assumptions impact the effect measure interpretation is included below.

Design Considerations for the Practical Management of Common Biases

An important goal of all study designs, observational or experimental, is to have internal validity. That is, except for random error, the estimated effect measure based on the study sample is accurate for the source population. Biases of some form or another will affect the internal validity of all study designs. There are numerous biases in observational research (Sackett, 1979). It is thus important to control for potential biases in the design of a case-control study in order to minimize a possibly distorted effect measure. A common way to classify biases is in three categories: selection, information, and (Dohoo et al., 2009). As confounding is a common concern in case- control studies it will be briefly introduced below however extensive discussions of selection and information biases are beyond the scope of this review but are described in numerous places (Schlesselman and Stolley, 1982; Wacholder, 1995; Rothman and

Greenland, 1998; Hernan et al., 2004; Dohoo et al., 2009).

One of the most important concepts in observational study design is the control of confounding. A confounding factor can distort the effect measure between the exposure and disease of interest when not controlled. Where randomization in a controlled trial attempts to set baseline factors to be equal between study groups, case-control studies

(and other observational designs) do not enjoy such luxury. As a result, differences 24 related to disease risk may inherently exist between study groups. Put another way, the cases and controls identified from the source population in a case-control study may have underlying disease risk differences due to some additional factor (the confounder) that when compared will result in a distorted or confounded association (Greenland and

Robins, 1986). Although not sufficient, there are three criteria that suggest a factor is a confounder as described by Rothman and Greenland (1998): the confounding factor is either itself a cause or proxy for a different cause of the disease, is associated with the exposure in the source population such that it “causes” the exposure [for a case-control study this means an association in the control group (Miettinen and Cook, 1981)], and it may not be intermediary between exposure and disease. For additional information on how to deal with the situation in which the confounder is an intermediate step between exposure and disease see (Robins and Greenland, 1992).

There are two main methods in which to address the issue of confounding in the case-control design: restriction, matching, and analytical methods.

Restriction

One of the simplest and most effective design choices to control for confounding is restriction (Greenland and Morgenstern, 2001). Restriction or restricted sampling refers to the exclusion of study subjects based on one level of a potential dichotomous or categorical confounding factor. If the confounder is measured on a continuous scale, suitable ranges can be determined as exclusionary criteria. As confounding is the result of an uneven distribution of an exposure-disease associated variable across study groups

(here cases and controls), excluding subjects possessing one level or narrow range of that variable will effectively control the impact of that confounding variable. Proper restricted 25 sampling will thus result in all study subjects having the same level of the confounding variable. When restriction is based on dichotomous or categorical variables, Dohoo et al.

(2009) suggest including subjects from the low-risk group (i.e. individuals with the level of the variable believed to be lower risk). Restriction may be based on any measureable confounding variable such as age, weight, breed, dam parity, or even hospital attended in a multicenter case-control study. For example, Wakeling et al. (2009) restricted their investigation of risk factors for hyperthyroidism to cats in the United Kingdom that were eight years of age and older.

Thoughtful consideration should be given before implementing restriction however. Restricted sampling will result in a decrease in the number of available potential study subjects. This decrease will become more exaggerated if more than one exclusionary criterion is implemented. Additionally, as all subjects will have the same value for the restricted factor, the effect of factor cannot be examined in the analysis.

Furthermore, restriction of potential subjects may result in reduced generalizability of study results (external validity). If a case-control study were restricted to overweight

Himalayan cats for example, careful discussion would be required in order to generalize any observed exposure-disease association to cats at large.

Matching

Matching is the pairing of individual controls to cases based on one or more measurable variables (e.g. breed, weight, age). Matching in a case-control study is the selection of the control series such that the distribution of a potential confounder is identical (or approximately identical) to that of the cases (Schlesselman and Stolley,

1982; Rothman and Greenland, 1998). Matching may be based on one or more potential 26 confounding factors. A hospital based case-control study may take advantage of a large database of potential controls that could be matched to cases based on a number of factors (e.g. age, weight, breed, reproductive status). Multiple matched controls may be selected for each case with careful considerations (Miettinen, 1968). This type of matching is also referred to as individual matching.

Frequency or category matching on the other hand consists of selecting the controls such the distribution of the matching factor is equal to that of the cases (Karon and Kupper, 1982). For example, if 60% of the case series consisted of Miniature

Pinschers and 40% German Shepard, controls would be selected to ensure approximately

60% were Miniature Pinscher and 40% German Shepard. Frequency matching is seemingly less common in the veterinary sciences but for case-control studies it has been argued to be more efficient than pair matching (Karon and Kupper, 1982).

Matching does not prevent confounding per se, but rather increases the efficiency of confounding control by ensuring approximately equal distribution of cases and controls across a confounding factor strata (Breslow and Day, 1980; Kupper et al., 1981;

Rothman and Greenland, 1998). As a result of the balanced (or equal distribution) of cases and controls across a matched factor, the effect measure of interest may have decreased variance (i.e. increased efficiency) (Kupper et al., 1981). There are however exceptions to this and matching may result in decreased efficiency when there is no confounding of the association of interest (Thomas and Greenland, 1983). Practical recommendations based on statistical efficiency have been previously outlined by

Thomas and Greenland (1983). 27

Importantly, if matching is implemented for the enrollment of controls, stratified or matched analysis must be conducted in order to ensure validity of the reported effect measure (Schlesselman and Stolley, 1982). For example, Stavisky et al. (2011) investigated risk factor for diarrhea in dogs where controls were matched to cases for both the timing of case presentation and clinic visited, and the analysis was conducted in a matched fashion (conditional logistic regression).

Analytical methods

The two most common analytical methods employed to control for confounding in case-control studies are stratification and regression models. These two approaches will be briefly described in practical terms; more thorough discussions may be found elsewhere (Breslow and Day, 1980; Rothman and Greenland, 1998; Dohoo et al., 2009;

Kleinbaum and Klein, 2010).

Stratified analysis is the simplest method in which to control for confounding in the association between exposure and disease. This analytical technique relies on separating or rather stratifying the data based on the confounding variables that were measured or assessed from the study subjects. When confounding variables are homogenous within each stratum, the impact of those variables in distorting the association is removed. To implement a stratified analysis, the following steps may be considered: 1) calculate and examine the odds ratios of each defined stratum, 2) conduct a statistical test for homogeneity (i.e. are the stratum-specific odds ratios equal at significance level α?), 3a) if the null hypothesis of homogeneity is rejected report each stratum-specific odds ratio separately, 3b) if homogeneity is considered reasonable, a single pooled estimate may be calculated. Typically, when the variables under study are 28 dichotomous, the pooled estimate will be a Mantel-Haenszel estimator or Mantel-

Haenszel odds ratio, which is a weighted average that can be calculated by hand or with user-friendly open source software such OpenEpi (Open Source Epidemiologic for Public Health) (Mantel and Haenszel, 1959; Sullivan et al., 2009). Alternatively, a logistic regression model may be used with conditional maximum likelihood estimators which are not based on stratum-specific weights and require more involved calculations

(Breslow and Day, 1980; Kleinbaum and Klein, 2010). The main advantage of employing a logistic model for analyzing matched data from a case-control study is the ability to control for other measured variables that may have an effect on the exposure-disease association (Kleinbaum and Klein, 2010).

It should be noted that statistical tests of homogeneity are of lower power, and some authors recommend the assumption homogeneity should be based on the judgment and biological knowledge of the researcher (Greenland, 1983; Rothman and Greenland,

1998).

When multiple confounders are measured and the dataset is stratified across those measured confounders, a “sparse-data” issue may arise. This problem occurs when some strata have fewer individuals as a result of the multiple stratifications and comparing exposed and unexposed is not valid or even possible (Greenland et al., 2000). Conditional logistic regression models can be used to minimize this issue but should be carefully considered as this method is not guaranteed to be effective and other methods (well beyond the scope of this review) might be considered (Greenland et al., 2000). As with a stratified analysis, the decision to utilize a regression model in the analysis requires the 29 measuring of all potentially relevant covariates a priori to beginning the study.

Additional information may be found elsewhere as suggested above.

Interpretation of the Odds Ratio

The follow section describes the odds ratio (or cross-product ratio) in two ways: proper interpretation compared to risk or rate ratios and what the cross-product ratio estimates based on the above control sampling methods.

Describing the odds ratio

The odds ratio is the most common effect measure derived from case-control studies. Where probability is the proportion of event (or disease outcome) occurrence if observed many times (ranging from zero to one), odds are the ratio of the probability of event occurrence to the probability of no event (ranging from zero to infinity) (Grimes and Schulz, 2008). An odds ratio is thus the ratio of the odds calculated for two different groups. For a case-control study, the odds ratio is typically the odds of exposure in the cases divided by the odds of exposure in the controls, termed an exposure odds ratio. As the defining feature of the case-control design is the sampling on an outcome or disease status, only the exposure odds ratio can be directly calculated. This ratio is mathematically equivalent to the ratio of the odds of disease in the exposed to the odds of disease in the non-exposed which is more valuable in making estimates about disease

“risk” (Sackett et al., 1996). However, the odds (or odds ratio) may not be interpreted as a risk and is often regarded as non-intuitive owing to the difficulty in the proper interpretation. As an effect measure, odds ratios may be calculated for most any study design. 30

Importantly, odds are different than risk or rate. The risk refers to the probability

(expressed as a proportion or percentage) that a person develops the outcome over the time period under study. The rate on other hand refers to the rate at which at risk study subjects develop the outcome. The rate is calculated as the number of study subjects at risk over the combined person years of all participants (Rothman and Greenland, 1998;

Vandenbroucke and Pearce, 2012b). The risk (or rate) ratio from an appropriately designed study is thus the risk proportion in one group (e.g. exposed individuals from a cohort study) divided by the risk proportion in a second group (e.g. unexposed individuals). For clinicians, this relative measure is much more intuitive and easier to understand as it describes the likelihood for developing a disease outcome in terms of a probability.

The distinction between relative effect measures, specifically the odds and risk ratios, may be often confused or misinterpreted (O'Connor, 2013). Altman et al. (1998) reported on an example where the odds ratio calculation was described as “88 times more likely”. While the calculation was indeed correct, the risk ratio was only 7 and thus interpreting 88 as a probability is wholly misinforming. Holcomb et al. (2001) searched for studies reporting odds ratios from two human medicine journals over the course of a single year and observed the odds ratio erroneously interpreted as a risk ratio in a quarter of those studies. Importantly, interpreting the odds ratio as a risk ratio will always overestimate (i.e. further from the null value of one) the true association (Davies et al.,

1998). Interpreting the odds ratio as a risk ratio will rarely have an impact qualitatively

(i.e. direction remains the same but the magnitude differs) (Davies et al., 1998). However under certain modeled conditions the odds and risk ratio may indeed be in opposite 31 directions of the null (Brumback and Berg, 2008). As suggested by O'Connor (2013), when reporting the odds ratio, it is important to avoid the terms risk or probability.

There are however compelling reasons for which the non-intuitive odds ratio is a useful effect measure including; examining the association between two dichotomous variables (as in a 2x2 table), using logistic regression modeling techniques to control extraneous variables produces odds ratios, and the odds ratio has appealing statistical properties for use in pairwise- and network meta-analyses.

Cross-product ratio estimation

As mentioned above, the odds ratio is rather difficult to interpret properly. The odds ratio is itself a characteristic of the study population, thus suggestions that the odds ratio will estimate a risk or rate ratio are not entirely accurate as described by (O'Connor,

2014). However, the calculation of the odds ratio, may estimate a risk or rate ratio depending on study design considerations. The final section of this review will thus consider those design options that were introduced earlier and how they impact the interpretation of the odds ratios calculation.

The calculation of the odds ratio from a standard 2x2 table is commonly referred to as a cross-product ratio. It is called a cross-product ratio as the odds of the outcome in the exposed divided by the odds of the outcome in the non-exposed is mathematically equivalent to multiplying and dividing the corners from a 2x2 table. The odds ratio may also be derived from the exponent of the covariate of interest from a regression model

[see Kleinbaum and Klein (2010) for more detailed discussion]. For simplicity, the remainder of this section will describe how the cross-product ratio may estimate other relative effect measures but these considerations hold for model based calculations as 32 well. The information of the following sections has been wonderfully illustrated (Figure

1) by Knol et al. (2008).

Incident cases from a fixed cohort

When incident cases are chosen and the source population is considered fixed

(e.g. an infectious disease outbreak in a hospital over the course of six months), the three main choices for the timing of control sampling and additional assumptions will impact the interpretation of the cross-product ratio. If controls are sampled from the beginning of follow-up (i.e. a case-cohort design), the cross-product ratio will estimate a risk ratio if study subject losses to follow-up are not related to exposure status (Kupper et al., 1975;

Greenland et al., 1986; Vandenbroucke and Pearce, 2012a). As a hypothetical example, in a hospital based case-control study of a MRSP outbreak, cases could be selected from dogs that tested positive for MRSP during the defined outbreak period. The controls could theoretically be selected at random from all dogs that entered or were already in the hospital at the start of the outbreak. The cross-product ratio calculated from such a study would on average be a good estimate of the risk ratio, a far more easily interpreted effect measure.

Alternatively, if controls are sampled concurrently (i.e. control subjects selected from the hospital records on or approximately the same day as each case is enrolled) and the analysis is matched on time, the cross-product ratio will estimate a rate ratio

(Greenland and Thomas, 1982; Langholz and Thomas, 1990; Vandenbroucke and Pearce,

2012a). The cross-product ratio will estimate a rate ratio as opposed to a risk ratio because the time matched control subjects (patients at risk for the outcome) represent the combined person years which are used to calculate in a rate ratio in a cohort study for 33 example (Rodrigues and Kirkwood, 1990). Jones et al. (2013) employed this design strategy to examine risk factors for bovine neonatal pancytopenia (BNP). As their analysis was matched on time, the resulting odds ratio from the logistic conducted by the authors would estimate the rate of BNP incidence in the calves under study.

Finally, when controls are selected from a fixed cohort at the end of the study period, and the disease being investigated is considered rare in the population, the cross- product ratio will estimate a risk ratio (Miettinen, 1976; Vandenbroucke and Pearce,

2012a). This is the traditional way in which case-control studies have been taught and specifically with the notion of the “rare disease assumption” introduced by Cornfield

(1951). Indeed, a recent survey of case-control studies in the veterinary science observed that of seven studies employing incident cases from a fixed cohort, four sampled controls from subject at the end of follow-up.

Incident cases from a dynamic population

When a case-control study is designed such that incident cases and controls are selected from a dynamic population (i.e. a population continually in flux such as patients for a large referral hospital), the main consideration in interpretation of the cross-product ratio is the matching of controls to cases on time. When controls are matched to cases on time, the cross-product ratio will estimate a rate ratio without additional assumptions

(Greenland and Thomas, 1982). If controls are not matched to cases on time, the cross- product ratio will still estimate a rate ratio if the population can be considered stable. That is, the distributions of the variables under study, in particular the exposure, are not likely 34 to change meaningfully over the course of the study (Greenland et al., 1986; Knol et al.,

2008).

Conclusions

This review summarized the relevant design and interpretation aspects of the hospital-based case-control study. Specifically, the population source, the nature of the cases, selection and timing of valid control enrollment, considerations of matching and restriction, proper interpretation of the odds ratio, and design strategies to enable a more interpretable effect measure were described. Importantly, when the case-control study is properly described and reported in the literature, more value from the study can be obtained. For example, often times systematic reviews will exclude case-control studies as they are believed to be of less inferential value. If an excluded case-control study estimated a risk ratio comparable to what would have been observed had a cohort study been conducted, excluding that case-control study would result in the loss of potential valuable information in understanding a disease etiology. 35

Glossary

Accuracy: How close a measurement mirrors the true value.

Efficiency: The value of information from a study considering both time and money cost considerations or statistical efficiency related to estimate precision.

Precision: The lack of random error as related to reproducibility.

Validity: Inference derived from a study is accurate with regards to the source population from which study subjects were drawn (internal validity) or to individuals not included in the source population (external validity). Also relates to a lack of bias. 36

References

Adams, P., Bolus, R., Middleton, S., Moores, A.P., Grierson, J., 2011. Influence of signalment on developing cranial cruciate rupture in dogs in the UK. Journal of Small Animal Practice 52, 347-352. Altman, D.G., Deeks, J.J., Sackett, D.L., 1998. Odds ratios should be avoided when events are common. BMJ 317, 1318. Berryessa, N.A., Johnson, L.R., Kasten, R.W., Chomel, B.B., 2008. Microbial culture of blood samples and serologic testing for bartonellosis in cats with chronic rhinosinusitis. Journal of the American Veterinary Medical Association 233, 1084-1089. Breslow, N., Day, N., 1987. Statistical Methods in Cancer Research Volume II: The Design and Analysis of Cohort Studies—IARC Scientific Publications No. 82. Lyon: International Agency for Research on Cancer. Breslow, N.E., Day, N.E., 1980. Statistical methods in cancer research. Volume I - The analysis of case-control studies. IARC Sci Publ, 5-338. Brumback, B., Berg, A., 2008. On effect-measure modification: Relationships among changes in the , odds ratio, and risk difference. Stat Med 27, 3453- 3465. Calvo-Artavia, F.F., Nielsen, L.R., Dahl, J., Clausen, D.M., Graumann, A.M., Alban, L., 2013. A case-control study of risk factors for bovine cysticercosis in Danish cattle herds. Zoonoses and Public Health 60, 311-318. Concato, J., Shah, N., Horwitz, R.I., 2000. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med 342, 1887-1892. Cornfield, J., 1951. A method of estimating comparative rates from clinical data; applications to cancer of the lung, breast, and cervix. J Natl Cancer Inst 11, 1269- 1275. Davies, H.T., Crombie, I.K., Tavakoli, M., 1998. When can odds ratios mislead? BMJ 316, 989-991. Dekkers, O.M., Egger, M., Altman, D.G., Vandenbroucke, J.P., 2012. Distinguishing case series from cohort studies. Ann Intern Med 156, 37-40. Dohoo, I.R., Martin, W., Stryhn, H.E., 2009. Veterinary Epidemiologic Research. AVC Inc. Prince Edward Island, Canada. Euser, A.M., Zoccali, C., Jager, K.J., Dekker, F.W., 2009. Cohort studies: prospective versus retrospective. Nephron Clin Pract 113, c214-217. Farcas, N., Lommer, M.J., Kass, P.H., Verstraete, F.J., 2014. Dental radiographic findings in cats with chronic gingivostomatitis (2002-2012). J Am Vet Med Assoc 244, 339-345. Gail, M., Williams, R., Byar, D.P., Brown, C., 1976. How many controls? J Chronic Dis 29, 723-731. Greenland, S., 1983. Tests for interaction in epidemiologic studies: a review and a study of power. Stat Med 2, 243-251. Greenland, S., Morgenstern, H., 2001. Confounding in health research. Annu Rev Public Health 22, 189-212. 37

Greenland, S., Robins, J.M., 1986. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol 15, 413-419. Greenland, S., Schwartzbaum, J.A., Finkle, W.D., 2000. Problems due to small samples and sparse data in conditional logistic regression analysis. Am J Epidemiol 151, 531-539. Greenland, S., Thomas, D.C., 1982. On the need for the rare disease assumption in case- control studies. Am J Epidemiol 116, 547-553. Greenland, S., Thomas, D.C., Morgenstern, H., 1986. The rare-disease assumption revisited. A critique of "estimators of relative risk for case-control studies". Am J Epidemiol 124, 869-883. Grimes, D.A., Schulz, K.F., 2002. Cohort studies: marching towards outcomes. Lancet 359, 341-345. Grimes, D.A., Schulz, K.F., 2008. Making sense of odds and odds ratios. Obstet Gynecol 111, 423-426. Hernan, M.A., Hernandez-Diaz, S., Robins, J.M., 2004. A structural approach to selection bias. Epidemiology 15, 615-625. Holcomb, W.L., Jr., Chaiworapongsa, T., Luke, D.A., Burgdorf, K.D., 2001. An odd measure of risk: use and misuse of the odds ratio. Obstet Gynecol 98, 685-688. Huang, A.A., Moore, G.E., Scott-Moncrieff, J.C., 2012. Idiopathic immune-mediated thrombocytopenia and recent vaccination in dogs. Journal of Veterinary Internal Medicine 26, 142-148. Ibrahim, M.A., Spitzer, W.O., 1979. The case control study: the problem and the prospect. J Chronic Dis 32, 139-144. Jones, B.A., Sauter-Louis, C., Henning, J., Stoll, A., Nielen, M., Schaik, G.v., Smolenaars, A., Schouten, M., Uijl, I.d., Fourichon, C., Guatteo, R., Madouasse, A., Nusinovici, S., Deprez, P., Vliegher, S.d., Laureyns, J., Booth, R., Cardwell, J.M., Pfeiffer, D.U., 2013. Calf-level factors associated with bovine neonatal pancytopenia - a multi-country case-control study. PLoS ONE 8, e80619. Karon, J.M., Kupper, L.L., 1982. In defense of matching. Am J Epidemiol 116, 852-866. Kleinbaum, D.G., Klein, M., 2010. Logistic regression: a self-learning text. Springer Science & Business Media. Knol, M.J., Vandenbroucke, J.P., Scott, P., Egger, M., 2008. What do case-control studies estimate? Survey of methods and assumptions in published case-control research. Am J Epidemiol 168, 1073-1081. Kupper, L., McMichael, A., Spirtas, R., 1975. A hybrid epidemiologic study design useful in estimating relative risk. Journal of the American Statistical Association 70, 524-528. Kupper, L.L., Karon, J.M., Kleinbaum, D.G., Morgenstern, H., Lewis, D.K., 1981. Matching in epidemiologic studies: validity and efficiency considerations. Biometrics 37, 271-291. Langholz, B., Thomas, D.C., 1990. Nested case-control and case-cohort methods of sampling from a cohort: a critical comparison. Am J Epidemiol 131, 169-176. Lefebvre, S.L., Reid-Smith, R.J., Waltner-Toews, D., Weese, J.S., 2009. Incidence of acquisition of methicillin-resistant Staphylococcus aureus, Clostridium difficile, and other health-care-associated pathogens by dogs that participate in animal- 38

assisted interventions. Journal of the American Veterinary Medical Association 234, 1404-1417. Mann, C.J., 2003. Observational research methods. Research design II: cohort, cross sectional, and case-control studies. Emerg Med J 20, 54-60. Mantel, N., Haenszel, W., 1959. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22, 719-748. Mesich, M.L., Mayhew, P.D., Paek, M., Holt, D.E., Brown, D.C., 2009. Gall bladder mucoceles and their association with endocrinopathies in dogs: a retrospective case-control study. J Small Anim Pract 50, 630-635. Miettinen, O., 1976. Estimability and estimation in case-referent studies. Am J Epidemiol 103, 226-235. Miettinen, O., 1982. Design options in epidemiologic research. An update. Scand J Work Environ Health 8 Suppl 1, 7-14. Miettinen, O.S., 1968. The matched pairs design in the case of all-or-none responses. Biometrics 24, 339-352. Miettinen, O.S., 1985a. The "case-control" study: valid selection of subjects. J Chronic Dis 38, 543-548. Miettinen, O.S., 1985b. Theoretical epidemiology: principles of occurrence research in medicine. Wiley New York. Miettinen, O.S., Cook, E.F., 1981. Confounding: essence and detection. Am J Epidemiol 114, 593-603. Minozzi, G., Buggiotti, L., Stella, A., Strozzi, F., Luini, M., Williams, J.L., 2010. Genetic loci involved in antibody response to Mycobacterium avium ssp. paratuberculosis in cattle. PLoS ONE, e11117. Morgenstern, H., Thomas, D., 1993. Principles of study design in environmental epidemiology. Environ Health Perspect 101 Suppl 4, 23-38. Moritz, D.J., Kelsey, J.L., Grisso, J.A., 1997. Hospital controls versus community controls: differences in inferences regarding risk factors for hip fracture. Am J Epidemiol 145, 653-660. O'Connor, A.M., 2013. Interpretation of odds and risk ratios. Journal of veterinary internal medicine / American College of Veterinary Internal Medicine 27, 600- 603. O'Connor, A.M., 2014. Letter to the editor. Journal of veterinary internal medicine / American College of Veterinary Internal Medicine 28, 1382-1383. O'Connor, A.M., Sargeant, J.M., 2014. Meta-analyses including data from observational studies. Prev Vet Med 113, 313-322. O'Connor, A.M., Shen, H.G., Wang, C., Opriessnig, T., 2012. Descriptive epidemiology of Moraxella bovis, Moraxella bovoculi and Moraxella ovis in beef calves with naturally occurring infectious bovine keratoconjunctivitis (Pinkeye). Veterinary microbiology 155, 374-380. Pearce, N., 2004. Effect measures in prevalence studies. Environ Health Perspect 112, 1047-1050. Pearce, N., 2012. Classification of epidemiological study designs. Int J Epidemiol 41, 393-397. 39

Phillips, B., Ball, C., Sackett, D., Badenoch, D., Straus, S., Haynes, B., Dawes, M., 2001. Oxford centre for evidence-based medicine levels of evidence. Verfügbar unter: http://www. cebm. net/levels_of_evidence. asp. Pomp, E.R., Van Stralen, K.J., Le Cessie, S., Vandenbroucke, J.P., Rosendaal, F.R., Doggen, C.J., 2010. Experience with multiple control groups in a large population-based case-control study on genetic and environmental risk factors. Eur J Epidemiol 25, 459-466. Prentice, R.L., 1986. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73, 1-11. Prentice, R.L., 1995. Design issues in cohort studies. Stat Methods Med Res 4, 273-292. Robins, J.M., Greenland, S., 1992. Identifiability and Exchangeability for Direct and Indirect Effects. Epidemiology 3, 143-155. Rodrigues, L., Kirkwood, B.R., 1990. Case-control designs in the study of common diseases: updates on the demise of the rare disease assumption and the choice of sampling scheme for controls. Int J Epidemiol 19, 205-213. Rothman, K., Greenland, S., 1998. Modern Epidemiology. Lippincott Williams & Wilkins. Sackett, D.L., 1979. Bias in analytic research. J Chronic Dis 32, 51-63. Sackett, D.L., Deeks, J.J., Altman, D.G., 1996. Down with odds ratios! Evidence Based Medicine 1, 164-166. Sallander, M., Eliasson, J., Hedhammar, A., 2012. Prevalence and risk factors for the development of diabetes mellitus in Swedish cats. Acta veterinaria Scandinavica 54, 61. Schlesselman, J.J., Stolley, P.D., 1982. Case-control studies : design, conduct, analysis. Oxford University Press New York. Schulz, K.F., Grimes, D.A., 2002. Case-control studies: research in reverse. Lancet 359, 431-434. Silva, G.S., da Costa Lana, M.V., Dias, G.B.G., da Cruz, R.A.S., Lopes, L.L., Machado, G., Corbellini, L.G., Gava, D., Souza, M.A., Pescador, C.A., 2015. Case-control study evaluating the sow's risk factors associated with stillbirth piglets in Midwestern in Brazil. Tropical animal health and production 47, 445-449. Silverlas, C., de Verdier, K., Emanuelson, U., Mattsson, J.G., Bjorkman, C., 2010. Cryptosporidium infection in herds with and without calf diarrhoeal problems. Parasitology research 107, 1435-1444. Stavisky, J., Radford, A.D., Gaskell, R., Dawson, S., German, A., Parsons, B., Clegg, S., Newman, J., Pinchbeck, G., 2011. A case-control study of pathogen and lifestyle risk factors for diarrhoea in dogs. Preventive Veterinary Medicine 99, 185-192. Sullivan, K.M., Dean, A., Soe, M.M., 2009. OpenEpi: a web-based epidemiologic and statistical calculator for public health. Public Health Rep 124, 471-474. Thomas, D.C., Greenland, S., 1983. The relative efficiencies of matched and independent sample designs for case-control studies. J Chronic Dis 36, 685-697. Ury, H.K., 1975. Efficiency of case-control studies with multiple controls per case: continuous or dichotomous data. Biometrics 31, 643-649. Vandenbroucke, J.P., 1991. Prospective or retrospective: what's in a name? BMJ 302, 249-250. 40

Vandenbroucke, J.P., Pearce, N., 2012a. Case-control studies: basic concepts. Int J Epidemiol 41, 1480-1489. Vandenbroucke, J.P., Pearce, N., 2012b. Incidence rates in dynamic populations. Int J Epidemiol 41, 1472-1479. Wacholder, S., 1991. Practical considerations in choosing between the case-cohort and nested case-control designs. Epidemiology 2, 155-158. Wacholder, S., 1995. Design issues in case-control studies. Stat Methods Med Res 4, 293-309. Wacholder, S., McLaughlin, J.K., Silverman, D.T., Mandel, J.S., 1992a. Selection of controls in case-control studies. I. Principles. Am J Epidemiol 135, 1019-1028. Wacholder, S., Silverman, D.T., 1990. Re: "Case-control studies using other diseases as controls: problems of excluding exposure-related diseases". Am J Epidemiol 132, 1017-1019. Wacholder, S., Silverman, D.T., McLaughlin, J.K., Mandel, J.S., 1992b. Selection of controls in case-control studies. II. Types of controls. Am J Epidemiol 135, 1029- 1041. Wakeling, J., Everard, A., Brodbelt, D., Elliott, J., Syme, H., 2009. Risk factors for feline hyperthyroidism in the UK. Journal of Small Animal Practice 50, 406-414. Walter, S.D., 1977. Determination of significant relative risks and optimal sampling procedures in prospective and retrospective comparative studies of various sizes. Am J Epidemiol 105, 387-397. West, D.W., Schuman, K.L., Lyon, J.L., Robison, L.M., Allred, R., 1984. Differences in risk estimations from a hospital and a population-based case-control study. Int J Epidemiol 13, 235-239.

41

CHAPTER 3. THE CASE-CONTROL DESIGN IN VETERINARY SCIENCES: A SURVEY

A paper submitted to Preventive Veterinary Medicine

Jonah N. Cullen, Jan M. Sargeant, Kelly M. Makielski, Annette M. O’Connor

Abstract

Background: The case-control study design is deceptively simple. A number of design considerations affect the effect measure estimated by the design. An investigation of case-control studies in the human literature suggested a number of these considerations are not described in reports of case-control studies.

Hypothesis/Objectives: The hypothesis was that the majority of veterinary studies labeled case-control would be incident density designs and many would not interpret the effect measure obtained from those studies as the rate ratio rather than the odds ratio.

Methods: Reference databases were searched for author designated case-control studies.

A survey of 100 of these studies was conducted to examine the different design options described and estimated effect measures.

Results: Of the 100 case-control studies, 83 assessed a disease etiology or causal association and of those only 54 (65.1%) sampled the study population based on an outcome and would thus be considered case-control designs. Twelve studies were incidence density designs but none used this terminology. Of the studies that reported an odds ratio as the effect measure, none reported on additional considerations that would have enabled a more interpretable result. 42

Conclusions: The survey indicated many case-control labeled studies were not actual case-control designs and among case-control studies key design aspects were not often described. The absence of information about study design elements and underlying assumptions in case-control studies limits the ability to establish the effect measured by the study and their evidentiary value might be underestimated.

Introduction

Background and rationale

In 2008, a survey was conducted of 150 case-control studies in human medicine that assessed the study design elements, underlying assumptions, and interpretation of the results

(Knol et al., 2008). The survey findings suggested that many authors did not interpret the calculation of the odds ratio (OR) as a rate or risk rate when possible. For example, the stable population assumption, required for the OR calculation to estimate a rate ratio, was not stated by any of the authors in their sample. Knol et al. (2008) results may have been associated with recent case-control studies appropriately reporting necessary assumptions, additional surveys of case-control studies in specific topic areas, and rethinking of the use of case-control designs in human health (Fullerton et al., 2012; Goldberg et al., 2013; Touvier et al., 2013; Adam et al.,

2015). Although debate about the principles, conduct, reporting, and interpretation is not new

(Greenland and Thomas, 1982; Wacholder et al., 1992a; Wacholder et al., 1992c; von Elm et al.,

2007; O'Connor, 2013).

The case-control study is one of the “big three” observational study designs: cohort, case- control, and cross-sectional designs (von Elm et al., 2007). These designs are frequently used to assess exposure and outcome associations, i.e. etiology. Often the outcome is a disease, although not always. For simplicity, disease etiology will be used throughout this manuscript instead of 43 exposure-outcome association when possible. The case-control design is described as one of the most efficient choices for a research question regarding disease etiology. This is due to the relative inexpensive nature of the design in both assessment of exposure (as a result of assessing only a sample of the population) and the lack of required follow-up time (Rothman and

Greenland, 1998). These benefits are even more pronounced when the outcome of interest is rare. For these reasons, the case-control design is often conducted in the early stages of investigating a disease etiology. However, while comparatively inexpensive and efficient, the case-control design can be complex to conduct, requiring close attention to detail, suitable case and control selection, and interpretation of the effect measure obtained from case control studies.

Case-control designs are characterized by the selection approach into the study based on the outcome rather than exposure. This selection approach is the defining feature of the case- control study design. Although often the outcome of interest in a case-control study is the disease status of the cases, hence the name, this is not a requirement of the case-control design. For example, what would be considered an exposure in a cohort study may be used as an outcome in a case-control design. That is, if investigators were interested in examining the association between a disease (exposure) and the development of a secondary disease (outcome), a case- control study could be designed such that cases are enrolled based on the secondary disease.

Subjects included in the study with the outcome of interest are considered the case population

(cases) and subjects from the same source population without the defined outcome are considered the control population (controls).

There are four main considerations when designing, reporting, and properly interpreting a case control study: 1) nature of the enrolled cases (incident or prevalent) (Pearce, 2012), 2) type of source population (fixed cohort or dynamic) (Miettinen, 1985b; Rothman and Greenland, 44

1998), 3) the source and timing of control selection (Miettinen, 1985a; Wacholder et al., 1992b) and 4) necessary assumptions and analyses considerations (Greenland and Thomas, 1982;

Schlesselman and Stolley, 1982). The choice of incident or prevalent case enrollment will be referred to as “case type” throughout this manuscript. The interactions of these four design features affect the inference that can be obtained from a case-control study, and importantly the evidentiary value of the results. Figure 1, gratefully reproduced from Knol et al. (2008), depicts a flow diagram of these design features and will be referenced throughout this manuscript.

Objectives

Given the common use of the case-control design in veterinary clinical science and the findings of Knol et al. (2008), our objective was to determine if similar reporting issues occur in veterinary sciences with the case-control design. Our approach to addressing this objective was to replicate the Knol et al. (2008) study in veterinary science with small modifications. We examined case-control studies for the source population type, nature of the selected outcome, timing of control selection, and reporting of assumptions. Our working hypothesis was that the majority of veterinary studies using the case-control label would be incident density studies and that many authors would not report the effect measure obtained from those studies as the rate ratio but rather than the odds ratio.

Methods

Study design

The study was envisaged as a descriptive hypothesis-generating study. To address the objective, a survey was conducted to characterize a sample of studies, designated by the original authors as case-control studies, by their “true” design based on the described characteristics.

Both Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) and 45

Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines were considered in preparing this manuscript (von Elm et al., 2007; Moher et al., 2009).

Information sources and electronic search

The Center for Biosciences and Agriculture International (CABI) and MEDLINE® databases were searched to identify potentially relevant manuscripts published in English since

2006 via the Web of Science interface through Iowa State University. The search was designed to capture manuscripts that were described by the authors as "case control" in either the title or abstract using the Web of Science “Topic” and “Title” field tags (Table 1). The “Title” field tag specifically searches within the title field for both CABI and MEDLINE. Relevant to our search, the “Topic” field tag for CABI searches within the title (including foreign titles), abstract, descriptors (including organism and “up-posted” descriptors), identifiers, and CABICODES. For

MEDLINE, the relevant fields in which the “Topic” field tag searches are the title, abstract,

Medical Subject Heading (MeSH) terms, and keyword list. Database searches were conducted in

June of 2015. The species chosen were based on convenience, relevance to the veterinary science, and anticipated size of the published literature available. The search used two concepts

“case control” and species. The species chosen were cats, dogs, pigs, cows, and horses. The time limit (post 2006) was imposed because it was considered more relevant to study recently published studies and more of such studies would be available electronically.

Study selection – refining the results of the electronic search

Citations retrieved by the electronic searches were downloaded and duplicate manuscripts were identified and removed via EndNote X7.5a. The remaining citations were imported into the online systematic review software DistillerSRb, which was used to manage all manuscripts and 46 associated data. After importing into DistillerSR, duplicate studies not identified in EndNote were removed using the DistillerSR “Duplicate Detection” tool.

A single reviewer (JC) conducted the initial screening to identify all eligible manuscripts that were author designated as case-control in the title or abstract. Manuscripts not specifically described as some form of case-control were excluded. To be eligible, the abstract or title needed to include the terms “case” and “control” adjacently; descriptions of cases and controls only were not considered. Of the manuscripts eligible following the initial screening, 100 manuscripts were then randomly selected (using the DistillerSR random sorting function) for further evaluation and the full texts obtained. After obtaining the full text, manuscripts were removed for any of the following reasons: upon closer examination “case control” was not described in the title or abstract, the manuscript was a preliminary report, did not assess an animal health outcome, or was not available in English. Manuscripts were replaced (randomly) as needed, in order to end up with a total of 100.

Data collection process – quality control approaches for data extraction

The data extraction form was pilot tested by three reviewers to ensure completeness of assessment and agreement. Example studies were extracted by all reviewers and results compared. The form was modified until all reviewers agreed.

Two reviewers reviewed all studies, and conflicts were resolved by discussion. If two reviewers could not come to a consensus for any item of a manuscript, a third reviewer was included for adjudication. As part of the pilot test, the following guidelines were established to assess specific design features. If the two reviewers decided the presumed purpose of the study was a diagnostic test accuracy (DTA) study rather than assessment of an exposure-disease 47 relationship, no design elements were examined. Only if the study described using newly diagnosed patients were cases considered incident and we made no assumptions about the "lag time" between disease onset and case presentation. As an example, dogs might have Cushing’s disease (hyperadrenocorticism) for extended periods (perhaps months) prior to presentation at the veterinary clinic; however, we considered these incident cases when presented for the first time. Cases were considered prevalent when the authors stated patients were previously diagnosed or were described as having chronic disease symptoms. Note for hospital-based studies, authors rarely specifically indicated that cases were the first presentation, but often did indicate that cases had been previously seen. Diseases known to be recurrent or chronic in livestock were considered prevalent unless it was specifically stated as a newly diagnosed herd or animal. Congenital diseases were considered prevalent (Mason et al., 2005). For studies involving cats or dogs, an author (KM) was consulted if this case type was ultimately unclear. In deciding between population type (fixed cohort and dynamic), if no other patients could conceivably enter the source population it was considered a fixed cohort. Within dynamic populations, we made no assumptions regarding the validity of the timing of case to control selection.

Data items – data extracted from manuscripts

For the 100 manuscripts we extracted the species [later grouped by animal type: small animal (cat, dog) and large animal (cow, pig, horse)], year of publication, journal, and year(s) from which the study population was drawn. We then assessed the presumed purpose of the study based on the author description. Options included an etiology assessment, DTA, or unclear. The presumed purpose was considered to be a DTA based upon descriptions of novel diagnostic techniques or clinical scoring systems, reports of sensitivity and specificity, ROC 48 curves or likelihood ratios, or if the authors specifically stated the purpose was a test evaluation.

Etiology assessment was considered to be the purpose if the authors appeared to be making inference about causation. The purpose was considered unclear if was not possible to differentiate between etiology assessment or DTA.

For studies that were considered to have an etiology purpose, the next step was to extract information that would allow determination of the design based on the Knol et al (2008) schematic (Figure 1). This design information included sampling from the source population based on the outcome (yes, no, not discernable), unit of observation (individual, group, mixed) and the case type (incident, prevalent, mixed). For studies that did not state, or were unclear, as to whether sampling was on an outcome, inclusion of a longitudinal or temporal component was assessed. If the authors examined the exposure-disease association at a single point in time, it was considered a cross-sectional design and if one of these variables preceded the other, it was considered a cohort design. These studies were not investigated further as they were not considered to be case-control designs. Additionally, studies with unclear case type were not examined further as determining subsequent design considerations would be of little value.

We also extracted the author specific name for the design (e.g. case-control design, retrospective case-control). Author specific design titles were summarized in order to produce a reasonable number of design descriptions. For example, a design title of “prospective, multicenter, unmatched case-control study” was summarized as “prospective unmatched case- control”. Similarly, a “multi-country matched case-control study” was summarized as “matched case-control study”. 49

Finally for studies that did sample from the source population based on a clearly described case type [i.e. case-control studies with incident, prevalent, or both (mixed) case types], we extracted the following information to determine the effect measure that could be estimated: the presumed nature of the source population type (fixed cohort or dynamic source), the timing of control selection (beginning, end, or concurrently), the author stated underlying assumptions (censoring of study subjects not related to exposure, rare disease in the source population, matched analysis, stable exposure distribution), and the level of analysis (individual or group). Often authors did not report these factors but an interpretation based on the author’s description of the study was made. The author reported effect measures (e.g. odds ratio, rate ratio, risk ratio, regression coefficients) were also recorded for each study. If the disease state (or outcome) was not dichotomous, the reported effect measure was considered not applicable. For examining whether the OR calculation may estimate a rate or risk ratio, only studies investigating incident diseases from either a fixed cohort or dynamic population that provided

OR (i.e. studies reporting prevalence OR or regression coefficients without transformation to an

OR were excluded) were considered for the reporting of necessary assumptions. This was determined in conjunction with Figure 1, reproduced from Knol et al. (2008).

Bias

We did not expect any major sources of bias in our survey. We did not restrict our search by journal or species and randomly selected our sample population of studies from the source. To reduce information bias in data extraction, we used pretested forms and duplicate data extraction.

Study size

The rationale for the sample size was pragmatic as no prior information was available on which to base a sample size calculation. A sample size of 100 was chosen by one author (AOC) 50 because previously conducted studies of similar exploratory style used this size and it seems sufficiently large to provide a good “picture” of the topic (Sargeant et al., 2009; Sargeant et al.,

2010; Sargeant et al., 2011). Further, as we did not test any hypotheses, we did not need to evaluate power of the study to test a hypothesis.

Quantitative variables

We recorded no quantitative variables.

Statistical methods and approach to summarization of results

No statistical hypotheses were tested, and basic descriptive statistics were calculated to describe the frequency of design characteristics. Although statistical tests were not conducted to address the working hypotheses, descriptive statistics were used subjectively to report the frequencies of different sampling strategies (e.g. exclusive sampling versus incidence density sampling) and the assumptions stated for the effect measure estimation of the OR calculation.

Results

Study population

The search resulted in 744 citations from CABI and 2,153 from MEDLINE. Following removal of duplicate manuscripts, 2,289 citations remained. Five hundred and fifty three were identified as some form of case-control study in the title or abstract by the authors. One hundred manuscripts were randomly selected, 12 of which were excluded and randomly replaced due to not being described as case-control upon examination of the full text (n=2), considered to be a preliminary report (n=2), not an animal health outcome (n=1), or were not available in English

(n=7).

51

Descriptive data for general characteristics of studies

The 100 self-described case control manuscripts were from 45 different journals with the

Journal of the American Veterinary Medical Association most represented (n=16), followed by the Journal of Veterinary Internal Medicine (n=11), PLoS One (n=9), Equine Vet Journal (n=7),

Preventive Veterinary Medicine (n=6) and Journal of Small Animal Practice (n=5) (Table 2).

The most frequent year of publication was 2011. Of the 100 manuscripts, the presumed primary purpose was determined to be a DTA (n=14) or unclear (n=3). These 17 manuscripts were excluded from further characterization. The remaining 83 papers were considered to have an etiology purpose.

Table 2 contains the design titles given by the authors of the 83 manuscripts. “Case- control” (without additional descriptors) was the given design title for 30 of 54 (55.6%) studies that sampled on the outcome and 9 of 25 (36%) that did not. “Prospective” was used nearly equally between studies that did or did not sample on an outcome [compare 4 of 54 (7.4%) sampled on an outcome to 2 of 25 (8%) that did not]. The term “retrospective” was employed as a design title in 9 of 54 (16.7%) studies that sampled on the outcome and 12 of 25 (48%) that did not.

Table 3 summarizes the general characteristics of the 83 manuscripts with the primary purpose of assessing a disease etiology, grouped by animal type [small animal (cat, dog) and large animal (cow, pig, horse)]. Of the 83 etiology studies, 54 (65.1%) sampled from the source population based on the outcome, suggesting a case-control design. Of the 54 studies, 37 (68.5%) were large animal and 17 (31.5%) were small animal. The remaining 29 studies did not sample from the source population based on the on the outcome [25 (30.1%)] and were either cohort 52 studies (longitudinal aspect included) or cross sectional studies (no longitudinal component), or the sampling method was not discernable [4 (4.8%)].

Based on our selection criteria, all of these studies had been described by the authors as some form of case-control studies. These data indicate that 34.9% (95% CI: 24.7 – 45.2) of self- described case-control studies with the purpose of assessing etiology are not correctly identified.

Also, overall (combined with DTA and unclear purpose), 46% (95% CI: 35.3 – 56.7) of the self- described case-control studies in our random sample were not actually that design.

The sample set contained 52 large animal studies with a primary purpose of assessing a disease etiology. Fifteen of the 52 (28.9%) large animal studies did not sample on the outcome.

For three of the 15 (28.9%), the sampling strategy was not discernable. Of the remaining 12

(23.1%) studies that did not sample on the outcome, six employed a temporal component and six did not. These results suggest that six studies would more likely be a cohort study (with temporal component) and the other six as a cross-sectional (without temporal aspect).

Thirty-one small animal studies were identified as having an etiology purpose. Fourteen of these 31 (45.1%) small animal studies did not sample from the source population based on the outcome (13) or it was not discernable (1). Eight of the 13 small animal studies that did not sample on outcome, 8 would be considered cohorts (with temporal component) and 5 would be considered cross-sectional studies (without temporal aspect).

For the unit of observation, 62 of 83 etiology studies (74.7%) sampled at the individual level, 19 (22.9%) at the group level, and 2 (2.4%) enrolled both individuals and groups. No small animal studies employed a group level sampling whereas 19 or 52 (36.5%) large animal studies did. 53

Incident cases were most common in the etiology studies, with 45 of 83 (54.2%) studies.

Prevalent cases were utilized in 23 of 83 (27.7%). The majority of prevalent case types were from large animal studies [19 of 23 (82.6%)], where as only 4 of 23 (17.4%) small animal studies employed prevalent cases. Of the 83, 3 (3.6%) studies used both incident and prevalent case types (referred to as “mixed”) and case type was unclear for 12 (14.5%).

Approach to case selection in case-control designs

The design features of the 54 studies (65.1%) that sampled on the outcome (thus considered case-control) are reported in Table 4. Twenty-nine studies were determined to be enrolling incident cases, and from studies utilizing incident cases, fixed cohorts with 7 of 29

(24.1%) were less represented then dynamic populations with 22 of 29 (75.9%). Seventeen of the

54 (31.5%) studies were considered to be enrolling prevalent cases. Both incident and prevalent cases were enrolled for 2 (3.7%) studies and the type of case was unclear for 6 (11.1%).

Prevalent cases were utilized for large animal topics more often [15 of 37 (40.5%)] than small animal topics [2 of 17 (11.8%)]. The level of analysis was the group level for 12 of 37 (32.4%) large animal studies and no small animal based manuscripts.

Approach to control selection in case-control designs

The timing of control group selection is presented in Table 4. For studies that used incident cases [29 of 54 (53.7%)] within a fixed cohort [7 of 29 (24.1%)], controls were sampled concurrently [2 of 7 (28.6%)], at the end of follow-up [4 of 7 (57.1%)], or it was unclear [1 of

7(14.3%)]. No studies in our sample selected controls at the beginning of follow-up; thus, none of sample studies would be considered a case-cohort. Twenty-two of 29 (75.9%) studies employed incident cases from a dynamic population. Of the 22, 10 attempted to match controls 54 to cases on time, 9 did not match on time, and for 3 it was unclear with regards to matching.

None of the incident density studies described themselves as such.

Effect measure reported by authors and inferred by reviewers

Six of the 54 (11.1%) studies were excluded from additional examination, as it was not possible to determine how the effect measure may be interpreted without distinguishing between incident and prevalent cases. Table 5 contains the effect measures reported by the authors, stratified by animal grouping (small animal and large animal). The OR was reported as the sole effect measure in 30 of the 48 (62.5%) studies. Some authors reported the OR and regression coefficient (without transformation to a standard effect measure) [3 (6.3%)], the prevalence OR

[2 (4.2%)], the regression coefficient [1 (2.1%)], or did not report an effect measure [9 (18.8%)].

Additionally, for 3 of the 48 (6.3%) studies, the reporting of the effect measure was considered

“not applicable” as the outcome was not dichotomous.

Table 6 presents the timing of control selection from fixed cohorts or dynamic populations in studies that reported the OR as the effect measure of interest (i.e. studies reporting solely the prevalence OR or regression coefficients without transformation to a standard effect measure were excluded). Only studies that solely employed incident case types were included

(i.e. prevalence and incidence/prevalence based studies were excluded). Additionally the assumptions needed for the OR calculation to estimate a risk or rate ratio are displayed.

Five studies (two small animal and three large animal) sampled cases and controls from within a fixed cohort (Table 6). Of the five studies, three sampled controls at the end of follow- up, one concurrently, and for one the timing was unclear. No studies sampled controls from the beginning of follow-up, thus none would be considered case-cohort studies. When controls are 55 sampled at the end of follow-up, the rare disease assumption is required for the calculation of the

OR to estimate a risk ratio. This assumption was not stated in any of the three studies.

Concurrent sampling of controls requires the analysis to be matched on time in order for the OR calculation to estimate a rate ratio. The analysis was matched on time for the study that sampled controls concurrently from our sample. However the authors did not describe a rate ratio within the manuscript but instead used the term matched OR.

Eighteen studies (seven small animal and eleven large animal) sampled cases and controls from a dynamic population (Table 6). Of the 18 studies, eight matched and eight did not match the timing of control sampling to cases. The timing was unclear for two studies. For studies that matched the timing of control selection to case enrollment, no further assumptions are needed for the OR calculation to estimate a rate ratio (Greenland and Thomas, 1982). When control selection timing is not matched to cases, a stable population assumption is required

(Greenland and Thomas, 1982). A population is considered or assumed to be stable when the study characteristics under study (e.g. exposure distribution) do not change significantly over the study period (Greenland et al., 1986). None of the eight studies that did not match timing of control sampling to cases suggested population stability. After extensive discussions, the authors of this manuscript determined all eight studies likely examined a disease etiology from a stable population. It should be noted, however that stability as a population characteristic is often debatable, because the assumption is a function of time, and experts may differ on the time frame for stability (Knol et al., 2008). For further discussion see Greenland and Thomas (1982) and

Greenland et al. (1986).

56

Discussion

Our survey of case-control studies in the veterinary literature suggests uncertainty as to what constitutes a case-control study and the assumptions needed for the OR to be more interpretable. Furthermore, it highlights incomplete descriptions of important methodological aspects in a number of our sampled manuscripts.

Key results

Our survey of 100 author designated case-control studies in the veterinary sciences resulted in greater than half of the studies that sampled from the source population based on the outcome (i.e. case-control design) and assessed a disease etiology or exposure-outcome association. Nearly a fifth (17) of examined manuscripts were DTA or had an unclear purpose.

Of the studies that examined a disease etiology but did not sample on the outcome, almost half used sampling methods consistent with the cohort design and greater than half with cross- sectional designs. This suggests that the approximately 30% of our sample that were not DTA but were labeled as case-control would actually better be described as cohort and cross-sectional designs. This is an issue in veterinary medicine that should be addressed. For example, a survey of observational designs for pre-harvest food safety also identified mislabeled study design descriptions (Sargeant et al., 2011). This finding has ramifications as it indicates authors, editors, or reviewers may be misinterpreting study design descriptions. As a result, if design labels are used as criteria for relevance or “quality” then mislabeled studies may be incorrectly excluded from consideration for research synthesis. This also has implications in terms of searches used to identify literature based on study design type.

Of the manuscripts that that sampled on the outcome and were therefore consistent with a case-control design, over half utilized incident cases, a third prevalent cases, and the remaining 57 were either unclear or mixed case types. These results are in contrast to the previously observed results by Knol et al. (2008) for human case-control studies. The Knol et al. (2008) survey reported nearly all manuscripts enrolled incident case types (81%) and less than 10% prevalent case types. The difference, with regards to case type, between our veterinary survey and the human survey is less pronounced for case-control studies with small animals. Nearly two-thirds of the small animal case-control studies used incident cases and approximately 10% prevalent cases. However for large animal based case-control studies the ratio of incident to prevalent case types is nearly one-to-one. This is an important observation as it suggests that although veterinary researchers may be more likely (at least for small animal studies) to use incident cases with the case-control design, the frequency is less than that for human medicine. This implies the nuance in the choice of utilizing incident versus prevalent cases may not be entirely recognized.

With such incident case based designs, if assumptions are properly reported, the estimation of the risk or rate ratio rather than the OR may be possible. This is helpful because the risk or rate ratio is more readily interpretable. Researchers understand the ratio of risks better than ratio of odds

(Sackett et al., 1996; Davies et al., 1998; Grimes and Schulz, 2008; O'Connor, 2013). Authors are often warned not to treat the two effect measures as interchangeable because the OR is further from the null and therefore it over estimates the risk ratio in non-rare situations (Davies et al., 1998). However, that statement is only applicable to the small subset of case control studies that do estimate the OR. Knowledge of the actual case-control design would ensure correct interpretation.

The relationship between fixed cohort and dynamic population usage among incident cases matched the observed relationship in human case-control studies (Knol et al., 2008). These results suggest that both human and veterinary researchers employ fixed cohort and dynamic 58 populations for case-control studies in similar proportions. This is an important observation as it implies veterinary researchers conducting case-controls are cognizant of the “fixed” nature of the source population from which the cases were drawn. This is especially true in the subset of our sample that recognized their case-control study was conducted within a cohort, as case-control studies may often be considered as “nested” within a hypothetical cohort (Rothman and

Greenland, 1998).

Where Knol et al. (2008) observed 90% of all case control studies reported an OR, only

62.5% of our sample that sampled on the outcome and had a clear population type reported the

OR as the effect measure. Prevalence ORs, regression coefficients, or no effect measure at all were reported. Unfortunately the necessary design and study population information required to determine if a study using incident cases actually estimated the risk ratio or rate ratio were not specifically stated in any of the sampled manuscripts. Among designs using incident cases and control sampling from fixed cohorts, only one study of five sampled concurrently and employed the essential assumption for the OR to estimate the rate ratio, namely the analysis be matched on time . However, the authors of this study did not report the calculation of the OR to be an estimate of the rate ratio. Among controls sampled from dynamic populations and of the manuscripts that matched controls to cases on time (thus no further assumptions needed), none reported the OR as an estimate of the rate ratio (two reported mortality rates). Additionally, the stable population assumption required when controls are sampled from dynamic populations that are not sampled on time was not specifically stated in any of the manuscripts employing this sampling strategy. As a consequence, a more interpretable effect measure of the causal association was missed in these studies. 59

The results of this survey indicate that issues exist in the designing, reporting, and implementing of the case-control study design in the veterinary science. Study subjects were not sampled on an outcome, the defining feature of the case-control design, in 35% of examined studies that assessed a disease etiology or exposure-outcome association. In many studies, the effect estimate was likely to have a more interpretable meaning than the odds ratio reported and incomplete reporting of the assumptions resulted in the loss of potentially important information from these studies. These findings have two implications: 1) filtering by author designated study design within a systematic review may result in the loss of relevant manuscripts if reported incorrectly, and 2) when conducting meta-analyses, it is unclear how to incorporate ORs if their calculation estimates a more interpretable risk or rate ratio based on design considerations.

Overall, the results suggest an opportunity exists for epidemiologists to improve the teaching of the complexities of the case-control design to maximize potential etiologic inference from this misunderstood epidemiologic study design.

Limitations

Although we did not expect major sources of biases, our survey may contain some limitations or biases. We do not expect a large bias as a result of journal publication standards as we did not limit the search to specific journals. Our survey may be limited by our interpretation of the intended purpose of studies. For example, if the authors’ intended purpose was to investigate disease etiology but more prominently described aspects of a DTA, this study may have been misclassified as a DTA. We attempted to minimize this misclassification by looking for typical features (e.g. ROC curves, sensitivity and specificity analysis) and specific language

(e.g. “to evaluate a novel diagnostic...”) of DTAs. 60

It may be argued that the chosen sample size of 100 is a limitation as our survey led to only 54 studies that were actually case-control designs. Moreover, of the 54, only 23 studies included incident cases from a fixed or dynamic population and reported an OR. As a result, the assumptions required for the OR calculation to estimate a risk or rate ratio were only assessed for less than a quarter of the original sample. This is in contrast to the results of Knol et al. (2008) where of the 150 case-control studies investigated, 135 reported an OR. Further, all 150 were examined for the reporting of effect measures and necessary assumptions.

One of the possible biases is the amount of inference we were required to draw from studies, as authors did not state most assumptions, i.e. we may have made the wrong judgment.

We attempted to minimize this incorrect assessment by using two independent reviewers per manuscripts and ensuring agreement across reviewers on the pilot set. Further, when disagreements arose with regards to outcome type (incident versus prevalent) or control selection and could not be resolved by discussion, we used "unclear" instead of guessing at the authors' intention. This may have caused us to under represent the number of studies that should have been included in the description of the effect measure and assumptions (Table 7). The lack of specific study design descriptions in a large number of our sample was not entirely unexpected.

Our survey may also be biased by our search only including "case control". Veterinary studies employing a case-cohort design (hybrid case-control with controls sampled from the population at the beginning of follow-up) may have been missed if the authors used the term

"case-cohort" and not "case control". Authors that recognize a case-cohort as a particular case- control design may be conscientious of the proper reporting and necessary assumptions. As a result, the observation that none of our study sample included case-cohort studies is likely to be an underestimate of the veterinary science at large. However, it should be noted that Knol et al. 61

(2008) observed only 1 of 150 (0.67%) case-control studies that would be considered case- cohorts.

Generalizability

Based on our survey of the veterinary literature and previous surveys in the human literature (Pocock et al., 2004; Tooth et al., 2005; Knol et al., 2008; Fullerton et al., 2012), we would suggest employing a different classification system. Historically, directionality and the use of the terms “retrospective” and “prospective” have been a source of confusion in defining study design (Vandenbroucke, 1991). With this confusion, case-control studies are often considered retrospective in nature and cohorts as prospective. In this survey, for example, nearly half of the studies labeled case-control that did not sample on an outcome (i.e. not “true” case- control designs) used the term “retrospective” to describe the design. Thus in order to alleviate this confusion and better teach the case-control design, as well as observational design in general, a different classification system may be warranted. Indeed, one such classification scheme, originally proposed by Greenland and Morgenstern (1988) and Morgenstern and

Thomas (1993), has been recently revived by Pearce (2012). This classification scheme specifically rejects the notion of directionality as a means of distinguishing between designs, and consists of two main considerations: 1) incident or prevalent case type, and 2) whether or not to sample on an outcome. These simplified design choices thus result in four basic designs: incident cases sampled on an outcome, incident cases not sampled on an outcome, prevalent cases sampled on an outcome, and prevalent cases not sampled on an outcome. Pearce (2012) argues that not only will this classification scheme emphasize that directionality is unimportant in distinguishing study design, but also highlight the importance of case type utilized, and the distinction between whether or not subjects are identified and enrolled based on the case type. 62

Our belief is that if authors were to employ this much simpler design scheme, there may be less uncertainty in adequately describing the overall study design for most observational study designs. 63

Footnotes a Endnote X7.5, Thomson Reuters, http://endnote.com b DistillerSR, Evidence Partners, https://distillercer.com 64

Figures

Figure 1. Schematic of case-control design options illustrating what the OR calculation will estimate. Reproduced with permission from Knol et al. (2008). 65

Tables

Table 1. Literature review search terms to identify case-control studies in animals for the Centre for Biosciences and Agriculture International (CABI) and MEDLINE databases conducted on

June 5th, 2015.

Population AND Study design

TS=(cat OR feline OR dog OR canine OR cow OR cattle OR bovine OR TS=(“case control”) OR TI=(“case pig OR swine OR porcine OR feedlot OR “cow-calf” OR cow-calf OR control”) herd)

TS=(horse OR equine)*

th *The search including horses was conducted on June 26 , 2015. 66

Table 2. Study design titles for studies that did or did not sample on an outcome or sampling strategy was unclear.

Sampled on an outcome (54) Not sampled on an outcome (25) Unclear sampling strategy (4)

Case-control (30) Case-control (9) Case-control (2)

Case-control association (1) Case-control association (1)

Cross-sectional case-control (1)

Genome wide case-control (2)

Matched case-control (4) Matched case-control (1)

Nested case-control (2)

Prospective case-control (1) Prospective case-control (2)

Prospective nested case-control (1)

Prospective unmatched case-control (2)

Randomized blinded case-control (1)

Retrospective case-control (9) Retrospective case-control (12)

Unmatched case-control (1) Unmatched case-control (1)

67

Table 3. Summary statistics of manuscripts author designated as case-control identified from database searches. [n (%) or ]

Manuscripts (n=100)

Journal (n= 100)

Journal of the American Veterinary Medical Association 16 (16.0)

Journal of Veterinary Internal Medicine 11 (11.0)

PLoS One 9 (9.0)

Equine Veterinary Journal 7 (7.0)

Preventive Veterinary Medicine 6 (6.0)

Journal Small Animal Practice 5 (5.0)

Other* 46 (46.0)

Presumed purpose (n= 100)

Disease etiology 83 (83.0)

Diagnostic test evaluation 14 (14.0)

Unclear 3 (3.0)

Species (n=83)

Bovine 26 (26.0)

Canine 21 (21.0)

Equine** 16 (16.0)

Feline 9 (9.0)

Porcine 4 (4.0)

Mixed 7 (7.0)

* Other manuscripts across 39 other journals at less than five per journal.

** One manuscript with donkeys as the study population. 68

Table 4. Summary statistics of 83 manuscripts with the presumed purpose to assess a disease etiology (i.e. not diagnostic test evaluation or unclear). [n (% of column subgroup)]

Small animal (n=31) Large animal (n=52) Total (n=83)

Sampling on the outcome

Yes 17 (54.8) 37 (71.2) 54 (65.1)

No 13 (41.9) 12 (23.1) 25 (30.1)

Not discernable 1 (3.2) 3 (5.8) 4 (4.8)

Unit of observation

Individual 31 (100) 31 (59.6) 62 (74.7)

Group 0 19 (36.5) 19 (22.9)

Mixed 0 2 (3.9) 2 (2.4)

Case type

Incidence 19 (61.3) 26 (50.0) 45 (54.2)

Prevalence 4 (12.9) 19 (36.5) 23 (27.7)

Mixed 3 (9.7) 0 3 (3.6)

Not discernable 5 (16.1) 7 (13.5) 12 (14.5)

69

Table 5. Summary statistics of 54 manuscripts that were case control studies (sampled on the outcome) with the presumed purpose of assessing a disease etiology. [n (%)]

Small animal (n=17) Large animal (n=37) Total (n=54)

Unit of observation

Individual 17 (100) 18 (48.6) 35 (64.8)

Group 0 18 (48.6) 18 (33.3)

Mixed 0 1 (2.7) 1 (1.9)

Control selection

Incident cases 11 (64.7) 18 (48.6) 29 (53.7)

Fixed cohort 2 (18.2) 5 (27.8) 7 (24.1)

Beginning of 0 0 0 follow-up

Concurrently 0 2 2

End of follow-up 2 2 4

Not discernable 0 1 1

Dynamic population 9 (81.8) 13 (72.2) 22 (75.9)

Attempt match on 4 6 10 time

Not matched on 3 6 9 time

Not discernable 2 1 3 regarding time

Prevalent cases 2 (11.8) 15 (40.5) 17 (31.5)

Incident & prevalent 2 (11.8) 0 2 (3.7) cases

Case type not 2 (11.8) 4 (10.8) 6 (11.1) discernable

Level of analysis

Individual 15 (88.2) 17 (45.9) 32 (59.3)

Group 0 12 (32.4) 12 (22.2) 70

Table 5. continued

Mixed* 0 3 (8.1) 2 (3.7)

No analysis 0 1 (2.7) 1 (1.9

Not applicable** 2 (11.8) 4 (10.8) 6 (11.1)

* Mixed includes "Individual, Group" (2) and "Individual, Pool of individuals" (1)

** Not applicable when case type could not be determined (i.e. unclear if incident or prevalent cases were included), or when additional design considerations, including the level of analysis, were not completed. 71

Table 6. Effect measure reported from 48 manuscripts that sampled on the outcome, were not presumed to be diagnostic test evaluations, and the population type (incidence/prevalence) was clear from the authors' description. [n (%)]

Small animal (n=15) Large animal (n=33) Total (n=48)

Effect measure

Odds ratio 10 (66.7) 20 (60.6) 30 (62.5)

Odds ratio and 1 (6.7) 2 (6.1) 3 (6.3) regression coefficient

Prevalence odds 1 (6.7) 1 (3.0) 2 (4.2) ratio

Regression 0 1 (3.0) 1 (2.1) coefficient

Not applicable 0 3* (9.1) 3 (6.3)

None reported 3 (20.0) 6 (18.2) 9 (18.8)

72

Table 7. Odds ratios and necessary assumptions by species type for manuscripts that included only incident cases and reported the odds ratios.

Small Large Total Odds ratio Assumption Assumption animal animal (n=23) may needed stated (n=9) (n=14) approximate

Control sampling – fixed cohort

End of 2 1 3 Risk Rare 0 follow-up ratio disease

Concurrently 0 1 1 Rate Analysis 1 ratio matched on time

Beginning of 0 0 0 Risk Censoring 0 follow-up ratio unrelated to exposure

Unclear 0 1 1

Control sampling – dynamic population

Matched on 4 4 8 Rate No time ratio further assumption needed

Not matched 2 6 8 Rate Stable 0 on time ratio population

Unclear 1 1 2 Rate Stable 0 regarding ratio population time

73

References

Adam, M., Kuehni, C.E., Spoerri, A., Schmidlin, K., Gumy-Pause, F., Brazzola, P., Probst- Hensch, N., Zwahlen, M., 2015. Socioeconomic Status and Childhood Leukemia Incidence in Switzerland. Front Oncol 5, 139. Davies, H.T., Crombie, I.K., Tavakoli, M., 1998. When can odds ratios mislead? BMJ 316, 989-991. Fullerton, K.E., Scallan, E., Kirk, M.D., Mahon, B.E., Angulo, F.J., de Valk, H., van Pelt, W., Gauci, C., Hauri, A.M., Majowicz, S., O'Brien, S.J., International Collaboration Working, G., 2012. Case-control studies of sporadic enteric infections: a review and discussion of studies conducted internationally from 1990 to 2009. Foodborne Pathog Dis 9, 281-292. Goldberg, R.J., McManus, D.D., Allison, J., 2013. Greater knowledge and appreciation of commonly-used research study designs. Am J Med 126, 169 e161-168. Greenland, S., Morgenstern, H., 1988. Classification schemes for epidemiologic research designs. J Clin Epidemiol 41, 715-716. Greenland, S., Thomas, D.C., 1982. On the need for the rare disease assumption in case- control studies. Am J Epidemiol 116, 547-553. Greenland, S., Thomas, D.C., Morgenstern, H., 1986. The rare-disease assumption revisited. A critique of "estimators of relative risk for case-control studies". Am J Epidemiol 124, 869-883. Grimes, D.A., Schulz, K.F., 2008. Making sense of odds and odds ratios. Obstet Gynecol 111, 423-426. Knol, M.J., Vandenbroucke, J.P., Scott, P., Egger, M., 2008. What do case-control studies estimate? Survey of methods and assumptions in published case-control research. Am J Epidemiol 168, 1073-1081. Mason, C.A., Kirby, R.S., Sever, L.E., Langlois, P.H., 2005. Prevalence is the preferred measure of frequency of birth defects. Birth Defects Res A Clin Mol Teratol 73, 690- 692. Miettinen, O.S., 1985a. The "case-control" study: valid selection of subjects. J Chronic Dis 38, 543-548. Miettinen, O.S., 1985b. Theoretical epidemiology: principles of occurrence research in medicine. Wiley New York. Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., Group, P., 2009. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 6, e1000097. Morgenstern, H., Thomas, D., 1993. Principles of study design in environmental epidemiology. Environ Health Perspect 101 Suppl 4, 23-38. O'Connor, A.M., 2013. Interpretation of odds and risk ratios. Journal of veterinary internal medicine / American College of Veterinary Internal Medicine 27, 600-603. Pearce, N., 2012. Classification of epidemiological study designs. Int J Epidemiol 41, 393- 397. Pocock, S.J., Collier, T.J., Dandreo, K.J., de Stavola, B.L., Goldman, M.B., Kalish, L.A., Kasten, L.E., McCormack, V.A., 2004. Issues in the reporting of epidemiological studies: a survey of recent practice. BMJ 329, 883. 74

Rothman, K., Greenland, S., 1998. Modern Epidemiology. Lippincott Williams & Wilkins. Sackett, D.L., Deeks, J.J., Altman, D.G., 1996. Down with odds ratios! Evidence Based Medicine 1, 164-166. Sargeant, J.M., Elgie, R., Valcour, J., Saint-Onge, J., Thompson, A., Marcynuk, P., Snedeker, K., 2009. Methodological quality and completeness of reporting in clinical trials conducted in livestock species. Prev Vet Med 91, 107-115. Sargeant, J.M., O'Connor, A.M., Renter, D.G., Kelton, D.F., Snedeker, K., Wisener, L.V., Leonard, E.K., Guthrie, A.D., Faires, M., 2011. Reporting of methodological features in observational studies of pre-harvest food safety. Preventive Veterinary Medicine 98, 88-98. Sargeant, J.M., Thompson, A., Valcour, J., Elgie, R., Saint-Onge, J., Marcynuk, P., Snedeker, K., 2010. Quality of reporting of clinical trials of dogs and cats and associations with treatment effects. Journal of veterinary internal medicine / American College of Veterinary Internal Medicine 24, 44-50. Schlesselman, J.J., Stolley, P.D., 1982. Case-control studies : design, conduct, analysis. Oxford University Press New York. Tooth, L., Ware, R., Bain, C., Purdie, D.M., Dobson, A., 2005. Quality of reporting of observational longitudinal research. Am J Epidemiol 161, 280-288. Touvier, M., Fezeu, L., Ahluwalia, N., Julia, C., Charnaux, N., Sutton, A., Mejean, C., Latino-Martel, P., Hercberg, S., Galan, P., Czernichow, S., 2013. Association between prediagnostic biomarkers of inflammation and endothelial function and cancer risk: a nested case-control study. Am J Epidemiol 177, 3-13. Vandenbroucke, J.P., 1991. Prospective or retrospective: what's in a name? BMJ 302, 249- 250. von Elm, E., Altman, D.G., Egger, M., Pocock, S.J., Gotzsche, P.C., Vandenbroucke, J.P., Initiative, S., 2007. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Epidemiology 18, 800-804. Wacholder, S., McLaughlin, J.K., Silverman, D.T., Mandel, J.S., 1992a. Selection of controls in case-control studies. I. Principles. Am J Epidemiol 135, 1019-1028. Wacholder, S., Silverman, D.T., McLaughlin, J.K., Mandel, J.S., 1992b. Selection of controls in case-control studies. II. Types of controls. Am J Epidemiol 135, 1029- 1041. Wacholder, S., Silverman, D.T., McLaughlin, J.K., Mandel, J.S., 1992c. Selection of controls in case-control studies. III. Design options. Am J Epidemiol 135, 1042-1050.

75

CHAPTER 4. A SYSTEMATIC REVIEW AND META-ANALYSIS OF THE ANTIBIOTIC TREATMENT FOR INFECTIOUS BOVINE KERATOCONJUNCTIVITIS: AN UPDATE

J.N. Cullena, C. Yuanb,S.Tottonc, R. Dzikamunhengaa, J.F. Coetzeea, N. da Silvab, C. Wanga,b, A.M. O’Connora

aDepartment of Veterinary Diagnostic and Production Animal Medicine, College of Veterinary Medicine, Iowa State University, Ames, Iowa, 500010, Corresponding author: [email protected] bDepartment of Statistics, College of Agriculture and Life Sciences, Iowa State University, Ames, Iowa cNo a✏iation

Abstract

Infectious bovine keratoconjunctivitis (IBK) is a common and important disease of calves. Without e↵ective vaccines, antibiotic therapy is often implemented to minimize the impact of IBK. This review updates a previously published system- atic review regarding comparative ecacy for antibiotic treatments of IBK. Avail- able years of CABI and MEDLINE databases were searched, including non-English results. Also searched were the American Association of Bovine Practitioners and

World Buiatrics Congress conference proceedings from 1996-2016, reviews since 2013, reference lists from relevant trials, and U.S. Food and Drug Administration New An- imal Drug Application summaries. Eligible studies assessed antibiotic treatment of

Preprint submitted to Elsevier July 21, 2016 76 naturally-occurring IBK in calves randomly allocated to group at the individual level.

Outcomes of interest were clinical score, healing time, unhealed ulcer risk, and ulcer surface area. A mixed-e↵ects model comparing active drug to placebo was employed for all outcomes. Heterogeneity was assessed visually and using Cochranes Q-test.

Thirteen trials assessing nine treatments were included. Compared to placebo, most antibiotic treatments were e↵ective. There was evidence that the treatment e↵ect dif- fered by day of outcome measurement. Visually, the largest di↵erences were observed

7-14 days post-treatment. These results indicate improved IBK healing with many antibiotics and suggest the need for randomized trials comparing di↵erent antibiotic treatments. Keywords: Antibiotics, infectious bovine keratoconjunctivitis, meta-analysis, pinkeye, systematic review

1. Introduction

Rationale

Infectious bovine keratoconjunctivitis (IBK) or pinkeye is one of the most important production-limiting diseases of pre-weaned beef calves. The clinical signs of IBK include, lacrimation, photophobia, corneal edema, ocular pain, corneal ulceration, and the potential for vision loss (Alexander, 2010). Calves with IBK lesions have decreased weaning weight by 15-30 lb compared to una↵ected calves (Thrift & Over- 77

field, 1974; Funk et al., 2009, 2014). Moraxella bovis is considered the primary causal organism associated with IBK (George et al., 1984; Gould et al., 2013; Brown et al.,

1998). Moraxella bovoculi has also been suggested to play a role in IBK disease eti- ology (Angelos et al., 2007). However, available evidence suggests prevention of IBK with either commercially available or autogenous M. bovis or M. bovoculi vaccines is ine↵ective (Funk et al., 2009) (Cullen et al., in press). In the absence of evidence sup- porting an e↵ective IBK vaccine, antibiotic therapy is the best method for reducing the production and welfare impacts of IBK. In 2006, members of our group published areviewofantibiotictreatmentsofIBKthatincludedninerandomizedcontrolled trials (O’Connor et al., 2006). In that review, only one trial included an active- to-active treatment comparison of di↵erent antibiotics and no antibiotic regime was employed more than once. Therefore, in the prior review a pairwise meta-analysis was not appropriate (O’Connor et al., 2006). Further, the 2006 review focused on risk of unhealed corneal ulcers as the outcome. As ten years have passed since the conduct of the initial review, it was decided to update the review and determine if additional IBK treatment trials had been published in the ensuing decade that would enable better understanding of comparative ecacy of antibiotic treatment options for IBK. Additionally, statistical methods that allow for the comparison of multiple treatments (mixed treatment meta-analyses, MTCMA) and methods that allow for 78 comparison of di↵erent outcomes (multivariate meta-analysis MVMA) have become more accessible and it was thought these might enable better synthesis of the data

(Lu & Ades, 2004; Higgins & Whitehead, 1996; Jansen et al., 2008). Therefore, the purpose of this systematic review was to update the previous review of antibiotics available for IBK treatment (O’Connor et al., 2006) with all available relevant trials and to conduct, if feasible, either a MTCMA or a MVMA to better understand the comparative ecacy of antibiotic treatment options for IBK. The overall objective of this review was to provide veterinarians and producers with information about the comparative ecacy of antibiotic treatments for IBK in calves. Unlike the prior re- view, treatment ecacy was measured using a variety of outcomes including clinical ulcer score as described by the authors, duration until healing following antibiotic treatment, cure risk based on authors’ definition of resolution of corneal ulcers, and corneal ulcer surface area. A secondary outcome of interest was weight gain, either total weight gain or average daily gain (ADG).

Methods

This manuscript was assembled using the PRISMA guidelines (Moher et al., 2009).

The PRISMA Checklist with corresponding section page numbers is available in the supplemental materials. 79

Protocol and registration

A protocol for the update was developed by four coauthors (JNC, JFC, CW, AOC).

The time-stamped protocol (21st Dec 2015) was submitted for record keeping to the journal prior to starting the review, and any modifications that occurred after ap- proval of the protocol are noted in the manuscript. The original protocol is available in the supplemental materials.

Eligibility criteria i.e. the PICOS criteria

The review question was ”What is the comparative ecacy of antibiotics for the treatment of IBK in beef and dairy cattle?” The eligibility criteria were defined in terms of the population (P), intervention (I), comparator (C), outcomes (O) and study design (S). The eligible population of interest (P) consisted of beef or dairy cattle diagnosed with naturally occurring IBK. IBK diagnoses were defined either by bacterial isolation of Moraxella spp., Neisseria spp.,orBranhamella catarrhalis

(reclassified as Moraxella catarrhalis)orbytheauthors’clinicaldefinitionwhen bacterial isolation was not conducted. No restrictions were placed on the population age, weight, or the country where the trial was conducted. Eligible interventions

(I) included any antibiotic treatment not banned for use in cattle and non-active placebos. This restriction does not limit antibiotic interventions to those registered for the treatment of IBK. Interventions banned for use were based on Group 1 and 80

Group 2 lists from the Food Animal Residue Avoidance Databank (FARAD)a which was consulted during the review process. Multidrug interventions (i.e. antibiotic combined with anti-inflammatory component or antibiotic combined with antibiotic), non-antibiotic treatments, vitamins, or homeopathic treatments were excluded as not relevant to the review. The clarification to exclude vitamins or homeopathic treatments was a protocol change. For example, if one arm of a three-arm trial treated with penicillin plus dexamethasone, only the other two arms (penicillin and placebo) were extracted. As multiple interventions were of interest, no specific comparator

(C) was specified.

The protocol initially defined the primary relevant outcome (O) as the incidence of unhealed corneal lesions (or healed lesion) and the secondary relevant outcome as measures of weight gain. However, the outcomes relevant to the review were expanded during the review process to include clinical ulcer scores, duration until ulcer healing, or corneal ulcer surface area. The rationale for this protocol change and expansion was that many studies reporting relevant information included these outcomes.

With respect to study design (S) characteristics, only randomized controlled trials

(RCT) that allocated calves at the individual level were considered relevant to the 81 review. Trials must have been conducted under field conditions with naturally oc- curring disease.

Information sources

All available years were searched in the Centre for Biosciences and Agriculture In-

b R c ternational (CABI) and MEDLINE databases using the Iowa State University

Web of Science Interface. The proceedings table of contents from the last 20 years

(1996-2016) of the American Association of Bovine Practitioners (AABP) and World

Buiatrics Association were reviewed for potentially relevant conference abstracts.

The Food and Drug Administration (FDA) Freedom of Information (FOI) New An- imal Drug Application (NADA) summaries databased was searched for IBK trial data of approved cattle antibiotics. Searches of the Zoetise and Boehringer Ingel- heimf websites were conducted to look for potentially relevant technical bulletins. We searched these sites because Zoetis is the manufacturer of tulathromycin, a product registered for treatment of IBK, and both companies make a non-generic injectable oxytetracyline, another antibiotic for use to treat IBK. IBK reviews published after

2013 were also examined for relevant trials. After the completion of relevance screen- ing, the bibliographies of relevant manuscripts were examined for trials meeting the eligibility criteria. 82

Search

The search strategy followed the general form of ”population” AND ”disease” AND

R ”intervention”. For MEDLINE ”Drug Therapy”, ”Injections”, and ”Anti-Bacterial

Agents” were searched as Medical Subject Headings (MeSH). For CABI, ”drug ther- apy”, ”injection”, and ”antibacterial” were searched as Descriptors (DE). The search strategy for CABI is presented in Table 1.

Study selection

The online systematic review software, DistillerSRg was used to manage all literature records and associated data. The relevance assessment form was pilot tested by two reviewers for approximately 20 manuscripts and modified to ensure agreement.

After establishing the final set of questions, the two reviewers independently read all abstracts to identify potentially relevant manuscripts. One reviewer (JC) assessed the proceedings table of contents from both conferences. Only one reviewer needed to indicate potential relevance for the full text to be acquired. Full text manuscripts were assessed by two reviewers for relevance. When two reviewers could not agree on trial relevancy questions based on the full text, a third reviewer (AOC) would decide.

This decision was made as this reviewer is the most experienced with IBK. For trials considered relevant by both reviewers or decided by the third reviewer (AOC), all relevant data were extracted. 83

Originally, we intended to only include manuscripts written in English. However, a protocol modification was made such that manuscripts in languages other than

English were considered eligible. For foreign-language papers, we assessed the ab- stract (if in English) and determined if there were reasons for exclusion based on the above exclusion criteria (e.g. treatment combinations, banned treatment regi- mens, non-random allocation, etc.). If it was not possible to exclude based on the

English abstract, we attempted to obtain the full text and where possible translated the document using Google translate and again evaluated the text for exclusion- ary evidence. PDF image files written in French, Italian, Spanish and Chinese that could not be automatically translated were given to a single native speaker with a background in statistics and study design to determine if the trials were random- ized and contained at least two comparative groups. Therefore these papers did not have two reviewers and were based on the judgement of the single native speaker.

Other foreign-language papers with PDF image files were excluded as were papers that could not be obtained. Foreign-language theses were not translated and were excluded.

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)

flow diagram was used as a template to present the study selection results (Moher et al., 2009). 84

Data extraction process

The data extraction process was pilot tested by two authors for a single outcome, unhealed corneal ulcer risk (AOC and JNC). Upon inclusion of the other outcomes, the form was modified to include all relevant outcome data. Data extraction was completed from all eligible manuscripts by two reviewers independently. Extracted data conflicts were resolved by a single author (JNC) in consultation with the orig- inal manuscript data to ensure accuracy. When a trial was identified from multiple sources, all sources were combined in order to obtain the most complete trial de- scription.

Data items

Data extracted from each individual trial included: country in which the trial was conducted, age and weight means or ranges of calves, included genders and breed, concurrent management interventions (i.e. fly control or ), author defini- tions of eligible IBK case and ”cured”, duration of observation period, cattle weights pre- and post-treatment, whether random allocation based on IBK severity occurred, dose and route of each intervention, number of unhealed corneal ulcers and total ul- cer number per treatment, duration until healing, clinical scores, and corneal ulcer surface area. 85

For all continuous covariates and outcomes, standard deviations or standard errors of the mean (SEM) were extracted. If the manuscript authors presented a standard error it was converted to the standard deviation as this was a required input for the statistical analyses (see below). The outcome data were extracted for each time point reported when applicable. This protocol change was deemed necessary in or- der to obtain sucient outcome data. When outcome data were presented only in

figures or plots, WebPlotDigitizer (version 3.9)h was used to digitize the figures in order to extract relevant data. Plot digitizing was conducted in triplicate for contin- uous outcomes per figure by a single reviewer (JC). If corneal ulcer surface area was presented as square-root transformed, the data were back transformed to the raw mean surface area and an approximate standard deviation to ensure the scales were the same between studies (Jørgensen & Pedersen, 1998). This was accomplished as follows: x represents the square rooted value, then the mean value is defined as

n 1 mean = (SD )2 +(mean )2 n x x

where (SDx)standsforthestandarddeviationofthesquaredrootedvalue,and

(meanx) stands for the mean value of the squared rooted value. The approximate standard deviation was

SD = 2 SDxpmean 86

To assess the accuracy of transformation, samples were randomly generated from a normal distribution and the mean of x and the mean using the transformation formula were compared. This procedure was repeated multiple times using di↵erent random samples. The results were considered reasonable based on the opinion of one author (CY). No statistical tests were utilized to compare predictions.

If authors reported only the percentage of unhealed corneal ulcers (calf or eye level), counts were calculated based on the described arm’s sample size and rounded to the nearest whole number. When continuous outcome data were presented in a stratified manner (e.g. healing times for calves with moderate lesions versus severe lesions), outcomes were combined using an appropriate statistical method (Higgins et al.,

2008). Similarly when discrete data were stratified by severity (i.e. unhealed corneal ulcers for calves mild and moderate ulcers), counts were combined. When authors reported outcomes on multiple days, all were extracted and the day of measurement was recorded. Data that were presented for multiple location sites was extracted separately per site.

Geometry of the network of studies

The geometry of the treatment network for each outcome were assessed using the probability of interspecific encounter (PIE) index and C-score (Salanti et al., 2008).

The PIE index describes the network diversity where a value > 0.75, suggests the 87 network has a relatively high diversity. The C-score test measures co-occurrence, meaning particular intervention comparisons are more likely than others and a small p-value indicates there exists statistically significant co-occurrence in the network.

PIE index was calculated by customer written R script (R Core Team, 2015) and

CscoretestwasaccomplishedviaRpackageEcoSimR (Gotelli et al., 2015) version

0.1.0.

Risk of bias within individual studies

Risk of bias was assessed at study level considering nature of the outcome (subjective versus objective outcome) using the Cochrane Risk of Bias for trials (Higgins et al.,

2008). Two reviewers assessed each manuscript as ”high risk”, ”low risk”, or ”un- clear risk” for selection, detection, attrition, reporting, and other potential sources of bias. Disagreements as to the level of risk were discussed between three reviewers

(JNC, ST, and RD). If consensus could not be reached, unclear risk was chosen. For the single foreign language paper that was considered relevant, the translator was asked the risk of bias questions, the translator provided the answers and one of the authors (AOC) made the risk of bias assessment. Manuscripts that described how randomization was implemented (e.g. randomization schedule or by random number draw) or randomization was stratified by clinical severity of IBK were considered low risk for randomization. Allocation concealment was considered unclear risk unless 88 the authors described sequential allocation to ensure even numbers between treat- ment arms (high risk). Lack of caregiver blinding was considered unclear risk unless treatment selectively caused a reaction in one treatment group and not the other. For example one reviewer (ST) suggested that if application of an antibiotic caused eyelid swelling in one treatment group, there were concerns that the a↵ected group may be placed in one specific pasture over another. When blinding of outcome assessor was not described, manuscripts were considered at low risk of bias if the outcome was objective or assessed in an objective manner (e.g. fluorescein stain uptake) other- wise for subjectvie outcomes they were considered at high risk of bias. Attrition bias was considered unclear risk if outcome data were missing from the results without author explanation. If less than 10% of outcome data were missing, attrition bias was considered low risk. Selective reporting was considered to be associated with alowriskofbiasifalldescribedoutcomeswerereportedintheresults,otherwise the risk of bias was considered unclear. Additionally, pharmaceutical sponsorship was considered for each included trial with extractable data. A study was considered sponsored by a company if one of the authors was identified as being from a company or the authors indicated a company as a funding source. 89

Summary measures

The summary measure for the incidence risk of unhealed corneal ulcers was the odds ratio (OR) calculated from the log odds ratio. Data were organized such that the non-active control arm would be the referent when calculating the odds ratio. In this way an odds ratio less than one would indicate that the active treatment (numera- tor) is more e↵ective than placebo (denominator). An odds ratio greater than one would thus indicate that the placebo is more e↵ective than active treatment. For clinical scores, the standardized mean di↵erence was the principal summary mea- sure. The standardized mean di↵erence (SMD) was chosen to account for di↵erent scales employed between manuscripts. For all other outcomes (duration until heal- ing and corneal ulcer surface area), the mean di↵erence was the principal summary measure. Duration until healing and ulcer surface area (following appropriate trans- formation as described above) are on the same scales, respectively, thus the summary measure is a more interpretable e↵ect size. When calculating the SMD and mean di↵erence, active treatment is subtracted from placebo. Therefore a positive mean di↵erence implies that the placebo had a higher mean than the intervention group.

As lower levels of clinical score, shorter healing time, or smaller ulcer surface area imply improved healing, positive mean di↵erence implies the intervention is better.

For example, if calves that received placebo had on average higher clinical scores (i.e. 90 more severely a↵ected eyes) than calves receiving active treatment, the SMD would be greater than zero. Similarly, if calves that received placebo had on average larger corneal ulcer surface areas than calves receiving active treatment, the risk di↵erence would be greater than zero.

Planned method of statistical analysis - pairwise meta-analysis

Pairwise meta-analyses were conducted for each outcome separately for trials that included a non-active control arm. Meta-analyses of active-to-active comparisons were not conducted due to the lack of trials assessing the same intervention. The meta-analyses were conducted using the R package metafor and the rma function using the default restricted maximum likelihood estimation method (Viechtbauer,

2010). The rma function requires data in the form of an e↵ect size and variance for continuous outcomes and count data for dichotomous outcomes. The escalc function was used to generate this data for all analyses (Viechtbauer, 2010). For calculating the log odds ratio (unhealed ulcer risk), a continuity correction of 0.5 was included and added to each cell for instances when a cell within a ”2 X 2” table contained zero. A nested multilevel structure which included a fixed e↵ect for measurement day nested within a random e↵ect for trial was employed. The compound symmetry covariance structure was used to account for repeated measures within a single trial.

This model structure was considered appropriate since outcome data from the same 91 population assessed on multiple days should be correlated within each trial. The pairwise model equation for the log odds ratio and standardized mean di↵erence was as follows:

Let yst stands for the observed e↵ect size in the s-th study at time point t, where s =

1, 2, ,n, t =1, 2, ,m . n is the number of studies with a particular outcome, ··· ··· s and m is the total number of time points in study s. Let y =(y , ,y )T s s s1 ··· sms

ys N(µ, ⌃ms ms ) ⇠ ⇤

where variance covariance matrix ⌃ms ms follows compound symmetric structure: ⇤

1 ⇢ ⇢ 0 ··· 1 ⇢ 1 ⇢ 2 B ··· C ⌃ms ms = B C ⇤ B C B C B C B··· ··· ··· ···C B C B C B 1 C B··· ··· ··· C @ A Summary e↵ects were calculated as active treatment subgroups, all of which were compared to a non-active arm. We presented the data on forest plots for ease of interpretation; however we did not present summary e↵ect sizes across the active treatment subgroups as these would be meaningless.

We assessed heterogeneity when more than one study or time point within a study was available for a comparison of interest. When this occurred we used Cochrane’s Q homogeneity test. Cochrane’s Q tests the null hypothesis that the observed treatment 92 e↵ect between trials or repeated measures within a trial are equivalent at statistical threshold ↵ (Tate & Brown, 1970).

Risk of bias assessment

We did not conduct an analysis for small size e↵ect across the studies as there was an insucient number of studies for either a meaningful visual interpretation or a statistical analysis (Sterne et al., 2011).

Ancillary analysis

The expectation was that a number of included trials would have provided calf weight data before and after treatment. This weight data (total weight gain or ADG) would have allowed for pairwise analyses and summary measures at least for active to non- active comparisons. A mixed treatment comparison meta-analysis was planned based on the a priori belief that a suitable number of trials fitting our inclusion criteria would be recovered from the literature search.

Results

Study selection

The search results are presented in Figure 1. The citation database searches were completed on December 21, 2015. The FDA FOI NADA summaries contained 23 93 potentially relevant summaries based on the search of ”infectious bovine kerato- conjunctivitis”. Fourteen manuscripts described reviews and were set aside to be examined later if published in the last three years. The list of full texts assessed and the first reason for exclusion based on the order of questioning in the screening forms are provided in the supplemental material.

No additional relevant manuscripts were identified (that were available through an inter-library loan system) from recent IBK reviews, reference lists from the full text relevant manuscripts, or the last 20 years of AABP proceedings. Nine of the last ten World Buiatrics Association conference proceedings were assessed for potentially relevant trials. One conference proceedings (2010 World Buiatrics Congress) could not be examined as the holding library declined to loan the requested materials. One relevant randomized trial was identified from the 1998 proceedings (Cosgrove et al.,

1998).

Study characteristics

Of the 97 manuscripts, 16 were trials that described randomization to treatment arm and were included for data extraction. Thus the conference proceedings and database searches combined resulted in a total of 17 relevant trials. The trial from the conference proceedings could not be extracted as the authors did not report the number of calves allocated to each treatment arm (Cosgrove et al., 1998). One 94 manuscript could not be extracted because measures of variation for the reported outcome, clinical eye score, were not reported (Roeder et al., 1995). No data from an additional manuscript could be extracted as the authors only reported comparative ecacy for healing time without measures of variation (Edmondson et al., 1989). A

Spanish language trial could not be extracted because the authors did not report the number of calves allocated per treatment arm (Abdala et al., 2002). Thus from the

17 manuscripts, 13 contained data suitable for extraction (George et al., 1988, 1989;

Chadli, 1992; Allen et al., 1995; Eastman et al., 1998; Angelos et al., 2000; Gokce et al., 2002; Zielinski et al., 2002; Dueger et al., 2004; Starke et al., 2007; Senturk et al., 2007; Freedom of Information Summary, 2007; Quesada et al., 2010).Author definitions of the outcomes are included in the supplemental material.

Table 2 and Table 3 provide the population and study characteristics of each manuscript for which data were extracted. Since the original systematic review in 2006, four new randomized trials have been published (Starke et al., 2007; Senturk et al., 2007; Free- dom of Information Summary, 2007; Quesada et al., 2010). One of the four trials was only available in French (Senturk et al., 2007) and one was an FDA drug ap- plication (Freedom of Information Summary, 2007). The drug application results were also published as a technical bulletin found by one of the reviewers (JFC) and available on the manufacturer websitee (Pfizer Animal Health, 2008). One trial was 95 published in 2002 (Gokce et al., 2002) but was incorrectly excluded from the original systematic review. Concurrent management interventions were rarely reported and thus not included in the descriptive results.

Although 13 randomized trials had at least some extractable outcome data, this did not mean all the outcomes measured and reported by all 13 trials could be extracted.

For some studies, unhealed corneal ulcer data could not be extracted because the study population contained both a↵ected and una↵ected calves (George et al., 1988;

Eastman et al., 1998) or because the figure contained treatment arms combined for calves diagnosed with IBK (Starke et al., 2007). For another paper, the image reso- lution was too low to extract ulcer surface area data beyond days 3-4 (Dueger et al.,

2004) Clinical score data could not be extracted because the population contained both a↵ected and una↵ected calves (George et al., 1988) or authors did not provide any measures of variation (Allen et al., 1995; Zielinski et al., 2002). Ulcer surface area data could not be extracted due to the population containing both a↵ected and una↵ected calves (George et al., 1988). For three manuscripts, healing time data could not be extracted as the authors did not provide any variance measures but only a range (Angelos et al., 2000; Freedom of Information Summary, 2007; Quesada et al., 2010). 96

Summary of the treatment network geometry

The network of treatment comparisons for each outcome is presented in Figure 2.

With the exception of the unhealed ulcers network, we observed a general lack of available comparisons suitable for a MTC meta-analysis. Specifically, very few trials of the same intervention were available. A multivariate meta-analysis was also not possible due to the overall lack of multiple common treatment comparisons across outcomes.

Analysis of the treatment network geometry

For the outcome clinical score, the probability of interspecific encounter (PIE) index was 0.79, the C-score was 1.20 with an associated p-value of 0.86.

For the outcome healing time, the PIE index was 0.85, the C-score was 1.20 with an associated p-value of 0.14.

For the outcome risk of unhealed ulcers, the PIE index was 0.86, the C-score was

1.88 with an associated p-value of 0.81.

For the surface area outcome, the PIE index was 0.78, the C-score was 1.20 with p-value is 0.93.

The PIE index was greater than > 0.75 for all four outcomes suggesting relatively high network diversity. This is not surprising as the 13 studies did represent di↵erent interventions. None of the C-score p-values were significant meaning co-occurrence 97 is not occurring within the four outcome networks. These results may reflect the fact that multiple studies on the same intervention rarely occurred in this dataset and may also be due to the smaller number of trials available.

Risk of bias within studies

Risk of bias for each study are presented in Figure 3. The main concerns for bias were insucient descriptions of randomization. No manuscripts in our sample set de- scribed concealment of the allocation sequence. It should be noted that although all studies were included based on randomization, one study (Zielinski et al., 2002) was included in the meta-analysis as a randomized trial because the authors self described the paper as a randomized block design however it was considered high risk by the reviewers for selection bias due to a description of sequential allocation. However as other papers were not required to prove random allocation by documenting the method of allocation, it was decided to retain the paper and consider it randomized.

The risk of bias assessment resulted in a fairly large number of unclear risk assess- ments. This was due to the fact that these biases are often dicult to assess in older papers that frequently do not report in a manner consistent with recent recommen- dations such as the REFLECT guidelines (O’Connor et al., 2010). 98

Synthesis of pairwise meta-analysis of active treatments compared to placebo and exploration of heterogeneity

For presentation of the data, we grouped drugs together by compound and ignored the route of dose as there were insucient data for further refining the treatment protocol.

Clinical score as outcome

Figure 4 contains the summary SMD for the comparison of active treatments to non-active treatments. Notice that each antibiotic only has data from a single study assessed at multiple time points. The SMD and 95% confidence interval

(95% CI) for cloxacillin 0.71 (0.51, 0.91), florfenicol 0.70 (0.52, 0.87), and clindamycin

3.37 (2.78, 3.96) suggest lower clinical scores (i.e. treatment successful) in calves as- signed to active treatments compared to non-active treatment.

For clinical score there was no evidence of heterogeneity within the studies for each antibiotic. It should be noted that normally in meta-analysis heterogeneity assess- ment aims to assess across studies, but in this analysis the comparison is across time points within each study. The Q test statistic was 1.8 with p-value 0.99 for cloxacillin, 0.58 with p-value 0.99 for florfenicol, and 1.5 with p-value 0.48 for clin- damycin. These Q statistics and associated p-values suggest insucient evidence to reject the null hypothesis of homogeneity (Hoaglin, 2016). 99

Healing time as outcome

The durations until corneal ulcer healing are presented in Figure 5. Healing time was the only outcome that contained e↵ect measures for the same active to non-active comparison from more than one trial (Allen et al., 1995; Eastman et al., 1998). These studies assessed 300,000 IU of penicillin administered into the subconjunctiva. The mean di↵erence (MD) and 95% CI for penicillin 3.23 ( 5.85, 12.31) and cloxacillin 2.34 (0.28, 4.40) indicate active treatment reduces healing time compared to placebo.

However for the penicillin subgroup one trial suggested penicillin was ine↵ective based on the point estimate and confidence interval (Allen et al., 1995). This di↵erence in estimates across the studies for penicillin is reflected in the test for heterogeneity.

The Q test statistic was 9.1 with p-value 0.0026 for penicillin and 0.057 with p-value

0.81 for cloxacillin.

Unhealed corneal ulcers as outcome

The results of the pairwise meta-analysis for unhealed corneal ulcers by active treat- ment subgroup are presented in Figure 6. Compared to non-active arms, odds ratios

(95% CI) from the random e↵ects models were as follows, penicillin 0.19 (0.12, 0.30),

florfenicol 0.092 (0.027, 0.31), tilmicosin 0.36 (0.19, 0.69), tulathromycin 0.13 (0.085, 0.21), and ceftiofur 0.35 (0.20, 0.63). These results suggest, as expected, active treatment is overall more e↵ective than no treatment. A summary e↵ect was not calculated 100 for the oxytetracycline subgroup as only one arm of one trial used oxytetracycline to treat IBK (Zielinski et al., 2002).

The Q test statistic was 9.8 with p-value 0.45 for penicillin, 1.1 with p-value 0.29 for florfenicol, 0.024 with p-value 1 for tilmicosin, 25 with p-value 0.0035 for tu- lathromycin, and 3.8 with p-value 0.58 for ceftiofur.

Corneal ulcer surface as outcome

Figure 7 displays the summary MD for each active to non-active subgroup. The MD and 95% CI for cloxacillin 0.17 (0.11, 0.24), florfenicol 1.62 (1.22, 2.01), and ceftiofur

0.096 (0.066, 0.13) suggest active treatment results in smaller corneal ulcer surface areas compared to placebo. The summary MD for penicillin 0.34 ( 0.48, 0.19) would suggest non-treatment reduces corneal ulcer size more e↵ectively than treat- ment with penicillin in this trial population.

The Q test statistic was 5.6 with p-value 0.78 for cloxacillin, 12 with p-value 0.04 for florfenicol, 13 with p-value 0.038 for ceftiofur, and 6.6 with p-value 0.086 for penicillin.

Assessing sources of systematic bias

The typical sources of systematic bias are a lack of randomization, outcome blind- ing, and trial sponsorship. Our dataset included only randomized trials, 7 of which 101 employed random allocation stratified by IBK severity, 6 described blinding of out- come(s), and 5 were pharmaceutical company sponsored. To assess risk of bias across studies, it is necessary to have multiple studies of the same intervention; however the sparse data in this data set precluded such an analysis. We were therefore unable to assess these methodological variables further beyond reporting the numbers here.

The rationale for not assessing bias due to caregiver blinding is that no studies reported this process, so it could not be assessed. Loss-to-follow-up was also not assessed because it was not possible to accurately determine, due to the approach to reporting if authors made the decision a priori to only report for animals that com- pleted the study. Any assessment of loss-to-follow-up would be inaccurate, highly subjective and likely meaningless. These decisions were made while the protocol was being developed, and should have been included in the protocol but were not.

Risk of bias across studies

The sample set did not contain multiple trials comparing the same active treatment for one of the four outcomes thus we did not examine funnel plots. Only for duration until ulcer healing comparing penicillin to placebo was there more than one trial

(Allen et al., 1995; Eastman et al., 1998). 102

Active-to-active comparisons of interest

As can be seen from the networks, only three studies assessing active to active com- parisons have been conducted that did not just use di↵erent doses or routes of the same drugs. Gokce et al. (Gokce et al., 2002) reported that florfenicol was more e↵ective than oxytetracycline for healing time [MD=5.42 and 95% CI (-7.98, 18.82) extra days until healing for oxytetracycline] and lower risk of unhealed ulcer by day 28 [OR=0.05 and 95% CI (0.00, 1.00) - oxytetracycline as the referent]. For risk of unhealed ulcer, Chadli (Chadli, 1992) indicated oxytetracycline was more ef- fective than aureomycin [OR=17.26 (5.43, 54.86) - oxytetracycline as the referent].

Zielinski et al. (Zielinski et al., 2002) suggested that tilmicosin was more e↵ective at multiple doses than oxytetracycline. For example an OR of 0.18 (0.044, 0.76) at day 21 was observed when comparing 300 mg of either antibiotic administered sub-subconjunctivally.

Mixed treatment comparisons meta-analysis

Based on the small sample size of randomized trials with extractable data, the MTC meta-analysis failed to produce interpretable results. We did conduct the analysis for the most commonly measured outcome, unhealed ulcers, which had the largest network. However, for the resulting ranking of treatment analysis all active treat- ments and even placebo had overlapping credibility intervals around 0-11 (data not 103 shown). This was likely due to the relatively close e↵ect measure estimates and lack of enough available trial data to distinguish between treatment ecacies. As a re- sult the MTC meta-analysis was not investigated further or reported further. We also considered conducting a multivariable meta-analysis which would enable us to borrow information across outcomes, but the data were too sparse to even attempt such an analysis.

Ancillary analysis - IBK e↵ect on weight

As it has been previously observed that IBK-a↵ected calves have decreased weaning weight compared to una↵ected herd mates (Thrift & Overfield, 1974; Funk et al.,

2009; O’Connor et al., 2011; Funk et al., 2014), the expectation was that a number of trials would include ADG or final weight. However of the included trials, only two manuscripts reported weight as an outcome (Zielinski et al., 2002; Roeder et al.,

1995). Additionally the authors of one of the two manuscripts did not report mea- sures of variance for final weight or ADG and therefore the data cannot be used

(Zielinski et al., 2002). As a result ancillary analysis was not conducted.

Discussion

This systematic review update identified 17 randomized trials, 13 of which had data that could be extracted. Four of the 13 were published since the last review in 104

2006. This review expanded the outcomes of interest to include not only risk of unhealed ulcers but also clinical scores, duration until healing, and ulcer surface area. With the exception of oxytetracycline for unhealed ulcer risk and penicillin for ulcer surface area, the results indicate that overall any active intervention is more e↵ective than placebo for the treatment of IBK in cattle. This result was observed for all assessed outcomes and should be expected. Unfortunately we were not able to conduct a meaningful mixed treatment analysis that would provide additional information about the comparative ecacy of the drugs because of the sparse data for each intervention. We had initially proposed to conduct a mixed treatment comparison that would use direct and indirect information to provide information of the comparative ecacy of the active drugs. With so few trials reporting active to active comparisons this would provide critical information for treatment selection.

Of particular interest were the sideways parabola-type shapes produced from lining up the e↵ect measures for each day within an active treatment subgroup. For example the cloxacillin results from the ulcer surface area forest plot (Figure 7), appear as two (one for 250 mg and one for 375 mg) sideways parabolas starting around the null (zero) at day four, bending away from the null, and returning back by day 16.

Although not necessarily statistically relevant based on the confidence intervals, the greatest e↵ect size point estimate (or the furthest from the null on the parabola) 105 was observed on day seven or day ten depending on the dose. These results suggest that the best time in which to observe a di↵erence between treatments using surface area as an outcome should be between day 7-10, and that beyond day 16 the choice of an active antibiotic intervention or placebo makes no di↵erence. Similar shapes were observed for ceftiofur, florfenicol (largest observed e↵ect size at day seven or day 14), tulathromycin, and penicillin. For the penicillin subgroup however the parabola is orientated in the other direction indicating penicillin made no impact on surface area. Considering our summary measures are compared to placebo, we would expect the di↵erence to be even smaller for trials comparing active-to-active treatments. Although to a lesser extent, similar shapes were observed for some of the active treatment subgroups for the clinical score (Figure 4) and unhealed ulcer risk (Figure 6) forest plots.

By using multiple time points from within the same studies for our summary mea- sures, we expected the Q statistic and p-value to indicate heterogeneity. This is due to IBK self-healing without the application of antibiotics and producing the side- ways parabolas. For healing time (penicillin), unhealed ulcer risk (tilmicosin and tulathromycin), and the surface area (florfenicol and ceftiofur) outcomes, the null hypothesis of homogeneous e↵ect measures across days was rejected at an ↵ of 0.05 within active subgroups. 106

One of the reasons for conducting an MTC meta-analysis is to make active-to-active treatment comparison ecacy predictions that do not exist as publicly available data.

These comparison predictions allow for the ranking of treatments by e↵ectiveness which ultimately enable the clinician to make the best treatment decision for their clients and patients. In the absence of a suitably large enough trial data set, this type of meta-analysis is not possible. As this systematic review demonstrated, this is exactly the case for the antibiotic treatment of IBK. In these circumstances the clinician should use whatever available active-to-active data that exists realizing there is more variability based on the one or two estimates as they are single random events.

We found only one study that made this direct comparison at a single time point (28 days). That trial compared 20 mg/kg IM oxytetracycline to 20 mg/kg IM florfenicol using unhealed ulcer risk as the outcome (Gokce et al., 2002). The results suggested

florfenicol was superior (OR=0.05). This single observation suggests florfenicol might be more e↵ective than oxytetracycline. However, this does not mean that florfeni- col should be the treatment of choice, as comparative ecacy is only one factor in treatment decisions. Interestingly only oxytetracyline and tulathromycin are regis- tered for use for IBK in the US. We could not find any peer-reviewed studies of the ecacy of tulathromycin that met our PICOS criteria [one trial published in Ital- 107 ian did not have a valid comparator (Emarcora et al., 2010) and one trial employed achallengemodel(Laneetal.,2006)].Theonlydatawehaveforthistreatment option is from the FDA FOI NADA summary (Freedom of Information Summary,

2007) and a technical bulletin from the manufacturer website (Pfizer Animal Health,

2008). These are not places we would expect many producers or veterinarians to look for information about ecacy. In Canada florfenicol is also registered for use for treatment of IBK, but we could not find the equivalent of the FDA NADA FOI for this use.

In summary, the results of this systematic review and meta-analysis strongly indicate that antibiotics of many kinds improve healing compared to placebo. The trial results also suggest a need for more active-to-active randomized trials in the treatment of

IBK in calves. Finally, the maximum power of those studies to detect di↵erences between treatments is perhaps around 7 to 14 days post treatment.

Acknowledgments

Thank you to Samantha Tyner for translation assistance for the French language article. 108

2. References

Abdala, A. A., Canavesio, V. R., Zimmermann, G., Abramovich, G., Calvinho, L. F.

(2002). ‘Two treatments for bovine infections keratoconjunctivitis Evaluacion de

dos regimenes de tratamiento antibiotico topico, para Queratoconjuntivitis Infec-

ciosa Bovina’. Revista de Medicina Veterinaria (Buenos Aires) 83(3):100–102.

Alexander, D. (2010). ‘Infectious bovine keratoconjunctivitis: a review of cases in

clinical practice’. Veterinary Clinics of North America: Food Animal Practice

26(3):487–503.

Allen, L. J., George, L. W., Willits, N. H. (1995). ‘E↵ect of penicillin or penicillin

and dexamethasone in cattle with infectious bovine keratoconjunctivitis’. Journal

of the American Veterinary Medical Association 206(8):1200–1203.

Angelos, J. A., Dueger, E. L., George, L. W., Carrier, T. K., Mihalyi, J. E., Cosgrove,

S. B., Johnson, J. C. (2000). ‘Ecacy of florfenicol for treatment of naturally oc-

curring infectious bovine keratoconjunctivitis’. Journal of the American Veterinary

Medical Association 216(1):62–64.

Angelos, J. A., Spinks, P. Q., Ball, L. M., George, L. W. (2007). ‘Moraxella bovoculi

sp. nov., isolated from calves with infectious bovine keratoconjunctivitis’. Inter-

national Journal of Systematic and Evolutionary Microbiology 57(Pt 4):789–795. 109

Brown, M. H., Brightman, A. H., Fenwick, B. W., Rider, M. A. (1998). ‘Infectious

bovine keratoconjunctivitis: a review’. Journal of Veterinary Internal Medicine

12(4):259–266.

Chadli, M. (1992). ‘Clinical evaluation of infectious bovine keratoconjunctivitis af-

ter treatment with long acting oxytetracycline La keratoconjonctivite infectieuse

bovine: evaluation clinique apres traitement a l’oxytetracycline longue action’.

Maghreb Veterinaire 7(27):34–36.

Cosgrove, S. B., Simmons, R. D., Johnson, J. C., Varma, K. J. (1998). ‘In vivo

and in vitro ecacy of Nuflor (florfenicol) in the treatment of infectious bovine

keratoconjunctivitis. Schering-plough Animal Health Nuflor: new therapeutic ap-

plications’. In Proceedings of a symposium held in conjunction with the XX World

Buiatrics Congress,p.514.

Cullen, J. N., Engelken, T. J., Cooper, V. S., OConnor, A. M. (in press). ‘A random-

ized, controlled, blinded trial assessing the association between infectious bovine

keratoconjunctivitis incidence and a commercial Moraxella bovis vaccine in beef

calves’. Journal of the American Veterinary Medical Association .

Dueger, E. L., George, L. W., Angelos, J. A., Tankersley, N. S., Luiz, K. M., Meyer,

J. A., Portis, E. S., Lucas, M. J. (2004). ‘Ecacy of a long-acting formulation of cef- 110

tiofur crystalline-free acid for the treatment of naturally occurring infectious bovine

keratoconjunctivitis’. American Journal of Veterinary Research 65(9):1185–1188.

Eastman, T. G., George, L. W., Hird, D. W., Thurmond, M. C. (1998). ‘Com-

bined parenteral and oral administration of oxytetracycline for control of infec-

tious bovine keratoconjunctivitis’. Journal of the American Veterinary Medical

Association 212(4):560–563.

Edmondson, A. J., George, L. W., Farver, T. B. (1989). ‘Survival analysis for eval-

uation of corneal ulcer healing times in calves with naturally acquired infectious

bovine keratoconjunctivitis’. American Journal of Veterinary Research 50(6):838–

844.

Emarcora, M., Morassi, R., Soriolo, A., Piva, R., Bregoli, M., Schiavon, E. (2010).

‘Ecacy of tulathromycin in a outbreak of infectious bovine keratoconjunctivitis’.

Large Animal Review 16(4):163–165.

Freedom of Information Summary (2007). ‘NADA 141-244 - DRAXXIN

Injectable Solution’. http://www.fda.gov/downloads/animalveterinary/

products/approvedanimaldrugproducts/foiadrugsummaries/ucm051495.pdf.

Funk, L., O’Connor, A. M., Maroney, M., Engelken, T., Cooper, V. L., Kinyon,

J., Plummer, P. (2009). ‘A randomized and blinded field trial to assess the ef- 111

ficacy of an autogenous vaccine to prevent naturally occurring infectious bovine

keratoconjunctivis (IBK) in beef calves’. Vaccine 27(34):4585–4590.

Funk, L. D., Reecy, J. M., Wang, C., Tait R. G., J., O’Connor, A. M. (2014).

‘Associations between infectious bovine keratoconjunctivitis at weaning and ultra-

songraphically measured body composition traits in yearling cattle’. Journal of

the American Veterinary Medical Association 244(1):100–106.

George, L., Mihalyi, J., Edmondson, A., Daigneault, J., Kagonyera, G., Willits, N.,

Lucas, M. (1988). ‘Topically applied furazolidone or parenterally administered

oxytetracycline for the treatment of infectious bovine keratoconjunctivitis’. Jour-

nal of the American Veterinary Medical Association 192(10):1415–1422.

George, L. W., Keefe, T., Daigneault, J. (1989). ‘E↵ectiveness of two benzathine

cloxacillin formulations for treatment of naturally occuring infectious bovine ker-

atoconjunctivitis’. American Journal of Veterinary Research 50(7):1170–1174.

George, L. W., Wilson, W. D., Baggot, J. D., Mihalyi, J. E. (1984). ‘Antibiotic treat-

ment of Moraxella bovis infection in cattle’. Journal of the American Veterinary

Medical Association 185(10):1206–1209.

Gokce, H. I., Citil, M., Genc, O., Erdogan, H. M., Gunes, V., Kankavi, O. (2002).

‘A comparison of the ecacy of florfenicol and oxytetracycline in the treatment 112

of naturally occurring infectious bovine keratoconjunctivitis’. Irish Veterinary

Journal 55(11):573–576.

Gotelli, N. J., Hart, E. M., Ellison, A. M. (2015). ‘EcoSimR: Null model analysis for

ecological data - version 0.1.0’. http://github.com/gotellilab/EcoSimR.

Gould, S., Dewell, R., To✏emire, K., Whitley, R. D., Millman, S. T., Opriessnig, T.,

Rosenbusch, R., Trujillo, J., O’Connor, A. M. (2013). ‘Randomized blinded chal-

lenge study to assess association between Moraxella bovoculi and Infectious Bovine

Keratoconjunctivitis in dairy calves’. Veterinary Microbiology 164(1-2):108–115.

Higgins, J., Green, S., Cochrane Collaboration (2008). Cochrane handbook for sys-

tematic reviews of interventions / edited by Julian P.T. and Sally Green.,vol.5.

Chichester, England;Hoboken, NJ: Wiley-Blackwell.

Higgins, J. P., Whitehead, A. (1996). ‘Borrowing strength from external trials in a

meta-analysis’. Statistics in Medicine 15(24):2733–2749.

Hoaglin, D. C. (2016). ‘Misunderstandings about Q and ’Cochran’s Q test’ in meta-

analysis’. Statistics in Medicine 35(4):485–95.

Jansen, J. P., Crawford, B., Bergman, G., Stam, W. (2008). ‘Bayesian meta-analysis

of multiple treatment comparisons: an introduction to mixed treatment compar-

isons’. Value Health 11(5):956–964. 113

Jørgensen, E., Pedersen, A. R. (1998). ‘How to obtain those nasty standard errors

from transformed data-and why they should not be used’ http://gbi.agrsci.

dk/~ejo/publications/dinapig/intrep7.pdf.

Lane, V. M., George, L. W., Cleaver, D. M. (2006). ‘Ecacy of tulathromycin for

treatment of cattle with acute ocular Moraxella bovis infections’. Journal of the

American Veterinary Medical Association 229(4):557–561.

Lu, G., Ades, A. E. (2004). ‘Combination of direct and indirect evidence in mixed

treatment comparisons’. Statistics in Medicine 23(20):3105–3124.

Moher, D., Liberati, A., Tetzla↵, J., Altman, D. G., Prisma Group (2009). ‘Pre-

ferred reporting items for systematic reviews and meta-analyses: the PRISMA

statement’. PLoS Medicine 6(7).

O’Connor, A. M., Brace, S., Gould, S., Dewell, R., Engelken, T. (2011). ‘A random-

ized clinical trial evaluating a farm-of-origin autogenous Moraxella bovis vaccine

to control infectious bovine keratoconjunctivis (pinkeye) in beef cattle’. Journal

of Veterinary Internal Medicine 25(6):1447–1453.

O’Connor, A. M., Sargeant, J. M., Gardner, I. A., Dickson, J. S., Torrence, M. E.,

Dewey, C. E., Dohoo, I. R., Evans, R. B., Gray, J. T., Greiner, M., Keefe, G.,

Lefebvre, S. L., Morley, P. S., Ramirez, A., Sischo, W., Smith, D. R., Snedeker, 114

K., Sofos, J., Ward, M. P., Wills, R. (2010). ‘The REFLECT statement: methods

and processes of creating reporting guidelines for randomized controlled trials for

livestock and food safety’. Preventive Veterinary Medicine 93(1):11–8.

O’Connor, A. M., Wellman, N. G., Evans, R. B., Roth, D. R. (2006). ‘A review

of randomized clinical trials reporting antibiotic treatment of infectious bovine

keratoconjunctivitis in cattle’. Animal Health Research Reviews 7(1-2):119–127.

Pfizer Animal Health (2008). ‘Ecacy of DRAXXIN (tulathromycin) Injectable

Solution for treatment of naturally occurring infectious bovine keratoconjunctivitis

associated with Moraxella bovis in beef and airy calves’. https://www.zoetisus.

com/\_locale-assets/myresources/drx08001-pinkeye-tb.pdf.

Quesada, M. A. Z., A, J. A., Lopez, H. S. (2010). ‘Clinical ecacy of parenteral vs

ophthalmic florfenicol for the treatment of infectious bovine keratoconjunctivitis’.

Veterinaria Mexico 41:219–225.

RCoreTeam(2015).‘R:ALanguageandEnvironmentforStatisticalComputing’.

Roeder, B. L., Skogerboe, T. L., Clark, F. D., Kelly, E. J. (1995). ‘E↵ect of

early treatment with parenteral long-acting oxytetracycline on performance of beef

calves with acute eye lesions’. Agri-Practice 16(7):6–11. 115

Salanti, G., Kavvoura, F. K., Ioannidis, J. P. (2008). ‘Exploring the geometry of

treatment networks’. Annals of Internal Medicine 148(7):544–553.

Senturk, S., Cetin, C., Temizel, M., Ozel, E. (2007). ‘Evaluation of the clinical ecacy

of subconjunctival injection of clindamycin in the treatment of naturally occurring

infectious bovine keratoconjunctivitis’. Veterinary Ophthalmology 10(3):186–189.

Starke, A., Eule, C., Meyer, H., Im Winkel, C., Verspohl, J., Rehage, J. (2007).

‘Ecacy of intrapalpebral and intramuscular application of oxytetracycline in a

natural outbreak of infectious bovine kertoconjunctivitis (IBK) in calves’. DTW.

Deutsche tierarztliche Wochenschrift 114(6):219–224.

Sterne, J. A. C., Sutton, A. J., Ioannidis, J., Terrin, N., Jones, D. R., Lau, J., Car-

penter, J., R¨ucker, G., Harbord, R. M., Schmid, C. H., Others (2011). ‘Recommen-

dations for examining and interpreting funnel plot asymmetry in meta-analyses of

randomised controlled trials’. BMJ 343:1–8.

Tate, M. W., Brown, S. M. (1970). ‘Note on the Cochran Q test’. Journal of the

American Statistical Association 65(329):155–160.

Thrift, F. A., Overfield, J. R. (1974). ‘Impact of pinkeye (infectious bovine kerato-

conjunctivitis) on weaning and postweaning performance of Hereford calves’. Jour-

nal of Animal Science 38(6):1179–1184. 116

Viechtbauer, W. (2010). ‘Conducting meta-analyses in R with the metafor package’.

Journal of Statistical Software 36(3):1–48.

Zielinski, G. C., Piscitelli, H. G., Perez-Monti, H., Stobbs, L. A., Zimmermann, A. G.

(2002). ‘Ecacy of di↵erent dosage levels and administration routes of tilmicosin in

a natural outbreak of infectious bovine keratoconjunctivitis (pinkeye)’. Veterinary

Therapeutics 3(2):196–205. 117

3. Footnotes

(a) http://www.farad.org

(b) http://www.cabi.org

(c) http://www.ncbi.nlm.nih.gov/pubmed

(d) http://www.fda.gov/RegulatoryInformation/FOI/

(e) Zoetisus, http://www.zoetis.com/animal-health/

(f) Boehringer Ingelheim Vetmedica, http://www.bi-vetmedica.com

(g) DistillerSR (web-based software), Evidence Partners, https://distillercer.com

(h) Ankit Rohatgi, WebPlotDigitizer, Texas, http://arohatgi.info/WebPlotDigitizer/ 118

4. Tables

Table 1: Database search terms for the Centre for Biosciences and Agriculture Internatinal (CABI) and MEDLINE databases. Population AND Disease AND Intervention

TS=(beef OR bovine OR calf OR calves OR cat- TS= (keratoconjunctivitis OR conjunctivitis OR TS = (antibiotic OR antimicrobial OR amoxi-

tle OR cow OR dairy OR angus OR hereford OR pink eye OR pinkeye OR IBK OR Branhamella OR cillin OR ampicillin OR ceftiofur OR chlortetra-

holstein OR ruminant OR steer) Moraxella OR Mycoplasma bovoculi OR Neisseria) cycline OR cloxacillin OR danofloxacin OR en-

rofloxacin OR erythromycin OR florfenicol OR

gamithromycin OR gentamicin OR lincomycin OR

oxytetracycline OR penicillin OR spectinomycin

OR sulfadimethoxine OR sulfamethoxazole OR

tetracycline OR tildipirosin OR tilmicosin OR

trimethoprim OR tulathromycin OR tylosin) OR

DE= (drug therapy OR injection OR antibacterial

agents) 119

Table 2: Population characteristics for each extractable trial. Weight presented as a range or +/- the SD. Author Publication Population Number Relevant

year of sites arms

George et al. 1988 USA, Hereford, m/f, 146-312kg, 6 months 1 2

George et al. 1989 USA, Holstein (Trial I), Angus cross (Trial II), 2 4

m/f, 2-9 months

Chadlin, M. 1992 Morocco 2

Allen et al. 1995 USA, Angus and Hereford, f, 6-8 months 1 2

Eastman et al. 1998 USA, Hereford, 6 months 1 2

Angelos et al. 2000 USA, Angus, Angus-Hereford, Holstein, m/f, 4 3

125.9-364.5 kg, 4-12 months

Zielinski et al. 2002 Argentina, Hereford, m, 6 months 1 6

Gokce et al. 2002 Turkey, Swiss Brown, 6-12 months 1 2

Dueger et al. 2004 USA, Angus-Hereford (SFS), Holstein (CD), 2 2

3-9 months

Starke et al. 2007 Country not reported, Holstein, m, 119 +/- 25 1 2

kg

Senturk et al. 2007 Turkey, Holstein, 4-28 months 1 2

FOI Summary 2007 USA, 138 to 342 kg and 4-10 months (WI site), 2 2

NADA 141-244 156 to 395 kg and 6-14 months (CA site)

Quesada et al. 2010 Country not reported, f, 326 +/- 15.8 kg 1 2 120

Table 3: Trial characteristics for each extractable trial. IM = intramuscular, SC = subcutaneous, SJ = subconjunctival, TOP = topical opthalmic.

Author Publication Duration of ob- Treatment (size per arm) Extractable outcomes Blinding Stratification by Sponsorhip

year servation severity

George et al. 1988 54 days non active (n=52), oxytetracycline 20 mg/kg IM (n=52) healing time no no yes

George et al. 1989 16 days non active (Trial I n=20), cloxacillin 250 mg TOP (Trial surface area, clin score, yes no no

In=23,TrialIIn=8),cloxacillin375mgTOP(TrialI heal time

n=21, Trial II n=8)

Chadlin, M. 1992 15 days aureomycin 1% wash TOP (n=53), oxytetracycline 1 unhealed risk no not translated not translated

ml/10 kg IM (n=53)

Allen et al. 1995 84 days non active (n=14), penicillin 300,000 U SJ (n=18) surface area, healing yes yes no

time

Eastman et al. 1998 49 days non active (n=39), penicillin 300,000 U SJ (n=38) unhealed risk, healing no yes no

time

Angelos et al. 2000 20 days non active (n=52), florfenicol 20 mg/kg IM (n=49), flor- unhealed risk, surface yes yes yes

fenicol 40 mg/kg SC (n=41) area, clin score

Zielinski et al. 2002 21 days non active (n=19), oxytetracycline 300 mg SJ (n=20), unhealed risk yes yes yes

tilmicosin at 2.5 mg SC (n=20), 5 mg SC (n=21), 10 mg

SC (n=20), and 300 mg SJ (n=19)

Gokce et al. 2002 70 days florfenicol 20 mg/kg IM (n=15), oxytetracycline 20 unhealed risk, healing no no no

mg/kg IM (n=15) time

Dueger et al. 2004 21 days non active (Site I n=40, Site II n=26), ceftiofur 6.6 mg/kg unheal risk, surface yes yes yes

SC (Site I n=38, Site II n=26) area

Starke et al. 2007 108 days oxytetracycline 20 mg/kg IM (n=27), oxytetracycline 200 healing time no yes no

mg SJ (n=29)

Senturk et al. 2007 15 days non active (n=18), clindamycin 150 mg SJ (n=18) clin score no no no

FOI Summary NADA 2007 21 days non active (n=49 at CA site, n=49 at WI site), tu- unhealed risk no yes yes

141-244 lathromycin 2.5 mg/kg SC (n=50 at CA site, n=50 at

WI site)

Quesada et al. 2010 21 days florenicol 20 mg/kg IM (n=20), florfenicol 126 unhealed risk yes no no

mg/bovine/week TOP (n=21) 121

5. Illustrations

Figure 1: Flow chart illustrating the database searches and manuscript exclusions during the screen- ing and data extraction processes. 122

Clinical Score Healing Time

NON ACTIVE NON ACTIVE

oxytet_IM pen_SJ

flor_SC clox_TOP

oxytet_SJ clox_TOP

clind_SJ

flor_IM flor_IM

Surface Area Unhealed Ulcers NON ACTIVE NON ACTIVE tilm_SJ oxytet_IM

flor_SC ceft_SC tilm_SC ceft_SC

flor_SC pen_SJ

flor_IM clox_TOP oxytet_SJ flor_IM

aureo_TOP flor_TOP pen_SJ tula_SC

Figure 2: Network of treatment arms for each outcome. The size of the node and line width represent the relative number of arms and direct comparisons, respectively. IM = intramuscular,

SC = subcutaneous, SJ = subconjunctival, TOP = topical opthhalmic, aureo = aureomycin, ceft

= ceftiofur, clind = clindamycin, clox = cloxacillin, flor = florfenicol, oxytet = oxytetracycline, pen

= penicillin, tilm = tilmicosin, tula = tulathromycin. 123

Figure 3: Risk of bias per manuscript for which data were extracted. Low risk is indicated by green, high risk by red, and unclear risk is blank. 124

Author (year), treatment Arm 1 (Placebo) Arm 2 (Active) dose, route, timepoint mean SD mean SD SMD [95% CI]

George et al (1989), cloxacillin 375 mg,TOP−OPTHALM,day 16 1.90 0.80 1.30 0.60 0.84 [ 0.20 , 1.47 ] 375 mg,TOP−OPTHALM,day 13 2.00 0.90 1.50 0.70 0.61 [ −0.02 , 1.24 ] 375 mg,TOP−OPTHALM,day 10 2.00 0.70 1.50 0.60 0.75 [ 0.12 , 1.39 ] 375 mg,TOP−OPTHALM,day 7 2.10 0.70 1.50 0.60 0.90 [ 0.26 , 1.55 ] 375 mg,TOP−OPTHALM,day 4 2.30 0.70 1.70 0.70 0.84 [ 0.20 , 1.48 ] 250 mg,TOP−OPTHALM,day 16 1.90 0.80 1.40 0.50 0.75 [ 0.13 , 1.37 ] 250 mg,TOP−OPTHALM,day 13 2.00 0.90 1.40 0.60 0.78 [ 0.16 , 1.40 ] 250 mg,TOP−OPTHALM,day 10 2.00 0.70 1.50 0.80 0.65 [ 0.04 , 1.26 ] 250 mg,TOP−OPTHALM,day 7 2.10 0.70 1.70 0.80 0.52 [ −0.09 , 1.13 ] 250 mg,TOP−OPTHALM,day 4 2.30 0.70 1.90 0.90 0.48 [ −0.13 , 1.09 ] RE Model for Subgroup 0.71 [ 0.51 , 0.91 ]

Angelos et al (2000), florfenicol 40 mg/kg,SC,day 21 0.49 0.81 0.10 0.44 0.59 [ 0.15 , 1.03 ] 40 mg/kg,SC,day 14 0.70 0.89 0.13 0.43 0.79 [ 0.36 , 1.23 ] 40 mg/kg,SC,day 7 0.98 0.82 0.45 0.55 0.74 [ 0.31 , 1.16 ] 20 mg/kg,IM,day 21 0.49 0.81 0.06 0.21 0.74 [ 0.32 , 1.17 ] 20 mg/kg,IM,day 14 0.70 0.89 0.20 0.53 0.68 [ 0.27 , 1.09 ] 20 mg/kg,IM,day 7 0.98 0.82 0.49 0.69 0.64 [ 0.24 , 1.04 ] RE Model for Subgroup 0.70 [ 0.52 , 0.87 ]

Senturk et al (2007), clindamycin 150 mg,S−CONJUNC,day 15 1.77 0.42 0.23 0.42 3.55 [ 2.50 , 4.60 ] 150 mg,S−CONJUNC,day 7 2.19 0.42 0.56 0.42 3.76 [ 2.67 , 4.85 ] 150 mg,S−CONJUNC,day 3 2.27 0.42 1.00 0.42 2.93 [ 1.99 , 3.87 ] RE Model for Subgroup 3.37 [ 2.78 , 3.96 ]

−1.00 0.00 1.00 2.00 3.00 4.00 5.00

Treatment ineffective SMD Treatment effective

Figure 4: Standardized mean di↵erence (SMD) for clinical score - all available pairwise treatments that included a non-active arm. 125

Author (year), treatment Arm 1 (Placebo) Arm 2 (Active)

dose, route, timepoint mean SD mean SD Mean difference [95% CI]

George et al (1988), oxytetracycline

20 mg/kg,IM, 11.80 11.10 7.00 6.80 4.80 [ 1.26 , 8.34 ]

George et al (1989), cloxacillin

375 mg,TOP−OPTHALM, 10.50 5.20 7.90 5.00 2.60 [ −0.38 , 5.58 ]

250 mg,TOP−OPTHALM, 10.50 5.20 8.40 5.00 2.10 [ −0.74 , 4.94 ]

RE Model for Subgroup 2.34 [ 0.28 , 4.40 ]

Eastman et al(1998), Allen et al(1995), penicillin

300,000 U,S−CONJUNC, 7.30 6.20 8.80 5.40 −1.50 [ −6.15 , 3.15 ]

300,000 U,S−CONJUNC, 12.40 11.87 4.63 2.47 7.77 [ 3.92 , 11.62 ]

RE Model for Subgroup 3.23 [ −5.85 , 12.31 ]

−10.00 −5.00 0.00 5.00 10.00 15.00

Treatment ineffective Mean difference Treatment effective

Figure 5: Mean di↵erence for healing time - all available pairwise treatments that included a non- active arm. 126

Author (year), treatment Arm 1 (Active) Arm 2 (Placebo) dose, route, timepoint unheal heal unheal heal Odds ratio [95% CI] Eastman et al (1998), penicillin 300,000 U,S−CONJUNC,day 24 0 38 9 30 0.04 [ 0.00 , 0.75 ] 300,000 U,S−CONJUNC,day 21 0 38 10 29 0.04 [ 0.00 , 0.65 ] 300,000 U,S−CONJUNC,day 20 0 38 11 28 0.03 [ 0.00 , 0.57 ] 300,000 U,S−CONJUNC,day 15 0 38 12 27 0.03 [ 0.00 , 0.50 ] 300,000 U,S−CONJUNC,day 13 1 37 13 26 0.05 [ 0.01 , 0.44 ] 300,000 U,S−CONJUNC,day 11 2 36 13 26 0.11 [ 0.02 , 0.54 ] 300,000 U,S−CONJUNC,day 8 3 35 12 27 0.19 [ 0.05 , 0.75 ] 300,000 U,S−CONJUNC,day 6 4 34 15 24 0.19 [ 0.06 , 0.64 ] 300,000 U,S−CONJUNC,day 5 10 28 20 19 0.34 [ 0.13 , 0.88 ] 300,000 U,S−CONJUNC,day 4 11 27 23 16 0.28 [ 0.11 , 0.73 ] 300,000 U,S−CONJUNC,day 3 25 13 34 5 0.28 [ 0.09 , 0.90 ] RE Model for Subgroup 0.19 [ 0.12 , 0.30 ] Angelos et al (2000), florfenicol 40 mg/kg,SC,day 20 3 39 19 33 0.13 [ 0.04 , 0.49 ] 20 mg/kg,IM,day 20 1 48 19 33 0.04 [ 0.00 , 0.28 ] RE Model for Subgroup 0.09 [ 0.03 , 0.31 ] Zielinski et al (2002), tilmicosin 300 mg,S−CONJUNC,day 21 8 11 13 6 0.34 [ 0.09 , 1.27 ] 10 mg/kg,SC,day 21 9 11 13 6 0.38 [ 0.10 , 1.40 ] 5 mg/kg,SC,day 21 9 12 13 6 0.35 [ 0.09 , 1.27 ] 2.5 mg/kg,SC,day 21 9 11 13 6 0.38 [ 0.10 , 1.40 ] RE Model for Subgroup 0.36 [ 0.19 , 0.69 ] Zielinski et al (2002), oxytetracycline 300 mg,S−CONJUNC,day 21 16 4 13 6 1.85 [ 0.43 , 7.96 ]

Dueger et al (2004), ceftiofur 6.6 mg/kg,SC,day 21 0 26 0 26 1.00 [ 0.02 , 52.29 ] 6.6 mg/kg,SC,day 14 1 25 2 24 0.48 [ 0.04 , 5.65 ] 6.6 mg/kg,SC,day 7 3 23 2 24 1.57 [ 0.24 , 10.24 ] 6.6 mg/kg,SC,day 21 3 32 11 27 0.23 [ 0.06 , 0.91 ] 6.6 mg/kg,SC,day 14 5 31 17 23 0.22 [ 0.07 , 0.68 ] 6.6 mg/kg,SC,day 7 11 27 21 19 0.37 [ 0.14 , 0.94 ] RE Model for Subgroup 0.35 [ 0.20 , 0.63 ] FOI Summary NADA 141−244 (2007), tulathromycin 2.5 mg/kg,SC,day 21 10 40 31 18 0.15 [ 0.06 , 0.36 ] 2.5 mg/kg,SC,day 17 11 39 30 19 0.18 [ 0.07 , 0.43 ] 2.5 mg/kg,SC,day 13 16 34 39 10 0.12 [ 0.05 , 0.30 ] 2.5 mg/kg,SC,day 9 25 25 38 11 0.29 [ 0.12 , 0.69 ] 2.5 mg/kg,SC,day 5 35 15 45 4 0.21 [ 0.06 , 0.68 ] 2.5 mg/kg,SC,day 21 0 49 6 43 0.07 [ 0.00 , 1.23 ] 2.5 mg/kg,SC,day 17 0 49 8 41 0.05 [ 0.00 , 0.88 ] 2.5 mg/kg,SC,day 13 0 49 37 12 0.00 [ 0.00 , 0.06 ] 2.5 mg/kg,SC,day 9 0 49 45 4 0.00 [ 0.00 , 0.02 ] 2.5 mg/kg,SC,day 5 26 24 48 1 0.02 [ 0.00 , 0.18 ] RE Model for Subgroup 0.13 [ 0.09 , 0.21 ]

0.00 0.01 1.00 148.41

Treatment effective Odds ratio Treatment ineffective

Figure 6: Odds ratio for unhealed corneal ulcers - all available pairwise treatments that included a non-active arm. 127

Author (year), treatment Arm 1 (Placebo) Arm 2 (Active) dose, route, timepoint mean SD mean SD Mean difference [95% CI] George et al (1989), cloxacillin 375 mg,TOP−OPTHALM,day 16 0.13 0.21 0.06 0.17 0.08 [ −0.04 , 0.19 ] 375 mg,TOP−OPTHALM,day 13 0.29 0.46 0.11 0.23 0.17 [ −0.05 , 0.40 ] 375 mg,TOP−OPTHALM,day 10 0.42 0.63 0.16 0.39 0.27 [ −0.06 , 0.59 ] 375 mg,TOP−OPTHALM,day 7 0.60 0.90 0.24 0.52 0.36 [ −0.09 , 0.81 ] 375 mg,TOP−OPTHALM,day 4 0.73 1.16 0.44 0.82 0.29 [ −0.32 , 0.91 ] 250 mg,TOP−OPTHALM,day 16 0.13 0.21 0.04 0.12 0.09 [ −0.02 , 0.20 ] 250 mg,TOP−OPTHALM,day 13 0.29 0.46 0.09 0.24 0.20 [ −0.02 , 0.42 ] 250 mg,TOP−OPTHALM,day 10 0.42 0.63 0.11 0.26 0.31 [ 0.02 , 0.61 ] 250 mg,TOP−OPTHALM,day 7 0.60 0.90 0.33 0.71 0.27 [ −0.22 , 0.76 ] 250 mg,TOP−OPTHALM,day 4 0.73 1.16 0.52 0.97 0.21 [ −0.44 , 0.85 ] RE Model for Subgroup 0.17 [ 0.11 , 0.24 ] Allen et al (1995), penicillin 300,000 U,S−CONJUNC,day 14 0.15 0.14 0.33 0.33 −0.18 [ −0.35 , −0.01 ] 300,000 U,S−CONJUNC,day 10 0.19 0.17 0.55 0.62 −0.36 [ −0.66 , −0.06 ] 300,000 U,S−CONJUNC,day 7 0.12 0.18 0.77 0.67 −0.65 [ −0.97 , −0.33 ] 300,000 U,S−CONJUNC,day 4 0.25 0.18 0.53 0.61 −0.28 [ −0.58 , 0.02 ] RE Model for Subgroup −0.34 [ −0.48 , −0.19 ] Angelos et al (2000), florfenicol 40 mg/kg,SC,day 21 1.00 1.78 0.11 0.22 0.89 [ 0.40 , 1.37 ] 40 mg/kg,SC,day 14 2.53 4.16 0.27 0.53 2.25 [ 1.05 , 3.46 ] 40 mg/kg,SC,day 7 2.76 4.12 0.50 0.84 2.27 [ 1.12 , 3.42 ] 20 mg/kg,IM,day 21 1.00 1.78 0.03 0.06 0.97 [ 0.49 , 1.45 ] 20 mg/kg,IM,day 14 2.53 4.16 0.35 0.66 2.18 [ 0.97 , 3.38 ] 20 mg/kg,IM,day 7 2.76 4.12 1.20 2.13 1.56 [ 0.29 , 2.83 ] RE Model for Subgroup 1.62 [ 1.22 , 2.01 ] Dueger et al (2004), ceftiofur 6.6 mg/kg,SC,day 3−4 0.04 0.09 0.07 0.16 −0.03 [ −0.10 , 0.04 ] 6.6 mg/kg,SC,day 21 0.10 0.21 0.02 0.09 0.08 [ 0.01 , 0.15 ] 6.6 mg/kg,SC,day 17−18 0.19 0.38 0.06 0.20 0.13 [ 0.00 , 0.27 ] 6.6 mg/kg,SC,day 14 0.23 0.34 0.08 0.25 0.15 [ 0.02 , 0.29 ] 6.6 mg/kg,SC,day 10−11 0.26 0.35 0.10 0.26 0.16 [ 0.02 , 0.30 ] 6.6 mg/kg,SC,day 7 0.27 0.39 0.11 0.24 0.16 [ 0.02 , 0.30 ] 6.6 mg/kg,SC,day 3−4 0.28 0.35 0.17 0.24 0.11 [ −0.03 , 0.24 ] RE Model for Subgroup 0.10 [ 0.07 , 0.13 ]

−1.00 0.00 1.00 2.00 3.00 4.00

Treatment ineffective Mean difference Treatment effective

Figure 7: Mean di↵erence for surface area - all available pairwise treatments that included a non- active arm. 128

CHAPTER 5. GENERAL CONCLUSIONS

General Discussion

Overall, the results of this thesis indicate that important study design considerations or characteristics are often lacking if not entirely unknown from a number of RCTs and case- control studies in the veterinary science. Specifically, the case-control survey suggested there is a fair amount of confusion as to the conduct of this deceptively simple study design. From our sample, nearly 40% of studies that assessed a disease etiology failed to sample study subjects on the basis of an outcome – the defining feature of the case-control design. Moreover no studies reported additional design features that would enable a more interpretable outcome. The systematic review and meta-analysis did not fair much better with regards to the implementation and reporting of key design features of RCTs. From roughly 100 manuscripts examined for relevance in the systematic review, less than 20% randomized subject to treatment. Of the trials that reported the most basic of design and statistical considerations (e.g. numbers allocated to treatment arms, measures of variation provided), less than half reported the blinding of outcome assessment. Randomization and blinding of outcome assessment are typically fundamental aspect of a well-conducted RCT.

Failure to implement and thoughtfully report all design and analysis aspects of any research study have major implications: 1) the ability to make an accurate and valid inference may be hindered, 2) proper explanation of a more interpretable effect measure may be squandered, 3) misidentification of the “true” study design could lead to exclusion from systematic reviews resulting in the loss of potentially valuable data, and 4) perpetuate unsuitable or incorrect study design descriptions in the veterinary literature at large. 129

Recommendations for Future Work

The results and implications of this thesis suggest the need for multiple streams of future work and research. Firstly, based on the results of the case-control survey, it is highly likely similar misconceptions are occurring in other study designs within the veterinary science. Thus a similar or even more expansive examination of the other observational study designs (e.g. cohort, cross-sectional) would be of value. This is especially true of the cohort design as this observational design is sometimes included in systematic reviews and meta-analyses. More investigations into what is occurring with these design choices in the veterinary science may lead to the development of veterinary specific guidelines for the reporting of studies. Ultimately this will benefit improved animal health through valid inference and accurate research synthesis.

Secondly, as a result of the propensity amongst clinicians in choosing the case-control design to efficiently examine a disease etiology, in-depth training and teaching of the case-control design is crucial in maximizing the inferential value of the research. As mentioned in the discussion of

Chapter III, perhaps a more effective way in which to teach observational study design is to employ a simpler classification system that rejects directionality as a foundation. Finally, additional research is needed to determine how to properly include effect measures where the design considerations suggest a more interpretable and attractive estimation is possible.