<<

Reliability of Examination Todd A. Florin, MD, MSCE,a,​ b​ Lilliam Ambroggio, PhD, MPH,​b,​c,​d Cole Brokamp, PhD,c​ Mantosh S. Rattan, MD,​b,​e FindingsEric J. Crotty, MD,b,​ e​ Andrea Kachelmeyer,in Suspected BS, CCRP,a​ Richard M. Ruddy, Community- MD,a,​ b​ Samir S. Shah, MD, MSCEa,​b,​d,​f Acquired BACKGROUND: abstract

The authors of national guidelines emphasize the use of history and examination findings to diagnose community-acquired pneumonia (CAP) in outpatient children. Little is known about the interrater reliability of the in children with METHODS: suspected CAP. This was a prospective cohort study of children with suspected CAP presenting to a pediatric emergency department from July 2013 to May 2016. Children aged 3 months ≤ to 18 years with lower respiratory signs or symptoms who received a chest radiograph were included. We excluded children hospitalized 14 days before the study visit and those with a chronic medical condition or aspiration. Two clinicians performed independent ’ κ examinations and completed identical forms reporting examination findings. Interrater reliability for each finding was reported by using Fleiss kappa ( ) for categorical variables RESULTS: and intraclass correlation coefficient (ICC) for continuous variables.κ κ – No examination finding had substantial agreement ( /ICC > 0.8). Two findings (retractions, wheezing) had moderate to substantial agreement ( /ICC = 0.6 0.8). Nine findings (abdominal , pleuritic pain, nasal flaring, skin color, overall impression, κ – cool extremities, , , and /rales) had fair to moderate agreement ( /ICC = 0.4 0.6). Eight findings (capillary refill time, , rhonchi, head κ – bobbing, behavior, grunting, general appearance, and decreased breath sounds) had poor to fair reliability ( /ICC = 0 0.4). Only 3 examination findings had acceptable agreement, with CONCLUSIONS: the lower 95% confidence limit >0.4: wheezing, retractions, and respiratory rate. In this study, we found fair to moderate reliability of many findings used to diagnose CAP. Only 3 findings had acceptable levels of reliability. These findings must be NIH considered in the clinical management and research of pediatric CAP.

Divisions of aEmergency , cBiostatistics and Epidemiology, dHospital Medicine, and fInfectious , What’s Known on This Subject: The authors of and eDepartment of Radiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio; and bDepartment national guidelines emphasize the use of history and of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, Ohio examination findings to diagnose community-acquired pneumonia (CAP) in outpatient children. Little is known Dr Florin conceptualized and designed the study, interpreted the data, and drafted the initial about the interrater reliability of the physical examination manuscript; Drs Ambroggio, Ruddy, and Shah played substantial roles in study design and data in children with suspected CAP. interpretation and critically reviewed the manuscript; Dr Brokamp conducted the initial data analyses, interpreted the data, and critically reviewed the manuscript; Drs Rattan and Crotty What This Study Adds: In this study, we found fair to interpreted all chest radiographs in this study (data acquisition) and critically reviewed the moderate reliability of many findings used to diagnose manuscript; Ms Kachelmeyer coordinated and supervised data collection and critically reviewed CAP. Only 3 of the 19 findings examined had acceptable the manuscript; and all authors approved the final manuscript as submitted and agree to be reliability. The fair reliability of the examination must be accountable for all aspects of the work. Dr Florin and Dr Brokamp had full access to all the data considered in the clinical management and research of in the study and take responsibility for the integrity of the data and the accuracy of the data pediatric CAP. analysis. To cite: Florin TA, Ambroggio L, Brokamp C, et al. Reliability of Examination Findings in Suspected Community-Acquired Pneumonia. Pediatrics. 2017;140(3):e20170310

Downloaded from www.aappublications.org/news by guest on September 25, 2021 PEDIATRICS Volume 140, number 3, September 2017:e20170310 Article

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF Children with a respiratory illness in pediatric CAP, our objective was or chronic medical conditions commonly require emergency to evaluate the IRR of examination known to predispose to severe department (ED) evaluation. findings specifically for children with or recurrent pneumonia (eg, Distinguishing community-acquired suspected CAP. We hypothesized that immunodeficiency, chronic pneumonia (CAP) from other examination measures frequently corticosteroid use, , causes of respiratory illness can be used to diagnose CAP in children chronic , , challenging. Consequently, there is would demonstrate only fair to , congenital substantial variation in ED evaluation moderate IRR. disease, tracheostomy-dependent of children with CAP because Methods patients, and neuromuscular clinicians attempt to distinguish disorders affecting the ) were upper from lower respiratory tract Study Design not included, nor were children with

(LRTI) 1,and2​ viral from a history of aspiration or aspiration bacterial causes. ‍ pneumonia. These criteria were This study is part of an ongoing developed to include otherwise The Pediatric Infectious Diseases prospective cohort study of children healthy children with suspected CAP. Society and Infectious Diseases with suspected CAP called Catalyzing Patients who were enrolled within 30 Society of America published Ambulatory Research in Pneumonia days before the study ED visit were consensus guidelines for CAP Etiology and Diagnostic Innovations excluded to ensure a distinct episode management in children to in Emergency Medicine (CARPE ’ of infection during the study visit. minimize unnecessary variation DIEM). Patients who presented to the Study Protocol and encourage judicious3 use of ED at Cincinnati Children s Hospital testing and antibiotics. Citing Medical Center (CCHMC) and were high-quality evidence, the authors enrolled in CARPE DIEM from July The current study was composed of the guidelines provide a strong 2013 to May 2016 were eligible of a convenience sample of patients recommendation that routine chest for inclusion in this analysis. The from the CARPE DIEM cohort for radiographs (CXRs) are not necessary study was approved by the CCHMC whom 2 clinicians were able to to confirm suspected CAP in patients ’ Institutional Review Board. Informed perform independent physical well enough to be cared for at consent was obtained from all legal examinations. The patient s primary home after office or ED evaluation, ≥ guardians of patients and assent from treating clinician (attending stressing that the clinician should Schildrentudy Population 11 years of age. physician, pediatric emergency diagnose CAP by using historical and medicine [PEM] fellow, or nurse examination findings. practitioner), the first assessor, Given the importance of the physical Children 3 months to 18 years of age completed a standardized physical examination in the diagnosis of with of LRTI examination on a form. children with suspected CAP, it is who received a CXR for suspicion A second assessor completed an critical that examination findings are of CAP were enrolled. Signs and examination on the same child reliable. Interrater reliability (IRR), symptoms of LRTI were defined as and an identical form independent the ability to obtain similar results having one or more of the following: of the first assessor. The second by different examiners under similar new or different cough, new or assessor was identified on the basis conditions, is a key component of any different production, chest of available clinicians in the ED examination measure. If a measure pain, dyspnea and/or shortness at the time of enrollment; it could has poor IRR, it cannot be used to of breath, documented tachypnea, have been a fellow working with the make an accurate diagnosis that or abnormal findings consistent first assessor or another clinician

is generalizable across providers with LRTI on physical examination9 working elsewhere in the ED. It and practice settings. Among adults (eg, rales/crackles, wheezing). was required that the examination with CAP, the IRR of examination There are no uniform standards was not discussed between the 2 findings was highly variable, with for use of CXRs with suspected assessors before completion of the

most4 demonstrating poor to fair pneumonia at our institution; case report forms. Ideally, both IRR. Studies of children with cough, in 2016, 83% of children with a examinations would be performed , and have diagnosis of pneumonia received a before knowledge of the CXR results; ≤ revealed mixed results,– ranging CXR in our ED. We excluded children however, this was challenging

from poor to good reliability5 8 of hospitalized 14 days before the for practical reasons of clinical examination findings. ‍ Given the index ED visit to exclude possible flow in a busy ED. In our study, variability in these respiratory hospital-acquired pneumonia. 71% of first assessors and 63% of processes and the paucity of evidence Children with immunocompromising second assessors knew radiograph Downloaded from www.aappublications.org/news by guest on September 25, 2021 2 Florin et al

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF results before the examination. The blinded to all clinical information 122 paired examinations were

research coordinators encouraged and outcomes. A patient was required. Analyses22 were performed completion of the 2 examinations considered to have radiographic by using R (v3.3). within 20 minutes of each other. In pneumonia when both radiologists Results addition, no respiratory treatments, agreed that radiograph findings including nebulized favored pneumonia (as opposed to or nasal suctioning, should have atelectasis or normal findings). In ’ Paired assessments were performed occurred between the 2 examinations cases of disagreement, the attending on 128 children. Assessments were to mitigate the effect of these radiologist s impression on the completed within 20 minutes of each treatments on examination findings. clinical report at the time of the study Measurements other in 96.5% of children; 94.7% visit acted as a tiebreaker. Data Analysis had no treatments between evaluations. The study cohort for The case report forms recorded IRR was similar to the overall CARPE specific examination findings ’ κ DIEM cohort in age, sex, medical IRR for categorical findings was relevant to a child presenting with history, presence of radiographic assessed by using Fleiss kappa ( ), suspected CAP. These findings pneumonia, and primary billing and its 95% confidence interval were selected on the basis of – κ diagnoses (Supplemental Table 3). (CI) was estimated by using 1000 an extensive literature review 14 16 For the first assessment, 2 bootstrap replicates. ‍ ‍ accounts and expert opinions of faculty assessments (2%) were performed for variability across many raters physicians in emergency medicine, by nurse practitioners, 46 (36%) by – because many different physicians hospital medicine, and infectious κ PEM fellow physicians, 40 (31%) 3,10​ 13 performed physical examinations as diseases. ‍ ‍ Examination findings by attending physicians with <5 part of CARPE DIEM. is reported included general appearance (ie, years of practice after completion of on a 0 to 1 scale, with 0 indicating ≥ how the patient appeared to the training, and 40 (31%) by attending poor agreement and 1 indicating clinician on first observation), κ – physicians with 5 years of practice near-perfect agreement (poor to behavior, perfusion (skin color, κ – after completion of training. For the slight agreement: = 0.00 0.20; fair cool extremities, capillary refill, and κ – second assessment, 3 (2%) were agreement: = 0.21 0.40; moderate peripheral quality), respiratory κ – performed by nurse practitioners, 58 agreement: = 0.41 0.60; substantial signs (cough, nasal flaring, κ – (45%) by PEM fellow physicians, 21 agreement: = 0.61 0.80; and near- retractions, grunting, crackles/ 17 (17%) by attending physicians with perfect agreement: = 0.81 1.00). rales, rhonchi, wheezing, head <5 years of practice after completion On the basis of previous literature, ≥ bobbing, decreased breath sounds, of training, and 46 (36%) by we defined acceptable agreement as pleuritic , , physicians with 5 years of practice at least moderate agreement, with tachypnea, and respiratory rate), κ ≥ – after completion of training. ’ the lower 95% confidence limit of and overall clinical impression (ie, 17 21 0.4. ‍ ‍ IRR for continuous The prevalence of findings as the clinician s final impression after findings was assessed by using the determined by the clinician caring taking all history and examination intraclass correlation coefficient for the patient (ie, first assessor) factors into account). Respiratory (ICC). We repeated the above is reported in Table 1. Raw overall rate was counted by the individual analyses for 2 subgroups: (1) those agreement for physical examination clinicians at the time of the with and without radiographic findings ranged from 52% (behavior) examination. To capture real-world pneumonia and (2) those younger to 96% (cool extremities and interpretation of the physical than and older than 5 years of age. We peripheral ) (Fig 1). No examination, we did not provide anticipated that some examination examination finding had substantial detailed definitions of findings. For findings would rarely be abnormal to near-perfect agreement. Two certain findings, such as pleuritic (eg, peripheral pulses) and thus have findings (retractions and wheezing) chest pain, assessment is challenging κ – high measures of raw agreement had moderate to substantial because of developmental “ ” while not necessarily achieving at agreement ( = 0.6 0.8). Eight immaturity. We included options κ “ ” “ ” least moderate agreement with the findings (abdominal pain, pleuritic such as too young to assess,​ lower 95% confidence limit of . pain, nasal flaring, skin color, unable to assess,​ or unknown for κ With our sample size calculations, we overall impression, cool extremities, these findings. κ – determined that to achieve a of 0.7 tachypnea, and crackles/rales) had To define radiographic pneumonia, with a lower limit of the 95% CI of moderate agreement ( = 0.4 0.6). CXRs were independently reviewed 0.4 and conservatively assuming the The ICC for respiratory rate was 0.58 by 2 radiologists (M.S.R. and E.J.C.) prevalence of a finding is 5%, (95% CI 0.41 to 0.72), also indicating Downloaded from www.aappublications.org/news by guest on September 25, 2021 PEDIATRICS Volume 140, number 3, September 2017 3

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF TABLE 1 Prevalence of Examination Findings by Primary Clinician Finding Presence of Finding by Primary Assessor, κ − n (%) on diffuse versus focal wasn 86% with a of 0.52 (95% CI, 0.06 to 0.9). For General examination κ decreased breath sounds ( = 34), General appearance − Well 35 (27) raw agreement was 71% with a of Mildly ill or distressed 40 (31) 0.33 (95% CI, 0.02 to 0.67). Once it Moderately ill or distressed 49 (38) was agreed that focal findings were Severely ill or distressed 4 (3) present, the reliability of location was Behavior n moderate to substantial. For crackles Playing and appropriate 33 (26) κ Quiet but appropriate 57 (45) ( = 21), left-sided findings had a Sleeping but easily arousable 11 (9) raw agreement of 86%n with a Fussy but consolable 20 (16) of 0.69 (95% CI, 0.3 to 1.0), and κ Irritable 6 (5) right-sided findings ( = 21) had Lethargic, confused, or reduced response to pain 1 (1) a raw agreement of 90% withn a Crackles 49 (38) of 0.81 (95% CI, 0.53 to 1.0). For Decreased breath sounds 55 (43) decreased breath sounds ( = 18), κ Wheezing 41 (32) left-sided findings had a raw Retractions 73 (57) agreement of 89% with a of 0.77 Rhonchi 43 (34) (95% CI, 0.4 to 1.0), and right-sided Tachypneic 89 (70) κ If tachypneic, respiratory rate, mean breaths per 46 (12) findings had a raw agreement of 78% minute (SD) with a of 0.54 (95% CI, 0.11 to 0.89). Nasal flaring 30 (23) Pleuritic chest pain 14 (27) ‍Table 2 illustrates the prevalence of Grunting 14 (11) examination findings by the primary Head bobbing 9 (7) Observed cough 68 (53) treating clinician (ie, first assessor) Cardiovascular and perfusion examination for the 2 subgroup analyses: those Cool extremities 4 (3) with and without radiographic ≥ Skin color pneumonia and those <5 years Pink and/or normal 109 (85) of age or 5 years of age. There Pale or dusky 18 (14) Mottled 1 (1) were no substantial differences Capillary refill time (s) in IRR (ie, there is overlap of all 1–2 109 (85) CIs) of findings in those with and 3 19 (15) without radiographic pneumonia 4 0 (0) (Supplemental Fig 2). There were ≥5 0 (0) no substantial differences in IRR Peripheral pulses ≥ Normal 124 (98) of findings in those <5 years old Bounding 3 (2) and those 5 years of age, with the exception of retractions, for which Abdominal pain or 14 (11) κ Overall impression the IRR appears substantially greater in older children ( = 0.81; 95% CI, Mild 55 (43) κ Moderate 63 (50) 0.61 to 0.96) compared with younger Severe 9 (7) children ( = 0.42; 95% CI, 0.20 to 0.63) (Supplemental Fig 3). Discussion moderate agreement. Eight findings patients; therefore, we only report (capillary refill time, cough, rhonchi, the raw agreement of 96%. head bobbing, behavior, grunting, We found fair to moderate IRR for general appearance, and decreased κ – If auscultatory findings were most physical examination findings breath sounds) had poor to fair present, examiners were asked in children who presented to the reliability ( or ICC = 0 0.4). Only 3 whether findings were diffuse or ED with suspected CAP. Three examination findings had acceptable focal. Analyses of focal findings were findings (wheezing, retractions, and agreement: wheezing, retractions, limited by sample size. For wheezing, respiratory rate) met the definition– and respiratory rate (Fig n1). One there weren no focal findings. For of acceptable reliability, 17with21 a lower finding, peripheral pulses, was rhonchi, there was 1 focal finding. For confidence limit of >0.4. ‍ IRR was rated as normal in 97% ( = 124) of crackles ( = 28), the raw agreement largely similar between the strata Downloaded from www.aappublications.org/news by guest on September 25, 2021 4 Florin et al

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF reflecting the– heterogeneity of

respiratory5 findings8,23​ and populations examined. ‍‍ Our results in children with suspected CAP are generally consistent with previous studies in other respiratory κ conditions. The IRR for respiratory rate ranges from or ICC of– 0.58– to

0.95, although values as low5 8,23​ as27 0.12 have been reported. ‍‍ ‍ ‍ Similarly, retractions and wheezing κ also have large ranges of reported κ – statistics across studies – – – ( = 0.25 0.77 for retractions,5 8,23​ 27 0.31 0.78 for wheezing). ‍ ‍ ‍ ‍ ‍

Several other examination findings that are considered important in making the clinical diagnosis of pneumonia (eg, crackles/rales and decreased breath sounds) demonstrated poor to moderate IRR. In adults with suspected CAP,

the IRR for crackles/rales and 4 rhonchi ranged from 0 to 0.64. The reliability of auscultatory sounds other than wheezing is not often reported in children. The authors of 1 study of examination findings in κ preschoolers with cough found a of 0.39 (95% CI, 0.26 to 0.53)

for abnormal chest findings, which7 included wheezing or crackles. Decreased breath sounds had only κ fair IRR in our study, which is in – contrast to reported statistics in 28 children with asthma (0.67 0.76). Differences between the IRR in children with suspected CAP and those with other conditions, such as asthma, likely reflect differences in pathophysiology. Given the importance placed on these findings in clinically diagnosing CAP in children, it is concerning that they FIGURE 1 demonstrated only fair reliability. Agreement and IRR of examination findings in children with suspected CAP. Dots with bars represent the κ statistic with 95% CIs. Triangles represent raw agreement. The dotted line represents the acceptable level of agreement for the lower 95% confidence limit κ( < 0.40). There are several possible reasons for the modest IRR of examination findings in children with suspected ≥ (pneumonia versus not, age <5 years respiratory examination findings CAP. Clinical providers have vs 5 years). focused on children with asthma, different descriptions for the bronchiolitis, or wheezing. The adventitious sounds discovered Authors of previous studies who authors of these previous studies on lung (eg, coarse examined the IRR of pediatric demonstrate conflicting results, crackles versus rhonchi). For Downloaded from www.aappublications.org/news by guest on September 25, 2021 PEDIATRICS Volume 140, number 3, September 2017 5

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF TABLE 2 Prevalence of Examination Findings by Radiographic Pneumonia Diagnosis and Age Finding Radiographic Pneumonia, n (%) Age, n (%) No Pneumonia (n = Pneumonia (n = 23) <5 y (n = 81) ≥ 5 y (n = 47) 105) General examination General appearance Well 31 (30) 4 (17) 19 (23) 16 (34) Mildly ill or distressed 39 (37) 10 (43) 34 (42) 15 (32) Moderately ill or distressed 32 (30) 8 (35) 25 (31) 15 (32) Severely ill or distressed 3 (3) 1 (4) 3 (4) 1 (2) Behavior Playing and appropriate 28 (27) 5 (22) 17 (21) 16 (34) Quiet but appropriate 45 (43) 12 (52) 31 (38) 26 (55) Sleeping but easily arousable 10 (10) 1 (4) 9 (11) 2 (4) Fussy but consolable 17 (16) 3 (13) 18 (22) 2 (4) Irritable 5 (5) 1 (4) 6 (7) 0 Lethargic, confused, or reduced 0 1 (4) 0 1 (2) response to pain Respiratory examination Crackles 35 (33) 14 (61) 35 (43) 14 (30) Decreased breath sounds 36 (34) 19 (83) 24 (30) 31 (66) Wheezing 36 (34) 5 (22) 30 (37) 11 (23) Retractions 60 (57) 13 (57) 55 (68) 18 (38) Rhonchi 38 (36) 5 (22) 35 (43) 8 (17) Tachypneic 72 (69) 17 (74) 61 (75) 28 (60) Pleuritic chest pain 7 (20) 7 (41) 0 (0) 14 (33) Nasal flaring 24 (23) 6 (26) 21 (26) 9 (19) Grunting 11 (10) 3 (13) 10 (12) 4 (9) Head bobbing 9 (9) 0 (0) 8 (10) 1 (2) Observed cough 51 (49) 17 (74) 44 (54) 24 (51) Cardiovascular and perfusion examination Cool extremities 2 (2) 2 (9) 4 (5) 0 (0) Skin color Pink and/or normal 91 (87) 18 (78) 69 (85) 40 (85) Pale or dusky 13 (12) 5 (22) 11 (14) 7 (15) Mottled 1 (1) 0 1 (1) 0 Capillary refill time (s) 1–2 91 (87) 18 (78) 69 (85) 40 (85) 3 14 (13) 5 (22) 12 (15) 7 (15) Peripheral pulses Normal 102 (97) 22 (96) 80 (99) 44 (96) Bounding 2 (2) 1 (4) 1 (1) 2 (4) Abdominal examination Abdominal pain 10 (10) 4 (17) 4 (5) 10 (22) Overall impression Mild 51 (49) 4 (17) 35 (43) 20 (44) Moderate 46 (44) 17 (74) 40 (49) 23 (50) Severe 8 (8) 1 (4) 6 (7) 3 (6) Findings reflect the examination of the primary treating clinician (ie, first assessor).

example, the authors of 1 study suggesting that descriptions of contributing to the observed fair to of 12 physicians classifying lung auscultation findings differ across moderate IRR. It is also likely that κ 29 audiovisual recordings found poor to physicians. To address this, we physicians use different terms (eg, “ fair agreement ( < 0.4) when using provided general descriptions of rhonchi versus coarse crackles) for ” “ detailed descriptions of adventitious examination findings (eg, crackles/ the same auscultatory findings or ” “ ” sounds, such as coarse crackles rales instead of coarse crackles/ that findings are misinterpreted or versus fine crackles. Interrater rales vs fine crackles/rales ) to indistinguishable (eg, transmitted agreement improved when findings capture real-world clinical practice. upper airway sounds versus lower κ were combined in more general Despite keeping terminology respiratory rhonchi). The poor to fair κ terms (crackles overall [ = 0.62] general, clinicians likely labeled reliability for rhonchi and crackles and overall [ = 0.59]), auscultation findings differently, suggests that this might be the case. Downloaded from www.aappublications.org/news by guest on September 25, 2021 6 Florin et al

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF κ Age may also play a role in explaining national guidelines recommend of positive findings, resulting in a our results because certain findings against the use of radiography to high agreement with lower , is head κ may be more reliable on the basis confirm suspected CAP in children3 bobbing. High agreement with low of age because of differences in size treated in the outpatient setting. If is a known phenomenon in cases and pathophysiology. We found the examination is not sufficiently in which there are imbalances in the reliability of retractions was reliable, it will have limited use in the prevalence of findings, which substantially higher in older children clinical practice, thus contributing to can occur if the prevalence of a compared with those <5 years of age, the wide variation in management finding approaches 0% or 100%. It κ likely because of retractions being of children who present to the ED1, 2​ has been shown that low values of more pronounced in larger children with febrile respiratory illnesses. ‍ because of marginal imbalances and adolescents. Certain examination Furthermore, because the Infectious in the prevalence of findings, even findings were also more prevalent by Diseases Society of America and when absolute agreement is high, age, such as retractions and rhonchi Pediatric Infectious Diseases cannot be dismissed as an unfair in younger children, likely reflecting Society pneumonia guidelines do penalty and should be interpreted the higher prevalence of viral lower not recommend radiography in the as truly indicating lesser degrees of

respiratory tract disease in this age outpatient setting, the reliance of agreement36 beyond those expected by group. examination findings with limited chance. Our sample size calculation reliability to diagnose CAP has the was based on a prevalence rate of The fair reliability that we observed potential to increase antibiotic 5%; therefore, for a few findings may also be due to changes in overuse, resulting in the spread of with lower prevalence (ie, cool physical examination education, antimicrobial resistance, antibiotic- extremities, head bobbing, and practice, and precision over time. associated adverse effects, and peripheral pulses), the study was not Advanced imaging and laboratory increased cost. It is reassuring that adequately powered. For findings testing has potentially supplanted the findings of respiratory rate with prevalence <5% and those and retractions, often abnormal of our subanalyses with sample the examination for the current30 generation of physicians. in pneumonia, had acceptable sizes <122, they may have been Researchers for 1 study who reliability. However, the limited classified as unacceptable, but with examined the competency of the reliability of many other findings an increased prevalence or sample across years that are hallmarks of the clinical size, it may be that these findings are of training found that cardiac diagnosis of CAP suggests that indeed acceptable because the lower examination skills improve from either interventions to improve CI potentially may increase. the beginning to the end of medical examination skills are necessary, school, but skills do not improve after a more standardized approach to Some children may not have had the third year of medical school and the diagnosis is required, or more clear pneumonia on a CXR at time of 31 objective tools are needed to aid in visit, causing us to misclassify these may decline after years in practice. κ The conflict over advocating that CAP diagnosis in children. patients as not having pneumonia. This would not affect the overall clinical skills are of less importance ∼ because of the improvements in This study has several limitations. because all patients were included in diagnostic technologies competes The study was conducted in an those calculations. Similarly, 20% with others stressing that clinical academic pediatric ED; thus, our of our cohort had radiographic results may not be generalizable to pneumonia, limiting our ability to history and examination32,33​ findings are still critical. ‍ In the case of other clinical settings. Several facets provide definitive results of IRR in of the respiratory examination may those with radiographic pneumonia. pediatric pneumonia, the reference κ diagnostic standard (chest change, even in a brief time, which It was our objective, however, to may underestimate . Given that examine the IRR of examination radiography)34 has moderate IRR and is imprecise. Thus, the diagnosis almost all of our paired examinations findings in suspected pneumonia, occurred within 20 minutes of each which reflects real-world conditions is often a clinical one rather than κ one based on imaging or laboratory other, this is unlikely to substantially of performing the examination first testing, further emphasizing the alter our results. The statistic is and using these findings to decide affected by the prevalence of the if a radiograph or treatment of importance of a reliable physical κ examination. examination finding; for uncommon pneumonia is warranted. There were findings, low values may not 7 patients who received breathing

These findings have several necessarily 35reflect low levels of treatments between examinations, important implications for clinical agreement. An example of a which was a protocol deviation care and research. Authors of finding that had a low prevalence and not an exclusion criterion. We Downloaded from www.aappublications.org/news by guest on September 25, 2021 PEDIATRICS Volume 140, number 3, September 2017 7

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF performed a sensitivity analysis to generalizability. In addition, supervision; the Clinical Research removing those 7 children, and despite the competing demands of Coordinator team, physicians, and results did not change substantively, a busy ED, nearly all examinations advanced practice and staff nurses of suggesting that these treatments did occurred within 20 minutes and the Division of Emergency Medicine not alter the IRR of the examination without intervening treatments. at CCHMC who assisted with study findings (Supplemental Fig 4). A Our ability to incorporate a large enrollment and procedures; and large proportion of assessors knew number of examination findings all the patients and families who the CXR results, which may have that are commonly used to diagnose participated in CARPE DIEM. influenced reporting of certain CAP clinically makes this the most examination findings. The proportion extensive study to our knowledge to Abbreviations of first and second assessors privy address the reliability of examination to radiograph results was similar, findings in suspected CAP. however, and thus we would not Conclusions CAP: community-acquired expect differential bias between pneumonia assessors. In addition, some second CARPE DIEM: Catalyzing assessors may have been privy We found fair to moderate reliability Ambulatory to more clinical information than of many findings to be Research in others; however, given the random important to the clinical diagnosis Pneumonia selection of the second assessor, of pneumonia. Only 3 findings Etiology and knowledge of clinical history (retractions, wheezing, and Diagnostic and radiograph results would be respiratory rate) had acceptable Innovations in randomly distributed. Thus, there levels of IRR. The reliability of these Emergency is a low likelihood that systematic findings must be considered in the ’ Medicine bias would be introduced as a result. clinical management and research of CCHMC: Cincinnati Children s Finally, we did provide detailed children with CAP. Hospital Medical Center definitions of examination findings Acknowledgments CI: confidence interval to examiners because we intended to CXR: chest radiograph capture real-world interpretation of ED: emergency department the physical examination. We acknowledge Shreya Reddy for ICC: intraclass correlation her assistance with the literature coefficient Despite these limitations, our study review; Judd Jacobs and Jessi IRR: interrater reliability has several notable strengths, Lipscomb for data management LRTI: lower respiratory tract namely its prospective, real-world and programming support; infection approach. The large number of raters Kerry Aicholtz for administrative PEM: pediatric emergency involved in our study replicates the assistance; Mekibib Altaye, PhD, and medicine real-world setting and contributes Heidi Sucharew, PhD, for statistical

This study was presented in abstract form at the annual meeting of the Pediatric Academic Societies; April 30, 2016; Baltimore, MD. DOI: https://​doi.​org/​10.​1542/​peds.​2017-​0310 Accepted for publication Jun 1, 2017 Address correspondence to Todd A. Florin, MD, MSCE, Division of Emergency Medicine, Cincinnati Children’s Hospital Medical Center, 3333 Burnet Ave, ML 2008, Cincinnati, OH 45229. E-mail: [email protected] PEDIATRICS (ISSN Numbers: Print, 0031-4005; Online, 1098-4275). Copyright © 2017 by the American Academy of Pediatrics FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose. FUNDING: This work was supported by a pediatric research grant from The Gerber Foundation (to T.A.F.); the National Center for Research Resources and the National Center for Advancing Translational Sciences (grant 8 KL2 TR000078-05 to T.A.F.); the National Institute for and Infectious Diseases and the National Institutes of Health (grant 1 K23 AI121325-01 to T.A.F.); and a Trustee Award from Cincinnati Children’s Hospital Medical Center (to L.A.). The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication. Funded by the National Institutes of Health (NIH). POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.

Downloaded from www.aappublications.org/news by guest on September 25, 2021 8 Florin et al

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF References 1. Florin TA, French B, Zorc JJ, Alpern 10. Lynch T, Platt R, Gouin S, Larson C, Identification of children at very ER, Shah SS. Variation in emergency Patenaude Y. Can we predict which low risk of clinically-important department diagnostic testing and children with clinically suspected injuries after head trauma: a disposition outcomes in pneumonia. pneumonia will have the presence of prospective cohort study. Lancet. Pediatrics. 2013;132(2):237–244 focal infiltrates on chest radiographs? 2009;374(9696):1160–1170 Pediatrics. 2004;113(3, pt 1). Available 2. Shah S, Bourgeois F, Mannix R, Nelson 21. Yen K, Kuppermann N, Lillis K, et al; at: www.​pediatrics.​org/​cgi/​content/​ K, Bachur R, Neuman MI. Emergency Intra-abdominal Injury Study Group full/​113/​3/​e186 department management of febrile for the Pediatric Emergency Care respiratory illness in children. Pediatr 11. Mahabee-Gittens EM, Grupp-Phelan Applied Research Network (PECARN). Emerg Care. 2016;32(7):429–434 J, Brody AS, et al. Identifying children Interobserver agreement in the clinical with pneumonia in the emergency assessment of children with blunt 3. Bradley JS, Byington CL, Shah SS, department. Clin Pediatr (Phila). abdominal trauma. Acad Emerg Med. et al; Pediatric Infectious Diseases 2005;44(5):427–435 2013;20(5):426–432 Society and the Infectious Diseases Society of America. The management 12. Bilkis MD, Gorgal N, Carbone M, et al. 22. R: A Language and Environment for of community-acquired pneumonia Validation and development of a Statistical Computing [computer in infants and children older than in clinically program]. Version 3.3. Vienna, Austria: 3 months of age: clinical practice suspected community-acquired R Foundation for Statistical Computing; guidelines by the Pediatric Infectious pneumonia. Pediatr Emerg Care. 2016 Diseases Society and the Infectious 2010;26(6):399–405 23. Angelilli ML, Thomas R. Inter-rater Diseases Society of America. Clin Infect 13. neuman MI, Monuteaux MC, Scully KJ, evaluation of a clinical scoring system Dis. 2011;53(7):e25–e76 Bachur RG. Prediction of pneumonia in children with asthma. Ann Allergy Asthma Immunol. 2002;88(2):209 214 4. Wipf JE, Lipsky BA, Hirschmann JV, et al. in a pediatric emergency department. – Diagnosing pneumonia by physical Pediatrics. 2011;128(2):246–253 24. Liu LL, Gallaher MM, Davis RL, Rutter examination: relevant or relic? Arch 14. Fung KP, Lee J. Bootstrap estimate CM, Lewis TC, Marcuse EK. Use of a Intern Med. 1999;159(10):1082–1087 of the variance and confidence respiratory clinical score among interval of kappa. Br J Ind Med. different providers. Pediatr Pulmonol. 5. Biondi EA, Gottfried JA, Dutko 1991;48(7):503 504 2004;37(3):243–248 Fioravanti I, Schriefer JA, Aligne CA, – Leonard MS. Interobserver reliability 15. Marin JR, Bilker W, Lautenbach 25. Eggink H, Brand P, Reimink R, Bekhof of attending physicians and bedside E, Alpern ER. Reliability of clinical J. Clinical scores for dyspnoea severity nurses when using an inpatient examinations for pediatric skin and in children: a prospective validation paediatric respiratory score. J Clin soft-tissue . Pediatrics. study. PLoS One. 2016;11(7):e0157724 Nurs. 2015;24(9–10):1320–1326 2010;126(5):925–930 26. Bekhof J, Reimink R, Bartels IM, Eggink H, Brand PL. Large observer variation 6. Gajdos V, Beydon N, Bommenel L, et al. 16. Zapf A, Castell S, Morawietz L, Karch of clinical assessment of dyspnoeic Inter-observer agreement between A. Measuring inter-rater reliability for wheezing children. Arch Dis Child. physicians, nurses, and respiratory nominal data - which coefficients and 2015;100(7):649 653 therapists for respiratory clinical confidence intervals are appropriate? – evaluation in bronchiolitis. Pediatr BMC Med Res Methodol. 2016;16:93 27. Stevens MW, Gorelick MH, Schultz T. Interrater agreement in the clinical Pulmonol. 2009;44(8):754–762 17. Landis JR, Koch GG. The measurement evaluation of acute pediatric asthma. 7. Hay AD, Wilson A, Fahey T, Peters of observer agreement for categorical J Asthma. 2003;40(3):311–315 TJ. The inter-observer agreement of data. Biometrics. 1977;33(1):159–174 28. Ducharme FM, Chalut D, Plotnick examining pre-school children with 18. Gorelick MH, Atabaki SM, Hoyle J, et al; L, et al. The pediatric respiratory acute cough: a nested study. BMC Fam Pediatric Emergency Care Applied assessment measure: a valid clinical Pract. 2004;5:4 Research Network. Interobserver score for assessing acute asthma agreement in assessment of clinical 8. Wang EE, Milner RA, Navas L, Maj H. severity from toddlers to teenagers. variables in children with blunt Observer agreement for respiratory J Pediatr. 2008;152(4):476–480, 480.e1 signs and oximetry in infants head trauma. Acad Emerg Med. 29. Melbye H, Garcia-Marcos L, Brand hospitalized with lower respiratory 2008;15(9):812–818 P, Everard M, Priftis K, Pasterkamp infections. Am Rev Respir Dis. 19. Holmes JF. Clinical prediction rules. H. Wheezes, crackles and rhonchi: 1992;145(1):106 109 In: Li G, Baker SP, eds. Injury Research: – simplifying description of lung sounds Theories, Methods, and Approaches. 9. Jain S, Williams DJ, Arnold SR, et al; increases the agreement on their New York, NY: Springer; 2012:317–336 CDC EPIC Study Team. Community- classification: a study of 12 physicians’ acquired pneumonia requiring 20. Kuppermann N, Holmes JF, Dayan classification of lung sounds from hospitalization among U.S. children. N PS, et al; Pediatric Emergency Care video recordings. BMJ Open Respir Engl J Med. 2015;372(9):835–845 Applied Research Network (PECARN). Res. 2016;3(1):e000136

Downloaded from www.aappublications.org/news by guest on September 25, 2021 PEDIATRICS Volume 140, number 3, September 2017 9

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF 30. Feddock CA. The lost art of clinical an observational study. Lancet. 35. Feinstein AR, Cicchetti DV. skills. Am J Med. 2007;120(4):374–378 2003;362(9390):1100–1105 High agreement but low kappa: I. 31. Vukanovic-Criley JM, Criley S, Warde 33. Patel N, Ngo E, Paterick TE, The problems of two paradoxes. CM, et al. Competency in cardiac Chandrasekaran K, Tajik J. Should J Clin Epidemiol. 1990;43(6): examination skills in medical students, doctors still examine patients? Int 543–549 trainees, physicians, and faculty: a J Cardiol. 2016;221:55–57 36. Gorelick MH, Yen K. The kappa statistic multicenter study. Arch Intern Med. 34. Lynch T, Bialy L, Kellner JD, et al. A was representative of empirically 2006;166(6):610–616 systematic review on the diagnosis of observed inter-rater agreement for 32. Reilly BM. Physical examination pediatric bacterial pneumonia: when gold physical findings. J Clin Epidemiol. in the care of medical inpatients: is bronze. PLoS One. 2010;5(8):e11989 2006;59(8):859–861

Downloaded from www.aappublications.org/news by guest on September 25, 2021 10 Florin et al

Florin et al https://doi.org/10.1542/peds.2017-0310 September 2017 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia 3 140 Pediatrics 2017 ROUGH GALLEY PROOF Reliability of Examination Findings in Suspected Community-Acquired Pneumonia Todd A. Florin, Lilliam Ambroggio, Cole Brokamp, Mantosh S. Rattan, Eric J. Crotty, Andrea Kachelmeyer, Richard M. Ruddy and Samir S. Shah Pediatrics originally published online August 23, 2017;

Updated Information & including high resolution figures, can be found at: Services http://pediatrics.aappublications.org/content/early/2017/08/21/peds.2 017-0310 References This article cites 34 articles, 6 of which you can access for free at: http://pediatrics.aappublications.org/content/early/2017/08/21/peds.2 017-0310#BIBL Subspecialty Collections This article, along with others on similar topics, appears in the following collection(s): Emergency Medicine http://www.aappublications.org/cgi/collection/emergency_medicine_ sub Infectious Disease http://www.aappublications.org/cgi/collection/infectious_diseases_su b Permissions & Licensing Information about reproducing this article in parts (figures, tables) or in its entirety can be found online at: http://www.aappublications.org/site/misc/Permissions.xhtml Reprints Information about ordering reprints can be found online: http://www.aappublications.org/site/misc/reprints.xhtml

Downloaded from www.aappublications.org/news by guest on September 25, 2021 Reliability of Examination Findings in Suspected Community-Acquired Pneumonia Todd A. Florin, Lilliam Ambroggio, Cole Brokamp, Mantosh S. Rattan, Eric J. Crotty, Andrea Kachelmeyer, Richard M. Ruddy and Samir S. Shah Pediatrics originally published online August 23, 2017;

The online version of this article, along with updated information and services, is located on the World Wide Web at: http://pediatrics.aappublications.org/content/early/2017/08/21/peds.2017-0310

Data Supplement at: http://pediatrics.aappublications.org/content/suppl/2017/08/22/peds.2017-0310.DCSupplemental

Pediatrics is the official journal of the American Academy of Pediatrics. A monthly publication, it has been published continuously since 1948. Pediatrics is owned, published, and trademarked by the American Academy of Pediatrics, 345 Park Avenue, Itasca, Illinois, 60143. Copyright © 2017 by the American Academy of Pediatrics. All rights reserved. Print ISSN: 1073-0397.

Downloaded from www.aappublications.org/news by guest on September 25, 2021