International Journal of Health Sciences and Research www.ijhsr.org ISSN: 2249-9571

Review Article

A Comparison of Diagnostic Imaging and Vibrating Tuning Forks in the Detection of Fractures

Derek Charles1*, Ashleigh Elkins*, Andrew Kneeburg*, Kelsey Nikkila*

1Assistant Professor, *Department of Physical Therapy, Tennessee State University 3500 John A Merritt Boulevard Nashville TN.

Corresponding Author: Derek Charles

ABSTRACT

INTRODUCTION: The purpose of this review was to determine the effectiveness of tuning forks compared to diagnostic imaging in ruling out and ruling in fractures. METHODS: Multiple databases including Ebscohost, Pub Med, and Sport Discus were utilized in the literature review. Keywords included tuning fork test, fracture detection, x-ray, , MRI, bone scan, CT scan, and diagnostic imaging. Inclusion criteria included Oxford Level of Evidence 3 or higher, statistics on sensitivity, specificity, and reliability, peer-reviewed, and English only journals. 9 articles published between 1997 and 2016 were included in the synthesis of the results. RESULTS: For the tuning fork test, sensitivity ranged from 75% to 92%, specificity ranged from 18% to 94%, positive likelihood ratios were between 1.1 and 16.5 and negative likelihood ratios were between .09 and 1.62. Either pain induction or reduction of transmission while listening with a stethoscope was used for fracture detection but there was not standardization in training or methodology. CONCLUSIONS: MRI continues to be the preferred method of fracture detection due to its high sensitivity and specificity. Computed Tomography and bone scans also demonstrated high specificity and sensitivity. While a tuning fork is simple to administer and cheap, it has low diagnostic capabilities when used alone, and therefore other imaging is still necessary for confirmation even with a positive or negative test. KEY WORDS: tuning forks, fractures, x-rays, MRI, diagnostic imaging, diagnostic accuracy

INTRODUCTION increases the potential for a fracture. [1] Stress fractures are a partial or Osteoclastic activity outpaces osteoblastic incomplete fractures caused by repetitive activity resulting in the gradual weakening weight bearing activities such as running, and eventual failure of bone. This results in jumping, or marching and are classified into ischemia caused by repetitive loading and two types: fatigue and insufficiency. [1] A micro damage to the bone‘s capillary fatigue fracture is more common and is supply. [2] An insufficiency fracture is typically seen in an active population, such usually found in older adults whereby as runners or military recruits, where normal stress is placed on a mineral training intensity rapidly increases over a deficient bone due to osteoporosis and the short period of time. Under stress a bone bone fractures from loading, often during will generally deform from its normal shape normal activities of daily living. [1] In a and recoil without damage; but if a bone is review of the literature pertaining to the loaded beyond its ability to withstand etiology and diagnosis of stress fractures, forces, it plastically deforms which Reeder and colleagues reported new bone

International Journal of Health Sciences & Research (www.ijhsr.org) 473 Vol.7; Issue: 8; August 2017 Derek Charles et al. A Comparison of Diagnostic Imaging and Vibrating Tuning Forks in the Detection of Fractures ossification takes up to 90 days to complete, reduction of sound transmission while exposing weakened bone to stress for listening through a stethoscope. 9 articles extended periods of time and ultimately published between 1997 and 2016 were resulting in a fracture. [3] included in the synthesis of the results. Stress fractures can occur at any location but certain areas are more RESULTS susceptible, depending on the nature of the Sensitivity activity. For example, the navicular is small Overall, the tuning fork test in diameter and exposed to repetitive bouts exhibited moderate to high sensitivity. of loading during walking, and even more in According to a systemic review by running. [3] Fractures result from the Mugunthan et al., [10] sensitivity ranged from coupling of a flexible foot where the arch is 75% to 92% with common lower extremity flattened and the navicular repeatedly comes stress fractures when performed on into contact with the ground, increasing the homogenous populations. However, while a potential for injury. Other susceptible sites positive tuning fork test could be sufficient include the tibia, talus, sesamoid of the 1st to warrant the initiation of treatment while toe, and base of the 5th metatarsal. [1, 3, 4] waiting of the results of imaging, it does not Diagnostic imaging such as display consistent or sufficient sensitivity to radiographs, bone scintigraphy, ultrasound, rule out a stress fracture if no pain was computed tomography and MRI, is present during the test. traditionally used in the detection of bony Specificity injuries. [5-9] Tuning forks are a diagnostic The results yielded a wide spread in tool that can theoretically detect fractures by the specificity, or the ability of a tuning fork transmission of vibrations through a bone test to rule in a fracture. According to the and if present, the patient should complain same review by Mugunthan et al., [10] of pain due to the percussion. [4] The specificity varied from 18% to 94%, purpose of this review was to determine the meaning there is high potential for false effectiveness of tuning forks compared to positive results. The significant fluctuations diagnostic imaging in the detection of in specificity were likely due to the fact pain fractures. can occur even without an actual fracture, making the tuning fork method unreliable METHODS for ruling in a fracture. [11, 12] The review of the literature was Likelihood Ratios conducted utilizing multiple databases, The likelihood ratios provided including Ebscohost, Pub Med, and Sport additional information about swings in Discus. Keyword searches included tuning probability for potential fractures.[13]Positive fork test, stress fractures, x-ray, ultrasound, likelihood ratios were between 1.1 and 16.5 MRI, bone scan, CT scan, and diagnostic while negative likelihood ratios were imaging. Articles from the initial search between .09 and 1.62. [10-12, 14] A positive LR were eliminated if it was determined the greater than 10 indicated a strong shift study did not examine fractures, tuning towards the probability of a fracture while a forks, diagnostic imaging, or include negative LR less than .10 indicated a strong statistics on reliability, sensitivity, or shift towards the nonexistence of a fracture. specificity. The abstracts of the remaining [13] The large range in likelihood ratios does articles were reviewed by one of the five not allow any definitive clinical investigators and the following inclusion associations. It should be noted the study criteria were applied: Oxford Level of that concentrated on the use of a 128 Hz Evidence 3 or higher, peer-reviewed, and tuning fork with suspected distal fibula English only journals. The method of fractures after an inversion ankle injury in detection was either reproduction of pain or individuals with positive Ottawa ankle rules

International Journal of Health Sciences & Research (www.ijhsr.org) 474 Vol.7; Issue: 8; August 2017 Derek Charles et al. A Comparison of Diagnostic Imaging and Vibrating Tuning Forks in the Detection of Fractures had the highest positive and negative fork test was valid in the detection of likelihood ratios, which indicated the tuning fractures at that location. [15]

Table 1. Results of the studies reviewed related to the use of tuning forks Reference Participants Exposure Outcome Key Findings Comments Fatima, 55 subjects, ages Each participant 67 total stress Sensitivity of the TF Caution should be taken Jeilani, 18-28, with had a TF test to the fractures were test was 79% and when interpreting the Abbasi, et al.26 negative x-rays to painful site at 128 located with specificity was 63%. results of a TF test to rule J Ayub Med the tibia or fibula. Hz followed by scintigraphy. The in or out a stress fracture. Coll 2012: bone scintigraphy. TF test was 24(3-4): 180- positive for 53 182 participants. Reference Participants Exposure Outcome Key Findings Comments Dissmann, Pilot study with 49 Patients with Sensitivity was Diagnostic accuracy The addition of a TF test Han15 patients, ages 12- ‗‗Ottawa positive‘‘ 100% and was 65% for the for ‗‗Ottawa positive‘‘ Emerg Med J. 84, who were findings had a 128 specificity was lateral malleolus and patients may lead to a 2006; 23(10): examined by a Hz TF applied to 61% for the lateral 96% for the distal reduction in ankle 788-790. single instigator. the lateral malleolus. fibula. The +LR was radiographs. This may be Patients then malleolus or distal Sensitivity was 2.59 for the lateral relevant when radiological received fibula 100% and malleolus and 22 for facilities are not readily radiographs for a specificity was the distal fibula. available or where access reference standard 95% for the distal has to be prioritized. fibula. Reference Participants Exposure Outcome Key Findings Comments Wilder, 45 runners (31.2 Participants had a Higher fork- The 256-Hz fork While the 256-Hz tuning Vincent, +/- 13.1 years) with TF at 128, 256, and induced pain elicited the highest fork had a high sensitivity, Stewart, et stress fracture 512 Hz placed over ratings were pain ratings and the test had very poor al12. symptoms who had the painful site and correlated with sensitivity for specificity: only around Athletic already received rated symptom positive imaging (r fracture detection 20%. So individuals Training and radiographs, MRI, intensity on a 0-3 = 0.156, P = .056). (range, 77.7%– without stress fractures Sports Health and a bone scan. scale. The odds risk of a 92.3%), and the 512- will nevertheless have Care positive image was Hz fork elicited the pain during a tuning fork 2009;1(1):12- 5.91 with a fork- lowest (range, 50%– test, making it difficult to 18 induced pain rating 76.9%) of all use as a diagnostic aid. of 3, compared diagnostic tests. with pain ratings ≤ 2. Reference Participants Exposure Outcome Key Findings Comments Moore24 19 males and 18 A 128 Hz TF was Results included 10 Sensitivity was 83% A TF test has increases J Athl Train females, ranging placed distal to the true positives, 20 and specificity was specificity and diagnostic 2009; from 7-60 years suspected fracture true negatives, 5 80%. Diagnostic accuracy when combined 44(3):272-274 old,were evaluated while a stethoscope false positives and accuracy was 81%. with a stethoscope. for possible was placed 2 false negatives. Use of a stethoscope fractures < 7 days proximally did not alter the old. sensitivity, but specificity increased to 92% and diagnostic accuracy to 89%. Reference Participants Exposure Outcome Key Findings Comments Mugunthan, 6 studies were Various LE Fracture prevalence The heterogeneous In the study with patients Doust, Kurz, pooled with over locations were ranged from 10% specificity resulted in positive for the Ottawa et al.10 329 patients examined using to 80%. Sensitivity a high proportion of ankle rules, sensitivity was BMJ Open. between the ages of pain induction or ranged from 75% false positives due to 100%, although there were 2014; 7-60. reduced sound to 100% and pain reproduction only five patients with 4(8):e005238. conduction with a specificity ranged with a 256 Hz TF in fractures. stethoscope. 4 from 18% to 95%. subjects without studies used a 128 fractures. Hz TF while 2 used different . Reference Participants Exposure Outcome Key Findings Comments Schneiders, 9 studies reviewed Inclusion criteria Sensitivity ranged All studies examined Other variances in the Sullivan, with 420 subjects, included at least 1 from 43-100%, athletes and/or studies included time Hendrick, et ranging from 18-45 radiological specificity 0-100%, military personnel, frames for detection, the al.14 years old. 6 studies reference test, +LR 1.04 to 3.67, - confirming these definition of a stress JOSPT 2012; involved both computation of LR .06 to 1.62, populations are at fracture, methods for 42(9), 760-771 genders and 2 were diagnostic values, DOR values 1.08- risk. But the results quantifying pain, and men only. 5 studies no age restrictions, 171.20, and can‘t be generalized of the TF, and examined multiple published between QUADAS score to other populations type of reference test. LE locations and 1 1950 and 2011, and from 7-23. including those with study focused on only LE stress underlying osteoporosis. fractures. pathologies such as osteoporosis.

International Journal of Health Sciences & Research (www.ijhsr.org) 475 Vol.7; Issue: 8; August 2017 Derek Charles et al. A Comparison of Diagnostic Imaging and Vibrating Tuning Forks in the Detection of Fractures

Table 1: Continued… Reference Participants Exposure Outcome Key Findings Comments Toney, 6 studies reviewed All types of LE Sensitivity ranged Overall the review Further research is needed Games, the accuracy of a stress fracture were from 75% to 92% demonstrated a high to determine if TF tests are Winkelmann, 128 Hz TF method measured against a and specificity ability to rule out a only effective in certain et al.12 for pain induction MRI, x-ray, or ranged from 18% fracture but there was physiologic adaptations, and reduction of bone scan. Case to 94%. The +LR much greater such as cortical bone J Athl Train sound transmission. series, case- was 1.1 to 16.5 and variability in the discontinuity, or are 2016; 51(6): Patient age range controls studies, or the –LR was 0.09 ability to rule in a affected by the timing of 498-499. was 7 to 84. narrative reviews to 0.49. fracture. the injury. Additionally, weren‘t assessed. larger sample sizes with various fracture types and locations using higher quality protocols, including blinding of testers to the reference standard are recommended. Reference Participants Exposure Outcome Key Findings Comments Kazemi and 46 consecutive An orthopedist 37/46 patients had There was marginal Even in the presence of a Roscoe4 patients, 2-91 years examined each fractures. reproducibility and negative TF test, follow up International old, presented to an patient without Sensitivity was poor specificity in the imaging is recommended Sports ambulatory center inclusion of a TF. 86.8% and TF test, regardless of due to the high rate of Journal. 2000; with acute UE, LE, The 2nd examiner specificity was the frequency. false negatives. 4(2): 1-8 or rib pain <10 placed a 128 and 50% for the TF. Avulsion fractures days due to trauma 256 HZ TF over the The Kappa was .35 and radial head involved side and a fractures were 256 HZ TF on the especially difficult to uninvolved side. detect. Radiographs were then performed. Reference Participants Exposure Outcome Key Findings Comments Lesho17 46 military 128 Hz TF was The sensitivity and The TF test had The sensitivity of the TF Mil Med. personnel, average applied to the specificity of the poorer diagnostic test is not enough to rule 1997; 162(12): age 25 years old, anterior tibia. Pain TF test were 75% capabilities compared out a stress fracture with a 802-803. with exercise reproduction was and 67%. to x-ray and negative or pain free test. related shin pain considered positive. especially to the localized to the Patients also had a MRI‘s 100% and tibia. bone scan. 86% sensitivity and specificity. Reliability The clinical presentation of each Overall, the results did not conclude fracture type may be unique with varying a tuning fork test was more reliable for physiological features and this could have fracture detection based on anatomical affected the accuracy of detection. [16] A location or type of fracture. Articles tuning fork test is most accurate when the reviewed included tibial stress fractures, periosteum is completely disturbed or there femoral neck fractures, ankle inversion is minimal soft tissue around the bone. [12] injuries, foot and ankle stress fractures, rib Generally, tuning fork tests were fractures, and unspecified locations. [14] The significantly less accurate (sensitivity = fractures covered in the literature review 75% [95% confidence interval {CI} = 57%, were characterized numerous types, 87%]; specificity = 67% [95% CI = 44%, including stress fractures, avulsion fractures, 84%]; positive likelihood ratio = 2.2 [95% greenstick fractures, comminuted fractures, CI = 1.1, 4.5]; negative likelihood ratio = and hairline fractures. [16] All studies 0.37 [95% CI = 0.18, 0.77]) for stress included in this review either utilized pain fractures, when the outer layer of bone was induction and/or reduction of sound still intact. [17] The results of the studies transmission as the method for fracture reviewed are summarized in Table 1. detection. There did not appear to be standardization in training or use of the DISCUSSION tuning forks and future research should While MRI has the highest cost, [18] determine whether one method of detection it displays the greatest detail and has the is preferable. [10] highest sensitivity and specificity of any imaging technique, [6,19] and is therefore the

International Journal of Health Sciences & Research (www.ijhsr.org) 476 Vol.7; Issue: 8; August 2017 Derek Charles et al. A Comparison of Diagnostic Imaging and Vibrating Tuning Forks in the Detection of Fractures preferred choice for the detection of stress due to time spent away from work [21] and fractures. [20] CT scans are also more evidence suggests athletes who wait more expensive than other forms of imaging but than three weeks after the onset of have a high level of specificity. Bone scans symptoms ultimately have a delay and cost less than MRI or CT scans and have increased timeline for return to sport. [23] high sensitivity [18] however, bone scans Ohta-Fukushima analyzed 222 stress have an increased rate of false positives in fractures in 208 athletes, determining there the presence of an infection or a tumor so it was a statistically significant difference in may not be suitable for all populations. [2, 6, time for return to competition between 19] X-rays have a low cost and accessibility athletes who visit a hospital within three that make it an easy diagnostic tool to obtain weeks of symptoms and those who waited but has less detail and low initial sensitivity longer. It took an average of 14.2 weeks to than other techniques. [18, 21] While a tuning return to sport for those athletes who visited fork is simple to administer and extremely the hospital within three weeks of symptoms cheap, [22] it has low sensitivity and and sometimes over six months for those specificity when used alone, and therefore who waited longer than three weeks. [23] So other imaging is still necessary for regardless of the method of detection, early confirmation even with a positive test. [17] diagnosis and intervention are keys for Table 2 compares the costs and accuracy of return to activity. the various diagnostic tools. [4] CONCLUSION Table 2. Comparison of the cost and accuracy of various diagnostic tools Additional research is necessary to Test Cost Sensitivity/Specificity conclude if tuning-fork tests are beneficial MRI $750- High sensitivity and specificity $2200 under particular conditions, such as a lack of Radiographs $90-$950 Poor initial sensitivity cortical bone continuity, [24] in conjunction Bone Scans $150- High sensitivity [25] $600 with other tests, or are influenced by the [26] CT Scan $950- Low sensitivity/high specificity timing of the injury. Also, investigators $2000 Tuning Fork $5-$6 Low sensitivity/low specificity when should use greater sample sizes with used alone different types of fractures, different locations, standardized protocols, and It is proposed stress fractures have blinding of testers. The method of pain two different stages: a cortical bone induction using the 128 Hz tuning fork weakening stage and a cortical bone requires less training than use of a discontinuity stage. [1] Due to this, use of a stethoscope to listen for sound reduction and tuning fork should be studied in the therefore could be the focus of future different stages because this may explain the research. In summary, caution should be low specificity in previous studies. If results taken when interpreting the results of a TF show the tuning fork test has a greater test to rule in or rule out a fracture. sensitivity and/or specificity once the Diagnostic imaging should continue to be fracture progresses to the discontinuity the standard for identifying fractures stage, then the tuning fork test may help regardless of the suspected type or location. clinicians recognize the signs and symptoms associated with a stress fracture earlier and REFERENCES therefore assist in the determination of 1. Romani WA, Gieck J H, Perrin DH, et al. additional imaging as appropriate. Mechanisms and management of stress Minimizing delays in detection has fractures in physically active persons. J Athl Train. 2002; 37(3): 306–314. an impact on the ability to return to 2. Otter MW, XY Q, Rubin CT, et al. Does bone activities of daily living, occupational perfusion/reperfusion initiate bone remodeling duties, and recreational activities in a timely and the stress fracture syndrome? Med manner. There are financial implications Hypotheses. 1999; 53(5): 363-368.

International Journal of Health Sciences & Research (www.ijhsr.org) 477 Vol.7; Issue: 8; August 2017 Derek Charles et al. A Comparison of Diagnostic Imaging and Vibrating Tuning Forks in the Detection of Fractures

3. Reeder MT, Dick BH, Atkins JK, et al. Stress ―Ottawa positive‖ patients after ankle fractures. Current concepts of diagnosis and inversion injury. Emerg Med J. 2006; 23(10): treatment. Sports Med. 1996; 22(3): 198-212. 788-790. 4. Kazemi M, Roscoe M. Is the tuning fork test a 16. Müller ME, Koch P, Nazarian S, et al. The reliable tool in detecting acute simple comprehensive classification of fractures of fractures? International Sports Journal. 2000; long bones. New York, NY: Springer Science 4(2): 1-8. and Business Media; 1990. 5. Ishibashi Y, Okamura Y, Otsuka H, et al. 17. Lesho EP. Can Tuning Forks Replace Bone Comparison of scintigraphy and magnetic Scans For Identification of Tibial Stress imaging for stress injuries of bone. Fractures? Mil Med. 1997; 162(12): 802-803. Clin J Sport Med. 2002; 12: 79-84. 18. Average Out-of-Pocket Costs for Uninsured 6. Zwart A, Beeres F, Ring D, et al. MRI as a and Self-Pay Patients | El Camino Hospital. reference standard for suspected scaphoid 2015; Retrieved from fractures. Br J Radiol. 2012; 1098-1101. http://www.elcaminohospital.org/Patient_Serv 7. Bretlau T, Christensen OM, Edstrom P, et al. ices/Billing_Insurance/Patient_Services_Prici Diagnosis of scaphoid fracture and dedicated ng_and_Estimates/Average_Out-of- extremity MRI. ACTA Orthopedics of Pocket_Costs_for_Uninsured_Patients Scandinavia. 1999; 70(5): 504-508. 19. Fredericson M, Bergman AG, Hoffman KL, et 8. Cabarrus MC, Ambekar A, Lu Y, et al. MRI al. Tibial stress reaction in runners. and CT of insufficiency fractures of the pelvis Correlation of clinical symptoms and and the proximal femur. Am J Roentgenol. scintigraphy with a new magnetic resonance 2008; 191: 995-1001. imaging grading system. Am J Sports Med. 9. Gaeta M., Minutoli F, Scribano E, et al. CT 1995; 23(4): 472-481. and MR imaging findings in athletes with 20. Liong SY, Whitehouse RW. Lower extremity early tibial stress injuries: comparison with and pelvic stress fractures in athletes. Br J bone scintigraphy findings and emphasis on Radiol. 2012; 85: 1148-1156. cortical abnormalities. Radiology. 2005; 21. Matheson GO, Clement DB, McKenzie DC, et 235(2): 553-61. al. Stress fractures in athletes. A study of 320 10. Mugunthan K, Doust J, Kurz B, et al. Is there cases. Am J Sports Med. 1987; 15(1): 46-58. sufficient evidence for tuning fork tests in 22. TENSnet - Over 25,000 health and medical diagnosing fractures? A systematic review. products at everyday low prices. 2015; BMJ Open. 2014; 4(8):e005238. Retrieved from http://TENSNET.COM 11. Wilder RP, Vincent HK, Stewart J, et al. 23. Fukushima-Ohta M, Mutoh Y, Takasugi S, et Clinical use of tuning forks to identify al. Characteristics of stress fractures in young running-related stress fractures: a pilot athletes under 20 years. J Sports Med Phys study.Athletic Training and Sports Health Fitness. 2002; 42: 198-206. Care. 2009; 1(1): 12-18. 24. Moore M. The use of a tuning fork and 12. Toney CM, Games KE, Winkelmann ZK, et stethoscope to identify fractures. J Athl Train. al. Using tuning-fork tests in diagnosing 2009; 44(3): 272-274. fractures. J Athl Train. 2016; 51(6):498-499. 25. Bachmann LM, Kolb E, Koller MT, et al. 13. Grimes DA, Schulz KF. Refining clinical Accuracy of Ottawa ankle rules to exclude diagnosis with likelihood ratios. Lancet. 2005; fractures of the ankle and mid-foot: systematic 365(9469):1500–1505. review. BMJ: British Medical Journal. 2003; 14. Schneiders AG, Sullivan SJ, Hendrick PA, et 326(7386): 417. al. The ability of clinical tests to diagnose 26. Fatima ST, Jeilani A, Abassi N, et al. stress fractures: a systematic review and meta- Validation of tuning fork test in stress analysis. J Orthop Sports Phys Ther. 2012; fractures and its comparison with radionuclide 42(9): 760-771. bone scan. J Ayub Med Coll. 2012; 24(3-4): 15. Dissmann PD, Han KH. The tuning fork 180-182. test—a useful tool for improving specificity in

How to cite this article: Charles D, Elkins A, Kneeburg A et al. A comparison of diagnostic imaging and vibrating tuning forks in the detection of fractures. Int J Health Sci Res. 2017; 7(8):473-478.

***********

International Journal of Health Sciences & Research (www.ijhsr.org) 478 Vol.7; Issue: 8; August 2017