European Journal of Trauma Focus on Multiple Trauma

A Comparison Study of the Score Models

Bernice Dillon1, Wenbin Wang2, Omar Bouamra3

Abstract Key Words Background: A central component to the statistical Trauma · Injury · Severity · Score · ISS · TRISS analysis of trauma care is the probability of survival model, which predicts outcome of the trauma event Eur J Trau ma 2006;32:538–47 taking into account various anatomical and physi- DOI 10.1007/s00068-006-5102-9 ological factors. One of the key input information to the survival model is the injury score which forms the cornerstone of trauma epidemiology. There are many Introduction scoring systems currently in use, and the Injury Severity Injury severity scoring is a cornerstone of trauma epi- Score (ISS) as the anatomical component of the injury demiology. An anatomical injury score is an essential in the probability of survival model is a widely used constituent of any risk-adjustment model used to pre- one. This paper examines the possibility of represent- dict the probability of survival. This score must accu- ing the anatomical component of the trauma using rately assess the extent and severity of endured. different injury severity scoring methods described in Anatomical injury is determined on the basis of physi- the literature. cal examination, investigative procedures, and surgical Material and methods: The dataset used consists intervention and in fatal cases, post-mortem examina- of 75,371 cases from the Trauma Audit and Research tion. The injuries for each case are catalogued once Network (TARN). TARN regroups 110 hospitals in the and do not change in time, unlike the physiological UK and it is the largest European trauma registry. scores. The TRISS models used at the Trauma Audit Various limitations of ISS have been described in the & Research Network (TARN)[1] are all based on the literature and an investigation into other scoring use of the Injury Severity Score (ISS) as the anatomi- methods, which could be calculated from the availa- cal component. This widely used injury scoring system ble data, was proposed. Using the available database, involves classifying all patient injuries using the Abbre- the alternative injury scoring methods can be calcu- viated Injury Scale (AIS)[2] system described below. lated and their use within a Trauma score and Injury Specialist coders within the TARN group characterize Severity Score (TRISS) probability of survival model each injury in terms of anatomic description and injury is assessed. severity using the AIS system based on a catalogue of Results: The current score performs reasonably well, injuries provided by the hospitals. The aim of this paper but there is some improvement in calibration associ- was to investigate some alternative injury scoring sys- ated with introducing a score, which takes into account tems and the advantages or disadvantages of their use in body-region locations of all injuries. the TRISS model used by TARN. All the data analysis

1 Medical Statistician at Whithenshawe Hospital, Manchester, UK 2 Centre for OR and Applied Statistics, School of Accounting, Economics and Management Science, University of Salford, Salford, UK, 3 Medical Statistician at the Trauma Audit & Reseach Network, Mancester, UK. Received: August 26, 2005; revision accepted: June 12, 2006

European Journal of Trauma 2006 · No. 6 © Urban & Vogel 538 Dillon B, et al. Injury Scoring

and model evaluation described in this paper were done Table 1. Ordinal scale used for AIS severity code classification. using a dataset obtained from the TARN database. AIS Severity Score Injury Severity Description The dataset used contains all the trauma events in the 1 Minor TARN database for 5 years from 1998–2002. Over 150 2 Moderate variables covering all aspects of care within the trauma 3 Serious, not life-threatening 4 Severe, life-threatening system are stored for each of the 75,371 trauma patients 5 Critical, survival uncertain treated. The software used for all statistical modelling 6 Maximum, actually untreatable and testing is SPSS (version 12). Excel was also used to produce some of the tables and graphs. The next section multiple injuries into a single score, that is the purpose of shows a review and a description of some of the scoring the injury scoring systems described in the next section. systems currently in use for modeling outcome in trau- ma. Logisitic regression is used to model outcome and Review of Scoring Systems the model performance is tested by means of the AROC Injury scoring systems are used to provide a valid curve. The goodness-of-fit test is the Homer-Lemeshow numerical measure of the overall severity of injury in (H-L) test which compares observed to expected out- patients with multiple injuries. Four different scoring come using the chi squared statistic. A detailed descrip- systems, which are all functions of AIS injury severity tion of the statistical method can be found in Homer and codes, are reviewed here. Lemeshow [3]. Injury Severity Score (ISS) Material And Methods This is the most widely used score for evaluating patients Injury Classification and Scaling System - AIS with multiple injuries. The ISS was introduced by [4] to Injury scaling is the assessment of the severity of trau- provide a summary measure of injury severity. It was ma-related tissue damage. The based on a study of 2,128 motor vehicle crash patients (AIS) is the most widely used anatomical scale for rating where Baker and her colleagues showed a non-linear severity of injuries worldwide. It was developed by a joint relationship between mortality and the maximum AIS Committee on Injury Scaling composed of members of rating of the most severe injury. In addition the mor- the American Association for Automotive Medicine[2], tality rate depended on the second most severe injury the American Medical Association and the Society of as well as the most severe injury. When the age of the Automotive Engineers to provide motor vehicle crash patient is taken into account, the ISS has been shown teams with an accurate method of rating and compar- to correlate well with mortality, length of stay and to a ing injuries. The first version was a rudimentary classi- lesser extent charges in the hospital. It was proposed as fication of 73 injuries that was published in 1971. Since the anatomical component in the original TRISS model then, the AIS classification system has been revised and [5] and has been the most commonly used injury sever- extended six times. The latest 1990 version distinguishes ity score ever since. However as described in the rest of over 2000 different injuries, including penetrating as this section, other severity scores have been developed well as blunt injuries and is regarded as the injury cod- and some studies show that these correlate better than ing ‘industry standard’ with respect to its injury–specific ISS with mortality rate and other outcomes. descriptive abilities. In AIS classification, each injury is assigned a six-digit code based on its anatomical site, Evaluation of ISS nature and severity. There are nine AIS body regions Each injury is given an AIS severity code and classified and over 2000 codes describing different injuries. The into one of six ISS body regions. The ISS is defined as ordinal scale 1–6 used to characterize the severity of the the sum of the squares of the highest AIS scores in each injury is given in Table 1. This AIS severity scale is not of the three most severely injured ISS body regions. If a linear progression i.e. the intervals between the scores there is more than one injury in a particular body region are not equal. The AIS 1–6 numeric code is simply a only the highest AIS score is used. The maximum ISS means of distinguishing between categories of injuries score is 75 and any patient with an injury of AIS severity within a similar range of severity. The AIS focuses on score 6 is automatically given an ISS score of 75. An ISS classifying each individual injury accurately however it of 16 or more defines with a typical, aver- provides no mechanism to summarize a single patient’s age mortality rate of more than 10% [6].

European Journal of Trauma 2006 · No. 6 © Urban & Vogel 539 Dillon B, et al. Injury Scoring

Advantages and disadvantages of ISS found NISS was a better predictor of length of stay and The ISS was shown to correlate substantially better with ICU admission than ISS. However, a recent study on mortality rate than does the AIS rating for the single a database of 6,231 patients of a Level I trauma centre most severe injury [6]. It is relatively easy to calculate if found little difference in the two scoring systems [15]. the AIS codes are available. However in various com- The AROC values were similar (ISS: 0.940, NISS: 0.936) parative studies, it does not do as well as other scoring and neither the ISS nor NISS were well calibrated with systems [7], [8 and [9]. The major limitations of it relate H-L statistics of 36.11 and 49.28 respectively. Also in to three factors: two comparative studies NISS did less well than ISS [7] (1) It does not take into account multiple injuries and [9]. in the same body region since it only includes the highest AIS score in each body region. This Evaluation of NISS may lead to an underestimation of the patients NISS is defined as the sum of the squares of the AIS overall anatomic injury severity because some scores of a patient’s three most severe injuries. It is of the patients most severe injuries may not be more straightforward to calculate than ISS, because the included. location of the injuries do not have to be considered, (2) ISS gives equal weight to AIS injury severity the three highest AIS scores are used regardless of body codes for each body region and this may not be region. appropriate in calculating mortality. An AIS 4 has a higher mortality risk than an Advantages and disadvantages of NISS AIS 4 injury to the extremities. Once the AIS severity codes are available, NISS is (3) It has also been shown that diverse combina- straightforward to calculate. It does take account of tions of specific injuries and injuries severities multiple injuries in the same region, however it does will give the same ISS value but have very differ- not discriminate between injury severities in differ- ent risks of mortality associated with them [10]. ent locations. Its performance compared to ISS seems to depend to some extent on the characteristics of the New Injury Severity Score (NISS) database studied. A new scoring method was proposed in 1997 to improve the accuracy of TRISS models [8]. Based on a univari- Anatomic Profile (AP) Score ate logistic analysis of two independent small datasets This score was developed to overcome or mitigate the (3,136 and 3,449) Osler et al, [11], showed that NISS was limitations of ISS. This four component score charac- better than ISS at predicting survival (the area under terizes anatomic injury by more precisely describing the receiver operating characteristic (AROC) curve, patient injuries and more accurately relating them to ISS:0.869, AROC NISS:0.896) and also NISS provides the likelihood of patient survival. It takes into account a better fit throughout its entire range of prediction all serious injuries (AIS severity >3 ) and also includes (Hosmer-Lemeshow (H-L) statistic:for ISS:29.12, for a categorization based on injury locations. This means NISS:8.88). The improved accuracy of NISS for predic- that ‘high-risk’ injuries to the head and spinal cord can tion of short-term mortality was also confirmed in a dif- be weighted more heavily in the score evaluation. How- ferent study of 2328 patients [12]. This focussed on the ever it is more complex to evaluate than either ISS or mortality rates in patients with discrepant NISS and ISS NISS. It was proposed in 1990 by Copes et al. [16], who scores (68% of patients). This group had a significant- analysed 20,946 blunt-injured adult patients submitted ly higher mortality rate compared with those who had to the Major Trauma Outcome Study (MTOS) data- identical ISS and NISS scores. Moreover as the differ- base before 1987. Performance comparisons of ISS and ence in the ISS and NISS scores increases the patient’s AP were obtained from a test dataset of 5,939 patients likelihood of survival decreases. The AROC was greater submitted to the MTOS database in 1987. An advisory with NISS than ISS. (0.852 vs 0.799; p < 0.001). Another group of trauma surgeons and health service research- study found NISS describes postinjury multiple organ ers guided the development of the score and based on failure better than ISS [13]. In a subsequent study, Balo- clinical judgements and the results of statistical analy- gh et al. found patients with multiple orthopaedic inju- sis, they assigned injuries into categories for calculat- ries were more likely to have NISS > ISS [14]. They also ing the four AP components A-D. The evaluation of

540 European Journal of Trauma 2006 · No. 6 © Urban & Vogel Dillon B, et al. Injury Scoring

AP is described in the section below and Table 2 shows Evaluation of APS and AP score the categorization of injuries. The results obtained The AP of a patient’s injuries is based on four anatomic by Copes et al.[16] show improved discrimination and profile components. These components are calculat- sensitivity for AP compared to ISS, however neither ed by classifying all the patient injuries into groups as score alone was well calibrated when used to predict shown in Table 2. The assignment of injury to AP com- patient survival. AP was incorporated into a probabil- ponents was based on an analysis of MTOS data and ity of survival model, which uses the Revised Trauma clinical judgements by an advisory committee of trauma Score (RTS) values and patients age as predictors in surgeons and health service researchers [16]. Each com- addition to the AP score. This model known as ASCOT ponent A-D is calculated by evaluating a summary value (A Severity Characterization of Trauma) was proposed of all the injuries associated with that component. The by Champion et al.[17] . Calculation of ASCOT models summary value is obtained by calculating the square is discussed later. A variation of the AP score, the mod- root of the squares of the AIS codes of all injuries of ified anatomic profile (mAP) was introduced by Sacco a specified severity in particular body region groups et al., [8], in a comparative study for assessing injury (Table 2). The component value is zero if none of its severity scores. The mAP is used to define a single associated injuries exist. Copes et al. [16] do not specify value anatomic profile score (APS) instead of the multi- how patients with any AIS 6 injury are scored however component AP score. Similar to AP, this score relies on it is assumed that these are placed in a set aside group specification of the AIS and ICD codes for each injury as described later for ASCOT evaluation. Treatment of and is based on component values, which summarize serious injuries (AIS >2) to the external region and criti- the number, location and severity of all serious injuries. cal injuries (AIS > 4) to the face is also not specified in APS also recognises the importance of the maximum table 2. However Copes et al., [16], found component D AIS severity value and uses it as a component. In inde- was insignificant in the evaluation of the score and since pendent comparative calculations to predict mortality these injuries are usually associated with injuries to the using univariate logistic models based on the various head or other regions, it was assumed that they do not injury severity scores alone, APS performs better than contribute to the evaluation of AP directly. NISS and ISS [7] and [9]. To calculate APS, Sacco et al., [8], gives APS = 0.319 9(mA) + 0.4381(mB) + 0.1406(mC) + 0.7961(Max AIS) Table 2. Classification of injuries to AP components (from Copes (1) where mA-mC are the mAP component values spe- et al. [16]). cified in Table 3 and ‘Max AIS’ is the maximum overall Component Injury AIS ISS ICD-9-CM Codes AIS score. The weights in equation (1) were derived from severity body (International a logistic regression analysis of a MTOS dataset [17]. region Classification of Diseases Ninth Revision) Limitation of TARN database for APS/AP score A Head/brain 3–5 1 800, 801, 803, 850–854 evaluation Spinal cord 3–5 1, 3, 4 806, 950, 952, 953 The categorization of the patient’s injuries in Table 2 807, 839.61/.71, B Thoracic 3–5 3 depends on the AIS and ICD-9- CM classification 860–862, 902 of each injury. Unfortunately only the AIS codes are Front of 3–5 1 807.5/.6, 874, 900 neck stored for the patient injuries in the TARN database Abdomen/ C pelvis 3–5 4 863–868, 902 and so neither the AP score nor the APS could be calcu- Spine w/o lated accurately. The only location-dependent informa- cord 3 1, 3, 4 805, 839 808, tion for each injury, which can be easily calculated from Pelvic 4–5 5 Fracture 839.42/.52/.69/.79 the TARN database, is the ISS body regions associated Femoral artery 4–5 5 904.0/.1 with each injury. Using this information, we constructed Crush above two approximate scores so that some location-weighted knee 4–5 5 928.00/.01, 928.8 scores could be examined in this study. Amputation 4–5 5 897.2/.3/.6/.7 above knee 1. ‘AP-like’ score: This six component score is sim- Popliteral artery 4 5 904.41 ply evaluated by calculating a summary value D Face 1–4 2 802, 830 for each of the six ISS body regions. The sum- All others 1–2 1–6 – mary value used is the same as APS/AP scores:

European Journal of Trauma 2006 · No. 6 © Urban & Vogel 541 Dillon B, et al. Injury Scoring

Table 3. Component definitions for mAP (from Sacco et al. [8]). cients using the ITEC data [19]. Using the same dataset Componenta Injured body region AIS severity split into a design dataset (to predict the model coef- mA Head/brain 3–6 ficients) and a test dataset, Hannan et al., [19], showed Spinal cord 3–6 that ASCOT performed better than TRISS (Disparity: mB 3–6 0.37 vs. 0.34, H-L statistic: 10.3 vs. 96.8). Results from Front of neck 3–6 mC All others 3–6 another independent evaluation of ASCOT and TRISS

a were reported in 1996, [20] by Champion et al. They ana- mA, mB, mC scores are derived by taking the square root of the sum of squares for all injures defined by each component lysed 14,296 patient cases submitted over two years to four level 1 trauma centres which contribute to MTOS. the square root of the sum of the squares of the These hospitals are classified as MTOS ‘controlled sites’ severity code for all serious injuries (AIS 3,4,5) because of their emphasis on controlled data collec- 2. ‘APS-like’ score: This single value score (APS*) tion, which result in a high percentage of complete and is the weighted sum of the summary values accurate patient records. Only 64% of the cases were from three groups of ISS regions: APS* = 0.319 categorized as blunt-injured adults. The AROC for this 9(mA*) +0.4381(mB*) + 0.1406(mC*) + 0.7961 sample was similar using the two models (TRISS 0.911; (Max AIS) (2) where mA* is a summary value of ASCOT 0.916). However, the H-L value of 13.3 with all ISS region 1 serious injuries (AIS 3–6), mB* ASCOT indicates that it provides a statistically good-fit is the same for ISS region 3 and mC* is a sum- and was a substantial improvement on the TRISS value mary value for serious injuries found in all other of 30.7. regions. The same summary value is used here. Clearly these approximate scores are not as pre- Construction of ASCOT models cise as the definitions given in Tables 2 and 3. In The ASCOT model developed by Champion et al., [17], particular the weights used for APS* have been differs from TRISS in three significant ways. derived from an MTOS database relating to USA 1 It uses AP instead of ISS to characterize the anatomic trauma patients and weights for a UK-based data- severity of the injuries. base would have been more appropriate. 2 Patient’s age is categorized into 5 levels instead of the two used by TRISS. Development of ASCOT – a probability of survival 3 Each of the RTS components (Glasgow Coma Score model (GCS), respiratory rate and systolic blood pressure) AP was incorporated into a TRISS-like model, which are used as a predictor in the model instead of forming uses RTS values and patients age as predictors in addi- the weighted sum equation (1) to give an RTS value, tion to the AP score. This model, known as ASCOT (A as the predictor. Severity Characterization of Trauma) was proposed in With ASCOT, as in TRISS, patients are separated into 1990, [17]. Calculation of ASCOT models is described blunt and penetrating injuries datasets for analysis. In in the next section. Results from a study using MTOS ASCOT, patients with extremely good or poor prog- data showed that ASCOT for blunt-injured patients, noses are excluded from the logistic function modelling. had similar discrimination to TRISS, however its H-L Table 4 shows the four set-aside groups and their appli- statistic is substantially lower than in TRISS ( 24.8 vs. cation to a test dataset. The first three groups contain 43.9) [17]. A further study to validate ASCOT analysed patients with very a small chance of survival; they have 5,685 patients submitted to the Institute for Trauma and sustained an AIS 6 injury and/or arrive at A&E in cardi- Emergency Care, New York Medical College (ITEC) ac arrest. The fourth set-aside group comprise patients database over two years from July 1987 [18]. The results with minor injuries whose prognoses are good. In the showed little difference between the TRISS and ASCOT calculation of ASCOT models, the Ps value for each models and it was concluded ‘that the relatively small patient in a set-aside group is set equal to the survival gain in predictive accuracy by ASCOT over TRISS is rate of the same group in the design dataset. largely offset by its complexity and computer process- ing requirements’. A subsequent paper pointed out Other scoring systems that Markle et al., [18], used the MTOS coefficients to The AIS classification of injuries, which forms the predict the outcomes rather than generating new coeffi- basis of all the above scores, requires a manual review

542 European Journal of Trauma 2006 · No. 6 © Urban & Vogel Dillon B, et al. Injury Scoring

of each patient’s record, which typically takes 10 to 20 A mortality rate of 4.1% (2193 deaths) after 30 days minutes. To avoid this expense, alternative methods is observed. The ISS and NISS values could be easily of coding injury have been proposed. These scoring compared with each other and in their relationship to methods are based on the International Classification the survival outcome. The AP-like score used had six of Diseases Ninth Revision (ICD-9) codes, which are components and so it could not easily be compared with used by the vast majority of hospitals to codify clinical mortality without doing a logistic regression analysis. diagnoses (American Medical Association, 1999). Two The APS-like score was a single value and it is com- different approaches have been developed to generate pared here with ISS and NISS. The NISS values were injury scores from the ICD-9 codes. A software package calculated for each patient using the stored AIS sever- ICDMAP-85 was released in 1985 by MacKenzie. This ity codes. The NISS value is always equal to or greater provided a means of converting ICD-9 codes to AIS than the ISS. In 52.7% of cases in the dataset the ISS codes. An updated version ICDMAP-90 was released and NISS scores are identical. Of the remaining 25,227 in 1997, [21]. This software is particularly useful in the cases, the difference between the ISS and NISS scores analysis of large databases, which do not contain AIS varies between 1 and 50 with 61% of these cases having codes. In comparative studies of the injury scores these a difference of 4 or less. type of mapped scores generally performed less well The mortality rate in the cases with NISS > ISS is than those based on directly-coded AIS values or even 5.8%, which is higher than that found in the dataset the ICD-9-based Injury Severity Score (ICISS), [7], [8], which agrees with the study reported by [12]. An analysis and [9]. The development of ICD-10 may provide more by body region shows that 67% of the NISS > ISS have a flexibility for injury classification and description and so severity score of 3 or more in the extremities region (ISS provide a better match with the AIS codes. The ICISS region 5). For both scores the median value for the sur- was proposed in 1996, [22]. The ICISS methodology vivors was 9, but the median value for the non-survivors uses survival risk ratios derived for each ICD-9 code to was 34 with NISS compared to 25 with ISS. Figures 1 and formulate an overall severity score, without the need 2 show the separation of survivors and non-survivors for for any AIS coding. This score has been very successful both cases. The broadening effect of the NISS is clear. and in comparative studies it often out-performs all the The main peak at ISS 9 is reduced from 54.5% with the other scores on discriminatory ability [7], [8], and [9]. ISS score, to 31.1% with NISS. The ROC curves can be evaluated for the two scores and the results are similar Results with ISS AROC = 0.832 and NISS AROC = 0.827. To The different scoring systems were first calculated and calculate APS*, the AIS severity codes must be assigned evaluated by checking the ability of each injury scoring to particular body regions and hence component values system to relate to mortality. TARN uses some exclu- (mA*-mC*) as in equation (2). The number of injuries sion criteria for the TRISS model. The following cases per patient in the core dataset, varies from 1 to 26, with are excluded: Children ( < 16 years), Intubated/Ven- a mean of 2.31 injuries. Altogether there were 123,586 tilated, , referrals and patients with penetrating injuries and these were assigned to each of the six ISS injuries. After the exclusions, only 53,286 patients out body regions. The largest proportion of them are in the of the 75,371 previously selected will be considered. extremities region as shown in Figure 3. Only 45% of these injuries are classed as serious (AIS 3 4 5 6) and Table 4. Set-aside groups for ASCOT models in a test ataset. contribute to the APS* score. The maximum AIS value Set-aside Patient AIS and Design dataset Test dataset was 3 in 68.7% of the patients and 2 in 23.2%. Each of group RTS description (n = 25,891) (n = 12,606) the other maximum AIS values occurred in less than 5% N N Survival of the patients. rate The range of the calculated APS* values was 12.25 1 AIS 6, RTS = 0 8 0 5 but unlike ISS and NISS they are not restricted to spe- 2 Max AIS < 6, 55 0 40 cific integer values. The mean value of survivors is 2.74 RTS = 0 3 AIS 6, RTS > 0 20 0.05 8 with a standard deviation of 0.93 and for non-survivors it 4 Max AIS = 1 or 2, 6,073 0.988 2,982 is 5.24 with a standard deviation of 2.19. Figure 4 shows RTS > 0 the separation of survivors and non-survivors for this Total cases 6,156 3,035 score. It has a peak at 2.81 with 53.2% of cases having

European Journal of Trauma 2006 · No. 6 © Urban & Vogel 543 Dillon B, et al. Injury Scoring

60 Survivors 1 - Head 6 - External 10% non survivors 14% 2 - Face 50 3%

40 3 - Thorax 9% 30 % of cases 20 4 - Abdomen 5% 10

0 1-5 6-10 11-15 16-25 26-74 75 NISS Figure 1. Separation of survivors and non-survivors in the dataset with NISS.

90 Survivors 5 - Extremities 80 non survivors 59% 70 Figure 3. Proportion of all injuries in the dataset for each ISS body re- gion. 60

50 ASCOT using the available data was also tested for

40 each score. % of cases The dataset described in Table 2 was used for 30 these tests because complete RTS and age records 20 are required. The dataset was split into a prediction 10 (design) dataset and a validation (test) dataset. The 0 logistic regression model in SPSS was used to calculate 0-3 3-6 6-9 9-14 the coefficients. The models were evaluated by look- APS ing at AROC for their discriminatory ability and the Figure 2. Separation of survivors and non-survivors in the dataset with H-L statistic to assess their calibration and predictive ISS. reliability. The TRISS model used has age described that score. This is a very similar proportion to that found by 5 levels rather than the two level age variable used at ISS 9 in Figure 2. Calculation of the AROC for APS* 90 gave a value of 0.837. This is slightly better than ISS or Survivors non survivors NISS indicating that this score, even with the approxi- 80 mations used in the calculation has good discriminatory 70 abilities. 60

50 Inclusion in TRISS and ASCOT models 40 The different injury scoring systems could be compared % of cases using a univariate logistic model, which would examine 30 the relationship between mortality and each scoring sys- 20 tem alone. However in TARN, TRISS models, which 10 include RTS and age predictors are always employed 0 and so for the evaluation here each injury score (ISS, 0-3 3-6 6-9 9-14 NISS, APS* and AP-like) was used as the anatomic APS component in a TRISS model. Since ASCOT was devel- Figure 4. Separation of survivors and non-survivors in the dataset with oped specifically for use with AP, an approximation to APS*.

544 European Journal of Trauma 2006 · No. 6 © Urban & Vogel Dillon B, et al. Injury Scoring

in the MTOS TRISS model. Table 5 shows the results tory rate at the scene of the accident and this variable for all the models. The results for ISS and NISS are was missing in a high proportion of the test dataset very similar, with NISS having a slightly higher H-L cases (34%). The results from a trial calculation using statistic. The lack of improvement in AROC is prob- ISS, Age, GCS and blood pressure on admission and ably due to the tendency of NISS to overstate severity respiratoy rate at the scene as predictors are shown for less severely injured patients. The test dataset had in Table 4. These discrimination results are signifi- a mortality rate of 4.4% and most of the patients are cantly worse and so the RTS value was used here in not severely injured: 88.1% patients have an ISS score an approximation to the ASCOT model. With this < 16 and 75.1% have a NISS score < 16. The AP-like approximation to ASCOT, the only real difference and APS scores have similar discrimination to ISS but between the TRISS and ASCOT models for a partic- ISS had the best H- L value of the TRISS models. A ular injury severity score is the treatment of the set- more ASCOT-like model was then investigated with aside groups. The low H-L values achieved with the each injury severity score. This model used the four AP-like and APS* scores in the ASCOT model shows set-aside groups along with a 5-level age variable and that the use of set-aside groups works well with these the injury severity score. Cases in the four set-aside types of injury scores. The effect of injury location on groups in the design dataset were identified and their the patient prognoses must be most significant in the survival rates calculated. These were used as the pre- severity levels 3–5. Although the H-L values are lower dicted probability of survival for each of the cases in for these scores, the AROC values shows little varia- the related group in the test dataset. The number of tion between the scores. This result agrees with [10]. In cases and the survival rates for each set-aside group general all the AROC values for the ASCOT models in each dataset are shown in Table 5. In each case, are lower than those obtained for the TRISS models. approximately 24% of the dataset are set-aside from This indicates that the definition and use of the set the logistic model. The set-aside groups are independ- aside groups decreases the discrimination of the mod- ent of the injury severity score used and so all the el. One reason for this can be seen by examining the scores had the same observed or predicted survival largest group, Set-Aside 4. In ASCOT, all these cases rate for these cases. The RTS values should have been are assigned a Ps = 0.988 however Table 6 shows the individual components for the ASCOT model but an change in mortality rate with both age and ISS value examination of the database revealed that the respi- across this group. In TRISS models age, variation in ratory rate on admission to A&E was not included ISS and RTS within this group would result in a range in the database. The closest variable was the respira- of Ps values. For example, with ISS the Ps values cal-

Table 5. Performance evaluation of Ps models using different injury severity scores. Model Injury Severity H–L Statistic AROC all data AROC design data AROC test data Type Score (n = 38,497) (n = 25,891) (n = 12,606) Signi- Value Value Error Value Error Value Error ficance TRISS ISS 11.465 0.120 0.938 0.003 0.937 0.004 0.939 0.005 TRISS NISS 19.412 0.013 0.937 0.003 0.935 0.004 0.941 0.005 TRISS AP-like 14.399 0.045 0.938 0.003 0.936 0.004 0.940 0.005 TRISS APS* 29.706 0.000 0.936 0.003 0.935 0.004 0.937 0.006 ASCOT ISS 13.704 0.057 0.921 0.003 0.920 0.005 0.923 0.007 ASCOT NISS 21.489 0.006 0.918 0.004 0.916 0.005 0.923 0.007 ASCOT AP-Like 9.690 0.207 0.920 0.004 0.919 0.005 0.923 0.007 ASCOT APS* 5.534 0.354 0.920 0.004 0.918 0.005 0.923 0.007 16.358 0.926 ASCOTa ISS 0.038 0.917 (13,249) 0.005 0.912 (8,942) 0.006 0.008 (13,249) (4,307)

Values of H–L <= 15.5 do not reject the hypothesis that the model provides an adequate fit of the data (p < 0.05), and an AROC value of 1 cor- responds to a model that perfectly separates the two-survivor and no-survivor sub-populations aThis model used GCS and systolic blood pressure on admission and respiratory rate at the scene instead of RTS value. No. of cases with missing data is shown in brackets

European Journal of Trauma 2006 · No. 6 © Urban & Vogel 545 Dillon B, et al. Injury Scoring

culated for this group range from 0.100 to 0.999 with a to check the effect of using the RTS components instead mean of 0.989 and a standard deviation = 0.025. of the value calculated from equation (1) because the respiratory rate on admission and at scene was mis- Summary sing from the database in a large proportion. Currently The major conclusion from the analysis described here TARN is using GCS in their modeling instead of is that ISS score is reasonably well suited to the TARN RTS [1]. database. The discriminatory ability of all the inju- ry scores studied was similar. NISS tends to overstate Acknowledgements injury severity in less severely injured patients and since The first two authors wish to thank the staff at TARN and the participa- ting hospitals for their help, advice and the provision of the database. they are a high proportion of the TARN data, (88.1% in the core dataset) it does marginally less well than ISS. References It was not possible to calculate the AP and APS scores 1. The Trauma Audit & Research Network (TARN). http://www. precisely because of the limitations of the database, TARN.ac.uk. however approximations to these scores highlighted 2. Committee on injury scaling, Association for the Advancement of Automotive Medicine (AAAM): The abbreviated injury scale some of their characteristics. In particular the low H-L 1990 revision. Des Plaines, Chicago: Association for the Advance- values achieved with the ASCOT models provide insight ment of Automotive Medicine. into why the set aside groups were introduced. Using the 3. Hosmer DW, Lemeshow S Applied Logistic Regression. John same predictors on the same database, the H-L values Wiley & Sons, New York 1989;25, 64, pp 140–5. 4. Baker SP, O’Neill B, Haddon W, Long WB The injury severity score: for the TRISS model are much higher; treatment of the development and potential usefulness. Proceedings American set-aside groups is the only difference between the two Association for Automotive Medicine 1974;18:58–74. models because both models had the same predictors. 5. Boyd CR, Tolson MA, Copes WS Evaluating trauma care: The TRISS method. J Trauma 1987;27:370–8. It would be interesting to compare more accurately cal- 6. Baker SP, O’Neil B The injury severity score: an update. J Trauma culated AP/APS scores with ISS for a TARN database 1976;16:882–5. if they are available. ASCOT differs from TRISS in the 7. Meredith JW, Evans G, Kilgo PD, MacKenzie EJP, Osler T, McGwin injury scoring method, use of RTS components as pre- G, Cohn S, Esposito T, Gennarelli TA, Hawkins M, Lucas C, Mock C, Rotondo M, Rue LW A comparison of the abilities of nine scoring dictors rather than the RTS value and the definition of algorithms in predicting mortality. J Trauma 2002;53:621–9. four set-aside subgroups, which are excluded from the 8. Sacco WJP, MacKenzie EJP, Champion HR, Davis EG, Buckman logistic regression. The results shown here indicate that RF Comparison of alternative ways of assessing injury severity there are no advantages to using the set-aside groups for based on anatomic descriptors. J Trauma 1999;47:441–6. 9. Stephenson SCR, Langley JD, Civil ID Comparing measures of the TARN database. Unfortunately it was not possible injury severity for use with large databases. J Trauma 2002;53:326–32. 10. Russell R, Halcomb E, Caldwell E, Sugrue M Differences Table 6. Mortality rate by age and ISS for the 9,055 cases in set-aside in mortality predictions between injury severity score triplets: group 4. A significant flaw. J Trauma 2004;56:1321–14. Age category Dead Alive Total 11. Osler TMD, Baker SPMPH, Long WMD A modification of the In- 16–54 4 0.10% 5,423 99.90% 5,427 jury Severity Score that both improves accuracy and simplifies 55–65 4 0.40% 1,111 99.60% 1,115 scoring. J Trauma 1997;43:922–6. 12. Brenneman FDMDF, Boulanger BRMDFF, McLellan BAMDF, Redel- 65–75 16 1.60% 1,016 98.40% 1,032 meier DAMDFFM Measuring Injury Severity: Time for a Change. 75–85 40 4.20% 905 95.80% 945 J Trauma 1998;44:580–2. >=85 49 9.10% 487 90.90% 536 13. Balogh Z, Offner PJ, Moore EE, Biffl WL NISS Predicts Postin- Total 113 1.20% 8,942 98.80% 9,055 jury Multiple Organ Failure Better than the ISS. J Trauma ISS Dead Alive Total 2000;48:624–8. 14. Balogh ZJ, Varga E, Tomka J, Suveges G, Toth L, Simonka JA The 1 3 5.30% 54 94.70% 57 New Injury Severity Score Is a Better Predictor of Extended Hos- 2 4 7.10% 52 92.90% 56 pitalization and Intensive Care Unit Admission Than the Injury 3 0 0.00% 9 100% 9 Severity Score in Patients With Multiple Orthopaedic Injuries. 4 50 0.90% 5,682 99.10% 5,732 J Orthop Trauma 2003;17:508–12. 5 28 1.60% 1,733 98.40% 1,761 15. Tay SY, Sloan E, Zun L, Zaret P Comparison of the New In- 6 3 1.90% 152 98.10% 155 jury Severity Score and the Injury Severity Score. J Trauma 8 8 1.20% 666 98.80% 674 2004;56:162–14. 9 14 2.90% 470 97.10% 484 16. Copes WS, Champion HR, Sacco WJP, Lawnick MM, Gann DS, 12 3 2.40% 124 97.60% 127 Gennarelli TA, MacKenzie EJP, Schwaitzberg S Progress in charac- Total 113 1.20% 8,942 98.80% 9,055 terizing anatomic injury. J Trauma 1990;30:1200–7.

546 European Journal of Trauma 2006 · No. 6 © Urban & Vogel Dillon B, et al. Injury Scoring

17. Champion HR, Copes WS, Sacco WJP, Lawnick MM, Bain LW, 22. Osler TMD, Rutledge RMD, Deis JRN, Bedrick E ICISS: an interna- Gann DS, Gennarelli TA, MacKenzie EJP, Schwaitzberg S A tional classification of disease-based injury severity score. new characterization of injury severity. J Trauma 1990; J Trauma 1996;41:380–8. 30:539–46. 18. Markle J, Cayten CG, Byrne DW, Moy F, Murphy JG Comparison between TRISS and ASCOT methods in conrolling for injury severity. J Trauma 1992;33:326–32. Address for Correspondence 19. Hannan EL, Mendeloff J, Farrell LS, Cayten CG, Murphy JG Valida- Wenbin Wang, PhD tion of TRISS and ASCOT using a non-MTOS trauma registry. Centre for OR and Applied Statistics J Trauma 1995;38:83–8. School of Accounting, Economics 20. Champion HR, Copes WS, Sacco WJP, Charles F, Holcroft and Management Science JW, Hoyt DB, Weigelt JA Improved predictions from ASCOT University of Salford over TRISS: results of an independent evaluation. J Trauma M5 4WT Salford 1996;40:42–9. 21. MacKenzie EJP, Sacco WJP, and Colleagues, ICDMAP-90:A users UK guide. Baltimore: The John Hopkins University of Public Health Phone (+44/161) 2954124, Fax 2954947 and TriAnalytics Inc, 1997. e-mail: [email protected]

European Journal of Trauma 2006 · No. 6 © Urban & Vogel 547