<<

Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise by Nancy Ritter

Tool uses random forest modeling to identify probationers likely to reoffend within two years of returning to the community.

or some time to come, our between dwindling dollars and cities, counties and states will the lowest possible risk to pub- Fface the tremendous challenge lic safety. The good news is of trying to do their work with fewer that researchers and officials in resources. That challenge is perhaps Philadelphia, Pa., believe they have no more pressing than in the nation’s developed a tool that helps find system, where fiscal that balance. realities demand the downsizing of populations. Seven years ago, criminologists from the University of Pennsylvania In 2009, the Pew Center on the and officials with Philadelphia’s States estimated that 1 in 45 adults Adult and in the U.S. was under some form Department (APPD) teamed up of community correctional supervi- to create a computerized system sion.1 As ever-increasing numbers that predicts — with a high degree of offenders are supervised in the of accuracy — which probationers community — witness the mas- are likely to violently reoffend sive “realignment” of in within two years of returning to — parole and probation the community. departments must find the balance

4 NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

Parole and probation departments must find the “We were asked to develop a new risk-forecasting tool to help balance between dwindling dollars and the lowest the financially strapped probation department tailor their officers’ possible risk to public safety. Researchers and caseloads to the risk level of proba- tioners,” said Geoffrey Barnes, who, with fellow researcher Jordan Hyatt officials in Philadelphia believe they have from Penn’s Jerry Lee Center of , created and evaluated developed a tool that helps find that balance. the tool. “The goal was to ensure that officers who were supervising probationers with a high risk of recidi- vating would have a smaller caseload offender saw his or her probation “When they contacted us, the than officers who were supervising officer about once a month for department’s leaders expressed a folks with a lower risk.” 20-30 minutes. strong desire to reform this policy — to focus more supervision on those The tool — which has been success- “Most of APPD’s probationers were with the largest risk of future vio- fully used in Philadelphia for four supervised under a strategy that lence and devote far fewer resources years — assesses each new proba- mandated only two and a half hours on those who presented little or no tion case at its outset and assigns of interaction per year,” said Barnes. risk of reoffending.” the probationer to a high-, moderate- or low-risk category. Although this is not a new concept, what is unique is that the tool uses “random forest What Is Random Forest Modeling? modeling,” a sophisticated statisti- cal approach that considers the nonlinear effects of a large number andom forest modeling is the technique used by Richard Berk — of variables with complex interac- Rworking with NIJ-funded researchers Geoffrey Barnes and tions (see sidebar, “What Is Random Jordan Hyatt — to build the risk prediction tool for Philadelphia’s Forest Modeling?” on this page). Adult Probation and Parole Department. Random forest modeling Historically, corrections officials — in could best be described as hundreds of individual decision trees. Philadelphia and elsewhere around the country — have used simpler In the simplest statistical terms, here is how it works: Data are orga- statistical methods, such as linear nized using a technique called “classification and regression trees.” regression models, to try to get a The computer then runs an algorithm that selects predictors at random handle on the risk that a probationer and repeats and repeats this process to build several hundred may pose to the community. trees — which then allow the randomly selected predictors to average themselves into a single outcome. In the case of the Philadelphia tool, Random forest modeling, as applied this outcome was assignment to one of three risk categories (high, to criminal , was pioneered by moderate or low) for probation-supervision purposes. criminologist Richard Berk, also at The final NIJ report describes random forest modeling — and the Penn, who acted as a consultant on fine-tuning that the research partnership went through as they built the NIJ-funded project. three iterations of the risk prediction tool — in much more detail (http://www.ncjrs.gov/pdffiles1/nij/grants/238082.pdf). Pre-Random Forest Times Is random forest modeling an improvement over more traditional Before the creation and implementa- actuarial prediction analyses? Barnes and Hyatt say yes. tion of the new risk-forecasting tool, “It allows for the inclusion of a large number of predictors, the use Philadelphia — like many of of a variety of data sources, the expansion of assessments beyond the nation’s parole and probation binary outcomes, and taking the costs of different types of forecasting departments — used a one-size- errors into account,” Barnes said. fits-all supervising strategy. Every

Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise | 5 NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

To develop the tool they had in discover it has far more data than it mind, Barnes and Hyatt were “The real achievement realizes — criminal histories in actually “embedded” in APPD. Over of the final model in the court system, local prison the next few years, they built three records and separate police iterations of a model that makes Philadelphia is not that records. virtually instantaneous forecasts regarding offenders who are due it is right two-thirds “Every jurisdiction probably has to be released to the community. access to data that they haven’t even of the time but that it thought about,” said Barnes. “We Since APPD began using on-demand capture so many types of informa- risk forecasting, the agency has han- produces this accuracy tion as a matter of course, as part dled well over 120,000 new “case of the day-to-day routine. I suspect starts,” referring to the time when by balancing the relative few people realize how enormously an offender begins probation. (Note powerful it could be to — with just a that about one-third of the offenders costs of the different few manipulations — convert it into have had more than one probation numbers that could forecast future “case start,” so this number actu- kinds of errors.” behavior.” ally reflects about 72,000 individual offenders.) As they developed the risk prediction tool in Philadelphia, Barnes and Hyatt In 10-15 seconds, the tool assigns mined raw data from six different a new probationer to one of three databases. The team then tested categories. The lowest level of risk is set their own parameters based on hundreds of different predictors assigned to those who are predicted resources and every manner using many different approaches, to not commit any new offense in of policy, operational and political all the while fine-tuning the delicate the next two years. The moderate- reality that the tool is asked balance between APPD’s resources risk level identifies those who are to consider. and the forecasting accuracy that likely to commit a , but not a was achievable. Eventually, three serious one. The high-risk level is for Hyatt offers this analogy: You models went live. The third, Model those who are most likely to commit could take the engine out of a C, has been in operation since a serious crime, which APPD defines custom sports car, but it probably November 2011 and uses 12 of as murder, attempted murder, aggra- wouldn’t work the same way in the strongest predictors of risk of vated assault, rape and arson. another car — and it might not reoffending, including prior jail stays, even work at all. Therefore, another the probationer’s ZIP code and Community supervision is based on jurisdiction using random forest the number of years since the the determined risk level. Probation modeling to build a risk-prediction last serious offense. officers who are supervising high- tool would need its own statisti- risk individuals are given the smallest cians, computer whizzes and Every jurisdiction would be looking caseloads. agency officials working in concert. at its own very unique data set Although many of the same that reflects decisions, made by Getting Started questions would be asked, the people who have long since retired, “answers” — specifically, the out- about what should and should not Although the random forest model comes around which the tool would be rolled over into their next system. developed in Philadelphia can be be designed — would be different. Therefore, it would be especially adapted by other jurisdictions, it is helpful for one of the team members not an off-the-shelf tool. Obviously, The first thing a jurisdiction interested to understand what data were the data are unique to the probation- in creating a random forest risk- taken from an older system — ers who are under APPD supervision. forecasting tool must do is determine be they paper records from jails, And the “outcomes,” or risk-level what data already exist in electronic courts or police, or old computer assignments, are also unique to form. It is very possible, say Barnes records — and used in newer Philadelphia because APPD officials and Hyatt, that a jurisdiction will systems.

6 | Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

“You definitely need a computer beginning point can be any moment the grant could be a bit overwhelmed professional on the team from the in the lifespan of an offender’s by random forest modeling. The beginning,” said Barnes. “Ideally, case — when is set, when forecasting tool now being used in this would be someone familiar with charges are filed, at sentencing, Philadelphia, for example, looks at the way the jurisdiction has kept its when the offender enters the correc- 500 decision “trees” (hence random records.” tional system or when the offender “forest”) as it runs a risk assess- first reports for probation. ment of a new probationer. But, they Finally, it is important to be mindful insist, there is no reason that criminal of simple geography. It will come as In Philadelphia, officials chose the justice practitioners should shy away no surprise that data-sharing start of probation and a time hori- from the technology. among jurisdictions in the U.S. zon of two years. The APPD tool is quite limited, particularly in terms therefore predicts the likelihood of “If you think about it,” said Barnes, of the kind of instantaneous fore- a probationer committing a violent “private companies do this every casting that this tool is designed crime within two years of returning day — they crunch data to decide to perform. For example, the APPD to the community. Although any time who’s likely to buy peanut butter, for tool uses criminal history data only period can be used, it is important example, and they send coupons to from Philadelphia; data from other to understand that the accuracy of those folks.” states, and even other parts of forecasting a longer period depends Pennsylvania, are not used, which on the depth of data available. Of course, both researchers are quick means that the forecasts do not to point out that forecasting criminal necessarily indicate each probation- “If, for example, you want to fore- behavior is not coupon clipping, but er’s universal level of risk. cast what is going to happen over the principles of data analysis, they the next five years, you have to use say, are the same. “Offenders who represent a seri- data from at least five years ago and ous danger outside the city of before,” Hyatt said. Determining an Acceptable Philadelphia could very easily be Error Rate forecasted as low risk within these Once the unit of prediction and boundaries, particularly if they usually time horizon are determined, the No prediction tool is perfect. Anyone live, work and offend elsewhere,” next step is to decide what “fore- who has watched a weather fore- Hyatt explained. casting outcomes” the tool should caster predict 8 inches of snow be set up to predict. Researchers — then dealt with crying children The bottom line is that, as in every such as Barnes and Hyatt can guide who have to go to school when only scientifically based endeavor, data practitioners through this process, a dusting falls — knows that predic- are paramount. but the practitioners themselves tions are occasionally wrong. must ultimately make the decisions “The key,” added Barnes, “is to because resources, personnel, The key in building a random forest ensure that all of the data sources operational and even political prediction tool for any aspect of the are immediately available through realities must be considered. In system is balanc- the agency’s data network, although Philadelphia — after weeks of ing the risk of getting it wrong. This it is important to note that the data examining caseloads and staff- process involves determining, in do not need to be up-to-the-minute ing levels — officials decided that advance, an acceptable error rate. accurate to be useful.” approximately 15 percent of their And this demands intensive col- probation population should be laboration between researchers and Forecast Begin- and End-Points classified as high risk, 25-30 percent practitioners, one in which agency as moderate risk, and 55-60 percent officials — not statisticians — must After dealing with the availability as low risk. make crucial policy decisions. In of data, the next step is to determine particular, this means determining when the forecasting begins (called Barnes and Hyatt acknowledge that prespecified levels of “false posi- the “unit of prediction”) and when someone picking up the final report tives” to “false negatives.” it ends (the “time horizon”). The they submitted to NIJ at the end of

Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise | 7 NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

A false negative is an actual high- a new serious offense within the risk person who was mistakenly “Random forest two-year forecast period than either identified as moderate or low risk. modeling allows you to low- or moderate-risk probation- A false positive is an actual low- or ers. The NIJ report is available at moderate-risk person who was add different variables http://www.ncjrs.gov/pdffiles1/nij/ identified, and therefore supervised, grants/238082.pdf. as high risk. without sacrificing your All three iterations of the Philadelphia As the practitioners work side-by- ability to make accurate model were validated using a sample side with the researchers to set of probation cases from 2001, which these parameters, they will inevi- predictions.” gave the researchers a 10-year tably encounter the need to make period in which to assess the long- tradeoffs they can live with. This term offending of the probationers. is referred to as the “cost ratio.” That said, of course, any forecasting Before the risk prediction tool can or the cost ratio. Every jurisdiction tool, including this one built using be built, the numerical ratio of these that wants to build a random forest random forest modeling, will make costs must be approximated. It is model prediction tool must commit mistakes. not enough to simply say that false to this very delicate balancing act — negatives are generally more costly one in which researchers can assist, But, said the researchers, when it than false positives. Rather, an actual but that, in the end, requires practi- comes to figuring out how to be value must be provided. tioners to do the heavy lifting. more effective in using corrections system dollars, everyone should Here is how Hyatt explained the “I cannot emphasize this enough,” understand that choices will always process in Philadelphia: “Basically, Barnes added. “Balancing these have to be made — and the goal is we had to determine precisely how different types of errors with the to make the most accurate choices much more costly it would be to model’s overall accuracy rate is not in as cost-effective a manner as mistakenly classify a probationer in a the job of the team’s statisticians. possible. lower-risk category who then went Because an agency’s leadership has on to commit a serious crime than to live with the consequences of any As Barnes put it, “The real it would be to intensely supervise error that occurs once the forecast- achievement of the final model someone who is actually a low-risk ing tool goes live, they must decide in Philadelphia is not that it is probationer because the tool had what level of accuracy they can live right two-thirds of the time but assessed him as high risk.” with and the balance of potential that it produces this accuracy by errors they prefer.” balancing the relative costs of the Most jurisdictions that contemplate different kinds of errors.” building a random forest risk- Accuracy prediction tool would likely do “The point,” Hyatt added, “is that what they did in Philadelphia: set The model that has been used in random forest modeling allows a higher relative cost for false Philadelphia for just over a year you to add different variables with- negatives than for false positives. (Model C) has an accuracy rate of out sacrificing your ability to make Philadelphia’s APPD decided on a 66 percent when considering all accurate predictions. By working cost ratio where false negatives three (high-, moderate- and low-risk) hand-in-hand with their practitioner were 2.6 times more costly than categories. In their final report to NIJ, and policymaker partners, research- false positives. But any jurisdiction Barnes and Hyatt offer a detailed ers can come up with the right ratio that wishes to design and implement account of the development and of variables that work in their own a similar tool would have to deter- accuracy of the three generations unique jurisdiction, both from a mine its own cost ratio or error rate. of risk-prediction models, including practical standpoint in terms of the much more detail about the sepa- data that are available and from a As Barnes and Hyatt noted, there is rate accuracy rates for the three risk standpoint of political and policy no single ‘right answer’ in choosing categories; for example, probationers exigencies which decision-makers the unit of prediction, the time hori- who were categorized as high risk are comfortable putting into a zon, the definition of outcomes are 13 times more likely to commit forecast tool.”

8 | Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

It is important to understand that the Another advantage of random for- NIJ-funded project discussed in this Proponents say one est modeling is its ability to identify article looked only at the creation of the most compelling highly nonlinear effects for each and effectiveness of the prediction individual predictor. Consider, for tool itself — not at the effectiveness reasons for building a risk example, the bivariate relationship of the subsequent supervision or between a soon-to-be-probationer’s treatment of APPD probationers. In analysis prediction tool age and the likelihood that the tool other words, the project did not, for would forecast him to be high risk. example, consider whether (and to using random It is not surprising that the youngest what extent) intense supervision and probationers in Philadelphia were exposure to more aggressive inter- forest modeling is the forecast to present the greatest dan- ventions may have caused a high-risk ger of a serious-crime reoffending. probationer to not commit another simple matter of However, the random forest analysis serious crime. fiscal resources. also showed something else. The Benefits of Random “A bit more surprising is how quickly Forest Modeling the probability of a high-risk forecast without losing predictive capacity. dropped as the offender got just a One of the most compelling attri- Indeed, it is this feature that can help few years older,” said Hyatt. “By butes of random forest modeling bring researchers, practitioners and the time the incoming probationer is that — unlike linear regression even politicians to the same table turns 27, the likelihood of receiving a analyses — it is not necessary to while the tool is being developed. high-risk forecast is not appreciably know in advance what data will be It helps garner buy-in, as it were, different from that of a 40-year-old — useful in predicting behavior or which from skeptics. and, after the age of 40, the amount variables will affect the predictive of risk seems to drop once again until power. In more traditional statistical “Adding variables that individual it reaches a level that is effectively procedures, only a limited number of stakeholders cared about — even zero at age 50 and beyond.” predictors are used to try to forecast if we, as criminologists, didn’t think future behavior. But random forest they would have much predictive Resources, Equitability modeling does not require users to power — helped our APPD partners and Fairness be so choosy. feel that we were hearing them and responding to their concerns,” said Why should officials in the criminal The tool can be programmed to Barnes. “This feature helped them justice system think about building simply not consider a factor based get behind what we were trying to a risk analysis prediction tool using on other variables. In other words, do as we built the forecasting tool, random forest modeling? Proponents data can be “over-included,” and the and, importantly, it helped everyone say one of the most compelling tool will simply filter them out. For understand the risks that the policy- reasons is the simple matter of example, the tool may say, “I don’t makers, in particular, faced.” fiscal resources. see much of a juvenile record for this individual, but I do see, from an The bottom line is that any data can “We just do not have the ability earlier branch in the tree, that this be used in a random forest tool, to pay for the most intensive level person is 60 years old, so I wouldn’t depending on the wishes of officials of supervision for every probationer,” expect to find much of a juvenile and other key players. Data that may said Barnes. “We don’t have the record; but, regardless, now that be statistically unimportant — but ability to every he is 60, this is probably not a very politically important — can be built to life. We have to be very careful important factor now.” into the tool. For example, a juris- about how we allocate precious diction might want to consider the resources and, for public-sector This is not the case with regression number of a probationer’s violent workers — be they probation equations, where every time another co-offenders; although APPD ended officers, police officers or correc- variable or predictor is added, some- up not using that data in its tool, tions officers — the most precious thing is lost. With random forest another jurisdiction may find such resource is time.” modeling, variables can be added data predictive.

Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise | 9 NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

The random forest model prediction analysis,” Barnes said. “Two tool, he said, allows agencies to base The random forest offenders with the same data their personnel and policy decisions model prediction tool values — even if they come from on a scientifically proven method. different parts of the city, even if allows agencies to they are different kinds of people — Another reason to consider con- will go through the same scoring structing such a sophisticated base their personnel process in the same way.” prediction tool is that, quite frankly, “prediction,” in some form or and policy decisions on “And that,” Barnes argued, “is a another, is already occurring. far sight more equitable than a proba- Everyone involved in the criminal a scientifically proven tion officer perhaps taking a dislike justice system — from judges to pro- to you and deciding that you need bation officers, from police leaders method. to come in more frequently because to politicians who write the laws and you remind him of somebody who determine budgets — is making judg- victimized a close relative a few ments, essentially predictions, about a similar risk of reoffending could be months ago.” the relative risk of an offender. — and therefore likely are — treated in disparate ways based on who their This is not to say, however, that Researchers Hyatt and Barnes probation officer is and any number human judgments don’t play a role. believe that by using random forest of other factors. modeling to build the actuarial risk- “Human judgments are important,” assessment tool for Philadelphia’s However, in addition to ensuring that Hyatt added. “But one thing that has APPD, they have ensured that offenders are assessed consistently been consistently found every time those predictions are being made in terms of their risk level, the tool that this sort of technology has been in the fairest, most equitable way being used in Philadelphia — and used to forecast human behavior is possible. the policy decisions that APPD has that these actuarial decision-making put into place to operationalize the models do a better job — and “Using random forest modeling gave results — ensures that offenders produce more accuracy in a more us the assurance that we made use who are identified as being at a consistent fashion — than human of the best science available to iden- certain risk level are all treated the gut reactions ever could.” tify the most dangerous offenders,” same. Every probationer whom the said Barnes. “It has ensured that tool scores as high risk is treated As with any kind of new technology- we’re preserving resources and that under the same high-risk protocol; based tool, however, there is an the people who are subject to the this standardizes both their report- inevitable intersection of science and policy decisions based on those ing requirements and the rules that human nature — including ethics — risk assessments are being treated they have to follow — including, that must be grappled with. in a fair and consistent way.” of course, any likelihood that they will be sanctioned for a technical For example, some have argued that “You may not like being on high-risk violation. using some variables, such as an probation,” he added, “but from a offender’s ZIP code — particularly procedural justice standpoint, you This equitability is something that in a city as highly segregated as at least know that the decision was researchers Barnes and Hyatt — and Philadelphia — can be a proxy for made the same way for everybody.” the probation professionals who have race. Others note that individuals been successfully using the tool — who are categorized as high risk and Under one-size-fits-all procedures believe in. therefore more intensely supervised being used in many jurisdictions are probably going to incur more around the country, probation “Because every probationer is technical violations of the terms of officers are given an enormous put into the same model, the same their parole. Certainly, just as any amount of discretion. This means decision points will be hit as the policy decision that has moral and that probationers who actually have model produces the risk-category ethical ramifications (and most do),

10 | Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

“One thing that has The Role of Ethics in Statistical Forecasting been consistently

he ethical considerations prohibit their use in “higher- found every time that Tinherent in trying to predict stakes” decision-making such as this sort of technology future events — such as criminal sentencing? offending — are not new. Indeed, Furthermore, some note, aren’t has been used to as the NIJ-funded researchers the age of criminal-behavior onset, who worked on the Philadelphia possession of a juvenile record or forecast human behavior risk assessment tool point out, the neighborhood a person resides one of the reasons some offend- in (factors that could be used as is that these actuarial ers are sentenced to longer prison prediction variables) all “extraju- terms is to prevent that dicial” factors? As such, should decision-making models they might commit if they were they be considered in an individual not incarcerated. criminal justice decision? do a better job — Geoffrey Barnes and Jordan Considering potential “collateral and produce more Hyatt, from the University of consequences” of decision- Pennsylvania, believe that ran- making based on a forecasting accuracy in a more dom forest modeling offers a tool is also an important part of different — and potentially more the process. As mentioned in consistent fashion — accurate — approach for building a the main article, for example, prediction tool. Nonetheless, they Philadelphia’s Adult Probation than human gut recognize the ethical crux that and Parole Department used the lies at the heart of building such a random forest prediction tool to reactions ever could.” tool: deciding which “predictors,” identify offenders who were at a or fact variables, are acceptable high risk of committing a serious to use. crime in the two years following it is important that these issues are In their final report, for example, return to the community — and clearly understood and squarely they ask, “Would it ever be these people were supervised addressed (see sidebar, “The Role permissible … to include an more closely, under more strin- of Ethics in Statistical Forecasting,” offender’s racial background gent parole terms and conditions. on this page). as a predictor variable in one This could increase the likelihood of these models? If not, what that technical violations of their The Key: A Strong Partnership about the use of predictors such parole would be more likely to be Barnes and Hyatt emphasize that as residential location or familial detected and punished, includ- building the random forest prediction circumstances, which could indi- ing imposing additional custodial tool in Philadelphia was a tremen- rectly communicate the offender’s sanctions. dously iterative process — and one racial identity into the forecasting There are no easy answers that required day-to-day collaboration model?” to these questions, but they with APPD. Would it be permissible to use will have to be addressed controversial predictors in “lower- head-on as increasingly tech- “You don’t put all the data into the stakes” forecasting models — to nologically advanced forecasting computer the first time and hit the control admission into a treatment methods become available for button and say, ‘OK, we're done,’” program or govern supervision use in our nation’s criminal Barnes said. “The model comes decisions, for example — but justice system. out and you look at it. Everyone sits down around a table and discusses

Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise | 11 NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

it. The statisticians describe the 5. Choose the predictor variables problems they faced. The database A tool like the one to be used, based on theo- guys look at it and say, ‘Well, yes, developed in Philadelphia retical, practical and policy but you are using this variable in the considerations. wrong way,’ and the practitioners provides an opportunity 6. Build a single database file. look at it and say, ‘We really can’t have 35 percent of our caseload to advance the 7. Estimate the relative costs on high-risk supervision. It’s not of false positives and false going to work. That number has capabilities of the negatives, keeping in mind that got to come down.’” agency leadership must value criminal justice system the relative weight of these This gets at the constantly evolving inaccuracies. nature of the random forest tool. to protect communities, 8. Build an initial model and evalu- ate the results. “You constantly are building new particularly for things to try to deal with changes 9. Adjust the model to reflect in the environment, changes in the jurisdictions with large policy-based concerns regard- data, changes in what people think ing accuracy and proportional are predictive, changes in chronologi- probation populations assignment to risk categories; cal theory over time,” Hyatt noted. construct additional test models that must be managed where required. Recommendations from with fewer dollars. 10. Produce forecasts for offenders the Research already in the agency’s caseload. Given the need to balance fiscal able to concentrate resources on a 11. Create the user interface and realities with an overarching mission small number of probationers who back-end software to produce to protect public safety, criminal require more active supervision, live forecasts. justice professionals are beginning rather than on those who are unlikely to look — with the same creativity 12. Continuously monitor the results to reoffend regardless of how they and vigor as private-sector of the live forecasts. are supervised. professionals — at sophisticated statistical tools to solve problems. Again, it is important to under- In their final report, Barnes and Hyatt Therefore, it is likely that risk- stand that the Philadelphia tool was recommend 12 steps that could prediction tools using random based on probationers who live in serve as a blueprint for a jurisdiction forest modeling may play an Philadelphia. Needless to say, people that is considering building a random important role in the future of in other jurisdictions may be differ- forest model risk prediction tool: our criminal justice system. ent in key ways — and crime trends vary in different parts of the country 1. Obtain access to reliable data A tool like the one developed in and even in different parts of a state. that are consistently and elec- Philadelphia provides an opportunity Therefore, a tool that uses random tronically available. to advance the capabilities of the forest modeling must be based on criminal justice system to protect 2. Define the unit of prediction the best available data about the communities, particularly for jurisdic- and time horizon. population whose behavior is being tions with large probation populations predicted. 3. Define the outcome risk that must be managed with fewer categories. dollars. Indeed, as the Philadelphia Finally, say proponents, because project demonstrated, the random 4. Consider the practical implica- random forest modeling can be forest-based risk-prediction tool has tions for a risk-based supervision tailored to specific needs, research- helped probation officials manage strategy and ensure adequate ers and practitioners should not limit cases more efficiently. For nearly resources based on the distribu- their thinking to urban probationers, four years now, they have been tion of risk scores. such as those with whom the team worked in Philadelphia. Random

12 | Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise NIJ JOURNAL / ISSUE NO. 271 ■ FEBRUARY 2013

forest modeling may prove useful certainly have already employed, in managing prison populations, for convert the data into a usable format, example. Or, said Barnes, perhaps and go ahead and build the model,” officials in another jurisdiction are he added. “Give it a shot.” interested in looking at the pretrial behavior of people who have merely been charged with an offense. About the author: Nancy Ritter is a writer and editor at NIJ. These would present entirely differ- ent environments, of course. NCJ 240696

“But,” Barnes noted, “the chances are that a jurisdiction has the data Notes Watch researchers Geoffrey to build other kinds of prediction 1. Pew Center on the States, One in Barnes and Jordan Hyatt models.” 31: The Long Reach of American discuss the Philadelphia risk Corrections, Washington, D.C.: assessment tool: http://www. “You just have to make the contact Author, 2009, http://www. nij.ncjrs.gov/multimedia/ with somebody with reasonable pewstates.org/uploadedFiles/ video-barnes-hyatt.htm. PCS_Assets/2009/PSPP_1in31_ statistical skills, use the database report_FINAL_WEB_3-26-09.pdf professionals who you almost (accessed November 27, 2012).

Software Tools, Apps and Databases

IJ has funded research and development that has resulted in more than 50 free or low-cost software tools, apps and databases to assist with investigations and research — and now Nthey are all gathered in one place on NIJ.gov. Tools available in NIJ’s online catalog include:

▼ Video Previewer: A program that assists investigations involving time-consuming video review by quickly processing the video and showing key frames in a PDF.

▼ U.S. Y-STR Database: An online searchable listing of 11- to 17-locus Y-STR haplotypes, which are required to provide a statistical estimate of a match’s significance.

▼ 3D-ID: A Java program that provides geometric morphometric tools to help assess the sex and ancestry of unidentified cranial remains.

▼ CrimeStat 3.4: A spatial statistical program used to analyze crime locations and identify hot spots.

To find other software tools, apps and databases, browse our catalog at http://www.nij.gov/topics/ technology/software-tools.htm.

Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise | 13