Based on the work of two winning teams in the U.S. Justice Department’s National Institute of Justice’s (NIJ’s) Recidivism Challenge, this report presents and discusses findings pertinent to forecasting recidivism for a group of parolees (male and female) in Georgia for each of 3 years on parole.
One report is from Team Crime Free, which produced the winning solution for male, female, and overall categories of parolees in Year One. It notes that the key to the competition was using all available information about each parolee and using large ensembles to stack these predictions forward. Models that predicted other years, beyond each round’s target, extracted meaningful and relevant combinations of features that are also predictive of the target for each round. The overall ensemble for each round consisted primarily of gradient-boosted trees and neural networks. Massive ensembles of gradient-boosted trees, using wide and distinct sets of hyperparameters, were the most important models in predicting recidivism in years 2 and 3. The second report is by a representative of Team TrueFit, which was awarded the winning solution for male, female, and overall categories in the second year. This report advises that classification based on quantitative data is primarily about feature engineering and model ensembling. The former encodes and enriches patterns in the data, while the latter produces robust predictions even from a limited amount of data. The winning solution included heavy amounts of each. A roadmap is provided to this solution, along with source code and guidance on how to produce similar results on future datasets. The third report provides insight from Team TrueFit’s winning solution for female, runner-up for male, and overall in year three recidivism prediction solution for male, female, and overall categories. The final ensemble in each round included neural networks, with multi-path MLPs and gradient-boosted trees, with a wide variety of hyperparameter settings. All models were fed into a linear model for averaging. in particular. an elastic net model where all weights must be greater than or equal to zero.
