Using AI to Predict Criminal Offending: What Makes it ‘Accurate’, and What Makes it ‘Ethical’.
Jonathan Pugh
Tom Douglas
The Durham Police force plans to use an artificial intelligence system to inform decisions about whether or not to keep a suspect in custody.
Developed using data collected by the force, The Harm Assessment Risk Tool (HART) has already undergone a 2 year trial period to monitor the accuracy of the tool. Over the trial period, predictions of low risk were accurate 98% of the time, whilst predictions of high risk were accurate 88% of the time, according to media reports. Whilst HART has not so far been used to inform custody sergeants’ decisions during this trial period, the police force now plans to take the system live.
Given the high stakes involved in the criminal justice system, and the way in which artificial intelligence is beginning to surpass human decision-making capabilities in a wide array of contexts, it is unsurprising that criminal justice authorities have sought to harness AI. However, the use of algorithmic decision-making in this context also raises ethical issues. In particular, some have been concerned about the potentially discriminatory nature of the algorithms employed by criminal justice authorities.
These issues are not new. In the past, offender risk assessment often relied heavily on psychiatrists’ judgements. However, partly due to concerns about inconsistency and poor accuracy, criminal justice authorities now already use algorithmic risk assessment tools. Based on studies of past offenders, these tools use forensic history, mental health diagnoses, demographic variables and other factors to produce a statistical assessment of re-offending risk.
Beyond concerns about discrimination, algorithmic risk assessment tools raise a wide range of ethical questions, as we have discussed with colleagues in the linked paper. Here we address one that it is particularly apposite with respect to HART: how should we balance the conflicting moral values at stake in deciding the kind of accuracy we want such tools to prioritise?