Study finds Minority Report-style AI used by courts for the last 20 years to predict criminal repeat offenders is no more accurate than untrained people guessing

Monday, January 29, 2018 by

A new research paper, which studies the effectiveness of artificial intelligence (AI) in determining the likelihood of a convicted felon committing a crime, shows quite clearly that the software-based solutions available might not be so intelligent after all. To be more specific, research shows that dedicated AI programs designed to make certain risk calculations about convicted individuals can perform worse than untrained humans when it comes to judging their so-called reoffending risk. Accordingly, a co-author of the study says that this should call the use of such programs into question.

According to Hany Farid, a professor of computer science at Dartmouth College in New Hampshire and a co-author of the study, there’s just too much on the line for the authorities as well as the public to rely on an evidently ineffective algorithm. “The cost of being wrong is very high and at this point there’s a serious question over whether it should have any part in these decisions,” he said in a statement published in The Guardian. He researched with a female colleague, undergraduate student Julia Dressel.

The algorithm referred to here is none other than Compas, which stands for Correctional Offender Management Profiling for Alternative Sanctions, which is used by the U.S. to determine whether certain defendants who are either awaiting trial or sentencing are at “too much risk of reoffending” to be released on bail.

Compas was first developed in 1998, and now after almost two decades, it has been used on more than one million defendants. Until now, the efficacy of its calculations and predictions has hardly been called into question. However, the evidence put forward by the researchers can be a bit too hard to ignore.

The researchers analyzed the performance of Compas under the following circumstances: they let it perform its routine procedures, which involves combining a total of 137 different measures for every individual being assessed. They then recorded all the scores it gave to more than 7,000 pretrial defendants whose records were taken from a Broward County, Florida database. These scores were then compared with those taken from an unspecified number of untrained workers, all of which were contracted through the online crowdsourcing marketplace Mechanical Turk by Amazon. Unlike Compas, the untrained workers were only given seven variables instead of 137.

What the researchers found was rather shocking. In the first pass, they found that the humans were more accurate in 67 percent of the cases assessed, compared with Compas which was only accurate in 65 percent of all the cases assessed. Meanwhile, a second analysis showed that Compas’s accuracy when it comes to predicting recidivism could be matched merely by using a simple calculation that required only an offender’s age and the number of prior convictions.

To make a long story short, humans are no doubt much better than the robot designed to do this particular job, and it doesn’t look like the robot is going to get better any time soon. But there mere fact that its effectiveness has now been called into question should go a long way towards fixing the situation.

“When you boil down what the software is actually doing,” said Farid, “it comes down to two things: your age and number of prior convictions.” So it should be no surprise why untrained workers were able to match it and even overcome it so quickly. However, he also says, “As we peel the curtain away on these proprietary algorithms, the details of which are closely guarded, it doesn’t look that impressive. It doesn’t mean we shouldn’t use it, but judges and courts and prosecutors should understand what is behind this.”

And maybe with that, the ones who are responsible for making these algorithms can do a better job and make police work that much easier.

Read more about the use of algorithms in everyday life through


Sources include:


comments powered by Disqus