Risk assessment algorithms in the New Zealand criminal justice system

31 March 2022 05:07


This article was originally published in the New Zealand Law Journal – Issue 328


Across the world, police officers, judges, and parole boards are turning to sophisticated algorithmic risk assessments to guide increasingly important decisions. Unknown to many, New Zealand already uses a risk assessment algorithm, the ROC*ROI measure, to assist with sentencing and parole decisions (Colin Gavaghan and others Government Use of Artificial Intelligence in New Zealand (New Zealand Law Foundation, 2019)). This article will assess the use of the ROC*ROI measure from both a normative and practical perspective. This is a difficult task as the arguments for and against risk assessment tools in the criminal justice system are plentiful and, ironically, often cover the same issues: biases, accuracy, consistency, and transparency.

This article will begin by introducing risk assessment tools, in particular the ROC*ROI measure and also the United States analogue, COMPAS (Correctional Offender Management Profiling for Alternative Sanctions). In doing so, it will explore the risks and benefits of including ‘dynamic variables’ into a risk assessment model. Although dynamic variables provide a more complete picture of the offender, commentators have cautioned that some dynamic factors can act as a proxy for impermissible variables such as race or class. However, this can be equally true for static variables and, despite their risks, many dynamic variables do inform the likelihood of reoffending without acting as a proxy for race or class. The focus of this analysis should not be “should we include dynamic or static variables?” but instead “what variables should we include, regardless of their nature?”.

As static and dynamic variables both reflect human actions, they are equally capable of perpetuating biases that occur at each stage of the criminal justice process. However, removing the ROC*ROI measure will not remove these biases, as judges often look to the exact same factors. The correct approach is to embrace the ambitious task of addressing these biases at their source, thereby decontaminating those inputs that compromise the efficacy of ROC*ROI. This is a difficult task, of course, but one that will have a lasting, meaningful impact.

Risk assessment tools claim to provide consistency by ensuring that two defendants with identical inputs will receive an identical risk assessment score. If properly understood and used correctly, this can improve consistency across the criminal justice system. However, such consistency can only be obtained if the tool accurately predicts future offending. The international experience illustrates that some risk assessment tools prove to be superior to judges in this task, whereas others are no better than random volunteers recruited from the internet. The Department of Corrections, which developed and now operates the ROC*ROI, do not release any data demonstrating ROC*ROI’s accuracy, leaving the public to blindly trust that it is accurately predicting future reoffending. Despite this lack of transparency, positive trends are afoot. In July 2020, the New Zealand Government released the Algorithm Charter for Aotearoa New Zealand (the Algorithm Charter), with the Department of Corrections quickly becoming a signatory (James Shaw “New Algorithm Charter a world-first” (press release, 28 July 2020). A world first, the Charter will require the Department of Corrections to make a number of commitments relating to transparency, partnership, people, data, privacy, ethics and human rights, and human oversight (New Zealand Government “Algorithm Charter for Aotearoa New Zealand”).

Although the ROC*ROI variables are publicly available, they are difficult to locate. Further, due to its complexity, the ROC*ROI measure operates beyond the comprehension of most New Zealanders, including defendants, defence counsel, and judges. The Algorithm Charter will require the Department of Corrections to provide plain English information about how the data is collected, processed, and calculated.

This will allow judges to be better informed when using ROC*ROI and ensure that defendants understand what factors have been used to predict their likelihood of reoffending.


Risk predictions are nothing new in the criminal justice system. In fact, there have been approximately four generations of risk assessment tools over the past century (Danielle Kehl, Priscilla Guo, and Samuel Kessler Algorithms in the Criminal Justice System: Assessing the Use of Risk Assessments in Sentencing (Harvard Law School, 2017)). In the first generation, risk assessment was conducted on a case-by-case basis, with correctional staff generally relying on their own professional judgement. This was convenient and flexible but lacked transparency and consistency, while also being highly susceptible to biases. The second and third generation then introduced static and dynamic variables respectively. The fourth and current generation embraces a more systematic and comprehensive approach with the assistance of ‘big data’. Contemporary risk assessment tools use computer modelling to combine factors of sufficiently high correlation to weigh them accordingly with increasingly complex algorithms (Melissa Hamilton “The Biased Algorithm: Evidence of Disparate Impact on Hispanics” (2019) 56 Am Crim L Rev 1553).

For many people, these algorithms bring a sense of unease as they remove, at least in part, the human element of some of society’s most important decisions. This aversion to algorithms is rooted in the strong preference that many people have for “the natural over the synthetic”. This prejudice against algorithms is magnified when the decisions are particularly consequential (Daniel Kahneman Thinking Fast and Slow (Penguin Books, London, 2011) at 228). Judges are not infallible but, in many circumstances, there is some solace to be gained by experiencing the human process of a criminal trial. At the very least, the offender can pinpoint exactly who is deciding their fate. By contrast, there is something innately unsettling about removing a fellow individual’s liberty based, at least in part, on the outcome of an impersonal and foreign algorithm.

These concerns must then be reconciled with the claims of some commentators who believe that algorithms hold “great promise for making our criminal justice system more efficient, equitable, and just” (Arthur Rizer and Caleb Watney “Artificial Intelligence Can Make our Jail System More Efficient, Equitable, and Just” 2018 Texas Review of Law and Politics 182). This article assesses the validity of these claims with particular reference to New Zealand’s own risk assessment tool: the ROC*ROI Measure.


In 2001, the Department of Corrections introduced the ROC*ROI measure. The ROC*ROI measure aims to predict whether a particular offender will, in the next five years, be convicted of a crime that will result in imprisonment. The algorithm assesses 32 inputs and uses a statistical method called logistical regression to calculate the “Risk of reConviction” and “Risk of Imprisonment” (Leon Bakker, David Riley and James O’Malley Risk of Reconviction: Statistical Models which predict four types of reoffending (Department of Corrections, 1999)). These two measures are then multiplied, generating the final likelihood that the individual will be reconvicted in the future and sentenced to a term of imprisonment for that offence. For example, if the individual has a very high “Risk of ReConviction” (say 90 per cent) but a very low “Risk of Imprisonment” (say 10 per cent), the chance of being both reconvicted for an offence, and being sent to prison for that offence, would be 9 per cent. (The full ROC*ROI algorithm is available at:  www.fyi.org.nz/request/501/response/4002/attach/4/Attachment%20C5931.PDF.pdf ). The ROC*ROI score is then used alongside traditional tools such as probation reports and psychologist reports to determine the offender’s sentence or suitability for parole (Gavaghan and others, above, at 20).

There are two primary approaches to developing algorithms that predict reoffending. The first is to focus solely on static variables relating to the offender and their criminal history, as it is argued that past behaviour is the best predictor of future behaviour. The second approach looks at static variables and also dynamic variables that may have an association with offending (for example, gang associations, employment status, or location of residence). The ROC*ROI measure adopts the first approach, incorporating only static variables, including: number of convictions, age of first offence, time since last conviction, and total time spent in prison. On the other hand, many international risk assessment tools include both static and dynamic variables. First developed in 1998, COMPAS is the most widely used risk assessment algorithm in the United States (Hamilton, above, at 1554). COMPAS uses static variables similar to those used by ROC*ROI, but also data collected through a questionnaire completed by the defendant. The questionnaire includes 137 questions canvassing the following topics: criminal history, family history, peers, residence, education, vocation, recreation, criminal personality, and anger (Northpointe “COMPAS Risk Assessment” (COMPAS Questionnaire, 2016)).

Many commentators support the use of dynamic factors as they give “the best indication of whether an  offender is likely to reoffend soon” and avoid an over-reliance on ‘tombstone’ predictors that are out of the control of the defendant (Edward Latessa and Brian Lovins “The Role of Offender Risk Assessment: A Policy Maker Guide” International Journal of Evidence based Research, Policy, and Practice 5 (2010) 203 at 205). However, other commentators have criticised the way many risk assessment tools, such as COMPAS, go about including dynamic factors (Ignacio Cofone “Algorithmic Discrimination Is an information Problem” (2019) 70(6) Hastings LJ 1389).

Dynamic factors and racial biases

Dynamic factors which assess the socioeconomic or familial background of the offender run the risk of acting as a proxy for impressible considerations such as race or class (see Cofone and Hamilton, above). For example, the COMPAS questionnaire asks: “how many of your  friends/acquaintances have ever been incarcerated?”. Research shows that African-American men are significantly more likely to have an acquaintance incarcerated than their white counterparts (37 per cent compared to 22 per cent) (Hedwig Lee and others “Racial Inequalities in Connectedness to Imprisoned Individuals in the United States” (2015) 12 Du Bois Review: Social Science Research on Race 269 at 275). As a result, the sole fact that a defendant is African American would directly increase their risk assessment score. Similar results arise for questions about gang affiliation, prior criminal history, education, frequency of moving to a new house, and financial means. It follows that an African-American individual taking the COMPAS questionnaire is inherently more likely to obtain a higher-risk score purely because their race will affect the answers to these questions.

In 2015, independent journalism organisation ProPublica analysed more than 10,000 criminal defendants in Broward County, Florida and compared the offenders’ COMPAS predictions with the reoffending rate that actually occurred. Ultimately the study found that African American defendants were “far more likely” than white defendants to be incorrectly judged to be at a high risk of reoffending, while white defendants were more likely than African American defendants to be incorrectly classified as low risk (Jula Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner “Machine Bias” (May 2016) ProPublica www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing ).

Northpointe, who developed COMPAS, “strongly rejected” these results. After running its own statistical analysis on the same dataset, Northpointe concluded that COMPAS had actually achieved “predictive parity” for African-Americans and whites and asserted that “ProPublica made several statistical and technical errors” (Northpointe “Response to ProPublica: Demonstrating Accuracy, Equity, and Predictive Parity” (December 2018) https://www.equivant.com/response-to-propublica-demonstrating-accuracy-equity-and-predictive-parity/ ).

Impartial commentators have stated that “contrasting measures of algorithmic fairness” explain the diverging results (Tafari Mbadiwe “Algorithmic Injustice” (2018) The New Atlantis at 18). As Sam Corbett-Davies explains, algorithms are often assessed against different definitions of “fairness” which may be in conflict. For example, the “anti-classification” definition merely requires that risk assessment algorithms do not consider protected characteristics, such as race, when deriving estimates. On this metric, COMPAS succeeds. However, the equity-seeking “calibration” definition requires that outcomes are independent of protected characteristics, after controlling for estimated risk (Samuel Corbett-Davies and Sharad Goel “The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning” (2018) arXiv1808.00023). By this standard, ProPublica’s analysis reveals COMPAS’s failings. Although the algorithm does not explicitly prejudice ethnic minorities, it nevertheless has this effect. This is at least in part due to the inclusion of dynamic factors, such as those discussed above, which perpetuate biases that exist by virtue of the offender’s race or class.

However, despite their risks, it is overly bullish to discard dynamic variables out of hand. Some dynamic variables do not act as a proxy for race or class and may inform the likelihood of re-offending; such as, whether or not the offender has completed a rehabilitation programme. For example, if an offender in New Zealand has completed the Special Treatment Unit Rehabilitation Programme, the Department of Corrections reports that the likelihood of them being reconvicted in the next 12 months reduces by 13.2 per cent (DepartmentofCorrectionsAnnualReport2018–2019(June2019) at 156–157).

The better focus, therefore, should not be “should we include dynamic or static variables?” but instead “what variables should we include, regardless of their nature?”. Characterising all dynamic variables as inequitable fails to recognise permissible developments that may occur over time which do accurately inform the offender’s likelihood of reoffending. Similarly, it ignores the fact that potential biases that may also arise from static factors.

Static factors, such as an offender’s number of prior convictions or their total years spent in prison, are all a result of human decisions. Police decide where to deploy resources and who to arrest, prosecutors decide whether to charge, juries decide guilt, and judges decide the sentence. Biases that occur at each stage along the criminal justice process will be reflected in the offender’s static variables and, therefore, their risk assessment score. Unfortunately, the New Zealand criminal justice system is plagued such with biases. From 2005–2014, police apprehended roughly the same number of Māori and Pākehā, despite the Pākehā population being three times higher. Of those arrested, Pākehā are nearly twice as likely to be let off with a warning compared with Māori (Independent Police Conduct Authority Review of Pre-charge Warnings (14 September 2016)). Of those who are prosecuted, Māori are also more likely to receive a conviction. Comparing young Māori and non- Māori with similar history and social background, Māori are between 1.6 to 2.4 times more likely to have a criminal conviction (D M Fergusson, L J Horwood, and N R Swain- Campbell “Ethnicity and Criminal Convictions: Results of a 21-year Longitudinal Study” 2003)36 Australian and New Zealand Journal of Criminology 366). The upshot of these depressing statistics is that a Māori offender will likely receive a higher ROC*ROI score, at least in part due to their race. Despite the potentially disparate impact on Māori, the Department of Corrections did not seek or obtain any input from Māori leaders during the development of the ROC*ROI tool (Waitangi Tribunal The Offender Assessment Policies Report (Wai 1024, 2005) at 122). The Algorithm Charter will require the Department of Corrections to rectify this oversight. In attempt to facilitate partnership, the Charter requires signatories to honour its Treaty commitments by “embedding a Te Ao Māori perspective in the development and use of algorithms consistent with the principles of the Treaty of Waitangi”. This commitment will require the Department of Corrections to ensure that the ROCI*ROI measure is used in a manner that promotes partnership with and active protection of Māori. As outlined above, the current ROC*ROI measure fails to do this. Instead, it indirectly disadvantages Māori by granting them higher ROC*ROI scores. To meet these obligations, the Department of Corrections will have to squarely address the way the ROC*ROI deals with race to ensure that its variables do not act as a proxy for race.

If an individual’s risk assessment score is influenced by race, it naturally follows that ROC*ROI, and other similar risk assessment algorithms, will continue to perpetuate these racial biases. However, if the algorithm is removed, it does not leave the criminal justice system free from the pernicious effect of these biases. When assessing the likelihood of reoffending, judges often look to the same factors (number of convictions, total years in prison and so on) as ROC*ROI, leaving the defendant in no better position. Therefore, the ROC*ROI measure should not be evaluated against some perfect ideal, but against the very imperfect status quo. There is nothing inherently biased about risk assessment algorithms, in fact, one of the primary arguments in favour of such tools is the ability to remove biases that may influence judges (either consciously or unconsciously). The issue is that its inputs are contaminated by biases that permeate through every aspect of the justice system.

One approach is to adjust the inputs entirely. If data such as the number of convictions or total years in prison are merely manifestations of existing biases, maybe these factors should not be considered when assessing the likelihood of reoffending? Such an approach may help to remove racial inequities; however, it will also remove the most accurate indicia of future offending, thus making our communities less safe and detaining more individuals with a low likelihood of reoffending.

The better approach is to address these racial biases at the source. Once the biases are removed at the source, the relevant inputs can be computed without such biases occurring. The only way to actually remove such biases is from the bottom up: police stops, prosecution discretion, sentencing and so on. To reject ROC*ROI entirely would simply throw the baby out with the bath water and ignore the racial biases that currently exist. The better view is to resist treating the algorithm as a scapegoat and to instead acknowledge that racial biases exist in the current system. However, until these biases no longer contaminate the ROC*ROI inputs, judges should be educated of the potential biases and limitations within an offender’s ROC*ROI score and exercise caution accordingly. This will prevent judges relying upon the ROC*ROI score as an example of perfectly objective and impartial risk assessment.


Recent literature provides a litany of arguments for and against the use of risk assessment tools. Ironically, the very arguments which supporters put forward in favour of risk assessment tools are often the same arguments that opponents provide against their use: consistency, accuracy, and transparency.


If humans were perfect arbiters of truth and reason, we would not need help administering justice (Arthur Rizer and Caleb Watney, above). Unfortunately, human flaws are numerous and well-known. International studies have found that extraneous factors such as how recently a parole board member ate lunch or how the local college football team is doing may affect the outcome of judicial decisions (Ozkan Eren and Naci Mocan Emotional Judges and Unlucky Juveniles (The National Bureau of International Research, NBER Working Paper No 22611, September 2016) and Shai Danziger, Jonathan Levav, and Liora Avnaim-Pesso “Extraneous factors in judicial decisions” (2011) PNAS 108 6889)). At any rate, human judgement inherently brings human failings. Although algorithms do not remove the subjective element entirely, as some subjective decision-making is built into every risk assessment tool, there is benefit knowing that identical inputs will produce identical outputs.

In 2005, the Law Commission cautioned that “one of the core problems with New Zealand’s current sentencing and parole arrangements is their highly discretionary nature” (Law Commission Sentencing Guidelines and Parole Reform (NZLC R94, 2006) at 9). Although the Court of Appeal attempts to achieve consistency by providing sentencing guideline judgments, these judgments allow significant judicial discretion and rarely address offences at the lower end of the spectrum of seriousness.

Risk assessment tools such as ROC*ROI and COMPAS seek to provide consistent and reliable predictions of future offending, thereby encouraging fair and equitable application of the law. However, this lofty goal can only be achieved if judges apply the risk assessment scores consistently. In New Zealand, it is unclear how much guidance is provided to judges about how the ROC*ROI score is calculated or how it should be utilised. Without consistent and comprehensive guidance, it is left to judges to decide how much weight to give an offender’s ROC*ROI measure. This undermines ROC*ROI’s ability to improve the consistency of sentencing in New Zealand.

In State v Loomis, the Wisconsin Supreme Court considered an appeal against a trial court’s use of COMPAS in determining Mr. Loomis’ sentence (881NW2d749(Wis 2016)). Despite rejecting the appeal, the Court prescribed how risk assessments scores must be presented to trial courts. The Court required all pre-sentencing investigation reports (equivalent to a PAC Report in New Zealand) that incorporated a COMPAS assessment score to include five written warnings for judges. The warnings relevant to ROC*ROI and its application in New Zealand include: first, COMPAS scores are unable to identify specific high-risk individuals because the scores rely on group data; second, although COMPAS relies on a national data sample, there has been “no-cross validation study for a Wisconsin population”; and third, studies “have raised questions about whether COMPAS scores disproportionately classify minority offenders as having a higher risk of recidivism” (State v Loomis, above, at 769).

In New Zealand, it is unclear whether judges receive warnings of this kind. Without such information, there is a risk of judges inconsistently applying ROC*ROI risk predictions. Some judges may place little weight on an offender’s ROC*ROI score while others may rely heavily on the “magic of computers and automation” (Thomas Sheridan “Speculations on Future Relations Between Humans and Automation” in Raja Parasuraman and Mustapha Mouloua (eds) Automation and Human Performance: Theory and Applications (CRC Press, Florida, 1996) 449 at 458). Studies have shown that machine results, such as the ROC*ROI measure, are often seen as “error-resistant” and “infallible” (see Danielle Keats Citron “Technological Due Process” (2008) 85 Wash U L Rev 1249). This can lead some individuals to place too much weight on machine predictions, creating an “automation bias” (Ric Simmons “Big Data, Machine Judges, and the Legitimacy of the Criminal Justice System (2018) 52 UC Davis L Rev 1067 at 1100).

If the ROC*ROI measure is used to influence an offender’s sentence, judges should be acutely aware of its potential shortcomings. As a signatory to the Algorithm Charter, the Department of Corrections is required to clearly explain to the public how algorithms are calculated and influence decisions. To this end, judges should be provided with information about how an offender’s ROC*ROI score is calculated and the potential issues that may arise. The warnings provided by the Court in Loomis may be a good starting point for this, while other warnings particular to the New Zealand context may also be necessary.

By minimising the influence of extraneous and idiosyncratic factors that may cause judges to differ in opinion, algorithms have the ability to achieve consistency across the criminal justice system. However, this consistency will only occur if judges are sufficiently informed about the benefits, risks, and limitations of the ROC*ROI measure.


The utility of a risk assessment tool is entirely negated if it is not accurate. An algorithm that cannot accurately determine which individuals are likely to reoffend will let loose recidivist criminals and lock up individuals who would not have reoffended. The consequence will be a more expensive and unjust criminal justice system.

In 1954, clinical psychologist Paul Meehl famously claimed that statistical predictions generated by algorithms are often superior than the subjective impressions of trained professionals (Paul Meehl Clinical vs Statistical Prediction: A Theoretical Analysis and a Review of the Evidence (University of Minnesota Press, 1954)). These comments were the genesis for the evolving contemporary belief that algorithms are more accurate than humans. This belief, along with a growing body of empirical evidence, has led to algorithms now influencing a variety of important decisions across society.

However, as the ProPublica expose demonstrated, it can be difficult to identify exactly how accurately risk assessment measures are performing in the criminal justice system. If an offender is given an 85 per cent chance of being re-imprisoned in the next five years, and is sentenced on that basis, there is no way of knowing whether that offender would have been one of the 15 per cent of offenders who would not have re-offended (Hugh Magee “The Criminal Character: A Critique of Contemporary Risk Assessment and Preventive Detention of Criminal Offenders in New Zealand” 19 (2013) AULR 76 at 83). Despite these difficulties, some studies have attempted to test the accuracy of risk assessment algorithms, with results that are somewhat unsettling.

In 2018, Dressel and Farid found that COMPAS was “only fractionally” better at predicting an individual’s risk of reoffending than random volunteers recruited from the internet (Julia Dressel and Henry Farid “The accuracy, fairness, and limits of predicting recidivism” (2018) Science Advances 1 at 1). In the study, each volunteer read a short description of the defendant, highlighting seven pieces of information that summarised the defendant’s prior criminal history. Based on that information alone, the volunteers predicted whether the defendant would be convicted of another crime within two years. On average, the volunteers took 10 seconds per selection and were correct 63 per cent of the time. In comparison, COMPAS had an accuracy of only 65 per cent. In February 2020, Lin and others replicated the study and found “nearly identical results” (Zhiyuan Lin and others “The limits of human predictions of recidivism” (2020) Science Advances 1 at 1). These results are concerning. As Farid notes, “these are non-experts, responding to an online survey with a fraction of the amount of information that the software has… so what exactly is software like COMPAS doing?” Other studies have reached similar results (Megan Stevenson “Assessing Risk Assessment in Action” (2017) 103 Minn L Rev 303).

By contrast, a study of New York City’s pre-trial bail applications found that an algorithm could identify which defendants are likely to commit crime while they await trial much more accurately than judges. As a result, the study boldly estimated that 42 per cent of detainees could be released without any increase in people skipping trial or committing crimes before trial (Jon Kleinberg and others Human Decisions and Machine Predictions (National Bureau of Economic Research, Working Paper 23180, February 2017)).

These studies demonstrate that risk assessment tools are not inherently more accurate than judges and that their accuracy must be positively proven, not blindly presumed. The accuracy of a particular risk assessment tool will depend on the particular inputs and how they are weighted.

The ROC*ROI measure is based on a data set of 133,000 offenders convicted of imprisonable offences during the years 1983, 1988, and 1989. The Algorithm Charter will now require the Department of Corrections to ensure that the data contributing to the ROC*ROI measure is “fit for purpose” by understanding its limitations and identifying and managing bias (New Zealand Government, above, at 3). This current set of data, which is now more than 30 years old, fails this test. The data set is significantly out of date, given the vast changes in age, ethnic, and socio-economic demographics of New Zealand in that time (Magee, above, at 80). In addition, some crimes which were imprisonable in between 1983–1989, such as consensual sex between men (decriminalised in 1986), are now rightfully lawful in 2020. To meet its new obligations under the Charter, the Department of Corrections will likely need to compile a new data set. This will require active engagement and consultation with people, communities, and groups who are likely to be impacted by the use of the ROC*ROI measure.

Unfortunately, the Department of Corrections does not release any data to demonstrate the accuracy of ROC*ROI. However, it appears that such data does exist. In 2019, Peter Johnson, Research and Analysis Manager at the Department of Corrections, said he had “high confidence in [ROC*ROI’s] accuracy, given repeated exercises which show a high degree of correlation between predicted rates of reimprisonment,

and actual rates of reimprisonment” (Joel McManus “Why a pastor who abused children served less prison time than a low-level cannabis dealer” Stuff (online ed, 13 August 2019)). Under the Charter, the Department of Corrections will need to regularly peer review the ROC*ROI measure to assess for unintended consequences. Presumably, inaccuracy would be an unintended consequence. To fulfil the spirit of the Charter, the result of this peer review should be released to the public. This will allow the general public, including leading academics, to scrutinise the accuracy of the ROC*ROI measure. If these results do validate ROC*ROI’s accuracy, it will provide the public with certainty that these impersonal algorithms are, in fact, providing judges with accurate information. If it identifies flaws in ROC*ROI’s accuracy, it will prompt the government to improve ROC*ROI or discontinue its use.


One of the benefits of a risk assessment tool is that you can “look under the hood” and identify exactly what factors lead to an offender’s risk assessment score. Although judges are required to give reasons, defendants may take a cynical perspective and believe that the decision was influenced by some nefarious, ulterior motive. By contrast, the full ROC*ROI algorithm is publicly available. The availability of the ROC*ROI algorithm allows impartial experts to audit the algorithm and identify the variables that it considers, and the weight attributed to each variable. This should not be taken for granted. In the United States, most risk assessment tools, including COMPAS, are developed by private companies who then prevent defendants, defence lawyers, and even judges from viewing the algorithm. However, the complexity of the ROC*ROI measure currently prevents defendants and defence counsel from meaningfully interpreting its operation. The Algorithm Charter holds great potential for improving the transparency, and thereby accessibly, of the ROC*ROI measure. By signing the Charter, the Department of Corrections have committed to maintaining transparency by “clearly explaining how decisions are informed by algorithms” (New Zealand Government, above, at 3). The Charter suggests that this may include plain English documentation of the algorithm and publishing information about how the data is collected, secured, and stored. Further, the Department of Corrections will be required to retain human oversight by:

  1. Nominating a point of contact for public inquiries about algorithms;
  2. Providing a channel for challenging or appealing decisions informed by algorithms;
  3. Clearly explaining the role of humans in decisions informed by algorithms.

Once implemented, these changes will greatly improve the transparency, accessibility, and integrity of the ROC*ROI measure. It will allow judges to be better informed when using ROC*ROI and ensure that defendants and defence counsel understand what factors have been used to predict their likelihood of reoffending.


Largely outside the view of the general public, ROC*ROI has been influencing judicial decisions in New Zealand for more than 20 years. Only more recently, as artificial intelligence begins to encroach on all areas of life, have commentators begun to question whether risk assessment algorithms of this nature should play a part in the criminal justice process. In theory, risk assessment tools can help to make the criminal justice system more consistent, accurate, and equitable. However, whether or not this occurs depends on the implementation and application of the particular tool. Due to the enormity of the decisions it influences, there is an unfortunate lack of data supporting ROC*ROI’s efficacy. However, the signing of the Algorithm Charter is certainly a step in the right direction. The Charter will require the Department of Corrections to reassess the ROC*ROI measure and make important changes to ensure it meets its new commitments. These include:

  1. embedding a Te Ao Māori in the use of the ROC*ROI measure;
  2. clearly explaining to judges and the general public how the ROC*ROI works and how it influences decisions;
  3. updating the ROC*ROI measure data set to accurately reflect Aotearoa New Zealand in 2020
  4. publicly reporting on the accuracy of the ROC*ROI measure;
  5. increasing transparency and accessibility by nominating a point of contact for public inquiries about the ROC*ROI measure.

Since ProPublica published its expose in 2015, there has been a significant amount of literature exploring the relationship between risk assessment algorithms and racial bias. Although there is no literature analysing the effect of ROC*ROI, many of its variables will likely act as a proxy for the defendant’s race. However, these biases exist distinct from the algorithm and the same result would likely occur if it were a judge assessing the defendant’s social, demographic and criminal history. Risk assessment algorithms should not be evaluated against some perfect ideal, but against the very imperfect status quo. Used effectively, these algorithms do have the potential help the criminal justice system become more transparent, equitable, and accurate. However, in order to do so, meaningful changes must occur at all levels of the criminal justice system. This is an enormous task, but one that is supported by the recent judicial and political dialogue. In the meantime, judges must be cautious when using ROC*ROI to influence such consequential decisions and be conscious of the effect that embedded biases may have in the calculation of an offender’s risk assessment score.

Want access to more in-depth articles and the latest legal commentary on Lexis Advance ®? Contact your Relationship Manager or fill out the form below and one of our representatives will contact you about LexisNexis legal solutions.