Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

2017-07 Arbitration in Criminal Algorithm

You are on page 1of 37

Algorithms in the Criminal Justice

System: Assessing the Use of


Risk Assessments in Sentencing
The Harvard community has made this
article openly available. Please share how
this access benefits you. Your story matters

Citation Kehl, Danielle, Priscilla Guo, and Samuel Kessler. 2017. Algorithms
in the Criminal Justice System: Assessing the Use of Risk
Assessments in Sentencing. Responsive Communities Initiative,
Berkman Klein Center for Internet & Society, Harvard Law School.

Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:33746041

Terms of Use This article was downloaded from Harvard University’s DASH
repository, and is made available under the terms and conditions
applicable to Other Posted Material, as set forth at http://
nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-
use#LAA
Algorithms in the Criminal Justice System:
Assessing the Use of Risk Assessments in Sentencing
Danielle Kehl, Priscilla Guo, and Samuel Kessler
Responsive Communities

I. Introduction 2

II. The History of Risk Assessment in the Criminal Justice System 3

A. The Past as Prelude: The Selective Incapacitation Movement of the 1980s 3


B. Rehabilitation: A Shift Toward Individual Sentencing and Its Discriminatory Effects 6
C. Retributivism and the Rise of Evidence-Based Sentencing 7
D. The Evolution of Risk Assessment Tools 8
E. Enter the Algorithms: Risk Assessment Software 9
F. Risk-Assessment Validity and Adoption 11

III. Algorithms and Criminal Sentencing 13

A. The Move from Parole and Pre-Trial to Sentencing Risk Assessments 13


B. The Sentencing Process 14
C. Evidence-Based Sentencing and the Embrace of Risk Scores 15

IV. Legal Issues Raised By Risk Assessments in Sentencing 18

A. COMPAS Considered in Wisconsin: The Loomis Case 18


B. Constitutional Issues Implicated By Risk Assessment Algorithms  21
C. Related Sentencing Issues: Managing Risk in the Criminal Justice System 26

V. Challenges Presented by the Use of Risk Assessment Algorithms in Sentencing 28


A. Opacity 28
B. Bias and Lack of Reliability 28
C. Diverging Concepts of Fairness 30

VI. Recommendations for the Use of Risk Assessment Algorithms 32

A. Transparency 32
B. Accountability and Oversight 33
C. Robust and Holistic Approach to Fairness 34

VII. Further Areas for Research 35

VIII. Conclusion 36
Algorithms in the Criminal Justice System:
Assessing the Use of Risk Assessments in Sentencing
Danielle Kehl, Priscilla Guo, and Samuel Kessler

I. Introduction tencing process, a development which raises


fundamental legal and ethical questions about
In the summer of 2016, some unusual headlines fairness, accountability, and transparency. The
began appearing in news outlets across the goal is to provide an overview of these issues and
United States. “Secret Algorithms That Predict offer a set of key considerations and questions
Future Criminals Get a Thumbs Up From the for further research that can help local policy-
Wisconsin Supreme Court,” read one.1 Anoth- makers who are currently implementing or con-
er declared: “There’s software used across the sidering implementing similar systems. We start
country to predict future criminals. And it’s bi- by putting this trend in context: the history of
ased against blacks.”2 These news stories (and actuarial risk in the American legal system and
others like them) drew attention to a previous- the evolution of algorithmic risk assessments as
ly obscure but fast-growing area in the field of the latest incarnation of a much broader trend.
criminal justice: the use of risk assessment soft- We go on to discuss how these tools are used
ware, powered by sophisticated and sometimes in sentencing specifically and how that differs
proprietary algorithms, to predict whether in- from other contexts like pre-trial risk assess-
dividual criminals are likely candidates for re- ment. We then delve into the legal and policy
cidivism. In recent years, these programs have questions raised by the use of risk assessment
spread like wildfire throughout the American software in sentencing decisions, including the
judicial system. They are now being used in a potential for constitutional challenges under the
broad capacity, in areas ranging from pre-tri- Due Process and Equal Protection clauses of the
al risk assessment to sentencing and probation Fourteenth Amendment. Finally, we summarize
hearings. the challenges that these systems create for law
and policymakers in the United States, and out-
line a series of possible best practices to ensure
This paper focuses on the latest—and perhaps
that these systems are deployed in a manner
most concerning—use of these risk assessment
that promotes fairness, transparency, and ac-
tools: their incorporation into the criminal sen-
countability in the criminal justice system.
1 Ethan Chiel, Secret Algorithms That Predict Future
Criminals Get a Thumbs Up From Wisconsin Supreme
Court, Fusion (July 27, 2016), http://fusion.net/
story/330672/algorithms-recidivism-loomis-wiscon-
sin-court/. Suggested Citation: Kehl, Danielle, Guo, Priscilla,
2 Julia Angwin et al., Machine Bias: There’s Software Kessler, Samuel. Algorithms in the Criminal Justice
Used Across the Country to Predict Future Criminals. System: Assessing the Use of Risk Assessments in
And It’s Biased Against Blacks., ProPublica (May 23, Sentencing. (July 2017). Responsive Communities.
2016), https://www.propublica.org/article/machine-bi- Available at: https://cyber.harvard.edu/
as-risk-assessments-in-criminal-sentencing. publications/2017/07/Algorithms.

2
II. The History of Risk A. The Past as Prelude: The Selective
Incapacitation Movement of the 1980s
Assessment in the Criminal The modern debate about risk assessment algo-
Justice System rithms in sentencing bears a striking similarity
to a 1980s movement that the New York Times
described as a “quiet revolution” in the crimi-
The past decade has witnessed an explosion in
nal justice system: the selective incapacitation
the use of algorithms in the public sphere in the
movement.5 Selective incapacitation theory
United States. The rapid and unprecedented rise
was based on the premise that the justice sys-
of predictive algorithms has been fueled by a
tem should seek to identify, or “select,” a sub-
number of factors, including the vast amounts
set of individuals who are particularly prone
of data generated by ubiquitous use of the in-
to violence or recidivism—colloquially known
ternet and smart devices and a growing empha-
as “career criminals”6—and incapacitate them
sis on data-driven decision-making in both our
by keeping them in prison for longer periods of
private lives and public policy.3 Unsurprisingly,
time.7 Removing these criminals from the gener-
this emphasis on the use of data in government
al population, in theory, would lead to an over-
has permeated many stages of the criminal
all reduction in the crime rate.8 Although it was
justice system as well, from predictive policing
ultimately short-lived, the theory of selective in-
to risk assessment in the corrections system.4
capacitation and the controversy surrounding
But while data-driven approaches may explain
its practical and ethical implications offer some
the recent expansion in the use of risk assess-
critical insights into today’s debate about risk
ment tools, the algorithmic revolution was not
assessment instruments that similarly purport
responsible for their conception. Risk assess-
to identify individuals who are at high risk for
ment tools—and the principles underlying their
both general and violent recidivism and inform
development—have actually been a part of the
judges of those characteristics during the sen-
criminal justice system for decades. In this sec-
tencing process. The selective incapacitation
tion, we discuss the predecessors to modern
debate also suggests that policymakers should
risk assessment software and how their use has
proceed cautiously and deliberately when em-
evolved and shifted in response to various com-
bracing the use of modern risk assessment soft-
peting theories of criminal punishment. In par-
ware, balancing their interest in reducing future
ticular, we highlight the parallels between the
crimes against concerns about accuracy and
modern emphasis on these tools in sentencing
individual fairness.
and a controversial (and ultimately unsuccess-
ful) movement in the 1980s known as “selective
incapacitation.” Crime prediction has been a feature of the
United States criminal justice system since the
early 1920s.9 Beginning in the late 1960s and
early 1970s, crime prediction research focused

5 Tamar Lewin, Making Punishment Fit Future Crimes,


N.Y. Times (Nov. 14, 1982), http://www.nytimes.
com/1982/11/14/weekinreview/making-punishment-fit-fu-
ture-crimes.html.
6 Selective Incapacitation: Reducing Crime Through Pre-
dictions of Recidivism, 96 Harv. L. Rev., 511, 511 (1982).
7 Thomas Mathiesen, Selective Incapacitation Revisited,
22 L. and Human Behavior, 455, 455 (1998), www.jstor.
org/stable/1394595.
3 See, e.g., Digital Decisions, Ctr. for Dem. & Tech. 8 Selective incapacitation is different from collective
(2016), https://cdt.org/issue/privacy-data/digital-deci- incapacitation, which is used to punish all persons con-
sions/. victed of similar offenses in the same way. The strategy
4 Mara Hvistendahl, Can ‘Predictive Policing’ Prevent is used on broad categories of criminals, such as those
Crime Before It Happens?, Science (Sept. 28, 2016), who committed major felonies or those who have the
http://www.sciencemag.org/news/2016/09/can-predic- same number of petty crimes. Id. at 455-56.
tive-policing-prevent-crime-it-happens. 9 Id. at 458.

3
primarily on identifying an element of “danger- serious crimes in America and (2) these career
ousness” in offenders—namely, the capacity criminals can be identified through characteris-
to commit violent crimes.10 However, predicting tics like their personal and criminal history.16 The
dangerousness turned out to be quite complex, first assumption was proved through a variety
and early attempts resulted in a striking number of studies, most notably the 1972 study entitled
of false positives. For example, some studies in “Delinquency in a Birth Cohort.”17 After careful
the late 1960s and early 1970s mistakenly identi- analysis of the criminal records of 10,000 males
fied between 54 and 99 percent of participating in Philadelphia, the study found that 51.9 per-
individuals as “dangerous.”11 cent of the total offenses were committed by
just 18 percent of the group, otherwise known as
Nonetheless, despite the difficulty of predicting the chronic offenders.18 The second assumption,
dangerousness, proponents of the selective in- however, never substantiated the correlations
capacitation movement proposed to punish cer- that it drew between personal and criminal his-
tain individuals more severely based entirely on tories and the potential for recidivism.
a predicted future rate of offending.12 This con-
cept of punishing criminals not for what they In 1982, a report from the RAND Corporation ex-
had done in the past but for what they could pounded on the potential benefits of selective
do in the future represented a radical shift in incapacitation theory and stimulated discus-
theories of sentencing in criminal justice. The sion in the academic and criminal justice com-
ethical considerations underlying the selective munity on the validity of the theory. The report’s
incapacitation strategy embodied a conflict be- authors, Peter Greenwood and Allan Abraha-
tween utilitarianism and the idea that criminals mse, surveyed 2100 male inmates in California,
should get their “just deserts.” Under a utilitar- Texas, and Michigan prisons and jails over a six-
ian approach, selective incapacitation could year period.19 They gathered information direct-
be justified if it would reduce crime overall and ly from prisoners through interviews about their
ultimately protect the most number of people crimes and data compiled into self-reports, and
from danger. By contrast, the idea of just des- then included information from their official
erts focuses entirely on punishing criminals for crime records in the report. As evidence that a
past conduct and emphasizes that “it is unfair small percentage of the prison population was
to punish for choices expected which have not particularly prone to criminal activity, the re-
yet been made — that is, for expected crimes searchers noted that “[a]mong active burglars,
that might never be committed.”13 50 percent committed fewer than 6 per year,
while 10 percent committed more than 230 per
The utilitarian goals were the focus of the selec- year.”20 Furthermore, they found strong correla-
tive incapacitation movement: decreasing the tions between recidivism and factors such as
crime rate by imprisoning the most dangerous “juvenile convictions, heroin or barbiturate use,
felons14 and reducing mass incarceration.15 But unemployment and prior imprisonment.”21
the theory relied on two major assumptions: (1)
Career criminals are responsible for the bulk of Yet there were significant limitations to the
RAND report. First, the assumptions were based
10 Mathieson, Selective Incapacitation: Reducing Crime
Through Predictions of Recidivism, supra note 7, at 515. on robbery and burglary crimes only. Moreover,
11 John Monahan, Predicting Violent Behavior: An
Assessment of Clinical Technique 244, 246-50 (1981). 16 Cohen, Incapacitation as a Strategy for Crime Con-
12 Jacqueline Cohen, Incapacitation as a Strategy for trol, supra note 12, at 8-9.
Crime Control: Possibilities and Pitfalls, 5 Crime and 17 Marvin E. Wolfgang, Robert M. Figilio & Thorsten
Justice, 1, 12 (1983), www.jstor.org/stable/1147469. Sellin, Delinquency in a Birth Cohort 327 (1972).
13 Mathiesen, Selective Incapacitation Revisited, supra 18 Id.
note 7, at 460 (emphasis added). 19 Peter W. Greenwood with Allan Abrahamse, Selective
14 Stephen D. Gottfredson & Don Gottfredson, Selective Incapacitation, RAND Corporation (Aug. 1982), https://
Incapacitation?, 478 The Annals of the American Acad- www.rand.org/content/dam/rand/pubs/reports/2007/
emy of Political and Social Science 135, 142 (Mar. R2815.pdf (hereinafter “RAND Report”).
1985). 20 Id. at xiii.
15 Lewin, Making Punishment Fit Future Crimes, supra 21 Lewin, Making Punishment Fit Future Crimes, supra
note 5. note 5.

4
the report was a retrospective analysis of past cause it undermined the foundational presump-
crimes committed. There was no actual test tion of innocence until proven guilty. The study
of predicted future behavior. Greenwood and acknowledged that “[a]s long as our ability to
Abrahamse did develop a predictive scale for discriminate between high and low-rate offend-
identifying risk in offenders, labeling those most ers is imprecise, there will be legitimate concern
likely to reoffend as “high-rate” and the rest as about those who are improperly classified…
“medium-rate” or “low-rate.” Their predictive Furthermore, there will be differences of opinion
scale was fairly accurate in predicting which as to the legitimacy of using some of the factors
criminals would be low-rate offenders, with a that are correlated with rates of offending (e.g.
rate of 76 percent correctness.22 However, the juvenile record, drug use, employment) for sen-
scale was extremely inaccurate for high-rate tencing purposes.”25
offenders. According to researcher Jacqueline
Cohen, only 45 percent of the criminals cate- Proponents of the theory, however, argued that
gorized as high-rate offenders were correctly selective incapacitation was still an improve-
identified, resulting in a false-positive rate of 55
ment over relying solely on human judgment in
percent of survey respondents.23 In other words criminal sentencing, citing the need for guide-
more than half of supposed high-rate offenders lines and “orderly assessment schemes.”26 In
were incorrectly labeled. other words, they argued that these predictive
instruments were much more accurate than our
The RAND report faced other criticisms as well. intuitive methods.27 Predictive instruments could
There were no validation tests conducted on the help judges identify who was truly risk for re-
report. Moreover, the data was highly specula- cidivism, thereby limiting the imposition of long
tive since the researchers obtained much of the prison sentences that human judges tend to
personal and criminal history through interviews dole out somewhat arbitrarily.28
with the offenders themselves. In 1986, research-
er Christy Visher reanalyzed the report and con- Ultimately, selective incapacitation never be-
cluded that “reduction of crime would decline came a mainstream concept, largely due to con-
further” if the model was completed with more cerns about predictive accuracy and individual
official criminal records.24 fairness.29 Yet its principles have lingered on in
the criminal justice system. Many states today
More importantly, predicting recidivism was have “repeat-offender laws, prosecutorial units
susceptible to the risk of false negatives and devoted to career criminals and sentencing pol-
false positives that could undermine the entire icies that consider prior offenses, job stability
purpose of the theory. In false negative cases, and other personal data.”30 And despite the crit-
individuals were mistakenly predicted as unlike- icisms of its methods, the RAND report inspired
ly to recommit but subsequently did. In these the precursors to various modern-day predic-
cases, predictive failure allowed individuals tion tools, including the INSLAW instrument (de-
back into a society where they could commit veloped for U.S. federal prosecutors to carry out
additional crimes. False positives, on the other risk assessment of offenders), the Salient Fac-
hand, represented an error that threatened in- tor Score (developed as a risk assessment scale
dividual liberty. Individuals would be mistakenly for U.S. Parole Commission), and the Canadian
identified to be recidivists and imprisoned for
crimes that they had no intention of actually 25 RAND Report, supra note 19, at 22-23.
committing. Notably, the RAND report acknowl- 26 C.D. Webster, Comment on Thomas Mathiesen's
edged that the problem of false positives raised Selective Incapacitation Revisited, 22 L. and Human
concerns about selective incapacitation be- Behavior, 473, 473 (1998), available at www.jstor.org/
stable/1394596.
22 Gottfredson & Gottfredson, Selective Incapacitation?, 27 R.A. Wright, In Defense of Prisons (1994).
supra note 14, at 140. 28 Mathiesen, Selective Incapacitation Revisited, supra
23 Cohen, Incapacitation as a Strategy for Crime Con- note 7, at 466.
trol, supra note 12, at 48-49. 29 Kathleen Auerhahn, Selective Incapacitation and the
24 Christy Visher, The Rand Inmate Survey: A Reanalysis, Problem of Prediction, 37 Criminology 703, 703 (1999).
in A. Blumstein et al., Criminal Careers and “Career 30 Lewin, Making Punishment Fit the Future Crime, supra
Criminals” 205-226 (1986). note 5.

5
Dangerous Behavior Rating Scale for Metropoli- ply punishing people in proportion to the severi-
tan Toronto Forensic Service.31 ty of their crimes, individuals were given unique
sentences and treatment with the ultimate goal
Although selective incapacitation is primarily of rehabilitation, in order to prepare them for
seen as a historical footnote today, the move- safe reentry into society. With rehabilitation as
ment sheds light on today’s discussion of risk the central goal, strict guidelines and sentenc-
assessment algorithms in sentencing, which is es were not considered appropriate. Thus, in or-
plagued by many of the same concerns about der to ensure individual treatment, judges were
accuracy and fairness toward the individual granted extraordinary discretion in regards to
defendant. Much like proponents of selective sentencing decisions.33
incapacitation in the 1980s, advocates for the
widespread use of risk assessments today ap- Yet greater sentencing discretion may have had
pear to be doing so out of a genuine desire to negative effects. In particular, it quickly became
reduce mass incarceration without increasing clear that minorities were being treated dispro-
the crime rate and to use data and technical portionately compared to their white peers in
analysis to improve upon untethered human sentencing.34 In 1977, Senator Edward Kennedy
judgment.32 But doing so successfully and fairly explained the disparate impact of contempo-
may be a far more difficult task than it seems, rary sentencing practices on minorities:
particularly in the sentencing context, where
risk assessment could ultimately turn into a res- During the past few years a quiet but con-
urrection of the ideas behind selective incapac- structive debate has ensued over the issue
itation theory. of comprehensive criminal sentencing re-
form. The debate has involved judges, law-
B. Rehabilitation: A Shift Toward yers, corrections officials, law enforcement
Individual Sentencing and Its officers, members of the academic com-
munity and others. It has focused primarily
Discriminatory Effects on two interrelated problems–the total ab-
Beyond the selective incapacitation context,
sence of any prescribed guidelines to aid
today’s risk-assessment algorithms are the
judges during the sentencing process and
product of broader philosophical debate in the
the wide disparity in the sentences actually
United States regarding the objectives of our imposed in criminal cases.... The result has
criminal justice system. In the late nineteenth
been chaotic–all too often two defendants
century, the American criminal justice system
with similar backgrounds, convicted of the
began to shift away from capital and corporal
same crime, receive widely disparate sen-
punishment and towards rehabilitation. This
tences.35
rehabilitative focus dominated criminal justice
discussions until the 1970s and it emphasized
Although increased judicial discretion was in-
assigning punishments based on an individual’s
tended to serve a rehabilitative end, the dis-
characteristics rather than just the crimes that
parate impact that it has had on minorities
they committed. In other words, instead of sim-
suggested the approach also had a discrimina-
31 Mathiesen, Selective Incapacitation Revisited, supra 33 Douglas A. Berman, Re-balancing Fitness, Fairness,
note 7, at 460. and Finality for Sentences, 4 Wake Forest J. Law & Poli-
32 See, e.g., CSG Justice Center Staff, Risk and Needs cy 151, 157-8 (2014).
Assessment and Race in the Criminal Justice System, 34 Joshua B. Fischman & Max M. Schanzenbach, Racial
The Council of State Governments (May 31, 2016), Disparities under the Federal Sentencing Guidelines:
https://csgjusticecenter.org/reentry/posts/risk-and- The Role of Judicial Discretion and Mandatory Mini-
needs-assessment-and-race-in-the-criminal-justice-sys- mums, 9 J. of Empirical Legal Stud. 729, 729-64 (2012)
tem/ (noting that “validated risk and needs assessment (noting that United States Sentencing Guidelines were
is necessary to more accurately determine the risk of introduced to mitigate the disparate impact of judicial
recidivism and criminogenic needs of people involved discretion on judges, but might have had a counterpro-
in the criminal justice system—and to inform how the ductive effect).
system responds to reduce that risk and address those 35 Pierce O'Donnell et al., Toward a Just and Ef-
needs—than by relying on subjective, individual judg- fective Sentencing System: Agenda for Legislative
ment.”). Reform (1977).

6
tory effect. Many more policymakers eventually While the extraordinary discretion granted to
joined Senator Kennedy in questioning whether judges under the rehabilitative approach may
better guidelines might be necessary to assist have produced discriminatory effects, strict
in sentencing decisions in order to mitigate the retributivism was criticized for ushering in the
system's disproportionate sentencing practic- era of mass incarceration, which arguably had
es.36 its own discriminatory impact.41 Policymakers
soon began to grapple with the problems cre-
C. Retributivism and the Rise of ated by America’s ever-expanding prison popu-
Evidence-Based Sentencing lation and the harsh realities of these new sen-
With these concerns in mind, the sentencing re- tencing requirements.
form movement of the 1970s and 1980s shifted
back towards the retributive notion that crimi- In recent years, there has been a move towards
nal sentences should be based primarily on the evidence-based practices (EBP), which strive
crime committed rather than on the criminal to improve sentencing decisions by incorpo-
himself.37 The primary result of this shift was the rating scientific and quantitative methods. Ev-
establishment of clearer sentencing practices idence-based practices take an actuarial ap-
and increased use of sentencing guidelines. As a proach to assessing and treating risk, using the
part of this reform movement, Congress passed scientific method to predict future behavior.
42

the Sentencing Reform Act (SRA) in 1984.38 The Although the EBP movement has received some
SRA was predicated on the idea that sentenc- criticism for having had little effect on the mass
ing practices had become unfair and uncer- incarceration problem (or potentially making it
tain under the prevailing rehabilitative model, worse), 43
EBP is intended to improve sentencing
and it formalized federal sentencing through outcomes by using empirical assessment to in-
the establishment of the U.S. Sentencing Com- form sentencing decisions. 44

mission.39 The SRA also prescribed a clear sen-


tencing structure under the federal sentencing In the context of the criminal justice system, ev-
guidelines.40 idence-based practices utilize data to assess
the risk of re-offense, or recidivism. The goal
of these methods is to reduce recidivism rates
by focusing on particular offender character-
36 The Sentencing Reform Act of 1984: Principle Features
Affecting Guideline Construction, U.S. Sentencing
Commission (n.d.), http://www.ussc.gov/research/re- 41 See, e.g., The Moral Failures of America’s Prison-Indus-
search-and-publications/simplification-draft-paper-2. trial Complex, The Economist (Jul. 8, 2015), http://www.
37 Letter from Jonathan Wroblewski, Director, Office economist.com/blogs/democracyinamerica/2015/07/
of Policy and Legislation, U.S. Department of Justice’s criminal-justice-and-mass-incarceration.
Criminal Division, to the Honorable Patti B. Saris, Chair, 42 Actuarial assessment is “a formal method . . . [that
U.S. Sentencing Commission (Jul. 29, 2014), https:// provides] a probability, or expected value, of some
www.justice.gov/sites/default/files/criminal/lega- outcome. It uses empirical research to relate numerical
cy/2014/08/01/2014annual-letter-final-072814.pdf (here- predictor variables to numerical outcomes. The sine qua
inafter “DOJ Letter”). non of actuarial assessment involves using an objective,
38 Sentencing Reform Act, P.L. §98-473 (1984). mechanistic, reproducible combination of predictive fac-
39 An Overview of the United States Sentencing Com- tors, selected and validated through empirical research,
mission, United States Sentencing Commission (Jan. against known outcomes that have also been quanti-
5, 2011), http://www.ussc.gov/sites/default/files/pdf/ fied.” Kirk Heilbrun, Risk Assessment in Evidence-based
about/overview/USSC_Overview.pdf. Sentencing: Context and Promising Uses, 1 Chapman J.
40 Nathan James, Risk and Needs Assessment in the Crim. Just. 1, 134 (2009).
Criminal Justice System, Congressional Research 43 Cecelia Kingele, The Promises and Perils of Evi-
Services 1, 13 (Oct. 13, 2015), https://fas.org/sgp/crs/ dence-Based Corrections, University of Wisconsin
misc/R44087.pdf. It should be noted that although the Law School, Legal Studies Research Paper Series No.
SRA established the U.S. Sentencing Commission to 1368 (Nov. 30, 2015), http://www.wisconsinappeals.net/
create clearer sentencing guidelines, these guidelines wp-content/uploads/2015/12/Klingele-article.pdf.
only apply on a federal level. Most criminal prosecution 44 Richard E. Redding, Evidence-Based Sentencing: The
still happens in state courts, where these federal guide- Science of Sentencing Policy and Practice, 1 Chapman
lines have some persuasive influence but are not legally J. Crim. Just. 1, 3-4 (2009), http://works.bepress.com/
binding. richard_redding/11.

7
istics and criminogenic needs—factors which dence-based practices also place a renewed
are believed to increase a person’s propensity emphasis on recidivism because of its central
to commit crimes in the future.45 Criminals are role in decisions about how to treat offenders,
generally grouped by their risk, and assigned particularly when trying to balance public safe-
a high, medium, or low risk score. Consistent ty against a desire to reduce mass incarcera-
with rehabilitative approaches, this risk score is tion and prison overcrowding.49 This shift has
supposed to help determine the treatment and led to the development of risk assessment tools
interventions an offender will receive in prison.46 that are aimed at predicting an individual’s like-
Factors that increase and decrease the likeli- lihood of recidivism.
hood of recidivism are both considered, and
sentencing as well as treatment are assigned D. The Evolution of Risk Assessment
with these factors in mind.47 Tools
There have been roughly four different gener-
Some experts have praised evidence-based ations of risk assessment tools over the course
practices for their potential to find a construc- of the past century.50 The focus on rehabilita-
tive middle ground between the extreme results tion from the first half of the twentieth century
produced by placing a stronger emphasis on can be seen in the first generation, where risk
either rehabilitation or retributivism alone. As assessment was conducted on a case-by-case
Chapman University’s Dr. Richard E. Redding basis by correctional staff and clinical profes-
explains: sionals working in prisons.51 These actors would
generally rely on their own professional judg-
[T]he evidence-based approach will like- ment when making decisions for individuals
ly result in sentencing decisions that more about sentencing, supervision, and treatment.
comprehensively consider relevant utilitar- But over time, the way in which risk is measured
ian and retributive considerations. ‘[R]etri- has evolved considerably.
bution-oriented judges may concern them-
selves with the story of crime, and perhaps 49 According to the National Institute of Justice, recidi-
proceed to construct a narrative about the vism is important due its interplay with incapacitation,
offender’s criminal history, but they are un- specific deterrence, and rehabilitation: “Incapacitation
likely to construct a story of the offender’s refers to the effect of a sanction to stop people from
life as a rehabilitation oriented judge would committing crime by removing the offender from the
community. Specific deterrence is the terminology used
be likely to do.’ Risk and needs assessments to denote whether a sanction stops people from commit-
force judges to focus on both stories–the of- ting further crime, once the sanction has been imposed
fense and offense history as well as the risk or completed. Rehabilitation refers to the extent to which
and protective factors relevant to rehabili- a program is implicated in the reduction of crime by "re-
tation, all in a more precise and accurate pairing" the individual in some way by addressing his or
way.”48 her needs or deficits.” Why Recidivism is a Core Criminal
Justice Concern, National Institute of Justice (Oct.
Far from a complete departure from rehabilita- 3, 2008), https://www.nij.gov/topics/corrections/recidi-
tion and retributivism, the evidence-based risk/ vism/pages/core-concern.aspx.
50 Susan Turner et al., Development of the California
needs assessment model, which we describe in
Static Risk Assessment (CSRA): Recidivism Risk Predic-
the next section, embraces the principles of re-
tion in the California Department of Corrections and
habilitation while attempting to preserve some Rehabilitation, Center for Evidence-Based Correc-
of the standardization provided by retributive tions, University of California-Irvine (2013), http://
approaches. As mentioned above, these evi- ucicorrections.seweb.uci.edu/files/2013/12/Develop-
ment-of-the-CSRA-Recidivism-Risk-Prediction-in-the-CD-
45 Id. CR.pdf; James Bonta & D.A. Andrews, Risk-Need-Respon-
46 Melissa Hamilton, Risk-Needs Assessment: Consti- sivity Model for Offender Assessment and Rehabilitation,
tutional and Ethical Challenges, U. of Houston Law Public Safety Canada (2006-07), https://www.pub-
Center No. 2014-W-2 (Jan. 26, 2015), https://ssrn.com/ licsafety.gc.ca/cnt/rsrcs/pblctns/rsk-nd-rspnsvty/rsk-nd-
abstract=2506397. rspnsvty-eng.pdf.
47 Redding, Evidence-Based Sentencing: The Science of 51 Bonta & Andrews, Risk-Need-Responsivity Model for
Sentencing Policy and Practice, supra note 44, at 3-4. Offender Assessment and Rehabilitation, supra note 50,
48 Id. at 9. at 3.

8
The largest evolution of risk assessment came not alter static factors, tools which rely upon
with the aforementioned shift towards evi- them might have a discriminatory effect—judg-
dence-based practices and the development ing people for factors over which they have no
of sophisticated tools to measure risk. Evi- control. The third generation of risk assessments
dence-based risk/needs assessment instru- attempted to solve for the shortcomings of stat-
ments consider the interplay between static ic risk factors by considering static and dynam-
and dynamic risk factors. Dynamic risk factors ic factors in tandem with one another.58 This
are any factors that contribute to recidivism generation, of which risk/needs assessment is a
risk that can change over time. For rehabilita- part, is especially useful to rehabilitative mod-
tive tools, these factors—which include current els where changing offender characteristics
age, employment status, and whether a person matter. Finally, the fourth generation of risk as-
is in treatment for substance/alcohol abuse— sessment tools builds off of the third generation
are treated through targeted interventions that but it embraces a more “systematic and com-
are intended to decrease the likelihood of recid- prehensive” approach to measuring recidivism
ivism.52 These dynamic factors are also referred and treating offenders based on their specific
to as “criminogenic needs” since they can be risk factors and characteristics.59
addressed via treatment.53 For example, an of-
fender with alcohol problems might be placed E. Enter the Algorithms: Risk
in programming aimed at treating his addic- Assessment Software
tion, which could ultimately decrease his likeli- Today’s fourth-generation risk-assessment tools
hood of reoffending. On the other hand, static are far more technically sophisticated and wide-
risk factors—which include criminal history, ly available than the rudimentary tools that had
age at first arrest, and gender—are also cor- been used in the United States to inform parole
related with risk, but they are not targeted for decisions since the 1920s.60 A number of mod-
treatment since they cannot be changed. Static ern risk-assessment tools take advantage of
factors are, however, often used alongside dy- machine learning algorithms, which generate
namic factors to evaluate risk of recidivism.54 risk models based on vast quantities of data. As
these algorithms are used over time, their mod-
The second generation of risk assessment tools, els often dynamically adjust to new data. Risk
which emerged in the 1970s, primarily em- assessment tools and software–many of which
braced static factors for measuring risk.55 Many incorporate machine learning–are now being
second-generation tools abandon dynamic used in a variety of contexts, including prison
risk-factors altogether,56 and the immutable na- rehabilitation programs, pretrial risk assess-
ture of static factors makes it difficult (if not im- ment, and sentencing. In this subsection, we
possible) for these tools to account for positive describe the primary tools and models used in
changes or progress.57 Since the offender can- these three areas.
52 James, Risk and Needs Assessment in the Criminal
Justice System, supra note 40, at 3. For example, if an i. Rehabilitation-Specific Risk
offender has a history of alcohol or drug abuse (a dy- Assessment Tools
namic factor), they may receive some kind of addiction The foundation of most rehabilitative risk/needs
treatment. assessment (RNA) tools is the risk-needs-respon-
53 D.J. Simourd, Use of Dynamic Risk/Need Assessment
sivity (RNR) model, which rests on the afore-
Instruments Among Long-Term Incarcerated Offenders, 31
Crim. Just. and Behav., 306, 306-323 (2004).
mentioned concept of responding to recidivism
54 James, Risk and Needs Assessment in the Criminal
Justice System, supra note 40, at 3. 58 Id.
55 Bonta & Andrews, Risk-Need-Responsivity Model for 59 Id.
Offender Assessment and Rehabilitation, supra note 50, 60 Richard A. Berk & Justin Bleich, Statistical Procedures
at 3. for Forecasting Criminal Behavior, 12 Criminology &
56 Turner et al., Development of the California Static Risk Pub. Policy 1, 2 (2013). See also Howard G. Borden,
Assessment, supra note 50, at 5 Factors For Predicting Parole Success, 19 J. Crim. Law &
57 Bonta & Andrews, Risk-Need-Responsivity Model for Criminology 328, 328-36 (1928),
Offender Assessment and Rehabilitation, supra note 50, http://scholarlycommons.law.northwestern.edu/cgi/
at 4. viewcontent.cgi?article=2101&context=jclc.

9
risk and criminogenic needs through the most 300 U.S. jurisdictions, and it measures risk using
appropriate treatment.61 The RNR model, which a very narrow set of static risk factors relating
rose to prominence in the third and fourth gen- primarily to the defendant’s age and criminal
erations of risk assessment, is based on three history. The PSA does not seek to identify reha-
principles: bilitative treatments for offenders, but rather
was built to help make decisions about whether
1. The risk principle, which asserts that risk is an individual should be detained or released be-
predictable, and high-risk offenders should fore going to trial.67 The instrument makes a risk
receive different and more intensive treat- determination based on the aforementioned
ment than low-risk offenders.62 static risk factors, and this risk classification is
2. The needs principle, which suggests rehabil- used to determine whether a person is low-risk,
itative treatment and sentencing decisions and can therefore safely be released, or is high-
should respond to criminogenic needs which risk, and should be detained.68
contribute to criminal behavior.63
iii. Sentencing
3. The responsivity principle, which describes
Although there has been considerable focus on
how treatment should be tailored to the spe-
using risk assessment algorithms in rehabilita-
cific offender.64
tion and pretrial decision-making, they have re-
Many RNA instruments are used in prison reha- cently drawn attention for their use in sentenc-
bilitation programs, and these tools use the RNR ing—the primary focus of this paper.69 In 1994,
model to rehabilitate as well as incapacitate of- Virginia was the first state to implement a risk
fenders. Canada was a trailblazer in this area assessment instrument for use in sentencing.
of using evidence-based methods for rehabilita- The instrument, which was created by the Vir-
tion,65 but California and other states in the U.S. ginia Criminal Sentencing Commission, was de-
have followed suit by implementing RNA and signed to identify low-risk felons in order to as-
rehabilitation into treatment and sentencing. sign them a more suitable type of punishment.70
Rehabilitative tools like those developed for use These alternative punishments include diversion
in Canada and California target dynamic risk from prison to jail, diversion from jail to commu-
factors for treatment, and use static risk factors nity service or home-arrest, and fines.71 Virginia
to measure risk.66 remains unique in its approach to developing
risk assessment tools. While a handful of states
ii. Pretrial Detention and Release
Another use of risk-assessment tools is for pre-tri- 67 See infra Part III.A.
al detention and release decisions, which gen- 68 Public Safety Assessment, Laura and John Arnold
erally places more focus on static risk factors. Foundation, http://www.arnoldfoundation.org/initia-
One such pre-trial tool, the Public Safety Assess- tive/criminal-justice/crime-prevention/public-safety-as-
ment (PSA), is used in 29 American jurisdictions sessment/ (accessed Dec. 15, 2016).
69 See, e.g., Julia Angwin, Make Algorithms Account-
including three entire states: Arizona, Kentucky,
able, N.Y. Times (Aug. 1, 2016), http://www.nytimes.
and New Jersey. The PSA, which was developed
com/2016/08/01/opinion/make-algorithms-account-
by the Laura and John Arnold Foundation, was able.html?_r=0; Sari Horwitz, Eric Holder: Basing
built using data from 1.5 million crimes spanning Sentences on Data Analysis Could Prove Unfair to
Minorities, Washington Post (Aug. 1, 2014), https://
61Bonta & Andrews, Risk-Need-Responsivity Model for www.washingtonpost.com/world/national-security/
Offender Assessment and Rehabilitation, supra note 50, us-attorney-general-eric-holder-urges-against-da-
at 1. ta-analysis-in-criminal-sentencing/2014/08/01/92d0f-
62 James, Risk and Needs Assessment in the Criminal 7ba-1990-11e4-85b6-c1451e622637_story.html?utm_ter-
Justice System, supra note 40, at 3. m=.18af89d61814.
63 Id. 70 Brian Ostrom, Offender Risk Assessment in Virginia: A
64 Id. at 6-7. Three-stage Evaluation: Process of Sentencing Reform,
65 Bonta & Andrews, Risk-Need-Responsivity Model for Empirical Study of Diversion and Recidivism, Benefit-cost
Offender Assessment and Rehabilitation, supra note 50, Analysis, National Center for State Courts: Virginia
at 3. Criminal Sentencing Commission (2002), http://www.
66 James, Risk and Needs Assessment in the Criminal vcsc.virginia.gov/risk_off_rpt.pdf.
Justice System, supra note 40, at 1, 13. 71 Id. at 19.

10
like Virginia and Pennsylvania use risk-assess- cause COMPAS is proprietary software, it is not
ment tools that have been developed by (or in subject to federal oversight and there is almost
partnership with) the state government, many no transparency about its inner workings, in-
more states and jurisdictions have implemented cluding how it weighs certain variables. COM-
or adapted one of several existing commercial PAS has created a considerable amount of con-
systems.72 troversy for this very reason.

One of the first and most popular commercial F. Risk-Assessment Validity and
risk-assessment tools to be used in sentencing Adoption
is called the Level of Service Inventory – Revised Accuracy is of paramount concern when it
(LSI-R). LSI-R, which was developed by the Ca- comes to using risk assessment instruments, es-
nadian company Multi-Health Systems, pulls in- pecially in the sentencing context. A 2006 study
formation from a survey containing a wide set in the Journal of Criminal Justice that examined
of static and dynamic factors. These factors, the importance of implementation integrity for
which range from criminal history to personal- LSR-I noted that while it is important that high-
ity patterns, are used to determine a person’s risk offenders receive more severe sentences, it
risk for recidivism as well as the best sentencing is equally important that low-risk offenders re-
options. The tool was initially developed for use ceive less severe sentences.77 Risk-assessment
in rehabilitation, but it subsequently has been algorithms are useful for identifying these high
adapted for use in sentencing. LSI-R and adapt- and low-risk offenders, but it is important that
ed versions of it are used to assist sentencing in they are identified accurately since inaccura-
a number of states and jurisdictions, including cies would not only be unjust, but could actually
Washington73 and California.74 make individuals likely to recidivate.78
Another popular tool, COMPAS, was created Research has generally confirmed that risk as-
by the company Northpointe. COMPAS assess- sessment instruments can predict who is at risk
es variables under five main areas: criminal in- to recidivate with at least some degree of accu-
volvement, relationships/lifestyles, personality/ racy.79 Furthermore, a number of academics like
attitudes, family, and social exclusion. It uses a James Bonta argue that actuarial assessment,
combination of static and dynamic factors in or- which is at work in risk-assessment algorithms, is
der to assess recidivism risk, and it can be pro- preferable to clinical assessment.80 Bonta notes
grammed for a variety of use cases.75Although that studies have generally credited greater ac-
COMPAS can be employed for purposes be- curacy and predictive validity to the objectivity
yond sentencing, a number of states, including
Wisconsin, Florida, and Michigan, use COMPAS 77 Anthony W. Flores et al., Predicting Outcome with the
to assist judges with sentencing decisions.76 Be- Level of Service Inventory-Revised: The Importance of
Implementation Integrity, 34 J. Crim. Just. 523, 523-29
72 Algorithms in the Criminal Justice System, Electron- (2006), http://www.sciencedirect.com.ezp-prod1.hul.
ic Privacy Information Center (n.d.), https://epic.org/ harvard.edu/science/article/pii/S0047235206000833
algorithmic-transparency/crim-justice/ (accessed Dec. (noting that “the incorporation of the risk principle of
15, 2016). offender classification dictates that higher risk indi-
73 Sex Offender Sentencing in Washington State: Pre- viduals warrant the majority of correctional attention,
dicting Recidivism Based on the LSI-R, Washington including the most intensive levels of both rehabilitative
State Institute for Public Policy (2006), http:// service and supervision. Conversely, and arguably as
www.wsipp.wa.gov/ReportFile/935/Wsipp_Predict- important, is the need to leave lower risk individuals free
ing-Recidivism-Based-on-the-LSI-R_Predicting-Recidi- from intense levels of intervention to avoid interference
vism-Based-on-the-LSI-R.pdf. with the protective factors that are likely present in their
74 Turner et al., Development of the California Static Risk environment and within themselves.”).
Assessment (CSRA), supra note 50. 78 Id.
75 Practitioner’s Guide to COMPAS Core, Northpointe 79 James, Risk and Needs Assessment in the Criminal
(Mar. 19, 2015), https://assets.documentcloud.org/docu- Justice System, supra note 41, at 3.
ments/2840784/Practitioner-s-Guide-to-COMPAS-Core. 80 James Bonta, Offender Assessment: General Issues
pdf. and Considerations, Compendium on Effective Cor-
76 Algorithms in the Criminal Justice System, supra note rectional Programming (2000), http://www.csc-scc.
72. gc.ca/005/008/compendium/2000/chap_4-eng.shtml.

11
of actuarial tools compared to the theoretical
nature professional clinical judgment. Neverthe-
less, no instrument is completely accurate, and
it has even been suggested that there might be
some “natural limit” to the accuracy of risk-as-
sessment algorithms.81 Yet risk assessment has
received widespread support and is generally
considered to be a valid method for predicting
risk.

81 John Monahan and Jennifer L. Skeem, Risk Redux: The


Resurgence of Risk Assessment in Criminal Sanctioning,
26 Fed. Sent'g Rep. 162 (Feb. 2014).

12
III. Algorithms and Sentencing, by contrast, involves a much broad-
er range of considerations. A sentencing deci-
Criminal Sentencing sion involves first deciding how to punish some-
one and then, if a judge chooses incarceration,
While the previous section described the history how long a sentence should be. Determining the
of risk assessment in the criminal justice system severity and length of punishment often draws
broadly, this section focuses specifically on the upon a number of different theories of punish-
use of modern risk assessment tools in sentenc- ment, including individual retribution, rehabili-
ing decisions. We discuss the inherent challeng- tation, deterrence, and incapacitation.83 Judges
es of adapting these tools from the parole and often base their decisions on multiple theories,
pre-trial context to sentencing, and then explain despite their varied goals.84
the mechanics of how the scores are currently
incorporated into the sentencing process. As discussed above, there is a clear relationship
between recidivism and the goals of rehabili-
A. The Move from Parole and Pre-Trial tation and incapacitation: individuals who are
to Sentencing Risk Assessments unlikely to reoffend are typically considered
The fact that these algorithms have been suc- good candidates for rehabilitation and less se-
cessfully used in other parts of the criminal vere forms of punishment, whereas a high risk of
justice system may help explain why lawmak- recidivism may support an argument for long-
ers and judges have been relatively quick to term or permanent incapacitation to protect so-
embrace them in the sentencing context. But ciety 85against the defendant’s future dangerous-
these risk assessment tools may be better suit- ness. But the links between recidivism and the
ed and easier to assess in other contexts, such punishment goals 86 of deterrence and retribution
as during pre-trial release, when a judge is eval- are more tenuous. To the extent that a longer
uating whether a criminal defendant should be
officers “made more consistent and accurate assess-
held in jail prior to her scheduled appearances
ments of offender risk when using [a risk assessment
in court. The goals of a pre-trial risk evaluation tool] than when using unstructured clinical judgment” or
are relatively well defined. A judge is trying to relying on professional experience). Several professors
predict whether the defendant will appear in that we interviewed for this paper also indicated that
court when she is supposed to, and whether she forthcoming studies find similar results in the pre-tri-
is likely to commit any crimes in the meantime. al risk assessment context, where decisions aided by
If a defendant poses a significant flight risk algorithmic risk assessment tools are more accurate at
or a danger to the public, the judge will likely predicting which offenders are likely to commit crimes if
recommend against release, whereas a defen- released than decisions made solely relying on judge’s
dant that appears low risk in both categories is unguided human judgment.
83 See Model Penal Code § 1.02(2), which notes that the
likely to be set free before trial. It is not surpris-
general purposes of sentencing include, among others,
ing, therefore, that it has become increasingly “prevent[ing] the commission of offenses,” “promot[ing]
common to augment judicial decision-making the correction and rehabilitation of offenders,” and
with risk assessment software like Public Safety “differentiat[ing] among offenders with a view to a just
Assessment in order to help reduce the number individualization in their treatment.”
of individuals behind bars before trial without 84 In one study, eighteen judges were asked to report
increasing risk to the public.82 information their decisions on 1000 adult offenders, and
the results suggested they rarely attributed their deci-
82 Matthew Conlen, Reuben Fischer-Baum & Andy Ross- sion to any one goal. Gottfredson, Selective Incapacita-
back, Should Prison Sentences Be Based on Crimes that tion?, supra note 14.
Haven’t Been Committed Yet?, FiveThirtyEight Politics 85 Bernard Harcourt, Against Prediction 31-34
(Aug. 4, 2015), http://fivethirtyeight.com/features/pris- (2005).
on-reform-risk-assessment/ (noting that “[t]here is little 86 See, e.g., Paul Gendreau et al, The Effects of Prison
question that well-designed risk assessment tools “work,” Sentences on Recidivism, Department of the Solicitor
in that they predict behavior better than unaided expert General of Canada (1999), https://www.prisonpolicy.
opinion.”). See, e.g., J.C. Oleson et al., Training To See org/scans/gendreau.pdf (concluding that “[p]risons
Risk: Measuring the Accuracy of Clinical and Actuarial should not be used with the expectation of reducing
Risk Assessment Among Federal Probation Officers, 75 criminal behavior” and “[t]he primary justification of
Fed. Prob. 52 (Sept. 2011) (finding that federal probation prison should be to incapacitate offenders (particularly

13
prison sentence deters the individual who is complicated question about how to use the in-
receiving the sentence from committing future formation provided in the risk assessment score,
crimes, it arguably has an impact on recidivism. and his answer may be highly dependent on
But we tend to think of deterrence in terms of his own primary theory of punishment. Or, he
society more broadly, and how the decision to may simply take a risk-averse approach and im-
punish an individual for a crime will impact oth- pose more stringent sentences on criminals who
ers who might be inclined to commit the same are labeled high risk in order to avoid potential
crime. This broader conception of deterrence blame for a high-risk criminal who received a
bears little relation to an individual’s risk of com- less severe sentence and ultimately did reof-
mitting future crimes. Finally, retribution, al- fend.90
though focused entirely on the individual crim-
inal, is a backward-looking assessment of his These differences do not necessarily suggest
blameworthiness. A criminal’s future dangerous that these tools should only be used in the
has little relevance to ensuring that he gets his pre-trial risk assessment context, but rather that
“just desserts” for the crime he previously com- expanding to other, more complicated areas like
mitted. Thus, in sentencing decisions, although sentencing requires a great deal of thought. In
recidivism may be a relevant factor, it is hardly particular, sentencing authorities need to con-
the only consideration—and may not even be a sider which goals of punishment they are trying
central or determinative one.87 to achieve and how algorithmic tools could help
maximize for those goals, if possible.91 Part of
Moreover, regardless of a judge’s primary theo- this process may also involve thoughtful delib-
ry of punishment, it is less clear how he or she eration about how to quantify effects like deter-
should use a risk assessment score to inform a rence and retribution, which are harder to math-
sentencing decision as opposed to the pre-trial ematically measure than recidivism but may be
release context. Before trial, a judge faces a de- valuable ends. Moreover, it highlights the need
cision that is essentially binary: should the pris- for research to inform our understanding of how
oner stay in jail for the duration of the pre-trial factors like the type and length of the sentence
period or not? But at sentencing, a judge also impact future outcomes.
has to decide how long the punishment should
be. There is little positive evidence supporting B. The Sentencing Process
the notion that a longer criminal sentence has Despite the complexity of using these instru-
a significant impact on an individual’s recid- ments in sentencing, as noted above, states are
ivism.88 And so it does not necessarily follow increasingly recommending or mandating their
that a longer sentence will decrease the likeli- use. In this subsection we provide some context
hood that a criminal will commit crimes again about how sentencing works generally and how
in the future.89 A judge therefore faces a more these risk scores are specifically being incorpo-
rated into that process today.
those of a chronic, high-risk nature) for reasonable peri-
ods and to exact retribution.”). that does not mean that the punishment itself lowers
87 For further discussion of these concepts, see Har- his risk of recidivism, but rather that his incapacita-
court, Against Prediction, supra note 85, at 31-34, tion makes it difficult or impossible to commit crimes.
188-89. For further discussion, see Jennifer L. Doleac & Megan
88 Sonja B. Starr, Evidence-Based Sentencing and the Stevenson, Are Risk Assessment Scores Racist?, Brook-
Scientific Rationalization of Discrimination, 66 Stan. L. ings Inst. (Aug. 22, 2016), https://www.brookings.edu/
Rev. 803, 855-56 (2014) (noting that “[t]he instruments blog/up-front/2016/08/22/are-criminal-risk-assess-
tell us, at best, who has the highest risk of recidivism…. ment-scores-racist/.
not… whose risk of recidivism will be reduced the most 90 Judges are likely to overcorrect and err on the side of
by incarceration” and that EBS “predictions are not con- a higher rate of false positives rather than bear the per-
ditional on the sentence.”). sonal and societal risk of a recidivist committing a crime.
89 We should note that there is potential for a false 91 Interview with Jim Greiner, Professor, Harvard Law
positive here: a criminal who is identified as high risk School, and Chris Griffin, Research Director, Harvard
for recidivism and subsequently given a longer sen- Law School’s Access to Justice Lab (Nov. 7, 2016). Greiner
tence may therefore appear as a recidivism “success” and Griffin noted that in order for a risk assessment tool
because the fact that he is in prison for longer deprives to “work,” it has to know how success is defined and
him of the opportunity to commit future crimes. However, maximize toward that goal.

14
A criminal sentencing typically unfolds as fol- the PSI and any additional evidence presented
lows: after a defendant has been convicted, at a sentencing hearing, she is free to use that
the judge or sentencing authority requests a information however she sees fit in making a fi-
pre-sentence investigation report (PSI) with per- nal determination.
tinent information about the defendant’s life
and background. This report is usually prepared C. Evidence-Based Sentencing and
by an officer of the court with a background in
the Embrace of Risk Scores
social work—not a lawyer—and may include in-
In recent years, as legal experts and legislatures
formation about a defendant’s criminal record,
have embraced the idea of evidence-based sen-
details from interviews with the defendant’s
tencing (EBS), they have aggressively encour-
family, friends, and former employers, and oth-
aged judges to consider broader studies and
er personal and biographical details. From a
risk assessments at sentencing. For example,
legal standpoint, there are few restrictions on
the latest proposed revision of the sentencing
what this pre-sentence investigation report may
sections of the Model Penal Code (MPC) explic-
contain. Although strict rules govern what evi-
itly endorses the use of risk assessment instru-
dence can be introduced during the guilt phase
ments in the shift to EBS.95 The Conference of
of a trial, at sentencing a judge is free to con-
Chief Justices, the Conference of State Court
sider a wide range of additional evidence with-
Administrators, and the National Center for
out running afoul of a defendant’s right to due
State Courts have also begun working togeth-
process.92 The rationale for the distinction is that
er on a project to develop evidence-based sen-
sentencing is not just about the narrow issue of
tencing practices.96
guilt, but is also informed by a defendant’s life
and characteristics. In our system, not every of-
Like proponents of selective incapacitation in
fense in a particular legal category calls for an
the 1980s, EBS advocates' goals goals are large-
identical punishment absent consideration of
ly framed in progressive terms: to reduce incar-
the past life, habits, and prior criminal record of
ceration and save money by identifying low-risk
a particular offender.93
offenders who can be punished without going
to jail.97 Yet, like many of the state statutes dis-
Once the pre-sentence investigation report has
cussed below, the current draft language in the
been compiled, it is provided to the judge for
MPC is relatively broad in its endorsement of
review. Although the information in the PSI is
generally made available to the defendant or assurance of confidentiality to potential sources of
his counsel as well, certain information or parts information is essential to enable investigators to obtain
of the report the report may be considered relevant but sensitive disclosures from persons unwilling
confidential and kept from the defendant. The to comment publicly about a defendant's background or
justification for this selective redaction is that character. The availability of such information… provides
the individuals speaking with the social worker the person who prepares the report with greater detail
compiling the report may wish to do so in con- on which to base a sentencing recommendation and, in
fidence, especially if they fear reprisal from the turn, provides the judge with a better basis for his sen-
defendant—and without a guarantee that the tencing decision.”). In Gardner, the Supreme Court ruled
defendant will not be able to see that informa- that this confidentiality is unconstitutional in the capital
sentencing context, but did not impose any such require-
tion, they might be hesitant or altogether un- ment on ordinary criminal trials.
willing to talk, thereby reducing the amount of 95 According to the latest available draft, § 6B.09 of the
information upon which a judge can base her revised MPC will endorse the use of “actuarial instru-
sentencing decision.94 Once the judge receives ments or processes, supported by current and ongoing
recidivism research, that will estimate the relative risks
92 Williams v. New York, 337 U.S. 241, 251 (1949). Although that individual offenders pose to public safety,” includ-
Williams has been overruled in the death penalty con- ing their formal incorporation into sentencing guidelines
text, see Gardner v. Florida, 430 U.S. 349 (1977) (impos- “[w]hen these instruments or processes prove sufficiently
ing heightened evidentiary requirements for the punish- reliable.”
ment phase of a capital trial), the holding remains intact 96 Redding, Evidence-Based Sentencing: The Science of
for other criminal cases. Sentencing Policy and Practice, supra note 44, at 7-8.
93 Williams, 337 U.S. at 251-52. 97 Starr, Evidence-Based Sentencing, supra note 88, at
94 See Gardner, 430 U.S. at 358-59 (noting that “an 816.

15
risk assessment tools. Only the advisory notes clude sentencing.105 Accordingly, the state uses
indicate any caution or need for “adequate the Ohio Risk Assessment System (ORAS), which
protections” in order to ensure these tools are it developed in partnership with the University of
used fairly or highlight the importance of va- Cincinnati.106 Similarly, Pennsylvania required
lidity studies and other research to ensure ac- that its sentencing commission adopt a risk as-
curacy.98 This embrace of EBS is also at odds sessment instrument to help determine appro-
with other voices in the criminal justice system, priate sentences,107 which resulted in a lengthy
including the Department of Justice, which has process undertaken by the state Sentencing
taken a more skeptical approach toward the Commission to develop its own custom tool and
use of algorithms in sentencing.99 In 2014, the a series of guidelines for its use.108 The extensive
Department of Justice noted that “experience processes undertaken in Ohio and Pennsylva-
and analysis of current risk assessment tools nia in consultation with a wide range of experts
demonstrate that utilizing such tools for deter- and academics provide a stark contrast to those
mining prison sentences to be served will have states which embraced these tools with just a
a disparate and adverse impact on offenders few lines in a statute and have largely left it to
from poor communities already struggling with individual judges to sort out.
many social ills.”100
A number of other states merely permit the use
The statutory language that currently autho- of risk assessments in criminal sentencing, ac-
rizes—and in some cases requires—the use of knowledging their potential to guide judicial de-
these tools varies widely across jurisdictions.101 cision-making and reduce mass incarceration.
At least five states now require the use of risk In Idaho, for example, if a court orders a presen-
assessments in criminal sentencing, but in dif- tence investigation, the report for all offenders
ferent ways. Arizona, for example, specifically sentenced to prison time and for certain offend-
requires that the presentence reports in all pro- ers receiving probation must include informa-
bation-eligible cases “contain case information tion about current recidivism rates, differenti-
related to criminogenic risk and needs as doc- ated based on whether the offender risk level
umented by the standardized risk assessment is low, moderate, or high.109 Louisiana similarly
and other file and collateral information.”102 allows courts to use a presentence investigation
Similarly, Oklahoma requires the use of an as- validated risk and needs assessment tool prior
sessment and evaluation instrument designed to to sentencing an adult offender who is eligible
predict risk of recidivism to determine eligibility for assessment.110 In Indiana, the state supreme
for any community punishment.103 The Kentucky court has recommended that evidence-based
statute requires that pre-sentence investiga- offender assessment instruments be used at
tion report must include a defendant’s risk and criminal sentencing.111 The West Virginia Su-
needs assessment, and that sentencing judges preme Court has indicated in an unpublished
must “consider” the results and “likely impact of decision that although the legislature requires
a potential sentence on the reduction of the de- probation officers to conduct standardized risk
fendant's potential future criminal behavior.”104 and needs assessments,112 the court retains dis-
The Ohio legislature took the approach of man-
dating that the Ohio Department of Rehabilita- 105 Oh. Rev. Code Ann. § 5120.114(A)(1)–(3).
tion and Correction “select a single validated 106 Ohio Risk Assessment System, Ohio Department of
risk assessment tool for adult offenders” that Rehabilitation and Correction, http://www.drc.ohio.
will be used for a variety of purposes that in- gov/oras.
107 42 Pa. Cons. Stat. § 2154.7(a)
98 Id. 108 See supra note 72 and accompanying text.
99 DOJ Letter, supra note 37, at 7. 109 See Idaho Code § 19–2517.
100 Id. 110 Louisiana Stat. Ann. § 15:326(A).
101 For a broader overview of how various risk assessment 111 Malenchik v. State, 928 N.E.2d 564, 575 (Ind. 2010)
algorithms are used state-by-state, see Algorithms in the (holding that “trial courts are encouraged to employ
Criminal Justice System, supra note 72. evidence-based offender assessment instruments… as
102 Ariz. Code of Judicial Admin. § 6–201.01(J)(3). supplemental considerations in crafting a penal pro-
103 Okla. Stat. tit. 22, § 988.18(B). gram tailored to each individual defendant.”).
104 Ky. Rev. Stat. Ann. § 532.007(3)(a)–(b). 112 See W. Va. Code § 62–12–6(a)(2).

16
cretion to decide how to use these tools to in-
form sentencing decisions.113

State v. Rogers, No. 14–0373, 2015 WL 869323, at *4


113
(W.Va. Jan. 9, 2015).

17
IV. Legal Issues Raised of fleeing the police and driving a stolen car.117
After he pled guilty, the court requested a pre-
By Risk Assessments in sentence investigation report, which included
among other information a risk score calculat-
Sentencing ed using COMPAS. Loomis was designated by
the COMPAS algorithm as high risk for all three
When critics discuss these risk assessment tools, types of recidivism measured by the program:
one of the first questions that comes up often pre-trial recidivism, general recidivism, and vio-
centers on the legality of their use. In this sec- lent recidivism. The fact that he was a registered
tion, we explore the primary legal issues raised sex offender likely contributed to that score,118
by the use of risk assessments in sentencing. We although the proprietary nature of the software
begin with an analysis of the leading case on makes it difficult to pinpoint exactly why he was
this issue from the Supreme Court of Wisconsin. designated high risk. Nonetheless, the state ar-
We then discuss the broader constitutional is- gued that the court should consider all three
sues implicated by these algorithms and related high-risk scores when determining the appropri-
sentencing questions. ate sentence.119 Loomis received a six-year prison
sentence, and at the hearing Judge Scott Horne
A. COMPAS Considered in Wisconsin: told him: “The risk assessment tools that have
The Loomis Case been utilized suggest that you’re extremely high
In the summer of 2016, the Supreme Court of risk to reoffend.”120
Wisconsin considered the legality of using
risk-assessment software in criminal sentenc- Loomis challenged his sentence, arguing that
ing.114 State v. Loomis is one of the first majorthe judge’s use of the risk assessment score
cases in the United States to address concerns violated his right to due process—that is, his
about whether a judge’s consideration of a soft- constitutional right to a fair trial. Specifically,
ware-generated risk assessment score during he argued that it violated due process for three
sentencing constitutes a violation of due pro- reasons: (1) it violated his right to be sentenced
based on accurate information because the
cess or overt discrimination.115 The decision gen-
erated mixed reactions from both academics proprietary nature of the COMPAS software
and the public for its endorsement of the use prevented him from assessing the accuracy of
of risk assessment scores in sentencing despite the score; (2) it violated his right to an individu-
alized sentence because it relied on information
clear hesitation on the part of all three judges in
the panel about the potential for bias and other about the characteristics of a larger group to
troubling implications of the use of these algo- make an inference about his personal likelihood
rithms.116 to commit future crimes; and (3) it improperly
used “gendered assessments” in calculating the
Eric Loomis, the defendant in the case, was ar- score.121 Ultimately, the court rejected Loomis’s
rested for operating the vehicle during a drive- claims and held that COMPAS could be used
by shooting and pled guilty to lesser charges at sentencing, although it made several recom-
mendations about limiting COMPAS’s use in fu-

114 State v. Loomis, 881 N.W.2d 749 (Wisc. 2016). 117 Megan Garber, When Algorithms Take the Stand, The
115 Joe Palazzolo, Wisconsin Supreme Court to Rule on Atlantic (Jun. 30, 2016), http://www.theatlantic.com/
Predictive Algorithms Used in Sentencing, Wall St. J. technology/archive/2016/06/when-algorithms-take-the-
(Jun. 5, 2016), http://www.wsj.com/articles/wisconsin- stand/489566/.
supreme-court-to-rule-on-predictive-algorithms-used-in- 118 See Mitch Smith, In Wisconsin, A Backlash Against
sentencing-1465119008. Using Data to Foretell Defendants’ Futures, N.Y. Times
116 See, e.g., Chiel, Secret Algorithms That Predict Future (Jun. 22, 2016), http://www.nytimes.com/2016/06/23/us/
Criminals Get a Thumbs Up from the Wisconsin Su- backlash-in-wisconsin-against-using-data-to-foretell-de-
preme Court, supra note 1; Interview with the Sonja Starr, fendants-futures.html.
Professor, University of Michigan Law School (Oct. 28, 119 Loomis, 881 N.W.2d at 755.
2016); Interview with Jim Greiner, Professor, Harvard Law 120 Palazzolo, Wisconsin Supreme Court to Rule on Pre-
School, and Chris Griffin, Research Director, Harvard dictive Algorithms Used in Sentencing, supra note 115.
Law School’s Access to Justice Lab (Nov. 7, 2016). 121 Loomis, 881 N.W.2d at 757.

18
ture cases.122 Finally, the court considered Loomis’s challenge
to the use of gender as a variable that can
In response to the accuracy argument, the change a defendant’s risk score. This issue was
court acknowledged that the proprietary na- complicated by the fact that the COMPAS algo-
ture of COMPAS prevented Loomis from seeing rithm is proprietary, and the parties in the case
exactly how his score was calculated. However, disputed the mechanics of how COMPAS takes
since most of the information the algorithm used gender into account.129 Loomis argued that the
came from a questionnaire that he completed algorithm considered gender as a criminogenic
and public records, the court concluded that he factor, whereas the state argued that it is used
had an opportunity to ensure that the informa- solely for “statistical norming,” that is, to com-
tion was accurate.123 “[T]o the extent that Loom- pare each offender to a “norming” group of his
is's risk assessment is based upon his answers or her own gender. Nonetheless, Loomis objected
to questions and publicly available data about to any use of gender in calculating the scores;
his criminal history, Loomis had the opportunity the state, in response, argued that gender needs
to verify that the questions and answers listed to be considered in a risk assessment to achieve
on the COMPAS report were accurate.”124 statistical accuracy because men and women
have different rates of recidivism and different
The court responded to his argument about rehabilitation potential.130 The court, rejecting
his right to an individualized sentence by dis- Loomis’s argument, found that “if the inclusion
tinguishing a hypothetical case where the risk of gender promotes accuracy, it serves the in-
assessment score was either the only factor or terests of institutions and defendants, rather
the determinative factor in a sentencing deci- than a discriminatory purpose.”131
sion from the present case, whereas here the
risk score was simply one piece of information Loomis further argued that even if the statistical
among many that the judge considered in the generalizations based on gender were accurate,
sentencing decision.125 The court suggested that they were unconstitutional. In support of this
a due process challenge might succeed if the claim, he cited Craig v. Boren, a 1976 case where
risk assessment score was the determinative the Supreme Court held that an Oklahoma law
or role factor the judge considered, but reject- that treated men and women differently was un-
ed Loomis’s argument that considering it at all constitutional even though it was based on em-
constituted a due process violation.126 The court pirical data that supported the gender-based
emphasized: “COMPAS has the potential to pro- difference in the law.132 The Supreme Court rea-
vide sentencing courts with more complete in- soned in Craig that “the principles embodied in
formation to address [the] enhanced need [for the Equal Protection Clause [of the Fourteenth
more complete information up front].”127 In sup- Amendment] are not to be rendered inapplica-
port of this assertion, the court cited Malenchik ble by statistically measured but loose-fitting
v. State, a 2010 Indiana Supreme Court decision generalities concerning the… tendencies of
that looked at similar risk assessment tools and aggregate groups.”133 Loomis, however, failed
found that they help judges “more effectively to raise his claim as an Equal Protection viola-
evaluate and weigh several express statutory tion, as the court had found in Craig v. Boren,
sentencing considerations such as criminal his- instead arguing that the use of gender violated
tory, the likelihood of affirmative response to his right to due process. But the Wisconsin court
probation or short term imprisonment, and the found that Loomis had not met the burden of
character and attitudes indicating that a defen- proving that the court actually relied on gender
dant is unlikely to commit another crime.”128 as a factor in imposing his sentence, especially
since the judge did not mention it in explaining
his rationale.134
122 Id.
123 Id. at 761-62. 129 Loomis, 881 N.W.2d at 765.
124 Id. 130 Id.
125 Id. at 765. 131 Id. at 767.
126 Id. 132 Craig v. Boren, 429 U.S. 190, 208-10 (1976).
127 Id. 133 Id. at 208-09.
128 Malenchik v. State, 928 N.E.2d 564, 574 (Ind. 2010). 134 Loomis, 881 N.W.2d at 767 (noting that the judge

19
Having rejected all three of Loomis’s due pro- 1. COMPAS is a proprietary tool, which has
cess claims, the Wisconsin court approved the prevented the disclosure of specific informa-
use of COMPAS in this particular case, but it tion about the weights of the factors or how
did express some hesitation about its future use risk scores are calculated;
absent clear limitations. The court first outlined 2. COMPAS scores are based on group data,
permissible uses for the software, noting that and therefore identify groups with charac-
while COMPAS cannot be determinative, the risk teristics that make them high-risk offenders,
scores can be considered a relevant factor in not particular high-risk individuals;
several circumstances, including: (1) “diverting
low-risk prison-bound offenders to a non-prison 3. Several studies have suggested the COMPAS
alternative,” (2) assessing the public safety risk algorithm may be biased in how it classifies
an offender poses and whether he can be safe- minority offenders;
ly and effectively supervised in the community 4. COMPAS compares defendants to a national
rather than in prison, and (3) to inform deci- sample, but has not completed a cross-vali-
sions about the terms and conditions of proba- dation study for a Wisconsin population,
tion and supervision.135 and tools like this must be constantly moni-
tored and updated for accuracy as popula-
The court went on to prescribe key limitations tions change; and
for its use. While the risk score can help a judge 5. COMPAS was not originally developed for
better understand a defendant’s unique situa- use at sentencing.139
tion and relevant factors, the court held that it
should not be used to determine the length or
severity of the punishment, and certainly should The concurring opinions reiterated the note of
not be counted as an official aggravating or caution about relying on the COMPAS score in
mitigating factor in a sentencing decision.136 a meaningful way. Chief Judge Patience Drake
The court acknowledged that COMPAS was not Roggensack wrote separately to clarify that
designed with all of the goals of punishment in while a sentencing judge may consider a COM-
mind, but rather a focus on recidivism alone. Its PAS score, he may not rely on it in making his
lack of relevance to other important sentencing sentencing decision. Judge Shirley Abraha-
140

aims like retribution (which is a backward-look- mson also wrote separately to emphasize that
ing assessment of an individual’s blameworthi- in considering COMPAS or other tools in sen-
ness) and deterrence (a broader concept that tencing, a judge “must set forth on the record
goes beyond the individual) makes it a “poor fit” a meaningful process of reasoning addressing
for determining the length of the sentence.137 In the relevance, strengths, and weaknesses of the
order to ensure that these limitations are being risk assessment tool” as a means to address
followed, the court mandated that a judge must concerns about their use. She also noted that
141

explain at sentencing “the factors in addition to the lack of understanding about COMPAS and
a COMPAS risk assessment that independently how it works was a “significant problem” in this
support the sentence imposed.”138 case.142

The court also addressed the information that The Loomis case was a landmark decision, since
should be included in any pre-sentence investi- it was the first time a U.S. court evaluated these
gation report containing a COMPAS score. This algorithms head on. The post-decision head-
“written advisement of its limitations” should ex- lines made sweeping declarations like “Secret
plain that: algorithms that predict future criminals get a
thumbs up from Wisconsin Supreme Court”143
specifically referenced “your history, your history on su-
pervision, and the risk assessment tools that have been 139 Id. at 769-70.
utilized” when explaining to Loomis why he was at a high 140 Id. at 772 (Roggensack, C.J., concurring).
risk to reoffend). 141 Id. at 774-75 (Abrahamson, J., concurring).
135 Id. at 767-78. 142 Id. at 774.
136 Id. at 768. 143 Chiel, Secret Algorithms that Predict Future Criminals
137 Id. at 769. Get a Thumbs Up From Wisconsin Supreme Court, supra
138 Id. note 1.

20
and “(Un)fairness of Risk Scores in Criminal Although it is not a given that elected judges
Sentencing.”144 Yet legal experts noted that the will impose harsher sentences, when campaign-
court’s analysis was more sophisticated and did ing they may find it extraordinarily difficult to
not simply rubber stamp the use of programs like defend a decision to give a light sentence to a
COMPAS without any safeguards whatsoever.145 “high risk” offender, especially if that individual
Moreover, its implications are limited, not only actually does commit future crimes.
because it is binding only in the state of Wiscon-
sin, but also because Loomis chose not to bring The court arguably addressed the surface is-
all possible claims challenging its constitution- sues by adding caveats and mandating certain
ality. First, he opted not to contest any of the so- disclosures accompany COMPAS scores in pre-
cioeconomic variables used in COMPAS. And, as sentence investigation reports, but it was silent
the opinion noted, while Loomis argued that the on the underlying question of why the scores are
use of gender as a variable was problematic, he being included in the report at all if they should
did not frame it as an Equal Protection violation, not affect that length of the sentence.150 As the
altering the court’s analysis of the issue.146 highest court in Wisconsin, the judges deciding
this case certainly had the authority to take a
Some critics have also pointed out flaws in the stronger stand and tell lower courts in the state
court’s analysis of the issues that were before it. not to consider these scores at all, but they did
University of Michigan Law Professor Sonja Starr not.
argues that the court erred in its analysis of the
gender issue, and that, under existing constitu- As one of the first cases to meaningfully address
tional doctrine, saying that the inclusion of gen- the most recent incarnation of risk assessment
der makes the instrument more accurate is sim- scores, Loomis is significant but by no means
ply not enough to justify including it.147 Moreover, determinative. The opinion demonstrated not
by expressing concerns about the potential for only the challenges that the court faced in un-
unfairness and discrimination in using COMPAS derstanding how programs like COMPAS work,
but still approving it in this case, the court may but also the fact that there is little helpful prec-
ultimately fail to meaningfully restrict the use of edent to guide judges’ decision-making when it
the instrument. It is unclear, for example, how a comes to assessing their legality and crafting
judge might use a risk score if he cannot change meaningful restrictions. And it is clearly not the
the length of the sentence based on that num- end of the discussion. In the spring of 2017, the
ber.148 Nor does the opinion acknowledge that U.S. Supreme Court asked the federal govern-
in jurisdictions like Wisconsin, where judges are ment to weigh in on the question of whether it
elected, it is difficult to imagine that a “high should hear Loomis’ petition for a writ of certio-
risk” label will not result in a longer sentence.149 rari, an indication of some interest in the issue—
although the high court has yet to decide if it
144 Danielle Citron, (Un)Fairness of Risk Scores in will allow the appeal to go forward.151
Criminal Sentencing, Forbes (Jul. 13, 2016), http://
www.forbes.com/sites/daniellecitron/2016/07/13/
unfairness-of-risk-scores-in-criminal-sentenc- B. Constitutional Issues Implicated By
ing/#386c67d24479. Risk Assessment Algorithms
145 Lauren Kirchner, Wisconsin Court: Warning Labels Broadly speaking, risk-assessment systems
Are Needed for Scores Rating Defendants’ Risk of Future raise two primary constitutional concerns: their
Crime, ProPublica (Jul. 14, 2016), https://www.propubli- impact on an individual’s right to due process,
ca.org/article/wisconsin-court-warning-labels-needed- and the potential that the inclusion of certain
scores-rating-risk-future-crime. variables constitutes an equal protection viola-
146 Starr describes why this argument fits into the Equal
Protection Clause of the Fourteenth Amendment in a Michigan Law School (Oct. 28, 2016).
2014 Stanford Law Review article. See infra Part IV.B. 150 State v. Loomis: Wisconsin Supreme Court Requires
147 Interview with the Sonja Starr, Professor, University of Warning Before Use of Algorithmic Risk Assessments in
Michigan Law School (Oct. 28, 2016). Sentencing, 130 Harv. L. Rev. 1530 (Mar. 10, 2017).
148 Id.; Interview with Jim Greiner, Professor, Harvard Law 151 Adam Liptak, Sent to Prison by a Software Program’s
School, and Chris Griffin, Research Director, Harvard Secret Algorithms, N.Y. Times (May 1, 2017), https://www.
Law School’s Access to Justice Lab (Nov. 7, 2016). nytimes.com/2017/05/01/us/politics/sent-to-prison-by-a-
149 Interview with the Sonja Starr, Professor, University of software-programs-secret-algorithms.html?_r=2.

21
tion. It is worth noting that many of these same Court held that Loomis’s challenge did not clear
issues came up during the selective incapacita- the constitutional hurdles. Importantly, the case
tion movement in the 1980s, which raised con- relies on two prior state court decisions: State
cerns about both individualized sentencing and v. Samsa, which considered the court’s reliance
fairness.152 Although the Due Process and Equal on COMPAS scores provided in pre-sentence
Protection claims are related, we treat them investigation reports (but did not address due
separately here in order to make the argument process considerations),156 and State v. Skaff,
clearer. a 1989 decision which held that the right to be
sentenced based on accurate information in-
i. Due Process Challenges: The Right cludes the right to review and verify information
to Review and Verify Sentencing contained in the pre-sentence investigation re-
port.157 The crux of the court’s reasoning in the
Information Loomis decision was the fact that the COMPAS
As explained above, the information that a judge score cannot be the only thing the sentence
considers at sentencing is not constrained by is based on, or even the determinative factor,
traditional evidentiary rules. Judges tradition- thereby arguably ensuring that the judge will
ally have discretion to consider a wide range consider other information about the particular
of factors about the defendant’s personal his- case and assign an individual sentence based
tory, prior criminal record, and other details as on the totality of the circumstances.158 The court
part of the decision-making process; in Williams also reasoned that the right to review and verify
v. New York, the Supreme Court explained why information in the PSI was satisfied because the
such information, typically provided through a defendant could review and correct the public
pre-sentence investigation report, might be use- records upon which COMPAS relies, and the rest
ful to a sentencing judge and why it is not un- of the information was provided by the defen-
fair to the defendant to rely on such information dant himself in a questionnaire.159
even if it not admissible during the guilt phase
of the trial.153 At the same time, however, the Su-
Loomis’s failure to succeed in his due process
preme Court has subsequently recognized in
claim does not, of course, foreclose this line of
Gardner v. Florida that the sentencing process
argument in the future. It remains to be seen
itself must satisfy the requirements of the Due whether the limitations described by the court
Process clause of the Fourteenth Amendment.154 are actually sufficient to protect a defendant’s
Although Gardner is a capital case—and there- right to an individual sentence. It is particu-
fore is subject to certain heightened restrictions larly difficult to assess whether the risk score
compared to ordinary criminal sentencing cas- was a determinative factor in the judge’s deci-
es—it raises the question about whether a de- sion-making process, which the court suggests
fendant has a meaningful opportunity to refute, would rise to the level of a due process viola-
supplement, or explain the information upon
which his sentencing decision is based.155 156 State v. Samsa, 359 Wis.2d 580, 590 (Ct. App. Wisc.
2014) (rejecting a challenge to a sentence on the basis
Because the use of risk assessment algorithms that “COMPAS is merely one tool available to a court at
is so new, there have not been many legal chal- the time of sentencing and a court is free to rely on por-
lenges under the Due Process clause. The Loom- tions of the assessment while rejecting other portions.”).
is case, discussed above, challenged the use 157 State v. Skaff, 152 Wis.2d 48, 57-58 (Ct. App. Wisc.
of COMPAS as a violation of the defendant’s 1989).
due process rights. But the Wisconsin Supreme 158 It is likely significant that the judge told Loomis at the
sentencing hearing that the COMPAS score was one of
152 Mathieson, Selective Incapacitation: Reducing Crimes multiple factors that he weighed when ruling out pro-
Through Predictions of Recidivism, supra note 7. bation and assigning a six-year prison term: “In terms
153 See Williams v. New York, 337 U.S. 241, 251 (1949). of weighing the various factors, I'm ruling out probation
154 Gardner v. Florida, 430 U.S. 349, 359 (1977) (noting because of the seriousness of the crime and because
that “[t]he defendant has a legitimate interest in the your history, your history on supervision, and the risk
character of the procedure which leads to the imposition assessment tools that have been utilized, suggest that
of sentence even if he may have no right to object to a you're extremely high risk to re-offend.” State v. Loomis,
particular result of the sentencing process.”). 881 N.W.2d 749, 755 (Wisc. 2016).
155 Id. 159 Id. at 765.

22
tion,160 but without identifying a specific test too common in our criminal justice system
that could be used to help make that determina- and in our society.163
tion. Nor does the decision address the fact that
there is a plausible distinction between being Holder went on to urge extreme caution when
able to review and rebut the individual pieces of sentencing criminals not based on the facts of
information that are fed into the algorithm and the crimes committed and the defendant’s crim-
being able to actual review how the score itself inal history, but also on factors outside his or her
was calculated. control “or on the possibility of a future crime
that has not taken place.”164 A few days prior
ii. Equal Protection: Are We to Holder’s speech, the Department of Justice’s
Embracing Explicit Discrimination Criminal Division had sent a letter to the Chair
of the U.S. Sentencing Commission expressing
Under Technocratic Framing? similar concerns about the use of predictive
In Griffin v. Illinois, a seminal 1956 case about analysis in criminal sentencing.165 The letter not-
the rights of indigent defendants, Justice Hugo ed that these risk assessment instruments “raise
Black wrote that “providing equal justice to poor constitutional questions because of the use of
and rich, weak and powerful alike” as “the cen- group based characteristics and suspect classi-
tral aim of our entire judicial system—all people
fications in the analytics.”166 Although the Equal
charged with crime must, so far as the law is Protection Clause does not require that the gov-
concerned, stand on an equality before the bar ernment treat every person exactly the same, it
of justice in every American court.”161 The con- does prohibit discrimination if it is based upon
cept of individualism is at the heart of the Su- impermissible classifications.
preme Court’s Equal Protection jurisprudence,
which flows from the clause in the Fourteenth
In a 2014 Stanford Law Review article published
Amendment of the U.S. Constitution providing
the same year, Professor Starr lays out a de-
that no state shall “deny to any person within its
tailed strategy for challenging the use of these
jurisdiction the equal protection of the laws.”162
risk-assessment instruments under the Equal
Protection Clause. Her basic thesis is that us-
In a 2014 speech to the National Association ing risk assessment scores in criminal sentenc-
of Criminal Defense Lawyers, former Attorney ing represents “an explicit embrace of other-
General Eric Holder expressed serious concerns wise-condemned discrimination, sanitized by
about the use of risk assessment software and scientific language.”167 By including variables
its potential to undermine this central tenet of like age and gender as well as socioeconomic
the criminal justice system. He told the audi- factors like employment and education, these
ence:
systems are enabling judges to consider fac-
tors that have long been considered inappropri-
Although these measures were crafted with ate to bring into criminal sentencing. While we
the best of intentions, I am concerned that would object to the idea of judges systematical-
they may inadvertently undermine our ef- ly imposing harsher sentences on defendants
forts to ensure individualized and equal who are poor or uneducated or from a certain
justice. By basing sentencing decisions on demographic group, we are essentially sanc-
static factors and immutable characteris-
tics – like the defendant’s education level,
163 Attorney General Eric Holder Speaks at the National
socioeconomic background, or neighbor- Association of Criminal Defense Lawyers 57th Annual
hood – they may exacerbate unwarranted Meeting and 13th State Criminal Justice Network Con-
and unjust disparities that are already far ference, U.S. Dept. of Justice (Aug. 1, 2014), https://
www.justice.gov/opa/speech/attorney-general-eric-hold-
160 Id. (noting that “[i]f a COMPAS risk assessment were er-speaks-national-association-criminal-defense-law-
the determinative factor considered at sentencing this yers-57th.
would raise due process challenges regarding whether a 164 Id.
defendant received an individualized sentence.”). 165 DOJ Letter, supra note 37, at 4-8.
161 Griffin v. Illinois, 351 U.S. 12, 16-17 (1956) (quotation 166 Id. at 7.
marks omitted). 167 Starr, Evidence-Based Sentencing, supra note 88, at
162 U.S. Const. amend. XIV, § 1. 803, 806.

23
tioning the practice by encouraging the use of writes her book, Weapons of Math Destruction:
risk systems that—despite their “technocratic
framing”—take these variables into account. [I]t’s easy to imagine how inmates from a
Moreover, Starr argues, these systems are un- privileged background would answer one
constitutional, because the Supreme Court has way and those from tough inner-city streets
consistently held that otherwise-impermissible another. Ask a criminal who grew up in com-
discrimination cannot be justified by statistical fortable suburbs about “the first time you
generalizations about groups, such as a partic- were ever involved with the police,” and he
ular race or gender—even if those generaliza- might not have a single incident to report
tions are, on average, accurate. Our criminal
168
other than the one that brought him to pris-
justice system is premised on the idea that peo- on. Young black males, by contrast, are like-
ple have a right to be treated as individuals un- ly to have been stopped by police dozens
der the law. of times, even when they’ve done nothing
wrong… So if early “involvement” with the
iii. Race as a Variable police signals recidivism, poor people and
Virtually everyone agrees that race would be a racial minorities look far riskier.172
constitutionally impermissible factor to include,
and thus it is not included as an explicit variable Unfortunately, while O’Neil and other critics
in of any of these systems.169 Explicit race-based correctly point out that using factors which
classifications are subjected to the highest level correlate with race may be troubling, existing
of scrutiny by the courts, and when strict scru- constitutional doctrine does not suggest that
tiny applies it is virtually always fatal to the law their inclusion in a risk assessment instrument
or regulation being challenged. Thus if race was would constitute an Equal Protection violation.
explicitly included as an input in the COMPAS The current standard for evaluating whether a
algorithm, its use in sentencing criminal de- facially neutral law (or in this case, the use of a
fendants would almost certainly constitute an facially neutral factor, like the number of report-
Equal Protection violation. ed contacts with the police) that has a racial-
ly disparate impact violates the Equal Protec-
However, excluding race itself does not neces- tion Clause comes from Washington v. Davis.
173

sary mean that factors that correlate heavily Washington v. Davis held that while dispropor-
to an individual’s race—serving essentially as tionate impact on the members of a particular
proxies for race—are excluded from these al- racial group is not irrelevant, strict scrutiny is
gorithms.170 Nor are factors that have disparate only triggered if the individuals challenging the
impact based on the race of the individual, such law can show that it was also adopted with a ra-
as a question that asks a criminal defendant cially discriminatory intent. If not, rational basis
the number of times he or she has been stopped review applies, a highly deferential standard. In
by the police.171 As data scientist Cathy O’Neil the case of risk assessment algorithms, a crim-
inal defendant challenging his or her sentence
168 See, e.g., Craig v. Boren, 429 U.S. 190, 210 (1976), dis- would have to be able to prove that the variable
cussed supra note 132 and accompanying text. that correlated heavily to race was included for
169 Luis Daniel, The Dangers of Evidence-Based Sentenc-
the purpose of racial discrimination, which is an
ing, GovLab Blog (Oct. 31, 2014), http://thegovlab.org/
the-dangers-of-evidence-based-sentencing/ (noting that
extraordinarily difficult burden to meet.174 Only
“[o]verwhelmingly, states do not include race in the risk a handful of cases in the forty years since Wash-
assessments since there seems to be a general consen- ington v. Davis was decided have successfully
sus that doing so would be unconstitutional.”). proven racially discriminatory intent, and they
170 Michal Kosinski et al., Private Traits and Attributes are
Predictable from Digital Records of Human Behavior, 110 172 Cathy O’Neil, Weapons of Math Destruction 25-26
Proceedings of the National Academy of Sciences (2016).
of the United States of America, 5802, 5802-05 (2012) 173 Washington v. Davis, 426 U.S. 229 (1976).
(finding that easily accessible digital records such as 174 See Personnel Administrator v. Feeney, 442 U.S. 256
Facebook “likes” can be used to automatically and (1979) (holding that in order to find discriminatory intent,
accurately predict highly sensitive personal information, a state legislature has to have acted “because of,” not
including ethnicity). “in spite of,” the effects of a statute in relatively disad-
171 Id. vantaging members of a particular minority group).

24
have mostly occurred in the jury selection con- able characteristics like race, which some risk
text. It is therefore unlikely that a constitutional assessment tools arguably do.
challenge against factors that correlate heavily
to race will succeed under current doctrine. iv. Gender as a Variable
In contrast to race, systems like COMPAS and
A recent Supreme Court case, however, offers a LSI-R do take gender into account, despite the
sliver of hope. In early 2017, the Supreme Court fact that gender classifications are subject to
ruled in Buck v. Davis, a case which addresses an intermediate level of scrutiny179 that requires
a related topic: the constitutionality of a death an “exceedingly persuasive justification” to hold
sentence in a case where an expert witness tes- up under the Equal Protection Clause.180 In the
tified at sentencing that a black defendant was few instances where the issue has been raised
more likely to be dangerous in the future (which directly, courts have generally held that it is
is an aggravating factor in the Texas death impermissible to base sentences on gender,181
sentencing scheme) because of his race.175 Al- which does not bode well for risk-assessment
though the procedural history of the case is systems that produce different results based on
complex—and the question before the Supreme the gender of the defendant.182 The issue is by
Court focused on whether Buck’s counsel gave no means settled, so it is entirely possible that
him ineffective assistance in not objecting to in a future challenge the courts would find that
the testimony—the issue of whether the racial the gender classification in risk assessment al-
nature of the expert witness’s testimony tainted gorithms constitutes a constitutional violation—
the sentencing decision remains at the heart especially because, as mentioned above, the
of the case. As many experts predicted,176 the defendant in State v. Loomis failed to raise his
Court ruled in Buck’s favor in February, allow- gender discrimination claim under the Equal
ing him to appeal his death sentence.177 Chief Protection Clause, and as such the court did not
Justice Roberts, writing for the majority, openly have to address it.
acknowledged that “[a]s an initial matter, this
is a disturbing departure from a basic premise 179 Intermediate scrutiny falls in between the heavily bur-
of our criminal justice system: Our law punish- densome strict scrutiny that applies to race-based clas-
es people for what they do, not who they are. sifications and almost always results in invalidation, and
Dispensing punishment on the basis of an im- the highly deferential rational basis review, under which
mutable characteristic flatly contravenes this very few laws are declared unconstitutional. Under this
standard of review, the burden falls on the government
guiding principle.”178 Although the case does not to prove that a classification is substantially related to
overrule any precedent that relates directly to the achievement of an important government purpose,
claims of racially discriminatory impact, cer- Craig v. Boren, 429 U.S. 190 (1976), which the court later
tain language in the opinion suggests that the suggested required an “exceedingly persuasive justifica-
Supreme Court might be uncomfortable with tion,” United States v. Virginia, 518 U.S. 515, 531 (1996).
sentences that are clearly based on unchange- 180 United States v. Virginia, 518 U.S. at 531 (noting that
“equal protection principles, as applied to gender classi-
175 Amy Howe, Argument Preview: Justices to Consider fications, mean state actors may not rely on “overbroad”
Role of Racial Bias in Death Penalty Case, SCOTUS Blog generalizations to make “judgments about people that
(Sept. 28, 2016), http://www.scotusblog.com/2016/09/ are likely to ... perpetuate historical patterns of discrimi-
argument-preview-justices-to-consider-role-of-racial- nation”).
bias-in-death-penalty-case/; Nina Totenberg, Supreme 181 See, e.g., United States v. Maples, 501 F.2d 985, 989
Court To Hear Death Penalty Case Based On Racially (4th Cir. 1974); Williams v. Currie, 103 F. Supp. 2d 858,
Tainted Testimony, NPR (Oct. 5, 2016), http://www.npr. 868 (M.D.N.C. 2000). See also Carissa Byrne Hessick,
org/2016/10/05/496630474/supreme-court-to-hear- Race and Gender as Explicit Sentencing Factors, 14 J.
death-penalty-case-based-on-racially-tainted-testimony. Gender Race & Just. 127, 137 (2010).
176 See, e.g., Amy Howe, Argument Analysis: Justices Ap- 182 It is worth noting that empirical research suggests
pear Inclined to Rule in Favor of Texas Death Row Inmate that female defendants on average receive more lenient
in Racial Bias Case, SCOTUS Blog (Oct. 5, 2016), http:// treatment than male defendants, but judges and prose-
www.scotusblog.com/2016/10/argument-analysis-jus- cutors do not explicitly endorse such differential treat-
tices-appear-inclined-to-rule-in-favor-of-texas-death-row- ment. See Sonja B. Starr, Estimating Gender Disparities
inmate-in-racial-bias-case/. in Federal Criminal Cases, Univ. of Mich. L. Sch. L. &
177 Buck v. Davis, 580 U.S. ___ (2017). Econ. Research Paper Series, Paper No. 12-018, 3-4, 17
178 Id. at 21. (2012), available at http://ssrn.com/abstract=2144002.

25
In Craig v. Boren, moreover, the U.S. Supreme dants, especially in the sentencing context.
Court rejected a defense of a gender classifi-
cation that was grounded in statistical gener- In Bearden v. Georgia, the Supreme Court re-
alizations about women, even if those gener- jected the argument that a defendant’s pover-
alizations were empirically supported.183 The ty could be considered a factor that increased
challenged Oklahoma statute allowed women his likelihood of recidivism and therefore justi-
to buy certain types beer once they reached the fied additional incapacitation.188 The court held
age of 18 but prohibited men from buying it until that a sentence increase cannot not be based
they were over 21 because statistical evidence on “lumping [the defendant] together with oth-
suggested that men between the ages of 18 and er poor persons and thereby classifying him as
21 were over ten times more likely than their fe- dangerous. It would be little more than punishing
male peers to drive drunk.184 There is a plausible a person for his poverty.”189 Although financial
case, therefore, to argue that the incorporation background is not considered completely irrel-
of gender into risk assessment calculations is evant to sentencing—judges have traditionally
unconstitutional, even if the government argues been allowed to consider financial history and
that it has a substantial interest in the inclusion employment background at sentencing—there
of gender because it improves the accuracy of is a distinction when those factors are used to
the algorithm. This important distinction high- trigger “extra, unequal punishment” for poor
lights an underlying tension between the legal defendants.190 There is also general support for
and technical approaches to these issues. In the the idea that lower socioeconomic status should
world of machine learning, accuracy is valued not be considered an aggravating factor justify-
above all else, whereas our legal system tends ing a higher sentence.191
to place a greater emphasis on the principle of
fairness—even if it requires eschewing empiri- Thus, while the argument that using factors re-
cal results.185 lated to socioeconomic status is unconstitution-
al is not frivolous, it is by no means a clear-cut
v. Socioeconomic Status as a Variable one. Most of the relevant precedent involves
The use of socioeconomic variables might also situations where an individual cannot pay for
qualify as an impermissible wealth classifi- something (such as bail or court fees) because
cation, although the argument is not quite as of his poverty and is therefore subjected to
clear-cut as those that apply to race or gender. greater punishment as a result—which is dis-
A number of these risk assessment instruments tinct from independently using socioeconomic
incorporate data about a defendant’s employ- status as a dynamic factor in a risk assessment.
ment status, income, education, and job skills. For the Supreme Court to invalidate a factor like
Despite indicating in Griffin v. Illinois that the employment status or income based on this line
Supreme Court might subject wealth-related of cases would certainly be a novel—albeit not
classifications to the strict scrutiny that applies unprecedented—application.
to discrimination based on race and national
origin,186 the Supreme Court later held that pov- C. Related Sentencing Issues:
erty is not inherently suspect.187 Even so, there is Managing Risk in the Criminal Justice
ample case law that recognizes that we should System
not place special burdens on indigent defen- While we wait for the constitutional challeng-
es to unfold, however, there are other, more im-
183 Craig v. Boren, 429 U.S. 190, 210 (1976).
mediate legal and policy reasons to scrutinize
184 Id.
these systems and the factors upon which they
185 We owe this point to Ben Green, a fellow at the
Berkman Klein Center for Internet & Society and a PhD
rely. In general, the rush by state legislatures
candidate in Applied Mathematics at Harvard University.
We discuss this tension in greater depth in Part V.C. 188 Bearden v. Georgia, 461 U.S. 660, 661-62 (1983).
186 Griffin v. Illinois, 351 U.S. 12 (1956). 189 Id. at 671.
187 See, e.g., Maher v. Roe, 432 U.S. 464, 471 (1977) (noting 190 Starr, Evidence-Based Sentencing, supra note 88, at
that the Court has not held that “financial need alone 831-32.
identifies a suspect class for purposes of equal protec- 191 The Federal Sentencing Guidelines, for example, for-
tion analysis”). bid consideration of socioeconomic status.

26
and the scholars revising the MPC to embrace that are equally implicated by the rise of risk
the idea of evidence-based sentencing begs the assessment instruments used in sentencing.
question of whether managing risk should be
so heavily emphasized among the multiple pur-
poses of criminal sentencing.192 Although it has
long been clear that managing risk is a part of
the sentencing consideration, the use of these
algorithms almost certainly increases the prom-
inence of risk assessments in the decision-mak-
ing process. Yet judges might not be the best or
most appropriate actors to try to manage these
risks. Nor is there a significant body of evidence
at this point that suggests we are actually good
at predicting or managing risk—or that longer
sentences, for example, might decrease the risk
of recidivism.193

Moreover, these algorithms likely do not consider


the fact that many of the factors that increase
a risk score might also be considered mitigating
evidence. A young, poor, or uneducated defen-
dant might be at a higher risk for recidivism, but
those same circumstances might also diminish
his culpability and justify a more lenient sen-
tence, rather than a harsher one. The Supreme
Court confronted this very issue in Penry v.
Lynaugh, a capital case where the Court called
the defendant’s intellectual disabilities a “two-
edged sword.”194 Because the defendant’s men-
tal handicap prevented him from learning from
his mistakes, it arguably increased his future
dangerousness and could be considered an
aggravating factor.195 At the same time, it was
also a mitigating factor because it reduced his
blameworthiness for the crime he committed.196
Penry highlights an inherent tension in the jus-
tice system between competing concerns for
public safety and individual liberty—concerns
192 DOJ Letter, supra note 37, at 8 (emphasizing that
“[d]etermining imprisonment terms should be primarily
about accountability for past criminal behavior. While
any effective sentencing and corrections policy will take
account of future behavior to some extent—incapacitat-
ing those more likely to recidivate and utilizing effective
reentry efforts to reduce the likelihood of recidivism—we
believe the length of imprisonment terms should most-
ly be about accounting for past conduct. As analytics
evolve, we are concerned about the implications of sen-
tencing policy moving away from this precept.”).
193 See Gendreau et al, The Effects of Prison Sentences
on Recidivism, supra note 86.
194 Penry v. Lynaugh, 492 U.S. 302, 324 (1989).
195 Id. at 323.
196 Id. at 324.

27
V. Challenges Presented by bias.199 The lack of information about how inputs
are weighted also makes it harder to bring legal
the Use of Risk Assessment challenges to the use of these tools, since crim-
inal defendants cannot say for sure whether or
Algorithms in Sentencing how suspect factors like gender or racial proxies
may have influenced the risk assessment score
Drawing on the analysis of the history of risk as- or the judge’s ultimate sentencing decision.200
sessments and the legal and ethical concerns In the Loomis case, for example, the court dis-
that they raise, this section attempts to sum- missed the gender claim because the sentenc-
marize key concerns related to the use of these ing judge did not mention it specifically when
tools. In particular, we focus on three issues: explaining his decision201—a distinction which
opacity, bias and unreliability, and diverging seems to ignore the fact that a judge may never
concepts of fairness. explicitly mention a factor like gender when it is
quietly incorporated into an opaque risk score
A. Opacity rather than considered openly in the pre-sen-
In her concurring opinion in the Loomis case, tence investigation report or at a hearing.
Wisconsin Judge Shirley Abrahamson lamented
that “this court's lack of understanding of COM- It is also worth noting the distinction here be-
PAS was a significant problem in the instant tween algorithms developed by for-profit com-
case. At oral argument, the court repeatedly panies like Northpointe and Multi-Health Sys-
questioned both the State's and defendant's tems and those created by or in conjunction
counsel about how COMPAS works. Few an- with non-profits, researchers, and academics,
swers were available.”197 Abrahamson’s concur- like Public Safety Assessment and the state of
rence highlights one of the critical challenges Pennsylvania’s risk assessment algorithm. While
identified by both legal and technical experts: all of these tools may look like “black boxes” to
the lack of transparency about how these tools outsiders and are susceptible to concerns about
work.198 Although the details vary widely among opacity, the proprietary tools developed by for
the different systems, the broad concerns re- profit companies present unique challenges.
late to: (1) the inputs themselves, (2) how those Those companies have both a greater interest in
inputs are weighted by the algorithm, and (3) shrouding their products in secrecy in order to
whether specific factors (or combinations of remain competitive and more legal tools at their
factors) may end up serving as proxies for prob- disposal to keep their algorithms away from
lematic or impermissible variables like race and public scrutiny.202 Academic researchers and
poverty. These challenges can be compounded governments, by contrast, tend to have more in-
by a lack of information about the underlying centives to make the details of their algorithms
assumptions made by the computer scientists publicly available and ensure that they are sub-
developing the algorithms or conflicting purpos- ject to appropriate scrutiny and oversight.
es when a tool is developed for one context, such
as pre-trial risk assessment, and then adapted B. Bias and Lack of Reliability
for another like sentencing. In May 2016, ProPublica released an in-depth re-
port about COMPAS suggesting that it was both
The challenges presented by this opacity are racially biased and inaccurate.203 According to
two-fold. First, they make it difficult for research-
ers and outside experts to evaluate and audit 199 George Joseph, Justice By Algorithm, CityLab (Dec.
the algorithms in order to test for accuracy and 8, 2016), http://www.citylab.com/crime/2016/12/jus-
tice-by-algorithm/505514/.
200 Interview with the Sonja Starr, Professor, University of
197State v. Loomis, 881 N.W.2d 749, 774 (Wisc. 2016). Michigan Law School (Oct. 28, 2016).
198See, e.g., Nicholas Diakopoulos, We Need to Know 201 Loomis, 881 N.W.2d at 767.
the Algorithms the Government Uses to Make Important 202 Companies like Northpointe can argue that the
Decisions About Us, The Conversation (May 23, 2016), details of their algorithms constitute trade secrets that
https://theconversation.com/we-need-to-know-the-al- shield them from disclosure. O’Neil, Weapons of Math
gorithms-the-government-uses-to-make-important-deci- Destruction, supra note 172, at 29.
sions-about-us-57869. 203 Angwin et al., Machine Bias: There’s Software Used

28
ProPublica’s analysis, the scores not only proved tice system.”208
“remarkably unreliable” in forecasting violent
crime, but they also contained significant ra- The risk of bias may be compounded by algo-
cial disparities—even though the formula does rithms that rely on other potentially biased data
not officially take race into account. COMPAS sets, such as those that are used for predictive
incorrectly labeled black defendants as more policing.209 The interaction between these algo-
likely to commit crimes again than they actual- rithms is one of the central concerns expressed
ly were, while also frequently mislabeling white by O’Neil in Weapons of Math Destruction. O’Neil
defendants as low risk.204 The study was cited argues that that police essentially respond to
by the court in the Loomis case in the discussion two types of crimes: (1) crimes that are “report-
of the controversy surrounding these tools, even ed,” which usually refers to violent crimes (such
though it did not ultimately factor into the court’s as assault, homicide, and rape) and property
analysis in the case.205 Although the findings of crimes, and (2) crimes that are “found,” such as
the study have been disputed by Northpointe,206 when individuals are stopped and found to pos-
the research nonetheless highlights growing sess a small quantity of drugs or be engaged
discomfort among members of the legal and in otherwise illegal activity. Because of histor-
academic communities that these tools, which ic policing patterns—many of which are rein-
have been embraced for ostensibly progressive forced by new predictive tools—predominantly
reasons like reducing mass incarceration, may poor and minority neighborhoods tend to face
inadvertently reinforce or even exacerbate exist- a disproportionate amount of police activity
ing racial disparities.207 As a group of computer with respect to “found” crimes.210 Consequently,
science researchers wrote in the Washington O’Neil argues, the data sets concerning “found”
Post in response to the debate between Pro- crimes are likely biased to suggest that poor
Publica and Northpointe: “Algorithms have the and minority communities commit a higher pro-
potential to dramatically improve the efficiency portion of these crimes than they actually do.211
and equity of consequential decisions, but their If that information is then incorporated into a
use also prompts complex ethical and scientif- recidivism risk calculation, it might falsely sug-
ic questions…. We must continue to investigate gest that a poor or minority defendant is at a
and debate these issues as algorithms play an greater risk to commit future crimes and there-
increasingly prominent role in the criminal jus- fore assign that individual a higher risk score.

Across the Country to Predict Future Criminals. And It’s Of course, we should not pretend that inadver-
Biased Against Blacks., supra note 2. tent (and potentially overt) bias has not always
204 Id. The study found that black defendants were played a role in judge’s sentencing decisions.
almost twice as likely as white defendants to be labeled
a higher risk but not actually reoffend, whereas white 208 Sam Corbett-Davies et al., A Computer Program
defendants were much more likely to be labeled lower Used for Bail and Sentencing Decisions was Labeled as
risk but ultimately commit other crimes. Biased Against Blacks. It’s Actually Not that Clear., The
205 Loomis, 881 N.W.2d at 749 n. 2. Washington Post (Oct. 17, 2016), https://www.washing-
206 William Dieterich et al., COMPAS Risk Scales: tonpost.com/news/monkey-cage/wp/2016/10/17/can-an-
Demonstrating Accuracy Equity and Predictive Parity, algorithm-be-racist-our-analysis-is-more-cautious-than-
Northpointe (Jul. 8, 2016), http://go.volarisgroup.com/ propublicas/
rs/430-MBX-989/images/ProPublica_Commentary_Fi- 209 See, e.g., Jack Smith IV, Crime Prediction Tool Pred-
nal_070616.pdf (explaining that “[b]ased on our exam- Pol Amplifies Racially Biased Policing, Study Shows, Mic
ination of the work of Angwin et al. and on results of our (Oct. 9, 2016), https://mic.com/articles/156286/crime-
analysis of their data, we strongly reject the conclusion prediction-tool-pred-pol-only-amplifies-racially-biased-
that the COMPAS risk scales are racially biased against policing-study-shows#.Xp0PSJZA1.
blacks.”). 210 Jacob Metcalf, Ethics Review for Pernicious Feedback
207 See, e.g., Chris Griffin, Fear and Loathing Over Risk Loops: Reading Weapons of Math Destruction, Data &
Assessments, Part 2, Harvard Law School’s Access to Society Inst. (Nov. 7, 2016), https://points.datasociety.
Justice Lab (Oct. 14, 2016), http://a2jlab.org/fear-and- net/ethics-review-for-pernicious-feedback-loops-9a7ede-
loathing-over-risk-assessments-part-2/ (noting that the 4b610e#.3pfok1602.
ProPublica piece “focuses on the troubling implications 211 O’Neil, Weapons of Math Destruction, supra note
of racial imbalances in scores and predictive accura- 172, at 26-29; Interview with Cathy O’Neil, Author, Weap-
cy.”). ons of Math Destruction (Oct. 25, 2016).

29
A recent study found that judges in Florida, for ily reliant upon the precise objectives of that al-
example, sentence black defendants to 68 per- gorithm, and it raises a number of normative con-
cent more time in prison for serious first-degree siderations. One might argue that an algorithm
crimes even when they score the same as their is technically fair as long as it makes accurate
white counterparts on the formula used to de- and consistent predictions. But in addition to the
termine sentences.212 But the fact that bias exists fact that the academic community has still not
in the current system does not justify reinforc- reached a consensus on an exact definition for
ing—or even institutionalizing—bias by using fairness in the statistical context,216 Dr. Jeremy
risk assessment tools.213 Kun points out that an algorithm’s training data
may itself be flawed, indicating that the inputs
Moreover, bias aside, many of these algorithms themselves may not be “trustworthy.”217 Even if
have not been evaluated for their accuracy in a perfectly accurate algorithm does exist, the
the specific contexts or geographic areas in fairness-as-accuracy definition might still come
which they are being deployed. According to the up short in the event that an algorithm leads to
Electronic Privacy Information Center (EPIC), generalizations about particular groups. Con-
which has compiled an overview of state-by- sider an accurate algorithm that comes to the
state adoption of risk assessment algorithms, blanket conclusion that men tend to deserve
although some states have conducted validity higher risk scores than women. Whether or not
studies that how well these algorithms perform the algorithm is accurate, would it be fair for
with respect to their specific populations, many individuals to be judged based on immutable
have yet to do so.214 Indeed, the Wisconsin Su- characteristics such as gender? Such a circum-
preme Court noted this summer that the state stance is reminiscent of the one in the aforemen-
had not conducted a cross-validity study re- tioned case of Craig v. Boren,218 which grappled
garding COMPAS’s accuracy, and recommend- with the tension between statistical generaliza-
ed that the tools be constantly monitored and tions and empirical validity. To avoid the possi-
updated.215 In states where validity studies have ble unfairness that comes from pure “accura-
been conducted, it is similarly unclear whether cy,” one could argue instead that an algorithm
any of those studies have or will be repeated is only fair if its outcomes favor no particular
regularly in order to ensure ongoing accuracy group. While this suggestion sounds reasonable
as the population changes. on its face, it can lead to its own complex set of
questions, such as whether this could inadver-
C. Diverging Concepts of Fairness tently lead to reverse discrimination.
To argue that risk assessment algorithms should
be crafted fairly is uncontroversial, but the pre- The legal concept of fairness is also nebulous,
cise definition of “fairness” is hard to nail down. but in a different way. Legal fairness encom-
Whether an algorithm is technically fair is heav- passes the idea that every individual is enti-
tled to certain procedural rights designed to
212 Josh Salman et al., Florida’s Broken Sentencing give them a “fair” shot in the justice system. In
System: Designed for Fairness, Herald Tribune, http:// McCleskey v. Kemp, for example, the Supreme
projects.heraldtribune.com/bias/sentencing/. The Herald Court considered this issue when a death row
Tribune reviewed millions of records in the state of Flor-
prisoner argued that his sentence was uncon-
ida and found that, across the board, “[w]hen defen-
dants score the same points in the formula used to set
stitutional because the process through which
criminal punishments — indicating they should receive he was convicted was administered in a racially
equal sentences — blacks spend far longer behind bars” discriminatory manner. The Court articulated
219

compared to white defendants. its concept of fairness not necessarily in terms


213 See, e.g., Starr, Evidence-Based Sentencing, supra of reaching the correct outcome, but rather
note 88, at 806 (explaining that “[t]he technocratic
framing of [evidence-based sentencing] should not 216 Jeremy Kun, One Definition of Algorithmic Fairness:
obscure an inescapable truth: sentencing based on such Statistical Parity, MATH ∩ PROGRAMMING (Oct. 19,
instruments amounts to overt discrimination based on 2015), https://jeremykun.com/2015/10/19/one-defini-
demographics and socioeconomic status.”) tion-of-algorithmic-fairness-statistical-parity/.
214 Algorithms in the Criminal Justice System, supra note 217 See id.
72. 218 Craig v. Boren, 429 U.S. 190, 208-10 (1976).
215 State v. Loomis, 881 N.W.2d 749, 769-70 (Wisc. 2016). 219 McCleskey v. Kemp, 481 U.S. 279 (1987).

30
reaching a (hopefully) correct outcome through
a process that gave the individual a fair oppor-
tunity and guaranteed his or her rights to due
process. In the opinion, Justice Powell explained
that that “our consistent rule has been that con-
stitutional guarantees are met when ‘the mode
[for determining guilt or punishment] itself has
been surrounded with safeguards to make it as
fair as possible.’”220 In other words, legal fair-
ness tends to prioritize parity in the process by
which an outcome is reached rather than the
outcome itself.

220 Id. at 313 (quoting Singer v. United States, 380 U.S. 24,
35 (1965)).

31
VI. Recommendations for well. The core values of the technological due
process concept are transparency, accuracy,
the Use of Risk Assessment accountability, participation, and fairness.223
Citron and Frank Pasquale call for increased
Algorithms federal regulatory oversight over scoring sys-
tems that collect data about individuals, gen-
Despite the concerns described above, we as- erate scores from that data, distribute scores
sume that risk assessment tools will continue to to decision makers, and use those scores in
be used in the criminal justice system, including decision making.224 They argue that individuals
at sentencing, in light of both their widespread should have the “right to inspect, correct, and
embrace in the United States and the potential dispute inaccurate data and to know the sourc-
benefits they offer if correctly implemented. es (furnishers) of the data.”225 Furthermore,
Nonetheless, given the myriad challenges, we they believe that the algorithm that generates a
believe that policymakers should proceed cau- score from said data needs to be public so that
tiously and deliberately in implementing these each process can be inspected. Finally, they
systems. The goal of this section is to identify emphasize that policymakers need to ensure
overall concepts to guide policymakers that that a score is fair, accurate, and replicable.226
ensure transparency, accountability, and fair-
ness are given central. While these recommen- The key mechanism behind technological due
dations are not comprehensive, we believe that process is the requirement of audit trails that
they represent a valuable starting point for con- record correlations between rules and decisions
versations around the use of these tools. made in algorithms. The audit trail would include
a map of the facts and rules that were applied
A. Transparency to each decision made in an algorithm.227 Ven-
One of the central themes emphasized by both dors should also make the source code for the
legal and technical experts is the need for algorithms available to the public, which will en-
greater transparency about how these algo- able outsiders to test these algorithms, a stan-
rithms were developed, the assumptions that dard practice among software developers.228
were made in their design, how their factors are Testing can detect patterns of problematic clas-
weighted, and how frequently they are assessed sifications based on race, nationality, sexual
and updated. While transparency alone will not orientation, and gender.229 By making the data
necessarily reduce the likelihood of bias, it re- public, academics will also be able to comment
mains valuable for a number of reasons. First on the scoring systems and help ensure that
and foremost, greater transparency can help they are infused with public values rather than
facilitate audits by outside researchers.221 It can
also help increase the general understanding of 223 Danielle Keats Citron & Frank Pasquale, The Scored
these systems, how they work, and the tradeoffs Society: Due Process for Automated Predictions, 90
involved in implementing them. More informa- Wash. U. L. Rev. 1, 20 (2014).
tion about inputs and the weights of variables 224 Id.
is also critical for any future constitutional chal- 225 Id.
lenges based on the use of impermissible or po- 226 Id. at 22.
227 Danielle Keats Citron, Technological Due Process, 85
tentially impermissible factors.222
Wash. U. L. Rev. 1249, 1254 (2008).
228 Richard Berk, a statisticiation from the University of
Law Professor Danielle Keats Citron and oth- Pennsylvania who played a central role in the develop-
ers have also developed and advocated for a ment of Pennsylvania’s risk assessment program, also
concept known as technological due process— argues that all companies should be required to be
which aims to ensure that there is ample oppor- disclose the complete contents of their algorithms. At the
tunity to challenge the decisions made by algo- very least, Berk believes some government entity should
rithms—that can be instructive in this context as be created or tasked with evaluating the full contents of
risk-assessment algorithms, even if they are proprietary
221 See, e.g., Diakopoulos, We Need to Know The Algo- like COMPAS. Interview with the Richard Berk, Professor,
rithms The Government Uses to Make Important Deci- University of Pennsylvania (Oct. 31, 2016).
sions About Us, supra note 198. 229 Citron & Pasquale, The Scored Society: Due Process
222 See Starr, Evidence-Based Sentencing, supra note 88. for Automated Predictions, supra note 223, at 25.

32
dictated solely by the whims of the program- B. Accountability and Oversight
mer.230 Although some opponents of disclosure While transparency is a foundational step, it
have argued that it will threaten to innovation is just the beginning. In order to promote max-
and or make it easier for participants to “game imum accountability, policymakers need to en-
the system,” these concerns can be mitigated sure that the systems they deploy have been
by the fact that there is inadequate evidence designed for the purpose for which they are be-
of such behavior in other instances.231 While full ing used, that they are appropriate for the par-
public disclosure would be ideal, policymakers ticular jurisdiction or geographic area, and that
can work with industry on a case-by-case basis they are continually monitored and assessed
to determine if more limited forms of disclosure for accuracy and reliability.236 Any tools they
would be more appropriate.232 adopt should be built with integrity, based on
the best available science, and calibrated to
Transparency should also inform a govern- minimize potential negative effects, such as the
ment’s decisions about whether to use propri- inclusion of problematic variables.
etary risk assessment software or work with
academics or non-profits to develop tools spe- The need to conduct validity studies on a state-
cifically for a particular jurisdiction.233 As noted by-state—or potentially even more granular—
earlier, proprietary tools like COMPAS are inher- level is clear.237 A tool that has been tested on the
ently subject to less scrutiny and oversight than national population or in other states may not
their public counterparts might be. As a group be appropriate for a particular location. Local
of computer scientist researchers candidly ex- policymakers should possess that information
plained, “Northpointe has refused to disclose before deciding to implement any risk assess-
the details of its proprietary algorithm, making ment system, which requires validity studies
it impossible to fully assess the extent to which and other research as a prerequisite to making
it may be unfair, however inadvertently. That’s any decisions. Moreover, testing and validity
understandable: Northpointe needs to protect studies should not simply be completed once
its bottom line. But it raises questions about and then forgotten about. States should require
relying on for-profit companies to develop risk regular repetition of validity studies and devel-
assessment tools.”234 The tension between the op procedures to make appropriate alterations
legitimate business interests of a private com- based on any changes in the population or new
pany that wants to protect and sell its product information that emerges about these tools. Pol-
and the need for public accountability may not icymakers should also talk to their peers in oth-
be easy to resolve.235 er jurisdictions to share best practices and look
for opportunities for standardization among
jurisdictions, so that an individual’s protection
against biased or unreliable algorithms is less
dependent on what jurisdiction he or she hap-
230 Id. at 26.
pens to be in.
231 Id.
232 For example, one potential compromise is to provide
In addition to validity studies, facilitating outside
limited public transparency but full disclosure to govern-
ment agencies.
research and auditing is also critical. Greater
233 Professor Berk argues, for example, that the goals
transparency will have little impact if outside
of a company with proprietary software may be funda- researchers do not have access to the data and
mentally incompatible with these transparency require- tools to evaluate and test the algorithms for bias.
ments. Interview with the Richard Berk, Professor, Univer- These tools should also be rigorously evaluated
sity of Pennsylvania (Oct. 31, 2016). in comparison to existing mechanisms in the jus-
234 Corbett-Davies et al., A computer program used for tice system to ensure that they actually repre-
bail and sentencing decisions was labeled as biased
against blacks. It’s actually not that clear., supra note 236 Mark Ackerman, Safety Checklists for Sociotechnical
208. Design, Data & Society Inst. (Oct. 26, 2016), https://
235 Some experts, like Berk, suggest that the financial points.datasociety.net/safety-checklists-for-sociotechni-
goals of private companies and the fairness require- cal-design-2cb9192e9e3b#.iipwibu0o.
ments of the criminal justice system may ultimately be 237 Algorithms in the Criminal Justice System, supra note
mutually exclusive. 72.

33
sent an improvement over the status quo.238 The peting values. Absent such a conversation, it
Access to Justice Lab at Harvard Law School, for will be difficult to resolve disputes like the one
example, recently started a project to evaluate between ProPublica and Northpointe about
the efficacy of risk assessment scores in pre-tri- whether COMPAS is biased.242 Legal scholars
al assessment, using randomized control trials in and technical experts need to engage with one
several jurisdictions around the United States.239 another about the appropriate technical and le-
Research projects like can offer critical insights gal measures that should be in place to guaran-
into how these tools work, how judges actually tee that the algorithms do not inappropriately
use them, and how they might be deployed on prioritize one type of fairness over another.
a large scale in the best and most appropriate
manner.240 It may make sense to initiate similar Critical issues also need to be addressed in the
efforts on a wider scale. development phase of these algorithms, particu-
larly with regard to the inputs and how they are
Finally, where risk assessment algorithms are used.243 O’Neil, for example, argues that these
concerned, maintaining oversight of implemen- risk assessment algorithms should eliminate as
tation and ongoing use of these tools should not many unnecessary variables as possible, espe-
be a hands-off process.241 Policymakers should cially those that are potential proxies for race
be involved at all stages of the process, ask- or rely on historically-biased data sets (such
ing difficult questions and forcing their part- as the “found” crimes described in the previ-
ners—whether they are for-profit companies, ous section).244 Indeed, some researchers have
academic institutions, or non-profit organiza- found that it is possible to duplicate the results
tions—to explain and justify any assumptions of a system like COMPAS using far fewer vari-
or decisions that they make in developing and ables—and far fewer problematic variables, at
using these tools. Especially in light of the se- that.245 This research suggests that some of the
rious constitutional concerns raised by schol- most problematic variables could be removed
ars like Professor Starr, governments should not from these systems without sacrificing accura-
simply “outsource” risk assessments to private cy, although more studies are clearly required.
companies and assume that they will help guide O’Neil takes an even more controversial posi-
judicial decision-making in a way that is both tion: that troubling variables should be exclud-
accurate and fair to the individual defendant. ed even if their exclusion decreases the accura-
Rather, policymakers should maintain an active cy of the algorithm. “Are we going to sacrifice
role in overseeing their use and ensuring that the accuracy of the model for fairness? Do we
both the developers and those individuals em- have to dumb down our algorithms?” she writes.
ploying them are aligned with the overall goals “In some cases, yes. If we’re going to be equal
of the system and aware of any potential pit- before the law, or be treated equally as voters,
falls. we cannot stand for systems that drop us into
different castes and treat us differently.”246
C. Robust and Holistic Approach to
Fairness 242 Corbett-Davies et al., A computer program used for
Based on the diverging concepts of technical bail and sentencing decisions was labeled as biased
and legal fairness described above, policy- against blacks. It’s actually not that clear., supra note
makers need to engage in a thorough dialogue 208.
about how to reconcile or prioritize these com- 243 Ackerman, Safety Checklists for Sociotechnical De-
sign, supra note 236.
244 O’Neil, Weapons of Math Destruction, supra note
238 O’Neil, Weapons of Math Destruction, supra note 172, at 210.
172, at 208. 245 Sheldon X. Zhang et al., An Analysis of Prisoner
239 See Pre-Trial Release, Access to Justice Lab at Har- Reentry and Parole Risk Using COMPAS and Traditional
vard Law School, http://a2jlab.org/current-projects/ Criminal History Measures, 60 Crime & Delinquency
signature-studies/pre-trial-release/. 167, 187 (2014) (finding that a model assessing four static
240 O’Neil, Weapons of Math Destruction, supra note variables, gender, age, age of first arrest, and number
172, at 209-10. of prior arrests, performed just as well as COMPAS in
241 Michael Luca et al., Algorithms Need Managers, Too, predicting prior arrests).
Harvard Business Review (Jan./Feb. 2016), https://hbr. 246 O’Neil, Weapons of Math Destruction, supra note
org/2016/01/algorithms-need-managers-too. 172, at 210.
34
Robust procedural safeguards will also help The Science of Sentencing
ensure that, once they enter the criminal jus- • How does the length of a sentence impact
tice system, these scores are used properly and prisoner behavior, particularly with regard
that their inadvertent impact is minimized. The to an individual’s propensity for recidivism?
guidelines laid out by the Loomis case represent • Do risk assessment algorithms represent an
a decent first attempt to guide the use of these improvement over unguided human judg-
scores, but the holes in the court’s analysis like- ment?
ly undermine their effectiveness. Much greater
precision is required. In particular, it would be
valuable to develop standards for the types of Technical Fairness and Accuracy
information provided to judges and sentencing • Is there a particular accuracy threshold that
authorities in the PSI regarding the risk assess- should be required before a risk assessment
ment tools, how scores were calculated, and tool can be used in sentencing? How should
so on. These guidelines should include specific that threshold be established?
recommendations about how the information • Beyond statistical parity, how can we rec-
is actually presented to judges in the PSI.247 oncile the concepts of technical and legal
These rules should also address a defendant’s fairness for use in sentencing? How should
right to review and challenge this information, fairness and accuracy be balanced against
in light of precedent established in cases like one another?
Gardner.248 Furthermore, policymakers need to
think creatively about how to feasibly restrict • Should fairness be defined differently in a
judges from lengthening sentences based on sentencing context as compared to a reha-
the scores in the PSI, which is prohibited by the bilitative or pre-trial context?
Loomis court but practically quite difficult giv- • What kind of data should jurisdictions be
en the amount of discretion that judges have in collecting or maintaining for use in sentenc-
sentencing. ing algorithms, in order to ensure accuracy
and fairness?
VII. Further Areas for Research • Is there any advantage to using tools that
Beyond the recommendations described above, emphasize static factors over dynamic fac-
more scholarship is clearly needed to answer tors (or vice versa)?
critical questions about the legality, fairness,
and long-term impact of using risk-assessment
algorithms in the sentencing context. There has Legality and Transparency
been a substantial amount of research on the • Is the incorporation of certain variables (e.g.
use of these risk-assessment algorithms in reha- race, gender, socioeconomic status) into
bilitation and pre-trial assessment, but their
249 these algorithms unconstitutional? In partic-
use in sentencing is far newer and still warrants ular, does it violate the Due Process or Equal
additional inquiry. This section identifies some Protection clauses of the Fourteenth Amend-
key research questions about the technical and ment?
legal issues that are ripe for further inquiry. • Is the use of variables that correlate heavily
with impermissible factors like race unconsti-
247 Communicating Risk at Sentencing, Pennsylvania tutional?
Commission on Sentencing, Risk Need Assessment • How much information can private compa-
Project, Interim Report, 1, 8 (2014), http://pcs.la.psu. nies be required to disclose about their algo-
edu/publications-and-research/research-and-evalua-
tion-reports/risk-assessment/phase-i-reports/interim-re-
rithms? How much information should they
port-8-communicating-risk-at-sentencing/view (finding be required to disclose?
that the manner in which risk-assessment scores are • What is the appropriate administrative agen-
presented to judges can have an effect on the degree cy or other government institution to whom
to which these assessments are consider in sentencing the contents of these algorithms should be
decisions and that judges in Pennsylvania usually tend disclosed?
to prefer as much information as possible.)
248 See supra notes 154-55 and accompanying text. • Should the rules for decision-making incor-
249 See supra Part II.D. porated into these algorithms be available

35
for public comment and input? VIII. Conclusion
• Should more explicit rules be developed to
govern the form of risk assessment informa- The growing use of risk assessment software
tion provided to judges before sentencing? in criminal sentencing is a cause for both op-
What should these rules look like? timism and skepticism. While these tools have
the potential to improve sentencing accuracy
in the criminal justice system and reduce the
Validity Testing risk of human error and bias, they also have the
• What metrics should states and jurisdictions
potential to reinforce or exacerbate existing bi-
use when conducting validity tests?
ases and to undermine certain basic tenets of
• Can guidelines be established that are fairness that are central to our justice system.
transferable across jurisdictions in order to In this report, we have tried to canvass a wide
supplement validity testing? range of these legal and technical challenges in
• What level of transparency is necessary in order to help policymakers make more informed
order for jurisdictions to conduct validity decisions about whether and how to implement
tests? these systems in the future. Ultimately, we be-
lieve that the current trend toward greater use
of these tools is likely to continue, and therefore
While this is by no means a complete list, these we would urge policymakers to maintain a focus
questions are intended serve as a useful jump- on fairness, accountability, and transparency
ing-off point for those who are in a position to when deploying these tools. There are import-
conduct research on the use of risk-assessment ant ethical and normative decisions that need
algorithms in sentencing, or can provide fund- to be made as these risk assessment tools are
ing for said research. integrated into the existing system—and those
decisions should not be made lightly or with in-
sufficient information.

36

You might also like