This Content Downloaded From 13.234.156.0 On Sun, 02 Aug 2020 10:39:47 UTC
This Content Downloaded From 13.234.156.0 On Sun, 02 Aug 2020 10:39:47 UTC
This Content Downloaded From 13.234.156.0 On Sun, 02 Aug 2020 10:39:47 UTC
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
is collaborating with JSTOR to digitize, preserve and extend access to The Yale Law Journal
Edwina L. Risslandt
1957
those seeking to understand and model legal argument, and they exem-
plify and illustrate the progress, concerns, and style of its research.
Sci. 155 (1983). Reasoning by analogy was the subject of one of the earliest programs in AI. Evans, A
Program for the Solution of a Class of Geometric-Analogy Intelligence-Test Questions, in SEMANTIC
INFORMATION PROCESSING supra note 4, at 271.
13. See, e.g., E. CHARNIAK, TOWARD A MODEL OF CHILDREN'S STORY COMPREHENSION
271-74 (MIT Al Lab Technical Report No. 266, 1972) (even understanding "simple" children's
stories can be quite messy).
14. For instance, the rules of chess completely define the game. Furthermore, in the case of
games, solutions can be found using high-powered, specialized "search" techniques. This is how
highly successful chess programs, such as Deep Thought, work. Deep Thought can examine 720,000
chess positions per second. See Newborn, & Kopec, supra note 5, at 1225. However, it is quite a
different matter whether such programs work in the same way human chess experts do and whether
they can shed light upon human thought processes. See Chase & Simon, The Mind's Eye in Chess, in
VISUAL INFORMATION PROCESSING 215, 278 (W. Chase ed. 1973).
15. See D. DENNETT, BRAINSTORMS: PHILOSOPHICAL ESSAYS ON MIND AND PSYCHOLOGY 112
(1978); MIND DESIGN (J. Haugeland ed. 1981).
16. For two interesting theories of human cognition, see M. MINSKY, THE SOCIETY OF MIND
(1986); A. NEWELL, UNIFIED THEORY OF COGNITION (forthcoming 1990).
17. Even if psychological validity is not usually paramount, it is often helpful. The MYCIN
Project illustrates this point. The goal of the MYCIN Project was to build a system that could diag-
nose bacterial blood infections at an expert level. Although the goal was not to model closely the
diagnostic behavior of expert physicians, observations of medical experts were critical during the early
phases of the project stage, when the Al researchers (known as "knowledge engineers") gathered,
structured, and encoded the experts' medical knowledge for use by the program. Later, having the
program operate in a comprehensible manner was critical for debugging and refining it. See generally
Buchanan & Shortliffe, supra note 11. It is usually the case that if there is no point of contact
between the program's processing style and the human's, the program behavior appears inscrutable,
impeding its development. Some similarity between the program's and the experts' processing also
enhances one's belief in the correctness of the output of the program; sometimes this is so because it is
easier for the program to explain its own reasoning in the user's terms. With respect to the issue of
capturing the style of expert reasoning, a chess playing program like Deep Thought is an extreme
case of a high performance program where there is no claim to cognitive validity. See Newborn &
Kopec, supra note 5. There was no attempt to make Deep Thought think like a grand master. See
Leithauser, Kasparov Beats Deep Thought, N.Y. Times, Jan. 14, 1990, ? 6 (Magazine), at 33, 74
(discussion of grand master Kasparov's thoughts on ramifications of some of these issues).
18. Actually building a program is quite different from speculating about it. Programming makes
abundantly clear the weaknesses or difficulties of the model. Cf. McDermott, Artificial Intelligence
Meets Natural Stupidity, in MIND DESIGN, supra note 15, at 143, 156-59 (highlighting risk of
theorizing about implementation without actually implementing).
19. See, e.g., B. CARDOZO, THE NATURE OF THE JUDICIAL PROCESS (1921); E. LEVI, AN IN-
TRODUCTION TO LEGAL REASONING (1949); K. LLEWELLYN, THE BRAMBLE BUSH (1930); K.
LLEWELLYN, THE CASE LAW SYSTEM IN AMERICA (1989); Radin, Case Law and Stare Decisis:
Concerning Praijudizienrecht in Amerika, 33 COLUM. L. REV. 199 (1933).
20. Although this was obviously not their purpose, some discussions have come remarkably close.
For instance, some of the analyses done by Llewellyn and Radin capture the spirit of the sort of
description desired in Al. See, e.g., K. LLEWELLYN, supra note 19, at 50 n.I (describing doctrine of
precedent, particularly concerning broad and narrow reading of rule of case); Radin, supra note 19,
at 206-09 (describing concept evolution in law). Radin's description is uncannily similar to an al-
gorithm, called the "candidate elimination algorithm," used in machine learning. See 3 HANDBOOK
OF ARTIFICIAL INTELLIGENCE, supra note 2, at 385-91. For a comparison of this algorithm and
Radin's analysis, see Rissland & Collins, The Law as Learning System, in PROCEEDINGS OF THE
FOURTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY 500, 501 (1986) (examining
evolution of concept of "inherently dangerous" from Radin).
21. E. LEVI, supra note 19, at 1.
The law offers abundant opportunities for developing analytic and com-
putational AI models. Law also has unique characteristics that make it a
particularly challenging field for AI:
purposes, and there is a lot more to legal knowledge than the "book
knowledge" of traditional legal materials. Still, an Al study of the law has
been provided with a good beginning, an epistemological leg up.
That what counts as an "answer" in the law is not clearcut is also
different from other disciplines. In law there is usually no unique right
answer; rather there are reasonable alternative answers, more a matter of
degree than of extremes. The answers are highly contextual, depend on
goals and points of view, and change as the law evolves. Even the rule-
based aspects of legal reasoning cannot be modeled with purely deductive
methods. This also means that, unfortunately, there is never the comfort
of a quod erat demonstrandum at the end of a reasoning episode to sanc-
tion it as sound and beyond reproach, as there is in mathematics. From
the legal point of view, this is no real deficit-it's a feature and not a bug,
to use the computer scientists' phrase-since it allows law to accommodate
to changes in circumstance and purpose. But, computationally, the nature
of legal answers adds complexity and difficulty as well as richness and
flexibility.
These observations all suggest that the law is an exceedingly challeng-
ing domain for Al. Research in Al and law will impel Al in new direc-
tions and thus benefit Al. In turn, law will also benefit from Al, both
analytically and practically. As an analytical medium, Al forces meticu-
lous attention to details and precise testing of theoretical ideas. This, in
turn, facilitates the unmasking of flaws, furthers the understanding of as-
sumptions, and leads to proposals for refinements. Al focuses a spotlight
on issues of knowledge and process to a degree not found in non-
computational approaches, which often assume that some key task, like
analogical reasoning, will be accomplished magically without a hint as to
how,22 or with too many critical details left.underspecified.23 The practi-
22. The philosopher Dan Dennett calls this the problem of a "loan of intelligence" or the hidden
homunculus. See D. DENNETT, supra note 15, at 12. In an AI model, some process, somewhere, must
actually do the work or else as Dennett puts it, the theory is "in the red." Id. A great danger is in
"designing a system whose stipulated capacities are miraculous." Id. at 112; see also McDermott,
supra note 18, at 143.
23. Some of Dworkin's models are intriguing in this regard, such as his model of hard cases. See,
e.g., R. DWORKIN, TAKING RIGHTS SERIOUSLY (1977). He argues that there may not really be any
"hard" cases since one can use a set of relatively weighted principles to resolve certain "hard" ques-
tions, which arise because the on point cases are in conflict or there is a tie in the principles. The
principles and weights are generated from a collection of relevant precedents. Dworkin omits the
details of how to decide which precedents to use and how to induce principles from them. He is not
necessarily wrong, but it would be instructive to extract a more detailed description of how his model
works. By declining to instruct further on how to develop the weighting system, Dworkin has simply
moved the problem of analysis back one step. Regarding the assignment of relative weights, he has
walked headlong into the "credit assignment" quagmire, well known to workers in machine learning,
where the problem is to assign credit or blame for the overall success or failure of a problem solution
to an individual step or aspect of it. For instance, is the credit for a win or the blame for a loss in a
game of checkers to be given to the penultimate move of a game, the first move, or some intermediate
move or moves? How can one tell which features or principles "caused" a case to be resolved for one
party or the other? See Minsky, Steps Toward Artificial Intelligence, in COMPUTERS AND THOUGHT,
supra note 6, at 432 (1963); Samuel, Some Studies in Machine Learning Using the Game of Check-
cal benefits to law from Al are intelligent computational tools. The rela-
tionship between Al and law is truly synergistic.
We are a long way from such an ideal. There is, however, activity on all
of these important fronts, and impressive progress on a few.
In fact, one can today speculate about which of these desiderata can be
expected now, in the near future, someday, or probably never. Presently,
Al is actively pushing back the boundaries on case-based reasoning (goal
1), has well-understood methodologies in rule-based reasoning (goal 2),
and is exploring multi-paradigm reasoning (goal 3). Reasoning with
open-textured predicates (goal 4) has had an admirable first cut, but it
will require further contributions from other Al specialities like case-
based reasoning (CBR)25 and machine learning.26 Major contributions
INTELLIGENCE 83 (1988).
30. This is not so much because of any special difficulties with opinions, but because understand-
ing text is just plain hard. See supra notes 13, 29. Legal opinions use specialized vocabulary and
conventions, so they might be harder because these considerations require extra processing; on the
other hand, the extra constraints these considerations impose might make legal opinions more amena-
ble to computational analysis.
One of the earliest steps toward a model of legal reasoning was the use
of expert systems31 to model certain rule-based aspects of law.32 This step
reflects the development of Al: Rule-based expert systems were the first
type of Al system to become widely available and employed beyond the Al
research community.33 Furthermore, their underlying computational
mechanisms are conceptually clear and they have many computational
strengths. While from the legal standpoint there is a variety of opinions as
to the validity, usefulness, and status of rules, and there are acknowledged
difficulties in representing them,34 it is still quite natural to take some
body of legal rules and embed them in a standard rule-based computa-
tional framework.
In the rule-based approach, a rule is encoded in a simple, stylized if-
then format: If certain conditions are known to hold, then take the stated
action or draw the stated conclusion.35 Rule-based systems work by chain-
ing these rules together.36
31. An "expert system" is a special-purpose computer program, which can be said to be expert in
a narrow problem area. Typically, such a program uses rules to represent its knowledge and to rea-
son. See, e.g., P. HARMON & D. KING, EXPERT SYSTEMS (1985); D. WATERMAN, A GUIDE TO
EXPERT SYSTEMS (1986); B. BUCHANAN & R. SMITH, Fundamentals of Expert Systems, in 4 HAND-
BOOK OF ARTIFICIAL INTELLIGENCE, supra note 2, at 149.
32. An even earlier effort related to rule-based aspects of law was Layman Allen's work on "nor-
malization." His emphasis was on eliminating syntactic ambiguity in statutes and legal documents
rather than on using computational programs to reason with them. See Allen, Symbolic Logic: A
Razor-Edged Tool for Drafting and Interpreting Legal Documents, 66 YALE L.J. 833 (1957). Of
course, if certain ambiguities, for instance those about the scope of logical connectives and the meaning
of words like "if," "except," and "unless," were eliminated from legal sources, encoding legal rules for
use by an expert system would- be easier and less open to debate. Allen & Saxon, Some Problems in
Designing Expert Systems to Aid Legal Reasoning, in THE SECOND INTERNATIONAL CONFERENCE
ON ARTIFICIAL INTELLIGENCE AND LAW: PROCEEDINGS OF THE CONFERENCE 94 (1987) [hereinaf-
ter ICAIL-871 (discussing 48 alternative interpretations of structure of proposed limitations on exclu-
sionary rule).
33. For example, the expert system DENDRAL has been widely used by organic chemists, see R.
LINDSAY, B. BUCHANAN, E. FEIGENBAUM & J. LEDERBERG, APPLICATIONS OF ARTIFICIAL INTEL-
LIGENCE FOR ORGANIC CHEMISTRY: THE DENDRAL PROJECT (1980); sources cited supra note 31.
34. Besides "syntactic" difficulties, see Allen, supra note 32, there are "semantic" difficulties such
as the presence of conflicting rules, imprecise terms, and incompleteness. See Berman & Hafner,
Obstacles to the Development of Logic-Based Models of Legal Reasoning, in COMPUTER POWER AND
LEGAL LANGUAGE 183 (C. Walter ed. 1988) (discussing difficulties with logic-based approaches);
Berman, Cutting Legal Loops, in ICAIL-89, supra note 1, at 251 (discussing definitional circularity,
recursion, and self-referencing in statutes).
35. For instance, Rule 1, concerning "responsibility for use of the product," from the case settle-
ment system of Waterman and Peterson states, "IF the use of [the product] at the time of the plain-
tiff's loss is foreseeable and (that use is reasonable-and-proper or that use is an emergency or (there is
a description by the defendant of that use and that description is improper) or there is not a descrip-
tion by the defendant of that use) THEN assert the defendant is responsible for the use of the prod-
uct." D. WATERMAN & M. PETERSON, MODELS OF LEGAL DECISION MAKING 37 (Institute for
Civil Justice of The Rand Corporation Memo R-2717-ICJ 1981) (parentheses used to denote scope
and distribution of logical connectives; thus, here there are two principal antecedents, second of which
can be satisfied in alternative ways).
36. The systems can work either "forward" by reasoning from facts to a desired conclusion sup-
ported by them, or "backward" from a desired conclusion to find facts supporting it. Forward chain-
ing simply is the repeated application of the logical inference rule modus ponens: If one has a rule "If
A then B" and the fact A, conclude the fact B. Alternatively, in backward chaining, to establish the
fact B, one looks for such a rule and verifies that one has information to satisfy its precondition, A; if
A were not satisfied, one then would look for rules establishing A, and so on until the necessary
factual basis were reached and the desired conclusion logically supported. See W. SALMON, LOGIc (2d
ed. 1973); sources cited supra note 31.
37. A "heuristic" is a rule of thumb, a bit of encapsulated wisdom. Much problem-solving behav-
ior is guided by heuristic rules of thumb, such as "If it is fourth down and long yardage is required,
then punt," or "If you need to get your citations in correct law journal form, then ask a law journal
editor." Heuristics are methods that past experience has shown to be "good" or "wise" things to do;
they do not necessarily guarantee a solution, as might algorithms or mathematical theorems, and
occasionally they might even produce wrong answers or lead in counter-productive directions. The
word "heuristic" stems from the Greek for invention or discovery. The mathematician George Polya
discussed heuristic reasoning and the use of heuristics in mathematical reasoning and problem solving.
See G. POLYA, How To SOLVE IT (1973).
38. See generally D. WATERMAN & M. PETERSON, supra note 35.
39. See, e.g., supra note 35 (example of one of their rules).
40. See D. WATERMAN & M. PETERSON, supra note 35, at 13-15.
41. Id. at 45..
42. See id. at 26. Waterman, a pioneer in machine learning, understood how learning issues were
also critical to expert systems. He was well ahead of his time and his early death in 1987 was a great
loss.
43. Latent Damage Act, 1986.
44. P. CAPPER & R. SUSSKIND, LATENT DAMAGE LAW-THE EXPERT SYSTEM (1988). The
book contains diskettes for installing and running the system on a machine like the IBM PC or X
45. See R. SUSSKIND, EXPERT SYSTEMS IN LAW (1987).
46. British Nationality Act, 1981. See Sergot, Sadri, Kowalski, Kriwaczck, Hammond & Cory,
The British Nationality Act as a Logic Program, 29 COMM. ACM 370 (1986) [hereinafter Sergot].
47. See, e.g., Sergot, supra note 46, at 379 (defining "negation as failure" as the conclusion that
something is false if all known ways of showing it true fail); id. at 382 ("counterfactual conditionals"
as in the statutory phrase "became a British citizen by descent or would have done so but
for ...."); id. at 382 (discretion as in the phrase "If ... the Secretary of State sees fit"); Berman,
supra note 34; Berman & Hafner, supra note 34.
48. See, e.g., Grady & Patil, An Expert System for Screening Employee Pension Plans for the
Internal Re7venue Service, in ICAIL-87, supra note 32, at 137 (describing Internal Revenue Service
project to process employee pension plans); Pethe, Rippey & Kale, A Specialized Expert System for
Judicial Decision Support, in ICAIL-89, supra note 1, at 190-94 (system for processing claims under
The Federal Black Lung Benefits Act); Weiner, CACE: Computer-Assisted Case Evaluation in the
Brooklyn District Attorney's Office, in ICAIL-89, supra note 1, at 215-23 (describing system for
post-arrest/pre-trial processing of drug busts by police).
49. See, e.g., Wiehl, Computers Assuming New Roles at Law Firms, N.Y. Times, Jan. 20, 1989,
at B4, col. 3; Blodgett, Artificial Intelligence Comes of Age, 73 A.B.A. J. 68 (1987).
50. For example, Paul Brest has commented that constitutional law would be "wildly unsuited for
an expert system because the principles are vague . . . . Expert systems are best for areas of law that
are rule-bound." Blodgett, supra note 49, at 70 (quoting Brest).
51. See generally A. GARDNER, AN ARTIFICIAL INTELLIGENCE APPROACH TO LEGAL REASON-
ING (1987). Gardner holds a J.D. from Stanford and practiced law before returning to Stanford to
obtain a Ph.D. in computer science.
52. For a discussion of this framework, see Hart, Positivism and the Separation of Lauw and
Morals, 71 HARV. L. REV. 593 (1958), reprinted in H.L.A. HART, ESSAYS IN JURISPRUDENCE. AND
PHILOSOPHY (1983); R. DWORKIN, supra note 23, at 81.
53. For a discussion of "open textured" legal concepts, see H.L.A. HART, THE CONCEPT OF
LAW 121-32 (1961).
several specifics must be provided, such as who are the experts to be used,
what counts as a disagreement among them, and how to tell if a disagree-
ment exists. From a practical standpoint, if one could sift the easy from
the hard, one could employ rule-based techniques, for instance, to solve
the easy questions, and attack the hard questions with other methods.64
Gardner built a computational model for the hard/easy paradigm and
for reasoning with open-textured legal concepts. The problem area of her
program was classic offer-and-acceptance law, and its task was to analyze
issue-spotter questions, taken from sources such as bar exams and Gil-
bert's Law Summaries. The program analyzed what the legal questions
were, spotted which were hard, and answered the easy ones. Gardner re-
quired that her program be computationally reasonable; it should not take
an unduly hard computation to decide which questions are easy and hard,
or else nothing much would have been gained, at least computationally, if
not theoretically.
In Gardner's model, hard questions arise from problems with rules in
two ways: either the rules relevant to the domain, such as those from the
Restatement of Contracts,55 are incomplete, circular, or contradictory; or
the legal predicates used in the rules are indeterminate and their interpre-
tation cannot be resolved.6 When there is a mixture in the disposition of
cases used to resolve these problems, Gardner's program categorizes the
problem as hard. Typically it is not possible to handle indeterminate con-
cepts because, as Gardner puts it, the rules "run out;" that is, a needed
rule's premise uses an undefined term.57
Gardner's approach to these problems was to give her program a rich
body of knowledge and several powerful heuristics. Her program's knowl-
edge included (1) Restatement-like rules for the doctrine of offer and ac-
ceptance,58 (2) a "network" to represent various states in which parties
54. This is of course not the only way to model the hard/easy problem; for instance, case-based
techniques could be used for both types of questions. However, given that rule-based methods are
well-understood, Gardner's approach is quite natural.
55. RESTATEMENT (SECOND) OF CONTRACTS (1981).
56. Note that a typical rule-based approach to handling difficulties with the rule set is to resolve
conflicts by hand before encoding the rules in the program. Of course, one cannot always eliminate
conflicts in the rule set because even the experts disagree. For example, Gardner's program has pairs
of conflicting rules about a rejection that revokes a previous acceptance of an offer, and about simulta-
neous acceptance and proposal to modify the contract. See A. GARDNER, supra note 51, at 134. Gard-
ner deliberately leaves conflicting rules in her system because she wants her model to be able to
handle the fact that legal experts, and thus their rules, can be in conflict. See id. at 3-4. A typical
rule-based way to finesse problems with open-textured legal predicates is either simply to ask the user
to make the judgment or engage in what might be called "definitional backchaining:" write rules
whose pre-conditions specify whether a legal predicate obtains, and then write rules defining what
those pre-conditions mean, and so on. See Sergot, supra note 46, at 378 (implementing this approach);
D. WATERMAN & M. PETERSON, supra note 35, at 45 (suggesting this approach).
57. A. GARDNER, supra note 51, at 33-34 (discussing problem of inadequacy of rules). This
brings the traditional rule-based technique (of backward chaining) to a complete standstill before an
adequate definitional basis is reached and application of the ambiguous term can be resolved.
58. For example, her program uses about twenty rules covering contract doctrines such as offer
and acceptance. See id. at 133.
59. For example, there can be an offer pending, and an acceptance will enable one to make a
transition to the state of being in a contractual relationship. See id. at 123-25; id. at ? 6.2.
60. For instance, that a telegram is a kind of document, which in turn is a kind of inanimate
object, which in turn is a kind of physical object. See id. at 90-91. See generally id. at 85-117 (very
detailed presentation of representation issues).
61. For example, her program has two prototypical fact patterns for the legal predicate "produces
a manifestation with content:" making an utterance with some content, and sending a document with
some content. See id. at 156-57.
62. The determination of hard/easy cases is made as follows. If a problem such as the application
of a legal predicate to the current facts can be (tentatively) answered using the program's non-
exemplar domain knowledge, primarily rule-like in character, and if there are no opposing case exem-
plars, then the question is deemed easy and its answer is the one derived. If the case exemplars used
to check the tentative answer all point the opposite way, then the question is also considered easy but
the answer is that supported by the case exemplars. If, on the other hand, there is a mixture in the
disposition of the case exemplars, then the question is flagged as hard and the program does not
attempt to provide an answer. If a (tentative) answer cannot be derived with the domain rules be-
cause, for example, a predicate cannot be resolved, but all the relevant cases point the same way and
thus can be used to posit and support an answer, then the question is considered easy and the answer
is that indicated by the case exemplars. On the other hand, if an answer cannot be derived using the
domain rules and there is a mixture in the cases, then the question is also deemed hard. See id. at
54-55, 160-61 (abbreviated descriptions of her program's algorithm).
63. One might ask, why not go to the cases right off? My interpretation is that the reasoning
involving the other knowledge, the rules, the transition network, and the common sense knowledge,
provides a means by which to index relevant cases. Furthermore, even in the easy cases, one would
need to derive the answer by some means, and in her model this is attempted with the rule-guided
reasoning, which would need to be done at some point anyway. Alternatively, one could omit looking
at the rules altogether and just consider cases. Of course, one would need to say which cases to
consider and what exactly to do with them. See infra Section II.C (discussing work of Ashley).
64. For instance, too many questions might be categorized as "hard" in this model since there is
very often a mixture in the disposition of the cases. Perhaps one needs to be more discriminating in
the use of cases.
The next landmark Al project was Kevin Ashley's, which was done in
this author's research group at the University of Massachusetts.66 Ashley
developed a program called HYPO to model certain aspects of case-based
reasoning as exemplified by appellate-style argumentation with prece-
dents. The problem area of this project was trade secret law. The task
was to produce elements of precedent-based argument. The HYPO pro-
ject focused solely on reasoning with cases, and was the first project-not
only in Al and law, but also in Al in general-to attack squarely the
problem of reasoning with cases and hypotheticals in a precedent-based
manner.
HYPO performs as follows: Given a fact situation, HYPO analyzes it
according to its model of trade secret law and then retrieves relevant cases
from its knowledge base of cases. It then determines which relevant cases
are most on point, or potentially so, for whose point of view, and from
which analytic approach. HYPO then generates the skeleton of an argu-
ment. In such an argument snippet, HYPO first argues for side one
(plaintiff or defendant) by making a legal point and citing its best, mos
on point cases; then it argues for side two by responding with a counter-
point, citing a most on point case supporting side two's point of view or
also by distinguishing the current facts from side one's cases; and finally,
HYPO argues again for side one with a rebuttal of side two's position,
which may include distinguishing side two's cases and strengthening the
65. For instance, in Gardner's program cases are prototypical fact patterns rather than actual
cases. What about using real fact patterns instead, or in addition? Which ones? How should the
program's memory of cases be organized? Are certain cases to be preferred to others?
66. See generally K. ASHLEY, MODELING LEGAL ARGUMENT: REASONING WITH CASES AND
HYPOTHETICALS (forthcoming 1990) (originally completed as Ph.D. dissertation, Department of
Computer and Information Science of the University of Massachusetts, Amherst, February, 1988,
under this author's direction); Ashley & Rissland, A Case-Based Approach To Modeling Legal Ex-
pertise, IEEE EXPERT, Fall 1988, at 70-77 (short overview and example). Ashley holds a J.D. from
Harvard Law School and practiced as a litigator with White & Case in New York before returning to
study computer science at the University of Massachusetts.
67. For several examples, see K. ASHLEY, supra note 66; Ashley & Rissland, supra note 66, at
74-76.
68. K. ASHLEY, supra note 66.
69. For instance, in HYPO's domain of trade secret misappropriation law, knowing facts about
the relative costs incurred by the plaintiff and defendant in developing their putative secrets into a
product enables one to make arguments about the gain of competitive advantage, and knowing facts
about the disclosures made by the plaintiff enables one to make arguments about how well the plain-
tiff kept his knowledge secret. See, e.g., Gilburne & Johnston, Trade Secret Protection for Software
Generally and in the Mass Market, 3 COMPUTER L.J. 211 (1982). For further explanation and
examples, see K. ASHLEY, supra note 66; Rissland, Valcarce & Ashley, Explaining and Arguing
with Examples, in PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE
288, 291-93 (1984).
70. In HYPO, for instance, the greater the disparity between the development costs of plaintiff
and defendant, the easier it is to argue that, all other things being equal, the defendant gained an
unfair competitive advantage through misappropriation of plaintiff's trade secret. Thus, for example,
since there is a case in which the plaintiff prevails with a ratio of two to one for plaintiff's to defend-
ant's costs, see Telex Corp. v. IBM Corp., 367 F. Supp. 258 (N.D. Okla. 1973), then a new case with
a ratio of four to one would represent an even stronger position for the plaintiff in HYPO.
71. See K. ASHLEY, supra note 66.
72. So, for instance, suppose the dimensions that apply to the current fact situation (CFS) are W.
X, Y, and Z. Suppose Case A shares X, Y, and Z with CFS, and Case B shares just X and Y. Then
By then taking into account which cases support or cut against a party,
their relative strength along shared dimensions, and the dimensions not
shared, HYPO is able to analogize, distinguish, and otherwise manipulate
cases to construct snippets of precedent-citing argument. For instance, to
distinguish a case that the opposing party, say the plaintiff, has cited as
controlling the outcome of the current fact situation, HYPO looks at the
cited case for the existence of dimensions unshared with the current situa-
tion. HYPO then argues, for the defendant, that the presence or absence
in the current factual setting of these dimensions diminishes the applica-
bility of the cited case; in other words, these are differences that really
make a difference."
To summarize, the HYPO project models some of the key ingredients
of reasoning with precedents. The model provides computational defini-
tions of relevant cases, on point cases, and best cases to cite.74 Because of
the precision of HYPO's model, a researcher can examine specific details
and assumptions.75 By providing an analysis and computational model for
reasoning with cases, perhaps the most vital aspect of legal reasoning,
Ashley's work is a giant step toward the goal of understanding legal
argumentation.
An earlier effort addressing some of the same concerns as HYPO was
the work of Thorne McCarty on legal argument.76 One of McCarty's
longstanding goals has been the construction of knowledge representation
mechanisms with which to model certain aspects of legal argumentation.
For instance, McCarty has proposed a mechanism called "prototypes and
deformations" to model certain aspects of appellate argument. He has
used it to examine the sort of situation in which one incrementally steps
from a desirable precedent to the current case, through a sequence of in-
termediate real and hypothetical cases, in order either to construct an ar-
gument by analogy for the result one desires, or else to show the inconsis-
tency of the result argued for by one's opponent.77 His model requires a
Case A is more on point than Case B because the set that B shares with CFS is a subset of the set
that A shares. Suppose there is a third relevant case, Case C, sharing dimensions W and X with CFS;
Case C is neither more nor less on point than A or B. Both Case A and Case C are most on point
cases since each shares maximally with respect to a subset of dimensions.
73. For instance, if the case cited by the plaintiff is very strong on a pro-plaintiff dimension not
present in the current case, the defendant can argue that that particular dimension is responsible for
the pro-plaintiff outcome. If, in addition, the defendant can point to a different case, the same in all
respects to the plaintiff's cited case except for that one dimension, in which the defendant prevailed,
then the assignment of credit for the outcome in the plaintiffs case to the missing dimension is all the
more convincing.
74. For instance, the best cases for a side to cite are defined as those most on point cases for that
side that share at least one applicable dimension favoring that side.
75. For instance, an alternative way to measure relevancy is to minimize the sets of dimensions
not shared. One could then swap this alternative definition for relevancy in the HYPO model and
observe the repercussions.
76. See generally McCarty, Reflections on TAXMAN: An Experiment in Artificial Intelligence
and Legal Reasoning, 90 HARV. L. REV. 837 (1977).
77. See McCarty & Sridharan, The Representation of an Evolving System of Legal Concepts: II
84. See, e.g., Branting, Representing and Reusing Explanations of Legal Precedents, in ICAIL-
89, supra note 1, at 103 (developing model of how past arguments can be modified for re-use in new
case).
85. See K. BELLAIRS, CONTEXTUAL RELEVANCE IN ANALOGICAL REASONING: A MODEL OF
LEGAL ARGUMENT (submitted to the University of Minnesota, 1989) (Ph.D. thesis) (how relevance
can be defined in terms of analogy). This work continues the examination begun by Ashley of what it
means for one case to be relevant to another. It draws on work such as Gentner's, supra note 12,
which views analogy as a mapping between structured objects like legal concepts and cases.
86. I.R.C. ? 280A(c)(1) (1988).
87. For example, cases examining "exclusive use" include Frankel v. Commissioner, 82 T.C. 318
(1984), Baie v. Commissioner, 74 T.C. 105 (1980), and Chauls v. Commissioner, 41 T.C.M. (CCH)
234 (1980).
fit B, one must satisfy requirements R1, R2, and R3." Suppose that R1
and R2 are clearly satisfied. To argue for receiving the benefit, one has
several strategies to consider. For instance, one can try to find on point
precedents addressing the interpretation of antecedent R3, and then argue
that, in fact, the facts actually do satisfy R3 and thus the rule. Alterna-
tively, one can try to find on point precedents in which only the first two
requirements were met and the benefit was awarded nonetheless, and then
argue that the first two are sufficient by themselves. These are examples
of "near miss" strategies. To carry them out requires not only the use of
both the rule and the cases, but also that both types of reasoning be done
in concert. It is not good enough to confine one's attention to the cases; the
case-based aspects must speak to the rule. Similarly, one gets nowhere
using the rules alone, since the rules "run out." Figuring out how to coor-
dinate the reasoning activities of the rule-based and case-based aspects is a
question of "control."88
Gardner's research suggests one model for integrating rule-based and
case-based reasoning: use cases when reasoning with rules comes to an
impasse, and also use cases to validate the conclusions reached using
rules.89 In this approach, case-based reasoning is subjugated to rule-based
reasoning. Of course, selecting the dominant mode of reasoning should
really depend on the context or circumstances: sometimes the rules should
drive the reasoning, and sometimes the cases.
A different approach for an Al program would be to have independent
processes-one for case-based tasks, one for rule-based tasks-where each
works from its own point of view independently. Later, an executive pro-
cess would integrate the results. Metaphorically, this is like sending off
associates to do specific tasks, such as library work on case law or statu-
tory law, and having the senior associate integrate the information and the
reasoning.
For a computational model, one must spell out when and how the
processes interact. In particular, how should the associate processes com-
municate their results with one another? Should they wait until they are
done with their individual tasks before communicating? Should they share
intermediate results and insights?
One Al model for integrating the work of the associate processes would
be the following: each process could have access to a common blackboard,
on which it could write anything interesting or useful it learns, and from
which it could read any piece of information it found to be interesting or
useful. In this model, the processes reason largely independently of each
other, and yet they can capitalize opportunistically on each other's results.
88. "Control" refers to issues concerning how a program is organized, how its parts interact, and,
in general, how it decides what to do next.
89. See supra Section II.B.
90. See Nii, Blackboard Systems, in 4 HANDBOOK OF Al, supra note 2, at 3. The blackboard
model was first used in the 1970's in the HEARSAY II project for understanding continuous speech
(this has nothing to do with the hearsay rule in evidence). See, e.g., Erman, Hayes-Roth, Lesser &
Reddy, The HEARSAY-II Speech-Understanding System: Integrating Knowledge To Resolve Uncer-
taintT, 12 COMPUTING SURVEYS 213 (1980).
91. For example, one would need to determine which "knowledge sources," as the rule-based and
case-based associates would be called, have access to what information on the blackboard. For in-
stances, can some access more information than others? What is the relative import of their results?
92. The program AM used an agenda. See generally R. DAVIS & D. LENAT, supra note 7.
93. For instance, in our benefits rule example, the first task of applying the benefit rule would
spawn several near-miss tasks concerning the third antecedent, R3, such as finding cases to show R3
is satisfied or not needed.
94. The overall behavior of agenda-based systems is to work in a "best first" manner. Since the
decisions as to what counts as "best" are often based on evaluative heuristics, and at any given mo-
ment the tasks on the agenda represent alternatives, agenda-based systems perform "heuristic best first
search."
95. The management scheme can be used to bias the system to behave in a certain way. For
instance, one can bias the system always to look at rule-based tasks in preference to case-based ones.
96. See, e.g., Oskamp, Knowledge Representation and Legal Expert Systems, in ADVANCED Top-
ICS OF LAW AND INFORMATION TECHNOLOGY 195 (G. Vandenberghe ed. 1989); Oskamp, Walker,
Schrickx & van den Berg, PROLEXS Divide and Rule: A Legal Application, in ICAIL-89, supra
note 1, at 54.
97. See supra text accompanying notes 85-87; Rissland & Skalak, Interpreting Statutory Predi-
cates, in ICAIL-89, supra note 1, at 46; Rissland & Skalak, Combining Case-Based and Rule-Based
Reasoning: A Heuristic Approach, in 1 PROCEEDINGS OF THE ELEVENTH INTERNATIONAL JOINT
CONFERENCE ON ARTIFICIAL INTELLIGENCE 524 (1989) [hereinafter Rissland & Skalak, A Heuristic
Approach].
98. For some of the computational details, see Rissland & Skalak, A Heuristic Approach, supra
note 97.
99. Id. at 526.
100. See, e.g., W. TWINING & D. MIERS, How To Do THINGS WITH RULES (2d ed. 1982)
(discussing various problems, examples, and approaches for reasoning with statutes and rules).
101. By contrast, in most of the systems discussed in this Comment, perhaps with the exception of
Gardner's system, there is no method by which the program may reason about such knowledge ex-
plicitly. While questions of time and number come up, for instance, in HYPO's reasoning about
disclosure events, see supra notes 69, 82, HYPO cannot reason explicitly about time or numeracy as
topics in their own right. However, note that since HYPO knows that 10,000 disclosures is a far
worse number of disclosures from the plaintiff's point of view than two would be, in a sense HYPO
implicitly knows the difference between big and small numbers. One might even say that since, with
respect to the dimensions about disclosures, fact situations with the same number of disclosures are
treated as being the same (all other things aside), HYPO could be said to know what 2 or 10,000
means. However, numbers and their absolute magnitudes are not topics HYPO can reason about in
an explicit way. If HYPO were redesigned so that it used a deep model containing information about
numbers it would be able to do so.
discussed in this Comment and that research seeking to develop rich rep-
resentation schemes for legal knowledge.
Work on representation of legal knowledge is exemplified by the re-
search projects of Thorne McCarty'02 and Tom Gordon.'03 Deep models
stand in contrast to situations where one simply uses representation items
to encode domain knowledge, but cannot reason further about their mean-
ing. If one were to employ deep knowledge in a system like HYPO, for
instance, one would represent in much more detail information about such
things as disclosure events, companies, time periods of employment and
product development, obligations of employers and employees, and prod-
ucts. For instance, in a deep model an Al researcher could reify a relation
such as employer-employee at a deeper level, perhaps in terms of relations
of permission or obligation, so that the program could reason about the
employer-employee relation as a topic in itself.'04
III. CONCLUSIONS
102. See, e.g., McCarty, Permissions and Obligations, 1 PROCEEDINGS OF THE EIGHTH INTER-
NATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE 287 (1983) (using "permissions" and
"obligations" to represent certain relations between parties in case); cf. Hohfeld, Some Fundamental
Legal Conceptions as Applied in Legal Reasoning, 23 YALE L.J. 16 (1913) (developing similar
relationship-type structures). The need for representing such deontic relationships in his scheme re-
quired McCarty to develop a great deal of logical apparatus, which has been the focus of his work for
many years. Of late, McCarty has been coming back to some of his primary concerns: development of
a comprehensive representational theory for legal knowledge and ultimately, legal argument. See, e.g.,
McCarty, A Language for Legal Discourse I. Basic Features, in ICAIL-89, supra note 1, at 180.
103. See Gordon, Issue Spotting in a System for Searching Interpretation Spaces, in ICAIL-89,
supra note 1, at 157 (1989); Gordon, OBLOG-2: A Hybrid Knowledge Representation System for
Defeasible Reasoning, in ICAIL-87, supra note 32.
104. Hohfeld, supra note 102.
105. See supra Section II.A.
106. For instance, I predict that work on case-based reasoning will lead to practical tools for
creating and managing case data bases of individual practitioners and firms, which can then be used
in preparation of new cases. A beneficial side-effect of such CBR tools, and of course, of traditional
expert systems, will be the capturing and preservation of a firm's "institutional memory" and its use
to leverage new or inexperienced attorneys in the areas of the firm's expertise to higher levels of
I have tried to show how Al and law researchers are pursuing their
twin goals of analytic and practical advances, and how past and ongoing
research can be viewed as a coherent attempt to model legal reasoning,
particularly argumentation. Even though we may be a long way from
some vision of the ideal legal reasoning Al program, and in fact may
never be able to achieve certain aspects of human expertise, we can al-
ready accomplish very interesting and useful projects. We can use the fine
lens of Al to explicate the process of legal reasoning, for instance the crea-
tion of case-based arguments; to shed light on questions in legal philoso-
phy, such as the nature of open-textured predicates; and to provide practi-
cal, even socially beneficial, applications, such as expert systems in certain
administrative areas. The insistence on using computational methods in
the AT approach provides a useful discipline for considering longstanding
issues in legal reasoning.
Although I have not discussed it here, this body of research can also be
fruitfully applied to the law school curriculum. For instance, we can pro-
vide our law students with environments in which to examine their own
legal knowledge.107 The conceptual framework of AT can also provide a
way to describe our own expertise, such as posing of artful hypotheticals,
and show students how they may also possibly acquire it. AT provides a
set of tools, based on detailed models of representation and process.
By this discussion, I hope both to encourage those wishing to attempt
AT-style projects, and also to reassure those unsure of whether we should.
For instance, some might be concerned that the use of AT models will
somehow trivialize legal reasoning by making it seem too simple, under-
mine the importance of lawyers and judges by relegating them to the role
of mere users of systems which do all of the interesting reasoning, or de-
humanize us by describing intelligent behavior in well-defined terms. I
think AT research shows just the opposite: The more we understand
human reasoning, the more we marvel at its richness and flexibility, the
more questions we ask as we try to understand its workings, and the more
we require of a computer program exhibiting intelligence.
Legal reasoning is complex. Our current AT models, albeit too simple,
are but steps to more subtle and complete models, and at each step we
understand more. There will always be a need for human lawyers and
judges. The goal is to assist, not to replace. Demystification of some of the
performance, at the very least by keeping them from asking "obvious" questions and making "silly"
mistakes.
107. For example, one can use currently available commercial expert systems shells to allow stu-
dents to build their own small applications and then to experiment with them. The very exercise of
developing a rule base forces students to develop a theory of the application area and to think about
general issues regarding the validity and appropriateness of using rule-based approaches. For in-
stance, developing rule sets for consumer tort law or offer-and-acceptance law requires understanding
of the specifics of the law as well as general issues about rules and legal predicates and problematic
aspects of them. See supra Sections ILA-B.