Ebook: Legal Knowledge and Information Systems
In recent years, interest within the research community and the legal industry regarding technological advances in legal knowledge representation and processing has been growing. This relates to areas such as computational models of legal reasoning, cybersecurity, privacy, trust and blockchain methods, among other things.
This book presents the proceedings of JURIX 2022, the 35th International Conference on Legal Knowledge and Information Systems, held from 14 –16 December in Saarbrücken, Germany, under the auspices of the Dutch Foundation for Legal Knowledge Based Systems and hosted by Saarland University. The annual JURIX conference has become an international forum for academics and professionals to exchange knowledge and experiences at the intersection of law and artificial intelligence (AI). For this edition, 62 submissions were received from 163 authors in 24 countries. Following a rigorous review process, carried out by a programme committee of 72 experts recognised in the field, 14 submissions were selected for publication as long papers, 22 as short papers and 5 as demo papers, making a total of 41 papers altogether and representing a 22.5% acceptance rate for long papers (66.1% overall). The broad array of topics covered includes argumentation and legal reasoning, legal ontologies and the semantic web, machine and deep learning and natural language processing for legal knowledge extraction, as well as argument mining, translation of legal texts, defeasible logic, legal compliance, explainable AI, alternative dispute resolution, legal drafting and smart contracts.
Providing an overview of recent advances, the book will be of interest to all those working at the interface between the law and AI.
We are pleased to present to you the proceedings of the 35th International Conference on Legal Knowledge and Information Systems – JURIX 2022. For more than three decades, the JURIX conferences have been held under the auspices of the Dutch Foundation for Legal Knowledge Based Systems (www.jurix.nl). Traditionally based in Europe, in the time JURIX has become an international forum for academics and professionals to exchange knowledge and experiences at the intersection of Law and Artificial Intelligence. Over the years, JURIX has witnessed the growing interest of the research community and the industry in technological advances on legal knowledge representation, computational models of legal reasoning, evidential reasoning, argumentation, case-based and rule-based reasoning, machine learning and natural language processing for legal knowledge acquisition, big data and data analysis, open data and the semantic web in the legal domain, online dispute resolution, legal document management and information retrieval, knowledge discovery, data mining, as well as cybersecurity, privacy, trust and blockchain methods.
The 2022 edition of JURIX, which runs from 14–16 December, is hosted by the Saarland University in Saarbrücken, Germany. This edition marks an important milestone, because it coincides with a relative return to normal after the challenges of the Covid-19 crisis: in fact it will be the first conference of the series of the AI&Law community events expected to be held fully in presence again.
For this edition, we have received 62 submissions from 163 authors of 24 countries. 14 of these submissions were selected for publication as long papers (10 pages), 22 as short papers (6 pages) and 5 as demo papers (4 pages) for a total of 41 presentations. This is the result of a balance between inclusiveness and a very competitive and rigorous review process, which was carried out by a Program Committee composed by 72 recognised experts in the field. The result was a total acceptance rate (long, short and demo papers) of 66.1%, which testifies the overall good quality of the submissions, with 22.5% acceptance rate for long papers.
The accepted papers cover a broad array of topics, from argumentation and legal reasoning, to legal ontologies and semantic web; from machine and deep learning to natural language processing for legal knowledge extraction, as well as argument mining and translation of legal texts; from defeasible logic, legal compliance and explainable AI, to alternative dispute resolution, legal drafting and smart contracts.
Three invited speakers, from different and complementary areas (industry, European institutions and academia), have honored JURIX 2022, by kindly accepting to deliver their keynote lectures: Martin Rollinger, Paul Nemitz and Mireille Hildebrandt.
Martin Rollinger is the CEO of Sinc, a leading company in Germany in the implementation of AI and legal informatics in the judiciary. In the past 20 years, he’s been involved with numerous e-Government projects in domains such as education, law enforcement, intelligence and environmental management. He has been responsible for one of the largest projects in the digitization of German courts and public prosecutors offices and has been actively involved in building tools and products in the AI and law space for the past 10+ years.
Paul Nemitz is the Principal Advisor in the Directorate General for Justice and Consumers of the European Commission. He was appointed in April 2017, following a 6-year appointment as Director for Fundamental Rights and Citizen’s Rights in the same Directorate General. As Director, Paul Nemitz led the reform of Data Protection legislation in the EU, the negotiations of the EU – US Privacy Shield and the negotiations with major US Internet Companies of the EU Code of Conduct against incitement to violence and hate speech on the Internet. He is a Member of Commission for Media and Internet Policy of the Social Democratic Party of Germany (SPD), Berlin and a visiting Professor of Law at the College of Europe in Bruges. Paul is also a Member of the Board of the Verein Gegen Vergessen – Für Demokratie e.V., Berlin and a Trustee of the Leo Baeck Institute, New York. He chairs the Board of Trustees of the Arthur Langerman Foundation, Berlin.
Mireille Hildebrandt is Research Professor on ‘Interfacing Law and Technology’ at Vrije Universiteit Brussels (VUB), appointed by the VUB Research Council, and co-Director of the Research Group on Law Science Technology and Society studies (LSTS) at the Faculty of Law and Criminology. Mireille also holds the part-time Chair of Smart Environments, Data Protection and the Rule of Law at the Science Faculty, at the Institute for Computing and Information Sciences (iCIS) at Radboud University Nijmegen. Her research interests concern the implications of automated decisions, machine learning and mindless artificial agency for law and the rule of law in constitutional democracies.
As tradition, also this year JURIX has been accompanied by satellite co-located events including workshops, tutorials and a Doctoral Consortium. We thank the workshops and tutorial organizers for their excellent proposals and for the effort involved in organizing the events. This year’s edition comprises 1 tutorial “AI and Machine Learning – their benefits and drawbacks for use in ODR” and 6 workshops: AICOL 2022 – “AI Approaches to the Complexity of Legal Systems”; SORO – “Interdisciplinary Workshop on the Governance for Social Robots”; WAICOM – “Workshop on AI Compliance Mechanism”; LDA 2022 – “CEILI Workshop on Legal Data Analysis”; LN2FR – “Methodologies for Translating Legal Norms into Formal Representations”; AMPM 2022 “2nd Workshop in Agent-based Modeling & Policy-Making”. AICOL is at its 14th edition and, traditionally, it represents a space to discuss models of legal knowledge more suitable to the complexity of contemporary legal systems. SORO is a new workshop which stems from a Japanese project about the governance for social robots through interdisciplinary approaches, including ethical considerations like deception, privacy, safety, etc. from their close interactions with humans. WAICOM represents a somehow related workshop which discusses legal and technical solutions about the adherence of AI’s behaviour to legal and ethical principles. LDA, at its 8th edition, intends to focus on representation, analysis and reasoning with legal data. LN2FR explores the various challenges connected with the task of using formal languages and models to represent legal norms in a machine-readable manner. Finally, AMPM, at its 2nd edition, aims to create space for agent-based modeling, exploiting computation to investigate factual underpinnings of the legal phenomenon, like the intricate networks of cognitive, social, technological, and legal mechanisms through which law emerges, is applied, and exerts its effects.
Moreover, since 2013 and also this year, JURIX has offered young researchers, entering the AI&Law field as Ph.D. students, the opportunity to present their work during the Doctoral Consortium session, which represents an effective environment of growth and tutoring. For the coordination of this event, our special thanks go to Monica Palmirani, Doctoral Consortium Chair and untiring coordinator of the Joint International Doctoral Degree in Law, Science and Technology (LAST-JD).
Organizing this edition of the conference would not have been possible without the support of many people and institutions. Special thanks are due to the local organizing team of the Institute of Legal Informatics at Saarland University, chaired by Georg Borges and Christoph Sorge, and to the Faculty of Law at Saarland University for sponsoring the event.
Moreover, we are particularly grateful to the 72 members of the Program Committee for their commitment and excellent work in a rigorous review process, including their active participation in the discussions concerning borderline papers. Finally, we would like to thank the former and current JURIX executive and steering committee members for their continuous support and advice, as well as for sharing experiences and suggestions about JURIX organization challenges.
Enrico Francesconi, JURIX 2022 Program Chair
Georg Borges, JURIX 2022 Conference Co-Chair
Christoph Sorge, JURIX 2022 Conference Co-Chair
This paper extends the existing account of statutory interpretation based on argument schemes theory. It points out that the preference relations among statutory canons are not always determined by some predefined rules, but in certain systems of law or legal domains, it is necessary to argue these preference relations on the basis of case law. A set of factors favouring linguistic arguments and teleological arguments is presented, and a case-based argument scheme for the assignment of preference relations is reconstructed.
We describe a system for constructing, evaluating and visualising arguments based on a theory of a legal domain, developed using the Angelic methodology and the Carneades argumentation system. The visualisation can be used to explain particular cases and to refine and maintain the theory. A full implementation of the well known US Trade Secrets Domain is used to illustrate the process.
I explore a factor-based model of precedential constraint that, unlike existing models, does not rely on the assumption that the background set of precedent cases is consistent. The model I consider is a generalization of the reason model of precedential constraint that was suggested by Horty. I show that, within this framework, inconsistent case bases behave in a sensible and interesting way, both from a logical and a more practical perspective.
The typical judicial pathway is made of a judgment by a tribunal followed by a decision of an appellate court. However, the link between both documents is sometimes difficult to establish because of missing, incorrect or badly formatted references, pseudonymization, or poor drafting specific to each jurisdiction. This paper first shows that it is possible to link court decisions related to the same case although they are from different jurisdictions using manual rules. The use of deep learning afterwards significantly reduces the error rate in this task. The experiments are conducted between the Commercial Court of Paris and Appellate Courts.
Modelling the concept of explanation is a central matter in AI systems, as it provides methods for developing eXplainable AI (XAI). When explanation applies to normative reasoning, XAI aims at promoting normative trust in the decisions of AI systems: in fact, such a trust depends on understanding whether systems predictions correspond to legally compliant scenarios. This paper extends to normative reasoning a work by Governatori et al. (2022) on the notion of stable explanations in a non-monotonic setting: when an explanation is stable, it can be used to infer the same normative conclusion independently of other facts that are found afterwards.
In making legal decisions, courts apply relevant law to facts. While the law typically changes slowly over time, facts vary from case to case. Nevertheless, underlying patterns of fact may emerge. This research focuses on underlying fact patterns commonly present in cases where motorists are stopped for a traffic violation and subsequently detained while a police officer conducts a canine sniff of the vehicle for drugs. We present a set of underlying patterns of fact, that is, factors of suspicion, that police and courts apply in determining reasonable suspicion. We demonstrate how these fact patterns can be identified and annotated in legal cases and how these annotations can be employed to fine-tune a transformer model to identify the factors in previously unseen legal opinions.
Contemporary legal digital libraries such as Lexis Nexis and WestLaw allow users to search case laws using sophisticated search tools. At its core, various forms of keyword search and indexing are used to find documents of interest. While newer search engines leveraging semantic technologies such as knowledgebases, natural language processing, and knowledge graphs are becoming available, legal databases are yet to take advantage of them fully. In this paper, we introduce an experimental legal document search engine, called Prism, that is capable of supporting legal argument based search to support legal claims.
Clause recommendation is the problem of recommending a clause to a legal contract, given the context of the contract in question and the clause type to which the clause should belong. With not much prior work being done toward the generation of legal contracts, this problem was proposed as a first step toward the bigger problem of contract generation. As an open-ended text generation problem, the distinguishing characteristics of this problem lie in the nature of legal language as a sublanguage and the considerable similarity of textual content within the clauses of a specific type. This similarity aspect in legal clauses drives us to investigate the importance of similar contracts’ representation for recommending clauses. In our work, we experiment with generating clauses for 15 commonly occurring clause types in contracts expanding upon the previous work on this problem and analyzing clause recommendations in varying settings using information derived from similar contracts.
This paper brings together factor-based models of case-based reasoning (CBR) and the logical specification of classifiers. Horty [8] has developed the factor-based models of precedent into a theory of precedential constraint. In this paper we combine binary-input classifier logic (BCL) to classifiers and their explanations given by Liu & Lorini [13, 14] with Horty’s account of factor-based CBR, since both a classifier and CBR map sets of features to decisions or classifications. We reformulate case bases in the language of BCL, and give several representation results. Furthermore, we show how notions of CBR can be analyzed by notions of classifier explanation.
Reasoning with legal cases has long been modelled using symbolic methods. In recent years, the increased availability of legal data together with improved machine learning techniques has led to an explosion of interest in data-driven methods being applied to the problem of predicting outcomes of legal cases. Although encouraging results have been reported, they are unable to justify the outcomes produced in satisfactory legal terms and do not exploit the structure inherent within legal domains; in particular, with respect to the issues and factors relevant to the decision. In this paper we present the technical foundations of a novel hybrid approach to reasoning with legal cases, using Abstract Dialectical Frameworks (ADFs) in conjunction with hierarchical BERT. ADFs are used to represent the legal knowledge of a domain in a structured way to enable justifications and improve performance. The machine learning is targeted at the task of factor ascription; once factors present in a case are ascribed, the outcome follows from reasoning over the ADF. To realise this hybrid approach, we present a new hybrid system to enable factor ascription, envisioned for use in legal domains, such as the European Convention on Human Rights that is used frequently in modelling experiments.
Translating often has the meaning of converting from one human language to another. However, in a broader sense, it means transforming a message from one form of communication to another form. Logic is an important form of communication and the ability to translate natural language into logic is important in many different fields, in which logical reasoning and logical arguments are used. In the legal field, for example, judges must often reason from facts and arguments presented in natural language to logical conclusions. In this paper, toward the goal of support for this kind of reasoning with machines, we propose a method for translating natural language into logical representations using a combination of deep learning methods. Our approach contributes methodologies and insights to the development of computational methods for converting natural language into logical representations.
Topic modeling is widely used in various domains for extracting latent topics underlying large corpora, including judicial texts. In the latter, topics tend to be made by and for domain experts, but remain unintelligible for laymen. In the framework of housing law court decisions in French which mixes abstract legal terminology with real-life situations described in common language, similarly to [1], we aim at identifying different situations that can cause a tenant to prosecute their landlord in court with the application of topic models. Upon quantitative evaluation, LDA and BERTopic deliver the best results, but a closer manual analysis reveals that the second embedding-based approach is much better at producing and even uncovering topics that describe a tenant’s real-life issues and situations.
Legal text summarization is generally formalized as an extractive text summarization task applied to court decisions from which the most relevant sentences are identified and returned as a gist meant to be read by legal experts. However, such summaries are not suitable for laymen seeking intelligible legal information. In the scope of the JusticeBot, a question-answering system in French that provides information about housing law, we intend to generate summaries of court decisions that are, on the one hand, conditioned by a question-answer-decision triplet, and on the other hand, intelligible for ordinary citizens not familiar with legal documents. So far, our best model, a further pre-trained BARThez, achieves an average ROUGE-1 score of 37.7 and a deepened manual evaluation of summaries reveals that there is still room for improvement.
We propose an adaptive environment (CABINET) to support caselaw analysis (identifying key argument elements) based on a novel cognitive computing framework that carefully matches various machine learning (ML) capabilities to the proficiency of a user. CABINET supports law students in their learning as well as professionals in their work. The results of our experiments focused on the feasibility of the proposed framework are promising. We show that the system is capable of identifying a potential error in the analysis with very low false positives rate (2.0–3.5%), as well as of predicting the key argument element type (e.g., an issue or a holding) with a reasonably high F1-score (0.74).
Although argumentation is often studied in AI using abstract frameworks, actual debate often shows a dynamic interaction between argument structure and attack. Often intermediates steps in the reasoning are omitted, but it may be these intermediate steps which are the vulnerable parts of the argument. Inspired by Loui and Norman’s work on the rationale of arguments, we study the relation between argument structure and attack in terms of the unpacking of arguments. The paper provides an analysis of two kinds of rationales discussed by Loui and Norman. Example dialogues inspired by Dutch tort law are used for illustration.
We introduce the Illinois Intentional Tort Qualitative Dataset, a set of Illinois Common Law cases in Assault, Battery, Trespass, and Self-Defense, machine-translated into qualitative predicate representations. We discuss the cases involved, the natural language understanding system used to translate the cases into predicate logic, and validation measures that serve as performance baselines for future AI research using the dataset.
Deontic logics have long been the tool of choice for the formal analysis of normative texts. While various such logics have been proposed many deal with time in a qualitative sense, i.e., reason about the ordering but not timing of events, it was only in the past few years that real-time deontic logics have been developed to reason about time quantitatively. In this paper we present timed contract automata, an automata-based deontic modelling approach complementing these logics with a more operational view of such normative clauses and providing a computational model more amenable to automated analysis and monitoring.
Machine learning has improved significantly during the past decades. Computers perform remarkably in formerly difficult tasks. This article reports the preliminary results on the prediction of two characteristics of judgments of the European Court of Justice, which require the knowledge of concepts and doctrines of European Union law and judicial decision-making: The legal importance (doctrinal outcome) and leeway to the national courts and legislators (deference). The analysis relies on 1704 manually labelled judgments and trains a set of classifiers based on word embedding, LSTM, and convolutional neural networks. While all classifiers exceed simple baselines, the overall performance is weak. This suggests first, that the models learn meaningful representations of the judgments. Second, machine learning encounters significant challenges in the legal domain. These arise doe to the small training data, significant class imbalance, and the characteristics of the variables requiring external knowledge.
The article also outlines directions for future research.
With the enforcement of the European Union’s General Data Protection Regulation, users of Web services – the ‘data subjects’ –, which are powered by the intensive usage of personal data, have seen their rights be incremented, and the same can be said about the obligations imposed on the ‘data controllers’ responsible for these services. In particular, the ‘Right of Access’, which gives users the option to obtain a copy of their personal data as well as relevant details such as the categories of personal data being processed or the purposes and duration of said processing, is putting increasing pressure on controllers as their execution often requires a manual response effort, and the wait time is negatively affecting the data subjects. In this context, the main goal of this work is the development of an API, which builds on the previously mentioned structured information, to assist controllers in the automation of replies to right of access requests. The implemented API method is then used in the implementation of a Solid application whose main goal is to assist users in exercising their right of access to data stored in Solid Pods.
With the uptake of digital services in public and private sectors, the formalization of laws is attracting increasing attention. Yet, non-compliant fraudulent behaviours (money laundering, tax evasion, etc.)—practical realizations of violations of law—remain very difficult to formalize, as one does not know the exact formal rules that define such violations. The present work introduces a methodological framework aiming to discover non-compliance through compressed representations of behaviour, considering a fraudulent agent that explores via simulation the space of possible non-compliant behaviours in a given social domain. The framework is founded on a combination of utility maximization and active learning. We illustrate its application on a simple social domain. The results are promising, and seemingly reduce the gap on fundamental questions in AI and Law, although this comes at the cost of developing complex models of the simulation environment, and sophisticated reasoning models of the fraudulent agent.
This paper studies constraint hierarchies for ethical norms, which are unwritten and may be relaxed if they conflict with stronger norms. Since such ethical norms are unwritten, initial representations of ethical norms may contain errors. For correcting those errors, this paper examines fundamental revisions on constraint hierarchies for ethical norms. Although some revisions on representations for ethical norms have been suggested, revisions on constraint hierarchies for ethical norms have not been completely investigated. In this paper, we categorize two fundamental types of revisions on such constraint hierarchies, namely preference revision and content revision. We also compare effects of those revisions in the criteria of syntactic and semantic changes, which are common criteria of revisions on legal theories. From the comparison, we found that preference revision tentatively makes lower syntactic changes. However, its computation is intractable, incomplete, and potentially makes a large number of semantic changes. On the other hand, we show that content revision on constraint hierarchies can make a small number of semantic changes. However, the content revision tentatively produce a large number of syntactic changes. This comparison leads to the possibility of optimization between preference revision and content revision, which we think is an interesting future work.
This study aims at predicting the outcomes of legal cases based on the textual content of judicial decisions. We present a new corpus of Italian documents, consisting of 226 annotated decisions on Value Added Tax by Regional Tax law commissions. We address the task of predicting whether a request is upheld or rejected in the final decision. We employ traditional classifiers and NLP methods to assess which parts of the decision are more informative for the task.