Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

TULSI: an NLP system for extracting legal modificatory provisions

  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

Abstract

In this work we present the TULSI system (so named after Turin University Legal Semantic Interpreter), a system to produce automatic annotations of normative documents through the extraction of modificatory provisions. TULSI relies on a deep syntactic analysis and a shallow semantic interpreter that are illustrated in detail. We report the results of an experimental evaluation of the system and discuss them, also suggesting future directions for further improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. E.g., the technical syntagm “digital signature” is at the same time a legal concept and also a definition of a particular technology.

  2. E.g., the term “privacy” is a well-known legal term that represents a specific legal concept in all EU jurisdiction; in US there is a different legal meaning, in Italy this term is not used in any legal document about this topic.

  3. In particular, it is sometimes necessary to apply modern legal interpretation principles that go beyond the textualism theory, and that integrate the objective literal interpretation with teleological scope, socio-economical goals and historical-cultural elements.

  4. “We can now define the concept of a normative system as the set of all the propositions that are consequences of the explicitly commanded propositions” (Alchourròn and Bulygin 1971).

  5. Also, since the Italian standard NormeInRete was being devised, the paper (Bolioli et al. 2002) provides the first description of a software system for the automated mark-up of Italian legal texts, funded by the AIPA (the Italian Authority for promoting the information technologies in the Italian Public Administration).

  6. The files containing the DTD are available at the URL: http://www.digitpa.gov.it/standard-normeinrete.

  7. Further details about the input format are provided in Sect. 5.1.

  8. Corpo is the Italian word for body.

  9. Rif stands for riferimento, the Italian word for reference.

  10. Vir stands for virgolette, the Italian word for quotes.

  11. http://www.di.unito.it/∼radicion/AI_LAW_2012/s2507280.xml.

  12. Excerpt from the file http://www.di.unito.it/∼radicion/AI_LAW_2012/S2603265.xml.

  13. Chunks can be defined as groups of syntactically related and adjacent words (Abney 1991).

  14. The English pseudo-translation aims to keep the ordering of the Italian words. Conjunctions are labelled with subscripts for reference purposes in the following description.

  15. Actually, some inputs can be without verbs. In this case, either the analysis now includes a single chunk (and the task is completed) or one of the chunks is chosen as the head and the others are attached to it. This is still accomplished via heuristics.

  16. The \(\langle alinea \rangle\) tag is an element used in the annotation of legal texts to govern an enumeration of elements.

  17. The dataset is available for download at the URL: http://www.di.unito.it/∼radicion/AI_LAW_2012/.

  18. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2504725.xml.

  19. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2507927.xml.

  20. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2503953.xml.

  21. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2603483.xml.

  22. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2504283.xml.

  23. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2503953.xml; http://www.di.unito.it/∼radicion/AI_LAW_2012/S2507829.xml.

  24. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2504283.xml.

  25. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2500670.xml.

  26. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2507829.xml.

  27. E.g., a sentence like ‘dalla data del RIF32, il RIF33 abrogato’ (from the date of RIF32, RIF33 is repealed) is ambiguous in Italian, and the parser wrongly accounts for the dependent ‘dalla data’—translated into ‘from the date’—with the agent of a passive action, which is introduced by the same preposition.

  28. http://www.di.unito.it/∼radicion/AI_LAW_2012/S2500947.xml.

References

  • Abney SP (1991) Principle-based parsing: computation and psycholinguistics. In: Berwick RC, Abney SP, Tenny C (eds) Parsing by Chunks. Kluwer, Dordrecht

  • AIPA (2002) Formato per la rappresentazione elettronica dei provvedimenti normativi tramite il linguaggio di marcatura XML. Circolare n. AIPA/CR/40, 22 aprile

  • Alchourròn CE, Bulygin E (1971) Normativity and norms: critical perspectives on Kelsenian themes. In: Paulson SL, Litschewski-Paulson B (eds) The expressive conception of norms. Clarendon Press, Oxford

  • Alicante A, Bosco C, Corazza A, Lavelli A (2012) A treebank-based study on the influence of Italian word order on parsing performance. In: Calzolari N (Conference Chair), Choukri K, Declerck T, Uğur Doğan M, Maegaard B, Mariani J, Odijk J, Piperidis S (eds) Proceedings of the eight international conference on language resources and evaluation (LREC’12), Istanbul, Turkey, May 2012. European Language Resources Association (ELRA)

  • Appelt DE, Israel D (1999) Introduction to information extraction technology. In Proceedings of 16th international joint conference on artificial intelligence IJCAI-99, Tutorial

  • Arnold-Moore T (1995) Automatically processing amendments to legislation. In: ICAIL, pp 297–306

  • Arnold-Moore T (1997) Automatic generation of amendment legislation. In: Proceedings of the international conference on artificial intelligence and law (ICAIL), pp 56–62

  • Bartolini R, Lenci A, Montemagni S, Pirrelli V, Soria C (2004) Semantic mark-up of italian legal texts through nlp-based techniques. In: Proceedings of LREC 2004, pp 795–798

  • Biagioli C, Francesconi E, Spinosa P, Taddei M (2003) The NIR project: standards and tools for legislative drafting and legal document web publication. In: Proceedings of ICAIL workshop on e-government: modelling norms and concepts as key issues, pp 69–78

  • Biagioli C, Francesconi E, Passerini A, Montemagni S, Soria C (2005) Automatic semantics extraction in law documents. In: ICAIL ’05: Proceedings of the 10th international conference on artificial intelligence and law. New York, NY, USA. ACM, pp 133–140

  • Bolioli A, Dini L, Mercatali P, Romano F (2002) For the automated mark-up of italian legislative texts in XML. In: Bench-Capon T, Daskalopulu A, Winkels R (eds) Legal knowledge and information systems. Proceedings of Jurix 2002: the fifteenth annual conference. IOS Press

  • Bosco C, Montemagni S, Mazzei A, Lombardo V, Dell’Orletta F, Lenci A (2009) Evalita’09 parsing task: comparing dependency parsers and treebanks. In: Proceedings of Evalita’09. Reggio Emilia, Italy

  • Brighi R, Palmirani M (2009) Legal text analysis of the modification provisions: a pattern oriented approach. In: Proceedings of the international conference on artificial intelligence and law (ICAIL)

  • Cherubini M, Giardiello G, Marchi S, Montemagni S, Spinosa PL, Venturi G (2008) NLP-based metadata annotation of textual amendments. In: Proceedings of workshop on legislative XML 2008, Jurix

  • Collins M (1997) Three generative, lexicalised models for statistical parsing. In: Proceedings of the 35th annual meeting of the association for computational linguistics, pp 16–23

  • de Maat E, Winkels R, van Engers TM (2006) Automated detection of reference structures in law. In: van Engers TM (ed) Proceedings of the JURIX 2006 on legal knowledge and information systems: the nineteenth annual conference. IOS Press, Amsterdam, pp 41–50

  • de Maat E, Krabben K, Winkels R (2010) Machine learning versus knowledge based classification of legal texts. In: IOS Press, (ed) Proceedings of the 2010 conference on legal knowledge and information systems: JURIX 2010: the twenty-third annual conference, Amsterdam, pp 87–96

  • De Salvo Braz R, Girju R, Punyakanok V, Dan R, Sammons M (2005) An inference model for semantic entailment in natural language. In: AAAI’05: Proceedings of the 20th national conference on artificial intelligence. AAAI Press, pp 1043–1049

  • Domingos P (1999) The role of Occam’s razor in knowledge discovery. Data Min Knowl Discov 3:409–425

    Article  Google Scholar 

  • Haghighi AD, Ng AY, Manning CD (2005) Robust textual inference via graph matching. In: HLT ’05: Proceedings of the conference on human language technology and empirical methods in NLP, Morristown, NJ, USA, 2005. ACL, pp 387–394

  • Jackson P, Moulinier I (2002) Natural language processing for online applications. Text retrieval, extraction and categorization, vol 5 of natural language processing. Benjamins, Amsterdam, Philadelphia

  • Lesmo L (2007) The rule-based parser of the nlp group of the university of torino. Intell Artif 2(4):46–47

    Google Scholar 

  • Lesmo L, Lombardo V (2002) Transformed subcategorization frames in chunk parsing. In: Proceedings of the 3rd international conference on language resources and evaluation (LREC 2002), Las Palmas, pp 512–519

  • Lupo C, Vitali F, Francesconi E, Palmirani M, Winkels R, de Maat E, Boer A, Mascellani P (2007) General XML format(s) for legal Sources—ESTRELLA European project. Deliverable 3.1, Faculty of Law, University of Amsterdam, Amsterdam

  • McCarty LT (2007) Deep semantic interpretations of legal texts. In: ICAIL ’07: Proceedings of the 11th international conference on Artificial intelligence and law. New York, NY, USA, ACM, pp 217–224

  • Ogawa Y, Inagaki S, Toyama K (2008) Automatic consolidation of Japanese statutes based on formalization of amendment sentences. In: Proceedings of the 2007 conference on New frontiers in artificial intelligence, JSAI’07, Berlin, Heidelberg, 2008. Springer, pp 363–376

  • Palmirani M (2011) Legislative change management with Akoma-Ntoso. In: Sartor G, Palmirani M, Francesconi E, Angela Biasiotti MA (eds) Legislative XML for the Semantic Web. Springer, Berlin

  • Palmirani M, Benigni F (2007) Norma-system: a legal information system for managing time. In: Biagioli C, Francesconi E, Sartor G (eds) Proceedings of the V legislative XML workshop. European Press Academic Publishing, Feb 2007, pp 205–223

  • Palmirani M, Brighi R (2003) An XML editor for legal information management. In: Traunmüller R (ed) Electronic government, vol 2739 of LNCS. Springer, Berlin, pp 421–429

  • Palmirani M, Brighi R (2006) Time model for managing the dynamic of normative system. Electron Gov. Lecture notes in computer science, vol 4084. Springer, pp 207–218

  • Palmirani M, Brighi R (2010) Model regularity of legal language in active modifications. In: Biasiotti M et al (eds) AICOL workshops 2009. Springer, Berlin, pp 54–73

  • Palmirani M, Brighi R, Massini M (2004) Processing normative references on the basis of natural language questions. In: DEXA ’04 Proceedings of the database and expert systems applications, 15th international workshop. IEEE Computer Society, pp 9–12

  • Rodotà S (1998) La tecnica legislativa per clausole generali in Italia. In: Cabella Pisu L, Nanni L (eds) Clausole e principi generali nell’argomentazione giurisprudenziale degli anni novanta. Cedam, Padova

  • Sacco R (2000) Lingua e diritto. Ars Interpretandi. Annuario di ermeneutica giuridica. Traduzione e diritto 5:117–134

  • Sagri MT, Tiscornia D (2009) Le peculiarità del linguaggio giuridico. Problemi e prospettive nel contesto multilingue Europeo. MediAzioni 7. http://mediazioni.sitlec.unibo.it. ISSN 1974-4382

  • Saias J, Quaresma P (2004) A methodology to create legal ontologies in a logic programming based web information retrieval system. Artif Intell Law 12(4):397–417

    Article  Google Scholar 

  • Sartor G (1996) Riferimenti normativi e dinamica dei nessi normativi. In: Il procedimento normativo regionale. Cedam, Padova, pp 151–164

  • Soria C, Bartolini R, Lenci A, Montemagni S, Pirrelli V (2007) Automatic extraction of semantics in law documents. In: Biagioli C, Francesconi E, Sartor G (eds) Proceedings of the V legislative XML workshop. European Press Academic Publishing, pp 253–266

  • Spinosa PL, Giardiello G, Cherubini M, Marchi S, Venturi G, Montemagni S (2009) Nlp-based metadata extraction for legal text consolidation. In: Proceedings of the 12th international conference on artificial intelligence and law, ICAIL ’09. New York, NY, USA, 2009. ACM, pp 40–49

  • Wyner A (2011) Towards annotating and extracting textual legal case elements. In: Francesconi E (ed) Informatica e Diritto: special issue on legal ontologies and artificial intelligent techniques 19(1–2):9–18 ESI

  • Zanchetta E, Baroni M (2005) Morph-it! A free corpus-based morphological resource for the Italian language. Corpus Linguistics 2005 1(1). http://www.corpus.bham.ac.uk/PCLC/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniele P. Radicioni.

Appendix

Appendix

   

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lesmo, L., Mazzei, A., Palmirani, M. et al. TULSI: an NLP system for extracting legal modificatory provisions. Artif Intell Law 21, 139–172 (2013). https://doi.org/10.1007/s10506-012-9127-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10506-012-9127-6

Keywords