Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3600160.3605069acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article
Open access

Towards Grammatical Tagging for the Legal Language of Cybersecurity

Published: 29 August 2023 Publication History

Abstract

Legal language can be understood as the language typically used by those engaged in the legal profession and, as such, it may come both in spoken or written form. Recent legislation on cybersecurity obviously uses legal language in writing, thus inheriting all its interpretative complications due to the typical abundance of cases and sub-cases as well as to the general richness in detail. This paper faces the challenge of the essential interpretation of the legal language of cybersecurity, namely of the extraction of the essential Parts of Speech (POS) from the legal documents concerning cybersecurity.
The challenge is overcome by our methodology for POS tagging of legal language. It leverages state-of-the-art open-source tools for Natural Language Processing (NLP) as well as manual analysis to validate the outcomes of the tools. As a result, the methodology is automated and, arguably, general for any legal language following minor tailoring of the preprocessing step. It is demonstrated over the most relevant EU legislation on cybersecurity, namely on the NIS 2 directive, producing the first, albeit essential, structured interpretation of such a relevant document. Moreover, our findings indicate that tools such as SpaCy and ClausIE reach their limits over the legal language of the NIS 2.

References

[1]
2023. Repo tables. https://anonymous.4open.science/r/nis-tables-CBB1
[2]
2023. Types of legislation. https://european-union.europa.eu/institutions-law-budget/law/types-legislation_en
[3]
Sam Arts, Jianan Hou, and Juan Carlos Gomez. 2021. Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures. Research Policy 50, 2 (2021), 104144. https://doi.org/10.1016/j.respol.2020.104144
[4]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res. 3, null (mar 2003), 993–1022.
[5]
Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2020. LEGAL-BERT: The Muppets straight out of Law School. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 2898–2904. https://doi.org/10.18653/v1/2020.findings-emnlp.261
[6]
EU Commission. 2023. NIS 2 Directive. https://eur-lex.europa.eu/eli/dir/2022/2555/oj
[7]
Spacy developers. 2023. spacy-clausie library. https://spacy.io/universe/project/spacy-clausie
[8]
Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. (2017). To appear.
[9]
Hamed Jelodar, Yongli Wang, Chi Yuan, Xia Feng, Xiahui Jiang, Yanchao Li, and Liang Zhao. 2018. Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey. arxiv:1711.04305 [cs.IR]
[10]
Daniel Martin Katz, Michael J Bommarito II au2, and Josh Blackman. 2014. Predicting the Behavior of the Supreme Court of the United States: A General Approach. arxiv:1407.6333 [physics.soc-ph]
[11]
Jieh-Sheng Lee and Jieh Hsiang. 2019. Patent Claim Generation by Fine-Tuning OpenAI GPT-2. arxiv:1907.02052 [cs.CL]
[12]
Marco Lippi, Przemysław Pałka, Giuseppe Contissa, Francesca Lagioia, Hans-Wolfgang Micklitz, Giovanni Sartor, and Paolo Torroni. 2019. CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. Artificial Intelligence and Law 27, 2 (feb 2019), 117–139. https://doi.org/10.1007/s10506-019-09243-2
[13]
Pedro Henrique Luz de Araujo, Teófilo Emídio de Campos, Fabricio Ataides Braz, and Nilton Correia da Silva. 2020. VICTOR: a Dataset for Brazilian Legal Documents Classification. In Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 1449–1458. https://aclanthology.org/2020.lrec-1.181
[14]
Masha Medvedeva, Michel Vols, and Martijn Wieling. 2020. Using machine learning to predict decisions of the European Court of Human Rights. Artificial Intelligence and Law 28, 2 (01 Jun 2020), 237–266. https://doi.org/10.1007/s10506-019-09255-y
[15]
Packtpub. [n. d.]. Packtpub object extraction. https://subscription.packtpub.com/book/data/9781838987312/2/ch02lvl1sec16/extracting-subjects-and-objects-of-the-sentence
[16]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
[17]
Octavia-Maria Sulea, Marcos Zampieri, Shervin Malmasi, Mihaela Vela, Liviu P. Dinu, and Josef van Genabith. 2017. Exploring the Use of Text Classification in the Legal Domain. arxiv:1710.09306 [cs.CL]
[18]
Fahad ul Hassan and Tuyen Le. 2020. Automated Requirements Identification from Construction Contract Documents Using Natural Language Processing. Journal of Legal Affairs and Dispute Resolution in Engineering and Construction 12, 2 (2020), 04520009. https://doi.org/10.1061/(ASCE)LA.1943-4170.0000379
[19]
Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, and Junzhou Zhao. 2020. Distinguish Confusing Law Articles for Legal Judgment Prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 3086–3095. https://doi.org/10.18653/v1/2020.acl-main.280
[20]
Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. 2019. JEC-QA: A Legal-Domain Question Answering Dataset. arxiv:1911.12011 [cs.CL]

Cited By

View all
  • (2025)SecOntoComputers and Security10.1016/j.cose.2024.104150148:COnline publication date: 1-Jan-2025
  • (2025)An effective anonymous authentication and key negotiation protocol for legal system networkAlexandria Engineering Journal10.1016/j.aej.2024.12.032115(434-442)Online publication date: Mar-2025
  • (2024)A behaviouristic semantic approach to blockchain-based e-commerceSemantic Web10.3233/SW-24354315:5(1863-1914)Online publication date: 9-Oct-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '23: Proceedings of the 18th International Conference on Availability, Reliability and Security
August 2023
1440 pages
ISBN:9798400707728
DOI:10.1145/3600160
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Act
  2. Data Protection
  3. NLP
  4. POS tagging
  5. Privacy
  6. Pronouncement

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ARES 2023

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)244
  • Downloads (Last 6 weeks)41
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)SecOntoComputers and Security10.1016/j.cose.2024.104150148:COnline publication date: 1-Jan-2025
  • (2025)An effective anonymous authentication and key negotiation protocol for legal system networkAlexandria Engineering Journal10.1016/j.aej.2024.12.032115(434-442)Online publication date: Mar-2025
  • (2024)A behaviouristic semantic approach to blockchain-based e-commerceSemantic Web10.3233/SW-24354315:5(1863-1914)Online publication date: 9-Oct-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media