Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Contrastive Learning for Legal Judgment Prediction

Published: 21 April 2023 Publication History

Abstract

Legal judgment prediction (LJP) is a fundamental task of legal artificial intelligence. It aims to automatically predict the judgment results of legal cases. Three typical subtasks are relevant law article prediction, charge prediction, and term-of-penalty prediction. Due to the wide range of potential applications, LJP has attracted a great deal of interest, prompting the development of numerous approaches. These methods mainly focus on building a more accurate representation of a case’s fact description in order to improve the performance of judgment prediction. They overlook, however, the practical judicial scenario in which human judges often compare similar law articles or possible charges before making a final decision. To this end, we propose a supervised contrastive learning framework for the LJP task. Specifically, we train the model to distinguish (1) various law articles within the same chapter of a Law and (2) similar charges of the same law article or related law articles. By this means, the fine-grained differences between similar articles/charges can be captured, which are important for making a judgment. Besides, we optimize our model by identifying cases with the same article/charge labels, allowing it to more effectively model the relationship between the case’s fact description and its associated labels. By jointly learning the LJP task with the aforementioned contrastive learning tasks, our model achieves better performance than the state-of-the-art models on two real-world datasets.

References

[1]
Huajie Chen, Deng Cai, Wei Dai, Zehui Dai, and Yadong Ding. 2019. Charge-based prison term prediction with deep gating network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 6361–6366.
[2]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning (ICML’20), Virtual Event(Proceedings of Machine Learning Research, Vol. 119). PMLR, 1597–1607. http://proceedings.mlr.press/v119/chen20j.html.
[3]
Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, and Ziqing Yang. 2021. Pre-training with whole word masking for Chinese BERT. IEEE ACM Trans. Audio Speech Lang. Process. 29 (2021), 3504–3514.
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19), Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186.
[5]
Qian Dong and Shuzi Niu. 2021. Legal judgment prediction via relational learning. In The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21), Virtual Event. ACM, 983–992.
[6]
Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple contrastive learning of sentence embeddings. CoRR abs/2104.08821 (2021). arxiv:2104.08821https://arxiv.org/abs/2104.08821.
[7]
Anne von der Lieth Gardner. 1984. An Artificial Intelligence Approach to Legal Reasoning. Ph. D. Dissertation. Stanford University.
[8]
Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06). IEEE Computer Society, 1735–1742.
[9]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross B. Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). Computer Vision Foundation/IEEE, 9726–9735.
[10]
Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu, and Maosong Sun. 2018. Few-shot charge prediction with discriminative legal attributes. In Proceedings of the 27th International Conference on Computational Linguistics (COLING’18). Association for Computational Linguistics, 487–498. https://aclanthology.org/C18-1041/.
[11]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020 (NeurIPS’20), virtual. https://proceedings.neurips.cc/paper/2020/hash/d89a66c7c80a29b1bdbab0f2a1a94af8-Abstract.html.
[12]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14), A meeting of SIGDAT, a Special Interest Group of the ACL, Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1746–1751.
[13]
Fred Kort. 1957. Predicting supreme court decisions mathematically: A quantitative analysis of the “right to counsel” cases. American Political Science Review 51, 1 (1957), 1–12.
[14]
Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, Blai Bonet and Sven Koenig (Eds.). AAAI Press, 2267–2273. http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745.
[15]
Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In 7th International Conference on Learning Representations (ICLR’19). OpenReview.net. https://openreview.net/forum?id=Bkg6RiCqY7.
[16]
Bingfeng Luo, Yansong Feng, Jianbo Xu, Xiang Zhang, and Dongyan Zhao. 2017. Learning to predict charges for criminal cases with legal basis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP’17). Association for Computational Linguistics, 2727–2736.
[17]
Luyao Ma, Yating Zhang, Tianyi Wang, Xiaozhong Liu, Wei Ye, Changlong Sun, and Shikun Zhang. 2021. Legal judgment prediction with multi-stage case representation learning in the real court setting. In The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21), Virtual Event. ACM, 993–1002.
[18]
Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, and Xia Song. 2021. COCO-LM: Correcting and contrasting text sequences for language model pretraining. CoRR abs/2102.08473 (2021). arxiv:2102.08473https://arxiv.org/abs/2102.08473.
[19]
Tomás Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013. 3111–3119. https://proceedings.neurips.cc/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-Abstract.html.
[20]
Stuart S. Nagel. 1963. Applying correlation analysis to case prediction. Tex. L. Rev. 42 (1963), 1006.
[21]
Jeffrey A. Segal. 1984. Predicting supreme court cases probabilistically: The search and seizure cases, 1962-1981. American Political Science Review 78, 4 (1984), 891–900.
[22]
Zhan Su, Zhicheng Dou, Yutao Zhu, Xubo Qin, and Ji-Rong Wen. 2021. Modeling intent graph for search result diversification. In The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21), Virtual Event, Fernando Diaz, Chirag Shah, Torsten Suel, Pablo Castells, Rosie Jones, and Tetsuya Sakai (Eds.). ACM, 736–746.
[23]
Maosong Sun, Xinxiong Chen, Kaixu Zhang, Zhipeng Guo, and Zhiyuan Liu. 2016. Thulac: An efficient lexical analyzer for Chinese.
[24]
Johan A. K. Suykens and Joos Vandewalle. 1999. Least squares support vector machine classifiers. Neural Process. Lett. 9, 3 (1999), 293–300.
[25]
S. Sidney Ulmer. 1963. Quantitative analysis of judicial processes: Some practical and theoretical applications. Law and Contemporary Problems 28, 1 (1963), 164–184.
[26]
Josef Valvoda, Ryan Cotterell, and Simone Teufel. 2022. On the role of negative precedent in legal outcome prediction. CoRR abs/2208.08225 (2022). arXiv:2208.08225
[27]
Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. CoRR abs/1807.03748 (2018). arXiv:1807.03748http://arxiv.org/abs/1807.03748.
[28]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
[29]
Pengfei Wang, Yu Fan, Shuzi Niu, Ze Yang, Yongfeng Zhang, and Jiafeng Guo. 2019. Hierarchical matching network for crime classification. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’19). ACM, 325–334.
[30]
Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In Proceedings of the 37th International Conference on Machine Learning (ICML’20), Virtual Event(Proceedings of Machine Learning Research, Vol. 119). PMLR, 9929–9939. http://proceedings.mlr.press/v119/wang20k.html.
[31]
Yiquan Wu, Kun Kuang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Jun Xiao, Yueting Zhuang, Luo Si, and Fei Wu. 2020. De-biased court’s view generation with causality. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20), Online, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 763–780.
[32]
Zhuofeng Wu, Sinong Wang, Jiatao Gu, Madian Khabsa, Fei Sun, and Hao Ma. 2020. CLEAR: Contrastive learning for sentence representation. CoRR abs/2012.15466 (2020). arxiv:2012.15466https://arxiv.org/abs/2012.15466.
[33]
Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Yansong Feng, Xianpei Han, Zhen Hu, Heng Wang, and Jianfeng Xu. 2018. CAIL2018: A large-scale legal dataset for judgment prediction. CoRR abs/1807.02478 (2018). arXiv:1807.02478http://arxiv.org/abs/1807.02478.
[34]
Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, and Junzhou Zhao. 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20), Online. Association for Computational Linguistics, 3086–3095.
[35]
Wenmian Yang, Weijia Jia, Xiaojie Zhou, and Yutao Luo. 2019. Legal judgment prediction via multi-perspective bi-feedback network. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). ijcai.org, 4085–4091.
[36]
Linan Yue, Qi Liu, Binbin Jin, Han Wu, Kai Zhang, Yanqing An, Mingyue Cheng, Biao Yin, and Dayong Wu. 2021. NeurJudge: A circumstance-aware neural framework for legal judgment prediction. In The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21), Virtual Event. ACM, 973–982.
[37]
Han Zhang, Zhicheng Dou, Yutao Zhu, and Jirong Wen. 2021. Few-shot charge prediction with multi-grained features and mutual information. In Chinese Computational Linguistics - 20th China National Conference (CCL’21), Proceedings(Lecture Notes in Computer Science, Vol. 12869), Sheng Li, Maosong Sun, Yang Liu, Hua Wu, Kang Liu, Wanxiang Che, Shizhu He, and Gaoqi Rao (Eds.). Springer, 387–403.
[38]
Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Chaojun Xiao, Zhiyuan Liu, and Maosong Sun. 2018. Legal judgment prediction via topological learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3540–3549.
[39]
Haoxi Zhong, Zhengyan Zhang, Zhiyuan Liu, and Maosong Sun. 2019. Open Chinese Language Pre-trained Model Zoo. Technical Report. https://github.com/thunlp/openclap.
[40]
Yujia Zhou, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen. 2021. PSSL: Self-supervised learning for personalized search with contrastive sampling. In The 30th ACM International Conference on Information and Knowledge Management (CIKM’21), Virtual Event. ACM, 2749–2758.
[41]
Yutao Zhu, Jian-Yun Nie, Zhicheng Dou, Zhengyi Ma, Xinyu Zhang, Pan Du, Xiaochen Zuo, and Hao Jiang. 2021. Contrastive learning of user behavior sequence for context-aware document ranking. In The 30th ACM International Conference on Information and Knowledge Management (CIKM’21), Virtual Event. ACM, 2780–2791.
[42]
Yutao Zhu, Jian-Yun Nie, Zhicheng Dou, Zhengyi Ma, Xinyu Zhang, Pan Du, Xiaochen Zuo, and Hao Jiang. 2021. Contrastive learning of user behavior sequence for context-aware document ranking. In The 30th ACM International Conference on Information and Knowledge Management (CIKM’21), Virtual Event, Gianluca Demartini, Guido Zuccon, J. Shane Culpepper, Zi Huang, and Hanghang Tong (Eds.). ACM, 2780–2791.
[43]
Yutao Zhu, Kun Zhou, Jian-Yun Nie, Shengchao Liu, and Zhicheng Dou. 2021. Neural sentence ordering based on constraint graphs. In 35th AAAI Conference on Artificial Intelligence (AAAI’21), 33rd Conference on Innovative Applications of Artificial Intelligence (IAAI’21), 11th Symposium on Educational Advances in Artificial Intelligence (EAAI’21), Virtual Event. AAAI Press, 14656–14664. https://ojs.aaai.org/index.php/AAAI/article/view/17722.

Cited By

View all
  • (2024)Using Bidirectional Encoder Representations from Transformers (BERT) to predict criminal charges and sentences from Taiwanese court judgmentsPeerJ Computer Science10.7717/peerj-cs.184110(e1841)Online publication date: 31-Jan-2024
  • (2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
  • (2024)Decentralized Federated Recommendation with Privacy-aware Structured Client-level GraphACM Transactions on Intelligent Systems and Technology10.1145/364128715:4(1-23)Online publication date: 22-Jan-2024
  • Show More Cited By

Index Terms

  1. Contrastive Learning for Legal Judgment Prediction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 41, Issue 4
    October 2023
    958 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/3587261
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 April 2023
    Online AM: 18 January 2023
    Accepted: 05 January 2023
    Revised: 04 November 2022
    Received: 02 August 2022
    Published in TOIS Volume 41, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Deep learning
    2. legal judgment prediction
    3. supervised contrastive learning
    4. legal artificial intelligence
    5. law

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)679
    • Downloads (Last 6 weeks)62
    Reflects downloads up to 10 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Using Bidirectional Encoder Representations from Transformers (BERT) to predict criminal charges and sentences from Taiwanese court judgmentsPeerJ Computer Science10.7717/peerj-cs.184110(e1841)Online publication date: 31-Jan-2024
    • (2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
    • (2024)Decentralized Federated Recommendation with Privacy-aware Structured Client-level GraphACM Transactions on Intelligent Systems and Technology10.1145/364128715:4(1-23)Online publication date: 22-Jan-2024
    • (2024)Should Fairness be a Metric or a Model? A Model-based Framework for Assessing Bias in Machine Learning PipelinesACM Transactions on Information Systems10.1145/364127642:4(1-41)Online publication date: 22-Mar-2024
    • (2024)Cross-domain Recommendation via Dual Adversarial AdaptationACM Transactions on Information Systems10.1145/363252442:3(1-26)Online publication date: 22-Jan-2024
    • (2024)Legal Statute Identification: A Case Study using State-of-the-Art Datasets and MethodsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657879(2231-2240)Online publication date: 10-Jul-2024
    • (2024)DDPO: Direct Dual Propensity Optimization for Post-Click Conversion Rate EstimationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657817(1179-1188)Online publication date: 10-Jul-2024
    • (2024)Universal Adversarial Perturbations for Vision-Language Pre-trained ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657781(862-871)Online publication date: 10-Jul-2024
    • (2024)Exploring the Individuality and Collectivity of Intents behind Interactions for Graph Collaborative FilteringProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657738(1253-1262)Online publication date: 10-Jul-2024
    • (2024)Event Grounded Criminal Court View Generation with Cooperative (Large) Language ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657698(2221-2230)Online publication date: 10-Jul-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media