Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3594536.3595146acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicailConference Proceedingsconference-collections
research-article

Beyond Readability with RateMyPDF: A Combined Rule-based and Machine Learning Approach to Improving Court Forms

Published: 07 September 2023 Publication History

Abstract

In this paper, we describe RateMyPDF, a web application that helps authors measure and improve the usability of court forms. It offers a score together with automated suggestions to improve the form drawn from both traditional machine learning approaches and the general purpose GPT-3 large language model. We worked with form authors and usability experts to determine the set of features we measure and validated them by gathering a dataset of approximately 24,000 PDF forms from 46 U.S. States and the District of Columbia. Our tool and automated measures allow a form author or court tasked with improving a large library of forms to work at scale.
This paper describes the features that we find improve form usability, the results from our analysis of the large form dataset, details of the tool, and the implications of our tool on access to justice for self-represented litigants. We found that the RateMyPDF score significantly correlates to the score of expert reviewers.
While the current version of the tool allows automated analysis of Microsoft Word and PDF court forms, the findings of our research apply equally to the growing number of automated wizard-driven interactive legal applications that replace paper forms with interactive websites.

References

[1]
Rebekah George Benjamin. 2012. Reconstructing Readability: Recent Developments and Recommendations in the Analysis of Text Difficulty. Educ Psychol Rev 24, 1 (March 2012), 63--88.
[2]
Allen Russell Boehm. Ohio Forms Burden Reduction Act. Ohio (on file with author).
[3]
G. Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).
[4]
Jack Cushman, Matthew Dahl, and Michael Lissner. 2021. eyecite: A tool for parsing legal citations. JOSS 6, 66 (October 2021), 3617.
[5]
Edgar Dale and Jeanne S. Chall. 1948. A Formula for Predicting Readability: Instructions. Educational Research Bulletin 27, 2 (1948), 37--54.
[6]
Alice Davison and Robert N. Kantor. 1982. On the Failure of Readability Formulas to Define Readable Texts: A Case Study from Adaptations. Reading Research Quarterly 17, 2 (1982), 187--209.
[7]
William H. DuBay. 2007. Smart Language: Readers, Readability, and the Grading of Text. Retrieved February 3, 2023 from https://eric.ed.gov/?id=ED506403
[8]
Anne Fernald, Virginia A. Marchman, and Adriana Weisleder. 2013. SES differences in language processing skill and vocabulary are evident at 18 months. Developmental Science 16, 2 (2013), 234--248.
[9]
Rudolph Flesch. 1948. A new readability yardstick. Journal of Applied Psychology 32, (1948), 221--233.
[10]
Thomas François, Adeline Müller, Eva Rolin, and Magali Norré. 2020. AMesure: A Web Platform to Assist the Clear Writing of Administrative Texts. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations, Association for Computational Linguistics, Suzhou, China, 1--7. Retrieved November 9, 2022 from https://aclanthology.org/2020.aacl-demo.1
[11]
Dr Jörg Fuchs, Tina Heyer, and Diana Langenhan. 2008. Influence of Font Sizes on the Readability and Comprehensibility of Package Inserts. Pharm. Ind. (2008).
[12]
Paula Hannaford, Scott Graves, and Shelley Spacek Miller. 2015. The Landscape of Civil Litigation in State Courts. National Center for State Courts. Retrieved May 1, 2023 from https://www.ncsc.org/__data/assets/pdf_file/0020/13376/civiljusticereport-2015.pdf
[13]
Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. Retrieved February 2, 2023 from https://spacy.io/
[14]
Caroline Jarrett, Gerry Gaffney, and Steve Krug. 2008. Forms that Work: Designing Web Forms for Usability (1st edition ed.). Morgan Kaufmann, Amsterdam; Boston.
[15]
Marc Lauritsen and Quinten Steenhuis. 2019. Substantive Legal Software Quality: A Gathering Storm? In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, ACM, Montreal QC Canada, 52--62.
[16]
Irving Lorge and Raphael Blau. 1941. Reading Comprehension of Adults. Teachers College Record 43, 3 (December 1941), 1--6.
[17]
Shelley Miller-Shaul. 2005. The characteristics of young and adult dyslexics readers on reading and reading related cognitive tasks as compared to normal readers. Dyslexia 11, 2 (2005), 132--151.
[18]
A. Miniukovich, A. De angeli, S. Sulpizio, and P. Venuti. 2017. Design guidelines for web readability. In DIS 2017 - Proceedings of the 2017 ACM Conference on Designing Interactive Systems, Association for Computing Machinery, Inc, Edinburgh, 285--296.
[19]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, (2011), 2825--2830.
[20]
Janice Redish. 2000. Readability formulas have even more limitations than Klare discusses. ACM J. Comput. Doc. 24, 3 (August 2000), 132--137.
[21]
Luz Rello, Martin Pielot, and Mari-Carmen Marcos. 2016. Make It Big! The Effect of Font Size and Line Spacing on Online Readability. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16), Association for Computing Machinery, New York, NY, USA, 3637--3648.
[22]
John Sabatini. 2015. Understanding the Basic Reading Skills of U.S. Adults: Reading Components in the PIAAC Literacy Survey. ETS Center for Research on Human Capital and Education. Retrieved February 3, 2023 from https://eric.ed.gov/?id=ED593006
[23]
Amir Sepehri, David Matthew Markowitz, and Mitra Mir. 2022. PassivePy: A Tool to Automatically Identify Passive Voice in Big Text Dat.
[24]
Quinten Steenhuis and David Colarusso. 2021. Digital Curb Cuts: Towards an Open Forms Ecosystem. Akron Law Review 54, 4 (2021), 2.
[25]
Suffolk Law School's Legal Innovation and Technology Lab. About Spot. Retrieved February 9, 2021 from https://spot.suffolklitlab.org/
[26]
Susanne Trauzettel-Klosinski, Klaus Dietz, and the IReST Study Group. 2012. Standardized Assessment of Reading Performance: The New International Reading Speed Texts IReST. Investigative Ophthalmology & Visual Science 53, 9 (August 2012), 5452--5461.
[27]
Linda Veiga, Tomasz Janowski, and Luís Soares Barbosa. 2016. Digital Government and Administrative Burden Reduction. In Proceedings of the 9th International Conference on Theory and Practice of Electronic Governance (ICEGOV '15-16), Association for Computing Machinery, New York, NY, USA, 323--326.
[28]
Washington Law Help. 2022. How to File Petition for Order of Protection. Retrieved February 6, 2023 from https://www.washingtonlawhelp.org/files/C9D2EA3F-0350-D9AF-ACAE-BF37E9BC9FFA/attachments/9100D6C9-D107-4B15-87B3-A898F12B6FD8/3701en_how-to-file-petition-for-order-of-protection.pdf
[29]
Antoinette Welsh. 2013. Effects of Trauma Induced Stress on Attention, Executive Functioning, Processing Speed, and Resilience in Urban Children. Seton Hall University Dissertations and Theses (ETDs) (December 2013). Retrieved from https://scholarship.shu.edu/dissertations/1907
[30]
Jenny Ziviani and John Elkins. 1984. An Evaluation of Handwriting Performance. Educational Review 36, 3 (November 1984), 249--261.
[31]
2015. Paperwork Reduction Act (44 U.S.C. 3501 et seq.). Digital.gov. Retrieved February 2, 2023 from https://digital.gov/resources/paperwork-reduction-act-44-u-s-c-3501-et-seq/
[32]
2023. RateMyPDF. Retrieved February 3, 2023 from https://github.com/SuffolkLITLab/RateMyPDF
[33]
2023. FormFyxer. Retrieved February 3, 2023 from https://github.com/SuffolkLITLab/FormFyxer
[34]
2023. Textstat. Retrieved February 7, 2023 from https://github.com/textstat/textstat
[35]
How to write good questions for forms - NHS digital service manual. nhs.uk. Retrieved February 6, 2023 from https://service-manual.nhs.uk
[36]
Restraining order/abuse prevention order court forms | Mass.gov. Retrieved February 6, 2023 from https://www.mass.gov/lists/restraining-orderabuse-prevention-order-court-forms
[37]
How to estimate burden | A Guide to the Paperwork Reduction Act. Retrieved November 9, 2022 from https://pra.digital.gov/burden/estimation/
[38]
LIST:Legal Issues Taxonomy. LIST: Legal Issues Taxonomy. Retrieved February 7, 2023 from https://taxonomy.legal/
[39]
About the Form Explorer? Retrieved February 7, 2023 from https://suffolklitlab.org/form-explorer/
[40]
Requests: HTTP for Humans™ --- Requests 2.28.2 documentation. Retrieved February 3, 2023 from https://requests.readthedocs.io/en/latest/
[41]
Field labels to use in template files | The Document Assembly Line Project. Retrieved February 3, 2023 from https://suffolklitlab.org/docassemble-AssemblyLine-documentation/docs/label_variables
[42]
plainlanguage.gov | Choose your words carefully. Retrieved April 29, 2023 from https://www.plainlanguage.gov/guidelines/words/

Cited By

View all
  • (2024)AI, Law and beyond. A transdisciplinary ecosystem for the future of AI & LawArtificial Intelligence and Law10.1007/s10506-024-09404-yOnline publication date: 16-May-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICAIL '23: Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law
June 2023
499 pages
ISBN:9798400701979
DOI:10.1145/3594536
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

  • IAAIL: Intl Asso for Artifical Intel & Law

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Accessibility
  2. Administrative Burden
  3. Automated Analysis
  4. Court Forms
  5. Law
  6. Readability

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICAIL 2023
Sponsor:
  • IAAIL

Acceptance Rates

Overall Acceptance Rate 69 of 169 submissions, 41%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)39
  • Downloads (Last 6 weeks)4
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)AI, Law and beyond. A transdisciplinary ecosystem for the future of AI & LawArtificial Intelligence and Law10.1007/s10506-024-09404-yOnline publication date: 16-May-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media