research-article

Open access

All in One Place: Ensuring Usable Access to Online Shopping Items for Blind Users

Authors:

Akshay Kolgar Nayak,

Sampath Jayarathna,

Vikas AshokAuthors Info & Claims

Proceedings of the ACM on Human-Computer Interaction, Volume 8, Issue EICS

Article No.: 257, Pages 1 - 25

https://doi.org/10.1145/3664639

Published: 17 June 2024 Publication History

Abstract

Perusing web data items such as shopping products is a core online user activity. To prevent information overload, the content associated with data items is typically dispersed across multiple webpage sections over multiple web pages. However, such content distribution manifests an unintended side effect of significantly increasing the interaction burden for blind users, since navigating to-and-fro between different sections in different pages is tedious and cumbersome with their screen readers. While existing works have proposed methods for the context of a single webpage, solutions enabling usable access to content distributed across multiple webpages are few and far between. In this paper, we present InstaFetch, a browser extension that dynamically generates an alternative screen reader-friendly user interface in real-time, which blind users can leverage to almost instantly access different item-related information such as description, full specification, and user reviews, all in one place, without having to tediously navigate to different sections in different webpages. Moreover, InstaFetch also supports natural language queries about any item, a feature blind users can exploit to quickly obtain desired information, thereby avoiding manually trudging through reams of text. In a study with 14 blind users, we observed that the participants needed significantly lesser time to peruse data items with InstaFetch, than with a state-of-the-art solution.

References

[1]

Waleed Abdulla. 2017. Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. https://github.com/matterport/Mask_RCNN.

[2]

Faisal Ahmed, Yevgen Borodin, Andrii Soviak, Muhammad Islam, IV Ramakrishnan, and Terri Hedgpeth. 2012. Accessible skimming: faster screen reading of web pages. In Proceedings of the 25th annual ACM symposium on User interface software and technology. 367--378.

Digital Library

[3]

Julian Alarte, David Insa, and Josep Silva. 2017. Webpage menu detection based on DOM. In International Conference on Current Trends in Theory and Practice of Informatics. Springer, 411--422.

[4]

Manuel Álvarez, Alberto Pan, Juan Raposo, Fernando Bellas, and Fidel Cacheda. 2010. Finding and extracting data records from web pages. Journal of Signal Processing Systems 59, 1 (2010), 123--137.

Digital Library

[5]

Vikas Ashok, Yury Puzis, Yevgen Borodin, and IV Ramakrishnan. 2017. Web screen reading automation assistance using semantic abstraction. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. 407--418.

Digital Library

[6]

Ali Selman Aydin, Shirin Feiz, Vikas Ashok, and IV Ramakrishnan. 2020. Sail: Saliency-driven injection of aria landmarks. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 111--115.

Digital Library

[7]

Shrabastee Banerjee, Chrysanthos Dellarocas, and Georgios Zervas. 2021. Interacting user-generated content technologies: How questions and answers affect consumer reviews. Journal of Marketing Research 58, 4 (2021), 742--761.

[8]

Sean Bechhofer, Simon Harper, and Darren Lunn. 2006. Sadie: Semantic annotation for accessibility. In International Semantic Web Conference. Springer, 101--115.

Digital Library

[9]

Yevgen Borodin, Jeffrey P. Bigham, Glenn Dausch, and I. V. Ramakrishnan. 2010. More than Meets the Eye: A Survey of Screen-Reader Browsing Strategies. In Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A) (Raleigh, North Carolina) (W4A '10). Association for Computing Machinery, New York, NY, USA, Article 13, 10 pages. https://doi.org/10.1145/1805986.1806005

Digital Library

[10]

James V. Bradley. 1958. Complete Counterbalancing of Immediate Sequential Effects in a Latin Square Design. J. Amer. Statist. Assoc. 53, 282 (1958), 525--528. https://doi.org/10.1080/01621459.1958.10501456 arXiv:https://amstat.tandfonline.com/doi/pdf/10.1080/01621459.1958.10501456

[11]

John Brooke. 1996. Sus: a "quick and dirty'usability. Usability evaluation in industry 189, 3 (1996).

[12]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.

[13]

Deng Cai, Shipeng Yu, Ji-Rong Wen, and Wei-Ying Ma. 2003. Vips: a vision-based page segmentation algorithm. (2003).

[14]

Kaushik Chakrabarti, Zhimin Chen, Siamak Shakeri, and Guihong Cao. 2020. Open domain question answering using web tables. arXiv preprint arXiv:2001.03272 (2020).

[15]

Shiqian Chen, Chenliang Li, Feng Ji, Wei Zhou, and Haiqing Chen. 2019. Driven answer generation for product-related questions in e-commerce. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 411--419.

Digital Library

[16]

Yang Deng, Wenxuan Zhang, and Wai Lam. 2020. Opinion-aware answer generation for review-driven question answering in e-commerce. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 255--264.

Digital Library

[17]

Yang Deng, Wenxuan Zhang, Qian Yu, and Wai Lam. 2023. Product Question Answering in E-Commerce: A Survey. arXiv preprint arXiv:2302.08092 (2023).

[18]

Prasad M Deshpande, Karthikeyan Ramasamy, Amit Shukla, and Jeffrey F Naughton. 1998. Caching multidimensional queries using chunks. In Proceedings of the 1998 ACM SIGMOD international conference on Management of data. 259--270.

Digital Library

[19]

Yixiang Fang, Xiaoqin Xie, Xiaofeng Zhang, Reynold Cheng, and Zhiqiang Zhang. 2018. STEM: a suffix tree-based method for web data records extraction. Knowledge and Information Systems 55, 2 (2018), 305--331.

Digital Library

[20]

Javedul Ferdous, Hae-Na Lee, Sampath Jayarathna, and Vikas Ashok. 2022. InSupport: Proxy Interface for Enabling Efficient Non-Visual Interaction with Web Data Records. In 27th International Conference on Intelligent User Interfaces. 49--62.

[21]

Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, and Rui Yan. 2021. Meaningful answer generation of e-commerce question-answering. ACM Transactions on Information Systems (TOIS) 39, 2 (2021), 1--26.

Digital Library

[22]

Shen Gao, Zhaochun Ren, Yihong Zhao, Dongyan Zhao, Dawei Yin, and Rui Yan. 2019. Product-aware answer generation in e-commerce question-answering. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 429--437.

Digital Library

[23]

Boni García, Mario Munoz-Organero, Carlos Alario-Hoyos, and Carlos Delgado Kloos. 2021. Automated driver management for selenium WebDriver. Empirical Software Engineering 26, 5 (2021), 1--51.

Digital Library

[24]

Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision. 1440--1448.

Digital Library

[25]

Cole Gleason, Amy Pavel, Emma McCamey, Christina Low, Patrick Carrington, Kris M Kitani, and Jeffrey P Bigham. 2020. Twitter A11y: A browser extension to make Twitter images accessible. In Proceedings of the 2020 chi conference on human factors in computing systems. 1--12.

Digital Library

[26]

Tomas Gogar, Ondrej Hubacek, and Jan Sedivy. 2016. Deep neural networks for web page information extraction. In Artificial Intelligence Applications and Innovations: 12th IFIP WG 12.5 International Conference and Workshops, AIAI 2016, Thessaloniki, Greece, September 16-18, 2016, Proceedings 12. Springer, 154--163.

[27]

Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psychology. Vol. 52. Elsevier, 139--183.

[28]

Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2961--2969.

[29]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[30]

Warut Khern-am nuai, Hossein Ghasemkhani, Dandan Qiao, and Karthik Kannan. 2023. The impact of online Q&As on product sales: The case of Amazon answer. Information Systems Research (2023).

[31]

Anurendra Kumar, Keval Morabia, Jingjin Wang, Kevin Chen-Chuan Chang, and Alexander Schwing. 2021. CoVA: context-aware visual attention for webpage information extraction. arXiv preprint arXiv.2110.12320 (2021).

[32]

Eduardo Sany Laber, Críston Pereira de Souza, Iam Vita Jabour, Evelin Carvalho Freire de Amorim, Eduardo Teixeira Cardoso, Raúl Pierre Rentería, Lúcio Cunha Tinoco, and Caio Dias Valentim. 2009. A fast and simple method for extracting relevant content from news webpages. In Proceedings of the 18th ACM conference on Information and knowledge management. 1685--1688.

Digital Library

[33]

Jonathan Lazar, Aaron Allen, Jason Kleinman, and Chris Malarkey. 2007. What frustrates screen reader users on the web: A study of 100 blind users. International Journal of human-computer interaction 22, 3 (2007), 247--269.

[34]

Hae-Na Lee and Vikas Ashok. 2022. Customizable Tabular Access to Web Data Records for Convenient Low-Vision Screen Magnifier Interaction. ACM Transactions on Accessible Computing (TACCESS) (2022).

[35]

Hae-Na Lee, Sami Uddin, and Vikas Ashok. 2020. TableView: Enabling Efficient Access to Web Data Records for Screen-Magnifier Users. In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility. 1--12.

Digital Library

[36]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459--9474.

[37]

Dacheng Li, Rulin Shao, Anze Xie, Ying Sheng, Lianmin Zheng, Joseph Gonzalez, Ion Stoica, Xuezhe Ma, and Hao Zhang. 2023. How Long Can Context Length of Open-Source LLMs truly Promise?. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.

[38]

Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117--2125.

[39]

Nelson F Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. 2023. Lost in the middle: How language models use long contexts. arXiv preprint arXiv.2307.03172 (2023).

[40]

Valentyn Melnyk, Vikas Ashok, Yury Puzis, Andrii Soviak, Yevgen Borodin, and IV Ramakrishnan. 2014. Widget classification with applications to web accessibility. In International Conference on Web Engineering. Springer, 341--358.

[41]

Carol Moser, Chanda Phelan, Paul Resnick, Sarita Y Schoenebeck, and Katharina Reinecke. 2017. No such thing as too much chocolate: evidence against choice overload in e-commerce. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 4358--4369.

Digital Library

[42]

Arvind Neelakantan, Tao Xu, Raul Puri, Alec Radford, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas Tezak, Jong Wook Kim, Chris Hallacy, et al. 2022. Text and code embeddings by contrastive pre-training. arXiv preprint arXiv:2201.10005 (2022).

[43]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318.

Digital Library

[44]

Pinecone. 2023. LangChain Unleashed. https://www.pinecone.io/learn/chunking-strategies/.

[45]

Yash Prakash, Mohan Sunkara, Hae-Na Lee, Sampath Jayarathna, and Vikas Ashok. 2023. AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users. In Proceedings of the 28th International Conference on Intelligent User Interfaces. 32--45.

Digital Library

[46]

Jack W Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, et al. 2021. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446 (2021).

[47]

Johnny Saldaña. 2021. The coding manual for qualitative researchers. sage.

[48]

Weishi Shi, Heather Moses, Qi Yu, Samuel Malachowsky, and Daniel E Krutz. 2023. ALL: Supporting Experiential Accessibility Education and Inclusive Software Development. ACM Transactions on Software Engineering and Methodology (2023).

[49]

Brijendra Singh and Hemant Kumar Singh. 2010. Web data mining research: a survey. In 2010 IEEE International Conference on Computational Intelligence and Computing Research. IEEE, 1--10.

[50]

Ray Smith. 2007. An overview of the Tesseract OCR engine. In Ninth international conference on document analysis and recognition (ICDAR 2007), Vol. 2. IEEE, 629--633.

[51]

Amanda Stent, Matthew Marge, and Mohit Singhai. 2005. Evaluating evaluation methods for generation in the presence of variation. In International conference on intelligent text processing and computational linguistics. Springer, 341--351.

Digital Library

[52]

The GIMP Development Team. 1998. GNU Image Manipulation Program. https://www.gimp.org

[53]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).

[54]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824--24837.

[55]

Shaomei Wu, Jeffrey Wieland, Omid Farivar, and Julie Schiller. 2017. Automatic alt-text: Computer-generated image descriptions for blind users on a social network service. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1180--1192.

Digital Library

[56]

Canhui Xu, Cao Shi, Hengyue Bi, Chuanqi Liu, Yongfeng Yuan, Haoyan Guo, and Yinong Chen. 2021. A Page Object Detection Method Based on Mask R-CNN. IEEE Access 9 (2021), 143448--143457.

[57]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 (2022).

[58]

Qian Yu, Wai Lam, and Zihao Wang. 2018. Responding e-commerce product questions via exploiting qa collections and reviews. In Proceedings of the 27th International Conference on Computational Linguistics. 2192--2203.

[59]

Xujiang Zhao, Jiaying Lu, Chengyuan Deng, Can Zheng, Junxiang Wang, Tanmoy Chowdhury, Li Yun, Hejie Cui, Zhang Xuchao, Tianjiao Zhao, et al. 2023. Domain specialization as the key to make large language models disruptive: A comprehensive survey. arXiv preprint arXiv:2305.18703 (2023).

[60]

Zhuoyao Zhong, Lei Sun, and Qiang Huo. 2019. An anchor-free region proposal network for Faster R-CNN-based text detection approaches. International Journal on Document Analysis and Recognition (IJDAR) 22, 3 (2019), 315--327.

Digital Library

Index Terms

All in One Place: Ensuring Usable Access to Online Shopping Items for Blind Users
1. Human-centered computing
  1. Accessibility
    1. Accessibility technologies
    2. Empirical studies in accessibility

Recommendations

AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users
IUI '23: Proceedings of the 28th International Conference on Intelligent User Interfaces

Web data items such as shopping products, classifieds, and job listings are indispensable components of most e-commerce websites. The information on the data items are typically distributed over two or more webpages, e.g., a ‘Query-Results’ page showing ...
A Comparative Study of Accessibility and Usability of Norwegian University Websites for Screen Reader Users Based on User Experience and Automated Assessment
Universal Access in Human-Computer Interaction. Design Approaches and Supporting Technologies
Abstract
Websites are essential for learners’ access to information. However, due to the lack of accessibility and usability of websites, students with disabilities who solely rely on screen readers face challenges accessing webpage contents. This study ...
Usability Evaluation of Email Applications by Blind Users

In this article, we discuss results of usability evaluations of desktop and web-based email applications used by those who are blind. Email is an important tool for workplace communication, but computer software and websites can present accessibility ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction

Proceedings of the ACM on Human-Computer Interaction Volume 8, Issue EICS

EICS

June 2024

589 pages

EISSN:2573-0142

DOI:10.1145/3673909

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2024

Accepted: 01 April 2024

Revised: 01 April 2024

Received: 01 February 2024

Published in PACMHCI Volume 8, Issue EICS

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
226
Total Downloads

Downloads (Last 12 months)226
Downloads (Last 6 weeks)83

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents