Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3377811.3380327acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Unblind your apps: predicting natural-language labels for mobile GUI components by deep learning

Published: 01 October 2020 Publication History
  • Get Citation Alerts
  • Abstract

    According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called LabelDroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show that our model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.

    References

    [1]
    2011. Android Lint - Android Studio Project Site. http://tools.android.com/tips/lint.
    [2]
    2017. Screen Reader Survey. https://webaim.org/projects/screenreadersurvey7/.
    [3]
    2018. android.widget | Android Developers. https://developer.android.com/reference/android/widget/package-summary.
    [4]
    2018. Blindness and vision impairment. https://www.who.int/en/news-room/fact-sheets/detail/blindness-and-visual-impairment.
    [5]
    2019. Accessibility Scanner. https://play.google.com/store/apps/details?id=com.google.android.apps.accessibility.auditor.
    [6]
    2019. Android Accessibility Guideline. https://developer.android.com/guide/topics/ui/accessibility/apps.
    [7]
    2019. Android's Accessibility Testing Framework. https://github.com/google/Accessibility-Test-Framework-for-Android.
    [8]
    2019. Apple App Store. https://www.apple.com/au/ios/app-store/.
    [9]
    2019. Earl-Grey. https://github.com/google/EarlGrey.
    [10]
    2019. Espresso | Android Developers. https://developer.android.com/training/testing/espresso.
    [11]
    2019. Google Play Store. https://play.google.com.
    [12]
    2019. Google TalkBack source code. https://github.com/google/talkback.
    [13]
    2019. Image Button. https://developer.android.com/reference/android/widget/ImageButton.
    [14]
    2019. ImageView. https://developer.android.com/reference/android/widget/ImageView.
    [15]
    2019. iOS Accessibiliyu Guideline. https://developer.apple.com/accessibility/ios/.
    [16]
    2019. KIF. https://github.com/kif-framework/KIF.
    [17]
    2019. PyTorch. https://pytorch.org/s.
    [18]
    2019. Robolectric. http://robolectric.org/.
    [19]
    2019. Talkback Guideline. https://support.google.com/accessibility/android/answer/6283677?hl=en.
    [20]
    2019. VoiceOver. https://cloud.google.com/translate/docs/.
    [21]
    2019. World Wide Web Consortium Accessibility. https://www.w3.org/standards/webdesign/accessibility.
    [22]
    Jyoti Aneja, Aditya Deshpande, and Alexander G Schwing. 2018. Convolutional image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5561--5570.
    [23]
    Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
    [24]
    Abhijeet Banerjee, Hai-Feng Guo, and Abhik Roychoudhury. 2016. Debugging energy-efficiency related field failures in mobile apps. In Proceedings of the International Conference on Mobile Software Engineering and Systems. ACM, 127--138.
    [25]
    Abhijeet Banerjee and Abhik Roychoudhury. 2016. Automated re-factoring of android apps to enhance energy-efficiency. In 2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft). IEEE, 139--150.
    [26]
    Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65--72.
    [27]
    Yoav Benjamini and Yosef Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57, 1 (1995), 289--300.
    [28]
    John Brooke et al. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4--7.
    [29]
    Margaret Butler. 2010. Android: Changing the mobile landscape. IEEE Pervasive Computing 10, 1 (2010), 4--7.
    [30]
    Chunyang Chen, Xi Chen, Jiamou Sun, Zhenchang Xing, and Guoqiang Li. 2018. Data-driven proactive policy assurance of post quality in community q&a sites. Proceedings of the ACM on human-computer interaction 2, CSCW (2018), 1--22.
    [31]
    Chunyang Chen, Sidong Feng, Zhenchang Xing, Linda Liu, Shengdong Zhao, and Jinshui Wang. 2019. Gallery DC: Design Search and Knowledge Discovery through Auto-created GUI Component Gallery. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1--22.
    [32]
    Chunyang Chen, Ting Su, Guozhu Meng, Zhenchang Xing, and Yang Liu. 2018. From ui design image to gui skeleton: a neural machine translator to bootstrap mobile gui implementation. In Proceedings of the 40th International Conference on Software Engineering. ACM, 665--676.
    [33]
    Chunyang Chen, Zhenchang Xing, and Yang Liu. 2017. By the community & for the community: a deep learning approach to assist collaborative editing in q&a sites. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 1--21.
    [34]
    Chunyang Chen, Zhenchang Xing, Yang Liu, and Kent Long Xiong Ong. 2019. Mining likely analogical apis across third-party libraries via large-scale unsu-pervised api semantics embedding. IEEE Transactions on Software Engineering (2019).
    [35]
    Guibin Chen, Chunyang Chen, Zhenchang Xing, and Bowen Xu. 2016. Learning a dual-language vector space for domain-specific cross-lingual question retrieval. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 744--755.
    [36]
    Sen Chen, Lingling Fan, Chunyang Chen, Ting Su, Wenhe Li, Yang Liu, and Lihua Xu. 2019. Storydroid: Automated generation of storyboard for Android apps. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 596--607.
    [37]
    Sen Chen, Lingling Fan, Chunyang Chen, Minhui Xue, Yang Liu, and Lihua Xu. 2019. GUI-Squatting Attack: Automated Generation of Android Phishing Apps. IEEE Transactions on Dependable and Secure Computing (2019).
    [38]
    Sen Chen, Minhui Xue, Lingling Fan, Shuang Hao, Lihua Xu, Haojin Zhu, and Bo Li. 2018. Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. computers & security 73 (2018), 326--344.
    [39]
    Xinlei Chen, Hao Fang, Tsung-Yi Lin, Ramakrishna Vedantam, Saurabh Gupta, Piotr Dollár, and C Lawrence Zitnick. 2015. Microsoft coco captions: Data collection and evaluation server. arXiv preprint arXiv:1504.00325 (2015).
    [40]
    Tobias Dehling, Fangjian Gao, Stephan Schneider, and Ali Sunyaev. 2015. Exploring the far side of mobile health: information security and privacy of mobile health apps on iOS and Android. JMIR mHealth and uHealth 3, 1 (2015), e8.
    [41]
    Marcelo Medeiros Eler, José Miguel Rojas, Yan Ge, and Gordon Fraser. 2018. Automated accessibility testing of mobile apps. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 116--126.
    [42]
    Ruitao Feng, Sen Chen, Xiaofei Xie, Lei Ma, Guozhu Meng, Yang Liu, and Shang-Wei Lin. 2019. MobiDroid: A Performance-Sensitive Malware Detection System on Mobile Platform. In 2019 24th International Conference on Engineering of Complex Computer Systems (ICECCS). IEEE, 61--70.
    [43]
    Bin Fu, Jialiu Lin, Lei Li, Christos Faloutsos, Jason Hong, and Norman Sadeh. 2013. Why people hate your app: Making sense of user feedback in a mobile app store. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1276--1284.
    [44]
    Sa Gao, Chunyang Chen, Zhenchang Xing, Yukun Ma, Wen Song, and Shang-Wei Lin. 2019. A neural model for method name generation from functional description. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 414--421.
    [45]
    Isao Goto, Ka-Po Chow, Bin Lu, Eiichiro Sumita, and Benjamin K Tsou. 2013. Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop. In NTCIR.
    [46]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
    [47]
    Geoffrey Hecht, Omar Benomar, Romain Rouvoy, Naouel Moha, and Laurence Duchien. 2015. Tracking the software quality of android applications along their evolution (t). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 236--247.
    [48]
    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
    [49]
    Shaun K Kane, Chandrika Jayant, Jacob O Wobbrock, and Richard E Ladner. 2009. Freedom to roam: a study of mobile device adoption and accessibility for people with visual and motor disabilities. In Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility. ACM, 115--122.
    [50]
    Bridgett A King and Norman E Youngblood. 2016. E-government in Alabama: An analysis of county voting and election website content, usability, accessibility, and mobile readiness. Government Information Quarterly 33, 4 (2016), 715--726.
    [51]
    Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    [52]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
    [53]
    Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 (1951), 79--86.
    [54]
    Richard E Ladner. 2015. Design for user empowerment. interactions 22, 2 (2015), 24--29.
    [55]
    Yann LeCun, Yoshua Bengio, et al. 1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361, 10 (1995), 1995.
    [56]
    Chin-Yew Lin and Eduard Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 150--157.
    [57]
    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.
    [58]
    Mario Linares-Vasquez, Christopher Vendome, Qi Luo, and Denys Poshyvanyk. 2015. How developers detect and fix performance bottlenecks in Android apps. In 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 352--361.
    [59]
    Henry B Mann and Donald R Whitney. 1947. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics (1947), 50--60.
    [60]
    Higinio Mora, Virgilio Gilart-Iglesias, Raquel Pérez-del Hoyo, and María Andújar-Montoya. 2017. A comprehensive system for monitoring urban accessibility in smart cities. Sensors 17, 8 (2017), 1834.
    [61]
    Yusuke Oda, Hiroyuki Fudaba, Graham Neubig, Hideaki Hata, Sakriani Sakti, Tomoki Toda, and Satoshi Nakamura. 2015. Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T). In Automated Software Engineering (ASE), 2015 30th IEEE/ACM International Conference on. IEEE, 574--584.
    [62]
    Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 311--318.
    [63]
    Kyudong Park, Taedong Goh, and Hyo-Jeong So. 2014. Toward accessible mobile application design: developing mobile application accessibility guidelines for people with visual impairment. In Proceedings of HCI Korea. Hanbit Media, Inc., 31--38.
    [64]
    Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. Piscataway, NJ, 133--142.
    [65]
    Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99.
    [66]
    Anne Spencer Ross, Xiaoyi Zhang, James Fogarty, and Jacob O Wobbrock. 2018. Examining image-based button labeling for accessibility in Android apps through large-scale analysis. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, 119--130.
    [67]
    Ch Spearman. 2010. The proof and measurement of association between two things. International journal of epidemiology 39, 5 (2010), 1137--1150.
    [68]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
    [69]
    Ramakrishna Vedantam, C Lawrence Zitnick, and Devi Parikh. 2015. Cider: Consensus-based image description evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4566--4575.
    [70]
    Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3156--3164.
    [71]
    Fahui Wang. 2012. Measurement, optimization, and impact of health care accessibility: a methodological review. Annals of the Association of American Geographers 102, 5 (2012), 1104--1112.
    [72]
    Xu Wang, Chunyang Chen, and Zhenchang Xing. 2019. Domain-specific machine translation with recurrent neural network for software localization. Empirical Software Engineering 24, 6 (2019), 3514--3545.
    [73]
    Lili Wei, Yepang Liu, and Shing-Chi Cheung. 2016. Taming Android fragmentation: Characterizing and detecting compatibility issues for Android apps. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 226--237.
    [74]
    Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in statistics. Springer, 196--202.
    [75]
    Shunguo Yan and PG Ramachandran. 2019. The current status of accessibility in mobile apps. ACM Transactions on Accessible Computing (TACCESS) 12, 1 (2019), 3.
    [76]
    Zhenlong Yuan, Yongqiang Lu, and Yibo Xue. 2016. Droiddetector: android malware characterization and detection using deep learning. Tsinghua Science and Technology 21, 1 (2016), 114--123.
    [77]
    Xiaoyi Zhang, Anne Spencer Ross, Anat Caspi, James Fogarty, and Jacob O Wobbrock. 2017. Interaction proxies for runtime repair and enhancement of mobile application accessibility. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 6024--6037.
    [78]
    Xiaoyi Zhang, Anne Spencer Ross, and James Fogarty. 2018. Robust Annotation of Mobile Application Interfaces in Methods for Accessibility Repair and Enhancement. In The 31st Annual ACM Symposium on User Interface Software and Technology. ACM, 609--621.
    [79]
    Dehai Zhao, Zhenchang Xing, Chunyang Chen, Xin Xia, and Guoqiang Li. 2019. ActionNet: vision-based workflow action recognition from programming screen-casts. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 350--361.
    [80]
    Dehai Zhao, Zhenchang Xing, Chunyang Chen, Xiwei Xu, Liming Zhu, Guoqiang Li, and Jinshui Wang. 2020. Seenomaly: Vision-Based Linting of GUI Animation Effects Against Design-Don't Guidelines. In 42nd International Conference on Software Engineering (ICSE '20). ACM, New York, NY, 12 pages.
    [81]
    Hui Zhao, Min Chen, Meikang Qiu, Keke Gai, and Meiqin Liu. 2016. A novel pre-cache schema for high performance Android system. Future Generation Computer Systems 56 (2016), 766--772.

    Cited By

    View all
    • (2024)Towards Automated Accessibility Report Generation for Mobile AppsACM Transactions on Computer-Human Interaction10.1145/3674967Online publication date: 2-Jul-2024
    • (2024)Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement LearningACM Transactions on Software Engineering and Methodology10.1145/3674728Online publication date: 21-Jun-2024
    • (2024)Semi-supervised Crowdsourced Test Report Clustering via Screenshot-Text Binding RulesProceedings of the ACM on Software Engineering10.1145/36607761:FSE(1540-1563)Online publication date: 12-Jul-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
    June 2020
    1640 pages
    ISBN:9781450371216
    DOI:10.1145/3377811
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • KIISE: Korean Institute of Information Scientists and Engineers
    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. accessibility
    2. content description
    3. image-based buttons
    4. neural networks
    5. user interface

    Qualifiers

    • Research-article

    Conference

    ICSE '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 276 of 1,856 submissions, 15%

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)296
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Towards Automated Accessibility Report Generation for Mobile AppsACM Transactions on Computer-Human Interaction10.1145/3674967Online publication date: 2-Jul-2024
    • (2024)Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement LearningACM Transactions on Software Engineering and Methodology10.1145/3674728Online publication date: 21-Jun-2024
    • (2024)Semi-supervised Crowdsourced Test Report Clustering via Screenshot-Text Binding RulesProceedings of the ACM on Software Engineering10.1145/36607761:FSE(1540-1563)Online publication date: 12-Jul-2024
    • (2024)Improving Web Accessibility through Artificial Intelligence: A Focus on Image Description Generation: Améliorer l'Accessibilité des Sites Web grâce à l'Intelligence Artificielle : Focus sur la Génération de Descriptions d'ImagesProceedings of the 35th International Francophone Conference on Human-Computer Interaction10.1145/3650104.3652908(1-13)Online publication date: 25-Mar-2024
    • (2024)Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLMProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642939(1-20)Online publication date: 11-May-2024
    • (2024)Empirical Investigation of Accessibility Bug Reports in Mobile Platforms: A Chromium Case StudyProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642508(1-17)Online publication date: 11-May-2024
    • (2024)"I tend to view ads almost like a pestilence": On the Accessibility Implications of Mobile Ads for Blind UsersProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639228(1-13)Online publication date: 20-May-2024
    • (2024)MotorEase: Automated Detection of Motor Impairment Accessibility Issues in Mobile App UIsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639167(1-13)Online publication date: 20-May-2024
    • (2024)The Power of Positionality—Why Accessibility? An Interview With Kevin Moran and Arun KrishnavajjalaIEEE Software10.1109/MS.2024.336065041:3(91-94)Online publication date: May-2024
    • (2024)The Impact of Smartphone Applications on Visual Disabilities People2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS)10.1109/ICETSIS61505.2024.10459704(350-354)Online publication date: 28-Jan-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media