research-article

Opportunities for Human-AI Collaboration in Remote Sighted Assistance

Authors:

Syed Masum Billah,

John M. CarrollAuthors Info & Claims

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

Pages 63 - 78

https://doi.org/10.1145/3490099.3511113

Published: 22 March 2022 Publication History

Abstract

Remote sighted assistance (RSA) has emerged as a conversational assistive technology for people with visual impairments (VI), where remote sighted agents provide realtime navigational assistance to users with visual impairments via video-chat-like communication. In this paper, we conducted a literature review and interviewed 12 RSA users to comprehensively understand technical and navigational challenges in RSA for both the agents and users. Technical challenges are organized into four categories: agents’ difficulties in orienting and localizing the users; acquiring the users’ surroundings and detecting obstacles; delivering information and understanding user-specific situations; and coping with a poor network connection. Navigational challenges are presented in 15 real-world scenarios (8 outdoor, 7 indoor) for the users. Prior work indicates that computer vision (CV) technologies, especially interactive 3D maps and realtime localization, can address a subset of these challenges. However, we argue that addressing the full spectrum of these challenges warrants new development in Human-CV collaboration, which we formalize as five emerging problems: making object recognition and obstacle avoidance algorithms blind-aware; localizing users under poor networks; recognizing digital content on LCD screens; recognizing texts on irregular surfaces; and predicting the trajectory of out-of-frame pedestrians or objects. Addressing these problems can advance computer vision research and usher into the next generation of RSA service.

References

[1]

2021. ARCore. Retrieved June 27, 2021 from https://developers.google.com/ar

[2]

2021. Autour. http://autour.mcgill.ca/en/.

[3]

2021. More to Explore with ARKit 5. Retrieved June 27, 2021 from https://developer.apple.com/augmented-reality/arkit/

[4]

2021. OpenStreetMap. https://www.openstreetmap.org/.

[5]

2021. The Seeing Eye GPS™ App in the iTunes Apple Store!http://www.senderogroup.com/products/shopseeingeyegps.html.

[6]

Dragan Ahmetovic, Roberto Manduchi, James M Coughlan, and Sergio Mascetti. 2015. Zebra crossing spotter: Automatic population of spatial databases for increased safety of blind travelers. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. 251–258.

Digital Library

[7]

Dragan Ahmetovic, Roberto Manduchi, James M Coughlan, and Sergio Mascetti. 2017. Mind your crossings: Mining GIS imagery for crosswalk localization. ACM Transactions on Accessible Computing (TACCESS) 9, 4 (2017), 1–25.

Digital Library

[8]

Aira. 2021. Aira. https://aira.io/.

[9]

Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. 2016. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition. 961–971.

[10]

Moustafa Alzantot and Moustafa Youssef. 2012. Crowdinside: Automatic construction of indoor floorplans. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems. 99–108.

Digital Library

[11]

Mauro Avila, Katrin Wolf, Anke Brock, and Niels Henze. 2016. Remote assistance for blind users in daily life: A survey about Be My Eyes. In Proceedings of the 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments. 1–2.

Digital Library

[12]

Mohamad Nurfakhrian Aziz, Tito Waluyo Purboyo, and Anggunmeka Luhur Prasasti. 2017. A survey on the implementation of image enhancement. Int. J. Appl. Eng. Res 12, 21 (2017), 11451–11459.

[13]

Yicheng Bai, Wenyan Jia, Hong Zhang, Zhi-Hong Mao, and Mingui Sun. 2014. Landmark-based indoor positioning for visually impaired individuals. In 2014 12th International Conference on Signal Processing (ICSP). IEEE, 668–671.

[14]

Nikola Banovic, Rachel L Franz, Khai N Truong, Jennifer Mankoff, and Anind K Dey. 2013. Uncovering information needs for independent spatial learning for users who are visually impaired. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility. 1–8.

Digital Library

[15]

Przemyslaw Baranski, Maciej Polanczyk, and Pawel Strumillo. 2010. A remote guidance system for the blind. In The 12th IEEE International Conference on e-Health Networking, Applications and Services. IEEE, 386–390.

[16]

Przemyslaw Baranski and Pawel Strumillo. 2015. Field trials of a teleassistance system for the visually impaired. In 2015 8th International Conference on Human System Interaction (HSI). IEEE, 173–179.

[17]

BeMyEyes. 2021. Be My Eyes. https://www.bemyeyes.com/.

[18]

Ayan Kumar Bhunia, Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, and Yi-Zhe Song. 2021. Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition. arXiv preprint arXiv:2107.12090(2021).

[19]

Jeffrey P Bigham, Chandrika Jayant, Andrew Miller, Brandyn White, and Tom Yeh. 2010. VizWiz:: LocateIt-enabling blind people to locate objects in their environment. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. IEEE, 65–72.

[20]

BlindSquare. 2020. BlindSquare iOS Application. https://www.blindsquare.com/.

[21]

Erin Brady, Jeffrey P Bigham, 2015. Crowdsourcing accessibility: Human-powered access technologies. Foundations and Trends® in Human–Computer Interaction 8, 4(2015), 273–372.

[22]

Erin Brady, Meredith Ringel Morris, Yu Zhong, Samuel White, and Jeffrey P Bigham. 2013. Visual challenges in the everyday lives of blind people. In Proceedings of the SIGCHI conference on human factors in computing systems. 2117–2126.

Digital Library

[23]

Steve Branson, Catherine Wah, Florian Schroff, Boris Babenko, Peter Welinder, Pietro Perona, and Serge Belongie. 2010. Visual recognition with humans in the loop. In European Conference on Computer Vision. Springer, 438–451.

[24]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77–101.

[25]

Nama R Budhathoki and Caroline Haythornthwaite. 2013. Motivation for open collaboration: Crowd and community models and the case of OpenStreetMap. American Behavioral Scientist 57, 5 (2013), 548–575.

[26]

Andrius Budrionis, Darius Plikynas, Povilas Daniušis, and Audrius Indrulionis. 2020. Smartphone-based computer vision travelling aids for blind and visually impaired individuals: A systematic review. Assistive Technology(2020), 1–17.

[27]

M Bujacz, P Baranski, M Moranski, P Strumillo, and A Materka. 2008. Remote guidance for the blind—A proposed teleassistance system and navigation trials. In 2008 Conference on Human System Interactions. IEEE, 888–892.

[28]

Michele A Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P Bigham, and Amy Hurst. 2012. Crowdsourcing subjective fashion advice using VizWiz: challenges and opportunities. In Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility. ACM, 135–142.

Digital Library

[29]

John M. Carroll, Sooyeon Lee, Madison Reddie, Jordan Beck, and Mary Beth Rosson. 2020. Human-Computer Synergies in Prosthetic Interactions. IxD&A 44(2020), 29–52. http://www.mifav.uniroma2.it/inevent/events/idea2010/doc/44_2.pdf

[30]

Babar Chaudary, Iikka Paajala, Eliud Keino, and Petri Pulli. 2017. Tele-guidance based navigation system for the visually impaired and blind persons. In eHealth 360. Springer, 9–16.

[31]

Si Chen, Muyuan Li, Kui Ren, and Chunming Qiao. 2015. Crowd map: Accurate reconstruction of indoor floor plans from crowdsourced sensor-rich videos. In 2015 IEEE 35th International conference on distributed computing systems. IEEE, 1–10.

[32]

Ivan Dokmanić, Reza Parhizkar, Andreas Walther, Yue M. Lu, and Martin Vetterli. 2013. Acoustic echoes reveal room shape. Proceedings of the National Academy of Sciences 110, 30(2013), 12186–12191. https://doi.org/10.1073/pnas.1221464110 arXiv:https://www.pnas.org/content/110/30/12186.full.pdf

[33]

Mostafa Elgendy, Miklós Herperger, Tibor Guzsvinecz, and Cecilia Sik Lanyi. 2019. Indoor Navigation for People with Visual Impairment using Augmented Reality Markers. In The 10th IEEE International Conference on Cognitive Infocommunications (CogInfoCom). IEEE, 425–430.

[34]

Wafa Elmannai and Khaled M. Elleithy. 2017. Sensor-based assistive devices for visually-impaired people: Current status, challenges, and future directions. Sensors (Basel, Switzerland) 17 (2017).

[35]

Navid Fallah, Ilias Apostolopoulos, Kostas Bekris, and Eelke Folmer. 2012. The user as a sensor: navigating users with visual impairments in indoor spaces using tactile landmarks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 425–432.

Digital Library

[36]

G. Fischer, E. Giaccardi, Y. Ye, A. G. Sutcliffe, and N. Mehandjiev. 2004. Meta-Design: A Manifesto for End-User Development. Commun. ACM 47, 9 (Sept. 2004), 33–37. https://doi.org/10.1145/1015864.1015884

Digital Library

[37]

Giovanni Fusco and James M Coughlan. 2020. Indoor localization for visually impaired travelers using computer vision on a smartphone. In Proceedings of the 17th International Web for All Conference. 1–11.

Digital Library

[38]

Giovanni Fusco, Ender Tekin, Richard E Ladner, and James M Coughlan. 2014. Using computer vision to access appliance displays. In Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibility. 281–282.

Digital Library

[39]

Aura Ganz, Siddhesh Rajan Gandhi, James Schafer, Tushar Singh, Elaine Puleo, Gary Mullett, and Carole Wilson. 2011. PERCEPT: Indoor navigation for the blind and visually impaired. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 856–859.

[40]

Aura Ganz, James M Schafer, Yang Tao, Carole Wilson, and Meg Robertson. 2014. PERCEPT-II: Smartphone based indoor navigation system for the blind. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 3662–3665.

[41]

Vanja Garaj, Ziad Hunaiti, and Wamadeva Balachandran. 2007. The effects of video image frame rate on the environmental hazards recognition performance in using remote vision to navigate visually impaired pedestrians. In Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology. 207–213.

Digital Library

[42]

Vanja Garaj, Ziad Hunaiti, and Wamadeva Balachandran. 2010. Using remote vision: the effects of video image frame rate on visual object recognition performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 40, 4 (2010), 698–707.

Digital Library

[43]

Vanja Garaj, Rommanee Jirawimut, Piotr Ptasinski, Franjo Cecelja, and Wamadeva Balachandran. 2003. A system for remote sighted guidance of visually impaired pedestrians. British Journal of Visual Impairment 21, 2 (2003), 55–63.

[44]

Cole Gleason, Dragan Ahmetovic, Saiph Savage, Carlos Toxtli, Carl Posthuma, Chieko Asakawa, Kris M Kitani, and Jeffrey P Bigham. 2018. Crowdsourcing the installation and maintenance of indoor localization infrastructure to support blind navigation. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 1 (2018), 1–25.

Digital Library

[45]

Cole Gleason, Anhong Guo, Gierad Laput, Kris Makoto Kitani, and Jeffrey P. Bigham. 2016. VizMap: Accessible Visual Information Through Crowdsourced Map Reconstruction. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS. ACM, 273–274.

Digital Library

[46]

GPS.gov. [n.d.]. GPS Accuracy. https://www.gps.gov/systems/gps/performance/accuracy/.

[47]

João Guerreiro, Dragan Ahmetovic, Daisuke Sato, Kris Kitani, and Chieko Asakawa. 2019. Airport accessibility and navigation assistance for people with visual impairments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[48]

João Guerreiro, Daisuke Sato, Saki Asakawa, Huixu Dong, Kris M Kitani, and Chieko Asakawa. 2019. CaBot: Designing and evaluating an autonomous navigation robot for blind people. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 68–82.

Digital Library

[49]

Anhong Guo, Junhan Kong, Michael Rivera, Frank F Xu, and Jeffrey P Bigham. 2019. Statelens: A reverse engineering solution for making existing dynamic touchscreens accessible. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology. 371–385.

Digital Library

[50]

Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. 2018. Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2255–2264.

[51]

Danna Gurari, Qing Li, Abigale J Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P Bigham. 2018. Vizwiz grand challenge: Answering visual questions from blind people. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3608–3617.

[52]

Richard Guy and Khai Truong. 2012. CrossingGuard: exploring information content in navigation aids for visually impaired pedestrians. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 405–414.

Digital Library

[53]

Kotaro Hara, Shiri Azenkot, Megan Campbell, Cynthia L Bennett, Vicki Le, Sean Pannella, Robert Moore, Kelly Minckler, Rochelle H Ng, and Jon E Froehlich. 2015. Improving public transit accessibility for blind riders by crowdsourcing bus stop landmark locations with google street view: An extended analysis. ACM Transactions on Accessible Computing (TACCESS) 6, 2 (2015), 1–23.

Digital Library

[54]

Kotaro Hara, Jin Sun, Jonah Chazan, David Jacobs, and Jon E Froehlich. 2013. An initial study of automatic curb ramp detection with crowdsourced verification using google street view images. In First AAAI Conference on Human Computation and Crowdsourcing.

[55]

Kotaro Hara, Jin Sun, Robert Moore, David Jacobs, and Jon Froehlich. 2014. Tohme: detecting curb ramps in google street view using crowdsourcing, computer vision, and machine learning. In Proceedings of the 27th annual ACM symposium on User interface software and technology. 189–204.

Digital Library

[56]

Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4, 2 (1968), 100–107.

[57]

R. Hartley and A. Zisserman. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press.

[58]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.

[59]

Nicole Holmes and Kelly Prentice. 2015. iPhone video link facetime as an orientation tool: remote O&M for people with vision impairment. International Journal of Orientation & Mobility 7, 1 (2015), 60–68.

[60]

Bill Holton. 2015. Crowdviz: Remote video assistance on your iphone. AFB AccessWorld Magazine(2015).

[61]

Bill Holton. 2016. BeSpecular: A new remote assistant service. Access World Magazine 17, 7 (2016).

[62]

Ziad Hunaiti, Vanja Garaj, and Wamadeva Balachandran. 2006. A remote vision guidance system for visually impaired pedestrians. The Journal of Navigation 59, 3 (2006), 497–504.

[63]

Ziad Hunaiti, Vanja Garaj, Wamadeva Balachandran, and Franjo Cecelja. 2005. Use of remote vision in navigation of visually impaired pedestrians. In International Congress Series, Vol. 1282. Elsevier, 1026–1030.

[64]

Rabia Jafri, Syed Abid Ali, Hamid R Arabnia, and Shameem Fatima. 2014. Computer vision-based object recognition for the visually impaired in an indoors environment: a survey. The Visual Computer 30, 11 (2014), 1197–1222.

Digital Library

[65]

Rie Kamikubo, Naoya Kato, Keita Higuchi, Ryo Yonetani, and Yoichi Sato. 2020. Support strategies for remote guides in assisting people with visual impairments for effective indoor navigation. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[66]

Dimosthenis Karatzas, Lluis Gomez-Bigorda, Anguelos Nicolaou, Suman Ghosh, Andrew Bagdanov, Masakazu Iwamura, Jiri Matas, Lukas Neumann, Vijay Ramaseshan Chandrasekhar, Shijian Lu, 2015. ICDAR 2015 competition on robust reading. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1156–1160.

[67]

Seita Kayukawa, Keita Higuchi, João Guerreiro, Shigeo Morishima, Yoichi Sato, Kris Kitani, and Chieko Asakawa. 2019. Bbeep: A sonic collision avoidance system for blind travellers and nearby pedestrians. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[68]

Eunjeong Ko and Eun Yi Kim. 2017. A vision-based wayfinding system for visually impaired people using situation awareness and activity-based instructions. Sensors 17, 8 (2017), 1882.

[69]

Adarsh Kowdle, Yao-Jen Chang, Andrew Gallagher, and Tsuhan Chen. 2011. Active learning for piecewise planar 3d reconstruction. In CVPR 2011. IEEE, 929–936.

Digital Library

[70]

Aliasgar Kutiyanawala, Vladimir Kulyukin, and John Nicholson. 2011. Teleassistance in accessible shopping for the blind. In Proceedings on the International Conference on Internet Computing (ICOMP). The Steering Committee of The World Congress in Computer Science, Computer …, 1.

[71]

Walter S Lasecki, Kyle I Murray, Samuel White, Robert C Miller, and Jeffrey P Bigham. 2011. Real-time crowd control of existing interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 23–32.

Digital Library

[72]

Walter S Lasecki, Rachel Wesley, Jeffrey Nichols, Anand Kulkarni, James F Allen, and Jeffrey P Bigham. 2013. Chorus: a crowd-powered conversational assistant. In Proceedings of the 26th annual ACM symposium on User interface software and technology. ACM, 151–162.

Digital Library

[73]

Sooyeon Lee, Madison Reddie, Krish Gurdasani, Xiying Wang, Jordan Beck, Mary Beth Rosson, and John M. Carroll. 2018. Conversations for Vision: Remote Sighted Assistants Helping People with Visual Impairments. arxiv:1812.00148 [cs.HC]

[74]

Sooyeon Lee, Madison Reddie, Chun-Hua Tsai, Jordan Beck, Mary Beth Rosson, and John M Carroll. 2020. The emerging professional practice of remote sighted assistance for people with visual impairments. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[75]

Gordon E Legge, Paul J Beckmann, Bosco S Tjan, Gary Havey, Kevin Kramer, David Rolkosky, Rachel Gage, Muzi Chen, Sravan Puchakayala, and Aravindhan Rangarajan. 2013. Indoor navigation by people with visual impairment using a digital sign system. PloS one 8, 10 (2013).

[76]

Ki-Joune Li and Jiyeong Lee. 2010. Indoor spatial awareness initiative and standard for indoor spatial data. In Proceedings of IROS 2010 Workshop on Standardization for Service Robot, Vol. 18.

[77]

Xiyan Liu, Gaofeng Meng, and Chunhong Pan. 2019. Scene text detection and recognition with advances in deep learning: a survey. International Journal on Document Analysis and Recognition (IJDAR) 22, 2(2019), 143–162.

Digital Library

[78]

Yang Liu, Noelle RB Stiles, and Markus Meister. 2018. Augmented reality powers a cognitive assistant for the blind. ELife 7(2018), e37841.

[79]

Srikanth Malla, Behzad Dariush, and Chiho Choi. 2020. Titan: Future forecast using action priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11186–11196.

[80]

Roberto Manduchi, Sri Kurniawan, and Homayoun Bagherinia. 2010. Blind guidance using mobile computer vision: A usability study. In Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility. 241–242.

Digital Library

[81]

Troy McDaniel, Kanav Kahol, Daniel Villanueva, and Sethuraman Panchanathan. 2008. Integration of RFID and computer vision for remote object perception for individuals who are blind. In Proceedings of the 2008 Ambi-Sys Workshop on Haptic User Interfaces in Ambient Media Systems, HAS 2008. Association for Computing Machinery, Inc. 2008 1st Ambi-Sys Workshop on Haptic User Interfaces in Ambient Media Systems, HAS 2008 ; Conference date: 11-02-2008 Through 14-02-2008.

Digital Library

[82]

Microsoft. 2021. Seeing AI - Talking camera app for those with a visual impairment. https://www.microsoft.com/en-us/ai/seeing-ai.

[83]

Akihiro Miyata, Kazuki Okugawa, Yuki Yamato, Tadashi Maeda, Yusaku Murayama, Megumi Aibara, Masakazu Furuichi, and Yuko Murayama. 2021. A Crowdsourcing Platform for Constructing Accessibility Maps Supporting Multiple Participation Modes. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–6.

Digital Library

[84]

MG Mohanan and Ambuja Salgoankar. 2018. A survey of robotic motion planning in dynamic environments. Robotics and Autonomous Systems 100 (2018), 171–185.

[85]

Tim Morris, Paul Blenkhorn, Luke Crossey, Quang Ngo, Martin Ross, David Werner, and Christina Wong. 2006. Clearspeech: A Display Reader for the Visually Handicapped. IEEE Transactions on Neural Systems and Rehabilitation Engineering 14, 4(2006), 492–500. https://doi.org/10.1109/TNSRE.2006.881538

[86]

M. Murata, D. Ahmetovic, D. Sato, H. Takagi, K. M. Kitani, and C. Asakawa. 2018. Smartphone-based indoor localization for blind navigation across building complexes. In 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom). 1–10.

[87]

Muzammal Naseer, Salman Khan, and Fatih Porikli. 2018. Indoor scene understanding in 2.5/3d for autonomous agents: A survey. IEEE Access 7(2018), 1859–1887.

[88]

Brian J Nguyen, Yeji Kim, Kathryn Park, Allison J Chen, Scarlett Chen, Donald Van Fossan, and Daniel L Chao. 2018. Improvement in patient-reported quality of life outcomes in severely visually impaired individuals using the Aira assistive technology system. Translational Vision Science & Technology 7, 5 (2018), 30–30.

[89]

Gerald Oster and Yasunori Nishijima. 1963. Moiré patterns. Scientific American 208, 5 (1963), 54–63.

[90]

J Eduardo Pérez, Myriam Arrue, Masatomo Kobayashi, Hironobu Takagi, and Chieko Asakawa. 2017. Assessment of semantic taxonomies for blind indoor navigation based on a shopping center use case. In Proceedings of the 14th Web for All Conference on The Future of Accessible Work. 1–4.

Digital Library

[91]

Helen Petrie, Valerie Johnson, Thomas Strothotte, Andreas Raab, Rainer Michel, Lars Reichert, and Axel Schalt. 1997. MoBIC: An aid to increase the independent mobility of blind travellers. British Journal of Visual Impairment 15, 2 (1997), 63–66.

[92]

Swadhin Pradhan, Ghufran Baig, Wenguang Mao, Lili Qiu, Guohai Chen, and Bo Yang. 2018. Smartphone-based acoustic indoor space mapping. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 2 (2018), 1–26.

Digital Library

[93]

Giorgio Presti, Dragan Ahmetovic, Mattia Ducci, Cristian Bernareggi, Luca Ludovico, Adriano Baratè, Federico Avanzini, and Sergio Mascetti. 2019. WatchOut: Obstacle sonification for people with visual impairment or blindness. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 402–413.

Digital Library

[94]

Paymon Rafian and Gordon E Legge. 2017. Remote sighted assistants for indoor location sensing of visually impaired pedestrians. ACM Transactions on Applied Perception (TAP) 14, 3 (2017), 19.

Digital Library

[95]

Santiago Real and Alvaro Araujo. 2019. Navigation systems for the blind and visually impaired: Past work, challenges, and open problems. Sensors (Basel, Switzerland) 19, 15 (02 Aug 2019), 3404. https://doi.org/10.3390/s19153404 31382536[pmid].

[96]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779–788.

[97]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91–99.

[98]

Sebastião Rocha and Arminda Lopes. 2020. Navigation based application with augmented reality and accessibility. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3334480.3383004

Digital Library

[99]

Ranga Rodrigo, Mehrnaz Zouqi, Zhenhe Chen, and Jagath Samarabandu. 2009. Robust and efficient feature tracking for indoor navigation. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39, 3(2009), 658–671.

Digital Library

[100]

Andrey Rudenko, Luigi Palmieri, Michael Herman, Kris M Kitani, Dariu M Gavrila, and Kai O Arras. 2020. Human motion trajectory prediction: A survey. The International Journal of Robotics Research 39, 8 (2020), 895–935.

Digital Library

[101]

Manaswi Saha, Alexander J Fiannaca, Melanie Kneisel, Edward Cutrell, and Meredith Ringel Morris. 2019. Closing the Gap: Designing for the Last-Few-Meters Wayfinding Problem for People with Visual Impairments. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 222–235.

Digital Library

[102]

Manaswi Saha, Michael Saugstad, Hanuma Teja Maddali, Aileen Zeng, Ryan Holland, Steven Bower, Aditya Dash, Sage Chen, Anthony Li, Kotaro Hara, 2019. Project sidewalk: A web-based crowdsourcing tool for collecting sidewalk accessibility data at scale. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[103]

Daisuke Sato, Uran Oh, Kakuya Naito, Hironobu Takagi, Kris Kitani, and Chieko Asakawa. 2017. NavCog3: An evaluation of a smartphone-based blind indoor navigation assistant with semantic features in a large-scale environment. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. 270–279.

Digital Library

[104]

Stefano Scheggi, A Talarico, and Domenico Prattichizzo. 2014. A remote guidance system for blind and visually impaired people via vibrotactile haptic feedback. In 22nd Mediterranean Conference on Control and Automation. IEEE, 20–23.

[105]

Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai. 2018. Aster: An attentional scene text recognizer with flexible rectification. IEEE transactions on pattern analysis and machine intelligence 41, 9(2018), 2035–2048.

[106]

Sudipta N Sinha, Drew Steedly, Richard Szeliski, Maneesh Agrawala, and Marc Pollefeys. 2008. Interactive 3D architectural modeling from unordered photo collections. ACM Transactions on Graphics (TOG) 27, 5 (2008), 1–10.

Digital Library

[107]

Microsoft Soundscape. 2020. A map delivered in 3D sound. https://www.microsoft.com/en-us/research/product/soundscape/.

[108]

Jin Sun and David W Jacobs. 2017. Seeing what is not there: Learning context to determine where objects are missing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5716–5724.

[109]

TapTapSee. 2021. TapTapSee. https://taptapseeapp.com/.

[110]

Ender Tekin and James M. Coughlan. 2010. A mobile phone application enabling visually impaired users to find and read product barcodes. In Computers Helping People with Special Needs, Klaus Miesenberger, Joachim Klaus, Wolfgang Zagler, and Arthur Karshmer (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 290–295.

[111]

Ender Tekin, James M Coughlan, and Huiying Shen. 2011. Real-time detection and reading of LED/LCD displays for visually impaired persons. In 2011 IEEE Workshop on Applications of Computer Vision (WACV). IEEE, 491–496.

Digital Library

[112]

Nelson Daniel Troncoso Aldas, Sooyeon Lee, Chonghan Lee, Mary Beth Rosson, John M Carroll, and Vijaykrishnan Narayanan. 2020. AIGuide: An Augmented Reality Hand Guidance Application for People with Visual Impairments. In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility. 1–13.

Digital Library

[113]

Barbara Tversky. 1993. Cognitive maps, cognitive collages, and spatial mental models. In Spatial Information Theory A Theoretical Basis for GIS, Andrew U. Frank and Irene Campari (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 14–24.

[114]

Prashant Verma, Kushal Agrawal, and V Sarasvathi. 2020. Indoor navigation using augmented reality. In Proceedings of the 2020 4th International Conference on Virtual and Augmented Reality Simulations. 58–63.

Digital Library

[115]

Mei Wang and Weihong Deng. 2018. Deep visual domain adaptation: A survey. Neurocomputing 312(2018), 135–153.

Digital Library

[116]

Yuxin Wang, Hongtao Xie, Shancheng Fang, Jing Wang, Shenggao Zhu, and Yongdong Zhang. 2021. From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network. arXiv preprint arXiv:2108.09661(2021).

[117]

Zhihao Wang, Jian Chen, and Steven CH Hoi. 2020. Deep learning for image super-resolution: A survey. IEEE transactions on pattern analysis and machine intelligence (2020).

[118]

Galen Weld, Esther Jang, Anthony Li, Aileen Zeng, Kurtis Heimerl, and Jon E. Froehlich. 2019. Deep Learning for Automatically Detecting Sidewalk Accessibility Problems Using Streetscape Imagery. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 196–209. https://doi.org/10.1145/3308561.3353798

Digital Library

[119]

Michele A Williams, Amy Hurst, and Shaun K Kane. 2013. ”Pray before you step out” describing personal and situational blind navigation behaviors. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility. 1–8.

Digital Library

[120]

Jingyi Xie, Madison Reddie, Sooyeon Lee, Syed Billah, Zihan Zhou, Chun-hua Tsai, and John Carroll. 2022. Iterative Design and Prototyping of Computer Vision Mediated Remote Sighted Assistance. ACM Transactions on Computer-Human Interaction (in press) (2022).

[121]

Takuma Yagi, Karttikeya Mangalam, Ryo Yonetani, and Yoichi Sato. 2018. Future person localization in first-person videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7593–7602.

[122]

Ruijie Yan, Liangrui Peng, Shanyu Xiao, and Gang Yao. 2021. Primitive Representation Learning for Scene Text Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 284–293.

[123]

Chris Yoon, Ryan Louie, Jeremy Ryan, MinhKhang Vu, Hyegi Bang, William Derksen, and Paul Ruvolo. 2019. Leveraging augmented reality to create apps for people with visual disabilities: A case study in indoor navigation. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 210–221.

Digital Library

[124]

Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. 2019. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems 30, 11(2019), 3212–3232.

[125]

Yu Zhong, Walter S Lasecki, Erin Brady, and Jeffrey P Bigham. 2015. Regionspeak: Quick comprehensive spatial descriptions of complex images for blind users. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2353–2362.

Digital Library

[126]

Zhengxia Zou, Zhenwei Shi, Yuhong Guo, and Jieping Ye. 2019. Object detection in 20 years: A survey. arXiv preprint arXiv:1905.05055(2019).

Cited By

Singh SRai MPandey JSaxena A(2024)Emotional Intelligence and Collaborative Dynamics in Industry 5.0 for Human-Machine InteractionsHuman-Machine Collaboration and Emotional Intelligence in Industry 5.010.4018/979-8-3693-6806-0.ch010(190-204)Online publication date: 30-Jun-2024
https://doi.org/10.4018/979-8-3693-6806-0.ch010
Huang XZhang RLi YZhang BZhang JXu JXu S(2024)A Simulation and Training Platform for Remote-Sighted AssistanceSensors10.3390/s2423777324:23(7773)Online publication date: 4-Dec-2024
https://doi.org/10.3390/s24237773
Yu RLee SXie JBillah SCarroll J(2024)Human–AI Collaboration for Remote Sighted Assistance: Perspectives from the LLM EraFuture Internet10.3390/fi1607025416:7(254)Online publication date: 18-Jul-2024
https://doi.org/10.3390/fi16070254
Show More Cited By

Index Terms

Opportunities for Human-AI Collaboration in Remote Sighted Assistance

Index terms have been assigned to the content through auto-classification.

Recommendations

Helping Helpers: Supporting Volunteers in Remote Sighted Assistance with Augmented Reality Maps
DIS '22: Proceedings of the 2022 ACM Designing Interactive Systems Conference

Remote sighted assistance (RSA) has emerged as a conversational assistive service, where remote sighted workers, i.e., agents, provide real-time assistance to blind users via video-chat-like communication. Prior work identified several challenges for ...
Iterative Design and Prototyping of Computer Vision Mediated Remote Sighted Assistance
Remote sighted assistance (RSA) is an emerging navigational aid for people with visual impairments (PVI). Using scenario-based design to illustrate our ideas, we developed a prototype showcasing potential applications for computer vision to support RSA ...
The Emerging Professional Practice of Remote Sighted Assistance for People with Visual Impairments
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

People with visual impairments (PVI) must interact with a world they cannot see. Remote sighted assistance (RSA) has emerged as a conversational assistive technology. We interviewed RSA assistants ("agents") who provide assistance to PVI via a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

March 2022

888 pages

ISBN:9781450391443

DOI:10.1145/3490099

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 March 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

NIH (National Institutes of Health)

Conference

IUI '22

Sponsor:

IUI '22: 27th International Conference on Intelligent User Interfaces

March 22 - 25, 2022

Helsinki, Finland

Acceptance Rates

Overall Acceptance Rate 114 of 581 submissions, 20%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
835
Total Downloads

Downloads (Last 12 months)240
Downloads (Last 6 weeks)19

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Singh SRai MPandey JSaxena A(2024)Emotional Intelligence and Collaborative Dynamics in Industry 5.0 for Human-Machine InteractionsHuman-Machine Collaboration and Emotional Intelligence in Industry 5.010.4018/979-8-3693-6806-0.ch010(190-204)Online publication date: 30-Jun-2024
https://doi.org/10.4018/979-8-3693-6806-0.ch010
Huang XZhang RLi YZhang BZhang JXu JXu S(2024)A Simulation and Training Platform for Remote-Sighted AssistanceSensors10.3390/s2423777324:23(7773)Online publication date: 4-Dec-2024
https://doi.org/10.3390/s24237773
Yu RLee SXie JBillah SCarroll J(2024)Human–AI Collaboration for Remote Sighted Assistance: Perspectives from the LLM EraFuture Internet10.3390/fi1607025416:7(254)Online publication date: 18-Jul-2024
https://doi.org/10.3390/fi16070254
Gonzalez Penuela RCollins JBennett CAzenkot S(2024)Investigating Use Cases of AI-Powered Scene Description Applications for Blind and Low Vision PeopleProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642211(1-21)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642211
Xie JYu RZhang HLee SBillah SCarroll J(2024)BubbleCam: Engaging Privacy in Remote Sighted AssistanceProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642030(1-16)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642030
Yu RLiu JZhou ZHuang S(2024)NeRF-Enhanced Outpainting for Faithful Field-of-View Extrapolation2024 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA57147.2024.10611328(16826-16833)Online publication date: 13-May-2024
https://doi.org/10.1109/ICRA57147.2024.10611328
Roig PAlcaraz SGilly KBernad CJuiz C(2023)Formal Algebraic Model of an Edge Data Center with a Redundant Ring TopologyNetwork10.3390/network30100073:1(142-157)Online publication date: 30-Jan-2023
https://doi.org/10.3390/network3010007
Stangl ASadjo EEmami-Naeini PWang YGurari DFindlater L(2023)“Dump it, Destroy it, Send it to Data Heaven”: Blind People’s Expectations for Visual Privacy in Visual Assistance TechnologiesProceedings of the 20th International Web for All Conference10.1145/3587281.3587296(134-147)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3587281.3587296
Xie JYu RCui KLee SCarroll JBillah S(2023)Are Two Heads Better than One? Investigating Remote Sighted Assistance with Paired VolunteersProceedings of the 2023 ACM Designing Interactive Systems Conference10.1145/3563657.3596019(1810-1825)Online publication date: 10-Jul-2023
https://dl.acm.org/doi/10.1145/3563657.3596019
Alrashidi ACudd PAbhayaratne CGotoh Y(2023)Exploration of verbal descriptions and dynamic indoors environments for people with sight lossExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585883(1-6)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544549.3585883
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten