Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3490099.3511113acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

Opportunities for Human-AI Collaboration in Remote Sighted Assistance

Published: 22 March 2022 Publication History

Abstract

Remote sighted assistance (RSA) has emerged as a conversational assistive technology for people with visual impairments (VI), where remote sighted agents provide realtime navigational assistance to users with visual impairments via video-chat-like communication. In this paper, we conducted a literature review and interviewed 12 RSA users to comprehensively understand technical and navigational challenges in RSA for both the agents and users. Technical challenges are organized into four categories: agents’ difficulties in orienting and localizing the users; acquiring the users’ surroundings and detecting obstacles; delivering information and understanding user-specific situations; and coping with a poor network connection. Navigational challenges are presented in 15 real-world scenarios (8 outdoor, 7 indoor) for the users. Prior work indicates that computer vision (CV) technologies, especially interactive 3D maps and realtime localization, can address a subset of these challenges. However, we argue that addressing the full spectrum of these challenges warrants new development in Human-CV collaboration, which we formalize as five emerging problems: making object recognition and obstacle avoidance algorithms blind-aware; localizing users under poor networks; recognizing digital content on LCD screens; recognizing texts on irregular surfaces; and predicting the trajectory of out-of-frame pedestrians or objects. Addressing these problems can advance computer vision research and usher into the next generation of RSA service.

References

[1]
2021. ARCore. Retrieved June 27, 2021 from https://developers.google.com/ar
[2]
2021. Autour. http://autour.mcgill.ca/en/.
[3]
2021. More to Explore with ARKit 5. Retrieved June 27, 2021 from https://developer.apple.com/augmented-reality/arkit/
[4]
2021. OpenStreetMap. https://www.openstreetmap.org/.
[5]
2021. The Seeing Eye GPS™ App in the iTunes Apple Store!http://www.senderogroup.com/products/shopseeingeyegps.html.
[6]
Dragan Ahmetovic, Roberto Manduchi, James M Coughlan, and Sergio Mascetti. 2015. Zebra crossing spotter: Automatic population of spatial databases for increased safety of blind travelers. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. 251–258.
[7]
Dragan Ahmetovic, Roberto Manduchi, James M Coughlan, and Sergio Mascetti. 2017. Mind your crossings: Mining GIS imagery for crosswalk localization. ACM Transactions on Accessible Computing (TACCESS) 9, 4 (2017), 1–25.
[8]
Aira. 2021. Aira. https://aira.io/.
[9]
Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. 2016. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition. 961–971.
[10]
Moustafa Alzantot and Moustafa Youssef. 2012. Crowdinside: Automatic construction of indoor floorplans. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems. 99–108.
[11]
Mauro Avila, Katrin Wolf, Anke Brock, and Niels Henze. 2016. Remote assistance for blind users in daily life: A survey about Be My Eyes. In Proceedings of the 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments. 1–2.
[12]
Mohamad Nurfakhrian Aziz, Tito Waluyo Purboyo, and Anggunmeka Luhur Prasasti. 2017. A survey on the implementation of image enhancement. Int. J. Appl. Eng. Res 12, 21 (2017), 11451–11459.
[13]
Yicheng Bai, Wenyan Jia, Hong Zhang, Zhi-Hong Mao, and Mingui Sun. 2014. Landmark-based indoor positioning for visually impaired individuals. In 2014 12th International Conference on Signal Processing (ICSP). IEEE, 668–671.
[14]
Nikola Banovic, Rachel L Franz, Khai N Truong, Jennifer Mankoff, and Anind K Dey. 2013. Uncovering information needs for independent spatial learning for users who are visually impaired. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility. 1–8.
[15]
Przemyslaw Baranski, Maciej Polanczyk, and Pawel Strumillo. 2010. A remote guidance system for the blind. In The 12th IEEE International Conference on e-Health Networking, Applications and Services. IEEE, 386–390.
[16]
Przemyslaw Baranski and Pawel Strumillo. 2015. Field trials of a teleassistance system for the visually impaired. In 2015 8th International Conference on Human System Interaction (HSI). IEEE, 173–179.
[17]
BeMyEyes. 2021. Be My Eyes. https://www.bemyeyes.com/.
[18]
Ayan Kumar Bhunia, Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, and Yi-Zhe Song. 2021. Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition. arXiv preprint arXiv:2107.12090(2021).
[19]
Jeffrey P Bigham, Chandrika Jayant, Andrew Miller, Brandyn White, and Tom Yeh. 2010. VizWiz:: LocateIt-enabling blind people to locate objects in their environment. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. IEEE, 65–72.
[20]
BlindSquare. 2020. BlindSquare iOS Application. https://www.blindsquare.com/.
[21]
Erin Brady, Jeffrey P Bigham, 2015. Crowdsourcing accessibility: Human-powered access technologies. Foundations and Trends® in Human–Computer Interaction 8, 4(2015), 273–372.
[22]
Erin Brady, Meredith Ringel Morris, Yu Zhong, Samuel White, and Jeffrey P Bigham. 2013. Visual challenges in the everyday lives of blind people. In Proceedings of the SIGCHI conference on human factors in computing systems. 2117–2126.
[23]
Steve Branson, Catherine Wah, Florian Schroff, Boris Babenko, Peter Welinder, Pietro Perona, and Serge Belongie. 2010. Visual recognition with humans in the loop. In European Conference on Computer Vision. Springer, 438–451.
[24]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77–101.
[25]
Nama R Budhathoki and Caroline Haythornthwaite. 2013. Motivation for open collaboration: Crowd and community models and the case of OpenStreetMap. American Behavioral Scientist 57, 5 (2013), 548–575.
[26]
Andrius Budrionis, Darius Plikynas, Povilas Daniušis, and Audrius Indrulionis. 2020. Smartphone-based computer vision travelling aids for blind and visually impaired individuals: A systematic review. Assistive Technology(2020), 1–17.
[27]
M Bujacz, P Baranski, M Moranski, P Strumillo, and A Materka. 2008. Remote guidance for the blind—A proposed teleassistance system and navigation trials. In 2008 Conference on Human System Interactions. IEEE, 888–892.
[28]
Michele A Burton, Erin Brady, Robin Brewer, Callie Neylan, Jeffrey P Bigham, and Amy Hurst. 2012. Crowdsourcing subjective fashion advice using VizWiz: challenges and opportunities. In Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility. ACM, 135–142.
[29]
John M. Carroll, Sooyeon Lee, Madison Reddie, Jordan Beck, and Mary Beth Rosson. 2020. Human-Computer Synergies in Prosthetic Interactions. IxD&A 44(2020), 29–52. http://www.mifav.uniroma2.it/inevent/events/idea2010/doc/44_2.pdf
[30]
Babar Chaudary, Iikka Paajala, Eliud Keino, and Petri Pulli. 2017. Tele-guidance based navigation system for the visually impaired and blind persons. In eHealth 360. Springer, 9–16.
[31]
Si Chen, Muyuan Li, Kui Ren, and Chunming Qiao. 2015. Crowd map: Accurate reconstruction of indoor floor plans from crowdsourced sensor-rich videos. In 2015 IEEE 35th International conference on distributed computing systems. IEEE, 1–10.
[32]
Ivan Dokmanić, Reza Parhizkar, Andreas Walther, Yue M. Lu, and Martin Vetterli. 2013. Acoustic echoes reveal room shape. Proceedings of the National Academy of Sciences 110, 30(2013), 12186–12191. https://doi.org/10.1073/pnas.1221464110 arXiv:https://www.pnas.org/content/110/30/12186.full.pdf
[33]
Mostafa Elgendy, Miklós Herperger, Tibor Guzsvinecz, and Cecilia Sik Lanyi. 2019. Indoor Navigation for People with Visual Impairment using Augmented Reality Markers. In The 10th IEEE International Conference on Cognitive Infocommunications (CogInfoCom). IEEE, 425–430.
[34]
Wafa Elmannai and Khaled M. Elleithy. 2017. Sensor-based assistive devices for visually-impaired people: Current status, challenges, and future directions. Sensors (Basel, Switzerland) 17 (2017).
[35]
Navid Fallah, Ilias Apostolopoulos, Kostas Bekris, and Eelke Folmer. 2012. The user as a sensor: navigating users with visual impairments in indoor spaces using tactile landmarks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 425–432.
[36]
G. Fischer, E. Giaccardi, Y. Ye, A. G. Sutcliffe, and N. Mehandjiev. 2004. Meta-Design: A Manifesto for End-User Development. Commun. ACM 47, 9 (Sept. 2004), 33–37. https://doi.org/10.1145/1015864.1015884
[37]
Giovanni Fusco and James M Coughlan. 2020. Indoor localization for visually impaired travelers using computer vision on a smartphone. In Proceedings of the 17th International Web for All Conference. 1–11.
[38]
Giovanni Fusco, Ender Tekin, Richard E Ladner, and James M Coughlan. 2014. Using computer vision to access appliance displays. In Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibility. 281–282.
[39]
Aura Ganz, Siddhesh Rajan Gandhi, James Schafer, Tushar Singh, Elaine Puleo, Gary Mullett, and Carole Wilson. 2011. PERCEPT: Indoor navigation for the blind and visually impaired. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 856–859.
[40]
Aura Ganz, James M Schafer, Yang Tao, Carole Wilson, and Meg Robertson. 2014. PERCEPT-II: Smartphone based indoor navigation system for the blind. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 3662–3665.
[41]
Vanja Garaj, Ziad Hunaiti, and Wamadeva Balachandran. 2007. The effects of video image frame rate on the environmental hazards recognition performance in using remote vision to navigate visually impaired pedestrians. In Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology. 207–213.
[42]
Vanja Garaj, Ziad Hunaiti, and Wamadeva Balachandran. 2010. Using remote vision: the effects of video image frame rate on visual object recognition performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 40, 4 (2010), 698–707.
[43]
Vanja Garaj, Rommanee Jirawimut, Piotr Ptasinski, Franjo Cecelja, and Wamadeva Balachandran. 2003. A system for remote sighted guidance of visually impaired pedestrians. British Journal of Visual Impairment 21, 2 (2003), 55–63.
[44]
Cole Gleason, Dragan Ahmetovic, Saiph Savage, Carlos Toxtli, Carl Posthuma, Chieko Asakawa, Kris M Kitani, and Jeffrey P Bigham. 2018. Crowdsourcing the installation and maintenance of indoor localization infrastructure to support blind navigation. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 1 (2018), 1–25.
[45]
Cole Gleason, Anhong Guo, Gierad Laput, Kris Makoto Kitani, and Jeffrey P. Bigham. 2016. VizMap: Accessible Visual Information Through Crowdsourced Map Reconstruction. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility, ASSETS. ACM, 273–274.
[46]
GPS.gov. [n.d.]. GPS Accuracy. https://www.gps.gov/systems/gps/performance/accuracy/.
[47]
João Guerreiro, Dragan Ahmetovic, Daisuke Sato, Kris Kitani, and Chieko Asakawa. 2019. Airport accessibility and navigation assistance for people with visual impairments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.
[48]
João Guerreiro, Daisuke Sato, Saki Asakawa, Huixu Dong, Kris M Kitani, and Chieko Asakawa. 2019. CaBot: Designing and evaluating an autonomous navigation robot for blind people. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 68–82.
[49]
Anhong Guo, Junhan Kong, Michael Rivera, Frank F Xu, and Jeffrey P Bigham. 2019. Statelens: A reverse engineering solution for making existing dynamic touchscreens accessible. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology. 371–385.
[50]
Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. 2018. Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2255–2264.
[51]
Danna Gurari, Qing Li, Abigale J Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P Bigham. 2018. Vizwiz grand challenge: Answering visual questions from blind people. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3608–3617.
[52]
Richard Guy and Khai Truong. 2012. CrossingGuard: exploring information content in navigation aids for visually impaired pedestrians. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 405–414.
[53]
Kotaro Hara, Shiri Azenkot, Megan Campbell, Cynthia L Bennett, Vicki Le, Sean Pannella, Robert Moore, Kelly Minckler, Rochelle H Ng, and Jon E Froehlich. 2015. Improving public transit accessibility for blind riders by crowdsourcing bus stop landmark locations with google street view: An extended analysis. ACM Transactions on Accessible Computing (TACCESS) 6, 2 (2015), 1–23.
[54]
Kotaro Hara, Jin Sun, Jonah Chazan, David Jacobs, and Jon E Froehlich. 2013. An initial study of automatic curb ramp detection with crowdsourced verification using google street view images. In First AAAI Conference on Human Computation and Crowdsourcing.
[55]
Kotaro Hara, Jin Sun, Robert Moore, David Jacobs, and Jon Froehlich. 2014. Tohme: detecting curb ramps in google street view using crowdsourcing, computer vision, and machine learning. In Proceedings of the 27th annual ACM symposium on User interface software and technology. 189–204.
[56]
Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4, 2 (1968), 100–107.
[57]
R. Hartley and A. Zisserman. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press.
[58]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
[59]
Nicole Holmes and Kelly Prentice. 2015. iPhone video link facetime as an orientation tool: remote O&M for people with vision impairment. International Journal of Orientation & Mobility 7, 1 (2015), 60–68.
[60]
Bill Holton. 2015. Crowdviz: Remote video assistance on your iphone. AFB AccessWorld Magazine(2015).
[61]
Bill Holton. 2016. BeSpecular: A new remote assistant service. Access World Magazine 17, 7 (2016).
[62]
Ziad Hunaiti, Vanja Garaj, and Wamadeva Balachandran. 2006. A remote vision guidance system for visually impaired pedestrians. The Journal of Navigation 59, 3 (2006), 497–504.
[63]
Ziad Hunaiti, Vanja Garaj, Wamadeva Balachandran, and Franjo Cecelja. 2005. Use of remote vision in navigation of visually impaired pedestrians. In International Congress Series, Vol. 1282. Elsevier, 1026–1030.
[64]
Rabia Jafri, Syed Abid Ali, Hamid R Arabnia, and Shameem Fatima. 2014. Computer vision-based object recognition for the visually impaired in an indoors environment: a survey. The Visual Computer 30, 11 (2014), 1197–1222.
[65]
Rie Kamikubo, Naoya Kato, Keita Higuchi, Ryo Yonetani, and Yoichi Sato. 2020. Support strategies for remote guides in assisting people with visual impairments for effective indoor navigation. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.
[66]
Dimosthenis Karatzas, Lluis Gomez-Bigorda, Anguelos Nicolaou, Suman Ghosh, Andrew Bagdanov, Masakazu Iwamura, Jiri Matas, Lukas Neumann, Vijay Ramaseshan Chandrasekhar, Shijian Lu, 2015. ICDAR 2015 competition on robust reading. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1156–1160.
[67]
Seita Kayukawa, Keita Higuchi, João Guerreiro, Shigeo Morishima, Yoichi Sato, Kris Kitani, and Chieko Asakawa. 2019. Bbeep: A sonic collision avoidance system for blind travellers and nearby pedestrians. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
[68]
Eunjeong Ko and Eun Yi Kim. 2017. A vision-based wayfinding system for visually impaired people using situation awareness and activity-based instructions. Sensors 17, 8 (2017), 1882.
[69]
Adarsh Kowdle, Yao-Jen Chang, Andrew Gallagher, and Tsuhan Chen. 2011. Active learning for piecewise planar 3d reconstruction. In CVPR 2011. IEEE, 929–936.
[70]
Aliasgar Kutiyanawala, Vladimir Kulyukin, and John Nicholson. 2011. Teleassistance in accessible shopping for the blind. In Proceedings on the International Conference on Internet Computing (ICOMP). The Steering Committee of The World Congress in Computer Science, Computer …, 1.
[71]
Walter S Lasecki, Kyle I Murray, Samuel White, Robert C Miller, and Jeffrey P Bigham. 2011. Real-time crowd control of existing interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 23–32.
[72]
Walter S Lasecki, Rachel Wesley, Jeffrey Nichols, Anand Kulkarni, James F Allen, and Jeffrey P Bigham. 2013. Chorus: a crowd-powered conversational assistant. In Proceedings of the 26th annual ACM symposium on User interface software and technology. ACM, 151–162.
[73]
Sooyeon Lee, Madison Reddie, Krish Gurdasani, Xiying Wang, Jordan Beck, Mary Beth Rosson, and John M. Carroll. 2018. Conversations for Vision: Remote Sighted Assistants Helping People with Visual Impairments. arxiv:1812.00148 [cs.HC]
[74]
Sooyeon Lee, Madison Reddie, Chun-Hua Tsai, Jordan Beck, Mary Beth Rosson, and John M Carroll. 2020. The emerging professional practice of remote sighted assistance for people with visual impairments. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.
[75]
Gordon E Legge, Paul J Beckmann, Bosco S Tjan, Gary Havey, Kevin Kramer, David Rolkosky, Rachel Gage, Muzi Chen, Sravan Puchakayala, and Aravindhan Rangarajan. 2013. Indoor navigation by people with visual impairment using a digital sign system. PloS one 8, 10 (2013).
[76]
Ki-Joune Li and Jiyeong Lee. 2010. Indoor spatial awareness initiative and standard for indoor spatial data. In Proceedings of IROS 2010 Workshop on Standardization for Service Robot, Vol. 18.
[77]
Xiyan Liu, Gaofeng Meng, and Chunhong Pan. 2019. Scene text detection and recognition with advances in deep learning: a survey. International Journal on Document Analysis and Recognition (IJDAR) 22, 2(2019), 143–162.
[78]
Yang Liu, Noelle RB Stiles, and Markus Meister. 2018. Augmented reality powers a cognitive assistant for the blind. ELife 7(2018), e37841.
[79]
Srikanth Malla, Behzad Dariush, and Chiho Choi. 2020. Titan: Future forecast using action priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11186–11196.
[80]
Roberto Manduchi, Sri Kurniawan, and Homayoun Bagherinia. 2010. Blind guidance using mobile computer vision: A usability study. In Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility. 241–242.
[81]
Troy McDaniel, Kanav Kahol, Daniel Villanueva, and Sethuraman Panchanathan. 2008. Integration of RFID and computer vision for remote object perception for individuals who are blind. In Proceedings of the 2008 Ambi-Sys Workshop on Haptic User Interfaces in Ambient Media Systems, HAS 2008. Association for Computing Machinery, Inc. 2008 1st Ambi-Sys Workshop on Haptic User Interfaces in Ambient Media Systems, HAS 2008 ; Conference date: 11-02-2008 Through 14-02-2008.
[82]
Microsoft. 2021. Seeing AI - Talking camera app for those with a visual impairment. https://www.microsoft.com/en-us/ai/seeing-ai.
[83]
Akihiro Miyata, Kazuki Okugawa, Yuki Yamato, Tadashi Maeda, Yusaku Murayama, Megumi Aibara, Masakazu Furuichi, and Yuko Murayama. 2021. A Crowdsourcing Platform for Constructing Accessibility Maps Supporting Multiple Participation Modes. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–6.
[84]
MG Mohanan and Ambuja Salgoankar. 2018. A survey of robotic motion planning in dynamic environments. Robotics and Autonomous Systems 100 (2018), 171–185.
[85]
Tim Morris, Paul Blenkhorn, Luke Crossey, Quang Ngo, Martin Ross, David Werner, and Christina Wong. 2006. Clearspeech: A Display Reader for the Visually Handicapped. IEEE Transactions on Neural Systems and Rehabilitation Engineering 14, 4(2006), 492–500. https://doi.org/10.1109/TNSRE.2006.881538
[86]
M. Murata, D. Ahmetovic, D. Sato, H. Takagi, K. M. Kitani, and C. Asakawa. 2018. Smartphone-based indoor localization for blind navigation across building complexes. In 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom). 1–10.
[87]
Muzammal Naseer, Salman Khan, and Fatih Porikli. 2018. Indoor scene understanding in 2.5/3d for autonomous agents: A survey. IEEE Access 7(2018), 1859–1887.
[88]
Brian J Nguyen, Yeji Kim, Kathryn Park, Allison J Chen, Scarlett Chen, Donald Van Fossan, and Daniel L Chao. 2018. Improvement in patient-reported quality of life outcomes in severely visually impaired individuals using the Aira assistive technology system. Translational Vision Science & Technology 7, 5 (2018), 30–30.
[89]
Gerald Oster and Yasunori Nishijima. 1963. Moiré patterns. Scientific American 208, 5 (1963), 54–63.
[90]
J Eduardo Pérez, Myriam Arrue, Masatomo Kobayashi, Hironobu Takagi, and Chieko Asakawa. 2017. Assessment of semantic taxonomies for blind indoor navigation based on a shopping center use case. In Proceedings of the 14th Web for All Conference on The Future of Accessible Work. 1–4.
[91]
Helen Petrie, Valerie Johnson, Thomas Strothotte, Andreas Raab, Rainer Michel, Lars Reichert, and Axel Schalt. 1997. MoBIC: An aid to increase the independent mobility of blind travellers. British Journal of Visual Impairment 15, 2 (1997), 63–66.
[92]
Swadhin Pradhan, Ghufran Baig, Wenguang Mao, Lili Qiu, Guohai Chen, and Bo Yang. 2018. Smartphone-based acoustic indoor space mapping. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 2 (2018), 1–26.
[93]
Giorgio Presti, Dragan Ahmetovic, Mattia Ducci, Cristian Bernareggi, Luca Ludovico, Adriano Baratè, Federico Avanzini, and Sergio Mascetti. 2019. WatchOut: Obstacle sonification for people with visual impairment or blindness. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 402–413.
[94]
Paymon Rafian and Gordon E Legge. 2017. Remote sighted assistants for indoor location sensing of visually impaired pedestrians. ACM Transactions on Applied Perception (TAP) 14, 3 (2017), 19.
[95]
Santiago Real and Alvaro Araujo. 2019. Navigation systems for the blind and visually impaired: Past work, challenges, and open problems. Sensors (Basel, Switzerland) 19, 15 (02 Aug 2019), 3404. https://doi.org/10.3390/s19153404 31382536[pmid].
[96]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779–788.
[97]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91–99.
[98]
Sebastião Rocha and Arminda Lopes. 2020. Navigation based application with augmented reality and accessibility. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3334480.3383004
[99]
Ranga Rodrigo, Mehrnaz Zouqi, Zhenhe Chen, and Jagath Samarabandu. 2009. Robust and efficient feature tracking for indoor navigation. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39, 3(2009), 658–671.
[100]
Andrey Rudenko, Luigi Palmieri, Michael Herman, Kris M Kitani, Dariu M Gavrila, and Kai O Arras. 2020. Human motion trajectory prediction: A survey. The International Journal of Robotics Research 39, 8 (2020), 895–935.
[101]
Manaswi Saha, Alexander J Fiannaca, Melanie Kneisel, Edward Cutrell, and Meredith Ringel Morris. 2019. Closing the Gap: Designing for the Last-Few-Meters Wayfinding Problem for People with Visual Impairments. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 222–235.
[102]
Manaswi Saha, Michael Saugstad, Hanuma Teja Maddali, Aileen Zeng, Ryan Holland, Steven Bower, Aditya Dash, Sage Chen, Anthony Li, Kotaro Hara, 2019. Project sidewalk: A web-based crowdsourcing tool for collecting sidewalk accessibility data at scale. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.
[103]
Daisuke Sato, Uran Oh, Kakuya Naito, Hironobu Takagi, Kris Kitani, and Chieko Asakawa. 2017. NavCog3: An evaluation of a smartphone-based blind indoor navigation assistant with semantic features in a large-scale environment. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. 270–279.
[104]
Stefano Scheggi, A Talarico, and Domenico Prattichizzo. 2014. A remote guidance system for blind and visually impaired people via vibrotactile haptic feedback. In 22nd Mediterranean Conference on Control and Automation. IEEE, 20–23.
[105]
Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai. 2018. Aster: An attentional scene text recognizer with flexible rectification. IEEE transactions on pattern analysis and machine intelligence 41, 9(2018), 2035–2048.
[106]
Sudipta N Sinha, Drew Steedly, Richard Szeliski, Maneesh Agrawala, and Marc Pollefeys. 2008. Interactive 3D architectural modeling from unordered photo collections. ACM Transactions on Graphics (TOG) 27, 5 (2008), 1–10.
[107]
Microsoft Soundscape. 2020. A map delivered in 3D sound. https://www.microsoft.com/en-us/research/product/soundscape/.
[108]
Jin Sun and David W Jacobs. 2017. Seeing what is not there: Learning context to determine where objects are missing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5716–5724.
[109]
TapTapSee. 2021. TapTapSee. https://taptapseeapp.com/.
[110]
Ender Tekin and James M. Coughlan. 2010. A mobile phone application enabling visually impaired users to find and read product barcodes. In Computers Helping People with Special Needs, Klaus Miesenberger, Joachim Klaus, Wolfgang Zagler, and Arthur Karshmer (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 290–295.
[111]
Ender Tekin, James M Coughlan, and Huiying Shen. 2011. Real-time detection and reading of LED/LCD displays for visually impaired persons. In 2011 IEEE Workshop on Applications of Computer Vision (WACV). IEEE, 491–496.
[112]
Nelson Daniel Troncoso Aldas, Sooyeon Lee, Chonghan Lee, Mary Beth Rosson, John M Carroll, and Vijaykrishnan Narayanan. 2020. AIGuide: An Augmented Reality Hand Guidance Application for People with Visual Impairments. In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility. 1–13.
[113]
Barbara Tversky. 1993. Cognitive maps, cognitive collages, and spatial mental models. In Spatial Information Theory A Theoretical Basis for GIS, Andrew U. Frank and Irene Campari (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 14–24.
[114]
Prashant Verma, Kushal Agrawal, and V Sarasvathi. 2020. Indoor navigation using augmented reality. In Proceedings of the 2020 4th International Conference on Virtual and Augmented Reality Simulations. 58–63.
[115]
Mei Wang and Weihong Deng. 2018. Deep visual domain adaptation: A survey. Neurocomputing 312(2018), 135–153.
[116]
Yuxin Wang, Hongtao Xie, Shancheng Fang, Jing Wang, Shenggao Zhu, and Yongdong Zhang. 2021. From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network. arXiv preprint arXiv:2108.09661(2021).
[117]
Zhihao Wang, Jian Chen, and Steven CH Hoi. 2020. Deep learning for image super-resolution: A survey. IEEE transactions on pattern analysis and machine intelligence (2020).
[118]
Galen Weld, Esther Jang, Anthony Li, Aileen Zeng, Kurtis Heimerl, and Jon E. Froehlich. 2019. Deep Learning for Automatically Detecting Sidewalk Accessibility Problems Using Streetscape Imagery. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility (Pittsburgh, PA, USA) (ASSETS ’19). Association for Computing Machinery, New York, NY, USA, 196–209. https://doi.org/10.1145/3308561.3353798
[119]
Michele A Williams, Amy Hurst, and Shaun K Kane. 2013. ”Pray before you step out” describing personal and situational blind navigation behaviors. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility. 1–8.
[120]
Jingyi Xie, Madison Reddie, Sooyeon Lee, Syed Billah, Zihan Zhou, Chun-hua Tsai, and John Carroll. 2022. Iterative Design and Prototyping of Computer Vision Mediated Remote Sighted Assistance. ACM Transactions on Computer-Human Interaction (in press) (2022).
[121]
Takuma Yagi, Karttikeya Mangalam, Ryo Yonetani, and Yoichi Sato. 2018. Future person localization in first-person videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7593–7602.
[122]
Ruijie Yan, Liangrui Peng, Shanyu Xiao, and Gang Yao. 2021. Primitive Representation Learning for Scene Text Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 284–293.
[123]
Chris Yoon, Ryan Louie, Jeremy Ryan, MinhKhang Vu, Hyegi Bang, William Derksen, and Paul Ruvolo. 2019. Leveraging augmented reality to create apps for people with visual disabilities: A case study in indoor navigation. In The 21st International ACM SIGACCESS Conference on Computers and Accessibility. 210–221.
[124]
Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. 2019. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems 30, 11(2019), 3212–3232.
[125]
Yu Zhong, Walter S Lasecki, Erin Brady, and Jeffrey P Bigham. 2015. Regionspeak: Quick comprehensive spatial descriptions of complex images for blind users. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2353–2362.
[126]
Zhengxia Zou, Zhenwei Shi, Yuhong Guo, and Jieping Ye. 2019. Object detection in 20 years: A survey. arXiv preprint arXiv:1905.05055(2019).

Cited By

View all
  • (2024)Emotional Intelligence and Collaborative Dynamics in Industry 5.0 for Human-Machine InteractionsHuman-Machine Collaboration and Emotional Intelligence in Industry 5.010.4018/979-8-3693-6806-0.ch010(190-204)Online publication date: 30-Jun-2024
  • (2024)A Simulation and Training Platform for Remote-Sighted AssistanceSensors10.3390/s2423777324:23(7773)Online publication date: 4-Dec-2024
  • (2024)Human–AI Collaboration for Remote Sighted Assistance: Perspectives from the LLM EraFuture Internet10.3390/fi1607025416:7(254)Online publication date: 18-Jul-2024
  • Show More Cited By

Index Terms

  1. Opportunities for Human-AI Collaboration in Remote Sighted Assistance
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces
          March 2022
          888 pages
          ISBN:9781450391443
          DOI:10.1145/3490099
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 22 March 2022

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. 3D maps
          2. RSA
          3. artificial intelligence
          4. augmented reality
          5. blind
          6. camera
          7. computer vision
          8. conversational assistance
          9. navigation
          10. people with visual impairments
          11. remote sighted assistance
          12. smartphone

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Funding Sources

          Conference

          IUI '22
          Sponsor:

          Acceptance Rates

          Overall Acceptance Rate 114 of 581 submissions, 20%

          Upcoming Conference

          IUI '25

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)240
          • Downloads (Last 6 weeks)19
          Reflects downloads up to 30 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Emotional Intelligence and Collaborative Dynamics in Industry 5.0 for Human-Machine InteractionsHuman-Machine Collaboration and Emotional Intelligence in Industry 5.010.4018/979-8-3693-6806-0.ch010(190-204)Online publication date: 30-Jun-2024
          • (2024)A Simulation and Training Platform for Remote-Sighted AssistanceSensors10.3390/s2423777324:23(7773)Online publication date: 4-Dec-2024
          • (2024)Human–AI Collaboration for Remote Sighted Assistance: Perspectives from the LLM EraFuture Internet10.3390/fi1607025416:7(254)Online publication date: 18-Jul-2024
          • (2024)Investigating Use Cases of AI-Powered Scene Description Applications for Blind and Low Vision PeopleProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642211(1-21)Online publication date: 11-May-2024
          • (2024)BubbleCam: Engaging Privacy in Remote Sighted AssistanceProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642030(1-16)Online publication date: 11-May-2024
          • (2024)NeRF-Enhanced Outpainting for Faithful Field-of-View Extrapolation2024 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA57147.2024.10611328(16826-16833)Online publication date: 13-May-2024
          • (2023)Formal Algebraic Model of an Edge Data Center with a Redundant Ring TopologyNetwork10.3390/network30100073:1(142-157)Online publication date: 30-Jan-2023
          • (2023)“Dump it, Destroy it, Send it to Data Heaven”: Blind People’s Expectations for Visual Privacy in Visual Assistance TechnologiesProceedings of the 20th International Web for All Conference10.1145/3587281.3587296(134-147)Online publication date: 30-Apr-2023
          • (2023)Are Two Heads Better than One? Investigating Remote Sighted Assistance with Paired VolunteersProceedings of the 2023 ACM Designing Interactive Systems Conference10.1145/3563657.3596019(1810-1825)Online publication date: 10-Jul-2023
          • (2023)Exploration of verbal descriptions and dynamic indoors environments for people with sight lossExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544549.3585883(1-6)Online publication date: 19-Apr-2023
          • Show More Cited By

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media