Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy

Kim, Yunho; Lee, Jeong Hyun; Lee, Choongin; Mun, Juhyeok; Youm, Donghoon; Park, Jeongsoo; Hwangbo, Jemin

Computer Science > Robotics

arXiv:2406.02989 (cs)

[Submitted on 5 Jun 2024 (v1), last revised 28 Sep 2024 (this version, v2)]

Title:Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy

Authors:Yunho Kim, Jeong Hyun Lee, Choongin Lee, Juhyeok Mun, Donghoon Youm, Jeongsoo Park, Jemin Hwangbo

View PDF HTML (experimental)

Abstract:For reliable autonomous robot navigation in urban settings, the robot must have the ability to identify semantically traversable terrains in the image based on the semantic understanding of the scene. This reasoning ability is based on semantic traversability, which is frequently achieved using semantic segmentation models fine-tuned on the testing domain. This fine-tuning process often involves manual data collection with the target robot and annotation by human labelers which is prohibitively expensive and unscalable. In this work, we present an effective methodology for training a semantic traversability estimator using egocentric videos and an automated annotation process. Egocentric videos are collected from a camera mounted on a pedestrian's chest. The dataset for training the semantic traversability estimator is then automatically generated by extracting semantically traversable regions in each video frame using a recent foundation model in image segmentation and its prompting technique. Extensive experiments with videos taken across several countries and cities, covering diverse urban scenarios, demonstrate the high scalability and generalizability of the proposed annotation method. Furthermore, performance analysis and real-world deployment for autonomous robot navigation showcase that the trained semantic traversability estimator is highly accurate, able to handle diverse camera viewpoints, computationally light, and real-world applicable. The summary video is available at this https URL.

Comments:	Accepted to IEEE Robotics and Automation Letters (RA-L) 2024, First two authors contributed equally
Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2406.02989 [cs.RO]
	(or arXiv:2406.02989v2 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2406.02989

Submission history

From: Kim Yunho [view email]
[v1] Wed, 5 Jun 2024 06:40:04 UTC (3,918 KB)
[v2] Sat, 28 Sep 2024 16:31:58 UTC (5,977 KB)

Computer Science > Robotics

Title:Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators