research-article

Mobile AR Depth Estimation: Challenges & Prospects

Authors:

Tian GuoAuthors Info & Claims

HOTMOBILE '24: Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications

Pages 21 - 26

https://doi.org/10.1145/3638550.3641122

Published: 28 February 2024 Publication History

Abstract

Accurate metric depth can help achieve more realistic user interactions such as object placement and occlusion detection in mobile augmented reality (AR). However, it can be challenging to obtain metricly accurate depth estimation in practice. We tested four different state-of-the-art (SOTA) monocular depth estimation models on a newly introduced dataset (ARKitScenes) and observed obvious performance gaps on this real-world mobile dataset. We categorize the challenges to hardware, data, and model-related challenges and propose promising future directions, including (i) using more hardware-related information from the mobile device's camera and other available sensors, (ii) capturing high-quality data to reflect real-world AR scenarios, and (iii) designing a model architecture to utilize the new information.

References

[1]

Apple. https://developer.apple.com/augmented-reality/, 2017.

[2]

G. Baruch, Z. Chen, A. Dehghan, T. Dimry, Y. Feigin, P. Fu, T. Gebauer, B. Joffe, D. Kurz, A. Schwartz, and E. Shulman. ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data. In NeurIPS Datasets and Benchmarks Track, 2021.

[3]

S. F. Bhat, I. Alhashim, and P. Wonka. Localbins: Improving depth estimation by learning local distributions. In ECCV, 2022.

Digital Library

[4]

S. F. Bhat, R. Birkl, D. Wofk, P. Wonka, and M. Müller. Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv:2302.12288.

[5]

R. Birkl, D. Wofk, and M. Müller. Midas v3.1--a model zoo for robust monocular relative depth estimation. arXiv:2307.14460, 2023.

[6]

G. Brazil, A. Kumar, J. Straub, N. Ravi, J. Johnson, and G. Gkioxari. Omni3D: A large benchmark and model for 3D object detection in the wild. In CVPR, 2023.

[7]

J. Cho, D. Min, Y. Kim, and K. Sohn. DIML/CVL RGB-D Dataset: 2M RGB-D Images of Natural Indoor and Outdoor Scenes. arXiv: 2110.11590, 2021.

[8]

A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In CVPR, 2017.

[9]

S. Farooq Bhat, I. Alhashim, and P. Wonka. AdaBins: Depth Estimation Using Adaptive Bins. In CVPR, 2021.

[10]

Y. Fujimura, M. Iiyama, T. Funatomi, and Y. Mukaigawa. Deep depth from focal stack with defocus model for camera-setting invariance. arXiv:2202.13055, 2022.

[11]

A. Ganj, Y. Zhao, F. Galbiati, and T. Guo. Toward Scalable and Controllable AR Experimentation. In ImmerCom, 2023.

Digital Library

[12]

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun. Vision meets robotics: The kitti dataset. IJRR, 2013.

Digital Library

[13]

V. Guizilini, I. Vasiljevic, D. Chen, R. Ambrus, and A. Gaidon. Towards zero-shot scale-aware monocular depth estimation. In ICCV, 2023.

[14]

S. Hwang, J. Lee, W. J. Kim, S. Woo, K. Lee, and S. Lee. Lidar depth completion using color-embedded information via knowledge distillation. IEEE Transactions on Intelligent Transportation Systems, 2022.

[15]

Intel. https://www.intelrealsense.com/wp-content/uploads/2023/07/Intel-RealSense-D400-Series-Datasheet-July-2023.pdf, 2023.

[16]

M. Maximov, K. Galim, and L. Leal-Taixe. Focus on defocus: Bridging the synthetic to real domain gap for depth estimation. In CVPR, 2020.

[17]

P. K. Nathan Silberman, Derek Hoiem and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012.

Digital Library

[18]

M. Norman, V. Kellen, S. Smallen, B. DeMeulle, S. Strande, E. Lazowska, N. Alterman, R. Fatland, S. Stone, A. Tan, K. Yelick, E. Van Dusen, and J. Mitchell. Cloudbank: Managed services to simplify cloud access for computer science research and education. In Practice and Experience in Advanced Research Computing, PEARC '21, New York, NY, USA, 2021. Association for Computing Machinery.

Digital Library

[19]

R. Ranftl, A. Bochkovskiy, and V. Koltun. Vision transformers for dense prediction. In ICCV, 2021.

[20]

R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. TPAMI, 2020.

[21]

M. Sayed, J. Gibson, J. Watson, V. Prisacariu, M. Firman, and C. Godard. Simplerecon: 3d reconstruction without 3d convolutions. In ECCV, 2022.

Digital Library

[22]

J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A benchmark for the evaluation of rgb-d slam systems. In IROS, 2012.

[23]

F. Tapia Benavides, A. Ignatov, and R. Timofte. Phonedepth: A dataset for monocular depth estimation on mobile devices. In CVPRW, 2022.

[24]

N.-H. Wang, R. Wang, Y.-L. Liu, Y.-H. Huang, Y.-L. Chang, C.-P. Chen, and K. Jou. Bridging unsupervised and supervised depth from focus via all-in-focus supervision. In ICCV, 2021.

[25]

C.-Y. Wu, J. Wang, M. Hall, U. Neumann, and S. Su. Toward practical monocular indoor depth estimation. In CVPR, 2022.

[26]

W. Yin, C. Zhang, H. Chen, Z. Cai, G. Yu, K. Wang, X. Chen, and C. Shen. Metric3d: Towards zero-shot metric 3d prediction from a single image. 2023.

[27]

J. Zhang, H. Yang, J. Ren, D. Zhang, B. He, T. Cao, Y. Li, Y. Zhang, and Y. Liu. Mobidepth: Real-time depth estimation using on-device dual cameras. MobiCom, 2022.

Digital Library

[28]

Y. Zhang, T. Scargill, A. Vaishnav, G. Premsankar, M. Di Francesco, and M. Gorlatova. Indepth: Real-time depth inpainting for mobile augmented reality. IMWUT, 2022.

Digital Library

Cited By

Dayoub YSavchenko AMakarov I(2025)Boosting Depth Estimation for Self-Driving in a Self-Supervised Framework via Improved Pose NetworkIEEE Open Journal of the Computer Society10.1109/OJCS.2024.35058766(109-118)Online publication date: 2025
https://doi.org/10.1109/OJCS.2024.3505876
Abbasi MVáz PSilva JMartins P(2024)Enhancing Visual Perception in Immersive VR and AR Environments: AI-Driven Color and Clarity Adjustments Under Dynamic Lighting ConditionsTechnologies10.3390/technologies1211021612:11(216)Online publication date: 3-Nov-2024
https://doi.org/10.3390/technologies12110216
Zhao YGanj AGuo TGanesan DLane NShi W(2024)Towards In-context Environment Sensing for Mobile Augmented RealityProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3696211(2091-2097)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3696211
Show More Cited By

Index Terms

Mobile AR Depth Estimation: Challenges & Prospects
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Computer graphics
    1. Graphics systems and interfaces
      1. Virtual reality
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
      1. Mixed / augmented reality
      2. Virtual reality

Index terms have been assigned to the content through auto-classification.

Recommendations

3D Virtual Tracing and Depth Perception Problem on Mobile AR
CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Mobile Augmented Reality (AR) is most commonly implemented using a camera and a flat screen. Such implementation removes binocular disparity from users' observation. To compensate, people use alternative depth cues (e.g. depth ordering). However, these ...
A User-Perspective View for Mobile AR Systems Using Discrete Depth Segmentation
CW '15: Proceedings of the 2015 International Conference on Cyberworlds (CW)

In this paper we present a methodology for creating a user-perspective view for mobile augmented reality (AR) systems. The most common video-see-through style for mobile AR systems is a device-perspective view, and the methods suggested for user-...
Toward Next-Gen Mobile AR Games

Mobile augmented reality games offer a new and rich game experience allowing players to move and interact in their physical environment with 3D content. The authors review existing approaches to mobile AR games and identify two major trends: small, user-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HOTMOBILE '24: Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications

February 2024

167 pages

ISBN:9798400704970

DOI:10.1145/3638550

Chair:
Nigel Davies,
Program Chair:
Chenren Xu

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 February 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

HOTMOBILE '24

Sponsor:

SIGMOBILE

HOTMOBILE '24: 25th International Workshop on Mobile Computing Systems and Applications

February 28 - 29, 2024

CA, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 96 of 345 submissions, 28%

Upcoming Conference

HOTMOBILE '25

Sponsor:
sigmobile

The 26th International Workshop on Mobile Computing Systems and Applications

February 26 - 27, 2025

La Quinta , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
251
Total Downloads

Downloads (Last 12 months)251
Downloads (Last 6 weeks)20

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dayoub YSavchenko AMakarov I(2025)Boosting Depth Estimation for Self-Driving in a Self-Supervised Framework via Improved Pose NetworkIEEE Open Journal of the Computer Society10.1109/OJCS.2024.35058766(109-118)Online publication date: 2025
https://doi.org/10.1109/OJCS.2024.3505876
Abbasi MVáz PSilva JMartins P(2024)Enhancing Visual Perception in Immersive VR and AR Environments: AI-Driven Color and Clarity Adjustments Under Dynamic Lighting ConditionsTechnologies10.3390/technologies1211021612:11(216)Online publication date: 3-Nov-2024
https://doi.org/10.3390/technologies12110216
Zhao YGanj AGuo TGanesan DLane NShi W(2024)Towards In-context Environment Sensing for Mobile Augmented RealityProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3696211(2091-2097)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3696211
Ganj ASu HGuo T(2024)Toward Robust Depth Fusion for Mobile AR With Depth from Focus and Single-Image Priors2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)10.1109/ISMAR-Adjunct64951.2024.00149(517-520)Online publication date: 21-Oct-2024
https://doi.org/10.1109/ISMAR-Adjunct64951.2024.00149

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten