Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

🚀Sascha’s Paper Club

Depth Anything —A Foundation Model for Monocular Depth Estimation

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data by L. Yang et. al.

Sascha Kirch
Towards Data Science
11 min readMar 20, 2024

--

“Depth Anything” paper illustration by Sascha Kirch
Image created from publication by Sascha Kirch

Monocular depth estimation, the prediction of distance in 3D space from a 2D image. The “ill posed and inherently ambiguous problem”, as stated in literally every paper on depth estimation, is a fundamental problem in computer vision and robotics. At the same time foundation models dominate the scene in deep learning based NLP and computer vision. Wouldn’t it be awesome if we could leverage their success for depth estimation too?

In today’s paper walkthrough we’ll dive into Depth Anything, a foundation model for monocular depth estimation. We will discover its architecture, the tricks used to train it and how it is used for metric depth estimation.

Paper: Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data, Lihe Yang et.al., 19 Jan. 2024

Resources: GitHubProject PageDemoCheckpoints

Conference: CVPR2024

Category: foundation models, monocular depth estimation

Other Walkthroughs:
[BYOL] — [CLIP] — [GLIP] — [Segment Anything] — [DINO] — [DDPM]

--

--

🚙 Expert Deep Learning @ Bosch 🤖 Collaborating Researcher @ Volograms 🎓 Lecturer Deep Learning @ UNED ⚡️ IEEE Eta Kappa Nu Nu Alpha