Mono-depth estimation uses a single camera to produce depth maps. Recent works have made progress using self-supervised learning from video. Key methods include SfMLearner which pioneered this approach, struct2depth which models object motion explicitly, and Depth from Videos in the Wild which learns camera intrinsics from YouTube videos. PackNet directly estimates depth in metric units using a 3D packing network that preserves spatial details better than traditional upsampling. TRI has achieved state-of-the-art results using these techniques.