
Starred repositories
Genome modeling and design across all domains of life
The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."
This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Open source impl of **MV-DUSt3R+ Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds** from Meta Reality Labs. Project page https://mv-dust3rp.github.io/
One summary of efficient segment anything models
Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Efficient LLM Inference over Long Sequences
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Source code for CVPR2022 paper "Abandoning the Bayer-Filter to See in the Dark"
[ICRA2025] Integrates the vision, touch, and common-sense information of foundational models, customized to the agent's perceptual needs.
[ECCV'24] FisherRF: Active View Selection and Uncertainty Quantification for Radiance Fields using Fisher Information
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
Jobs_Applier_AI_Agent_AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.
Entropy Based Sampling and Parallel CoT Decoding
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
Deep Learning tools and applications for NVIDIA AGX platforms.