- Hong Kong
- https://yaomarkmu.github.io/
- @YaoMarkMu1
Stars
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
A generalized policy for robotics manipulation
DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
https://www.shoufachen.com/Awesome-Diffusion-Transformers/
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
Summary of key papers and blogs about diffusion models to learn about the topic. Detailed list of all published diffusion robotics papers.
ControlLLM: Augment Language Models with Tools by Searching on Graphs
[NeurIPS 2023] InsActor: Instruction-driven Physics-based Characters
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)
Carrie-lin / BlenderToolbox
Forked from HTDerekLiu/BlenderToolboxSome simple Blender scripts for rendering paper figures
Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
The official implementation of "CityDreamer: Compositional Generative Model of Unbounded 3D Cities". (Xie et al., CVPR 2024)
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"
This repository is a collection of research papers on World Models.
Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Proof of concept of the SayCan project applying on real UR5 robot
Code repository for the CVPR2023 publication "HoloDiffusion: Training a 3D diffusion model using 2D Images"
ShoufaChen / img2dataset
Forked from rom1504/img2datasetEasily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Generative Models by Stability AI
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Quadruped manipulator controller using model predictive control and whole body control based on OCS2