Stars
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
“FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with any VAE.
Latency and Memory Analysis of Transformer Models for Training and Inference
Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".
A hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding). Translations: 🇺🇸 🇨🇳 🇯🇵 🇮🇹 🇰🇷 🇷🇺 🇧🇷 🇪🇸
FastVideo is a lightweight framework for accelerating large video diffusion models.
qianlima-lab / awesome-lifelong-learning-methods-for-llm
Forked from zzz47zzz/awesome-lifelong-learning-methods-for-llmThis repository collects awesome survey, resource, and paper for Lifelong Learning for Large Language Models. (Updated Regularly)
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
Helpful tools and examples for working with flex-attention
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Efficient vision foundation models for high-resolution generation and perception.
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
In 2024, the strongest open-source implementation of asymmetric magvit_v2 supports inference code but excludes VQVAE. It supports the joint encoding of images and videos, accommodating arbitrary vi…
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
SGLang is a fast serving framework for large language models and vision language models.
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
DALL·E Mini - Generate images from a text prompt
Official inference repo for FLUX.1 models
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Framework for benchmarking vector search engines
Open-Sora: Democratizing Efficient Video Production for All
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving