Stars
The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)
GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
A trainable PyTorch reproduction of AlphaFold 3.
Development repository for the Triton language and compiler
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
Puzzles for learning Triton, play it with minimal environment configuration!
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
Accessible large language models via k-bit quantization for PyTorch.
A platform for building proxies to bypass network restrictions.
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
[NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Tips for Writing a Research Paper using LaTeX
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of GPT-Fast, a simple, PyTorch-native generation codebase.
Writing AI Conference Papers: A Handbook for Beginners
A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.
Code for visualizing the loss landscape of neural nets
This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.