Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
View Neko9810's full-sized avatar

Block or report Neko9810

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)

Python 562 21 Updated Jun 27, 2024

GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

9 Updated Jun 19, 2024

A trainable PyTorch reproduction of AlphaFold 3.

Python 596 47 Updated Nov 14, 2024

Development repository for the Triton language and compiler

C++ 13,396 1,638 Updated Nov 14, 2024

Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

Jupyter Notebook 434 39 Updated Apr 21, 2024

Puzzles for learning Triton, play it with minimal environment configuration!

Python 99 1 Updated Nov 12, 2024

Puzzles for learning Triton

Jupyter Notebook 1,117 80 Updated Sep 25, 2024
Python 32 Updated Nov 8, 2024

Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793

Python 325 10 Updated Oct 30, 2024

Python logging made (stupidly) simple

Python 19,957 699 Updated Nov 3, 2024

Accessible large language models via k-bit quantization for PyTorch.

Python 6,282 630 Updated Nov 14, 2024

A platform for building proxies to bypass network restrictions.

Go 29,540 4,659 Updated Nov 14, 2024

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

2,185 107 Updated Sep 24, 2024

[NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging

Python 30 3 Updated Oct 22, 2024
Python 199 19 Updated Jun 11, 2024

Awesome-Low-Rank-Adaptation

34 6 Updated Oct 13, 2024

Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?

Python 82 2 Updated Oct 21, 2024

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 627 36 Updated Nov 5, 2024

[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retention

Python 59 5 Updated Apr 15, 2024

Tips for Writing a Research Paper using LaTeX

TeX 3,112 367 Updated May 4, 2023

batched loras

Python 336 15 Updated Sep 6, 2023

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 2,192 143 Updated Nov 14, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,587 150 Updated Sep 25, 2024

Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of GPT-Fast, a simple, PyTorch-native generation codebase.

Python 87 8 Updated Aug 9, 2024

Writing AI Conference Papers: A Handbook for Beginners

1,359 47 Updated Nov 11, 2024

A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.

Python 54 6 Updated Aug 2, 2024

Code for visualizing the loss landscape of neural nets

Python 2,833 399 Updated Apr 5, 2022
Python 76 8 Updated Jul 6, 2024

This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.

Python 69 4 Updated Jul 1, 2024

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 56,512 5,989 Updated Nov 14, 2024
Next