Lists (1)
Sort Name ascending (A-Z)
Stars
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)
A series of math-specific large language models of our Qwen2 series.
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
LOFT: A 1 Million+ Token Long-Context Benchmark
A modular graph-based Retrieval-Augmented Generation (RAG) system
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Curated tutorials and resources for Large Language Models, Text2SQL, Text2DSL、Text2API、Text2Vis and more.
Code for the curation of The Stack v2 and StarCoder2 training data
Reference implementation for DPO (Direct Preference Optimization)
Question and Answer based on Anything.
SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
The user home repository for the Mathematics in Lean tutorial.
Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个