Context Awesome

awesome-llm-post-training

github.com/mbzuai-oryx/awesome-llm-post-training ↗

Awesome Reasoning LLM Tutorial/Survey/Guide

2.5k

GitHub Stars

187

Curated Resources

14

Categories

1 hour ago

Last Refreshed

🔍 Survey🤖 LLMs-in-RL🏆 Reward Learning (Process Reward Models)Policy OptimizationMCTS/Tree SearchExplainabilityMultimodal Agent related Slow-Fast SystemBenchmark and DatasetsReasoning and Safety🚀 RL & LLM Fine-Tuning Repositories⚡ Applications & Benchmarks📚 Tutorials & Courses🛠️ Libraries & Implementations🔗 Other Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 🚀 rl & llm fine-tuning repositories resources from awesome-llm-post-training"

Installation instructions →

What's inside

🚀 RL & LLM Fine-Tuning Repositories

1
Offers code for fine-tuning large vision-language models as decision-making agents via RL. Includes implementations for training models with task-specific rewards and evaluating them in various environments.
10
A high-throughput, distributed architecture for seamless LLM integration in interactive environments. While not specialized in RL or RLHF by default, it supports custom implementations and is ideal for users needing maximum flexibility.
11
Implements the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" . Focuses on enhancing LLM reasoning capabilities using a reverse curriculum RL approach.
12
A flexible, efficient, and production-ready RL training library for large language models (LLMs). Serves as the open-source implementation of the HybridFlow framework and supports various RL algorithms (PPO, GRPO), advanced resource utilization, and scalability up to 70B models on hundreds of GPUs. Integrates with Hugging Face models, supervised fine-tuning, and RLHF with multiple reward types.
13
A distributed training framework for fine-tuning large language models (LLMs) with reinforcement learning. Supports both Accelerate and NVIDIA NeMo backends, allowing training of models up to 20B+ parameters. Implements PPO and ILQL, and integrates with CHEESE for human-in-the-loop data collection.
14
A framework for instruction tuning in LLMs with RLHF, supporting 26 languages. Provides multilingual resources such as ChatGPT prompts, instruction datasets, and response ranking data, along with both BLOOM-based and LLaMa-based models and evaluation benchmarks.

🔍 Survey

A Survey on Bridging VLMs and Synthetic Data
16 May 2025
A Survey on Foundation Models for Decision Making
9 Jan 2023
A Survey on Large Language Model Alignment Techniques
6 May 2023
A Survey on Large Language Models for Reinforcement Learning
10 Dec 2023
A Survey on Multimodal Large Language Models
10 Feb 2025
A Survey on Post-training of Large Language Models
8 Mar 2025

🛠️ Libraries & Implementations

Resources

Komal Kumar

🔗 Other Resources

⚡ Applications & Benchmarks

Paper
OpenAI (2023) [
Paper
Wu et al. (2023) [
[Paper]
[Paper]
[Paper]
[Paper]

MCTS/Tree Search

Explainability

Showing a sample of 187 resources. View the full list on GitHub →