Context Awesome

awesome-rlvr

github.com/opendilab/awesome-rlvr ↗

A curated list of reinforcement learning with verifiable rewards (continually updated)

263

GitHub Stars

271

Curated Resources

4

Categories

4 hours ago

Last Refreshed

Surveys & TutorialsCodebasesPapersOther Awesome Lists

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 2026 resources from awesome-rlvr"

Installation instructions →

What's inside

Papers

Surveys & Tutorials

Codebases

AReaL
Ant Reasoning Reinforcement Learning for LLMs
Nemo-Aligner
Scalable toolkit for efficient model alignment
open-r1
Fully open reproduction of the DeepSeek-R1 pipeline (SFT, distillation, GRPO, evaluation)
Open-Reasoner-Zero
one open source implementation of large-scale reasoning-oriented RL training focusing on scalability, simplicity and accessibility
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)*
PRIME
PRIME (Process Reinforcement through IMplicit REwards), an open-source solution for online RL with process rewards

Other Awesome Lists

Showing a sample of 271 resources. View the full list on GitHub →