Toward RL Learning
GRPO
- https://www.interconnects.ai/p/papers-im-reading-base-model-rl-grpo?open=false#%C2%A7kimi-k-scaling-reinforcement-learning-with-llms
- https://www.youtube.com/watch?v=grpc-Wyy-Zg
- https://aiengineering.academy/LLM/TheoryBehindFinetuning/GRPO/
- https://www.youtube.com/watch?v=Yi1UCrAsf4o&t=1334s
- https://www.coursera.org/specializations/reinforcement-learning
PPO
- https://aiengineering.academy/LLM/TheoryBehindFinetuning/PPO/
- https://yugeten.github.io/posts/2025/01/ppogrpo/