Claude
Code & Development
Trust: 55/100 (Fair)fine-tuning-with-trl Guide
TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF.
170,110 starsby nousresearch
When to use fine-tuning-with-trl
TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF.
How to use fine-tuning-with-trl
fine-tuning-with-trl is a Claude skill in the SKILL.md format. Add it to your Claude environment from the source repository below, then it activates as a user-invocable skill when your task matches its description.
Details
PlatformClaude
CategoryCode & Development
Invocationuser-invocable
Modelany
Maintainernousresearch
LicenseMIT