miles-rl-training

Community

Provides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training large MoE models with FP8/INT4, needing train-inference alignment, or requiring speculative RL for maximum throughput.

Claude

8,991 stars Updated 1 months ago

Allowed Tools

This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.

Source

SKILL.md / Manifest

https://raw.githubusercontent.com/zechenzhangagi/ai-research-skills/main/06-post-training/miles/SKILL.md

Registry

github (via claudemarketplaces.com)

Trust Score

53Fair

Verification10/30

Scope Tightness

miles-rl-training

Allowed Tools

Source

Trust Score

Details