llm-obs-eval-pipeline Guide
End-to-end pipeline from unlabeled ml_app traces to a bootstrapped evaluator suite. Runs trace classification → root cause analysis → eval bootstrap in sequence with user checkpoints. Use when user says "run the eval pipeline", "go from traces to evals", "bootstrap evals end to end", "classify then RCA then bootstrap", "build an eval set from scratch", or wants a guided walkthrough from production data to evaluator code.
When to use llm-obs-eval-pipeline
End-to-end pipeline from unlabeled ml_app traces to a bootstrapped evaluator suite. Runs trace classification → root cause analysis → eval bootstrap in sequence with user checkpoints. Use when user says "run the eval pipeline", "go from traces to evals", "bootstrap evals end to end", "classify then RCA then bootstrap", "build an eval set from scratch", or wants a guided walkthrough from production data to evaluator code.
How to use llm-obs-eval-pipeline
llm-obs-eval-pipeline is a Claude skill in the SKILL.md format. Add it to your Claude environment from the source repository below, then it activates as a user-invocable skill when your task matches its description.