Claude
Code & Development
Trust: 55/100 (Fair)llm-evaluation Guide
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
36,031 starsby wshobson
When to use llm-evaluation
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
How to use llm-evaluation
llm-evaluation is a Claude skill in the SKILL.md format. Add it to your Claude environment from the source repository below, then it activates as a user-invocable skill when your task matches its description.
Details
PlatformClaude
CategoryCode & Development
Invocationuser-invocable
Modelany
Maintainerwshobson
LicenseMIT