llm-evaluation Guide

Name: llm-evaluation
Author: wshobson

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

36,031 starsby wshobson

When to use llm-evaluation

How to use llm-evaluation

llm-evaluation is a Claude skill in the SKILL.md format. Add it to your Claude environment from the source repository below, then it activates as a user-invocable skill when your task matches its description.

Skill source

https://raw.githubusercontent.com/wshobson/agents/main/plugins/llm-application-dev/skills/llm-evaluation/SKILL.md

Details

PlatformClaude

CategoryAI & ML

Invocationuser-invocable

Modelany

Maintainerwshobson

LicenseMIT

llm-evaluation Guide

When to use llm-evaluation

How to use llm-evaluation

Details

Resources