Claude
Code & Development
Trust: 55/100 (Fair)behavioral-evals Guide
Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt steering, or adding workspace regression tests.
104,658 starsby google-gemini
When to use behavioral-evals
Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt steering, or adding workspace regression tests.
How to use behavioral-evals
behavioral-evals is a Claude skill in the SKILL.md format. Add it to your Claude environment from the source repository below, then it activates as a user-invocable skill when your task matches its description.
Details
PlatformClaude
CategoryCode & Development
Invocationuser-invocable
Modelany
Maintainergoogle-gemini
LicenseApache-2.0