behavioral-evals

Community

Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt steering, or adding workspace regression tests.

Claude

104,658 stars Updated 1 months ago

Allowed Tools

This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.

Source

SKILL.md / Manifest

https://raw.githubusercontent.com/google-gemini/gemini-cli/main/.gemini/skills/behavioral-evals/SKILL.md

Registry

github (via claudemarketplaces.com)

Trust Score

53Fair

Verification10/30

Scope Tightness

behavioral-evals

Allowed Tools

Source

Trust Score

Details