Continuous improvement system for ToolUniverse tools, skills, and plugin. Run benchmarks, diagnose failures, route fixes to devtu skills, retest. Use after skill optimization, tool additions, or as regression check.
This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.
SKILL.md / Manifest
https://raw.githubusercontent.com/mims-harvard/tooluniverse/main/skills/devtu-benchmark-harness/SKILL.mdRegistry
github (via claudemarketplaces.com)