Alibaba Cloud content moderation and AI guardrails automated testing. Tests sample content against moderation APIs, compares multiple services, tracks requestId/traceId, supports manual annotation, deep false-negative analysis, cross-batch comparison, AI guardrails testing (prompt injection, sensitive data, jailbreak), and generates alignment reports. Use when user asks about content safety, moderation testing, moderation strategy, label configuration, content review, batch safety checks, miss analysis, AI guardrails, prompt injection detection, or safety guardrails testing.
This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.
SKILL.md / Manifest
https://raw.githubusercontent.com/aliyun/alibabacloud-aiops-skills/master/skills/security/lvwang/alibabacloud-safety-checker/SKILL.mdRegistry
github (via claudemarketplaces.com)