constitutional-ai Guide

Name: constitutional-ai
Author: zechenzhangagi

Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.

8,991 starsby zechenzhangagi

When to use constitutional-ai

How to use constitutional-ai

constitutional-ai is a Claude skill in the SKILL.md format. Add it to your Claude environment from the source repository below, then it activates as a user-invocable skill when your task matches its description.

Skill source

https://raw.githubusercontent.com/zechenzhangagi/ai-research-skills/main/07-safety-alignment/constitutional-ai/SKILL.md

Details

PlatformClaude

CategoryAI & ML

Invocationuser-invocable

Modelany

Maintainerzechenzhangagi

LicenseMIT

constitutional-ai Guide

When to use constitutional-ai

How to use constitutional-ai

Details

Resources