pxi-eval-dataset Guide

Name: pxi-eval-dataset
Author: arize-ai

Generate synthetic evaluation datasets for the PXI eval harness (evals/pxi/). Use whenever the user asks to create, author, draft, expand, or audit an eval dataset for a PXI tool, skill, or behavior — including phrases like "write evals for <tool>", "test PXI behavior", "synthetic dataset for PXI", "cover this tool with eval examples", or "find gaps in our PXI eval coverage". Inspects whichever evaluators currently live under evals/pxi/evaluators/ at use time and pauses to recommend a new evaluator if the behavior under test can't be scored by what already exists.

9,866 starsby arize-ai

When to use pxi-eval-dataset

How to use pxi-eval-dataset

pxi-eval-dataset is a Claude skill in the SKILL.md format. Add it to your Claude environment from the source repository below, then it activates as a user-invocable skill when your task matches its description.

Skill source

https://raw.githubusercontent.com/arize-ai/phoenix/main/.agents/skills/pxi-eval-dataset/SKILL.md

Details

PlatformClaude

CategoryTesting & QA

Invocationuser-invocable

Modelany

Maintainerarize-ai

LicenseNOASSERTION

pxi-eval-dataset Guide

When to use pxi-eval-dataset

How to use pxi-eval-dataset

Details

Resources