Evals With Memory

Verified

Three concrete, working ways to run **Mastra evals** against an agent that has **memory** turned on — including observational-memory in thread scope (the configuration that triggers ObservationalMemory (scope: 'thread') requires a threadId, but none was found in RequestContext or MessageList.). Everything in this example uses Mastra evals primitives (runEvals, createScorer, Dataset.startExperiment). No custom evaluation harness. The agent in every script uses @mastra/memory + @mastra/libsql for storage and observational memory in thread scope. Each script writes to a fresh temp DB and cleans up after itself. A deterministic mock model is used so no API key is required and runs are reproducible in CI.

Mastra

Semi-autonomous

Evaluation & Benchmarking

24,625 2,177 Updated 1 months ago

Safety & access

Autonomy

Semi-autonomous

Sandbox-aware

No declared sandbox guidance

Network access

Unspecified

Composition

Models

gpt-4oclaude-3-5-sonnetgpt-3.5-turbo

Trust Score

64Good

Verification12/20

Popularity15/15

Maintenance8/10

License Clarity5/10

Documentation10/10

Composition Transparency6/10

Sandbox Awareness0/10

Autonomy Safety8/15

Details

FrameworkMastra

PatternSingle agent