Topic

Evals

8 items across the graph — tagged with Evals.

From the graph · 8

Mastra is the modern TypeScript framework for AI-powered applications and agents.

Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.

→repo

ombharatiya/ai-system-design-guide

AI system design guide for engineers building production AI systems and evals.

→repo

future-agi/future-agi

Open-source, end-to-end platform for evaluating, observing, and improving LLM and AI agent applications. Tracing · Evals · Simulations · Datasets · Gateway · Gu…

→repo

hud-evals/hud-python

RL environments + evals for AI agents. Define once, train anything.

→repo

spences10/my-pi

Composable pi coding agent with MCP, LSP, agent chains, prompt presets, and local eval telemetry

→repo

dustalov/evalica

Evalica, your favourite evaluation toolkit

→repo

METR/hawk

Run Inspect AI evals in the cloud

→

From the graph · 8

Related topics