Evals
8 items across the graph — tagged with Evals.
From the graph · 8
repo
mastra-ai/mastra
→repoMastra is the modern TypeScript framework for AI-powered applications and agents.
Kiln-AI/Kiln
→repoBuild, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.
ombharatiya/ai-system-design-guide
→repoAI system design guide for engineers building production AI systems and evals.
future-agi/future-agi
→repoOpen-source, end-to-end platform for evaluating, observing, and improving LLM and AI agent applications. Tracing · Evals · Simulations · Datasets · Gateway · Gu…
hud-evals/hud-python
→repoRL environments + evals for AI agents. Define once, train anything.
spences10/my-pi
→repoComposable pi coding agent with MCP, LSP, agent chains, prompt presets, and local eval telemetry
dustalov/evalica
→repoEvalica, your favourite evaluation toolkit
METR/hawk
→Run Inspect AI evals in the cloud
