Topic

Llm Eval

2 items across the graph — tagged with Llm Eval.

From the graph · 2

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simpl…

→repo

Giskard-AI/giskard-oss

🐢 Open-Source Evaluation & Testing library for LLM Agents

→

From the graph · 2

Related topics