2 items across the graph — tagged with Llm Eval.
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, DeepSeek, and more. Simpl…
🐢 Open-Source Evaluation & Testing library for LLM Agents