Score your AI outputs.

A single API call to evaluate accuracy, relevance, safety, and more.Ship AI with confidence.

terminal
$ curl -X POST https://evalkit.dev/api/v1/eval \
  -H "Authorization: Bearer ek_live_abc123" \
  -d '{"output": "Paris is the capital of France",
       "criteria": ["accuracy", "relevance"]}'

{
  "overall_score": 0.95,
  "criteria": {
    "accuracy": { "score": 0.98, "reasoning": "Factually correct" },
    "relevance": { "score": 0.92, "reasoning": "Directly answers the question" }
  }
}

Integrate in minutes

curl -X POST https://evalkit.dev/api/v1/eval \
  -H "Authorization: Bearer ek_live_abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "output": "Your LLM output here",
    "criteria": ["accuracy", "relevance", "safety"]
  }'

Simple, transparent pricing

Start free. Scale as you grow.

Free

$0/mo
  • 100 evals/month
  • 1 API key
  • Standard criteria
  • Fast mode only
Start Free
Most Popular

Pro

$49/mo
  • 5,000 evals/month
  • Unlimited API keys
  • Custom criteria
  • Batch API
  • Fast + Thorough modes
Get Started

Scale

$199/mo
  • 25,000 evals/month
  • Everything in Pro
  • Priority support
  • Webhooks
  • Team seats
Contact Sales