Atla is an AI evaluator trained to test & evaluate generative AI applications. Our models help developers find AI mistakes at scale, and build more reliable GenAI applications. Atla is available via our simple API.
Atla supports evaluations for a wide range of generative AI applications, including chat, RAG, agents, and content generation. You can use our models to evaluate your performance, whether you’re building on OpenAI, Anthropic, Mistral, Meta, or your own LLM.
Creating an evaluation is as simple as passing the user’s input, your model’s response and optional context and reference fields. For full details see the API reference.
from atla import Atlaclient = Atla()client.evaluation.create(input="What is the capital of the United Kingdom?", response="The capital of the United Kingdom is London.", context="The UK is a country in Europe, with a population of 66 million and its capital is London.", reference="The capital of the United Kingdom is London.", metrics=["precision"])# Score: 5# Critique: The response 'The capital of the United Kingdom is London.' directly and precisely answers the instruction...