POST
/
v1
/
eval
from atla import Atla

client = Atla()

# Direct evaluation
evaluation = client.evaluation.create(
    model_id="atla-selene",
    model_input="What is the capital of France?",
    model_output="Paris",
    evaluation_criteria="Assign a score of 1 if the answer is factually correct, otherwise assign a score of 0.",
).result.evaluation

print(evaluation)

# Metric-based evaluation
evaluation = client.evaluation.create(
    model_id="atla-selene",
    model_input="What is the capital of France?",
    model_output="Paris",
    metric_name="my_custom_correctness_metric",
).result.evaluation

print(evaluation)
{
  "request_id": "123e4567-e89b-12d3-a456-426614174000",
  "status": "success",
  "result": {
    "model_id": "atla-selene-20250214",
    "evaluation": {
      "score": "1",
      "critique": "The model output is factually correct and well-reasoned. It does not provide any additional information not directly supported by the input or context provided."
    }
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

A request to an Atla evaluator via the /eval endpoint.

model_input
string
required

The input given to a model which produced the model_output to be evaluated.

model_output
string
required

The output of the model which is being evaluated. This is the model_output from the model_input.

model_id
string
required

The ID or name of the Atla evaluator model to use. This may point to a specific model version or a model family. If a model family is provided, the default model version for that family will be used.

evaluation_criteria
string

The criteria used to evaluate the model_output. Only one of evaluation_criteria or metric_name can be provided.

metric_name
string

The name of the metric to use for the evaluation. Only one of evaluation_criteria or metric_name can be provided.

model_context
string

Any additional context provided to the model which received the model_input and produced the model_output.

expected_model_output
string

An optional reference ("ground-truth" / "gold standard") answer against which to evaluate the model_output.

few_shot_examples
object[]

A list of few-shot examples for the evaluation.

Response

200
application/json
Success

A response to an Atla evaluator via the /eval endpoint.

request_id
string
required

The ID of the request the response is for.

result
object
required

The result of the evaluation.

status
enum<string>

Response status enum.

Available options:
success,
error