Create Evaluation
Run an evaluation directly via the Atla evaluation service.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
A request to an Atla evaluator via the /eval
endpoint.
The input given to a model which produced the model_output
to be evaluated.
The output of the model which is being evaluated. This is the model_output
from the model_input
.
The ID or name of the Atla evaluator model to use. This may point to a specific model version or a model family. If a model family is provided, the default model version for that family will be used.
The criteria used to evaluate the model_output
. Only one of evaluation_criteria
or metric_name
can be provided.
"Give a score of 1 if the answer is correct, 0 otherwise."
The name of the metric to use for the evaluation. Only one of evaluation_criteria
or metric_name
can be provided.
"my_metric"
The version of the prompt to use for the evaluation. If not provided, the active prompt version will be used.
x > 0
1
Any additional context provided to the model which received the model_input
and produced the model_output
.
An optional reference ("ground-truth" / "gold standard") answer against which to evaluate the model_output
.
A list of few-shot examples for the evaluation.
A few-shot example for a metric.
Response
A response to an Atla evaluator via the /eval
endpoint.
The ID of the request the response is for.
"123e4567-e89b-12d3-a456-426614174000"
The result of the evaluation.
"success"