API Documentation
- Introduction
- POSTCreate Evaluation
- Evaluation Metrics
- Overview
- POSTCreate Metric
- DELDelete Metric
- GETGet Metric
- GETList Metrics
- Evaluation Prompts
- Few Shot Examples
Get Metric
Get a metric by ID.
{
"request_id": "123e4567-e89b-12d3-a456-426614174000",
"status": "success",
"metric": {
"name": "my_metric",
"description": "An example metric demonstrating functionality.",
"metric_type": "binary",
"required_fields": [
"model_input",
"model_output",
"model_context",
"expected_model_output"
],
"active_prompt_version": 1,
"prompts": {
"1": {
"content": "This is an example prompt for the metric. It is active.",
"created_at": "2025-01-01T12:34:56.789000Z",
"updated_at": "2025-01-01T12:34:56.789000Z",
"version": 1
},
"2": {
"content": "This is an updated example prompt for the metric.",
"created_at": "2025-01-01T12:34:56.789000Z",
"updated_at": "2025-01-01T12:34:56.789000Z",
"version": 2
}
},
"few_shot_examples": [
{
"model_input": "Few-shot `model_input`.",
"model_output": "Few-shot `model_output`.",
"model_context": "Few-shot `model_context`.",
"expected_model_output": "Few-shot `expected_model_output`.",
"score": "1",
"critique": "Critique for the few-shot example explaining why the score is 1."
}
],
"_id": "<string>",
"project_id": "<string>",
"created_at": "2025-01-01T12:34:56.789Z",
"updated_at": "2025-01-01T12:34:56.789Z"
}
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Path Parameters
The ID of the metric to get.
Response
A response containing a single retrieved metric.
The ID of the request the response is for.
"123e4567-e89b-12d3-a456-426614174000"
The metric retrieved.
The name of the metric. Metric names must contain only lowercase letters, numbers, hyphens, or underscores, and must start with a lowercase letter and end with either a lowercase letter or number. Metric names must be unique within a project.
"my_metric"
The type of metric.
binary
, likert_1_to_5
The ID of the project that the metric belongs to. If the metric is shared, this field will be null
.
An optional description of the metric.
"An example metric demonstrating functionality."
The fields that are required for the metric. All metrics must require at least model_input
and model_output
, which are the default values.
An enum for the fields that can be used as inputs to an Atla evaluator.
model_input
, model_output
, model_context
, expected_model_output
[
"model_input",
"model_output",
"model_context",
"expected_model_output"
]
The version of the prompt that is currently active for the metric.
x > 0
1
The prompts for the metric, keyed by version.
A prompt for a metric.
The version of the prompt.
x > 0
1
The content of the prompt.
"Assign a score of 1 for a funny response, 0 for a boring response."
The creation time of the prompt.
"2025-01-01T12:34:56.789Z"
The last update time of the prompt.
"2025-01-01T12:34:56.789Z"
{
"1": {
"content": "This is an example prompt for the metric. It is active.",
"created_at": "2025-01-01T12:34:56.789000Z",
"updated_at": "2025-01-01T12:34:56.789000Z",
"version": 1
},
"2": {
"content": "This is an updated example prompt for the metric.",
"created_at": "2025-01-01T12:34:56.789000Z",
"updated_at": "2025-01-01T12:34:56.789000Z",
"version": 2
}
}
The few-shot examples for the metric. At most 3 examples are allowed.
A few-shot example for a metric.
The input to the model for the few-shot example.
"Few-shot
model_input."
The output from the model for the few-shot example.
"Few-shot
model_output."
The score for the few-shot example.
"1"
The context for the few-shot example.
"Few-shot
model_context."
The expected output from the model for the few-shot example.
"Few-shot
expected_model_output."
The critique for the few-shot example.
"Critique for the few-shot example explaining why the score is 1."
The ID of the metric in the database.
The creation time of the metric.
"2025-01-01T12:34:56.789Z"
The last update time of the metric.
"2025-01-01T12:34:56.789Z"
"success"
{
"request_id": "123e4567-e89b-12d3-a456-426614174000",
"status": "success",
"metric": {
"name": "my_metric",
"description": "An example metric demonstrating functionality.",
"metric_type": "binary",
"required_fields": [
"model_input",
"model_output",
"model_context",
"expected_model_output"
],
"active_prompt_version": 1,
"prompts": {
"1": {
"content": "This is an example prompt for the metric. It is active.",
"created_at": "2025-01-01T12:34:56.789000Z",
"updated_at": "2025-01-01T12:34:56.789000Z",
"version": 1
},
"2": {
"content": "This is an updated example prompt for the metric.",
"created_at": "2025-01-01T12:34:56.789000Z",
"updated_at": "2025-01-01T12:34:56.789000Z",
"version": 2
}
},
"few_shot_examples": [
{
"model_input": "Few-shot `model_input`.",
"model_output": "Few-shot `model_output`.",
"model_context": "Few-shot `model_context`.",
"expected_model_output": "Few-shot `expected_model_output`.",
"score": "1",
"critique": "Critique for the few-shot example explaining why the score is 1."
}
],
"_id": "<string>",
"project_id": "<string>",
"created_at": "2025-01-01T12:34:56.789Z",
"updated_at": "2025-01-01T12:34:56.789Z"
}
}