OpenAI-compatible Evaluations
Create Evaluation
API Documentation
- Introduction
- Atla Evaluations
- OpenAI-compatible Evaluations
- Evaluation Metrics
OpenAI-compatible Evaluations
Create Evaluation
Use OpenAI chat completions endpoint to run evaluations
POST
/
v1
/
chat
/
completions
from atla import Atla
client = Atla()
eval_prompt = """You are an expert evaluator.
You have been asked to evaluate an LLM's response to a given instruction.
Model input:
What is the capital of France?
Model output:
Paris
Score rubric:
Evaluate the answer based on its factual correctness. Assign a score of 1 if the answer is factually correct, otherwise assign a score of 0. Only scores of 0 or 1 are allowed.
Your response should strictly follow this format:
**Reasoning:** <your feedback>
**Result:** <your score>
"""
chat_completion = client.chat.completions.create(
model="atla-selene",
messages=[{"role": "user", "content": eval_prompt}],
)
print(chat_completion.choices[0].message.content)
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "**Reasoning:** The model output is factually correct and well-reasoned. It does not provide any additional information not directly supported by the input or context provided.\n\n**Result:** 1",
"role": "assistant"
}
}
],
"created": 694303200,
"model": "atla-selene",
"object": "chat.completion",
"usage": {
"completion_tokens": 10,
"prompt_tokens": 10,
"total_tokens": 20
}
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
A request to an Atla evaluator via the /eval/chat/completions
endpoint.
Response
200
application/json
Success
The response is of type object
.
from atla import Atla
client = Atla()
eval_prompt = """You are an expert evaluator.
You have been asked to evaluate an LLM's response to a given instruction.
Model input:
What is the capital of France?
Model output:
Paris
Score rubric:
Evaluate the answer based on its factual correctness. Assign a score of 1 if the answer is factually correct, otherwise assign a score of 0. Only scores of 0 or 1 are allowed.
Your response should strictly follow this format:
**Reasoning:** <your feedback>
**Result:** <your score>
"""
chat_completion = client.chat.completions.create(
model="atla-selene",
messages=[{"role": "user", "content": eval_prompt}],
)
print(chat_completion.choices[0].message.content)
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "**Reasoning:** The model output is factually correct and well-reasoned. It does not provide any additional information not directly supported by the input or context provided.\n\n**Result:** 1",
"role": "assistant"
}
}
],
"created": 694303200,
"model": "atla-selene",
"object": "chat.completion",
"usage": {
"completion_tokens": 10,
"prompt_tokens": 10,
"total_tokens": 20
}
}