About DeepEval

DeepEval is an open-source LLM evaluation framework, for evaluating and testing large-language model systems. It is similar to Pytest but specialized for unit testing LLM outputs.

Use Atla With DeepEval

Evaluate your GenAI application with DeepEval metrics with Selene Mini

1

Install Atla and DeepEval

pip install atla deepeval
2

Set Up Atla Selene Mini

We recommend using Ollama to run Selene Mini locally.

pip install ollama

ollama pull atla/selene-mini

deepeval set-local-model \
  --model-name=atla/selene-mini \
  --base-url="http://localhost:11434/v1/" \
  --api-key="ollama"
Alternatively, you can also download Selene Mini from HuggingFace and leverage DeepEval’s HuggingFace custom LLMs.
3

Run DeepEval Evaluation

from deepeval.metrics import FaithfulnessMetric
from deepeval.test_case import LLMTestCase

actual_output = "We offer a 30-day full refund at no extra cost."
retrieval_context = ["All customers are eligible for a 30 day full refund at no extra cost."]

metric = FaithfulnessMetric(
  threshold=0.7,
  include_reason=True,
  async_mode=False
)
test_case = LLMTestCase(
  input="What if these shoes don't fit?",
  actual_output=actual_output,
  retrieval_context=retrieval_context
)

metric.measure(test_case)
print(metric.score)
print(metric.reason)
Atla Selene API integration is coming soon to DeepEval.