Learn about Selene
Overview
Learn about Selene
Selene is the family of state-of-the-art LLM Judge models built by Atla. Selene is trained specifically to evaluate generative AI responses. It excels at evaluating LLM outputs by other models involving language, coding, math, chat, RAG contexts, and more. You can use Selene to evaluate your LLM outputs - whether you’re building on OpenAI, Anthropic, Mistral, Meta, or your own LLM.
Selene evaluates your outputs against your scoring criteria and generates a score and a Chain-of-Thought critique. We have built Selene with scale and flexibility in mind. You can use the score and critique generated for a variety of use cases:
- To select which model works best for your use case
- To improve your prompts
- To fine-tune your own LLM
- As a guardrail in production to block low quality outputs. You can go a step further and use the critiques to re-generate failed outputs on the fly.