About Langfuse

Langfuse is an open-source LLM engineering platform that helps teams collaboratively debug, analyze, and iterate on their LLM applications.

Setup Selene as an ‘LLM-as-a-Judge’ on Langfuse

This native integration lets you configure Atla through the Langfuse interface to evaluate your LLM application. Follow the steps below to set this up.

Navigate to your project on cloud.langfuse.com:

Add your Atla API key to your Langfuse project:

  1. Head to SettingsLLM Connections and select + Add new LLM API key.
  2. Set atla as the Provider name and select atla from the LLM adapter dropdown .
  3. The API Base URL will automatically be filled in. Paste your Atla API key beginning with “pk-…” into the API Key field.
  4. Leave Enable default models on.
  5. Click Save new LLM API key.

Use cases

Monitor your app by running evals over traces

Get started with our RAG app example.

This cookbook builds a Gradio application with a complete RAG pipeline. The app is a simple chatbot that answers questions based on a single webpage, which is set to Google’s Q4 2024 earnings call transcript.

Traces will automatically be sent to Langfuse and scored by Selene. The evaluation example in this cookbook is evaluating the retrieval component of the RAG app by assessing ‘context relevance.’

The demo video walks through the same example but evaluates the output of the RAG app by assessing ‘faithfulness.‘

Conduct experiments by running evals over datasets

Get started with our experiment to compare model performance to choose a base model. (Other experiments you can run include comparing prompts and retrieval logic.)

This cookbook compares the performance of various models (o1-mini, o3-mini, and gpt-4o) on function calling tasks using the Salesforce ShareGPT dataset. The notebook uploads the dataset to Langfuse and sets up experiment runs on different models. The various outputs are automatically evaluated by Selene.

The demo video walks through the same example.

Alternative: [API] Attach Selene scores to your Langfuse traces

If you’d like to stay in your codebase—this method lets you programmatically attach Atla scores to your Langfuse traces right from your code. You’ll get Selene scores and critiques in your trace data. Check out our API quickstart for Langfuse here.