Many times, you may want more customizations on your evaluations tailored to your specific use case. Before you create custom metrics, we recommend you take a quick look at our default metrics to see if you can still achieve your need with them. If you need to create custom metrics, we’ve got you covered.


Metrics are a key component of an evaluation. Before we jump into creating custom metrics, it is important to understand how metrics are structured.

Metrics Architecture Pn

Each metric has a name, a description, and a unique identifier. Additionally, metrics have 

  1. Prompts - Prompts hold all the details of the criteria being evaluated. Prompts have a versioning system to help you iteratively get to the right criteria. At any point, one of the versions will be set as the ‘active version’ and when you run an evaluation using a metric, the active version will get picked up.
  2. Few-shot examples - Few shot examples are test cases that Selene can learn from. If there are edge-cases that are difficult to score, it is recommended you add them to the metric. However, do not add too many of them as you may overfit Selene to the data. You can read more about developing your test cases.


You can create custom metrics using both the SDK and the Eval Copilot

Note that the custom metrics you create and align on the Eval Copilot are currently optimized for Selene. If you need to use custom metrics for the Selene Mini model, we recommend you use the SDK to create the metrics. Read here for more details on how to do that.