Development Workflow Steps
Create custom metrics
Navigate to your Metrics screen and create custom metrics that matter to your domain. While we automatically flag error types and failure patterns, you can add domain-specific metrics like
tool_call_efficiency.Configure metadata for test runs
Set up metadata to track different test configurations. For example, you might run three versions of a prompt that are concise, balanced, and verbose:
Compare results
Navigate to the Compare screen to analyze the relative error rates and performance on your custom metrics across different configurations.
Deep dive into issues
Click “View” on any column to deep dive into the specific step-level errors in the traces to understand deeper issues.