Skip to main content

Evaluation Runs

Evaluation Runs represent the execution of an evaluation at a specific point in time.

An evaluation run captures the full set of query executions, results, and metrics generated when an evaluation is run. Each run is tied to a particular evaluation configuration, enabling comparisons across runs. Runs provide the data needed to analyze performance trends, validate changes, and support reproducible testing.