Evaluations

An Evaluation ties together the core building blocks of a search relevance assessment:

A Search Endpoint — the search system to query
A Query Set — the queries to run
A Query Template — how queries are formatted into requests

Once you create an evaluation, you can run it repeatedly to track how your search relevance changes over time.

Creating an Evaluation

In the UI

Navigate to Evaluations and click Create Evaluation
Enter a Name and optional Description
Select the Endpoint, Query Set, and Query Template to use
Click Create

Using the API

curl -X POST "https://${RELEVAL_HOST}/api/v1/evaluations" \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer ${TOKEN}" \
--data @- <<EOF
{
  "name": "Product Search v2",
  "description": "Evaluate new ranking model",
  "endpoint_id": "${ENDPOINT_ID}",
  "query_set_id": "${QUERY_SET_ID}",
  "query_template_id": "${QUERY_TEMPLATE_ID}"
}
EOF

Evaluation Workflow

The typical workflow for evaluating search relevance:

Create an Evaluation — define what you're evaluating
Create an Evaluation Run — choose the relevance scale and metrics
Start the run — Releval executes each query against your endpoint and collects results
Judge the results — rate how relevant each returned candidate is
Review metrics — Releval automatically calculates metrics from your judgments
Iterate — create new runs to track improvements as you adjust your search configuration

Comparing Over Time

Because each evaluation can have multiple runs, you can track how metrics evolve:

Run before and after changing a ranking model
Run against different endpoints (production vs. staging)
Run with different query templates to compare search strategies

Each run preserves its results and metrics independently, giving you a historical record of search quality.

Creating an Evaluation​

In the UI​

Using the API​

Evaluation Workflow​

Comparing Over Time​

Creating an Evaluation

In the UI

Using the API

Evaluation Workflow

Comparing Over Time