Skip to main content

Evaluations

An Evaluation ties together the core building blocks of a search relevance assessment:

Once you create an evaluation, you can run it repeatedly to track how your search relevance changes over time.

Creating an Evaluation

In the UI

  1. Navigate to Evaluations and click Create Evaluation
  2. Enter a Name and optional Description
  3. Select the Endpoint, Query Set, and Query Template to use
  4. Click Create

Using the API

curl -X POST "https://${RELEVAL_HOST}/api/v1/evaluations" \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer ${TOKEN}" \
--data @- <<EOF
{
"name": "Product Search v2",
"description": "Evaluate new ranking model",
"endpoint_id": "${ENDPOINT_ID}",
"query_set_id": "${QUERY_SET_ID}",
"query_template_id": "${QUERY_TEMPLATE_ID}"
}
EOF

Evaluation Workflow

The typical workflow for evaluating search relevance:

  1. Create an Evaluation — define what you're evaluating
  2. Create an Evaluation Run — choose the relevance scale and metrics
  3. Start the run — Releval executes each query against your endpoint and collects results
  4. Judge the results — rate how relevant each returned candidate is
  5. Review metrics — Releval automatically calculates metrics from your judgments
  6. Iterate — create new runs to track improvements as you adjust your search configuration

Comparing Over Time

Because each evaluation can have multiple runs, you can track how metrics evolve:

  • Run before and after changing a ranking model
  • Run against different endpoints (production vs. staging)
  • Run with different query templates to compare search strategies

Each run preserves its results and metrics independently, giving you a historical record of search quality.