Evaluations
An Evaluation ties together the core building blocks of a search relevance assessment:
- A Search Endpoint — the search system to query
- A Query Set — the queries to run
- A Query Template — how queries are formatted into requests
Once you create an evaluation, you can run it repeatedly to track how your search relevance changes over time.
Creating an Evaluation
In the UI
- Navigate to Evaluations and click Create Evaluation
- Enter a Name and optional Description
- Select the Endpoint, Query Set, and Query Template to use
- Click Create
Using the API
curl -X POST "https://${RELEVAL_HOST}/api/v1/evaluations" \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer ${TOKEN}" \
--data @- <<EOF
{
"name": "Product Search v2",
"description": "Evaluate new ranking model",
"endpoint_id": "${ENDPOINT_ID}",
"query_set_id": "${QUERY_SET_ID}",
"query_template_id": "${QUERY_TEMPLATE_ID}"
}
EOF
Evaluation Workflow
The typical workflow for evaluating search relevance:
- Create an Evaluation — define what you're evaluating
- Create an Evaluation Run — choose the relevance scale and metrics
- Start the run — Releval executes each query against your endpoint and collects results
- Judge the results — rate how relevant each returned candidate is
- Review metrics — Releval automatically calculates metrics from your judgments
- Iterate — create new runs to track improvements as you adjust your search configuration
Comparing Over Time
Because each evaluation can have multiple runs, you can track how metrics evolve:
- Run before and after changing a ranking model
- Run against different endpoints (production vs. staging)
- Run with different query templates to compare search strategies
Each run preserves its results and metrics independently, giving you a historical record of search quality.