Skip to main content

AI Providers

An AI Provider is a saved connection to an LLM service — credentials, endpoint, region — that any number of AI Judges can share. Creating a provider is the first step before you can create a judge; the judge holds the model name, prompt template, and concurrency settings, but the credentials live on the provider.

The following provider types are supported:

NameDescription
OpenAIOpenAI Chat Completions API
AnthropicAnthropic Messages API
Amazon BedrockA model on Amazon Bedrock via the Bedrock Runtime API
Azure OpenAIA model deployment in an Azure OpenAI resource
OllamaA local or self-hosted Ollama server
CustomAny OpenAI-compatible Chat Completions endpoint

Visibility and ownership

Each provider has an owner:

  • Personal providers — owned by the member who created them. Visible only to that member and to admins.
  • Shared providers — created by an Admin with no owner set. Visible to all members.

Credentials are stored encrypted and are never returned by the API after creation.

OpenAI

Uses the OpenAI Chat Completions API. GPT, o-series and more models are supported. You provide your OpenAI API key when creating the provider.

curl -X POST "https://${RELEVAL_HOST}/api/v1/ai-providers" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{
"type": "openai",
"name": "OpenAI Production",
"api_key": "sk-..."
}'

Anthropic

Uses the Anthropic Messages API. Choose from the Claude family of models. You provide your Anthropic API key when creating the provider.

curl -X POST "https://${RELEVAL_HOST}/api/v1/ai-providers" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{
"type": "anthropic",
"name": "Anthropic Production",
"api_key": "sk-ant-..."
}'

Amazon Bedrock

Targets a foundation model on Amazon Bedrock using the Bedrock Runtime API. You provide an AWS region together with an access key ID and secret access key; the IAM principal behind those credentials determines which Bedrock models the judge can invoke.

curl -X POST "https://${RELEVAL_HOST}/api/v1/ai-providers" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{
"type": "amazon_bedrock",
"name": "Bedrock us-east-1",
"aws_region": "us-east-1",
"access_key_id": "${AWS_ACCESS_KEY_ID}",
"secret_access_key": "${AWS_SECRET_ACCESS_KEY}"
}'

Azure OpenAI

Targets an OpenAI model deployment in your Azure OpenAI resource. You provide the resource URL shown in the Azure portal and the API key for the resource.

curl -X POST "https://${RELEVAL_HOST}/api/v1/ai-providers" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{
"type": "azure_openai",
"name": "Azure OpenAI East US",
"endpoint_url": "https://my-resource.openai.azure.com/",
"api_key": "${AZURE_OPENAI_KEY}"
}'

When you later create a judge that references this provider, the judge's model is the deployment name configured in your Azure OpenAI resource, not the underlying model id.

Ollama

Targets a local or self-hosted Ollama server, running open-source models locally. No API key is required — use this for on-premise judging or for cheap iteration on prompt design without using paid API credits.

curl -X POST "https://${RELEVAL_HOST}/api/v1/ai-providers" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{
"type": "ollama",
"name": "Local Ollama",
"endpoint_url": "http://localhost:11434"
}'

The endpoint URL should point to the Ollama native API root (e.g. http://localhost:11434, with no /v1 suffix). Slow local hardware is accommodated by a generous per-request timeout, so larger models can run without judging runs failing on long inferences.

Custom

Targets any OpenAI-compatible Chat Completions endpoint. Use this for self-hosted models behind vLLM, LM Studio, llama.cpp, or hosted gateways such as OpenRouter, Together, or Groq. The endpoint URL is the OpenAI-compatible base URL (including the /v1 suffix the OpenAI SDK expects). An API key is optional — omit it for unauthenticated local servers.

curl -X POST "https://${RELEVAL_HOST}/api/v1/ai-providers" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{
"type": "custom",
"name": "OpenRouter",
"endpoint_url": "https://openrouter.ai/api/v1",
"api_key": "${OPENROUTER_KEY}"
}'

Choosing a provider type

You want…Use
Fastest setup, well-documented resultsOpenAI or Anthropic
Existing Azure procurement / data residencyAzure OpenAI
Free local iteration on promptsOllama
Self-hosted models or cheaper third-party gatewaysCustom
AWS-aligned billing, IAM-scoped accessAmazon Bedrock

Testing a provider

A test action sends a minimal probe prompt against a model of your choice to confirm that credentials, endpoint reachability, and model availability are all in order. The response reports whether the probe succeeded and, on failure, the underlying error message from the provider.

curl -X POST "https://${RELEVAL_HOST}/api/v1/ai-providers/${PROVIDER_ID}/test" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{ "model": "gpt-4o" }'

Rotating credentials

Updating a provider rotates its secret when you supply a new value, and keeps the existing secret when you omit the field. Non-secret fields (name, endpoint URL, region, access key ID) are wholesale-replaced on every update. The provider type itself is fixed at creation time — to change type, create a new provider.

curl -X PUT "https://${RELEVAL_HOST}/api/v1/ai-providers/${PROVIDER_ID}" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{
"type": "openai",
"name": "OpenAI Production",
"api_key": "sk-new-..."
}'

Deleting a provider

A provider can only be deleted if no AI judges still reference it. If any do, the delete fails and reports which judges depend on the provider so you can reassign or delete them first.

Vision-capable models

Enabling image judging on a judge instructs Releval to embed candidate images in the prompt sent to the model. This requires a model that accepts image inputs. If you enable image judging on a text-only model, the provider will reject the request and the judging run will fail.

Common vision-capable choices per provider:

ProviderExamples
OpenAIgpt-4o, gpt-4o-mini, gpt-4-turbo
AnthropicAll current Claude 4 models
Amazon BedrockClaude on Bedrock (e.g. anthropic.claude-sonnet-4-20250514-v1:0), Nova multimodal models
Azure OpenAIA deployment of any vision-capable OpenAI model (e.g. a GPT-4o deployment)
Ollamallama3.2-vision, other vision-tagged models in the Ollama library
CustomWhatever the upstream endpoint's model supports

Image judging multiplies token usage substantially — a vision model evaluating ten image-bearing candidates per query can be 5–10x the text-only cost. For details on how images flow through the prompt template, see Prompt templates.

Once your provider is in place, define how it grades candidates in Prompt templates, then create an AI Judge that references the provider.