Get an eval

curl --request GET \
  --url https://api.openai.com/v1/evals/{eval_id} \
  --header 'Authorization: Bearer <token>'

{
  "object": "eval",
  "id": "<string>",
  "name": "Chatbot effectiveness Evaluation",
  "data_source_config": {
    "type": "custom",
    "schema": "{\n  \"type\": \"object\",\n  \"properties\": {\n    \"item\": {\n      \"type\": \"object\",\n      \"properties\": {\n        \"label\": {\"type\": \"string\"},\n      },\n      \"required\": [\"label\"]\n    }\n  },\n  \"required\": [\"item\"]\n}\n"
  },
  "testing_criteria": "eval",
  "created_at": 123,
  "metadata": {},
  "share_with_openai": true
}

GET

evals

{eval_id}

Get an eval

curl --request GET \
  --url https://api.openai.com/v1/evals/{eval_id} \
  --header 'Authorization: Bearer <token>'

{
  "object": "eval",
  "id": "<string>",
  "name": "Chatbot effectiveness Evaluation",
  "data_source_config": {
    "type": "custom",
    "schema": "{\n  \"type\": \"object\",\n  \"properties\": {\n    \"item\": {\n      \"type\": \"object\",\n      \"properties\": {\n        \"label\": {\"type\": \"string\"},\n      },\n      \"required\": [\"label\"]\n    }\n  },\n  \"required\": [\"item\"]\n}\n"
  },
  "testing_criteria": "eval",
  "created_at": 123,
  "metadata": {},
  "share_with_openai": true
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

eval_id

string

required

The ID of the evaluation to retrieve.

Response

200 - application/json

The evaluation

An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:

Improve the quality of my chatbot
See how well my chatbot handles customer support
Check if o3-mini is better at my usecase than gpt-4o

object

enum<string>

default:eval

required

The object type.

Available options:

eval

string

required

Unique identifier for the evaluation.

name

string

required

The name of the evaluation.

Example:

"Chatbot effectiveness Evaluation"

data_source_config

CustomDataSourceConfig · object

required

A CustomDataSourceConfig which specifies the schema of your item and optionally sample namespaces. The response schema defines the shape of the data that will be:

Used to define your testing criteria and
What data is required when creating a run

CustomDataSourceConfig
StoredCompletionsDataSourceConfig

Show child attributes

testing_criteria

(LabelModelGrader · object | StringCheckGrader · object | TextSimilarityGrader · object)[]

required

A list of testing criteria.

A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.

LabelModelGrader
StringCheckGrader
TextSimilarityGrader

Show child attributes

created_at

integer

required

The Unix timestamp (in seconds) for when the eval was created.

metadata

object

required

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

Show child attributes

Indicates whether the evaluation is shared with OpenAI.

Create eval

Update an eval

⌘I

API

Assistants

Audio

Batch

Chat

Completions

Embeddings

Evals

Files

Fine-tuning

Images

Models

Moderations

API Reference

Audit Logs

Certificates

Usage

Invites

Projects

Users

Realtime

Responses

Uploads

Vector stores

Get an eval

Authorizations

Path Parameters

Response