POST
/
evals
curl --request POST \
  --url https://api.openai.com/v1/evals \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "<string>",
  "metadata": {},
  "data_source_config": {
    "type": "custom",
    "item_schema": "{\n  \"type\": \"object\",\n  \"properties\": {\n    \"name\": {\"type\": \"string\"},\n    \"age\": {\"type\": \"integer\"}\n  },\n  \"required\": [\"name\", \"age\"]\n}\n",
    "include_sample_schema": false
  },
  "testing_criteria": [
    {
      "type": "label_model",
      "name": "<string>",
      "model": "<string>",
      "input": [
        {
          "role": "<string>",
          "content": "<string>"
        }
      ],
      "labels": [
        "<string>"
      ],
      "passing_labels": [
        "<string>"
      ]
    }
  ],
  "share_with_openai": false
}'
{
  "object": "eval",
  "id": "<string>",
  "name": "Chatbot effectiveness Evaluation",
  "data_source_config": {
    "type": "custom",
    "schema": "{\n  \"type\": \"object\",\n  \"properties\": {\n    \"item\": {\n      \"type\": \"object\",\n      \"properties\": {\n        \"label\": {\"type\": \"string\"},\n      },\n      \"required\": [\"label\"]\n    }\n  },\n  \"required\": [\"item\"]\n}\n"
  },
  "testing_criteria": "eval",
  "created_at": 123,
  "metadata": {},
  "share_with_openai": true
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Response

201 - application/json

OK

An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:

  • Improve the quality of my chatbot
  • See how well my chatbot handles customer support
  • Check if o3-mini is better at my usecase than gpt-4o