curl --request POST \
--url https://api.openai.com/v1/evals \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"name": "<string>",
"metadata": {},
"data_source_config": {
"type": "custom",
"item_schema": "{\n \"type\": \"object\",\n \"properties\": {\n \"name\": {\"type\": \"string\"},\n \"age\": {\"type\": \"integer\"}\n },\n \"required\": [\"name\", \"age\"]\n}\n",
"include_sample_schema": false
},
"testing_criteria": [
{
"type": "label_model",
"name": "<string>",
"model": "<string>",
"input": [
{
"role": "<string>",
"content": "<string>"
}
],
"labels": [
"<string>"
],
"passing_labels": [
"<string>"
]
}
],
"share_with_openai": false
}'
{
"object": "eval",
"id": "<string>",
"name": "Chatbot effectiveness Evaluation",
"data_source_config": {
"type": "custom",
"schema": "{\n \"type\": \"object\",\n \"properties\": {\n \"item\": {\n \"type\": \"object\",\n \"properties\": {\n \"label\": {\"type\": \"string\"},\n },\n \"required\": [\"label\"]\n }\n },\n \"required\": [\"item\"]\n}\n"
},
"testing_criteria": "eval",
"created_at": 123,
"metadata": {},
"share_with_openai": true
}
Create the structure of an evaluation that can be used to test a model’s performance. An evaluation is a set of testing criteria and a datasource. After creating an evaluation, you can run it on different models and model parameters. We support several types of graders and datasources. For more information, see the Evals guide.
curl --request POST \
--url https://api.openai.com/v1/evals \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"name": "<string>",
"metadata": {},
"data_source_config": {
"type": "custom",
"item_schema": "{\n \"type\": \"object\",\n \"properties\": {\n \"name\": {\"type\": \"string\"},\n \"age\": {\"type\": \"integer\"}\n },\n \"required\": [\"name\", \"age\"]\n}\n",
"include_sample_schema": false
},
"testing_criteria": [
{
"type": "label_model",
"name": "<string>",
"model": "<string>",
"input": [
{
"role": "<string>",
"content": "<string>"
}
],
"labels": [
"<string>"
],
"passing_labels": [
"<string>"
]
}
],
"share_with_openai": false
}'
{
"object": "eval",
"id": "<string>",
"name": "Chatbot effectiveness Evaluation",
"data_source_config": {
"type": "custom",
"schema": "{\n \"type\": \"object\",\n \"properties\": {\n \"item\": {\n \"type\": \"object\",\n \"properties\": {\n \"label\": {\"type\": \"string\"},\n },\n \"required\": [\"label\"]\n }\n },\n \"required\": [\"item\"]\n}\n"
},
"testing_criteria": "eval",
"created_at": 123,
"metadata": {},
"share_with_openai": true
}
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
OK
An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:
Was this page helpful?