curl --request POST \
--url https://api.openai.com/v1/evals/{eval_id}/runs \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"data_source": {
"type": "jsonl",
"source": {
"type": "file_content",
"content": [
{
"item": {},
"sample": {}
}
]
}
},
"name": "<string>",
"metadata": {}
}
'{
"object": "eval.run",
"id": "<string>",
"eval_id": "<string>",
"status": "<string>",
"model": "<string>",
"name": "<string>",
"created_at": 123,
"report_url": "<string>",
"result_counts": {
"total": 123,
"errored": 123,
"failed": 123,
"passed": 123
},
"per_model_usage": [
{
"model_name": "<string>",
"invocation_count": 123,
"prompt_tokens": 123,
"completion_tokens": 123,
"total_tokens": 123,
"cached_tokens": 123
}
],
"per_testing_criteria_results": [
{
"testing_criteria": "<string>",
"passed": 123,
"failed": 123
}
],
"data_source": {
"type": "jsonl",
"source": {
"type": "file_content",
"content": [
{
"item": {},
"sample": {}
}
]
}
},
"metadata": {},
"error": {
"code": "<string>",
"message": "<string>"
}
}Create a new evaluation run. This is the endpoint that will kick off grading.
curl --request POST \
--url https://api.openai.com/v1/evals/{eval_id}/runs \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"data_source": {
"type": "jsonl",
"source": {
"type": "file_content",
"content": [
{
"item": {},
"sample": {}
}
]
}
},
"name": "<string>",
"metadata": {}
}
'{
"object": "eval.run",
"id": "<string>",
"eval_id": "<string>",
"status": "<string>",
"model": "<string>",
"name": "<string>",
"created_at": 123,
"report_url": "<string>",
"result_counts": {
"total": 123,
"errored": 123,
"failed": 123,
"passed": 123
},
"per_model_usage": [
{
"model_name": "<string>",
"invocation_count": 123,
"prompt_tokens": 123,
"completion_tokens": 123,
"total_tokens": 123,
"cached_tokens": 123
}
],
"per_testing_criteria_results": [
{
"testing_criteria": "<string>",
"passed": 123,
"failed": 123
}
],
"data_source": {
"type": "jsonl",
"source": {
"type": "file_content",
"content": [
{
"item": {},
"sample": {}
}
]
}
},
"metadata": {},
"error": {
"code": "<string>",
"message": "<string>"
}
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The ID of the evaluation to create a run for.
A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
Show child attributes
The name of the run.
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
Show child attributes
Successfully created a run for the evaluation
A schema representing an evaluation run.
The type of the object. Always "eval.run".
eval.run Unique identifier for the evaluation run.
The identifier of the associated evaluation.
The status of the evaluation run.
The model that is evaluated, if applicable.
The name of the evaluation run.
Unix timestamp (in seconds) when the evaluation run was created.
The URL to the rendered evaluation run report on the UI dashboard.
Counters summarizing the outcomes of the evaluation run.
Show child attributes
Usage statistics for each model during the evaluation run.
Show child attributes
Results per testing criteria applied during the evaluation run.
Show child attributes
A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
Show child attributes
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
Show child attributes
An object representing an error response from the Eval API.
Show child attributes
Was this page helpful?