> ## Documentation Index
> Fetch the complete documentation index at: https://openai-hd4n6.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Generate speech

> Generates audio from the input text.



## OpenAPI

````yaml api-definition.yaml post /audio/speech
openapi: 3.0.0
info:
  title: OpenAI API
  description: >-
    The OpenAI REST API. Please see
    https://platform.openai.com/docs/api-reference for more details.
  version: 2.3.0
  termsOfService: https://openai.com/policies/terms-of-use
  contact:
    name: OpenAI Support
    url: https://help.openai.com/
  license:
    name: MIT
    url: https://github.com/openai/openai-openapi/blob/master/LICENSE
servers:
  - url: https://api.openai.com/v1
security:
  - ApiKeyAuth: []
tags:
  - name: Assistants
    description: Build Assistants that can call models and use tools.
  - name: Audio
    description: Turn audio into text or text into audio.
  - name: Chat
    description: >-
      Given a list of messages comprising a conversation, the model will return
      a response.
  - name: Completions
    description: >-
      Given a prompt, the model will return one or more predicted completions,
      and can also return the probabilities of alternative tokens at each
      position.
  - name: Embeddings
    description: >-
      Get a vector representation of a given input that can be easily consumed
      by machine learning models and algorithms.
  - name: Evals
    description: Manage and run evals in the OpenAI platform.
  - name: Fine-tuning
    description: Manage fine-tuning jobs to tailor a model to your specific training data.
  - name: Batch
    description: Create large batches of API requests to run asynchronously.
  - name: Files
    description: >-
      Files are used to upload documents that can be used with features like
      Assistants and Fine-tuning.
  - name: Uploads
    description: Use Uploads to upload large files in multiple parts.
  - name: Images
    description: Given a prompt and/or an input image, the model will generate a new image.
  - name: Models
    description: List and describe the various models available in the API.
  - name: Moderations
    description: >-
      Given text and/or image inputs, classifies if those inputs are potentially
      harmful.
  - name: Audit Logs
    description: List user actions and configuration changes within this organization.
paths:
  /audio/speech:
    post:
      tags:
        - Audio
      summary: Generate speech
      description: Generates audio from the input text.
      operationId: createSpeech
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateSpeechRequest'
      responses:
        '200':
          description: OK
          headers:
            Transfer-Encoding:
              schema:
                type: string
              description: chunked
          content:
            application/octet-stream:
              schema:
                type: string
                format: binary
components:
  schemas:
    CreateSpeechRequest:
      type: object
      additionalProperties: false
      properties:
        model:
          description: >
            One of the available [TTS models](/docs/models#tts): `tts-1`,
            `tts-1-hd` or `gpt-4o-mini-tts`.
          anyOf:
            - type: string
            - type: string
              enum:
                - tts-1
                - tts-1-hd
                - gpt-4o-mini-tts
          x-oaiTypeLabel: string
        input:
          type: string
          description: >-
            The text to generate audio for. The maximum length is 4096
            characters.
          maxLength: 4096
        instructions:
          type: string
          description: >-
            Control the voice of your generated audio with additional
            instructions. Does not work with `tts-1` or `tts-1-hd`.
          maxLength: 4096
        voice:
          $ref: '#/components/schemas/VoiceIdsShared'
          description: >-
            The voice to use when generating the audio. Supported voices are
            `alloy`, `ash`, `ballad`, `coral`, `echo`, `fable`, `onyx`, `nova`,
            `sage`, `shimmer`, and `verse`. Previews of the voices are available
            in the [Text to speech
            guide](/docs/guides/text-to-speech#voice-options).
        response_format:
          description: >-
            The format to audio in. Supported formats are `mp3`, `opus`, `aac`,
            `flac`, `wav`, and `pcm`.
          default: mp3
          type: string
          enum:
            - mp3
            - opus
            - aac
            - flac
            - wav
            - pcm
        speed:
          description: >-
            The speed of the generated audio. Select a value from `0.25` to
            `4.0`. `1.0` is the default.
          type: number
          default: 1
          minimum: 0.25
          maximum: 4
      required:
        - model
        - input
        - voice
    VoiceIdsShared:
      example: ash
      anyOf:
        - type: string
        - type: string
          enum:
            - alloy
            - ash
            - ballad
            - coral
            - echo
            - fable
            - onyx
            - nova
            - sage
            - shimmer
            - verse
  securitySchemes:
    ApiKeyAuth:
      type: http
      scheme: bearer

````