Speech To Text

Convert speech into text using an API powered by the best of AI technologies.

The Speech To Text (STT) API is a robust tool designed to convert spoken language into written text. This API empowers developers to integrate speech recognition capabilities into their applications, enabling users to interact with spoken language for various purposes.

Pricing

Requests made to the Speech To Text (STT) API are billed. Prices are based on the number of characters sent to the service to be synthesized into audio.

The pricing for API requests is as follows:

  • Per Request Cost: 3 units base cost per request.

POST https://api.autogon.ai/api/v1/services/speech-to-text/

Headers

NameTypeDescription

Content-Type*

String

application/json

Request Body

NameTypeDescription

audio*

File

Audio to be processed and converted to text

language_code

String

Specifies the language spoken in the audio, defaults to "en"

{
    "success":true,
    "data":{
        "results":[
            {
                "alternatives":[
                    {
                        "transcript":"example audio transcription by Autogon AI",
                        "confidence":0.922834,
                        "words":[]
                    }],
                "resultEndTime":"19.110s",
                "languageCode":"en-gb",
                "channelTag":0
            }],
        "totalBilledTime":"20s",
        "requestId":"7571580858796951372"
        }
}

Last updated