Transcriptions(Speech to Text)
POST
/v1/audio/transcriptionsTranscribe audio into the input language.
The transcription API accepts the audio file you want to transcribe as input, along with the desired output file format for the transcription. We currently support multiple input and output file formats.
Price: 0.003 PTC / min
请求参数
API Key from 302.AI backend
Audio files to be transcribed in one of the following formats: mp3, mp4, mpeg, mpga, m4a, wav or webm.
whisper-large-v3
Optional text; to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
The format of the transcript output. Choose from: json, text, srt, verbose_json, or vtt.
The sampling temperature ranges from 0 to 1. Higher values (e.g., 0.8) increase randomness in the output, while lower values (e.g., 0.2) make it more focused and deterministic. If set to 0, the model uses log probability to automatically adjust the temperature until specific thresholds are reached.
示例代码
Responses
{
"text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that."
}