whisper-tiny-en
 Model ID:  @cf/openai/whisper-tiny-en 
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.
Properties
Task Type: Automatic Speech Recognition
Code Examples
Workers - Typescript
  export interface Env {  AI: Ai;}
export default {  async fetch(request, env): Promise<Response> {    const res = await fetch(      "https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/samples/cpp/windows/console/samples/enrollment_audio_katie.wav"    );    const blob = await res.arrayBuffer();
    const input = {      audio: [...new Uint8Array(blob)],    };
    const response = await env.AI.run(      "@cf/openai/whisper-tiny-en",      input    );
    return Response.json({ input: { audio: [] }, response });  },} satisfies ExportedHandler<Env>;curl
  curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/openai/whisper-tiny-en  \  -X POST  \  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"  \  --data-binary "@talking-llama.mp3"Response
Automatic speech recognition responses return both a single string text property with the audio transcription and an optional array of words with start and end timestamps if the model supports that.
{  "text": "It is a good day",  "word_count": 5,  "words": [    {      "word": "It",      "start": 0.5600000023841858,      "end": 1    },    {      "word": "is",      "start": 1,      "end": 1.100000023841858    },    {      "word": "a",      "start": 1.100000023841858,      "end": 1.2200000286102295    },    {      "word": "good",      "start": 1.2200000286102295,      "end": 1.3200000524520874    },    {      "word": "day",      "start": 1.3200000524520874,      "end": 1.4600000381469727    }  ]}API Schema
The following schema is based on JSON Schema
Input JSON Schema
  {  "oneOf": [    {      "type": "string",      "format": "binary"    },    {      "type": "object",      "properties": {        "audio": {          "type": "array",          "items": {            "type": "number"          }        }      },      "required": [        "audio"      ]    }  ]}Output JSON Schema
  {  "type": "object",  "contentType": "application/json",  "properties": {    "text": {      "type": "string"    },    "word_count": {      "type": "number"    },    "words": {      "type": "array",      "items": {        "type": "object",        "properties": {          "word": {            "type": "string"          },          "start": {            "type": "number"          },          "end": {            "type": "number"          }        }      }    },    "vtt": {      "type": "string"    }  },  "required": [    "text"  ]}