Question 1

Is my audio uploaded anywhere?

Accepted Answer

No. Transcription runs entirely in your browser using Whisper via Transformers.js. The only network request is downloading the model itself from Hugging Face — your audio file never leaves your device.

Question 2

What languages are supported?

Accepted Answer

English, Chinese, Japanese, Korean, Spanish, French, German, Russian, Portuguese, Italian, Hindi, and Arabic. You must select the spoken language before transcribing — Whisper defaults to English if none is chosen.

Question 3

Why does Chinese transcription come out in Traditional characters?

Accepted Answer

Whisper's Chinese output leans Traditional regardless of the actual accent spoken. Use the 简体/繁體 toggle above the transcript to switch between Simplified and Traditional after transcription.

Question 4

What are the file limits?

Accepted Answer

MP3, WAV, M4A, OGG, or WebM files up to 25MB and 10 minutes long.

AI Speech to Text

Frequently Asked Questions

Is my audio uploaded anywhere?

What languages are supported?

Why does Chinese transcription come out in Traditional characters?

What are the file limits?