Voxtral Speech Recognition Demo

Experience the power of Voxtral's open-source speech understanding AI. Upload audio or video files, record live speech, or try our sample content to see Voxtral's superior transcription and multilingual capabilities in action.

Input Audio

Upload Audio File

📁

Drag and drop audio or video files here or click to browse

Supported formats: MP3, WAV, M4A, FLAC, MP4, AVI, MOV (max 50MB)

Record Live Audio

Not recording
00:00

Try Sample Content

Voxtral Results

âŗ Ready to process audio

Upload or record audio to see Voxtral's transcription results here.

Language: -
Confidence: -
Duration: -

Ask a question about the audio content to see Voxtral's understanding capabilities.

Upload audio to see Voxtral's automatic summarization capabilities.

Export Results

Voxtral Demo Features

Experience the full range of Voxtral's capabilities through our interactive demonstration.

đŸŽ¯

High Accuracy Transcription

State-of-the-art word error rate (WER) that outperforms Whisper and other leading models.

🌍

Multilingual Support

Automatic language detection and superior performance across 8+ languages including English, Spanish, French, Portuguese, Hindi, German, Dutch, and Italian.

❓

Direct Q&A

Ask questions directly about audio content without additional LLM chaining. Voxtral understands context and provides relevant answers.

📝

Automatic Summarization

Generate concise summaries of long audio content automatically. Perfect for podcasts, lectures, and meetings.

⚡

Real-time Processing

Experience live transcription and processing capabilities with minimal latency for interactive applications.

🔓

Open Source

All Voxtral models are open source under Apache 2.0 license. Deploy locally, customize, and build without vendor lock-in.

Performance Comparison

See how Voxtral compares to other speech recognition solutions in real-world scenarios.

Feature Voxtral Small Voxtral Mini Whisper Large v3 GPT-4o Mini
English WER SOTA Excellent Good Good
Multilingual Performance Superior Superior Good Good
API Cost (per minute) $0.001 $0 (local) $0.002 $0.003
Built-in Q&A Yes Yes No No
Local Deployment Advanced Easy Possible No
License Apache 2.0 Apache 2.0 MIT Proprietary

Ready to Use Voxtral in Your Projects?

Get started with Voxtral's open-source models and build the next generation of speech AI applications.