CTranslate2-based reimplementation of Whisper that runs up to 4x faster with comparable accuracy. Uses INT8 quantization and optimized compute kernels for efficient inference on CPU and GPU with reduced memory footprint.
Transcript will appear here in real-time as you speak…
4x faster than original Whisper with INT8 quantization
Significantly reduced memory footprint enables larger batch sizes
Drop-in replacement producing identical output format
Transcribe meetings in real-time with speaker identification and punctuation.
Analyze customer calls at scale with sentiment detection and keyword spotting.
Convert audio and video libraries into searchable text archives.
Provide real-time captions for broadcasts, presentations, and live events.
// Faster Whisper — Speech-to-Text
import { transcribe } from "@arkitekton/voice";
const result = await transcribe({
model: "vm-hf-007",
vendor: "huggingface",
audio: audioFile,
language: "en",
options: {
punctuate: true,
diarize: true,
smart_format: true,
},
});
console.log("Transcript:", result.text);
console.log("Confidence:", result.confidence);Community-driven open-source speech models and toolkits