Nova-3

DeepgramSpeech-to-TextMultilingualGenerally AvailableProprietaryvm-dg-001

About

Industry-leading automatic speech recognition model built on end-to-end deep learning. Achieves top accuracy benchmarks with real-time streaming, speaker diarization, topic detection, sentiment analysis, and entity recognition.

Capabilities (5)

Industry-leading accuracy

Real-time streaming

Speaker diarization

Sentiment analysis

Entity recognition

Transcript will appear here in real-time as you speak…

Key Highlights

Benchmark-leading word error rate across multiple domains

Built-in sentiment analysis and entity recognition without extra pipeline

Sub-300ms streaming latency for real-time voice applications

Use Cases

Meeting Transcription

Transcribe meetings in real-time with speaker identification and punctuation.

Call Center Analytics

Analyze customer calls at scale with sentiment detection and keyword spotting.

Content Indexing

Convert audio and video libraries into searchable text archives.

Live Captioning

Provide real-time captions for broadcasts, presentations, and live events.

Code Example

// Nova-3 — Speech-to-Text
import { transcribe } from "@arkitekton/voice";

const result = await transcribe({
  model: "vm-dg-001",
  vendor: "deepgram",
  audio: audioFile,
  language: "en",
  options: {
    punctuate: true,
    diarize: true,
    smart_format: true,
  },
});

console.log("Transcript:", result.text);
console.log("Confidence:", result.confidence);

Related Models

NeMo ASR

NVIDIA

Riva

NVIDIA

Parakeet

NVIDIA

gpt-4o-realtime

OpenAI

gpt-4o-mini-realtime

OpenAI

gpt-4o-mini-transcribe

OpenAI

Quick Stats

Latency<300ms streaming

Languages36 supported

LicenseProprietary

Pricing$0.0043 / minute (Pay-as-you-go)

StatusGenerally Available

Vendor

Deepgram

End-to-end deep learning ASR and TTS built for developers

View all Deepgram models

Documentation

View on Deepgram Site

Meeting Transcription

Transcribe meetings in real-time with speaker identification and punctuation.

Call Center Analytics

Analyze customer calls at scale with sentiment detection and keyword spotting.

Content Indexing

Convert audio and video libraries into searchable text archives.

Live Captioning

Provide real-time captions for broadcasts, presentations, and live events.

Code Example

// Nova-3 — Speech-to-Text
import { transcribe } from "@arkitekton/voice";

const result = await transcribe({
  model: "vm-dg-001",
  vendor: "deepgram",
  audio: audioFile,
  language: "en",
  options: {
    punctuate: true,
    diarize: true,
    smart_format: true,
  },
});

console.log("Transcript:", result.text);
console.log("Confidence:", result.confidence);