Aura-2

DeepgramText-to-SpeechGenerally AvailableProprietaryvm-dg-002

About

Enterprise text-to-speech model offering 40+ voices with domain-specific pronunciation tuning. Optimized for contact centers, IVR systems, and voice agents with adjustable speaking rate and emphasis controls.

Capabilities (5)

40+ voices

Domain pronunciation tuning

Adjustable speaking rate

Emphasis controls

Streaming output

161 chars

Speed1.0x

Pitch1.0

0:00.00

Key Highlights

Domain-specific pronunciation models for finance, healthcare, and tech

40+ voices tuned for contact center and IVR deployments

Sub-250ms time-to-first-audio for responsive voice agents

Use Cases

Audiobook Narration

Generate natural-sounding narration for long-form content with consistent voice quality.

Notification Systems

Deliver voice alerts and notifications with expressive, human-like speech synthesis.

Multilingual Content

Produce audio content in multiple languages from a single text source.

Real-Time Voice Chat

Power low-latency voice responses in interactive applications and games.

Code Example

// Aura-2 — Text-to-Speech
import { synthesize } from "@arkitekton/voice";

const audio = await synthesize({
  model: "vm-dg-002",
  vendor: "deepgram",
  input: "Hello, welcome to Arkitekton.",
  voice: "alloy",
  response_format: "mp3",
  speed: 1.0,
});

// Play the audio
const blob = new Blob([audio], { type: "audio/mp3" });
const url = URL.createObjectURL(blob);
const player = new Audio(url);
player.play();

Related Models

PersonaPlex 7B

NVIDIA

NeMo TTS

NVIDIA

Riva

NVIDIA

ACE (Avatar Cloud Engine)

NVIDIA

OpenAI TTS

OpenAI

WaveNet

Google

Quick Stats

Latency<250ms TTFA

Languages5 supported

LicenseProprietary

Pricing$0.015 / 1K characters

StatusGenerally Available

Vendor

Deepgram

End-to-end deep learning ASR and TTS built for developers

View all Deepgram models

Documentation

View on Deepgram Site

Audiobook Narration

Generate natural-sounding narration for long-form content with consistent voice quality.

Notification Systems

Deliver voice alerts and notifications with expressive, human-like speech synthesis.

Multilingual Content

Produce audio content in multiple languages from a single text source.

Real-Time Voice Chat

Power low-latency voice responses in interactive applications and games.

Code Example

// Aura-2 — Text-to-Speech
import { synthesize } from "@arkitekton/voice";

const audio = await synthesize({
  model: "vm-dg-002",
  vendor: "deepgram",
  input: "Hello, welcome to Arkitekton.",
  voice: "alloy",
  response_format: "mp3",
  speed: 1.0,
});

// Play the audio
const blob = new Blob([audio], { type: "audio/mp3" });
const url = URL.createObjectURL(blob);
const player = new Audio(url);
player.play();