ACE (Avatar Cloud Engine)

NVIDIAConversationalText-to-SpeechGenerally AvailableNVIDIA EULAvm-nv-006

About

Multimodal avatar platform combining real-time voice AI with animated 3D and 2D digital humans. Integrates ASR, TTS, NLU, and facial animation into unified interactive agents for kiosks, games, and virtual assistants.

Capabilities (5)

3D avatar animation

Voice + visual sync

Real-time lip sync

Emotion expression

Multi-turn dialogue

Talk naturally with ACE (Avatar Cloud Engine)

Start a conversation and speak freely. The AI will listen and respond naturally — no buttons between messages.

Demo Mode · Voice: Browser

Key Highlights

End-to-end embodied AI combining voice, NLU, and visual animation

Real-time lip sync and emotion-driven facial expressions

Deployed in customer service kiosks and gaming NPCs at scale

Use Cases

Customer Support Agents

Deploy AI voice agents that handle customer inquiries with natural conversation flow and real-time responses.

Virtual Assistants

Build always-on voice assistants for enterprise applications with full-duplex capabilities.

Telehealth & Consultation

Enable voice-first healthcare consultations with HIPAA-compliant conversational AI.

Interactive Voice Response

Replace traditional IVR menus with natural language voice agents that understand intent.

Code Example

// ACE (Avatar Cloud Engine) — Conversational Voice Session
import { VoiceSession } from "@arkitekton/voice";

const session = await VoiceSession.create({
  model: "vm-nv-006",
  vendor: "nvidia",
  config: {
    fullDuplex: true,
    language: "en-US",
    turnDetection: "server_vad",
  },
});

session.on("speech_started", () => {
  console.log("Agent is speaking...");
});

session.on("transcript", (text) => {
  console.log("User said:", text);
});

// Connect to audio stream
const mic = await navigator.mediaDevices.getUserMedia({ audio: true });
session.connect(mic);

Related Models

PersonaPlex 7B

NVIDIA

NeMo ASR

NVIDIA

NeMo TTS

NVIDIA

Riva

NVIDIA

Parakeet

NVIDIA

gpt-4o-realtime

OpenAI

Quick Stats

Latency<250ms end-to-end

Languages10 supported

LicenseNVIDIA EULA

PricingEnterprise license

StatusGenerally Available

Vendor

NVIDIA

GPU-accelerated speech AI and conversational frameworks

View all NVIDIA models

Documentation

View on NVIDIA Site

Customer Support Agents

Deploy AI voice agents that handle customer inquiries with natural conversation flow and real-time responses.

Virtual Assistants

Build always-on voice assistants for enterprise applications with full-duplex capabilities.

Telehealth & Consultation

Enable voice-first healthcare consultations with HIPAA-compliant conversational AI.

Interactive Voice Response

Replace traditional IVR menus with natural language voice agents that understand intent.

Code Example

// ACE (Avatar Cloud Engine) — Conversational Voice Session
import { VoiceSession } from "@arkitekton/voice";

const session = await VoiceSession.create({
  model: "vm-nv-006",
  vendor: "nvidia",
  config: {
    fullDuplex: true,
    language: "en-US",
    turnDetection: "server_vad",
  },
});

session.on("speech_started", () => {
  console.log("Agent is speaking...");
});

session.on("transcript", (text) => {
  console.log("User said:", text);
});

// Connect to audio stream
const mic = await navigator.mediaDevices.getUserMedia({ audio: true });
session.connect(mic);