Full-duplex multimodal voice interaction powered by Gemini. Supports real-time bidirectional audio streaming with natural turn-taking, tool use during conversation, and grounding with Google Search across 45+ languages.
Talk naturally with Gemini Live
Start a conversation and speak freely. The AI will listen and respond naturally — no buttons between messages.
Demo Mode · Voice: Browser
Multimodal grounding combines voice with vision and search
Native tool calling during live conversation without interruption
45+ language support with automatic language detection
Deploy AI voice agents that handle customer inquiries with natural conversation flow and real-time responses.
Build always-on voice assistants for enterprise applications with full-duplex capabilities.
Enable voice-first healthcare consultations with HIPAA-compliant conversational AI.
Replace traditional IVR menus with natural language voice agents that understand intent.
// Gemini Live — Conversational Voice Session
import { VoiceSession } from "@arkitekton/voice";
const session = await VoiceSession.create({
model: "vm-ggl-001",
vendor: "google",
config: {
fullDuplex: true,
language: "en-US",
turnDetection: "server_vad",
},
});
session.on("speech_started", () => {
console.log("Agent is speaking...");
});
session.on("transcript", (text) => {
console.log("User said:", text);
});
// Connect to audio stream
const mic = await navigator.mediaDevices.getUserMedia({ audio: true });
session.connect(mic);Cloud-scale speech services with multilingual reach