MMS (Massively Multilingual Speech)

MetaSpeech-to-TextText-to-SpeechMultilingualGenerally AvailableCC BY-NC 4.0vm-meta-003

About

Speech model covering over 1,100 languages for ASR and 1,400+ languages for language identification. Trained on religious text recordings and unlabeled data to extend coverage to low-resource languages dramatically.

Capabilities (5)

1,100+ language ASR

1,400+ language ID

Low-resource languages

CTC-based architecture

Wav2Vec 2.0 backbone

Transcript will appear here in real-time as you speak…

Key Highlights

Covers 10x more languages than any previous speech technology

Enables ASR for endangered and low-resource languages worldwide

Language identification across 1,400+ languages with high accuracy

Use Cases

Audiobook Narration

Generate natural-sounding narration for long-form content with consistent voice quality.

Notification Systems

Deliver voice alerts and notifications with expressive, human-like speech synthesis.

Multilingual Content

Produce audio content in multiple languages from a single text source.

Real-Time Voice Chat

Power low-latency voice responses in interactive applications and games.

Code Example

// MMS (Massively Multilingual Speech) — Text-to-Speech
import { synthesize } from "@arkitekton/voice";

const audio = await synthesize({
  model: "vm-meta-003",
  vendor: "meta",
  input: "Hello, welcome to Arkitekton.",
  voice: "alloy",
  response_format: "mp3",
  speed: 1.0,
});

// Play the audio
const blob = new Blob([audio], { type: "audio/mp3" });
const url = URL.createObjectURL(blob);
const player = new Audio(url);
player.play();

Related Models

PersonaPlex 7B

NVIDIA

NeMo ASR

NVIDIA

NeMo TTS

NVIDIA

Riva

NVIDIA

Parakeet

NVIDIA

ACE (Avatar Cloud Engine)

NVIDIA

Quick Stats

Languages1,100+

LicenseCC BY-NC 4.0

PricingOpen-source / self-hosted

StatusGenerally Available

Vendor

Real-Time Voice Chat

Power low-latency voice responses in interactive applications and games.

Code Example

// MMS (Massively Multilingual Speech) — Text-to-Speech
import { synthesize } from "@arkitekton/voice";

const audio = await synthesize({
  model: "vm-meta-003",
  vendor: "meta",
  input: "Hello, welcome to Arkitekton.",
  voice: "alloy",
  response_format: "mp3",
  speed: 1.0,
});

// Play the audio
const blob = new Blob([audio], { type: "audio/mp3" });
const url = URL.createObjectURL(blob);
const player = new Audio(url);
player.play();