Gradium TTS
About this model
Gradium TTS is a cloud-hosted text-to-speech service from Gradium, exposed through a low-latency streaming WebSocket API designed for real-time voice agents. It generates speech incrementally, returning base64-encoded PCM audio chunks at a 24kHz sample rate, which suits interruptible, conversational pipelines rather than batch file rendering. Authentication is handled with a Gradium API key, and callers select output by voice identifier while passing optional model and JSON configuration settings.
The service is integrated into common agent stacks: plugin support exists for both Pipecat and LiveKit, where Gradium can act as the TTS provider inside an agent session or as a standalone speech generator. Pipecat exposes runtime-configurable settings that can be updated mid-conversation, reflecting the model's focus on live interaction.
This page describes the first cataloged release in the Gradium TTS family, so there is no prior same-family version available here for a direct generational comparison.
Separately, Gradium describes its cloud API for use cases needing broad language and speaker coverage, while its on-device model, Phonon, targets offline, privacy-sensitive, and high-volume consumer deployments where cloud synthesis is not the right architecture.
This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.
Data sources: Venice API · HuggingFace — enrichment updated 2h ago