Industrial-grade open-source TTS

GLM-TTS Use Cases & Resources

Common production scenarios and official download links.

Mixed Chinese/English, formulas, and polyphones with phoneme control.

Multi-role narration with a wide emotional range (crying, laughing, shouting).

Warm, professional speech with stable prosody even with variable inserts.

Clone timbre and prosody from ~3 seconds of prompt audio.

Nuanced emotions (happy/sad/angry) plus natural laughter/breathing.

Hybrid phoneme + text input for polyphones and rare words.

Download checkpoints from zai-org/GLM-TTS.

Recommended mirror for users in China.

Run an interactive web demo locally via tools/gradio_app.py.