GLM-4-Voice
  • is an end-to-end spoken large language model
  • developed by Zhipu AI (Z.ai)
  • unlike traditional “pipeline” systems that chain together separate STT (Speech-to-Text), LLM, and TTS (Text-to-Speech) models, GLM-4-Voice processes audio natively