What is Sesame CSM?
Conversational Speech Model (CSM) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs. The model architecture employs a Llama backbone and a smaller audio decoder that produces Mimi audio codes.
I just released Sesame CSM gradio UI, a 100% local, free text-to-speech tool with superior voice cloning! No cloud processing, no API keys – just pure, high-quality AI-generated speech on your own machine. It works on CUDA, APPLE MLX and CPU so anyone can try it.
Listen to a sample conversation generated by CSM.
🔥 Features:
- ✅ Runs 100% Locally – No internet connection required!
- ✅ Free & Open Source – No subscriptions, no paywalls.
- ✅ Superior Voice Cloning – Built directly into the UI.
- ✅ Gradio UI – Simple, interactive, and user-friendly.
- ✅ Supports CUDA, Apple MLX, and CPU – Works on NVIDIA GPUs, Apple Silicon, and regular CPUs.
Below is a video showing how to use voice cloning feature.
Note: It has no audio, it shows how to use the UI.
Getting Started
1. Clone the Repository
git clone https://github.com/akashjss/sesame-csm.git
cd sesame-csm
2. Install Dependencies, use venv to isolate environment as shown below.
python -m venv venv
source .venv/bin/activate
pip install -r requirements.txt
3. Run Sesame CSM
python run_csm_gradio.py
Once the server is running, open the Gradio UI in your browser to start generating speech!
🎙️ How to Use Voice Cloning
One of the most exciting features of Sesame CSM is its built-in voice cloning. You can record your own voice and use it to generate AI speech.
Steps to Clone Your Voice:
- Click the microphone icon in the UI.
- Press the record button and read the Speaker Prompt.
- Stop recording when finished.
- Click ‘Generate Conversation’ to create AI-generated speech using your recorded voice.
Here’s a visual guide to help you out:

💡 Why Use Sesame CSM?
If you’re looking for a fast, free, and high-quality text-to-speech tool with voice cloning, Sesame CSM is the perfect choice. Whether you’re a developer, content creator, or just experimenting with AI-generated speech, this tool gives you full control without any restrictions.
🔗 Try it Now!
I’d love to hear your thoughts! Try it out and feel free to share your feedback, report issues, or contribute to the project!
Akash Gupta
Senior VoIP Engineer and AI Enthusiast

AI and VoIP Blog
Thank you for visiting the Blog. Hit the subscribe button to receive the next post right in your inbox. If you find this article helpful don’t forget to share your feedback in the comments and hit the like button. This will helps in knowing what topics resonate with you, allowing me to create more that keeps you informed.
Thank you for reading, and stay tuned for more insights and guides!

Leave a comment