← All projects
Active

claude-voice

The missing half of Claude Code's voice mode. You talk to Claude with /voice. Now Claude talks back — with real-time word-by-word highlighting in your terminal. Fully local. Zero API keys. One Python file.

Python
language
82M
TTS model params
~1s
time to first audio
1
file

The problem

Claude Code shipped voice mode in March 2026. You hold spacebar, speak, it transcribes. But the loop is only half complete — Claude's responses are still silent text. You talk to it. It doesn't talk back.

Every existing solution I found either required an API key (ElevenLabs, OpenAI TTS), ran a full MCP server (VoiceMode), or had no visual feedback at all. None of them had word-level highlighting. None of them worked as a simple drop-in.

What I tried first

ElevenLabs — free tier is useless for API

Signed up, got an API key, hit a 402 immediately. Free tier can't use library voices via the API. You need a paid plan to do anything beyond the web playground. Not viable for a tool I want anyone to install for free.

VoiceMode (893 stars) — too much

VoiceMode is a full MCP server with 100+ source files, a DJ mode, sound fonts, team connect, credential stores, systemd services, and a FastAPI Kokoro wrapper. It does two-way voice but requires significant setup and runs as an always-on service. I wanted something that's one file, zero config, and just works as a hook.

OpenAI TTS — cloud dependency

Works, sounds good, costs money per character. Also means every Claude response gets sent to OpenAI's servers. Defeats the purpose of running local infrastructure.

What claude-voice does instead

One Python file. Installs as a Claude Code Stop hook — fires automatically after every response. Uses Kokoro TTS (82M parameters, runs on CPU) to generate speech locally. Plays audio while highlighting the current word in real-time with a karaoke-style sliding window.

Features

Karaoke highlighting
Real-time word-by-word highlighting with a 3-color gradient: spoken words dim, current word bright cyan with underline, upcoming words muted. Sliding window keeps focus.
Fully local
Kokoro 82M runs on CPU. No API keys, no cloud calls, no internet required. Audio never leaves your machine.
12 voices
American and British, male and female. af_heart (warm), am_fenrir (deep), bm_george (polished), and 9 more. Switch with --voice or config file.
Smart filtering
Skips code-heavy responses, strips markdown tables, URLs, and formatting. Only speaks actual conversational text. Dev terms like CLI, API, JSON are pronounced correctly.
Interrupt on keypress
Press any key while Claude is speaking and audio stops immediately. Terminal cleans up. No waiting.
One-command setup
claude-voice setup adds the Stop hook to your Claude Code settings automatically. No manual JSON editing.

Commands

claude-voice setup              # install hook into Claude Code
claude-voice demo               # polished demo for screen recording
claude-voice benchmark          # measure latency, print shareable stats
claude-voice on / off           # toggle without removing the hook
claude-voice --voices           # list all 12 voices
claude-voice --voice am_fenrir "text"   # speak with a specific voice

Stack

Python 3.11+ Kokoro TTS sounddevice numpy Claude Code Hooks


Open source on github.com/Null-Phnix/claude-voice. Read the full story in the blog post.