3.8 KiB
WHISPER VOICE
SOVEREIGN SPEECH RECOGNITION
Your Voice. Your Machine. Your Data.
A high-performance, locally-run dictation agent for the liberated desktop.
✊ The Manifesto
We hold these truths to be self-evident: That user data is an extension of the self, and its exploitation by centralized clouds is a violation of digital autonomy.
Whisper Voice is built on the principle of technological sovereignty. It provides state-of-the-art speech recognition without renting your cognitive output to corporate oligarchies. By running entirely on your own hardware, it reclaims the means of digital production, ensuring that your words remain exclusively yours.
⚡ Technical Core
Under the hood, Whisper Voice exploits the raw power of Faster-Whisper, a hyper-optimized implementation of OpenAI's Whisper model using CTranslate2.
- Zero Latency Loop: By eliminating network round-trips, transcription happens as fast as your hardware can think.
- Privacy by Physics: Data physically cannot leave your machine because the engine has no cloud uplink. The cable is cut.
- Precision Engineering: Leveraging 8-bit quantization (
int8) to run professional-grade models on consumer hardware with minimal memory footprint.
📊 Model Performance
Choose the engine that matches your hardware capabilities.
| Model | GPU VRAM (rec.) | CPU RAM (rec.) | Relative Speed | Capability |
|---|---|---|---|---|
| Tiny | ~500 MB | ~1 GB | Supersonic | Quick commands, simple dictation. |
| Base | ~600 MB | ~1 GB | Very Fast | Good balance for older hardware. |
| Small | ~1 GB | ~2 GB | Fast | Standard driver. High accuracy for English. |
| Medium | ~2 GB | ~4 GB | Moderate | High precision. Great for accents. |
| Large-v3 Turbo | ~4 GB | ~6 GB | Fast/Mod | Best Balance. Near Large accuracy at much higher speeds. |
| Large-v3 | ~5 GB | ~8 GB | Heavy | Professional grade. Near-perfect understanding. |
Note: CPU inference is significantly slower than GPU but fully supported via highly optimized vector instructions (AVX2).
🛠️ Usage Guide
Installation
- Acquire: Download the latest portable executable from the Releases page.
- Deploy: Place
WhisperVoice.exein a directory of your choosing. - Initialize: Run the executable. It will autonomously hydrate its runtime environment (approx. 2GB) on the first launch.
Operation
- Configure: Right-click the System Tray Icon to open Settings. Select your Model Size and Compute Device.
- Engage: Press
F9(or your custom hotkey) to open the channel. - Dictate: Speak clearly. The noise gate will isolate your voice.
- Execute: Release the key. The machine interprets the signal and injects the text into your active window immediately.
⚖️ License & Rights
Public Domain (CC0 1.0)
To the extent possible under law, the creators of this interface have waived all copyright and related or neighboring rights to this work. This tool belongs to the commons.
- Fork it.
- Mod it.
- Sell it.
- Liberate it.
Acknowledgments
While this interface is CC0, it relies on the shoulders of giants:
- OpenAI Whisper Models: Released under the MIT License.
- Faster-Whisper & CTranslate2: Released under the MIT License.
No gods, no cloud managers.