WHISPER VOICE

SOVEREIGN SPEECH RECOGNITION

Your Voice. Your Machine. Your Data.
A high-performance, locally-run dictation agent for the liberated desktop.

✊ The Manifesto

We hold these truths to be self-evident: That user data is an extension of the self, and its exploitation by centralized clouds is a violation of digital autonomy.

Whisper Voice is built on the principle of technological sovereignty. It provides state-of-the-art speech recognition without renting your cognitive output to corporate oligarchies. By running entirely on your own hardware, it reclaims the means of digital production, ensuring that your words remain exclusively yours.

⚡ Technical Core

Under the hood, Whisper Voice exploits the raw power of Faster-Whisper, a hyper-optimized implementation of OpenAI's Whisper model using CTranslate2.

Zero Latency Loop: By eliminating network round-trips, transcription happens as fast as your hardware can think.
Privacy by Physics: Data physically cannot leave your machine because the engine has no cloud uplink. The cable is cut.
Precision Engineering: Leveraging 8-bit quantization (int8) to run professional-grade models on consumer hardware with minimal memory footprint.

📊 Model Performance

Choose the engine that matches your hardware capabilities.

Model	GPU VRAM (rec.)	CPU RAM (rec.)	Relative Speed	Capability
Tiny	~500 MB	~1 GB	Supersonic	Quick commands, simple dictation.
Base	~600 MB	~1 GB	Very Fast	Good balance for older hardware.
Small	~1 GB	~2 GB	Fast	Standard driver. High accuracy for English.
Medium	~2 GB	~4 GB	Moderate	High precision. Great for accents.
Large-v3 Turbo	~4 GB	~6 GB	Fast/Mod	Best Balance. Near Large accuracy at much higher speeds.
Large-v3	~5 GB	~8 GB	Heavy	Professional grade. Near-perfect understanding.

Note: CPU inference is significantly slower than GPU but fully supported via highly optimized vector instructions (AVX2).

🛠️ Usage Guide

Installation

Acquire: Download the latest portable executable from the Releases page.
Deploy: Place WhisperVoice.exe in a directory of your choosing.
Initialize: Run the executable. It will autonomously hydrate its runtime environment (approx. 2GB) on the first launch.

Operation

Configure: Right-click the System Tray Icon to open Settings. Select your Model Size and Compute Device.
Engage: Press F9 (or your custom hotkey) to open the channel.
Dictate: Speak clearly. The noise gate will isolate your voice.
Execute: Release the key. The machine interprets the signal and injects the text into your active window immediately.

⚖️ License & Rights

Public Domain (CC0 1.0)

To the extent possible under law, the creators of this interface have waived all copyright and related or neighboring rights to this work. This tool belongs to the commons.

Fork it.
Mod it.
Sell it.
Liberate it.

Acknowledgments

While this interface is CC0, it relies on the shoulders of giants:

OpenAI Whisper Models: Released under the MIT License.
Faster-Whisper & CTranslate2: Released under the MIT License.

No gods, no cloud managers.

3.8 KiB Raw Blame History