# đī¸ W H I S P E R V O I C E
### SOVEREIGN SPEECH RECOGNITION

[](https://git.lashman.live/lashman/whisper_voice/releases/latest)
[](https://creativecommons.org/publicdomain/zero/1.0/)
> *"The master's tools will never dismantle the master's house."* â Audre Lorde
>
> **Build your own tools. Run them locally.**
[Report Issue](https://git.lashman.live/lashman/whisper_voice/issues) âĸ [View Source](https://git.lashman.live/lashman/whisper_voice) âĸ [Releases](https://git.lashman.live/lashman/whisper_voice/releases)
## â The Manifesto
**We hold these truths to be self-evident:** That user data is an extension of the self, and its exploitation by centralized clouds is a violation of digital autonomy.
**Whisper Voice** is built on the principle of **technological sovereignty**. It provides state-of-the-art speech recognition without renting your cognitive output to corporate oligarchies. By running entirely on your own hardware, it reclaims the means of digital production, ensuring that your words remain exclusively yours.
---
## ⥠Technical Architecture
This operates on the metal. It is not a wrapper. It is an engine.
| Component | Technology | Benefit |
| :--- | :--- | :--- |
| **Inference Core** | **Faster-Whisper** | Hyper-optimized implementation of OpenAI's Whisper using **CTranslate2**. Delivers **4x speedups** over PyTorch. |
| **Quantization** | **INT8** | 8-bit quantization enables Pro-grade models (`Large-v3`) to run on consumer GPUs with minimal VRAM. |
| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out silence and background noise, conserving compute. |
| **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that feels native yet remains OS-independent. |
---
## đ Native Translation Engine
Whisper Voice v1.0.1 introduces a powerful **Universal Translator** built directly into the core. This is not a web-request to Google Translate. This is a neural network running on your GPU that understands the semantic meaning of speech and reconstructs it in fluent English.
* **Any Language Source**: Speak in French, Japanese, Russian, or 96 other languages.
* **English Output**: The engine instantly transcribes the audio into English text.
* **Zero Latency**: Translation happens in real-time as you speak (sentence-by-sentence).
### Dual-Channel Operation
You do not need to switch modes manually. The application listens on two separate channels simultaneously.
* **F9 (Default)** -> **Transcribe**: Types exactly what you say, in the language you speak.
* **F10 (Default)** -> **Translate**: Translates whatever you say in *any* language into English.
This allows for seamless bilingual workflows. Dictate a message to a local friend on `F9`, then instantly reply to an international colleague on `F10` without touching a single setting.
---
## đšī¸ Controls & Configuration
### Global Hotkeys
The system runs silently in the background. Control it via global shortcuts:
* **Transcribe (Default: F9)**: Use this for normal speech-to-text. It respects the language set in Settings (or Auto-Detect).
* **Translate (Default: F10)**: Use this to force translation to English.
* **Customization**: Both keys can be remapped in the Settings menu. The recorder supports complex combinations (e.g., `Ctrl + Alt + Space`).
### Input Modes
* **Clipboard Paste**: Injects text via OS clipboard. Instant, but some games disable paste.
* **Simulate Typing**: Mimics physical keystrokes. Bypasses anti-cheat and anti-paste blocks. Configurable speed (default 6000 CPM) to prevent game kicks.
---
## đ Intelligence Matrix (Models)
Select the model that aligns with your hardware capabilities.
| Model | VRAM (GPU) | RAM (CPU) | Velocity | Designation |
| :--- | :--- | :--- | :--- | :--- |
| `Tiny` | **~500 MB** | ~1 GB | ⥠**Supersonic** | Command & Control, older hardware. |
| `Base` | **~600 MB** | ~1 GB | đ **Very Fast** | Daily driver for low-power laptops. |
| `Small` | **~1 GB** | ~2 GB | ⊠**Fast** | High accuracy English dictation. |
| `Medium` | **~2 GB** | ~4 GB | âī¸ **Balanced** | Complex vocabulary, foreign accents. |
| `Large-v3 Turbo` | **~4 GB** | ~6 GB | ⨠**Optimal** | **Sweet Spot.** Near-Large smarts, Medium speed. |
| `Large-v3` | **~5 GB** | ~8 GB | đ§ **Maximum** | Professional transcription. Uncompromised. |
> *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
---
## đ ī¸ Operations
### đĨ Deployment
1. **Download**: Grab `WhisperVoice.exe` from [Releases](https://git.lashman.live/lashman/whisper_voice/releases).
2. **Deploy**: Place it anywhere. It is portable.
3. **Bootstrap**: Run it. The agent will self-provision an isolated Python environment (~2GB) on first launch.
4. **Updates**: Simply replace the `.exe`. The **Smart Bootstrapper** will detect the update and sync only the changed files, preserving your settings and skipping unnecessary downloads.
### đ§ Troubleshooting
* **App crashes on start**: Ensure you have [Microsoft Visual C++ Redistributable 2015-2022](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist) installed.
* **"Simulate Typing" is slow**: Some applications (remote desktops, older games) choke on super-fast input. Lower the typing speed in Settings to ~1200 CPM.
* **No Audio**: The agent listens to the **Default Communication Device**. Check your Windows Sound Control Panel.
---
## đ Supported Languages
The engine supports 99 languages. You can lock the engine to a specific language in Settings to improve accuracy, or leave it on **Auto-Detect** for multilingual usage.
([See full language list below](#full-language-list))
---
## đ Full Language List
| | | | | |
| :--- | :--- | :--- | :--- | :--- |
| Afrikaans đŋđĻ | Albanian đĻđą | Amharic đĒđš | Arabic đ¸đĻ | Armenian đĻđ˛ |
| Assamese đŽđŗ | Azerbaijani đĻđŋ | Bashkir đˇđē | Basque đĒđ¸ | Belarusian đ§đž |
| Bengali đ§đŠ | Bosnian đ§đĻ | Breton đĢđˇ | Bulgarian đ§đŦ | Burmese đ˛đ˛ |
| Castilian đĒđ¸ | Catalan đĒđ¸ | Chinese đ¨đŗ | Croatian đđˇ | Czech đ¨đŋ |
| Danish đŠđ° | Dutch đŗđą | English đēđ¸ | Estonian đĒđĒ | Faroese đĢđ´ |
| Finnish đĢđŽ | Flemish đ§đĒ | French đĢđˇ | Galician đĒđ¸ | Georgian đŦđĒ |
| German đŠđĒ | Greek đŦđˇ | Gujarati đŽđŗ | Haitian đđš | Hausa đŗđŦ |
| Hawaiian đēđ¸ | Hebrew đŽđą | Hindi đŽđŗ | Hungarian đđē | Icelandic đŽđ¸ |
| Indonesian đŽđŠ | Italian đŽđš | Japanese đ¯đĩ | Javanese đŽđŠ | Kannada đŽđŗ |
| Kazakh đ°đŋ | Khmer đ°đ | Korean đ°đˇ | Lao đąđĻ | Latin đģđĻ |
| Latvian đąđģ | Lingala đ¨đŠ | Lithuanian đąđš | Luxembourgish đąđē | Macedonian đ˛đ° |
| Malagasy đ˛đŦ | Malay đ˛đž | Malayalam đŽđŗ | Maltese đ˛đš | Maori đŗđŋ |
| Marathi đŽđŗ | Moldavian đ˛đŠ | Mongolian đ˛đŗ | Myanmar đ˛đ˛ | Nepali đŗđĩ |
| Norwegian đŗđ´ | Occitan đĢđˇ | Panjabi đŽđŗ | Pashto đĻđĢ | Persian đŽđˇ |
| Polish đĩđą | Portuguese đĩđš | Punjabi đŽđŗ | Romanian đˇđ´ | Russian đˇđē |
| Sanskrit đŽđŗ | Serbian đˇđ¸ | Shona đŋđŧ | Sindhi đĩđ° | Sinhala đąđ° |
| Slovak đ¸đ° | Slovenian đ¸đŽ | Somali đ¸đ´ | Spanish đĒđ¸ | Sundanese đŽđŠ |
| Swahili đ°đĒ | Swedish đ¸đĒ | Tagalog đĩđ | Tajik đšđ¯ | Tamil đŽđŗ |
| Tatar đˇđē | Telugu đŽđŗ | Thai đšđ | Tibetan đ¨đŗ | Turkish đšđˇ |
| Turkmen đšđ˛ | Ukrainian đēđĻ | Urdu đĩđ° | Uzbek đēđŋ | Vietnamese đģe |
| Welsh đ´ķ §ķ ĸķ ˇķ Ŧķ ŗķ ŋ | Yiddish đŽđą | Yoruba đŗđŦ | | |
### âī¸ PUBLIC DOMAIN (CC0 1.0)
*No Rights Reserved. No Gods. No Managers.*
Credit to **OpenAI** (Whisper), **Systran** (Faster-Whisper), and **Silero** (VAD).