# đī¸ W H I S P E R V O I C E
### SOVEREIGN SPEECH RECOGNITION

[](https://git.lashman.live/lashman/whisper_voice/releases/latest)
[](https://creativecommons.org/publicdomain/zero/1.0/)
> *"The master's tools will never dismantle the master's house."*
>
> **Build your own tools. Run them locally. Free your mind.**
[View Source](https://git.lashman.live/lashman/whisper_voice) âĸ [Report Issue](https://git.lashman.live/lashman/whisper_voice/issues)
## đĄ The Transmission
We live in an era of enclosure. Our words, our thoughts, and our digital footprints are strip-mined by centralized giants, turned into capital, and sold back to us as "convenience."
**Whisper Voice** is a rejection of that contract.
It is built on the axiom that **your voice belongs to you**. By bringing state-of-the-art inference down from the server farms and running it on your own metal, we reclaim a small piece of the digital commons. This software answers to no one but you. It has no telemetry, no subscription, and no masters.
---
## ⥠The Engine
This operates on the silicon. It is not a wrapper. It is a machine.
| Component | Technology | Benefit |
| :--- | :--- | :--- |
| **Inference Core** | **Faster-Whisper** | Hyper-optimized implementation via **CTranslate2**. Delivers **4x velocity** over standard PyTorch execution. |
| **Compression** | **INT8 quantization** | Enables Pro-grade models (`Large-v3`) to run on consumer hardware, democratizing access to high-fidelity AI. |
| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out the noise, ensuring only intent is captured. |
| **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that is fluid, responsive, and sovereign. |
## đ Universal Translator
Whisper Voice v1.0.1 introduces a **Neural Translation Engine** built directly into the core. It bypasses the need for corporate translation APIs entirely.
* **Input**: Speak in French, Japanese, Russian, or **96 other languages**.
* **Output**: The engine instantly reconstructs the semantic meaning in fluent **English**.
* **Local Execution**: No API keys. No data leaks. The translation happens on your GPU.
### Dual-Channel Workflow
The application listens on two separate channels simultaneously, allowing for seamless fluid switching between local and international communication.
* **F9 (Default)** -> **Transcribe**: Types exactly what you say, in the language you speak.
* **F10 (Default)** -> **Translate**: Translates whatever you say in *any* language into English.
---
## đšī¸ Command & Control
### Global Hotkeys
The agent runs silently in the background, waiting for your signal.
* **Transcribe (F9)**: Opens the channel for standard speech-to-text.
* **Translate (F10)**: Opens the channel for neural translation.
* **Customization**: Remap these keys in Settings. The recorder supports complex chords (e.g. `Ctrl + Alt + Space`) to fit your workflow.
### Injection Protocols
* **Clipboard Paste**: Standard text injection. Instant, reliable.
* **Simulate Typing**: Mimics physical keystrokes at superhuman speed (6000 CPM). Bypasses anti-paste restrictions and "protected" windows.
## đ Intelligence Matrix
Select the model that aligns with your available resources.
| Model | VRAM (GPU) | RAM (CPU) | Designation | Capability |
| :--- | :--- | :--- | :--- | :--- |
| `Tiny` | **~500 MB** | ~1 GB | ⥠**Supersonic** | Command & Control, older hardware. |
| `Base` | **~600 MB** | ~1 GB | đ **Very Fast** | Daily driver for low-power laptops. |
| `Small` | **~1 GB** | ~2 GB | ⊠**Fast** | High accuracy English dictation. |
| `Medium` | **~2 GB** | ~4 GB | âī¸ **Balanced** | Complex vocabulary, foreign accents. |
| `Large-v3 Turbo` | **~4 GB** | ~6 GB | ⨠**Optimal** | **The Sweet Spot.** Near-Large intelligence, Medium speed. |
| `Large-v3` | **~5 GB** | ~8 GB | đ§ **Maximum** | Professional grade. Uncompromised. |
> *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
---
## đ ī¸ Deployment
### đĨ Installation
1. **Acquire**: Download `WhisperVoice.exe` from [Releases](https://git.lashman.live/lashman/whisper_voice/releases).
2. **Deploy**: Place it anywhere. It is portable.
3. **Bootstrap**: Run it. The agent will self-provision an isolated Python runtime (~2GB) on first launch.
4. **Sync**: Future updates are handled by the **Smart Bootstrapper**, which surgically updates only changed files, respecting your bandwidth and your settings.
### đ§ Troubleshooting
* **App crashes on start**: Ensure you have [Microsoft Visual C++ Redistributable 2015-2022](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist) installed.
* **"Simulate Typing" is slow**: Some applications (remote desktops, legacy games) cannot handle the data stream. Lower the typing speed in Settings to ~1200 CPM.
* **No Audio**: The agent listens to the **Default Communication Device**. Verify your Windows Sound Control Panel.
---
## īŋŊ Supported Languages
The engine understands the following 99 languages. You can lock the focus to a specific language in Settings to improve accuracy, or rely on **Auto-Detect** for fluid multilingual usage.
| | | | | | |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Afrikaans đŋđĻ | Albanian đĻđą | Amharic đĒđš | Arabic đ¸đĻ | Armenian đĻđ˛ | Assamese đŽđŗ |
| Azerbaijani đĻđŋ | Bashkir đˇđē | Basque đĒđ¸ | Belarusian đ§đž | Bengali đ§đŠ | Bosnian đ§đĻ |
| Breton đĢđˇ | Bulgarian đ§đŦ | Burmese đ˛đ˛ | Castilian đĒđ¸ | Catalan đĒđ¸ | Chinese đ¨đŗ |
| Croatian đđˇ | Czech đ¨đŋ | Danish đŠđ° | Dutch đŗđą | English đēđ¸ | Estonian đĒđĒ |
| Faroese đĢđ´ | Finnish đĢđŽ | Flemish đ§đĒ | French đĢđˇ | Galician đĒđ¸ | Georgian đŦđĒ |
| German đŠđĒ | Greek đŦđˇ | Gujarati đŽđŗ | Haitian đđš | Hausa đŗđŦ | Hawaiian đēđ¸ |
| Hebrew đŽđą | Hindi đŽđŗ | Hungarian đđē | Icelandic đŽđ¸ | Indonesian đŽđŠ | Italian đŽđš |
| Japanese đ¯đĩ | Javanese đŽđŠ | Kannada đŽđŗ | Kazakh đ°đŋ | Khmer đ°đ | Korean đ°đˇ |
| Lao đąđĻ | Latin đģđĻ | Latvian đąđģ | Lingala đ¨đŠ | Lithuanian đąđš | Luxembourgish đąđē |
| Macedonian đ˛đ° | Malagasy đ˛đŦ | Malay đ˛đž | Malayalam đŽđŗ | Maltese đ˛đš | Maori đŗđŋ |
| Marathi đŽđŗ | Moldavian đ˛đŠ | Mongolian đ˛đŗ | Myanmar đ˛đ˛ | Nepali đŗđĩ | Norwegian đŗđ´ |
| Occitan đĢđˇ | Panjabi đŽđŗ | Pashto đĻđĢ | Persian đŽđˇ | Polish đĩđą | Portuguese đĩđš |
| Punjabi đŽđŗ | Romanian đˇđ´ | Russian đˇđē | Sanskrit đŽđŗ | Serbian đˇđ¸ | Shona đŋđŧ |
| Sindhi đĩđ° | Sinhala đąđ° | Slovak đ¸đ° | Slovenian đ¸đŽ | Somali đ¸đ´ | Spanish đĒđ¸ |
| Sundanese đŽđŠ | Swahili đ°đĒ | Swedish đ¸đĒ | Tagalog đĩđ | Tajik đšđ¯ | Tamil đŽđŗ |
| Tatar đˇđē | Telugu đŽđŗ | Thai đšđ | Tibetan đ¨đŗ | Turkish đšđˇ | Turkmen đšđ˛ |
| Ukrainian đēđĻ | Urdu đĩđ° | Uzbek đēđŋ | Vietnamese đģe | Welsh đ´ķ §ķ ĸķ ˇķ Ŧķ ŗķ ŋ | Yiddish đŽđą |
| Yoruba đŗđŦ | | | | | |
### âī¸ PUBLIC DOMAIN (CC0 1.0)
*No Rights Reserved. No Gods. No Masters. No Managers.*
Credit to **OpenAI** (Whisper), **Systran** (Faster-Whisper), and **Silero** (VAD).