Go to file

Your Name 4b84a27a67 v1.0.1 Feature Update and Polish

Full Changelog:

[New Features]
- Added Native Translation Mode:
  - Whisper model now fully supports Translating any language to English
  - Added 'task' and 'language' parameters to Transcriber core
- Dual Hotkey Support:
  - Added separate Global Hotkeys for Transcribe (default F8) and Translate (default F10)
  - Both hotkeys are fully customizable in Settings
  - Engine dynamically switches modes based on which key is pressed

[UI/UX Improvements]
- Settings Window:
  - Widened Hotkey Input fields (240px) to accommodate long combinations
  - Added Pretty-Printing for hotkey sequences (e.g. 'ctrl+f9' display as 'Ctrl + F9')
  - Replaced Country Code dropdown with Full Language Names (99+ languages)
  - Made Language Dropdown scrollable (max height 300px) to prevent screen overflow
  - Removed redundant 'Task' selector (replaced by dedicated hotkeys)
- System Tray:
  - Tooltip now displays both Transcribe and Translate hotkeys
  - Tooltip hotkeys are formatted readably

[Core & Performance]
- Bootstrapper:
  - Implemented Smart Incremental Sync
  - Now checks filesize and content hash before copying files
  - Drastically reduces startup time for subsequent runs
  - Preserves user settings.json during updates
- Backend:
  - Fixed HotkeyManager to support dynamic configuration keys
  - Fixed Language Lock: selecting a language now correctly forces the model to use it
  - Refactored bridge/main connection for language list handling

2026-01-24 18:29:10 +02:00

assets

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

src

v1.0.1 Feature Update and Polish

2026-01-24 18:29:10 +02:00

.gitignore

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

app_icon.ico

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

bootstrapper.py

v1.0.1 Feature Update and Polish

2026-01-24 18:29:10 +02:00

build_bootstrapper.py

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

build_exe.bat

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

convert_icon.py

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

download_icons.py

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

main.py

v1.0.1 Feature Update and Polish

2026-01-24 18:29:10 +02:00

portable_build.py

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

README.md

Aesthetic overhaul of documentation

2026-01-24 17:29:59 +02:00

requirements.txt

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

run_source.bat

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

run.bat

Initial commit of WhisperVoice

2026-01-24 17:03:52 +02:00

README.md

🎙️ W H I S P E R V O I C E

SOVEREIGN SPEECH RECOGNITION

"The master's tools will never dismantle the master's house." — Audre Lorde
Build your own tools. Run them locally.

Report Issue • View Source • Releases

✊ The Manifesto

We hold these truths to be self-evident: That user data is an extension of the self, and its exploitation by centralized clouds is a violation of digital autonomy.

Whisper Voice is built on the principle of technological sovereignty. It provides state-of-the-art speech recognition without renting your cognitive output to corporate oligarchies. By running entirely on your own hardware, it reclaims the means of digital production, ensuring that your words remain exclusively yours.

⚡ Technical Architecture

This operates on the metal. It is not a wrapper. It is an engine.

Component	Technology	Benefit
Inference Core	Faster-Whisper	Hyper-optimized implementation of OpenAI's Whisper using CTranslate2. Delivers 4x speedups over PyTorch.
Quantization	INT8	8-bit quantization enables Pro-grade models (`Large-v3`) to run on consumer GPUs with minimal VRAM.
Sensory Gate	Silero VAD	Enterprise-grade Voice Activity Detection filters out silence and background noise, conserving compute.
Interface	Qt 6 / QML	Hardware-accelerated, glassmorphic UI that feels native yet remains OS-independent.

📊 Intelligence Matrix

Select the model that aligns with your hardware capabilities.

Model	VRAM (GPU)	RAM (CPU)	Velocity	Designation
`Tiny`	~500 MB	~1 GB	⚡ Supersonic	Command & Control, older hardware.
`Base`	~600 MB	~1 GB	🚀 Very Fast	Daily driver for low-power laptops.
`Small`	~1 GB	~2 GB	⏩ Fast	High accuracy English dictation.
`Medium`	~2 GB	~4 GB	⚖️ Balanced	Complex vocabulary, foreign accents.
`Large-v3 Turbo`	~4 GB	~6 GB	✨ Optimal	Sweet Spot. Near-Large smarts, Medium speed.
`Large-v3`	~5 GB	~8 GB	🧠 Maximum	Professional transcription. Uncompromised.

Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.

🛠️ Operations

📥 Deployment

Download: Grab WhisperVoice.exe from Releases.
Deploy: Place it anywhere. It is portable.
Bootstrap: Run it. The agent will self-provision an isolated Python environment (~2GB) on first launch.

🕹️ Controls

Global Hook: F9 (Default). Press to open the channel. Release to inject text.
Tray Agent: Retracts to the system tray. Right-click for Settings or File Transcription.

📡 Input Modes

Mode	Description	Speed
Clipboard Paste	Standard text injection via OS clipboard.	Instant
Simulate Typing	Mimics physical keystrokes. Bypasses anti-paste blocks.	Up to 6000 CPM

🌐 Universal Translation

The model listens in 99 languages and translates them to English or transcribes them natively.

Click to view supported languages


Afrikaans 🇿🇦	Albanian 🇦🇱	Amharic 🇪🇹	Arabic 🇸🇦
Armenian 🇦🇲	Assamese 🇮🇳	Azerbaijani 🇦🇿	Bashkir 🇷🇺
Basque 🇪🇸	Belarusian 🇧🇾	Bengali 🇧🇩	Bosnian 🇧🇦
Breton 🇫🇷	Bulgarian 🇧🇬	Burmese 🇲🇲	Castilian 🇪🇸
Catalan 🇪🇸	Chinese 🇨🇳	Croatian 🇭🇷	Czech 🇨🇿
Danish 🇩🇰	Dutch 🇳🇱	English 🇺🇸	Estonian 🇪🇪
Faroese 🇫🇴	Finnish 🇫🇮	Flemish 🇧🇪	French 🇫🇷
Galician 🇪🇸	Georgian 🇬🇪	German 🇩🇪	Greek 🇬🇷
Gujarati 🇮🇳	Haitian 🇭🇹	Hausa 🇳🇬	Hawaiian 🇺🇸
Hebrew 🇮🇱	Hindi 🇮🇳	Hungarian 🇭🇺	Icelandic 🇮🇸
Indonesian 🇮🇩	Italian 🇮🇹	Japanese 🇯🇵	Javanese 🇮🇩
Kannada 🇮🇳	Kazakh 🇰🇿	Khmer 🇰🇭	Korean 🇰🇷
Lao 🇱🇦	Latin 🇻🇦	Latvian 🇱🇻	Lingala 🇨🇩
Lithuanian 🇱🇹	Luxembourgish 🇱🇺	Macedonian 🇲🇰	Malagasy 🇲🇬
Malay 🇲🇾	Malayalam 🇮🇳	Maltese 🇲🇹	Maori 🇳🇿
Marathi 🇮🇳	Moldavian 🇲🇩	Mongolian 🇲🇳	Myanmar 🇲🇲
Nepali 🇳🇵	Norwegian 🇳🇴	Occitan 🇫🇷	Panjabi 🇮🇳
Pashto 🇦🇫	Persian 🇮🇷	Polish 🇵🇱	Portuguese 🇵🇹
Punjabi 🇮🇳	Romanian 🇷🇴	Russian 🇷🇺	Sanskrit 🇮🇳
Serbian 🇷🇸	Shona 🇿🇼	Sindhi 🇵🇰	Sinhala 🇱🇰
Slovak 🇸🇰	Slovenian 🇸🇮	Somali 🇸🇴	Spanish 🇪🇸
Sundanese 🇮🇩	Swahili 🇰🇪	Swedish 🇸🇪	Tagalog 🇵🇭
Tajik 🇹🇯	Tamil 🇮🇳	Tatar 🇷🇺	Telugu 🇮🇳
Thai 🇹🇭	Tibetan 🇨🇳	Turkish 🇹🇷	Turkmen 🇹🇲
Ukrainian 🇺🇦	Urdu 🇵🇰	Uzbek 🇺🇿	Vietnamese 🇻e
Welsh 🏴󠁧󠁢󠁷󠁬󠁳󠁿	Yiddish 🇮🇱	Yoruba 🇳🇬

🔧 Troubleshooting

🔥 App crashes on start

The underlying engine requires standard C++ libraries. Install the Microsoft Visual C++ Redistributable (2015-2022).

🐌 "Simulate Typing" is slow

Some apps (games, RDP) can't handle supersonic input. Go to Settings and lower the Typing Speed to ~1200 CPM.

🎤 No Audio / Silence

The agent listens to the Default Communication Device. Ensure your microphone is set correctly in Windows Sound Settings.

⚖️ PUBLIC DOMAIN (CC0 1.0)

No Rights Reserved. No Gods. No Managers.

Credit to OpenAI (Whisper), Systran (Faster-Whisper), and Silero (VAD).

Releases 6

v1.2.0 Latest

2026-02-18 22:30:48 +02:00

Languages

Python 51.8%

QML 44.7%

GLSL 3.1%

Batchfile 0.4%