v1.0.1 Feature Update and Polish

Full Changelog: [New Features] - Added Native Translation Mode: - Whisper model now fully supports Translating any language to English - Added 'task' and 'language' parameters to Transcriber core - Dual Hotkey Support: - Added separate Global Hotkeys for Transcribe (default F8) and Translate (default F10) - Both hotkeys are fully customizable in Settings - Engine dynamically switches modes based on which key is pressed [UI/UX Improvements] - Settings Window: - Widened Hotkey Input fields (240px) to accommodate long combinations - Added Pretty-Printing for hotkey sequences (e.g. 'ctrl+f9' display as 'Ctrl + F9') - Replaced Country Code dropdown with Full Language Names (99+ languages) - Made Language Dropdown scrollable (max height 300px) to prevent screen overflow - Removed redundant 'Task' selector (replaced by dedicated hotkeys) - System Tray: - Tooltip now displays both Transcribe and Translate hotkeys - Tooltip hotkeys are formatted readably [Core & Performance] - Bootstrapper: - Implemented Smart Incremental Sync - Now checks filesize and content hash before copying files - Drastically reduces startup time for subsequent runs - Preserves user settings.json during updates - Backend: - Fixed HotkeyManager to support dynamic configuration keys - Fixed Language Lock: selecting a language now correctly forces the model to use it - Refactored bridge/main connection for language list handling
Fix: Invisible overlay blocking mouse clicks
2026-01-24 18:29:10 +02:00 · 2026-01-24 17:51:23 +02:00 · 2026-01-24 17:29:59 +02:00 · 2026-01-24 17:27:54 +02:00 · 2026-01-24 17:24:56 +02:00 · 2026-01-24 17:22:14 +02:00
13 changed files with 512 additions and 120 deletions
--- a/README.md
+++ b/README.md
@@ -1,71 +1,155 @@
-# Whisper Voice
+<div align="center">
-**Reclaim Your Voice from the Cloud.**
+# 🎙️ W H I S P E R &nbsp; V O I C E
 ### SOVEREIGN SPEECH RECOGNITION
-Whisper Voice is a high-performance, strictly local speech-to-text tool designed for the desktop. It provides instant, high-accuracy dictation anywhere on your system—no internet connection required, no corporate servers, and absolutely no data harvesting.
+<br>
-We believe that the tools of production—and communication—should belong to the individual, not rented from centralized tech giants.
+![Status](https://img.shields.io/badge/STATUS-OPERATIONAL-success?style=for-the-badge&logo=server)
 [![Download](https://img.shields.io/gitea/v/release/lashman/whisper_voice?gitea_url=https%3A%2F%2Fgit.lashman.live&label=Download&style=for-the-badge&logo=windows&logoColor=white&color=2563eb)](https://git.lashman.live/lashman/whisper_voice/releases/latest)
 [![License](https://img.shields.io/badge/LICENSE-CC0_PUBLIC_DOMAIN-lightgrey?style=for-the-badge&logo=creative-commons&logoColor=black)](https://creativecommons.org/publicdomain/zero/1.0/)
 <br>
 > *"The master's tools will never dismantle the master's house."* — Audre Lorde
 > <br>
 > **Build your own tools. Run them locally.**
 [Report Issue](https://git.lashman.live/lashman/whisper_voice/issues) • [View Source](https://git.lashman.live/lashman/whisper_voice) • [Releases](https://git.lashman.live/lashman/whisper_voice/releases)
 </div>
 <br>
 ## ✊ The Manifesto
 **We hold these truths to be self-evident:** That user data is an extension of the self, and its exploitation by centralized clouds is a violation of digital autonomy. 
 **Whisper Voice** is built on the principle of **technological sovereignty**. It provides state-of-the-art speech recognition without renting your cognitive output to corporate oligarchies. By running entirely on your own hardware, it reclaims the means of digital production, ensuring that your words remain exclusively yours.
 ---
-## ✊ Core Principles
+## ⚡ Technical Architecture
-### 1. Total Autonomy (Local-First)
+This operates on the metal. It is not a wrapper. It is an engine.
 Your voice data is yours alone. Unlike commercial alternatives that siphon your words to remote data centers for processing and profiling, Whisper Voice runs entirely on your hardware. **No masters, no servers.** You retain full sovereignty over your digital footprint.
-### 2. Decentralized Power
+| Component | Technology | Benefit |
-By leveraging optimized local processing, we strip away the need for reliance on massive, energy-hungry corporate infrastructure. This is technology scaled to the human level—powerful, efficient, and completely under your control.
+| :--- | :--- | :--- |
-
+| **Inference Core** | **Faster-Whisper** | Hyper-optimized implementation of OpenAI's Whisper using **CTranslate2**. Delivers **4x speedups** over PyTorch. |
-### 3. Accessible to All
+| **Quantization** | **INT8** | 8-bit quantization enables Pro-grade models (`Large-v3`) to run on consumer GPUs with minimal VRAM. |
-High-quality speech recognition shouldn't be gated behind subscriptions or paywalls. This tool is free, open, and built to empower users to interact with their machines on their own terms.
+| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out silence and background noise, conserving compute. |
 | **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that feels native yet remains OS-independent. |
 ---
-## ✨ Features
+## 📊 Intelligence Matrix
-*   **100% Offline Processing**: Once the recognition engine is downloaded, the cable can be cut. Nothing leaves your machine.
+Select the model that aligns with your hardware capabilities.
-*   **Universal Compatibility**: Works in any text field—editors, chat apps, terminals, or browsers. If you can type there, you can speak there.
+
-*   **Adaptive Input**:
+| Model | VRAM (GPU) | RAM (CPU) | Velocity | Designation |
-    *   *Clipboard Mode*: Standard paste injection.
+| :--- | :--- | :--- | :--- | :--- |
-    *   *High-Speed Simulation*: Simulates keystrokes at supersonic speeds (up to 6000 CPM) for apps that block pasting.
+| `Tiny` | **~500 MB** | ~1 GB | ⚡ **Supersonic** | Command & Control, older hardware. |
-*   **System Integration**: Minimalist overlay and system tray presence. It exists when you need it and vanishes when you don't.
+| `Base` | **~600 MB** | ~1 GB | 🚀 **Very Fast** | Daily driver for low-power laptops. |
-*   **Resource Efficiency**: Optimized to run smoothly on consumer hardware without monopolizing your system resources.
+| `Small` | **~1 GB** | ~2 GB | ⏩ **Fast** | High accuracy English dictation. |
 | `Medium` | **~2 GB** | ~4 GB | ⚖️ **Balanced** | Complex vocabulary, foreign accents. |
 | `Large-v3 Turbo` | **~4 GB** | ~6 GB | ✨ **Optimal** | **Sweet Spot.** Near-Large smarts, Medium speed. |
 | `Large-v3` | **~5 GB** | ~8 GB | 🧠 **Maximum** | Professional transcription. Uncompromised. |
 > *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
 ---
-## 🚀 Getting Started
+## 🛠️ Operations
-### Installation
+### 📥 Deployment
-1.  Download the latest release.
+1.  **Download**: Grab `WhisperVoice.exe` from [Releases](https://git.lashman.live/lashman/whisper_voice/releases).
-2.  Run `WhisperVoice.exe`.
+2.  **Deploy**: Place it anywhere. It is portable.
-3.  On the first run, the bootstrapper will autonomously provision the necessary runtime environment. This ensures your system remains clean and dependencies are self-contained.
+3.  **Bootstrap**: Run it. The agent will self-provision an isolated Python environment (~2GB) on first launch.
-### Usage
+### 🕹️ Controls
-1.  **Set Your Trigger**: Configure a global hotkey (default: `F9`) in the settings.
+*   **Global Hook**: `F9` (Default). Press to open the channel. Release to inject text.
-2.  **Speak Freely**: Hold the hotkey (or toggle it) and speak.
+*   **Tray Agent**: Retracts to the system tray. Right-click for **Settings** or **File Transcription**.
-3.  **Direct Action**: Your words are instantly transcribed and injected into your active window.
+
 ### 📡 Input Modes
 | Mode | Description | Speed |
 | :--- | :--- | :--- |
 | **Clipboard Paste** | Standard text injection via OS clipboard. | Instant |
 | **Simulate Typing** | Mimics physical keystrokes. Bypasses anti-paste blocks. | Up to **6000** CPM |
 ---
-## ⚙️ Configuration
+## 🌐 Universal Translation
-The **Settings** panel puts the means of configuration in your hands:
+The model listens in **99 languages** and translates them to English or transcribes them natively.
-*   **Recognition Engine**: Choose the size of the model that fits your hardware capabilities (Tiny to Large). Larger models offer greater precision but require more computing power.
+<details>
-*   **Input Method**: Switch between "Clipboard Paste" and "Simulate Typing" depending on target application restrictions.
+<summary><b>Click to view supported languages</b></summary>
-*   **Typing Speed**: Adjust the keystroke injection rate. Crank it up to 6000 CPM for instant text delivery.
+<br>
-*   **Run on Startup**: Configure the agent to be ready the moment your session begins.
+
 | | | | |
 | :--- | :--- | :--- | :--- |
 | Afrikaans 🇿🇦 | Albanian 🇦🇱 | Amharic 🇪🇹 | Arabic 🇸🇦 |
 | Armenian 🇦🇲 | Assamese 🇮🇳 | Azerbaijani 🇦🇿 | Bashkir 🇷🇺 |
 | Basque 🇪🇸 | Belarusian 🇧🇾 | Bengali 🇧🇩 | Bosnian 🇧🇦 |
 | Breton 🇫🇷 | Bulgarian 🇧🇬 | Burmese 🇲🇲 | Castilian 🇪🇸 |
 | Catalan 🇪🇸 | Chinese 🇨🇳 | Croatian 🇭🇷 | Czech 🇨🇿 |
 | Danish 🇩🇰 | Dutch 🇳🇱 | English 🇺🇸 | Estonian 🇪🇪 |
 | Faroese 🇫🇴 | Finnish 🇫🇮 | Flemish 🇧🇪 | French 🇫🇷 |
 | Galician 🇪🇸 | Georgian 🇬🇪 | German 🇩🇪 | Greek 🇬🇷 |
 | Gujarati 🇮🇳 | Haitian 🇭🇹 | Hausa 🇳🇬 | Hawaiian 🇺🇸 |
 | Hebrew 🇮🇱 | Hindi 🇮🇳 | Hungarian 🇭🇺 | Icelandic 🇮🇸 |
 | Indonesian 🇮🇩 | Italian 🇮🇹 | Japanese 🇯🇵 | Javanese 🇮🇩 |
 | Kannada 🇮🇳 | Kazakh 🇰🇿 | Khmer 🇰🇭 | Korean 🇰🇷 |
 | Lao 🇱🇦 | Latin 🇻🇦 | Latvian 🇱🇻 | Lingala 🇨🇩 |
 | Lithuanian 🇱🇹 | Luxembourgish 🇱🇺 | Macedonian 🇲🇰 | Malagasy 🇲🇬 |
 | Malay 🇲🇾 | Malayalam 🇮🇳 | Maltese 🇲🇹 | Maori 🇳🇿 |
 | Marathi 🇮🇳 | Moldavian 🇲🇩 | Mongolian 🇲🇳 | Myanmar 🇲🇲 |
 | Nepali 🇳🇵 | Norwegian 🇳🇴 | Occitan 🇫🇷 | Panjabi 🇮🇳 |
 | Pashto 🇦🇫 | Persian 🇮🇷 | Polish 🇵🇱 | Portuguese 🇵🇹 |
 | Punjabi 🇮🇳 | Romanian 🇷🇴 | Russian 🇷🇺 | Sanskrit 🇮🇳 |
 | Serbian 🇷🇸 | Shona 🇿🇼 | Sindhi 🇵🇰 | Sinhala 🇱🇰 |
 | Slovak 🇸🇰 | Slovenian 🇸🇮 | Somali 🇸🇴 | Spanish 🇪🇸 |
 | Sundanese 🇮🇩 | Swahili 🇰🇪 | Swedish 🇸🇪 | Tagalog 🇵🇭 |
 | Tajik 🇹🇯 | Tamil 🇮🇳 | Tatar 🇷🇺 | Telugu 🇮🇳 |
 | Thai 🇹🇭 | Tibetan 🇨🇳 | Turkish 🇹🇷 | Turkmen 🇹🇲 |
 | Ukrainian 🇺🇦 | Urdu 🇵🇰 | Uzbek 🇺🇿 | Vietnamese 🇻e |
 | Welsh 🏴󠁧󠁢󠁷󠁬󠁳󠁿 | Yiddish 🇮🇱 | Yoruba 🇳🇬 | |
 </details>
 ---
-## 🤝 Mutual Aid
+## 🔧 Troubleshooting
-This project thrives on community collaboration. If you have improvements, fixes, or ideas, you are encouraged to contribute. We build better systems when we build them together, horizontally and transparently.
+<details>
 <summary><b>🔥 App crashes on start</b></summary>
 <blockquote>
 The underlying engine requires standard C++ libraries. Install the <b>Microsoft Visual C++ Redistributable (2015-2022)</b>.
 </blockquote>
 </details>
-*   **Report Issues**: If something breaks, let us know.
+<details>
-*   **Contribute Code**: The source is open. Fork it, improve it, share it.
+<summary><b>🐌 "Simulate Typing" is slow</b></summary>
 <blockquote>
 Some apps (games, RDP) can't handle supersonic input. Go to <b>Settings</b> and lower the <b>Typing Speed</b> to ~1200 CPM.
 </blockquote>
 </details>
 <details>
 <summary><b>🎤 No Audio / Silence</b></summary>
 <blockquote>
 The agent listens to the <b>Default Communication Device</b>. Ensure your microphone is set correctly in Windows Sound Settings.
 </blockquote>
 </details>
 ---
-*Built with local processing libraries and Qt.*
+<div align="center">
-*No gods, no cloud managers.*
+
 ### ⚖️ PUBLIC DOMAIN (CC0 1.0)
 *No Rights Reserved. No Gods. No Managers.*
 Credit to **OpenAI** (Whisper), **Systran** (Faster-Whisper), and **Silero** (VAD).
 </div>
--- a/bootstrapper.py
+++ b/bootstrapper.py
@@ -259,48 +259,72 @@ class Bootstrapper:
        process.wait()
    def refresh_app_source(self):
-        """Refresh app source files. Skips if already exists to save time."""
+        """
-        # Optimization: If app/main.py exists, skip update to improve startup speed.
+        Smartly updates app source files by only copying changed files.
-        # The user can delete the 'runtime' folder to force an update.
+        Preserves user settings and reduces disk I/O.
-        if (self.app_path / "main.py").exists():
+        """
-            log("App already exists. Skipping update.")
+        if self.ui: self.ui.set_status("Checking for updates...")
            return True
        if self.ui: self.ui.set_status("Updating app files...")
        try:
-            # Preserve settings.json if it exists
+            # 1. Ensure destination exists
-            settings_path = self.app_path / "settings.json"
+            if not self.app_path.exists():
-            temp_settings = None
+                self.app_path.mkdir(parents=True, exist_ok=True)
            if settings_path.exists():
                try:
                    temp_settings = settings_path.read_bytes()
                except:
                    log("Failed to backup settings.json, it involves risk of data loss.")
-            if self.app_path.exists():
+            # 2. Walk source and sync
-                shutil.rmtree(self.app_path, ignore_errors=True)
+            # source_path is the temporary bundled folder
            # app_path is the persistent runtime folder
-            shutil.copytree(
+            changes_made = 0
                self.source_path,
                self.app_path,
                ignore=shutil.ignore_patterns(
                    '__pycache__', '*.pyc', '.git', 'venv', 
                    'build', 'dist', '*.egg-info', 'runtime'
                )
            )
-            # Restore settings.json
+            for src_dir, dirs, files in os.walk(self.source_path):
-            if temp_settings:
+                # Determine relative path from source root
-                try:
+                rel_path = Path(src_dir).relative_to(self.source_path)
-                    settings_path.write_bytes(temp_settings)
+                dst_dir = self.app_path / rel_path
-                    log("Restored settings.json")
+                
-                except:
+                # Ensure directory exists
-                    log("Failed to restore settings.json")
+                if not dst_dir.exists():
                    dst_dir.mkdir(parents=True, exist_ok=True)
                for file in files:
                    # Skip ignored files
                    if file in ['__pycache__', '.git', 'settings.json'] or file.endswith('.pyc'):
                        continue
                    src_file = Path(src_dir) / file
                    dst_file = dst_dir / file
                    # Check if update needed
                    should_copy = False
                    if not dst_file.exists():
                        should_copy = True
                    else:
                        # Compare size first (fast)
                        if src_file.stat().st_size != dst_file.stat().st_size:
                            should_copy = True
                        else:
                            # Compare content (slower but accurate)
                            # Only read if size matches to verify diff
                            if src_file.read_bytes() != dst_file.read_bytes():
                                should_copy = True
                    if should_copy:
                        shutil.copy2(src_file, dst_file)
                        changes_made += 1
                        if self.ui: self.ui.set_detail(f"Updated: {file}")
            # 3. Cleanup logic (Optional: remove files in dest that are not in source)
            # For now, we only add/update to prevent deleting generated user files (logs, etc)
            if changes_made > 0:
                log(f"Update complete. {changes_made} files changed.")
            else:
                log("App is up to date.")
            return True
        except Exception as e:
            log(f"Error refreshing app source: {e}")
            # Fallback to nuclear option if sync fails completely? 
            # No, 'smart_sync' failing might mean permissions, nuclear wouldn't help.
            return False
    def run_app(self):
--- a/main.py
+++ b/main.py
@@ -118,13 +118,14 @@ class DownloadWorker(QThread):
 class TranscriptionWorker(QThread):
    finished = Signal(str)
-    def __init__(self, transcriber, audio_data, is_file=False, parent=None):
+    def __init__(self, transcriber, audio_data, is_file=False, parent=None, task_override=None):
        super().__init__(parent)
        self.transcriber = transcriber
        self.audio_data = audio_data
        self.is_file = is_file
        self.task_override = task_override
    def run(self):
-        text = self.transcriber.transcribe(self.audio_data, is_file=self.is_file)
+        text = self.transcriber.transcribe(self.audio_data, is_file=self.is_file, task=self.task_override)
        self.finished.emit(text)
 class WhisperApp(QObject):
@@ -166,13 +167,18 @@ class WhisperApp(QObject):
        self.tray.transcribe_file_requested.connect(self.transcribe_file)
        # Init Tooltip
-        hotkey = self.config.get("hotkey")
+        from src.utils.formatters import format_hotkey
-        self.tray.setToolTip(f"Whisper Voice - Press {hotkey} to Record")
+        self.format_hotkey = format_hotkey # Store ref
        hk1 = self.format_hotkey(self.config.get("hotkey"))
        hk2 = self.format_hotkey(self.config.get("hotkey_translate"))
        self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nTranslate: {hk2}")
        # 3. Logic Components Placeholders
        self.audio_engine = None
        self.transcriber = None
-        self.hotkey_manager = None
+        self.hk_transcribe = None
        self.hk_translate = None
        self.overlay_root = None
        # 4. Start Loader
@@ -222,12 +228,23 @@ class WhisperApp(QObject):
            self.settings_root.setVisible(False)
        # Install Low-Level Window Hook for Transparent Hit Test
-        # We must keep a reference to 'self.hook' so it isn't GC'd
+        try:
-        # scale = self.overlay_root.devicePixelRatio()
+            from src.utils.window_hook import WindowHook
-        # self.hook = WindowHook(int(self.overlay_root.winId()), 500, 300, scale)
+            hwnd = self.overlay_root.winId()
-        # self.hook.install()
+            # Initial scale from config
            scale = float(self.config.get("ui_scale"))
-        # NOTE: HitTest hook will be installed here later
+            # Current Overlay Dimensions
            win_w = int(460 * scale)
            win_h = int(180 * scale)
            self.window_hook = WindowHook(hwnd, win_w, win_h, initial_scale=scale)
            self.window_hook.install()
            # Initial state: Disabled because we start inactive
            self.window_hook.set_enabled(False)
        except Exception as e:
            logging.error(f"Failed to install WindowHook: {e}")
    def center_overlay(self):
        """Calculates and sets the Overlay position above the taskbar."""
@@ -255,9 +272,16 @@ class WhisperApp(QObject):
        self.audio_engine.set_visualizer_callback(self.bridge.update_amplitude)
        self.audio_engine.set_silence_callback(self.on_silence_detected)
        self.transcriber = WhisperTranscriber()
-        self.hotkey_manager = HotkeyManager()
+        
-        self.hotkey_manager.triggered.connect(self.toggle_recording)
+        # Dual Hotkey Managers
-        self.hotkey_manager.start()
+        self.hk_transcribe = HotkeyManager(config_key="hotkey")
        self.hk_transcribe.triggered.connect(lambda: self.toggle_recording(task_override="transcribe"))
        self.hk_transcribe.start()
        self.hk_translate = HotkeyManager(config_key="hotkey_translate")
        self.hk_translate.triggered.connect(lambda: self.toggle_recording(task_override="translate"))
        self.hk_translate.start()
        self.bridge.update_status("Ready")
    def run(self):
@@ -275,7 +299,8 @@ class WhisperApp(QObject):
            except: pass
            self.bridge.stats_worker.stop()
-        if self.hotkey_manager: self.hotkey_manager.stop()
+        if self.hk_transcribe: self.hk_transcribe.stop()
        if self.hk_translate: self.hk_translate.stop()
        # Close all QML windows to ensure bindings stop before Python objects die
        if self.overlay_root: 
@@ -350,10 +375,14 @@ class WhisperApp(QObject):
        print(f"Setting Changed: {key} = {value}")
        # 1. Hotkey Reload
-        if key == "hotkey":
+        if key in ["hotkey", "hotkey_translate"]:
-            if self.hotkey_manager: self.hotkey_manager.reload_hotkey()
+            if self.hk_transcribe: self.hk_transcribe.reload_hotkey()
            if self.hk_translate: self.hk_translate.reload_hotkey()
            if self.tray:
-                self.tray.setToolTip(f"Whisper Voice - Press {value} to Record")
+                hk1 = self.format_hotkey(self.config.get("hotkey"))
                hk2 = self.format_hotkey(self.config.get("hotkey_translate"))
                self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nTranslate: {hk2}")
        # 2. AI Model Reload (Heavy)
        if key in ["model_size", "compute_device", "compute_type"]:
@@ -456,6 +485,8 @@ class WhisperApp(QObject):
        file_path, _ = QFileDialog.getOpenFileName(None, "Select Audio", "", "Audio (*.mp3 *.wav *.flac *.m4a *.ogg)")
        if file_path:
            self.bridge.update_status("Thinking...")
            # Files use the default configured task usually, or we could ask? 
            # Default to config setting for files.
            self.worker = TranscriptionWorker(self.transcriber, file_path, is_file=True, parent=self)
            self.worker.finished.connect(self.on_transcription_done)
            self.worker.start()
@@ -463,10 +494,13 @@ class WhisperApp(QObject):
    @Slot()
    def on_silence_detected(self):
        from PySide6.QtCore import QMetaObject, Qt
        # Silence detection always triggers the task that was active? 
        # Since silence stops recording, it just calls toggle_recording with no arg, using the stored current_task?
        # Let's ensure toggle_recording handles no arg calls by stopping the CURRENT task.
        QMetaObject.invokeMethod(self, "toggle_recording", Qt.QueuedConnection)
-    @Slot()
+    @Slot() # Modified to allow lambda override
-    def toggle_recording(self):
+    def toggle_recording(self, task_override=None):
        if not self.audio_engine: return
        # Prevent starting a new recording while we are still transcribing the last one
@@ -474,23 +508,36 @@ class WhisperApp(QObject):
            logging.warning("Ignored toggle request: Transcription in progress.")
            return
        # Determine which task we are entering
        if task_override:
            intended_task = task_override
        else:
            intended_task = self.config.get("task")
        if self.audio_engine.recording:
            # STOP RECORDING
            self.bridge.update_status("Thinking...")
            self.bridge.isRecording = False
            self.bridge.isProcessing = True # Start Processing
            audio_data = self.audio_engine.stop_recording()
-            self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self)
+            
            # Use the task that started this session, or the override if provided (though usually override is for starting)
            final_task = getattr(self, "current_recording_task", self.config.get("task"))
            self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self, task_override=final_task)
            self.worker.finished.connect(self.on_transcription_done)
            self.worker.start()
        else:
-            self.bridge.update_status("Recording")
+            # START RECORDING
            self.current_recording_task = intended_task
            self.bridge.update_status(f"Recording ({intended_task})...") 
            self.bridge.isRecording = True
            self.audio_engine.start_recording()
    @Slot(bool)
    def on_ui_toggle_request(self, state):
        if state != self.audio_engine.recording:
-            self.toggle_recording()
+            self.toggle_recording() # Default behavior for UI clicks
    @Slot(str)
    def on_transcription_done(self, text: str):
@@ -503,8 +550,8 @@ class WhisperApp(QObject):
    @Slot(bool)
    def on_hotkeys_enabled_toggle(self, state):
-        if self.hotkey_manager:
+        if self.hk_transcribe: self.hk_transcribe.set_enabled(state)
-            self.hotkey_manager.set_enabled(state)
+        if self.hk_translate: self.hk_translate.set_enabled(state)
    @Slot(str)
    def on_download_requested(self, size):
@@ -531,6 +578,25 @@ class WhisperApp(QObject):
        self.bridge.update_status("Error")
        logging.error(f"Download Error: {err}")
    @Slot(bool)
    def on_ui_toggle_request(self, is_recording):
        """Called when recording state changes."""
        # Update Window Hook to allow clicking if active
        is_active = is_recording or self.bridge.isProcessing
        if hasattr(self, 'window_hook'):
            self.window_hook.set_enabled(is_active)
    @Slot(bool)
    def on_processing_changed(self, is_processing):
        is_active = self.bridge.isRecording or is_processing
        if hasattr(self, 'window_hook'):
            self.window_hook.set_enabled(is_active)
 if __name__ == "__main__":
    import sys
    app = WhisperApp()
-    app.run()
+    
    # Connect extra signal for processing state
    app.bridge.isProcessingChanged.connect(app.on_processing_changed)
    sys.exit(app.run())
--- a/src/core/config.py
+++ b/src/core/config.py
@@ -16,6 +16,7 @@ from src.core.paths import get_base_path
 # Default Configuration
 DEFAULT_SETTINGS = {
    "hotkey": "f8",
    "hotkey_translate": "f10",
    "model_size": "small",
    "input_device": None,       # Device ID (int) or Name (str), None = Default
    "save_recordings": False,   # Save .wav files for debugging
@@ -38,6 +39,7 @@ DEFAULT_SETTINGS = {
    # AI - Advanced
    "language": "auto",         # "auto" or ISO code
    "task": "transcribe",       # "transcribe" or "translate" (to English)
    "compute_device": "auto",   # "auto", "cuda", "cpu"
    "compute_type": "int8",     # "int8", "float16", "float32"
    "beam_size": 5,
--- a/src/core/hotkey_manager.py
+++ b/src/core/hotkey_manager.py
@@ -30,15 +30,16 @@ class HotkeyManager(QObject):
    triggered = Signal()
-    def __init__(self, hotkey: str = "f8"):
+    def __init__(self, config_key: str = "hotkey"):
        """
        Initialize the HotkeyManager.
        Args:
-            hotkey (str): The global hotkey string description. Default: "f8".
+            config_key (str): The configuration key to look up (e.g. "hotkey").
        """
        super().__init__()
-        self.hotkey = hotkey
+        self.config_key = config_key
        self.hotkey = "f8" # Placeholder
        self.is_listening = False
        self._enabled = True
@@ -58,9 +59,9 @@ class HotkeyManager(QObject):
        from src.core.config import ConfigManager
        config = ConfigManager()
-        self.hotkey = config.get("hotkey")
+        self.hotkey = config.get(self.config_key)
-        logging.info(f"Registering global hotkey: {self.hotkey}")
+        logging.info(f"Registering global hotkey ({self.config_key}): {self.hotkey}")
        try:
            # We don't suppress=True here because we want the app to see keys during recording 
            # (Wait, actually if we are recording we WANT keyboard to see it, 
--- a/src/core/languages.py
+++ b/src/core/languages.py
@@ -0,0 +1,120 @@
 """
 Supported Languages Module
 ==========================
 Full list of languages supported by OpenAI Whisper.
 Maps ISO codes to display names.
 """
 LANGUAGES = {
    "auto": "Auto Detect",
    "af": "Afrikaans",
    "sq": "Albanian",
    "am": "Amharic",
    "ar": "Arabic",
    "hy": "Armenian",
    "as": "Assamese",
    "az": "Azerbaijani",
    "ba": "Bashkir",
    "eu": "Basque",
    "be": "Belarusian",
    "bn": "Bengali",
    "bs": "Bosnian",
    "br": "Breton",
    "bg": "Bulgarian",
    "my": "Burmese",
    "ca": "Catalan",
    "zh": "Chinese",
    "hr": "Croatian",
    "cs": "Czech",
    "da": "Danish",
    "nl": "Dutch",
    "en": "English",
    "et": "Estonian",
    "fo": "Faroese",
    "fi": "Finnish",
    "fr": "French",
    "gl": "Galician",
    "ka": "Georgian",
    "de": "German",
    "el": "Greek",
    "gu": "Gujarati",
    "ht": "Haitian",
    "ha": "Hausa",
    "haw": "Hawaiian",
    "he": "Hebrew",
    "hi": "Hindi",
    "hu": "Hungarian",
    "is": "Icelandic",
    "id": "Indonesian",
    "it": "Italian",
    "ja": "Japanese",
    "jw": "Javanese",
    "kn": "Kannada",
    "kk": "Kazakh",
    "km": "Khmer",
    "ko": "Korean",
    "lo": "Lao",
    "la": "Latin",
    "lv": "Latvian",
    "ln": "Lingala",
    "lt": "Lithuanian",
    "lb": "Luxembourgish",
    "mk": "Macedonian",
    "mg": "Malagasy",
    "ms": "Malay",
    "ml": "Malayalam",
    "mt": "Maltese",
    "mi": "Maori",
    "mr": "Marathi",
    "mn": "Mongolian",
    "ne": "Nepali",
    "no": "Norwegian",
    "oc": "Occitan",
    "pa": "Punjabi",
    "ps": "Pashto",
    "fa": "Persian",
    "pl": "Polish",
    "pt": "Portuguese",
    "ro": "Romanian",
    "ru": "Russian",
    "sa": "Sanskrit",
    "sr": "Serbian",
    "sn": "Shona",
    "sd": "Sindhi",
    "si": "Sinhala",
    "sk": "Slovak",
    "sl": "Slovenian",
    "so": "Somali",
    "es": "Spanish",
    "su": "Sundanese",
    "sw": "Swahili",
    "sv": "Swedish",
    "tl": "Tagalog",
    "tg": "Tajik",
    "ta": "Tamil",
    "tt": "Tatar",
    "te": "Telugu",
    "th": "Thai",
    "bo": "Tibetan",
    "tr": "Turkish",
    "tk": "Turkmen",
    "uk": "Ukrainian",
    "ur": "Urdu",
    "uz": "Uzbek",
    "vi": "Vietnamese",
    "cy": "Welsh",
    "yi": "Yiddish",
    "yo": "Yoruba",
 }
 def get_language_names():
    return list(LANGUAGES.values())
 def get_code_by_name(name):
    for code, lang in LANGUAGES.items():
        if lang == name:
            return code
    return "auto"
 def get_name_by_code(code):
    return LANGUAGES.get(code, "Auto Detect")
--- a/src/core/transcriber.py
+++ b/src/core/transcriber.py
@@ -74,11 +74,11 @@ class WhisperTranscriber:
            logging.error(f"Failed to load model: {e}")
            self.model = None
-    def transcribe(self, audio_data, is_file: bool = False) -> str:
+    def transcribe(self, audio_data, is_file: bool = False, task: Optional[str] = None) -> str:
        """
        Transcribe audio data.
        """
-        logging.info(f"Starting transcription... (is_file={is_file})")
+        logging.info(f"Starting transcription... (is_file={is_file}, task={task})")
        # Ensure model is loaded
        if not self.model:
@@ -91,6 +91,10 @@ class WhisperTranscriber:
            beam_size = int(self.config.get("beam_size"))
            best_of = int(self.config.get("best_of"))
            vad = False if is_file else self.config.get("vad_filter")
            language = self.config.get("language")
            # Use task override if provided, otherwise config
            final_task = task if task else self.config.get("task")
            # Transcribe
            segments, info = self.model.transcribe(
@@ -98,6 +102,8 @@ class WhisperTranscriber:
                beam_size=beam_size,
                best_of=best_of,
                vad_filter=vad,
                task=final_task,
                language=language if language != "auto" else None,
                vad_parameters=dict(min_silence_duration_ms=500),
                condition_on_previous_text=self.config.get("condition_on_previous_text"),
                without_timestamps=True
--- a/src/ui/bridge.py
+++ b/src/ui/bridge.py
@@ -245,6 +245,26 @@ class UIBridge(QObject):
    # --- Methods called from QML ---
    @Slot(result=list)
    def get_supported_languages(self):
        from src.core.languages import get_language_names
        return get_language_names()
    @Slot(str)
    def set_language_by_name(self, name):
        from src.core.languages import get_code_by_name
        from src.core.config import ConfigManager
        code = get_code_by_name(name)
        ConfigManager().set("language", code)
        self.settingChanged.emit("language", code)
    @Slot(result=str)
    def get_current_language_name(self):
        from src.core.languages import get_name_by_code
        from src.core.config import ConfigManager
        code = ConfigManager().get("language")
        return get_name_by_code(code)
    @Slot(str, result='QVariant')
    def getSetting(self, key):
        from src.core.config import ConfigManager
--- a/src/ui/qml/ModernComboBox.qml
+++ b/src/ui/qml/ModernComboBox.qml
@@ -100,7 +100,7 @@ ComboBox {
    popup: Popup {
        y: control.height - 1
        width: control.width
-        implicitHeight: contentItem.implicitHeight
+        implicitHeight: Math.min(contentItem.implicitHeight, 300)
        padding: 5
        contentItem: ListView {
--- a/src/ui/qml/ModernKeySequenceRecorder.qml
+++ b/src/ui/qml/ModernKeySequenceRecorder.qml
@@ -25,7 +25,7 @@ Rectangle {
    Text {
        anchors.centerIn: parent
-        text: control.recording ? "Listening..." : (control.currentSequence || "None")
+        text: control.recording ? "Listening..." : (formatSequence(control.currentSequence) || "None")
        color: control.recording ? SettingsStyle.accent : (control.currentSequence ? "#ffffff" : "#808080")
        font.family: "JetBrains Mono"
        font.pixelSize: 13
@@ -72,6 +72,23 @@ Rectangle {
        if (!activeFocus) control.recording = false
    }
    function formatSequence(seq) {
        if (!seq) return ""
        var parts = seq.split("+")
        for (var i = 0; i < parts.length; i++) {
            var p = parts[i]
            // Standardize modifiers
            if (p === "ctrl") parts[i] = "Ctrl"
            else if (p === "alt") parts[i] = "Alt"
            else if (p === "shift") parts[i] = "Shift"
            else if (p === "win") parts[i] = "Win"
            else if (p === "esc") parts[i] = "Esc"
            // Capitalize F-keys and others (e.g. f8 -> F8, space -> Space)
            else parts[i] = p.charAt(0).toUpperCase() + p.slice(1)
        }
        return parts.join(" + ")
    }
    function getKeyName(key, text) {
        // F-Keys
        if (key >= Qt.Key_F1 && key <= Qt.Key_F35) return "f" + (key - Qt.Key_F1 + 1)
--- a/src/ui/qml/Settings.qml
+++ b/src/ui/qml/Settings.qml
@@ -314,15 +314,25 @@ Window {
                                    spacing: 0
                                    ModernSettingsItem {
-                                        label: "Global Hotkey"
+                                        label: "Global Hotkey (Transcribe)"
-                                        description: "Press to record a new shortcut (e.g. Ctrl+Space)"
+                                        description: "Press to record a new shortcut (e.g. F9)"
                                        control: ModernKeySequenceRecorder {
-                                            Layout.preferredWidth: 200
+                                            implicitWidth: 240
                                            currentSequence: ui.getSetting("hotkey")
                                            onSequenceChanged: (seq) => ui.setSetting("hotkey", seq)
                                        }
                                    }
                                    ModernSettingsItem {
                                        label: "Global Hotkey (Translate)"
                                        description: "Press to record a new shortcut (e.g. F10)"
                                        control: ModernKeySequenceRecorder {
                                            implicitWidth: 240
                                            currentSequence: ui.getSetting("hotkey_translate")
                                            onSequenceChanged: (seq) => ui.setSetting("hotkey_translate", seq)
                                        }
                                    }
                                    ModernSettingsItem {
                                        label: "Run on Startup"
                                        description: "Automatically launch when you log in"
@@ -742,15 +752,17 @@ Window {
                                    ModernSettingsItem {
                                        label: "Language"
-                                        description: "Force language or Auto-detect"
+                                        description: "Spoken language to transcribe"
                                        control: ModernComboBox {
-                                            width: 140
+                                            Layout.preferredWidth: 200
-                                            model: ["auto", "en", "fr", "de", "es", "it", "ja", "zh", "ru"]
+                                            model: ui.get_supported_languages()
-                                            currentIndex: model.indexOf(ui.getSetting("language"))
+                                            currentIndex: model.indexOf(ui.get_current_language_name())
-                                            onActivated: ui.setSetting("language", currentText)
+                                            onActivated: (index) => ui.set_language_by_name(currentText)
                                        }
                                    }
                                    // Task selector removed as per user request (Hotkeys handle this now)
                                    ModernSettingsItem {
                                        label: "Compute Device"
                                        description: "Hardware acceleration (CUDA requires NVidia GPU)"
--- a/src/utils/formatters.py
+++ b/src/utils/formatters.py
@@ -0,0 +1,32 @@
 """
 Formatter Utilities
 ===================
 Helper functions for text formatting.
 """
 def format_hotkey(sequence: str) -> str:
    """
    Formats a hotkey sequence string (e.g. 'ctrl+alt+f9') 
    into a pretty readable string (e.g. 'Ctrl + Alt + F9').
    """
    if not sequence:
        return "None"
    parts = sequence.split('+')
    formatted_parts = []
    for p in parts:
        p = p.strip().lower()
        if p == 'ctrl': formatted_parts.append('Ctrl')
        elif p == 'alt': formatted_parts.append('Alt')
        elif p == 'shift': formatted_parts.append('Shift')
        elif p == 'win': formatted_parts.append('Win')
        elif p == 'esc': formatted_parts.append('Esc')
        else:
            # Capitalize first letter
            if len(p) > 0:
                formatted_parts.append(p[0].upper() + p[1:])
            else:
                formatted_parts.append(p)
    return " + ".join(formatted_parts)
--- a/src/utils/window_hook.py
+++ b/src/utils/window_hook.py
@@ -65,6 +65,10 @@ class WindowHook:
        # (Window 420x140, Pill 380x100)
        self.logical_rect = [20, 20, 20+380, 20+100]
        self.current_scale = initial_scale
        self.enabled = True # New flag
    def set_enabled(self, enabled):
        self.enabled = enabled
    def install(self):
        proc_address = ctypes.cast(self.new_wnd_proc, ctypes.c_void_p)
@@ -73,6 +77,10 @@ class WindowHook:
    def wnd_proc_callback(self, hwnd, msg, wParam, lParam):
        try:
            if msg == WM_NCHITTEST:
                # If disabled (invisible/inactive), let clicks pass through (HTTRANSPARENT)
                if not self.enabled:
                    return HTTRANSPARENT
                res = self.on_nchittest(lParam)
                if res != 0: 
                    return res
Author	SHA1	Message	Date
Your Name	4b84a27a67	v1.0.1 Feature Update and Polish Full Changelog: [New Features] - Added Native Translation Mode: - Whisper model now fully supports Translating any language to English - Added 'task' and 'language' parameters to Transcriber core - Dual Hotkey Support: - Added separate Global Hotkeys for Transcribe (default F8) and Translate (default F10) - Both hotkeys are fully customizable in Settings - Engine dynamically switches modes based on which key is pressed [UI/UX Improvements] - Settings Window: - Widened Hotkey Input fields (240px) to accommodate long combinations - Added Pretty-Printing for hotkey sequences (e.g. 'ctrl+f9' display as 'Ctrl + F9') - Replaced Country Code dropdown with Full Language Names (99+ languages) - Made Language Dropdown scrollable (max height 300px) to prevent screen overflow - Removed redundant 'Task' selector (replaced by dedicated hotkeys) - System Tray: - Tooltip now displays both Transcribe and Translate hotkeys - Tooltip hotkeys are formatted readably [Core & Performance] - Bootstrapper: - Implemented Smart Incremental Sync - Now checks filesize and content hash before copying files - Drastically reduces startup time for subsequent runs - Preserves user settings.json during updates - Backend: - Fixed HotkeyManager to support dynamic configuration keys - Fixed Language Lock: selecting a language now correctly forces the model to use it - Refactored bridge/main connection for language list handling	2026-01-24 18:29:10 +02:00
Your Name	f184eb0037	Fix: Invisible overlay blocking mouse clicks Problem: The overlay window, even when fully transparent or visually hidden (opacity 0), was still intercepting mouse events. This created a 'dead zone' on the screen where users could not click through to applications behind the overlay. This occurred because the low-level window hook was answering 'HTCAPTION' to hit tests regardless of the UI state. Solution: 1. Modified 'WindowHook' to accept an 'enabled' state. 2. When disabled, 'WM_NCHITTEST' now returns 'HTTRANSPARENT', allowing the OS to pass the click to the window underneath. 3. Updated 'main.py' to toggle this hook state dynamically: - ENABLED when Recording or Processing (UI is visible/active). - DISABLED when Idling (UI is hidden/transparent). Result: The overlay is now completely non-intrusive when not in use.	2026-01-24 17:51:23 +02:00
Your Name	306bd075ed	Aesthetic overhaul of documentation	2026-01-24 17:29:59 +02:00
Your Name	a1cc9c61b9	Add language list and file transcription info	2026-01-24 17:27:54 +02:00
Your Name	e627e1b8aa	Correct hardware detection statement in docs	2026-01-24 17:24:56 +02:00
Your Name	eaa572b42f	Fix release badge for Gitea	2026-01-24 17:22:14 +02:00
Your Name	e900201214	Final documentation polish	2026-01-24 17:20:22 +02:00
Your Name	0d426aea4b	Update docs with license and model stats	2026-01-24 17:16:53 +02:00
Your Name	b15ce8076f	Enhance documentation	2026-01-24 17:12:21 +02:00