Compare commits
9 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
4b84a27a67 | ||
|
|
f184eb0037 | ||
|
|
306bd075ed | ||
|
|
a1cc9c61b9 | ||
|
|
e627e1b8aa | ||
|
|
eaa572b42f | ||
|
|
e900201214 | ||
|
|
0d426aea4b | ||
|
|
b15ce8076f |
166
README.md
166
README.md
@@ -1,71 +1,155 @@
|
|||||||
# Whisper Voice
|
<div align="center">
|
||||||
|
|
||||||
**Reclaim Your Voice from the Cloud.**
|
# 🎙️ W H I S P E R V O I C E
|
||||||
|
### SOVEREIGN SPEECH RECOGNITION
|
||||||
|
|
||||||
Whisper Voice is a high-performance, strictly local speech-to-text tool designed for the desktop. It provides instant, high-accuracy dictation anywhere on your system—no internet connection required, no corporate servers, and absolutely no data harvesting.
|
<br>
|
||||||
|
|
||||||
We believe that the tools of production—and communication—should belong to the individual, not rented from centralized tech giants.
|

|
||||||
|
[](https://git.lashman.live/lashman/whisper_voice/releases/latest)
|
||||||
|
[](https://creativecommons.org/publicdomain/zero/1.0/)
|
||||||
|
|
||||||
|
<br>
|
||||||
|
|
||||||
|
> *"The master's tools will never dismantle the master's house."* — Audre Lorde
|
||||||
|
> <br>
|
||||||
|
> **Build your own tools. Run them locally.**
|
||||||
|
|
||||||
|
[Report Issue](https://git.lashman.live/lashman/whisper_voice/issues) • [View Source](https://git.lashman.live/lashman/whisper_voice) • [Releases](https://git.lashman.live/lashman/whisper_voice/releases)
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<br>
|
||||||
|
|
||||||
|
## ✊ The Manifesto
|
||||||
|
|
||||||
|
**We hold these truths to be self-evident:** That user data is an extension of the self, and its exploitation by centralized clouds is a violation of digital autonomy.
|
||||||
|
|
||||||
|
**Whisper Voice** is built on the principle of **technological sovereignty**. It provides state-of-the-art speech recognition without renting your cognitive output to corporate oligarchies. By running entirely on your own hardware, it reclaims the means of digital production, ensuring that your words remain exclusively yours.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## ✊ Core Principles
|
## ⚡ Technical Architecture
|
||||||
|
|
||||||
### 1. Total Autonomy (Local-First)
|
This operates on the metal. It is not a wrapper. It is an engine.
|
||||||
Your voice data is yours alone. Unlike commercial alternatives that siphon your words to remote data centers for processing and profiling, Whisper Voice runs entirely on your hardware. **No masters, no servers.** You retain full sovereignty over your digital footprint.
|
|
||||||
|
|
||||||
### 2. Decentralized Power
|
| Component | Technology | Benefit |
|
||||||
By leveraging optimized local processing, we strip away the need for reliance on massive, energy-hungry corporate infrastructure. This is technology scaled to the human level—powerful, efficient, and completely under your control.
|
| :--- | :--- | :--- |
|
||||||
|
| **Inference Core** | **Faster-Whisper** | Hyper-optimized implementation of OpenAI's Whisper using **CTranslate2**. Delivers **4x speedups** over PyTorch. |
|
||||||
### 3. Accessible to All
|
| **Quantization** | **INT8** | 8-bit quantization enables Pro-grade models (`Large-v3`) to run on consumer GPUs with minimal VRAM. |
|
||||||
High-quality speech recognition shouldn't be gated behind subscriptions or paywalls. This tool is free, open, and built to empower users to interact with their machines on their own terms.
|
| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out silence and background noise, conserving compute. |
|
||||||
|
| **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that feels native yet remains OS-independent. |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## ✨ Features
|
## 📊 Intelligence Matrix
|
||||||
|
|
||||||
* **100% Offline Processing**: Once the recognition engine is downloaded, the cable can be cut. Nothing leaves your machine.
|
Select the model that aligns with your hardware capabilities.
|
||||||
* **Universal Compatibility**: Works in any text field—editors, chat apps, terminals, or browsers. If you can type there, you can speak there.
|
|
||||||
* **Adaptive Input**:
|
| Model | VRAM (GPU) | RAM (CPU) | Velocity | Designation |
|
||||||
* *Clipboard Mode*: Standard paste injection.
|
| :--- | :--- | :--- | :--- | :--- |
|
||||||
* *High-Speed Simulation*: Simulates keystrokes at supersonic speeds (up to 6000 CPM) for apps that block pasting.
|
| `Tiny` | **~500 MB** | ~1 GB | ⚡ **Supersonic** | Command & Control, older hardware. |
|
||||||
* **System Integration**: Minimalist overlay and system tray presence. It exists when you need it and vanishes when you don't.
|
| `Base` | **~600 MB** | ~1 GB | 🚀 **Very Fast** | Daily driver for low-power laptops. |
|
||||||
* **Resource Efficiency**: Optimized to run smoothly on consumer hardware without monopolizing your system resources.
|
| `Small` | **~1 GB** | ~2 GB | ⏩ **Fast** | High accuracy English dictation. |
|
||||||
|
| `Medium` | **~2 GB** | ~4 GB | ⚖️ **Balanced** | Complex vocabulary, foreign accents. |
|
||||||
|
| `Large-v3 Turbo` | **~4 GB** | ~6 GB | ✨ **Optimal** | **Sweet Spot.** Near-Large smarts, Medium speed. |
|
||||||
|
| `Large-v3` | **~5 GB** | ~8 GB | 🧠 **Maximum** | Professional transcription. Uncompromised. |
|
||||||
|
|
||||||
|
> *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🚀 Getting Started
|
## 🛠️ Operations
|
||||||
|
|
||||||
### Installation
|
### 📥 Deployment
|
||||||
1. Download the latest release.
|
1. **Download**: Grab `WhisperVoice.exe` from [Releases](https://git.lashman.live/lashman/whisper_voice/releases).
|
||||||
2. Run `WhisperVoice.exe`.
|
2. **Deploy**: Place it anywhere. It is portable.
|
||||||
3. On the first run, the bootstrapper will autonomously provision the necessary runtime environment. This ensures your system remains clean and dependencies are self-contained.
|
3. **Bootstrap**: Run it. The agent will self-provision an isolated Python environment (~2GB) on first launch.
|
||||||
|
|
||||||
### Usage
|
### 🕹️ Controls
|
||||||
1. **Set Your Trigger**: Configure a global hotkey (default: `F9`) in the settings.
|
* **Global Hook**: `F9` (Default). Press to open the channel. Release to inject text.
|
||||||
2. **Speak Freely**: Hold the hotkey (or toggle it) and speak.
|
* **Tray Agent**: Retracts to the system tray. Right-click for **Settings** or **File Transcription**.
|
||||||
3. **Direct Action**: Your words are instantly transcribed and injected into your active window.
|
|
||||||
|
### 📡 Input Modes
|
||||||
|
| Mode | Description | Speed |
|
||||||
|
| :--- | :--- | :--- |
|
||||||
|
| **Clipboard Paste** | Standard text injection via OS clipboard. | Instant |
|
||||||
|
| **Simulate Typing** | Mimics physical keystrokes. Bypasses anti-paste blocks. | Up to **6000** CPM |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## ⚙️ Configuration
|
## 🌐 Universal Translation
|
||||||
|
|
||||||
The **Settings** panel puts the means of configuration in your hands:
|
The model listens in **99 languages** and translates them to English or transcribes them natively.
|
||||||
|
|
||||||
* **Recognition Engine**: Choose the size of the model that fits your hardware capabilities (Tiny to Large). Larger models offer greater precision but require more computing power.
|
<details>
|
||||||
* **Input Method**: Switch between "Clipboard Paste" and "Simulate Typing" depending on target application restrictions.
|
<summary><b>Click to view supported languages</b></summary>
|
||||||
* **Typing Speed**: Adjust the keystroke injection rate. Crank it up to 6000 CPM for instant text delivery.
|
<br>
|
||||||
* **Run on Startup**: Configure the agent to be ready the moment your session begins.
|
|
||||||
|
| | | | |
|
||||||
|
| :--- | :--- | :--- | :--- |
|
||||||
|
| Afrikaans 🇿🇦 | Albanian 🇦🇱 | Amharic 🇪🇹 | Arabic 🇸🇦 |
|
||||||
|
| Armenian 🇦🇲 | Assamese 🇮🇳 | Azerbaijani 🇦🇿 | Bashkir 🇷🇺 |
|
||||||
|
| Basque 🇪🇸 | Belarusian 🇧🇾 | Bengali 🇧🇩 | Bosnian 🇧🇦 |
|
||||||
|
| Breton 🇫🇷 | Bulgarian 🇧🇬 | Burmese 🇲🇲 | Castilian 🇪🇸 |
|
||||||
|
| Catalan 🇪🇸 | Chinese 🇨🇳 | Croatian 🇭🇷 | Czech 🇨🇿 |
|
||||||
|
| Danish 🇩🇰 | Dutch 🇳🇱 | English 🇺🇸 | Estonian 🇪🇪 |
|
||||||
|
| Faroese 🇫🇴 | Finnish 🇫🇮 | Flemish 🇧🇪 | French 🇫🇷 |
|
||||||
|
| Galician 🇪🇸 | Georgian 🇬🇪 | German 🇩🇪 | Greek 🇬🇷 |
|
||||||
|
| Gujarati 🇮🇳 | Haitian 🇭🇹 | Hausa 🇳🇬 | Hawaiian 🇺🇸 |
|
||||||
|
| Hebrew 🇮🇱 | Hindi 🇮🇳 | Hungarian 🇭🇺 | Icelandic 🇮🇸 |
|
||||||
|
| Indonesian 🇮🇩 | Italian 🇮🇹 | Japanese 🇯🇵 | Javanese 🇮🇩 |
|
||||||
|
| Kannada 🇮🇳 | Kazakh 🇰🇿 | Khmer 🇰🇭 | Korean 🇰🇷 |
|
||||||
|
| Lao 🇱🇦 | Latin 🇻🇦 | Latvian 🇱🇻 | Lingala 🇨🇩 |
|
||||||
|
| Lithuanian 🇱🇹 | Luxembourgish 🇱🇺 | Macedonian 🇲🇰 | Malagasy 🇲🇬 |
|
||||||
|
| Malay 🇲🇾 | Malayalam 🇮🇳 | Maltese 🇲🇹 | Maori 🇳🇿 |
|
||||||
|
| Marathi 🇮🇳 | Moldavian 🇲🇩 | Mongolian 🇲🇳 | Myanmar 🇲🇲 |
|
||||||
|
| Nepali 🇳🇵 | Norwegian 🇳🇴 | Occitan 🇫🇷 | Panjabi 🇮🇳 |
|
||||||
|
| Pashto 🇦🇫 | Persian 🇮🇷 | Polish 🇵🇱 | Portuguese 🇵🇹 |
|
||||||
|
| Punjabi 🇮🇳 | Romanian 🇷🇴 | Russian 🇷🇺 | Sanskrit 🇮🇳 |
|
||||||
|
| Serbian 🇷🇸 | Shona 🇿🇼 | Sindhi 🇵🇰 | Sinhala 🇱🇰 |
|
||||||
|
| Slovak 🇸🇰 | Slovenian 🇸🇮 | Somali 🇸🇴 | Spanish 🇪🇸 |
|
||||||
|
| Sundanese 🇮🇩 | Swahili 🇰🇪 | Swedish 🇸🇪 | Tagalog 🇵🇭 |
|
||||||
|
| Tajik 🇹🇯 | Tamil 🇮🇳 | Tatar 🇷🇺 | Telugu 🇮🇳 |
|
||||||
|
| Thai 🇹🇭 | Tibetan 🇨🇳 | Turkish 🇹🇷 | Turkmen 🇹🇲 |
|
||||||
|
| Ukrainian 🇺🇦 | Urdu 🇵🇰 | Uzbek 🇺🇿 | Vietnamese 🇻e |
|
||||||
|
| Welsh 🏴 | Yiddish 🇮🇱 | Yoruba 🇳🇬 | |
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🤝 Mutual Aid
|
## 🔧 Troubleshooting
|
||||||
|
|
||||||
This project thrives on community collaboration. If you have improvements, fixes, or ideas, you are encouraged to contribute. We build better systems when we build them together, horizontally and transparently.
|
<details>
|
||||||
|
<summary><b>🔥 App crashes on start</b></summary>
|
||||||
|
<blockquote>
|
||||||
|
The underlying engine requires standard C++ libraries. Install the <b>Microsoft Visual C++ Redistributable (2015-2022)</b>.
|
||||||
|
</blockquote>
|
||||||
|
</details>
|
||||||
|
|
||||||
* **Report Issues**: If something breaks, let us know.
|
<details>
|
||||||
* **Contribute Code**: The source is open. Fork it, improve it, share it.
|
<summary><b>🐌 "Simulate Typing" is slow</b></summary>
|
||||||
|
<blockquote>
|
||||||
|
Some apps (games, RDP) can't handle supersonic input. Go to <b>Settings</b> and lower the <b>Typing Speed</b> to ~1200 CPM.
|
||||||
|
</blockquote>
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary><b>🎤 No Audio / Silence</b></summary>
|
||||||
|
<blockquote>
|
||||||
|
The agent listens to the <b>Default Communication Device</b>. Ensure your microphone is set correctly in Windows Sound Settings.
|
||||||
|
</blockquote>
|
||||||
|
</details>
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
*Built with local processing libraries and Qt.*
|
<div align="center">
|
||||||
*No gods, no cloud managers.*
|
|
||||||
|
### ⚖️ PUBLIC DOMAIN (CC0 1.0)
|
||||||
|
|
||||||
|
*No Rights Reserved. No Gods. No Managers.*
|
||||||
|
|
||||||
|
Credit to **OpenAI** (Whisper), **Systran** (Faster-Whisper), and **Silero** (VAD).
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|||||||
@@ -259,48 +259,72 @@ class Bootstrapper:
|
|||||||
process.wait()
|
process.wait()
|
||||||
|
|
||||||
def refresh_app_source(self):
|
def refresh_app_source(self):
|
||||||
"""Refresh app source files. Skips if already exists to save time."""
|
"""
|
||||||
# Optimization: If app/main.py exists, skip update to improve startup speed.
|
Smartly updates app source files by only copying changed files.
|
||||||
# The user can delete the 'runtime' folder to force an update.
|
Preserves user settings and reduces disk I/O.
|
||||||
if (self.app_path / "main.py").exists():
|
"""
|
||||||
log("App already exists. Skipping update.")
|
if self.ui: self.ui.set_status("Checking for updates...")
|
||||||
return True
|
|
||||||
|
|
||||||
if self.ui: self.ui.set_status("Updating app files...")
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Preserve settings.json if it exists
|
# 1. Ensure destination exists
|
||||||
settings_path = self.app_path / "settings.json"
|
if not self.app_path.exists():
|
||||||
temp_settings = None
|
self.app_path.mkdir(parents=True, exist_ok=True)
|
||||||
if settings_path.exists():
|
|
||||||
try:
|
|
||||||
temp_settings = settings_path.read_bytes()
|
|
||||||
except:
|
|
||||||
log("Failed to backup settings.json, it involves risk of data loss.")
|
|
||||||
|
|
||||||
if self.app_path.exists():
|
# 2. Walk source and sync
|
||||||
shutil.rmtree(self.app_path, ignore_errors=True)
|
# source_path is the temporary bundled folder
|
||||||
|
# app_path is the persistent runtime folder
|
||||||
|
|
||||||
shutil.copytree(
|
changes_made = 0
|
||||||
self.source_path,
|
|
||||||
self.app_path,
|
|
||||||
ignore=shutil.ignore_patterns(
|
|
||||||
'__pycache__', '*.pyc', '.git', 'venv',
|
|
||||||
'build', 'dist', '*.egg-info', 'runtime'
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Restore settings.json
|
for src_dir, dirs, files in os.walk(self.source_path):
|
||||||
if temp_settings:
|
# Determine relative path from source root
|
||||||
try:
|
rel_path = Path(src_dir).relative_to(self.source_path)
|
||||||
settings_path.write_bytes(temp_settings)
|
dst_dir = self.app_path / rel_path
|
||||||
log("Restored settings.json")
|
|
||||||
except:
|
# Ensure directory exists
|
||||||
log("Failed to restore settings.json")
|
if not dst_dir.exists():
|
||||||
|
dst_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
for file in files:
|
||||||
|
# Skip ignored files
|
||||||
|
if file in ['__pycache__', '.git', 'settings.json'] or file.endswith('.pyc'):
|
||||||
|
continue
|
||||||
|
|
||||||
|
src_file = Path(src_dir) / file
|
||||||
|
dst_file = dst_dir / file
|
||||||
|
|
||||||
|
# Check if update needed
|
||||||
|
should_copy = False
|
||||||
|
if not dst_file.exists():
|
||||||
|
should_copy = True
|
||||||
|
else:
|
||||||
|
# Compare size first (fast)
|
||||||
|
if src_file.stat().st_size != dst_file.stat().st_size:
|
||||||
|
should_copy = True
|
||||||
|
else:
|
||||||
|
# Compare content (slower but accurate)
|
||||||
|
# Only read if size matches to verify diff
|
||||||
|
if src_file.read_bytes() != dst_file.read_bytes():
|
||||||
|
should_copy = True
|
||||||
|
|
||||||
|
if should_copy:
|
||||||
|
shutil.copy2(src_file, dst_file)
|
||||||
|
changes_made += 1
|
||||||
|
if self.ui: self.ui.set_detail(f"Updated: {file}")
|
||||||
|
|
||||||
|
# 3. Cleanup logic (Optional: remove files in dest that are not in source)
|
||||||
|
# For now, we only add/update to prevent deleting generated user files (logs, etc)
|
||||||
|
|
||||||
|
if changes_made > 0:
|
||||||
|
log(f"Update complete. {changes_made} files changed.")
|
||||||
|
else:
|
||||||
|
log("App is up to date.")
|
||||||
|
|
||||||
return True
|
return True
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
log(f"Error refreshing app source: {e}")
|
log(f"Error refreshing app source: {e}")
|
||||||
|
# Fallback to nuclear option if sync fails completely?
|
||||||
|
# No, 'smart_sync' failing might mean permissions, nuclear wouldn't help.
|
||||||
return False
|
return False
|
||||||
|
|
||||||
def run_app(self):
|
def run_app(self):
|
||||||
|
|||||||
116
main.py
116
main.py
@@ -118,13 +118,14 @@ class DownloadWorker(QThread):
|
|||||||
|
|
||||||
class TranscriptionWorker(QThread):
|
class TranscriptionWorker(QThread):
|
||||||
finished = Signal(str)
|
finished = Signal(str)
|
||||||
def __init__(self, transcriber, audio_data, is_file=False, parent=None):
|
def __init__(self, transcriber, audio_data, is_file=False, parent=None, task_override=None):
|
||||||
super().__init__(parent)
|
super().__init__(parent)
|
||||||
self.transcriber = transcriber
|
self.transcriber = transcriber
|
||||||
self.audio_data = audio_data
|
self.audio_data = audio_data
|
||||||
self.is_file = is_file
|
self.is_file = is_file
|
||||||
|
self.task_override = task_override
|
||||||
def run(self):
|
def run(self):
|
||||||
text = self.transcriber.transcribe(self.audio_data, is_file=self.is_file)
|
text = self.transcriber.transcribe(self.audio_data, is_file=self.is_file, task=self.task_override)
|
||||||
self.finished.emit(text)
|
self.finished.emit(text)
|
||||||
|
|
||||||
class WhisperApp(QObject):
|
class WhisperApp(QObject):
|
||||||
@@ -166,13 +167,18 @@ class WhisperApp(QObject):
|
|||||||
self.tray.transcribe_file_requested.connect(self.transcribe_file)
|
self.tray.transcribe_file_requested.connect(self.transcribe_file)
|
||||||
|
|
||||||
# Init Tooltip
|
# Init Tooltip
|
||||||
hotkey = self.config.get("hotkey")
|
from src.utils.formatters import format_hotkey
|
||||||
self.tray.setToolTip(f"Whisper Voice - Press {hotkey} to Record")
|
self.format_hotkey = format_hotkey # Store ref
|
||||||
|
|
||||||
|
hk1 = self.format_hotkey(self.config.get("hotkey"))
|
||||||
|
hk2 = self.format_hotkey(self.config.get("hotkey_translate"))
|
||||||
|
self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nTranslate: {hk2}")
|
||||||
|
|
||||||
# 3. Logic Components Placeholders
|
# 3. Logic Components Placeholders
|
||||||
self.audio_engine = None
|
self.audio_engine = None
|
||||||
self.transcriber = None
|
self.transcriber = None
|
||||||
self.hotkey_manager = None
|
self.hk_transcribe = None
|
||||||
|
self.hk_translate = None
|
||||||
self.overlay_root = None
|
self.overlay_root = None
|
||||||
|
|
||||||
# 4. Start Loader
|
# 4. Start Loader
|
||||||
@@ -222,12 +228,23 @@ class WhisperApp(QObject):
|
|||||||
self.settings_root.setVisible(False)
|
self.settings_root.setVisible(False)
|
||||||
|
|
||||||
# Install Low-Level Window Hook for Transparent Hit Test
|
# Install Low-Level Window Hook for Transparent Hit Test
|
||||||
# We must keep a reference to 'self.hook' so it isn't GC'd
|
try:
|
||||||
# scale = self.overlay_root.devicePixelRatio()
|
from src.utils.window_hook import WindowHook
|
||||||
# self.hook = WindowHook(int(self.overlay_root.winId()), 500, 300, scale)
|
hwnd = self.overlay_root.winId()
|
||||||
# self.hook.install()
|
# Initial scale from config
|
||||||
|
scale = float(self.config.get("ui_scale"))
|
||||||
|
|
||||||
# NOTE: HitTest hook will be installed here later
|
# Current Overlay Dimensions
|
||||||
|
win_w = int(460 * scale)
|
||||||
|
win_h = int(180 * scale)
|
||||||
|
|
||||||
|
self.window_hook = WindowHook(hwnd, win_w, win_h, initial_scale=scale)
|
||||||
|
self.window_hook.install()
|
||||||
|
|
||||||
|
# Initial state: Disabled because we start inactive
|
||||||
|
self.window_hook.set_enabled(False)
|
||||||
|
except Exception as e:
|
||||||
|
logging.error(f"Failed to install WindowHook: {e}")
|
||||||
|
|
||||||
def center_overlay(self):
|
def center_overlay(self):
|
||||||
"""Calculates and sets the Overlay position above the taskbar."""
|
"""Calculates and sets the Overlay position above the taskbar."""
|
||||||
@@ -255,9 +272,16 @@ class WhisperApp(QObject):
|
|||||||
self.audio_engine.set_visualizer_callback(self.bridge.update_amplitude)
|
self.audio_engine.set_visualizer_callback(self.bridge.update_amplitude)
|
||||||
self.audio_engine.set_silence_callback(self.on_silence_detected)
|
self.audio_engine.set_silence_callback(self.on_silence_detected)
|
||||||
self.transcriber = WhisperTranscriber()
|
self.transcriber = WhisperTranscriber()
|
||||||
self.hotkey_manager = HotkeyManager()
|
|
||||||
self.hotkey_manager.triggered.connect(self.toggle_recording)
|
# Dual Hotkey Managers
|
||||||
self.hotkey_manager.start()
|
self.hk_transcribe = HotkeyManager(config_key="hotkey")
|
||||||
|
self.hk_transcribe.triggered.connect(lambda: self.toggle_recording(task_override="transcribe"))
|
||||||
|
self.hk_transcribe.start()
|
||||||
|
|
||||||
|
self.hk_translate = HotkeyManager(config_key="hotkey_translate")
|
||||||
|
self.hk_translate.triggered.connect(lambda: self.toggle_recording(task_override="translate"))
|
||||||
|
self.hk_translate.start()
|
||||||
|
|
||||||
self.bridge.update_status("Ready")
|
self.bridge.update_status("Ready")
|
||||||
|
|
||||||
def run(self):
|
def run(self):
|
||||||
@@ -275,7 +299,8 @@ class WhisperApp(QObject):
|
|||||||
except: pass
|
except: pass
|
||||||
self.bridge.stats_worker.stop()
|
self.bridge.stats_worker.stop()
|
||||||
|
|
||||||
if self.hotkey_manager: self.hotkey_manager.stop()
|
if self.hk_transcribe: self.hk_transcribe.stop()
|
||||||
|
if self.hk_translate: self.hk_translate.stop()
|
||||||
|
|
||||||
# Close all QML windows to ensure bindings stop before Python objects die
|
# Close all QML windows to ensure bindings stop before Python objects die
|
||||||
if self.overlay_root:
|
if self.overlay_root:
|
||||||
@@ -350,10 +375,14 @@ class WhisperApp(QObject):
|
|||||||
print(f"Setting Changed: {key} = {value}")
|
print(f"Setting Changed: {key} = {value}")
|
||||||
|
|
||||||
# 1. Hotkey Reload
|
# 1. Hotkey Reload
|
||||||
if key == "hotkey":
|
if key in ["hotkey", "hotkey_translate"]:
|
||||||
if self.hotkey_manager: self.hotkey_manager.reload_hotkey()
|
if self.hk_transcribe: self.hk_transcribe.reload_hotkey()
|
||||||
|
if self.hk_translate: self.hk_translate.reload_hotkey()
|
||||||
|
|
||||||
if self.tray:
|
if self.tray:
|
||||||
self.tray.setToolTip(f"Whisper Voice - Press {value} to Record")
|
hk1 = self.format_hotkey(self.config.get("hotkey"))
|
||||||
|
hk2 = self.format_hotkey(self.config.get("hotkey_translate"))
|
||||||
|
self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nTranslate: {hk2}")
|
||||||
|
|
||||||
# 2. AI Model Reload (Heavy)
|
# 2. AI Model Reload (Heavy)
|
||||||
if key in ["model_size", "compute_device", "compute_type"]:
|
if key in ["model_size", "compute_device", "compute_type"]:
|
||||||
@@ -456,6 +485,8 @@ class WhisperApp(QObject):
|
|||||||
file_path, _ = QFileDialog.getOpenFileName(None, "Select Audio", "", "Audio (*.mp3 *.wav *.flac *.m4a *.ogg)")
|
file_path, _ = QFileDialog.getOpenFileName(None, "Select Audio", "", "Audio (*.mp3 *.wav *.flac *.m4a *.ogg)")
|
||||||
if file_path:
|
if file_path:
|
||||||
self.bridge.update_status("Thinking...")
|
self.bridge.update_status("Thinking...")
|
||||||
|
# Files use the default configured task usually, or we could ask?
|
||||||
|
# Default to config setting for files.
|
||||||
self.worker = TranscriptionWorker(self.transcriber, file_path, is_file=True, parent=self)
|
self.worker = TranscriptionWorker(self.transcriber, file_path, is_file=True, parent=self)
|
||||||
self.worker.finished.connect(self.on_transcription_done)
|
self.worker.finished.connect(self.on_transcription_done)
|
||||||
self.worker.start()
|
self.worker.start()
|
||||||
@@ -463,10 +494,13 @@ class WhisperApp(QObject):
|
|||||||
@Slot()
|
@Slot()
|
||||||
def on_silence_detected(self):
|
def on_silence_detected(self):
|
||||||
from PySide6.QtCore import QMetaObject, Qt
|
from PySide6.QtCore import QMetaObject, Qt
|
||||||
|
# Silence detection always triggers the task that was active?
|
||||||
|
# Since silence stops recording, it just calls toggle_recording with no arg, using the stored current_task?
|
||||||
|
# Let's ensure toggle_recording handles no arg calls by stopping the CURRENT task.
|
||||||
QMetaObject.invokeMethod(self, "toggle_recording", Qt.QueuedConnection)
|
QMetaObject.invokeMethod(self, "toggle_recording", Qt.QueuedConnection)
|
||||||
|
|
||||||
@Slot()
|
@Slot() # Modified to allow lambda override
|
||||||
def toggle_recording(self):
|
def toggle_recording(self, task_override=None):
|
||||||
if not self.audio_engine: return
|
if not self.audio_engine: return
|
||||||
|
|
||||||
# Prevent starting a new recording while we are still transcribing the last one
|
# Prevent starting a new recording while we are still transcribing the last one
|
||||||
@@ -474,23 +508,36 @@ class WhisperApp(QObject):
|
|||||||
logging.warning("Ignored toggle request: Transcription in progress.")
|
logging.warning("Ignored toggle request: Transcription in progress.")
|
||||||
return
|
return
|
||||||
|
|
||||||
|
# Determine which task we are entering
|
||||||
|
if task_override:
|
||||||
|
intended_task = task_override
|
||||||
|
else:
|
||||||
|
intended_task = self.config.get("task")
|
||||||
|
|
||||||
if self.audio_engine.recording:
|
if self.audio_engine.recording:
|
||||||
|
# STOP RECORDING
|
||||||
self.bridge.update_status("Thinking...")
|
self.bridge.update_status("Thinking...")
|
||||||
self.bridge.isRecording = False
|
self.bridge.isRecording = False
|
||||||
self.bridge.isProcessing = True # Start Processing
|
self.bridge.isProcessing = True # Start Processing
|
||||||
audio_data = self.audio_engine.stop_recording()
|
audio_data = self.audio_engine.stop_recording()
|
||||||
self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self)
|
|
||||||
|
# Use the task that started this session, or the override if provided (though usually override is for starting)
|
||||||
|
final_task = getattr(self, "current_recording_task", self.config.get("task"))
|
||||||
|
|
||||||
|
self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self, task_override=final_task)
|
||||||
self.worker.finished.connect(self.on_transcription_done)
|
self.worker.finished.connect(self.on_transcription_done)
|
||||||
self.worker.start()
|
self.worker.start()
|
||||||
else:
|
else:
|
||||||
self.bridge.update_status("Recording")
|
# START RECORDING
|
||||||
|
self.current_recording_task = intended_task
|
||||||
|
self.bridge.update_status(f"Recording ({intended_task})...")
|
||||||
self.bridge.isRecording = True
|
self.bridge.isRecording = True
|
||||||
self.audio_engine.start_recording()
|
self.audio_engine.start_recording()
|
||||||
|
|
||||||
@Slot(bool)
|
@Slot(bool)
|
||||||
def on_ui_toggle_request(self, state):
|
def on_ui_toggle_request(self, state):
|
||||||
if state != self.audio_engine.recording:
|
if state != self.audio_engine.recording:
|
||||||
self.toggle_recording()
|
self.toggle_recording() # Default behavior for UI clicks
|
||||||
|
|
||||||
@Slot(str)
|
@Slot(str)
|
||||||
def on_transcription_done(self, text: str):
|
def on_transcription_done(self, text: str):
|
||||||
@@ -503,8 +550,8 @@ class WhisperApp(QObject):
|
|||||||
|
|
||||||
@Slot(bool)
|
@Slot(bool)
|
||||||
def on_hotkeys_enabled_toggle(self, state):
|
def on_hotkeys_enabled_toggle(self, state):
|
||||||
if self.hotkey_manager:
|
if self.hk_transcribe: self.hk_transcribe.set_enabled(state)
|
||||||
self.hotkey_manager.set_enabled(state)
|
if self.hk_translate: self.hk_translate.set_enabled(state)
|
||||||
|
|
||||||
@Slot(str)
|
@Slot(str)
|
||||||
def on_download_requested(self, size):
|
def on_download_requested(self, size):
|
||||||
@@ -531,6 +578,25 @@ class WhisperApp(QObject):
|
|||||||
self.bridge.update_status("Error")
|
self.bridge.update_status("Error")
|
||||||
logging.error(f"Download Error: {err}")
|
logging.error(f"Download Error: {err}")
|
||||||
|
|
||||||
|
@Slot(bool)
|
||||||
|
def on_ui_toggle_request(self, is_recording):
|
||||||
|
"""Called when recording state changes."""
|
||||||
|
# Update Window Hook to allow clicking if active
|
||||||
|
is_active = is_recording or self.bridge.isProcessing
|
||||||
|
if hasattr(self, 'window_hook'):
|
||||||
|
self.window_hook.set_enabled(is_active)
|
||||||
|
|
||||||
|
@Slot(bool)
|
||||||
|
def on_processing_changed(self, is_processing):
|
||||||
|
is_active = self.bridge.isRecording or is_processing
|
||||||
|
if hasattr(self, 'window_hook'):
|
||||||
|
self.window_hook.set_enabled(is_active)
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
import sys
|
||||||
app = WhisperApp()
|
app = WhisperApp()
|
||||||
app.run()
|
|
||||||
|
# Connect extra signal for processing state
|
||||||
|
app.bridge.isProcessingChanged.connect(app.on_processing_changed)
|
||||||
|
|
||||||
|
sys.exit(app.run())
|
||||||
|
|||||||
@@ -16,6 +16,7 @@ from src.core.paths import get_base_path
|
|||||||
# Default Configuration
|
# Default Configuration
|
||||||
DEFAULT_SETTINGS = {
|
DEFAULT_SETTINGS = {
|
||||||
"hotkey": "f8",
|
"hotkey": "f8",
|
||||||
|
"hotkey_translate": "f10",
|
||||||
"model_size": "small",
|
"model_size": "small",
|
||||||
"input_device": None, # Device ID (int) or Name (str), None = Default
|
"input_device": None, # Device ID (int) or Name (str), None = Default
|
||||||
"save_recordings": False, # Save .wav files for debugging
|
"save_recordings": False, # Save .wav files for debugging
|
||||||
@@ -38,6 +39,7 @@ DEFAULT_SETTINGS = {
|
|||||||
|
|
||||||
# AI - Advanced
|
# AI - Advanced
|
||||||
"language": "auto", # "auto" or ISO code
|
"language": "auto", # "auto" or ISO code
|
||||||
|
"task": "transcribe", # "transcribe" or "translate" (to English)
|
||||||
"compute_device": "auto", # "auto", "cuda", "cpu"
|
"compute_device": "auto", # "auto", "cuda", "cpu"
|
||||||
"compute_type": "int8", # "int8", "float16", "float32"
|
"compute_type": "int8", # "int8", "float16", "float32"
|
||||||
"beam_size": 5,
|
"beam_size": 5,
|
||||||
|
|||||||
@@ -30,15 +30,16 @@ class HotkeyManager(QObject):
|
|||||||
|
|
||||||
triggered = Signal()
|
triggered = Signal()
|
||||||
|
|
||||||
def __init__(self, hotkey: str = "f8"):
|
def __init__(self, config_key: str = "hotkey"):
|
||||||
"""
|
"""
|
||||||
Initialize the HotkeyManager.
|
Initialize the HotkeyManager.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
hotkey (str): The global hotkey string description. Default: "f8".
|
config_key (str): The configuration key to look up (e.g. "hotkey").
|
||||||
"""
|
"""
|
||||||
super().__init__()
|
super().__init__()
|
||||||
self.hotkey = hotkey
|
self.config_key = config_key
|
||||||
|
self.hotkey = "f8" # Placeholder
|
||||||
self.is_listening = False
|
self.is_listening = False
|
||||||
self._enabled = True
|
self._enabled = True
|
||||||
|
|
||||||
@@ -58,9 +59,9 @@ class HotkeyManager(QObject):
|
|||||||
|
|
||||||
from src.core.config import ConfigManager
|
from src.core.config import ConfigManager
|
||||||
config = ConfigManager()
|
config = ConfigManager()
|
||||||
self.hotkey = config.get("hotkey")
|
self.hotkey = config.get(self.config_key)
|
||||||
|
|
||||||
logging.info(f"Registering global hotkey: {self.hotkey}")
|
logging.info(f"Registering global hotkey ({self.config_key}): {self.hotkey}")
|
||||||
try:
|
try:
|
||||||
# We don't suppress=True here because we want the app to see keys during recording
|
# We don't suppress=True here because we want the app to see keys during recording
|
||||||
# (Wait, actually if we are recording we WANT keyboard to see it,
|
# (Wait, actually if we are recording we WANT keyboard to see it,
|
||||||
|
|||||||
120
src/core/languages.py
Normal file
120
src/core/languages.py
Normal file
@@ -0,0 +1,120 @@
|
|||||||
|
"""
|
||||||
|
Supported Languages Module
|
||||||
|
==========================
|
||||||
|
Full list of languages supported by OpenAI Whisper.
|
||||||
|
Maps ISO codes to display names.
|
||||||
|
"""
|
||||||
|
|
||||||
|
LANGUAGES = {
|
||||||
|
"auto": "Auto Detect",
|
||||||
|
"af": "Afrikaans",
|
||||||
|
"sq": "Albanian",
|
||||||
|
"am": "Amharic",
|
||||||
|
"ar": "Arabic",
|
||||||
|
"hy": "Armenian",
|
||||||
|
"as": "Assamese",
|
||||||
|
"az": "Azerbaijani",
|
||||||
|
"ba": "Bashkir",
|
||||||
|
"eu": "Basque",
|
||||||
|
"be": "Belarusian",
|
||||||
|
"bn": "Bengali",
|
||||||
|
"bs": "Bosnian",
|
||||||
|
"br": "Breton",
|
||||||
|
"bg": "Bulgarian",
|
||||||
|
"my": "Burmese",
|
||||||
|
"ca": "Catalan",
|
||||||
|
"zh": "Chinese",
|
||||||
|
"hr": "Croatian",
|
||||||
|
"cs": "Czech",
|
||||||
|
"da": "Danish",
|
||||||
|
"nl": "Dutch",
|
||||||
|
"en": "English",
|
||||||
|
"et": "Estonian",
|
||||||
|
"fo": "Faroese",
|
||||||
|
"fi": "Finnish",
|
||||||
|
"fr": "French",
|
||||||
|
"gl": "Galician",
|
||||||
|
"ka": "Georgian",
|
||||||
|
"de": "German",
|
||||||
|
"el": "Greek",
|
||||||
|
"gu": "Gujarati",
|
||||||
|
"ht": "Haitian",
|
||||||
|
"ha": "Hausa",
|
||||||
|
"haw": "Hawaiian",
|
||||||
|
"he": "Hebrew",
|
||||||
|
"hi": "Hindi",
|
||||||
|
"hu": "Hungarian",
|
||||||
|
"is": "Icelandic",
|
||||||
|
"id": "Indonesian",
|
||||||
|
"it": "Italian",
|
||||||
|
"ja": "Japanese",
|
||||||
|
"jw": "Javanese",
|
||||||
|
"kn": "Kannada",
|
||||||
|
"kk": "Kazakh",
|
||||||
|
"km": "Khmer",
|
||||||
|
"ko": "Korean",
|
||||||
|
"lo": "Lao",
|
||||||
|
"la": "Latin",
|
||||||
|
"lv": "Latvian",
|
||||||
|
"ln": "Lingala",
|
||||||
|
"lt": "Lithuanian",
|
||||||
|
"lb": "Luxembourgish",
|
||||||
|
"mk": "Macedonian",
|
||||||
|
"mg": "Malagasy",
|
||||||
|
"ms": "Malay",
|
||||||
|
"ml": "Malayalam",
|
||||||
|
"mt": "Maltese",
|
||||||
|
"mi": "Maori",
|
||||||
|
"mr": "Marathi",
|
||||||
|
"mn": "Mongolian",
|
||||||
|
"ne": "Nepali",
|
||||||
|
"no": "Norwegian",
|
||||||
|
"oc": "Occitan",
|
||||||
|
"pa": "Punjabi",
|
||||||
|
"ps": "Pashto",
|
||||||
|
"fa": "Persian",
|
||||||
|
"pl": "Polish",
|
||||||
|
"pt": "Portuguese",
|
||||||
|
"ro": "Romanian",
|
||||||
|
"ru": "Russian",
|
||||||
|
"sa": "Sanskrit",
|
||||||
|
"sr": "Serbian",
|
||||||
|
"sn": "Shona",
|
||||||
|
"sd": "Sindhi",
|
||||||
|
"si": "Sinhala",
|
||||||
|
"sk": "Slovak",
|
||||||
|
"sl": "Slovenian",
|
||||||
|
"so": "Somali",
|
||||||
|
"es": "Spanish",
|
||||||
|
"su": "Sundanese",
|
||||||
|
"sw": "Swahili",
|
||||||
|
"sv": "Swedish",
|
||||||
|
"tl": "Tagalog",
|
||||||
|
"tg": "Tajik",
|
||||||
|
"ta": "Tamil",
|
||||||
|
"tt": "Tatar",
|
||||||
|
"te": "Telugu",
|
||||||
|
"th": "Thai",
|
||||||
|
"bo": "Tibetan",
|
||||||
|
"tr": "Turkish",
|
||||||
|
"tk": "Turkmen",
|
||||||
|
"uk": "Ukrainian",
|
||||||
|
"ur": "Urdu",
|
||||||
|
"uz": "Uzbek",
|
||||||
|
"vi": "Vietnamese",
|
||||||
|
"cy": "Welsh",
|
||||||
|
"yi": "Yiddish",
|
||||||
|
"yo": "Yoruba",
|
||||||
|
}
|
||||||
|
|
||||||
|
def get_language_names():
|
||||||
|
return list(LANGUAGES.values())
|
||||||
|
|
||||||
|
def get_code_by_name(name):
|
||||||
|
for code, lang in LANGUAGES.items():
|
||||||
|
if lang == name:
|
||||||
|
return code
|
||||||
|
return "auto"
|
||||||
|
|
||||||
|
def get_name_by_code(code):
|
||||||
|
return LANGUAGES.get(code, "Auto Detect")
|
||||||
@@ -74,11 +74,11 @@ class WhisperTranscriber:
|
|||||||
logging.error(f"Failed to load model: {e}")
|
logging.error(f"Failed to load model: {e}")
|
||||||
self.model = None
|
self.model = None
|
||||||
|
|
||||||
def transcribe(self, audio_data, is_file: bool = False) -> str:
|
def transcribe(self, audio_data, is_file: bool = False, task: Optional[str] = None) -> str:
|
||||||
"""
|
"""
|
||||||
Transcribe audio data.
|
Transcribe audio data.
|
||||||
"""
|
"""
|
||||||
logging.info(f"Starting transcription... (is_file={is_file})")
|
logging.info(f"Starting transcription... (is_file={is_file}, task={task})")
|
||||||
|
|
||||||
# Ensure model is loaded
|
# Ensure model is loaded
|
||||||
if not self.model:
|
if not self.model:
|
||||||
@@ -91,6 +91,10 @@ class WhisperTranscriber:
|
|||||||
beam_size = int(self.config.get("beam_size"))
|
beam_size = int(self.config.get("beam_size"))
|
||||||
best_of = int(self.config.get("best_of"))
|
best_of = int(self.config.get("best_of"))
|
||||||
vad = False if is_file else self.config.get("vad_filter")
|
vad = False if is_file else self.config.get("vad_filter")
|
||||||
|
language = self.config.get("language")
|
||||||
|
|
||||||
|
# Use task override if provided, otherwise config
|
||||||
|
final_task = task if task else self.config.get("task")
|
||||||
|
|
||||||
# Transcribe
|
# Transcribe
|
||||||
segments, info = self.model.transcribe(
|
segments, info = self.model.transcribe(
|
||||||
@@ -98,6 +102,8 @@ class WhisperTranscriber:
|
|||||||
beam_size=beam_size,
|
beam_size=beam_size,
|
||||||
best_of=best_of,
|
best_of=best_of,
|
||||||
vad_filter=vad,
|
vad_filter=vad,
|
||||||
|
task=final_task,
|
||||||
|
language=language if language != "auto" else None,
|
||||||
vad_parameters=dict(min_silence_duration_ms=500),
|
vad_parameters=dict(min_silence_duration_ms=500),
|
||||||
condition_on_previous_text=self.config.get("condition_on_previous_text"),
|
condition_on_previous_text=self.config.get("condition_on_previous_text"),
|
||||||
without_timestamps=True
|
without_timestamps=True
|
||||||
|
|||||||
@@ -245,6 +245,26 @@ class UIBridge(QObject):
|
|||||||
|
|
||||||
# --- Methods called from QML ---
|
# --- Methods called from QML ---
|
||||||
|
|
||||||
|
@Slot(result=list)
|
||||||
|
def get_supported_languages(self):
|
||||||
|
from src.core.languages import get_language_names
|
||||||
|
return get_language_names()
|
||||||
|
|
||||||
|
@Slot(str)
|
||||||
|
def set_language_by_name(self, name):
|
||||||
|
from src.core.languages import get_code_by_name
|
||||||
|
from src.core.config import ConfigManager
|
||||||
|
code = get_code_by_name(name)
|
||||||
|
ConfigManager().set("language", code)
|
||||||
|
self.settingChanged.emit("language", code)
|
||||||
|
|
||||||
|
@Slot(result=str)
|
||||||
|
def get_current_language_name(self):
|
||||||
|
from src.core.languages import get_name_by_code
|
||||||
|
from src.core.config import ConfigManager
|
||||||
|
code = ConfigManager().get("language")
|
||||||
|
return get_name_by_code(code)
|
||||||
|
|
||||||
@Slot(str, result='QVariant')
|
@Slot(str, result='QVariant')
|
||||||
def getSetting(self, key):
|
def getSetting(self, key):
|
||||||
from src.core.config import ConfigManager
|
from src.core.config import ConfigManager
|
||||||
|
|||||||
@@ -100,7 +100,7 @@ ComboBox {
|
|||||||
popup: Popup {
|
popup: Popup {
|
||||||
y: control.height - 1
|
y: control.height - 1
|
||||||
width: control.width
|
width: control.width
|
||||||
implicitHeight: contentItem.implicitHeight
|
implicitHeight: Math.min(contentItem.implicitHeight, 300)
|
||||||
padding: 5
|
padding: 5
|
||||||
|
|
||||||
contentItem: ListView {
|
contentItem: ListView {
|
||||||
|
|||||||
@@ -25,7 +25,7 @@ Rectangle {
|
|||||||
|
|
||||||
Text {
|
Text {
|
||||||
anchors.centerIn: parent
|
anchors.centerIn: parent
|
||||||
text: control.recording ? "Listening..." : (control.currentSequence || "None")
|
text: control.recording ? "Listening..." : (formatSequence(control.currentSequence) || "None")
|
||||||
color: control.recording ? SettingsStyle.accent : (control.currentSequence ? "#ffffff" : "#808080")
|
color: control.recording ? SettingsStyle.accent : (control.currentSequence ? "#ffffff" : "#808080")
|
||||||
font.family: "JetBrains Mono"
|
font.family: "JetBrains Mono"
|
||||||
font.pixelSize: 13
|
font.pixelSize: 13
|
||||||
@@ -72,6 +72,23 @@ Rectangle {
|
|||||||
if (!activeFocus) control.recording = false
|
if (!activeFocus) control.recording = false
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function formatSequence(seq) {
|
||||||
|
if (!seq) return ""
|
||||||
|
var parts = seq.split("+")
|
||||||
|
for (var i = 0; i < parts.length; i++) {
|
||||||
|
var p = parts[i]
|
||||||
|
// Standardize modifiers
|
||||||
|
if (p === "ctrl") parts[i] = "Ctrl"
|
||||||
|
else if (p === "alt") parts[i] = "Alt"
|
||||||
|
else if (p === "shift") parts[i] = "Shift"
|
||||||
|
else if (p === "win") parts[i] = "Win"
|
||||||
|
else if (p === "esc") parts[i] = "Esc"
|
||||||
|
// Capitalize F-keys and others (e.g. f8 -> F8, space -> Space)
|
||||||
|
else parts[i] = p.charAt(0).toUpperCase() + p.slice(1)
|
||||||
|
}
|
||||||
|
return parts.join(" + ")
|
||||||
|
}
|
||||||
|
|
||||||
function getKeyName(key, text) {
|
function getKeyName(key, text) {
|
||||||
// F-Keys
|
// F-Keys
|
||||||
if (key >= Qt.Key_F1 && key <= Qt.Key_F35) return "f" + (key - Qt.Key_F1 + 1)
|
if (key >= Qt.Key_F1 && key <= Qt.Key_F35) return "f" + (key - Qt.Key_F1 + 1)
|
||||||
|
|||||||
@@ -314,15 +314,25 @@ Window {
|
|||||||
spacing: 0
|
spacing: 0
|
||||||
|
|
||||||
ModernSettingsItem {
|
ModernSettingsItem {
|
||||||
label: "Global Hotkey"
|
label: "Global Hotkey (Transcribe)"
|
||||||
description: "Press to record a new shortcut (e.g. Ctrl+Space)"
|
description: "Press to record a new shortcut (e.g. F9)"
|
||||||
control: ModernKeySequenceRecorder {
|
control: ModernKeySequenceRecorder {
|
||||||
Layout.preferredWidth: 200
|
implicitWidth: 240
|
||||||
currentSequence: ui.getSetting("hotkey")
|
currentSequence: ui.getSetting("hotkey")
|
||||||
onSequenceChanged: (seq) => ui.setSetting("hotkey", seq)
|
onSequenceChanged: (seq) => ui.setSetting("hotkey", seq)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
ModernSettingsItem {
|
||||||
|
label: "Global Hotkey (Translate)"
|
||||||
|
description: "Press to record a new shortcut (e.g. F10)"
|
||||||
|
control: ModernKeySequenceRecorder {
|
||||||
|
implicitWidth: 240
|
||||||
|
currentSequence: ui.getSetting("hotkey_translate")
|
||||||
|
onSequenceChanged: (seq) => ui.setSetting("hotkey_translate", seq)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
ModernSettingsItem {
|
ModernSettingsItem {
|
||||||
label: "Run on Startup"
|
label: "Run on Startup"
|
||||||
description: "Automatically launch when you log in"
|
description: "Automatically launch when you log in"
|
||||||
@@ -742,15 +752,17 @@ Window {
|
|||||||
|
|
||||||
ModernSettingsItem {
|
ModernSettingsItem {
|
||||||
label: "Language"
|
label: "Language"
|
||||||
description: "Force language or Auto-detect"
|
description: "Spoken language to transcribe"
|
||||||
control: ModernComboBox {
|
control: ModernComboBox {
|
||||||
width: 140
|
Layout.preferredWidth: 200
|
||||||
model: ["auto", "en", "fr", "de", "es", "it", "ja", "zh", "ru"]
|
model: ui.get_supported_languages()
|
||||||
currentIndex: model.indexOf(ui.getSetting("language"))
|
currentIndex: model.indexOf(ui.get_current_language_name())
|
||||||
onActivated: ui.setSetting("language", currentText)
|
onActivated: (index) => ui.set_language_by_name(currentText)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Task selector removed as per user request (Hotkeys handle this now)
|
||||||
|
|
||||||
ModernSettingsItem {
|
ModernSettingsItem {
|
||||||
label: "Compute Device"
|
label: "Compute Device"
|
||||||
description: "Hardware acceleration (CUDA requires NVidia GPU)"
|
description: "Hardware acceleration (CUDA requires NVidia GPU)"
|
||||||
|
|||||||
32
src/utils/formatters.py
Normal file
32
src/utils/formatters.py
Normal file
@@ -0,0 +1,32 @@
|
|||||||
|
"""
|
||||||
|
Formatter Utilities
|
||||||
|
===================
|
||||||
|
Helper functions for text formatting.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def format_hotkey(sequence: str) -> str:
|
||||||
|
"""
|
||||||
|
Formats a hotkey sequence string (e.g. 'ctrl+alt+f9')
|
||||||
|
into a pretty readable string (e.g. 'Ctrl + Alt + F9').
|
||||||
|
"""
|
||||||
|
if not sequence:
|
||||||
|
return "None"
|
||||||
|
|
||||||
|
parts = sequence.split('+')
|
||||||
|
formatted_parts = []
|
||||||
|
|
||||||
|
for p in parts:
|
||||||
|
p = p.strip().lower()
|
||||||
|
if p == 'ctrl': formatted_parts.append('Ctrl')
|
||||||
|
elif p == 'alt': formatted_parts.append('Alt')
|
||||||
|
elif p == 'shift': formatted_parts.append('Shift')
|
||||||
|
elif p == 'win': formatted_parts.append('Win')
|
||||||
|
elif p == 'esc': formatted_parts.append('Esc')
|
||||||
|
else:
|
||||||
|
# Capitalize first letter
|
||||||
|
if len(p) > 0:
|
||||||
|
formatted_parts.append(p[0].upper() + p[1:])
|
||||||
|
else:
|
||||||
|
formatted_parts.append(p)
|
||||||
|
|
||||||
|
return " + ".join(formatted_parts)
|
||||||
@@ -65,6 +65,10 @@ class WindowHook:
|
|||||||
# (Window 420x140, Pill 380x100)
|
# (Window 420x140, Pill 380x100)
|
||||||
self.logical_rect = [20, 20, 20+380, 20+100]
|
self.logical_rect = [20, 20, 20+380, 20+100]
|
||||||
self.current_scale = initial_scale
|
self.current_scale = initial_scale
|
||||||
|
self.enabled = True # New flag
|
||||||
|
|
||||||
|
def set_enabled(self, enabled):
|
||||||
|
self.enabled = enabled
|
||||||
|
|
||||||
def install(self):
|
def install(self):
|
||||||
proc_address = ctypes.cast(self.new_wnd_proc, ctypes.c_void_p)
|
proc_address = ctypes.cast(self.new_wnd_proc, ctypes.c_void_p)
|
||||||
@@ -73,6 +77,10 @@ class WindowHook:
|
|||||||
def wnd_proc_callback(self, hwnd, msg, wParam, lParam):
|
def wnd_proc_callback(self, hwnd, msg, wParam, lParam):
|
||||||
try:
|
try:
|
||||||
if msg == WM_NCHITTEST:
|
if msg == WM_NCHITTEST:
|
||||||
|
# If disabled (invisible/inactive), let clicks pass through (HTTRANSPARENT)
|
||||||
|
if not self.enabled:
|
||||||
|
return HTTRANSPARENT
|
||||||
|
|
||||||
res = self.on_nchittest(lParam)
|
res = self.on_nchittest(lParam)
|
||||||
if res != 0:
|
if res != 0:
|
||||||
return res
|
return res
|
||||||
|
|||||||
Reference in New Issue
Block a user