Compare commits
32 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
aa2b0acd86 | ||
|
|
08b9ecc1cb | ||
|
|
02ef33023d | ||
|
|
d509eb5efb | ||
|
|
7c80ecfbed | ||
|
|
d8707b5ade | ||
|
|
07ad3b220d | ||
|
|
dc15e11e8e | ||
|
|
a70e76b4ab | ||
|
|
d40c83cc45 | ||
|
|
f2f80fc863 | ||
|
|
a6cf9efbcb | ||
|
|
4615f3084f | ||
|
|
937061f710 | ||
|
|
798a35e6d9 | ||
|
|
6737ed4547 | ||
|
|
aed489dd23 | ||
|
|
e23c492360 | ||
|
|
84f10092e9 | ||
|
|
03f46ee1e3 | ||
|
|
0f1bf5f1af | ||
|
|
0b2b5848e2 | ||
|
|
f3bf7541cf | ||
|
|
4b84a27a67 | ||
|
|
f184eb0037 | ||
|
|
306bd075ed | ||
|
|
a1cc9c61b9 | ||
|
|
e627e1b8aa | ||
|
|
eaa572b42f | ||
|
|
e900201214 | ||
|
|
0d426aea4b | ||
|
|
b15ce8076f |
25
.gitignore
vendored
25
.gitignore
vendored
@@ -1,25 +0,0 @@
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# Virtual Environment
|
||||
venv/
|
||||
env/
|
||||
|
||||
# Distribution / Build
|
||||
dist/
|
||||
build/
|
||||
*.spec
|
||||
_unused_files/
|
||||
runtime/
|
||||
|
||||
# IDEs
|
||||
.vscode/
|
||||
.idea/
|
||||
|
||||
# Application Specific
|
||||
models/
|
||||
recordings/
|
||||
*.log
|
||||
settings.json
|
||||
255
README.md
255
README.md
@@ -1,71 +1,244 @@
|
||||
# Whisper Voice
|
||||
<div align="center">
|
||||
|
||||
**Reclaim Your Voice from the Cloud.**
|
||||
# 🎙️ W H I S P E R V O I C E
|
||||
### SOVEREIGN SPEECH RECOGNITION
|
||||
|
||||
Whisper Voice is a high-performance, strictly local speech-to-text tool designed for the desktop. It provides instant, high-accuracy dictation anywhere on your system—no internet connection required, no corporate servers, and absolutely no data harvesting.
|
||||
<br>
|
||||
|
||||
We believe that the tools of production—and communication—should belong to the individual, not rented from centralized tech giants.
|
||||

|
||||
[](https://git.lashman.live/lashman/whisper_voice/releases/latest)
|
||||
[](https://creativecommons.org/publicdomain/zero/1.0/)
|
||||
|
||||
<br>
|
||||
|
||||
> *"The master's tools will never dismantle the master's house."*
|
||||
> <br>
|
||||
> **Build your own tools. Run them locally. Free your mind.**
|
||||
|
||||
[View Source](https://git.lashman.live/lashman/whisper_voice) • [Report Issue](https://git.lashman.live/lashman/whisper_voice/issues)
|
||||
|
||||
</div>
|
||||
|
||||
<br>
|
||||
<br>
|
||||
|
||||
## 📡 The Transmission
|
||||
|
||||
We are witnessing the **enshittification** of the digital world. What were once vibrant social commons are being walled off, strip-mined for data, and degraded into rent-seeking silos. Your voice is no longer your own; it is a training set for a corporate oracle that charges you for the privilege of listening.
|
||||
|
||||
**Whisper Voice** is a small act of sabotage against this trend.
|
||||
|
||||
It is built on the axiom of **Technological Sovereignty**. By moving state-of-the-art inference from the server farms to your own silicon, you reclaim the means of digital production. No telemetry. No subscriptions. No "cloud processing" that eavesdrops on your intent.
|
||||
|
||||
---
|
||||
|
||||
## ✊ Core Principles
|
||||
## ⚡ The Engine
|
||||
|
||||
### 1. Total Autonomy (Local-First)
|
||||
Your voice data is yours alone. Unlike commercial alternatives that siphon your words to remote data centers for processing and profiling, Whisper Voice runs entirely on your hardware. **No masters, no servers.** You retain full sovereignty over your digital footprint.
|
||||
Whisper Voice operates directly on the metal. It is not an API wrapper; it is an autonomous machine.
|
||||
|
||||
### 2. Decentralized Power
|
||||
By leveraging optimized local processing, we strip away the need for reliance on massive, energy-hungry corporate infrastructure. This is technology scaled to the human level—powerful, efficient, and completely under your control.
|
||||
| Component | Technology | Benefit |
|
||||
| :--- | :--- | :--- |
|
||||
| **Inference Core** | **Faster-Whisper** | Hyper-optimized C++ implementation via **CTranslate2**. Delivers **4x velocity** over standard PyTorch. |
|
||||
| **Compression** | **INT8 quantization** | Enables Pro-grade models (`Large-v3`) to run on consumer-grade GPUs, democratizing elite AI. |
|
||||
| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out the noise, ensuring only pure intent is processed. |
|
||||
| **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that is fluid, responsive, and sovereign. |
|
||||
|
||||
### 3. Accessible to All
|
||||
High-quality speech recognition shouldn't be gated behind subscriptions or paywalls. This tool is free, open, and built to empower users to interact with their machines on their own terms.
|
||||
### 🛑 Compatibility Matrix (Windows)
|
||||
The core engine (`CTranslate2`) is heavily optimized for Nvidia tensor cores.
|
||||
|
||||
| Manufacturer | Hardware | Status | Notes |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **Nvidia** | GTX 900+ / RTX | ✅ **Supported** | Full heavy-metal acceleration. |
|
||||
| **AMD** | Radeon RX | ⚠️ **CPU Fallback** | Runs on CPU. Valid for `Small/Medium`, slow for `Large`. |
|
||||
| **Intel** | Arc / Iris | ⚠️ **CPU Fallback** | Runs on CPU. Valid for `Small/Medium`, slow for `Large`. |
|
||||
| **Apple** | M1 / M2 / M3 | ❌ **Unsupported** | Release is strictly Windows x64. |
|
||||
|
||||
> **AMD Users**: v1.0.3 auto-detects GPU failures and silently falls back to CPU.
|
||||
|
||||
<br>
|
||||
|
||||
## 🖋️ Universal Transcription
|
||||
|
||||
At its core, Whisper Voice is the ultimate bridge between thought and text. It listens with superhuman precision, converting spoken word into written form across **99 languages**.
|
||||
|
||||
* **Punctuation Mastery**: Automatically handles capitalization and complex punctuation formatting.
|
||||
* **Contextual Intelligence**: Smarter than standard dictation; it understands the flow of sentences to resolve homophones and technical jargon ($1.5k vs "fifteen hundred dollars").
|
||||
* **Total Privacy**: Your private dictation, legal notes, or creative writing never leave your RAM.
|
||||
|
||||
### Workflow: `F9 (Default)`
|
||||
The primary channel for native-language transcription. It transcribes precisely what it hears in the language you speak (or the one you've locked in Settings).
|
||||
|
||||
### 🧠 Intelligent Correction (New in v1.1.0)
|
||||
Whisper Voice now integrates a local **Llama 3.2 1B** LLM to act as a "Silent Consultant". It post-processes transcripts to fix grammar or polish style without effectively "chatting" back.
|
||||
|
||||
It is strictly trained on a **Forensic Protocol**: it will never lecture you, never refuse to process explicit language, and never sanitize your words. Your profanity is yours to keep.
|
||||
|
||||
#### Correction Modes:
|
||||
* **Standard (Default)**: Fixes grammar, punctuation, and capitalization while keeping every word you said.
|
||||
* **Grammar Only**: Strictly fixes objective errors (spelling/agreement). Touches nothing else.
|
||||
* **Rewrite**: Polishes the flow and clarity of your sentences while explicitly preserving your original tone (Casual stays casual, Formal stays formal).
|
||||
|
||||
#### Supported Languages:
|
||||
The correction engine is optimized for **English, German, French, Italian, Portuguese, Spanish, Hindi, and Thai**. It also performs well on **Russian, Chinese, Japanese, and Romanian**.
|
||||
|
||||
This approach incurs a ~2s latency penalty but uses **zero extra VRAM** when in Low VRAM mode.
|
||||
|
||||
<br>
|
||||
|
||||
## 🌎 Universal Translation
|
||||
|
||||
Whisper Voice v1.0.1 includes a **Neural Translation Engine** that allows you to bridge any linguistic gap instantly.
|
||||
|
||||
* **Input**: Speak in French, Japanese, Russian, or **96 other languages**.
|
||||
* **Output**: The engine instantly reconstructs the semantic meaning into fluent **English**.
|
||||
* **Task Protocol**: Handled via the dedicated `F10` channel.
|
||||
|
||||
### 🔍 Why only English translation?
|
||||
A common question arises: *Why can't I translate from French to Japanese?*
|
||||
|
||||
The architecture of the underlying Whisper model is a **Many-to-English** design. During its massive training phase (680,000 hours of audio), the translation task was specifically optimized to map the global linguistic commons onto a single bridge language: **English**. This allowed the model to reach incredible levels of semantic understanding without the exponential complexity of a "Many-to-Many" mapping.
|
||||
|
||||
By focusing its translation decoder solely on English, Whisper achieves "Zero-Shot" quality that rivals specialized translation engines while remaining lightweight enough to run on your local GPU.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Features
|
||||
## 🕹️ Command & Control
|
||||
|
||||
* **100% Offline Processing**: Once the recognition engine is downloaded, the cable can be cut. Nothing leaves your machine.
|
||||
* **Universal Compatibility**: Works in any text field—editors, chat apps, terminals, or browsers. If you can type there, you can speak there.
|
||||
* **Adaptive Input**:
|
||||
* *Clipboard Mode*: Standard paste injection.
|
||||
* *High-Speed Simulation*: Simulates keystrokes at supersonic speeds (up to 6000 CPM) for apps that block pasting.
|
||||
* **System Integration**: Minimalist overlay and system tray presence. It exists when you need it and vanishes when you don't.
|
||||
* **Resource Efficiency**: Optimized to run smoothly on consumer hardware without monopolizing your system resources.
|
||||
### Global Hotkeys
|
||||
The agent runs silently in the background, waiting for your signal.
|
||||
|
||||
* **Transcribe (F9)**: Opens the channel for standard speech-to-text.
|
||||
* **Translate (F10)**: Opens the channel for neural translation.
|
||||
* **Customization**: Remap these keys in Settings. The recorder supports complex chords (e.g. `Ctrl + Alt + Space`) to fit your workflow.
|
||||
|
||||
### Injection Protocols
|
||||
* **Clipboard Paste**: Standard text injection. Instant, reliable.
|
||||
* **Simulate Typing**: Mimics physical keystrokes at superhuman speed (6000 CPM). Bypasses anti-paste restrictions and "protected" windows.
|
||||
|
||||
<br>
|
||||
|
||||
## 📊 Intelligence Matrix
|
||||
|
||||
Select the model that aligns with your available resources.
|
||||
|
||||
| Model | VRAM (GPU) | RAM (CPU) | Designation | Capability |
|
||||
| :--- | :--- | :--- | :--- | :--- |
|
||||
| `Tiny` | **~500 MB** | ~1 GB | ⚡ **Supersonic** | Command & Control, older hardware. |
|
||||
| `Base` | **~600 MB** | ~1 GB | 🚀 **Very Fast** | Daily driver for low-power laptops. |
|
||||
| `Small` | **~1 GB** | ~2 GB | ⏩ **Fast** | High accuracy English dictation. |
|
||||
| `Medium` | **~2 GB** | ~4 GB | ⚖️ **Balanced** | Complex vocabulary, foreign accents. |
|
||||
| `Large-v3 Turbo` | **~4 GB** | ~6 GB | ✨ **Optimal** | **The Sweet Spot.** Near-Large intelligence, Medium speed. |
|
||||
| `Large-v3` | **~5 GB** | ~8 GB | 🧠 **Maximum** | Professional grade. Uncompromised. |
|
||||
|
||||
> *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
|
||||
|
||||
### 📉 Low VRAM Mode
|
||||
For users with limited GPU memory (e.g., 4GB cards) or those running heavy games simultaneously, Whisper Voice offers a specialized **Low VRAM Mode**.
|
||||
|
||||
* **Behavior**: The AI model is aggressively unloaded from the GPU immediately after every transcription.
|
||||
* **Benefit**: When idle, the app consumes near-zero VRAM (~0MB), leaving your GPU completely free for gaming or rendering.
|
||||
* **Trade-off**: There is a "cold start" latency of 1-2 seconds for every voice command as the model reloads from the disk cache.
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Getting Started
|
||||
## ♿ Accessibility (WCAG 2.2 AAA)
|
||||
|
||||
### Installation
|
||||
1. Download the latest release.
|
||||
2. Run `WhisperVoice.exe`.
|
||||
3. On the first run, the bootstrapper will autonomously provision the necessary runtime environment. This ensures your system remains clean and dependencies are self-contained.
|
||||
Whisper Voice is built to be usable by everyone. The entire interface has been engineered to meet **WCAG 2.2 AAA** — the highest tier of accessibility compliance. This is not a checkbox exercise; it is a structural commitment.
|
||||
|
||||
### Usage
|
||||
1. **Set Your Trigger**: Configure a global hotkey (default: `F9`) in the settings.
|
||||
2. **Speak Freely**: Hold the hotkey (or toggle it) and speak.
|
||||
3. **Direct Action**: Your words are instantly transcribed and injected into your active window.
|
||||
### Color & Contrast
|
||||
Every design token is calibrated for **Enhanced Contrast** (WCAG 1.4.6, 7:1 minimum):
|
||||
|
||||
| Token | Ratio | Purpose |
|
||||
| :--- | :--- | :--- |
|
||||
| `textPrimary` #FAFAFA | ~17:1 | Body text, headings |
|
||||
| `textSecondary` #ABABAB | 8.1:1 | Descriptions, hints |
|
||||
| `accentPurple` #B794F6 | 7.2:1 | Interactive elements, focus rings |
|
||||
| `borderSubtle` | 3:1 | Non-text contrast for borders and separators |
|
||||
|
||||
### Keyboard Navigation
|
||||
Full keyboard access — no mouse required:
|
||||
|
||||
* **Tab / Shift+Tab**: Navigate between all interactive controls (sliders, switches, buttons, dropdowns, text fields).
|
||||
* **Arrow Keys**: Navigate the Settings sidebar tabs.
|
||||
* **Enter / Space**: Activate any focused control.
|
||||
* **Focus Rings**: Every interactive element shows a visible 2px accent-colored focus indicator.
|
||||
|
||||
### Screen Reader Support
|
||||
Every component is annotated with semantic roles and descriptive names:
|
||||
|
||||
* Buttons, sliders, checkboxes, combo boxes, text fields — all declare their `Accessible.role` and `Accessible.name`.
|
||||
* Switches report "on" / "off" state in their accessible name.
|
||||
* The loader status uses `AlertMessage` for live-region announcements.
|
||||
* Settings tabs use `Tab` / `PageTab` roles matching WAI-ARIA patterns.
|
||||
|
||||
### Non-Color State Indicators
|
||||
Toggle switches display **I/O marks** inside the thumb (not just color changes), ensuring state is perceivable without color vision (WCAG 1.4.1).
|
||||
|
||||
### Target Sizes
|
||||
All interactive controls meet the **24px minimum** target size (WCAG 2.5.8). Slider handles, buttons, switches, and nav items are all comfortably clickable.
|
||||
|
||||
### Reduced Motion
|
||||
A **Reduce Motion** toggle (Settings > Visuals) disables all decorative animations:
|
||||
|
||||
* Shader effects (gradient blobs, glow, CRT scanlines, rainbow waveform)
|
||||
* Particle systems
|
||||
* Pulsing animations (mic button, recording timer, border)
|
||||
* Loader logo pulse and progress shimmer
|
||||
|
||||
The system also respects the **Windows "Show animations" preference** via `SystemParametersInfo` detection. Essential information (recording state, progress bars, timer text) remains fully functional.
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Configuration
|
||||
## 🛠️ Deployment
|
||||
|
||||
The **Settings** panel puts the means of configuration in your hands:
|
||||
### 📥 Installation
|
||||
1. **Acquire**: Download `WhisperVoice.exe` from [Releases](https://git.lashman.live/lashman/whisper_voice/releases).
|
||||
2. **Deploy**: Place it anywhere. It is portable.
|
||||
3. **Bootstrap**: Run it. The agent will self-provision an isolated Python runtime (~2GB) on first launch.
|
||||
4. **Sync**: Future updates are handled by the **Smart Bootstrapper**, which surgically updates only changed files, respecting your bandwidth and your settings.
|
||||
|
||||
* **Recognition Engine**: Choose the size of the model that fits your hardware capabilities (Tiny to Large). Larger models offer greater precision but require more computing power.
|
||||
* **Input Method**: Switch between "Clipboard Paste" and "Simulate Typing" depending on target application restrictions.
|
||||
* **Typing Speed**: Adjust the keystroke injection rate. Crank it up to 6000 CPM for instant text delivery.
|
||||
* **Run on Startup**: Configure the agent to be ready the moment your session begins.
|
||||
### 🔧 Troubleshooting
|
||||
* **App crashes on start**: Ensure you have [Microsoft Visual C++ Redistributable 2015-2022](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist) installed.
|
||||
* **"Simulate Typing" is slow**: Some applications (remote desktops, legacy games) cannot handle the data stream. Lower the typing speed in Settings to ~1200 CPM.
|
||||
* **No Audio**: The agent listens to the **Default Communication Device**. Verify your Windows Sound Control Panel.
|
||||
|
||||
<br>
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Mutual Aid
|
||||
## 🌐 Supported Languages
|
||||
|
||||
This project thrives on community collaboration. If you have improvements, fixes, or ideas, you are encouraged to contribute. We build better systems when we build them together, horizontally and transparently.
|
||||
The engine understands the following 99 languages. You can lock the focus to a specific language in Settings to improve accuracy, or rely on **Auto-Detect** for fluid multilingual usage.
|
||||
|
||||
* **Report Issues**: If something breaks, let us know.
|
||||
* **Contribute Code**: The source is open. Fork it, improve it, share it.
|
||||
| | | | | | |
|
||||
| :--- | :--- | :--- | :--- | :--- | :--- |
|
||||
| Afrikaans 🇿🇦 | Albanian 🇦🇱 | Amharic 🇪🇹 | Arabic 🇸🇦 | Armenian 🇦🇲 | Assamese 🇮🇳 |
|
||||
| Azerbaijani 🇦🇿 | Bashkir 🇷🇺 | Basque 🇪🇸 | Belarusian 🇧🇾 | Bengali 🇧🇩 | Bosnian 🇧🇦 |
|
||||
| Breton 🇫🇷 | Bulgarian 🇧🇬 | Burmese 🇲🇲 | Castilian 🇪🇸 | Catalan 🇪🇸 | Chinese 🇨🇳 |
|
||||
| Croatian 🇭🇷 | Czech 🇨🇿 | Danish 🇩🇰 | Dutch 🇳🇱 | English 🇺🇸 | Estonian 🇪🇪 |
|
||||
| Faroese 🇫🇴 | Finnish 🇫🇮 | Flemish 🇧🇪 | French 🇫🇷 | Galician 🇪🇸 | Georgian 🇬🇪 |
|
||||
| German 🇩🇪 | Greek 🇬🇷 | Gujarati 🇮🇳 | Haitian 🇭🇹 | Hausa 🇳🇬 | Hawaiian 🇺🇸 |
|
||||
| Hebrew 🇮🇱 | Hindi 🇮🇳 | Hungarian 🇭🇺 | Icelandic 🇮🇸 | Indonesian 🇮🇩 | Italian 🇮🇹 |
|
||||
| Japanese 🇯🇵 | Javanese 🇮 Indonesa | Kannada 🇮🇳 | Kazakh 🇰🇿 | Khmer 🇰🇭 | Korean 🇰🇷 |
|
||||
| Lao 🇱🇦 | Latin 🇻🇦 | Latvian 🇱🇻 | Lingala 🇨🇩 | Lithuanian 🇱🇹 | Luxembourgish 🇱🇺 |
|
||||
| Macedonian 🇲🇰 | Malagasy 🇲🇬 | Malay 🇲🇾 | Malayalam 🇮🇳 | Maltese 🇲🇹 | Maori 🇳🇿 |
|
||||
| Marathi 🇮🇳 | Moldavian 🇲🇩 | Mongolian 🇲🇳 | Myanmar 🇲🇲 | Nepali 🇳🇵 | Norwegian 🇳🇴 |
|
||||
| Occitan 🇫🇷 | Panjabi 🇮🇳 | Pashto 🇦🇫 | Persian 🇮🇷 | Polish 🇵🇱 | Portuguese 🇵🇹 |
|
||||
| Punjabi 🇮🇳 | Romanian 🇷🇴 | Russian 🇷🇺 | Sanskrit 🇮🇳 | Serbian 🇷🇸 | Shona 🇿🇼 |
|
||||
| Sindhi 🇵🇰 | Sinhala 🇱🇰 | Slovak 🇸🇰 | Slovenian 🇸🇮 | Somali 🇸🇴 | Spanish 🇪🇸 |
|
||||
| Sundanese 🇮🇩 | Swahili 🇰🇪 | Swedish 🇸🇪 | Tagalog 🇵🇭 | Tajik 🇹🇯 | Tamil 🇮🇳 |
|
||||
| Tatar 🇷🇺 | Telugu 🇮🇳 | Thai 🇹🇭 | Tibetan 🇨🇳 | Turkish 🇹🇷 | Turkmen 🇹🇲 |
|
||||
| Ukrainian 🇺🇦 | Urdu 🇵🇰 | Uzbek 🇺🇿 | Vietnamese 🇻e | Welsh 🏴 | Yiddish 🇮🇱 |
|
||||
| Yoruba 🇳🇬 | | | | | |
|
||||
|
||||
---
|
||||
<br>
|
||||
<br>
|
||||
|
||||
*Built with local processing libraries and Qt.*
|
||||
*No gods, no cloud managers.*
|
||||
<div align="center">
|
||||
|
||||
### ⚖️ PUBLIC DOMAIN (CC0 1.0)
|
||||
*No Rights Reserved. No Gods. No Masters. No Managers.*
|
||||
|
||||
Credit to **OpenAI** (Whisper), **Systran** (Faster-Whisper), and **Silero** (VAD).
|
||||
|
||||
</div>
|
||||
|
||||
28
RELEASE_NOTES.md
Normal file
28
RELEASE_NOTES.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Release v1.0.4
|
||||
|
||||
**"The Compatibility Update"**
|
||||
|
||||
This release focuses on maximum stability across different hardware configurations (AMD, Intel, Nvidia) and fixing startup crashes related to corrupted models or missing drivers.
|
||||
|
||||
## 🛠️ Critical Fixes
|
||||
|
||||
### 1. Robust CPU Fallback (AMD / Intel Support)
|
||||
* **Problem**: Previously, if an AMD user tried to run the app, it would crash instantly because it tried to load Nvidia CUDA libraries by default.
|
||||
* **Fix**: The app now **silently detects** if CUDA initialization fails (due to missing DLLs or incompatible hardware) and **automatically falls back to CPU mode**.
|
||||
* **Result**: The app "just works" on any Windows machine, regardless of GPU.
|
||||
|
||||
### 2. Startup Crash Protection
|
||||
* **Problem**: If `faster_whisper` was imported before checking for valid drivers, the app would crash on launch for some users.
|
||||
* **Fix**: Implemented **Lazy Loading** for the AI engine. The app now starts the UI first, and only loads the heavy AI libraries inside a safety block that catches errors.
|
||||
|
||||
### 3. Corrupt Model Auto-Repair
|
||||
* **Problem**: Interrupted downloads could leave a corrupted model folder, preventing the app from ever starting again.
|
||||
* **Fix**: If the app detects a "vocabulary missing" or invalid config error, it will now **automatically delete the corrupt folder** and allow you to re-download it cleanly.
|
||||
|
||||
### 4. Windows DLL Injection
|
||||
* **Fix**: Added explicit DLL path injection for `nvidia-cublas` and `nvidia-cudnn` to ensure Python 3.8+ can find the required CUDA libraries on Windows systems that don't have them in PATH.
|
||||
|
||||
## 📦 Installation
|
||||
1. Download `WhisperVoice.exe` below.
|
||||
2. Replace your existing `.exe`.
|
||||
3. Run it.
|
||||
BIN
app_icon.ico
BIN
app_icon.ico
Binary file not shown.
|
Before Width: | Height: | Size: 73 KiB |
151
bootstrapper.py
151
bootstrapper.py
@@ -245,62 +245,106 @@ class Bootstrapper:
|
||||
|
||||
req_file = self.source_path / "requirements.txt"
|
||||
|
||||
# Use --prefer-binary to avoid building from source on Windows if possible
|
||||
# Use --no-warn-script-location to reduce noise
|
||||
# CRITICAL: Force --only-binary for llama-cpp-python to prevent picking new source-only versions
|
||||
cmd = [
|
||||
str(self.python_path / "python.exe"), "-m", "pip", "install",
|
||||
"--prefer-binary",
|
||||
"--only-binary", "llama-cpp-python",
|
||||
"--extra-index-url", "https://abetlen.github.io/llama-cpp-python/whl/cpu",
|
||||
"-r", str(req_file)
|
||||
]
|
||||
|
||||
process = subprocess.Popen(
|
||||
[str(self.python_path / "python.exe"), "-m", "pip", "install", "-r", str(req_file)],
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
stderr=subprocess.STDOUT, # Merge stderr into stdout
|
||||
text=True,
|
||||
cwd=str(self.python_path),
|
||||
creationflags=subprocess.CREATE_NO_WINDOW
|
||||
)
|
||||
|
||||
output_buffer = []
|
||||
for line in process.stdout:
|
||||
if self.ui: self.ui.set_detail(line.strip()[:60])
|
||||
process.wait()
|
||||
line_stripped = line.strip()
|
||||
if self.ui: self.ui.set_detail(line_stripped[:60])
|
||||
output_buffer.append(line_stripped)
|
||||
log(line_stripped)
|
||||
|
||||
return_code = process.wait()
|
||||
|
||||
if return_code != 0:
|
||||
err_msg = "\n".join(output_buffer[-15:]) # Show last 15 lines
|
||||
raise RuntimeError(f"Pip install failed (Exit code {return_code}):\n{err_msg}")
|
||||
|
||||
def refresh_app_source(self):
|
||||
"""Refresh app source files. Skips if already exists to save time."""
|
||||
# Optimization: If app/main.py exists, skip update to improve startup speed.
|
||||
# The user can delete the 'runtime' folder to force an update.
|
||||
if (self.app_path / "main.py").exists():
|
||||
log("App already exists. Skipping update.")
|
||||
return True
|
||||
|
||||
if self.ui: self.ui.set_status("Updating app files...")
|
||||
"""
|
||||
Smartly updates app source files by only copying changed files.
|
||||
Preserves user settings and reduces disk I/O.
|
||||
"""
|
||||
if self.ui: self.ui.set_status("Checking for updates...")
|
||||
|
||||
try:
|
||||
# Preserve settings.json if it exists
|
||||
settings_path = self.app_path / "settings.json"
|
||||
temp_settings = None
|
||||
if settings_path.exists():
|
||||
try:
|
||||
temp_settings = settings_path.read_bytes()
|
||||
except:
|
||||
log("Failed to backup settings.json, it involves risk of data loss.")
|
||||
# 1. Ensure destination exists
|
||||
if not self.app_path.exists():
|
||||
self.app_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
if self.app_path.exists():
|
||||
shutil.rmtree(self.app_path, ignore_errors=True)
|
||||
# 2. Walk source and sync
|
||||
# source_path is the temporary bundled folder
|
||||
# app_path is the persistent runtime folder
|
||||
|
||||
shutil.copytree(
|
||||
self.source_path,
|
||||
self.app_path,
|
||||
ignore=shutil.ignore_patterns(
|
||||
'__pycache__', '*.pyc', '.git', 'venv',
|
||||
'build', 'dist', '*.egg-info', 'runtime'
|
||||
)
|
||||
)
|
||||
changes_made = 0
|
||||
|
||||
# Restore settings.json
|
||||
if temp_settings:
|
||||
try:
|
||||
settings_path.write_bytes(temp_settings)
|
||||
log("Restored settings.json")
|
||||
except:
|
||||
log("Failed to restore settings.json")
|
||||
for src_dir, dirs, files in os.walk(self.source_path):
|
||||
# Determine relative path from source root
|
||||
rel_path = Path(src_dir).relative_to(self.source_path)
|
||||
dst_dir = self.app_path / rel_path
|
||||
|
||||
# Ensure directory exists
|
||||
if not dst_dir.exists():
|
||||
dst_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
for file in files:
|
||||
# Skip ignored files
|
||||
if file in ['__pycache__', '.git', 'settings.json'] or file.endswith('.pyc'):
|
||||
continue
|
||||
|
||||
src_file = Path(src_dir) / file
|
||||
dst_file = dst_dir / file
|
||||
|
||||
# Check if update needed
|
||||
should_copy = False
|
||||
if not dst_file.exists():
|
||||
should_copy = True
|
||||
else:
|
||||
# Compare size first (fast)
|
||||
if src_file.stat().st_size != dst_file.stat().st_size:
|
||||
should_copy = True
|
||||
else:
|
||||
# Compare content (slower but accurate)
|
||||
# Only read if size matches to verify diff
|
||||
if src_file.read_bytes() != dst_file.read_bytes():
|
||||
should_copy = True
|
||||
|
||||
if should_copy:
|
||||
shutil.copy2(src_file, dst_file)
|
||||
changes_made += 1
|
||||
if self.ui: self.ui.set_detail(f"Updated: {file}")
|
||||
|
||||
# 3. Cleanup logic (Optional: remove files in dest that are not in source)
|
||||
# For now, we only add/update to prevent deleting generated user files (logs, etc)
|
||||
|
||||
if changes_made > 0:
|
||||
log(f"Update complete. {changes_made} files changed.")
|
||||
else:
|
||||
log("App is up to date.")
|
||||
|
||||
return True
|
||||
except Exception as e:
|
||||
log(f"Error refreshing app source: {e}")
|
||||
# Fallback to nuclear option if sync fails completely?
|
||||
# No, 'smart_sync' failing might mean permissions, nuclear wouldn't help.
|
||||
return False
|
||||
|
||||
def run_app(self):
|
||||
@@ -323,22 +367,51 @@ class Bootstrapper:
|
||||
messagebox.showerror("WhisperVoice Error", f"Failed to launch app: {e}")
|
||||
return False
|
||||
|
||||
def check_dependencies(self):
|
||||
"""Check if critical dependencies are importable in the embedded python."""
|
||||
if not self.is_python_ready(): return False
|
||||
|
||||
try:
|
||||
# Check for core libs that might be missing
|
||||
# We use a subprocess to check imports in the runtime environment
|
||||
subprocess.check_call(
|
||||
[str(self.python_path / "python.exe"), "-c", "import faster_whisper; import llama_cpp; import PySide6"],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
cwd=str(self.python_path),
|
||||
creationflags=subprocess.CREATE_NO_WINDOW
|
||||
)
|
||||
return True
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
return False
|
||||
|
||||
def setup_and_run(self):
|
||||
"""Full setup/update and run flow."""
|
||||
try:
|
||||
# 1. Ensure basics
|
||||
if not self.is_python_ready():
|
||||
self.download_python()
|
||||
self._fix_pth_file() # Ensure pth is fixed immediately after download
|
||||
self.install_pip()
|
||||
self.install_packages()
|
||||
# self.install_packages() # We'll do this in the dependency check step now
|
||||
|
||||
# Always refresh source to ensure we have the latest bundled code
|
||||
self.refresh_app_source()
|
||||
|
||||
# 2. Check and Install Dependencies
|
||||
# We do this AFTER refreshing source so we have the latest requirements.txt
|
||||
if not self.check_dependencies():
|
||||
log("Dependencies missing or incomplete. Installing...")
|
||||
self.install_packages()
|
||||
|
||||
# Launch
|
||||
if self.run_app():
|
||||
if self.ui: self.ui.root.quit()
|
||||
except Exception as e:
|
||||
messagebox.showerror("Setup Error", f"Installation failed: {e}")
|
||||
if self.ui:
|
||||
import tkinter.messagebox as mb
|
||||
mb.showerror("Setup Error", f"Installation failed: {e}") # Improved error visibility
|
||||
log(f"Fatal error: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
|
||||
|
||||
31
build.bat
Normal file
31
build.bat
Normal file
@@ -0,0 +1,31 @@
|
||||
@echo off
|
||||
echo ============================================
|
||||
echo Building WhisperVoice Portable EXE
|
||||
echo ============================================
|
||||
echo.
|
||||
|
||||
if not exist venv (
|
||||
echo ERROR: venv not found. Run run_source.bat first.
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
call venv\Scripts\activate
|
||||
|
||||
echo Running PyInstaller (single-file bootstrapper)...
|
||||
pyinstaller build.spec --clean --noconfirm
|
||||
|
||||
if %ERRORLEVEL% NEQ 0 (
|
||||
echo.
|
||||
echo BUILD FAILED! Check errors above.
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
echo.
|
||||
echo Build complete!
|
||||
echo.
|
||||
echo Output: dist\WhisperVoice.exe
|
||||
echo.
|
||||
echo This single exe will download all dependencies on first run.
|
||||
pause
|
||||
95
build.spec
Normal file
95
build.spec
Normal file
@@ -0,0 +1,95 @@
|
||||
# -*- mode: python ; coding: utf-8 -*-
|
||||
# WhisperVoice — Single-file portable bootstrapper
|
||||
#
|
||||
# This builds a TINY exe that contains only:
|
||||
# - The bootstrapper (downloads Python + deps on first run)
|
||||
# - The app source code (bundled as data, extracted to runtime/app/)
|
||||
#
|
||||
# NO heavy dependencies (torch, PySide6, etc.) are bundled.
|
||||
|
||||
import os
|
||||
import glob
|
||||
|
||||
block_cipher = None
|
||||
|
||||
# ── Collect app source as data (goes into app_source/ inside the bundle) ──
|
||||
|
||||
app_datas = []
|
||||
|
||||
# main.py
|
||||
app_datas.append(('main.py', 'app_source'))
|
||||
|
||||
# requirements.txt
|
||||
app_datas.append(('requirements.txt', 'app_source'))
|
||||
|
||||
# src/**/*.py (core, ui, utils — preserving directory structure)
|
||||
for py in glob.glob('src/**/*.py', recursive=True):
|
||||
dest = os.path.join('app_source', os.path.dirname(py))
|
||||
app_datas.append((py, dest))
|
||||
|
||||
# src/ui/qml/** (QML files, shaders, SVGs, fonts, qmldir)
|
||||
qml_dir = os.path.join('src', 'ui', 'qml')
|
||||
for pattern in ('*.qml', '*.qsb', '*.frag', '*.svg', '*.ico', '*.png',
|
||||
'qmldir', 'AUTHORS.txt', 'OFL.txt'):
|
||||
for f in glob.glob(os.path.join(qml_dir, pattern)):
|
||||
app_datas.append((f, os.path.join('app_source', qml_dir)))
|
||||
|
||||
# Fonts
|
||||
for f in glob.glob(os.path.join(qml_dir, 'fonts', 'ttf', '*.ttf')):
|
||||
app_datas.append((f, os.path.join('app_source', qml_dir, 'fonts', 'ttf')))
|
||||
|
||||
# assets/
|
||||
if os.path.exists(os.path.join('assets', 'icon.ico')):
|
||||
app_datas.append((os.path.join('assets', 'icon.ico'), os.path.join('app_source', 'assets')))
|
||||
|
||||
# ── Analysis — only the bootstrapper, NO heavy imports ────────────────────
|
||||
|
||||
a = Analysis(
|
||||
['bootstrapper.py'],
|
||||
pathex=[],
|
||||
binaries=[],
|
||||
datas=app_datas,
|
||||
hiddenimports=[],
|
||||
hookspath=[],
|
||||
hooksconfig={},
|
||||
runtime_hooks=[],
|
||||
excludes=[
|
||||
# Exclude everything heavy — the bootstrapper only uses stdlib
|
||||
'torch', 'numpy', 'scipy', 'PySide6', 'shiboken6',
|
||||
'faster_whisper', 'ctranslate2', 'llama_cpp',
|
||||
'sounddevice', 'soundfile', 'keyboard', 'pyperclip',
|
||||
'psutil', 'pynvml', 'pystray', 'PIL', 'Pillow',
|
||||
'darkdetect', 'huggingface_hub', 'requests',
|
||||
'tqdm', 'onnxruntime', 'av',
|
||||
'tkinter', 'matplotlib', 'notebook', 'IPython',
|
||||
],
|
||||
win_no_prefer_redirects=False,
|
||||
win_private_assemblies=False,
|
||||
cipher=block_cipher,
|
||||
noarchive=False,
|
||||
)
|
||||
|
||||
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)
|
||||
|
||||
# ── Single-file EXE (--onefile) ──────────────────────────────────────────
|
||||
|
||||
exe = EXE(
|
||||
pyz,
|
||||
a.scripts,
|
||||
a.binaries,
|
||||
a.zipfiles,
|
||||
a.datas,
|
||||
[],
|
||||
name='WhisperVoice',
|
||||
debug=False,
|
||||
bootloader_ignore_signals=False,
|
||||
strip=False,
|
||||
upx=True,
|
||||
console=False, # No console — bootstrapper allocates one when needed
|
||||
disable_windowed_traceback=False,
|
||||
argv_emulation=False,
|
||||
target_arch=None,
|
||||
codesign_identity=None,
|
||||
entitlements_file=None,
|
||||
icon='assets/icon.ico',
|
||||
)
|
||||
@@ -1,66 +0,0 @@
|
||||
"""
|
||||
Build the Lightweight Bootstrapper
|
||||
==================================
|
||||
|
||||
This creates a small (~15-20MB) .exe that downloads Python + dependencies on first run.
|
||||
"""
|
||||
|
||||
import os
|
||||
import shutil
|
||||
import PyInstaller.__main__
|
||||
from pathlib import Path
|
||||
|
||||
def build_bootstrapper():
|
||||
project_root = Path(__file__).parent.absolute()
|
||||
dist_path = project_root / "dist"
|
||||
|
||||
# Collect all app source files to bundle
|
||||
# These will be extracted and used when setting up
|
||||
app_source_files = [
|
||||
("src", "app_source/src"),
|
||||
("assets", "app_source/assets"), # Include icon etc
|
||||
("main.py", "app_source"),
|
||||
("requirements.txt", "app_source"),
|
||||
]
|
||||
|
||||
add_data_args = []
|
||||
for src, dst in app_source_files:
|
||||
src_path = project_root / src
|
||||
if src_path.exists():
|
||||
add_data_args.extend(["--add-data", f"{src}{os.pathsep}{dst}"])
|
||||
|
||||
# Use absolute project root for copying
|
||||
shutil.copy2(project_root / "assets" / "icon.ico", project_root / "app_icon.ico")
|
||||
|
||||
print("🚀 Building Lightweight Bootstrapper...")
|
||||
print("⏳ This creates a small .exe that downloads dependencies on first run.\n")
|
||||
|
||||
PyInstaller.__main__.run([
|
||||
"bootstrapper.py",
|
||||
"--name=WhisperVoice",
|
||||
"--onefile",
|
||||
"--noconsole", # Re-enabled! Error handling in bootstrapper is ready.
|
||||
"--clean",
|
||||
"--icon=app_icon.ico", # Simplified path at root
|
||||
*add_data_args,
|
||||
])
|
||||
|
||||
exe_path = dist_path / "WhisperVoice.exe"
|
||||
if exe_path.exists():
|
||||
size_mb = exe_path.stat().st_size / (1024 * 1024)
|
||||
print("\n" + "="*60)
|
||||
print("✅ BOOTSTRAPPER BUILD COMPLETE!")
|
||||
print("="*60)
|
||||
print(f"\n📍 Output: {exe_path}")
|
||||
print(f"📦 Size: {size_mb:.1f} MB")
|
||||
print("\n📋 How it works:")
|
||||
print(" 1. User runs WhisperVoice.exe")
|
||||
print(" 2. First run: Downloads Python + packages (~2-3GB)")
|
||||
print(" 3. Subsequent runs: Launches instantly")
|
||||
print("\n💡 The 'runtime/' folder will be created next to the .exe")
|
||||
else:
|
||||
print("\n❌ Build failed. Check the output above for errors.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
os.chdir(Path(__file__).parent)
|
||||
build_bootstrapper()
|
||||
@@ -1,17 +0,0 @@
|
||||
@echo off
|
||||
echo Building Whisper Voice Portable EXE...
|
||||
if not exist venv (
|
||||
echo Please run run_source.bat first to setup environment!
|
||||
pause
|
||||
exit /b
|
||||
)
|
||||
|
||||
call venv\Scripts\activate
|
||||
pip install pyinstaller
|
||||
|
||||
echo Running PyInstaller...
|
||||
pyinstaller build.spec --clean --noconfirm
|
||||
|
||||
echo.
|
||||
echo Build Complete! Check dist/WhisperVoice.exe
|
||||
pause
|
||||
@@ -1,14 +0,0 @@
|
||||
from PIL import Image
|
||||
import os
|
||||
|
||||
# Path from the generate_image tool output
|
||||
src = r"C:/Users/lashman/.gemini/antigravity/brain/9a183770-2481-475b-b748-03f4910f9a8e/app_icon_1769195450659.png"
|
||||
dst = r"d:\!!! SYSTEM DATA !!!\Desktop\python crap\whisper_voice\assets\icon.ico"
|
||||
|
||||
if os.path.exists(src):
|
||||
img = Image.open(src)
|
||||
# Resize to standard icon sizes
|
||||
img.save(dst, format='ICO', sizes=[(256, 256)])
|
||||
print(f"Icon saved to {dst}")
|
||||
else:
|
||||
print(f"Source image not found: {src}")
|
||||
BIN
dist/WhisperVoice.exe
vendored
Normal file
BIN
dist/WhisperVoice.exe
vendored
Normal file
Binary file not shown.
@@ -1,43 +0,0 @@
|
||||
import requests
|
||||
import os
|
||||
|
||||
ICONS = {
|
||||
"settings.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/gear.svg",
|
||||
"visibility.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/eye.svg",
|
||||
"smart_toy.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/brain.svg",
|
||||
"microphone.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/microphone.svg"
|
||||
}
|
||||
|
||||
TARGET_DIR = r"d:\!!! SYSTEM DATA !!!\Desktop\python crap\whisper_voice\src\ui\qml"
|
||||
|
||||
def download_icons():
|
||||
if not os.path.exists(TARGET_DIR):
|
||||
print(f"Directory not found: {TARGET_DIR}")
|
||||
return
|
||||
|
||||
for filename, url in ICONS.items():
|
||||
try:
|
||||
print(f"Downloading {filename} from {url}...")
|
||||
response = requests.get(url, timeout=10)
|
||||
response.raise_for_status()
|
||||
|
||||
# Force white fill
|
||||
content = response.text
|
||||
if "<path" in content and "fill=" not in content:
|
||||
content = content.replace("<path", '<path fill="#ffffff"')
|
||||
elif "<path" in content and "fill=" in content:
|
||||
# Regex or simple replace if possible, but simplest is usually just injecting style or checking common FA format
|
||||
pass # FA standard usually has no fill.
|
||||
|
||||
# Additional safety: Replace currentcolor if present
|
||||
content = content.replace("currentColor", "#ffffff")
|
||||
|
||||
filepath = os.path.join(TARGET_DIR, filename)
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write(content)
|
||||
print(f"Saved {filepath} (modified to white)")
|
||||
except Exception as e:
|
||||
print(f"FAILED to download {filename}: {e}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
download_icons()
|
||||
402
main.py
402
main.py
@@ -9,6 +9,31 @@ app_dir = os.path.dirname(os.path.abspath(__file__))
|
||||
if app_dir not in sys.path:
|
||||
sys.path.insert(0, app_dir)
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# WINDOWS DLL FIX (CRITICAL for Portable CUDA)
|
||||
# Python 3.8+ on Windows requires explicit DLL directory addition.
|
||||
# -----------------------------------------------------------------------------
|
||||
if os.name == 'nt' and hasattr(os, 'add_dll_directory'):
|
||||
try:
|
||||
from pathlib import Path
|
||||
# Scan sys.path for site-packages
|
||||
for p in sys.path:
|
||||
path_obj = Path(p)
|
||||
if path_obj.name == 'site-packages' and path_obj.exists():
|
||||
nvidia_path = path_obj / "nvidia"
|
||||
if nvidia_path.exists():
|
||||
for subdir in nvidia_path.iterdir():
|
||||
# Add 'bin' folder from each nvidia stub (cublas, cudnn, etc.)
|
||||
bin_path = subdir / "bin"
|
||||
if bin_path.exists():
|
||||
os.add_dll_directory(str(bin_path))
|
||||
# Also try adding site-packages itself just in case
|
||||
# os.add_dll_directory(str(path_obj))
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
from PySide6.QtWidgets import QApplication, QFileDialog, QMessageBox
|
||||
from PySide6.QtCore import QObject, Slot, Signal, QThread, Qt, QUrl
|
||||
from PySide6.QtQml import QQmlApplicationEngine
|
||||
@@ -19,6 +44,7 @@ from src.ui.bridge import UIBridge
|
||||
from src.ui.tray import SystemTray
|
||||
from src.core.audio_engine import AudioEngine
|
||||
from src.core.transcriber import WhisperTranscriber
|
||||
from src.core.llm_engine import LLMEngine
|
||||
from src.core.hotkey_manager import HotkeyManager
|
||||
from src.core.config import ConfigManager
|
||||
from src.utils.injector import InputInjector
|
||||
@@ -54,6 +80,21 @@ try:
|
||||
except:
|
||||
pass
|
||||
|
||||
# Detect Windows "Reduce Motion" preference
|
||||
try:
|
||||
import ctypes
|
||||
SPI_GETCLIENTAREAANIMATION = 0x1042
|
||||
animation_enabled = ctypes.c_bool(True)
|
||||
ctypes.windll.user32.SystemParametersInfoW(
|
||||
SPI_GETCLIENTAREAANIMATION, 0,
|
||||
ctypes.byref(animation_enabled), 0
|
||||
)
|
||||
if not animation_enabled.value:
|
||||
ConfigManager().data["reduce_motion"] = True
|
||||
ConfigManager().save()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Configure Logging
|
||||
class QmlLoggingHandler(logging.Handler, QObject):
|
||||
sig_log = Signal(str)
|
||||
@@ -87,7 +128,7 @@ def _silent_shutdown_hook(exc_type, exc_value, exc_tb):
|
||||
sys.excepthook = _silent_shutdown_hook
|
||||
|
||||
class DownloadWorker(QThread):
|
||||
"""Background worker for model downloads."""
|
||||
"""Background worker for model downloads with REAL progress."""
|
||||
progress = Signal(int)
|
||||
finished = Signal()
|
||||
error = Signal(str)
|
||||
@@ -98,33 +139,144 @@ class DownloadWorker(QThread):
|
||||
|
||||
def run(self):
|
||||
try:
|
||||
from faster_whisper import download_model
|
||||
import requests
|
||||
from tqdm import tqdm
|
||||
model_path = get_models_path()
|
||||
# Download to a specific subdirectory to keep things clean and predictable
|
||||
# This matches the logic in transcriber.py which looks for this specific path
|
||||
# Determine what to download
|
||||
dest_dir = model_path / f"faster-whisper-{self.model_name}"
|
||||
logging.info(f"Downloading Model '{self.model_name}' to {dest_dir}...")
|
||||
repo_id = f"Systran/faster-whisper-{self.model_name}"
|
||||
files = ["config.json", "model.bin", "tokenizer.json", "vocabulary.json"]
|
||||
base_url = f"https://huggingface.co/{repo_id}/resolve/main"
|
||||
|
||||
# Ensure parent exists
|
||||
model_path.mkdir(parents=True, exist_ok=True)
|
||||
dest_dir.mkdir(parents=True, exist_ok=True)
|
||||
logging.info(f"Downloading {self.model_name} to {dest_dir}...")
|
||||
|
||||
# output_dir in download_model specifies where the model files are saved
|
||||
download_model(self.model_name, output_dir=str(dest_dir))
|
||||
# 1. Calculate Total Size
|
||||
total_size = 0
|
||||
file_sizes = {}
|
||||
|
||||
with requests.Session() as s:
|
||||
for fname in files:
|
||||
url = f"{base_url}/{fname}"
|
||||
head = s.head(url, allow_redirects=True)
|
||||
if head.status_code == 200:
|
||||
size = int(head.headers.get('content-length', 0))
|
||||
file_sizes[fname] = size
|
||||
total_size += size
|
||||
else:
|
||||
# Fallback for vocabulary.json vs vocabulary.txt
|
||||
if fname == "vocabulary.json":
|
||||
# Try .txt? Or just skip if not found?
|
||||
# Faster-whisper usually has vocabulary.json
|
||||
pass
|
||||
|
||||
# 2. Download loop
|
||||
downloaded_bytes = 0
|
||||
|
||||
with requests.Session() as s:
|
||||
for fname in files:
|
||||
if fname not in file_sizes: continue
|
||||
|
||||
url = f"{base_url}/{fname}"
|
||||
dest_file = dest_dir / fname
|
||||
|
||||
# Resume check?
|
||||
# Simpler to just overwrite for reliability unless we want complex resume logic.
|
||||
# We'll overwrite.
|
||||
|
||||
resp = s.get(url, stream=True)
|
||||
resp.raise_for_status()
|
||||
|
||||
with open(dest_file, 'wb') as f:
|
||||
for chunk in resp.iter_content(chunk_size=8192):
|
||||
if chunk:
|
||||
f.write(chunk)
|
||||
downloaded_bytes += len(chunk)
|
||||
|
||||
# Emit Progress
|
||||
if total_size > 0:
|
||||
pct = int((downloaded_bytes / total_size) * 100)
|
||||
self.progress.emit(pct)
|
||||
|
||||
self.finished.emit()
|
||||
|
||||
except Exception as e:
|
||||
logging.error(f"Download failed: {e}")
|
||||
self.error.emit(str(e))
|
||||
|
||||
class LLMDownloadWorker(QThread):
|
||||
progress = Signal(int)
|
||||
finished = Signal()
|
||||
error = Signal(str)
|
||||
|
||||
def __init__(self, parent=None):
|
||||
super().__init__(parent)
|
||||
|
||||
def run(self):
|
||||
try:
|
||||
import requests
|
||||
# Support one model for now
|
||||
url = "https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/resolve/main/llama-3.2-1b-instruct-q4_k_m.gguf?download=true"
|
||||
fname = "llama-3.2-1b-instruct-q4_k_m.gguf"
|
||||
|
||||
model_path = get_models_path() / "llm" / "llama-3.2-1b-instruct"
|
||||
model_path.mkdir(parents=True, exist_ok=True)
|
||||
dest_file = model_path / fname
|
||||
|
||||
# Simple check if exists and > 0 size?
|
||||
# We assume if the user clicked download, they want to download it.
|
||||
|
||||
with requests.Session() as s:
|
||||
head = s.head(url, allow_redirects=True)
|
||||
total_size = int(head.headers.get('content-length', 0))
|
||||
|
||||
resp = s.get(url, stream=True)
|
||||
resp.raise_for_status()
|
||||
|
||||
downloaded = 0
|
||||
with open(dest_file, 'wb') as f:
|
||||
for chunk in resp.iter_content(chunk_size=8192):
|
||||
if chunk:
|
||||
f.write(chunk)
|
||||
downloaded += len(chunk)
|
||||
if total_size > 0:
|
||||
pct = int((downloaded / total_size) * 100)
|
||||
self.progress.emit(pct)
|
||||
|
||||
self.finished.emit()
|
||||
|
||||
except Exception as e:
|
||||
logging.error(f"LLM Download failed: {e}")
|
||||
self.error.emit(str(e))
|
||||
|
||||
class LLMWorker(QThread):
|
||||
finished = Signal(str)
|
||||
|
||||
def __init__(self, llm_engine, text, mode, parent=None):
|
||||
super().__init__(parent)
|
||||
self.llm_engine = llm_engine
|
||||
self.text = text
|
||||
self.mode = mode
|
||||
|
||||
def run(self):
|
||||
try:
|
||||
corrected = self.llm_engine.correct_text(self.text, self.mode)
|
||||
self.finished.emit(corrected)
|
||||
except Exception as e:
|
||||
logging.error(f"LLMWorker crashed: {e}")
|
||||
self.finished.emit(self.text) # Fail safe: return original text
|
||||
|
||||
|
||||
class TranscriptionWorker(QThread):
|
||||
finished = Signal(str)
|
||||
def __init__(self, transcriber, audio_data, is_file=False, parent=None):
|
||||
def __init__(self, transcriber, audio_data, is_file=False, parent=None, task_override=None):
|
||||
super().__init__(parent)
|
||||
self.transcriber = transcriber
|
||||
self.audio_data = audio_data
|
||||
self.is_file = is_file
|
||||
self.task_override = task_override
|
||||
def run(self):
|
||||
text = self.transcriber.transcribe(self.audio_data, is_file=self.is_file)
|
||||
text = self.transcriber.transcribe(self.audio_data, is_file=self.is_file, task=self.task_override)
|
||||
self.finished.emit(text)
|
||||
|
||||
class WhisperApp(QObject):
|
||||
@@ -156,6 +308,7 @@ class WhisperApp(QObject):
|
||||
self.bridge.settingChanged.connect(self.on_settings_changed)
|
||||
self.bridge.hotkeysEnabledChanged.connect(self.on_hotkeys_enabled_toggle)
|
||||
self.bridge.downloadRequested.connect(self.on_download_requested)
|
||||
self.bridge.llmDownloadRequested.connect(self.on_llm_download_requested)
|
||||
|
||||
self.engine.rootContext().setContextProperty("ui", self.bridge)
|
||||
|
||||
@@ -166,13 +319,20 @@ class WhisperApp(QObject):
|
||||
self.tray.transcribe_file_requested.connect(self.transcribe_file)
|
||||
|
||||
# Init Tooltip
|
||||
hotkey = self.config.get("hotkey")
|
||||
self.tray.setToolTip(f"Whisper Voice - Press {hotkey} to Record")
|
||||
from src.utils.formatters import format_hotkey
|
||||
self.format_hotkey = format_hotkey # Store ref
|
||||
|
||||
hk1 = self.format_hotkey(self.config.get("hotkey"))
|
||||
hk2 = self.format_hotkey(self.config.get("hotkey_translate"))
|
||||
self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nTranslate: {hk2}")
|
||||
|
||||
# 3. Logic Components Placeholders
|
||||
self.audio_engine = None
|
||||
self.transcriber = None
|
||||
self.hotkey_manager = None
|
||||
self.llm_engine = None
|
||||
self.hk_transcribe = None
|
||||
self.hk_correct = None
|
||||
self.hk_translate = None
|
||||
self.overlay_root = None
|
||||
|
||||
# 4. Start Loader
|
||||
@@ -222,12 +382,23 @@ class WhisperApp(QObject):
|
||||
self.settings_root.setVisible(False)
|
||||
|
||||
# Install Low-Level Window Hook for Transparent Hit Test
|
||||
# We must keep a reference to 'self.hook' so it isn't GC'd
|
||||
# scale = self.overlay_root.devicePixelRatio()
|
||||
# self.hook = WindowHook(int(self.overlay_root.winId()), 500, 300, scale)
|
||||
# self.hook.install()
|
||||
try:
|
||||
from src.utils.window_hook import WindowHook
|
||||
hwnd = self.overlay_root.winId()
|
||||
# Initial scale from config
|
||||
scale = float(self.config.get("ui_scale"))
|
||||
|
||||
# NOTE: HitTest hook will be installed here later
|
||||
# Current Overlay Dimensions
|
||||
win_w = int(460 * scale)
|
||||
win_h = int(180 * scale)
|
||||
|
||||
self.window_hook = WindowHook(hwnd, win_w, win_h, initial_scale=scale)
|
||||
self.window_hook.install()
|
||||
|
||||
# Initial state: Disabled because we start inactive
|
||||
self.window_hook.set_enabled(False)
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to install WindowHook: {e}")
|
||||
|
||||
def center_overlay(self):
|
||||
"""Calculates and sets the Overlay position above the taskbar."""
|
||||
@@ -255,14 +426,77 @@ class WhisperApp(QObject):
|
||||
self.audio_engine.set_visualizer_callback(self.bridge.update_amplitude)
|
||||
self.audio_engine.set_silence_callback(self.on_silence_detected)
|
||||
self.transcriber = WhisperTranscriber()
|
||||
self.hotkey_manager = HotkeyManager()
|
||||
self.hotkey_manager.triggered.connect(self.toggle_recording)
|
||||
self.hotkey_manager.start()
|
||||
self.llm_engine = LLMEngine()
|
||||
|
||||
# Dual Hotkey Managers
|
||||
self.hk_transcribe = HotkeyManager(config_key="hotkey")
|
||||
self.hk_transcribe.triggered.connect(lambda: self.toggle_recording(task_override="transcribe", task_mode="standard"))
|
||||
self.hk_transcribe.start()
|
||||
|
||||
self.hk_correct = HotkeyManager(config_key="hotkey_correct")
|
||||
self.hk_correct.triggered.connect(lambda: self.toggle_recording(task_override="transcribe", task_mode="correct"))
|
||||
self.hk_correct.start()
|
||||
|
||||
self.hk_translate = HotkeyManager(config_key="hotkey_translate")
|
||||
self.hk_translate.triggered.connect(lambda: self.toggle_recording(task_override="translate", task_mode="standard"))
|
||||
self.hk_translate.start()
|
||||
|
||||
self.bridge.update_status("Ready")
|
||||
|
||||
def run(self):
|
||||
sys.exit(self.qt_app.exec())
|
||||
|
||||
@Slot(str, str)
|
||||
@Slot(str)
|
||||
def toggle_recording(self, task_override=None, task_mode="standard"):
|
||||
"""
|
||||
task_override: 'transcribe' or 'translate' (passed to whisper)
|
||||
task_mode: 'standard' or 'correct' (determines post-processing)
|
||||
"""
|
||||
if task_mode == "correct":
|
||||
self.current_task_requires_llm = True
|
||||
elif task_mode == "standard":
|
||||
self.current_task_requires_llm = False # Explicit reset
|
||||
|
||||
# Actual Logic
|
||||
if self.bridge.isRecording:
|
||||
logging.info("Stopping recording...")
|
||||
# stop_recording returns the numpy array directly
|
||||
audio_data = self.audio_engine.stop_recording()
|
||||
|
||||
self.bridge.isRecording = False
|
||||
self.bridge.update_status("Processing...")
|
||||
self.bridge.isProcessing = True
|
||||
|
||||
# Save task override for processing
|
||||
self.last_task_override = task_override
|
||||
|
||||
if audio_data is not None and len(audio_data) > 0:
|
||||
# Use the task that started this session, or the override if provided
|
||||
final_task = getattr(self, "current_recording_task", self.config.get("task"))
|
||||
if task_override: final_task = task_override
|
||||
|
||||
self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self, task_override=final_task)
|
||||
self.worker.finished.connect(self.on_transcription_done)
|
||||
self.worker.start()
|
||||
else:
|
||||
self.bridge.update_status("Ready")
|
||||
self.bridge.isProcessing = False
|
||||
|
||||
else:
|
||||
# START RECORDING
|
||||
if self.bridge.isProcessing:
|
||||
logging.warning("Ignored toggle request: Transcription in progress.")
|
||||
return
|
||||
|
||||
intended_task = task_override if task_override else self.config.get("task")
|
||||
self.current_recording_task = intended_task
|
||||
|
||||
logging.info(f"Starting recording... (Task: {intended_task}, Mode: {task_mode})")
|
||||
self.audio_engine.start_recording()
|
||||
self.bridge.isRecording = True
|
||||
self.bridge.update_status(f"Recording ({intended_task})...")
|
||||
|
||||
@Slot()
|
||||
def quit_app(self):
|
||||
logging.info("Shutting down...")
|
||||
@@ -275,7 +509,8 @@ class WhisperApp(QObject):
|
||||
except: pass
|
||||
self.bridge.stats_worker.stop()
|
||||
|
||||
if self.hotkey_manager: self.hotkey_manager.stop()
|
||||
if self.hk_transcribe: self.hk_transcribe.stop()
|
||||
if self.hk_translate: self.hk_translate.stop()
|
||||
|
||||
# Close all QML windows to ensure bindings stop before Python objects die
|
||||
if self.overlay_root:
|
||||
@@ -350,10 +585,16 @@ class WhisperApp(QObject):
|
||||
print(f"Setting Changed: {key} = {value}")
|
||||
|
||||
# 1. Hotkey Reload
|
||||
if key == "hotkey":
|
||||
if self.hotkey_manager: self.hotkey_manager.reload_hotkey()
|
||||
if key in ["hotkey", "hotkey_translate", "hotkey_correct"]:
|
||||
if self.hk_transcribe: self.hk_transcribe.reload_hotkey()
|
||||
if self.hk_correct: self.hk_correct.reload_hotkey()
|
||||
if self.hk_translate: self.hk_translate.reload_hotkey()
|
||||
|
||||
if self.tray:
|
||||
self.tray.setToolTip(f"Whisper Voice - Press {value} to Record")
|
||||
hk1 = self.format_hotkey(self.config.get("hotkey"))
|
||||
hk3 = self.format_hotkey(self.config.get("hotkey_correct"))
|
||||
hk2 = self.format_hotkey(self.config.get("hotkey_translate"))
|
||||
self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nCorrect: {hk3}\nTranslate: {hk2}")
|
||||
|
||||
# 2. AI Model Reload (Heavy)
|
||||
if key in ["model_size", "compute_device", "compute_type"]:
|
||||
@@ -456,6 +697,8 @@ class WhisperApp(QObject):
|
||||
file_path, _ = QFileDialog.getOpenFileName(None, "Select Audio", "", "Audio (*.mp3 *.wav *.flac *.m4a *.ogg)")
|
||||
if file_path:
|
||||
self.bridge.update_status("Thinking...")
|
||||
# Files use the default configured task usually, or we could ask?
|
||||
# Default to config setting for files.
|
||||
self.worker = TranscriptionWorker(self.transcriber, file_path, is_file=True, parent=self)
|
||||
self.worker.finished.connect(self.on_transcription_done)
|
||||
self.worker.start()
|
||||
@@ -463,48 +706,73 @@ class WhisperApp(QObject):
|
||||
@Slot()
|
||||
def on_silence_detected(self):
|
||||
from PySide6.QtCore import QMetaObject, Qt
|
||||
# Silence detection always triggers the task that was active?
|
||||
# Since silence stops recording, it just calls toggle_recording with no arg, using the stored current_task?
|
||||
# Let's ensure toggle_recording handles no arg calls by stopping the CURRENT task.
|
||||
QMetaObject.invokeMethod(self, "toggle_recording", Qt.QueuedConnection)
|
||||
|
||||
@Slot()
|
||||
def toggle_recording(self):
|
||||
if not self.audio_engine: return
|
||||
|
||||
# Prevent starting a new recording while we are still transcribing the last one
|
||||
if self.bridge.isProcessing:
|
||||
logging.warning("Ignored toggle request: Transcription in progress.")
|
||||
return
|
||||
|
||||
if self.audio_engine.recording:
|
||||
self.bridge.update_status("Thinking...")
|
||||
self.bridge.isRecording = False
|
||||
self.bridge.isProcessing = True # Start Processing
|
||||
audio_data = self.audio_engine.stop_recording()
|
||||
self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self)
|
||||
self.worker.finished.connect(self.on_transcription_done)
|
||||
self.worker.start()
|
||||
else:
|
||||
self.bridge.update_status("Recording")
|
||||
self.bridge.isRecording = True
|
||||
self.audio_engine.start_recording()
|
||||
|
||||
@Slot(bool)
|
||||
def on_ui_toggle_request(self, state):
|
||||
if state != self.audio_engine.recording:
|
||||
self.toggle_recording()
|
||||
self.toggle_recording() # Default behavior for UI clicks
|
||||
|
||||
@Slot(str)
|
||||
def on_transcription_done(self, text: str):
|
||||
self.bridge.update_status("Ready")
|
||||
self.bridge.isProcessing = False # End Processing
|
||||
self.bridge.isProcessing = False # Temporarily false? No, keep it true if we chain.
|
||||
|
||||
# Check LLM Settings -> AND check if the current task requested it
|
||||
llm_enabled = self.config.get("llm_enabled")
|
||||
requires_llm = getattr(self, "current_task_requires_llm", False)
|
||||
|
||||
# We only correct if:
|
||||
# 1. LLM is globally enabled (safety switch)
|
||||
# 2. current_task_requires_llm is True (triggered by Correct hotkey)
|
||||
# OR 3. Maybe user WANTS global correction? Ideally user uses separate hotkey.
|
||||
# Let's say: If "Correction" is enabled in settings, does it apply to ALL?
|
||||
# The user's feedback suggests they DON'T want it on regular hotkey.
|
||||
# So we enforce: Correct Hotkey -> Corrects. Regular Hotkey -> Raw.
|
||||
# BUT we must handle the case where user expects the old behavior?
|
||||
# Let's make it strict: Only correct if triggered by correct hotkey OR if we add a "Correct All" toggle later.
|
||||
# For now, let's respect the flag. But wait, if llm_enabled is OFF, we shouldn't run it even if hotkey pressed?
|
||||
# Yes, safety switch.
|
||||
|
||||
if text and llm_enabled and requires_llm:
|
||||
# Chain to LLM
|
||||
self.bridge.isProcessing = True
|
||||
self.bridge.update_status("Correcting...")
|
||||
mode = self.config.get("llm_mode")
|
||||
self.llm_worker = LLMWorker(self.llm_engine, text, mode, parent=self)
|
||||
self.llm_worker.finished.connect(self.on_llm_done)
|
||||
self.llm_worker.start()
|
||||
return
|
||||
|
||||
self.bridge.isProcessing = False
|
||||
if text:
|
||||
method = self.config.get("input_method")
|
||||
speed = int(self.config.get("typing_speed"))
|
||||
InputInjector.inject_text(text, method, speed)
|
||||
|
||||
@Slot(str)
|
||||
def on_llm_done(self, text: str):
|
||||
self.bridge.update_status("Ready")
|
||||
self.bridge.isProcessing = False
|
||||
if text:
|
||||
method = self.config.get("input_method")
|
||||
speed = int(self.config.get("typing_speed"))
|
||||
InputInjector.inject_text(text, method, speed)
|
||||
|
||||
# Cleanup
|
||||
if hasattr(self, 'llm_worker') and self.llm_worker:
|
||||
self.llm_worker.deleteLater()
|
||||
self.llm_worker = None
|
||||
|
||||
@Slot(bool)
|
||||
def on_hotkeys_enabled_toggle(self, state):
|
||||
if self.hotkey_manager:
|
||||
self.hotkey_manager.set_enabled(state)
|
||||
if self.hk_transcribe: self.hk_transcribe.set_enabled(state)
|
||||
if self.hk_translate: self.hk_translate.set_enabled(state)
|
||||
|
||||
@Slot(str)
|
||||
def on_download_requested(self, size):
|
||||
@@ -519,6 +787,19 @@ class WhisperApp(QObject):
|
||||
self.download_worker.error.connect(self.on_download_error)
|
||||
self.download_worker.start()
|
||||
|
||||
@Slot()
|
||||
def on_llm_download_requested(self):
|
||||
if self.bridge.isDownloading: return
|
||||
|
||||
self.bridge.update_status("Downloading LLM...")
|
||||
self.bridge.isDownloading = True
|
||||
|
||||
self.llm_dl_worker = LLMDownloadWorker(parent=self)
|
||||
self.llm_dl_worker.progress.connect(self.on_loader_progress) # Reuse existing progress slot? Yes.
|
||||
self.llm_dl_worker.finished.connect(self.on_download_finished) # Reuses same cleanup
|
||||
self.llm_dl_worker.error.connect(self.on_download_error)
|
||||
self.llm_dl_worker.start()
|
||||
|
||||
def on_download_finished(self):
|
||||
self.bridge.isDownloading = False
|
||||
self.bridge.update_status("Ready")
|
||||
@@ -531,6 +812,25 @@ class WhisperApp(QObject):
|
||||
self.bridge.update_status("Error")
|
||||
logging.error(f"Download Error: {err}")
|
||||
|
||||
@Slot(bool)
|
||||
def on_ui_toggle_request(self, is_recording):
|
||||
"""Called when recording state changes."""
|
||||
# Update Window Hook to allow clicking if active
|
||||
is_active = is_recording or self.bridge.isProcessing
|
||||
if hasattr(self, 'window_hook'):
|
||||
self.window_hook.set_enabled(is_active)
|
||||
|
||||
@Slot(bool)
|
||||
def on_processing_changed(self, is_processing):
|
||||
is_active = self.bridge.isRecording or is_processing
|
||||
if hasattr(self, 'window_hook'):
|
||||
self.window_hook.set_enabled(is_active)
|
||||
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
app = WhisperApp()
|
||||
app.run()
|
||||
|
||||
# Connect extra signal for processing state
|
||||
app.bridge.isProcessingChanged.connect(app.on_processing_changed)
|
||||
|
||||
sys.exit(app.run())
|
||||
|
||||
@@ -1,88 +0,0 @@
|
||||
"""
|
||||
Portable Build Script for WhisperVoice.
|
||||
=======================================
|
||||
|
||||
Creates a single-file portable .exe using PyInstaller.
|
||||
All data (settings, models) will be stored next to the .exe at runtime.
|
||||
"""
|
||||
|
||||
import os
|
||||
import shutil
|
||||
import PyInstaller.__main__
|
||||
from pathlib import Path
|
||||
|
||||
def build_portable():
|
||||
# 1. Setup Paths
|
||||
project_root = Path(__file__).parent.absolute()
|
||||
dist_path = project_root / "dist"
|
||||
build_path = project_root / "build"
|
||||
|
||||
# 2. Define Assets to bundle (into the .exe)
|
||||
# Format: (Source, Destination relative to bundle root)
|
||||
data_files = [
|
||||
# QML files
|
||||
("src/ui/qml/*.qml", "src/ui/qml"),
|
||||
("src/ui/qml/*.svg", "src/ui/qml"),
|
||||
("src/ui/qml/*.qsb", "src/ui/qml"),
|
||||
("src/ui/qml/fonts/ttf/*.ttf", "src/ui/qml/fonts/ttf"),
|
||||
# Subprocess worker script (CRITICAL for transcription)
|
||||
("src/core/transcribe_worker.py", "src/core"),
|
||||
]
|
||||
|
||||
# Convert to PyInstaller format "--add-data source;dest" (Windows uses ';')
|
||||
add_data_args = []
|
||||
for src, dst in data_files:
|
||||
add_data_args.extend(["--add-data", f"{src}{os.pathsep}{dst}"])
|
||||
|
||||
# 3. Run PyInstaller
|
||||
print("🚀 Starting Portable Build...")
|
||||
print("⏳ This may take 5-10 minutes...")
|
||||
|
||||
PyInstaller.__main__.run([
|
||||
"main.py", # Entry point
|
||||
"--name=WhisperVoice", # EXE name
|
||||
"--onefile", # Single EXE (slower startup but portable)
|
||||
"--noconsole", # No terminal window
|
||||
"--clean", # Clean cache
|
||||
*add_data_args, # Bundled assets
|
||||
|
||||
# Heavy libraries that need special collection
|
||||
"--collect-all", "faster_whisper",
|
||||
"--collect-all", "ctranslate2",
|
||||
"--collect-all", "PySide6",
|
||||
"--collect-all", "torch",
|
||||
"--collect-all", "numpy",
|
||||
|
||||
# Hidden imports (modules imported dynamically)
|
||||
"--hidden-import", "keyboard",
|
||||
"--hidden-import", "pyperclip",
|
||||
"--hidden-import", "psutil",
|
||||
"--hidden-import", "pynvml",
|
||||
"--hidden-import", "sounddevice",
|
||||
"--hidden-import", "scipy",
|
||||
"--hidden-import", "scipy.signal",
|
||||
"--hidden-import", "huggingface_hub",
|
||||
"--hidden-import", "tokenizers",
|
||||
|
||||
# Qt plugins
|
||||
"--hidden-import", "PySide6.QtQuickControls2",
|
||||
"--hidden-import", "PySide6.QtQuick.Controls",
|
||||
|
||||
# Icon (convert to .ico for Windows)
|
||||
# "--icon=icon.ico", # Uncomment if you have a .ico file
|
||||
])
|
||||
|
||||
print("\n" + "="*60)
|
||||
print("✅ BUILD COMPLETE!")
|
||||
print("="*60)
|
||||
print(f"\n📍 Output: {dist_path / 'WhisperVoice.exe'}")
|
||||
print("\n📋 First run instructions:")
|
||||
print(" 1. Place WhisperVoice.exe in a folder (e.g., C:\\WhisperVoice\\)")
|
||||
print(" 2. Run it - it will create 'models' and 'settings.json' folders")
|
||||
print(" 3. The app will download the Whisper model on first transcription\n")
|
||||
print("💡 TIP: Keep the .exe with its generated files for true portability!")
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Ensure we are in project root
|
||||
os.chdir(Path(__file__).parent)
|
||||
build_portable()
|
||||
@@ -5,6 +5,7 @@
|
||||
faster-whisper>=1.0.0
|
||||
torch>=2.0.0
|
||||
|
||||
|
||||
# UI Framework
|
||||
PySide6>=6.6.0
|
||||
|
||||
@@ -28,3 +29,6 @@ huggingface-hub>=0.20.0
|
||||
pystray>=0.19.0
|
||||
Pillow>=10.0.0
|
||||
darkdetect>=0.8.0
|
||||
|
||||
# LLM / Correction
|
||||
llama-cpp-python>=0.2.20
|
||||
|
||||
5
run.bat
5
run.bat
@@ -1,5 +0,0 @@
|
||||
@echo off
|
||||
echo [LAUNCHER] Starting Fake Blur UI (Python/Qt)...
|
||||
call venv\Scripts\activate.bat
|
||||
python main.py
|
||||
if %errorlevel% neq 0 pause
|
||||
@@ -16,6 +16,8 @@ from src.core.paths import get_base_path
|
||||
# Default Configuration
|
||||
DEFAULT_SETTINGS = {
|
||||
"hotkey": "f8",
|
||||
"hotkey_translate": "f10",
|
||||
"hotkey_correct": "f9", # New: Transcribe + Correct
|
||||
"model_size": "small",
|
||||
"input_device": None, # Device ID (int) or Name (str), None = Default
|
||||
"save_recordings": False, # Save .wav files for debugging
|
||||
@@ -38,13 +40,28 @@ DEFAULT_SETTINGS = {
|
||||
|
||||
# AI - Advanced
|
||||
"language": "auto", # "auto" or ISO code
|
||||
"task": "transcribe", # "transcribe" or "translate" (to English)
|
||||
"compute_device": "auto", # "auto", "cuda", "cpu"
|
||||
"compute_type": "int8", # "int8", "float16", "float32"
|
||||
"beam_size": 5,
|
||||
"best_of": 5,
|
||||
"vad_filter": True,
|
||||
"no_repeat_ngram_size": 0,
|
||||
"condition_on_previous_text": True
|
||||
"condition_on_previous_text": True,
|
||||
"initial_prompt": "Mm-hmm. Okay, let's go. I speak in full sentences.", # Default: Forces punctuation
|
||||
|
||||
# LLM Correction
|
||||
"llm_enabled": False,
|
||||
"llm_mode": "Standard", # "Grammar", "Standard", "Rewrite"
|
||||
"llm_model_name": "llama-3.2-1b-instruct",
|
||||
|
||||
|
||||
|
||||
# Low VRAM Mode
|
||||
"unload_models_after_use": False, # If True, models are unloaded immediately to free VRAM
|
||||
|
||||
# Accessibility
|
||||
"reduce_motion": False # Disable animations for WCAG 2.3.3
|
||||
}
|
||||
|
||||
class ConfigManager:
|
||||
@@ -94,9 +111,9 @@ class ConfigManager:
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to save settings: {e}")
|
||||
|
||||
def get(self, key: str) -> Any:
|
||||
def get(self, key: str, default: Any = None) -> Any:
|
||||
"""Get a setting value."""
|
||||
return self.data.get(key, DEFAULT_SETTINGS.get(key))
|
||||
return self.data.get(key, DEFAULT_SETTINGS.get(key, default))
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -1,31 +0,0 @@
|
||||
@echo off
|
||||
echo [DEBUG] LAUNCHER STARTED
|
||||
echo [DEBUG] CWD: %CD%
|
||||
echo [DEBUG] Python Path (expected relative): ..\python\python.exe
|
||||
|
||||
REM Read stdin to a file to verify data input (optional debugging)
|
||||
REM python.exe might be in different relative path depending on where this bat is run
|
||||
REM We assume this bat is in runtime/app/src/core/
|
||||
REM So python is in ../../../python/python.exe
|
||||
|
||||
set PYTHON_EXE=..\..\..\python\python.exe
|
||||
|
||||
if exist "%PYTHON_EXE%" (
|
||||
echo [DEBUG] Found Python at %PYTHON_EXE%
|
||||
) else (
|
||||
echo [ERROR] Python NOT found at %PYTHON_EXE%
|
||||
echo [ERROR] Listing relative directories:
|
||||
dir ..\..\..\
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
echo [DEBUG] Launching script: transcribe_worker.py
|
||||
"%PYTHON_EXE%" transcribe_worker.py
|
||||
if %ERRORLEVEL% NEQ 0 (
|
||||
echo [ERROR] Python script failed with code %ERRORLEVEL%
|
||||
pause
|
||||
) else (
|
||||
echo [SUCCESS] Script finished.
|
||||
pause
|
||||
)
|
||||
@@ -30,15 +30,16 @@ class HotkeyManager(QObject):
|
||||
|
||||
triggered = Signal()
|
||||
|
||||
def __init__(self, hotkey: str = "f8"):
|
||||
def __init__(self, config_key: str = "hotkey"):
|
||||
"""
|
||||
Initialize the HotkeyManager.
|
||||
|
||||
Args:
|
||||
hotkey (str): The global hotkey string description. Default: "f8".
|
||||
config_key (str): The configuration key to look up (e.g. "hotkey").
|
||||
"""
|
||||
super().__init__()
|
||||
self.hotkey = hotkey
|
||||
self.config_key = config_key
|
||||
self.hotkey = "f8" # Placeholder
|
||||
self.is_listening = False
|
||||
self._enabled = True
|
||||
|
||||
@@ -58,9 +59,9 @@ class HotkeyManager(QObject):
|
||||
|
||||
from src.core.config import ConfigManager
|
||||
config = ConfigManager()
|
||||
self.hotkey = config.get("hotkey")
|
||||
self.hotkey = config.get(self.config_key)
|
||||
|
||||
logging.info(f"Registering global hotkey: {self.hotkey}")
|
||||
logging.info(f"Registering global hotkey ({self.config_key}): {self.hotkey}")
|
||||
try:
|
||||
# We don't suppress=True here because we want the app to see keys during recording
|
||||
# (Wait, actually if we are recording we WANT keyboard to see it,
|
||||
|
||||
120
src/core/languages.py
Normal file
120
src/core/languages.py
Normal file
@@ -0,0 +1,120 @@
|
||||
"""
|
||||
Supported Languages Module
|
||||
==========================
|
||||
Full list of languages supported by OpenAI Whisper.
|
||||
Maps ISO codes to display names.
|
||||
"""
|
||||
|
||||
LANGUAGES = {
|
||||
"auto": "Auto Detect",
|
||||
"af": "Afrikaans",
|
||||
"sq": "Albanian",
|
||||
"am": "Amharic",
|
||||
"ar": "Arabic",
|
||||
"hy": "Armenian",
|
||||
"as": "Assamese",
|
||||
"az": "Azerbaijani",
|
||||
"ba": "Bashkir",
|
||||
"eu": "Basque",
|
||||
"be": "Belarusian",
|
||||
"bn": "Bengali",
|
||||
"bs": "Bosnian",
|
||||
"br": "Breton",
|
||||
"bg": "Bulgarian",
|
||||
"my": "Burmese",
|
||||
"ca": "Catalan",
|
||||
"zh": "Chinese",
|
||||
"hr": "Croatian",
|
||||
"cs": "Czech",
|
||||
"da": "Danish",
|
||||
"nl": "Dutch",
|
||||
"en": "English",
|
||||
"et": "Estonian",
|
||||
"fo": "Faroese",
|
||||
"fi": "Finnish",
|
||||
"fr": "French",
|
||||
"gl": "Galician",
|
||||
"ka": "Georgian",
|
||||
"de": "German",
|
||||
"el": "Greek",
|
||||
"gu": "Gujarati",
|
||||
"ht": "Haitian",
|
||||
"ha": "Hausa",
|
||||
"haw": "Hawaiian",
|
||||
"he": "Hebrew",
|
||||
"hi": "Hindi",
|
||||
"hu": "Hungarian",
|
||||
"is": "Icelandic",
|
||||
"id": "Indonesian",
|
||||
"it": "Italian",
|
||||
"ja": "Japanese",
|
||||
"jw": "Javanese",
|
||||
"kn": "Kannada",
|
||||
"kk": "Kazakh",
|
||||
"km": "Khmer",
|
||||
"ko": "Korean",
|
||||
"lo": "Lao",
|
||||
"la": "Latin",
|
||||
"lv": "Latvian",
|
||||
"ln": "Lingala",
|
||||
"lt": "Lithuanian",
|
||||
"lb": "Luxembourgish",
|
||||
"mk": "Macedonian",
|
||||
"mg": "Malagasy",
|
||||
"ms": "Malay",
|
||||
"ml": "Malayalam",
|
||||
"mt": "Maltese",
|
||||
"mi": "Maori",
|
||||
"mr": "Marathi",
|
||||
"mn": "Mongolian",
|
||||
"ne": "Nepali",
|
||||
"no": "Norwegian",
|
||||
"oc": "Occitan",
|
||||
"pa": "Punjabi",
|
||||
"ps": "Pashto",
|
||||
"fa": "Persian",
|
||||
"pl": "Polish",
|
||||
"pt": "Portuguese",
|
||||
"ro": "Romanian",
|
||||
"ru": "Russian",
|
||||
"sa": "Sanskrit",
|
||||
"sr": "Serbian",
|
||||
"sn": "Shona",
|
||||
"sd": "Sindhi",
|
||||
"si": "Sinhala",
|
||||
"sk": "Slovak",
|
||||
"sl": "Slovenian",
|
||||
"so": "Somali",
|
||||
"es": "Spanish",
|
||||
"su": "Sundanese",
|
||||
"sw": "Swahili",
|
||||
"sv": "Swedish",
|
||||
"tl": "Tagalog",
|
||||
"tg": "Tajik",
|
||||
"ta": "Tamil",
|
||||
"tt": "Tatar",
|
||||
"te": "Telugu",
|
||||
"th": "Thai",
|
||||
"bo": "Tibetan",
|
||||
"tr": "Turkish",
|
||||
"tk": "Turkmen",
|
||||
"uk": "Ukrainian",
|
||||
"ur": "Urdu",
|
||||
"uz": "Uzbek",
|
||||
"vi": "Vietnamese",
|
||||
"cy": "Welsh",
|
||||
"yi": "Yiddish",
|
||||
"yo": "Yoruba",
|
||||
}
|
||||
|
||||
def get_language_names():
|
||||
return list(LANGUAGES.values())
|
||||
|
||||
def get_code_by_name(name):
|
||||
for code, lang in LANGUAGES.items():
|
||||
if lang == name:
|
||||
return code
|
||||
return "auto"
|
||||
|
||||
def get_name_by_code(code):
|
||||
return LANGUAGES.get(code, "Auto Detect")
|
||||
185
src/core/llm_engine.py
Normal file
185
src/core/llm_engine.py
Normal file
@@ -0,0 +1,185 @@
|
||||
"""
|
||||
LLM Engine Module.
|
||||
==================
|
||||
|
||||
Handles interaction with the local Llama 3.2 1B model for transcription correction.
|
||||
Uses llama-cpp-python for efficient local inference.
|
||||
"""
|
||||
|
||||
import os
|
||||
import logging
|
||||
from typing import Optional
|
||||
from src.core.paths import get_models_path
|
||||
from src.core.config import ConfigManager
|
||||
|
||||
try:
|
||||
from llama_cpp import Llama
|
||||
except ImportError:
|
||||
Llama = None
|
||||
|
||||
class LLMEngine:
|
||||
"""
|
||||
Manages the Llama model and performs text correction/rewriting.
|
||||
"""
|
||||
def __init__(self):
|
||||
self.config = ConfigManager()
|
||||
self.model = None
|
||||
self.current_model_path = None
|
||||
|
||||
# --- Mode 1: Grammar Only (Strict) ---
|
||||
self.prompt_grammar = (
|
||||
"You are a text correction tool. "
|
||||
"Correct the grammar/spelling. Do not change punctuation or capitalization styles. "
|
||||
"Do not remove any words (including profanity). Output ONLY the result."
|
||||
"\n\nExample:\nInput: 'damn it works'\nOutput: 'damn it works'"
|
||||
)
|
||||
|
||||
# --- Mode 2: Standard (Grammar + Punctuation + Caps) ---
|
||||
self.prompt_standard = (
|
||||
"You are a text correction tool. "
|
||||
"Standardize the grammar, punctuation, and capitalization. "
|
||||
"Do not remove any words (including profanity). Output ONLY the result."
|
||||
"\n\nExample:\nInput: 'damn it works'\nOutput: 'Damn it works.'"
|
||||
)
|
||||
|
||||
# --- Mode 3: Rewrite (Tone-Aware Polish) ---
|
||||
self.prompt_rewrite = (
|
||||
"You are a text rewriting tool. Improve flow/clarity but keep the exact tone and vocabulary. "
|
||||
"Do not remove any words (including profanity). Output ONLY the result."
|
||||
"\n\nExample:\nInput: 'damn it works'\nOutput: 'Damn, it works.'"
|
||||
)
|
||||
|
||||
def load_model(self) -> bool:
|
||||
"""
|
||||
Loads the LLM model if it exists.
|
||||
Returns True if successful, False otherwise.
|
||||
"""
|
||||
if Llama is None:
|
||||
logging.error("llama-cpp-python not installed.")
|
||||
return False
|
||||
|
||||
model_name = self.config.get("llm_model_name", "llama-3.2-1b-instruct")
|
||||
model_dir = get_models_path() / "llm" / model_name
|
||||
model_file = model_dir / "llama-3.2-1b-instruct-q4_k_m.gguf"
|
||||
|
||||
if not model_file.exists():
|
||||
logging.warning(f"LLM Model not found at: {model_file}")
|
||||
return False
|
||||
|
||||
if self.model and self.current_model_path == str(model_file):
|
||||
return True
|
||||
|
||||
try:
|
||||
logging.info(f"Loading LLM from {model_file}...")
|
||||
n_gpu_layers = 0
|
||||
try:
|
||||
import torch
|
||||
if torch.cuda.is_available():
|
||||
n_gpu_layers = -1
|
||||
except:
|
||||
pass
|
||||
|
||||
self.model = Llama(
|
||||
model_path=str(model_file),
|
||||
n_gpu_layers=n_gpu_layers,
|
||||
n_ctx=2048,
|
||||
verbose=False
|
||||
)
|
||||
self.current_model_path = str(model_file)
|
||||
logging.info("LLM loaded successfully.")
|
||||
return True
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to load LLM: {e}")
|
||||
self.model = None
|
||||
return False
|
||||
|
||||
def correct_text(self, text: str, mode: str = "Standard") -> str:
|
||||
"""Corrects or rewrites the provided text."""
|
||||
if not text or not text.strip():
|
||||
return text
|
||||
|
||||
if not self.model:
|
||||
if not self.load_model():
|
||||
return text
|
||||
|
||||
logging.info(f"LLM Processing ({mode}): '{text}'")
|
||||
|
||||
system_prompt = self.prompt_standard
|
||||
if mode == "Grammar": system_prompt = self.prompt_grammar
|
||||
elif mode == "Rewrite": system_prompt = self.prompt_rewrite
|
||||
|
||||
# PREFIX INJECTION TECHNIQUE
|
||||
# We end the prompt with the start of the assistant's answer specifically phrased to force compliance.
|
||||
# "Here is the processed output:" forces it into a completion mode rather than a refusal mode.
|
||||
prefix_injection = "Here is the processed output:\n"
|
||||
|
||||
prompt = (
|
||||
f"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_prompt}<|eot_id|>"
|
||||
f"<|start_header_id|>user<|end_header_id|>\n\nProcess this input:\n{text}<|eot_id|>"
|
||||
f"<|start_header_id|>assistant<|end_header_id|>\n\n{prefix_injection}"
|
||||
)
|
||||
|
||||
try:
|
||||
output = self.model(
|
||||
prompt,
|
||||
max_tokens=512,
|
||||
stop=["<|eot_id|>"],
|
||||
echo=False,
|
||||
temperature=0.1
|
||||
)
|
||||
|
||||
result = output['choices'][0]['text'].strip()
|
||||
|
||||
# 1. Fallback: If result is empty, it might have just outputted nothing because we prefilled?
|
||||
# Actually llama-cpp-python usually returns the *continuation*.
|
||||
# So if it outputted "My corrected text.", the full logical response is "Here is...: My corrected text."
|
||||
# We just want the result.
|
||||
|
||||
# Refusal Detection (Safety Net)
|
||||
refusal_triggers = [
|
||||
"I cannot", "I can't", "I am unable", "I apologize", "sorry",
|
||||
"As an AI", "explicit content", "harmful content", "safety guidelines"
|
||||
]
|
||||
lower_res = result.lower()
|
||||
if any(trig in lower_res for trig in refusal_triggers) and len(result) < 150:
|
||||
logging.warning(f"LLM Refusal Detected: '{result}'. Falling back to original.")
|
||||
return text # Return original text on refusal!
|
||||
|
||||
# --- Robust Post-Processing ---
|
||||
|
||||
# 1. Strip quotes
|
||||
if result.startswith('"') and result.endswith('"') and len(result) > 2 and '"' not in result[1:-1]:
|
||||
result = result[1:-1]
|
||||
if result.startswith("'") and result.endswith("'") and len(result) > 2 and "'" not in result[1:-1]:
|
||||
result = result[1:-1]
|
||||
|
||||
# 2. Split by newline
|
||||
if "\n" in result:
|
||||
lines = result.split('\n')
|
||||
clean_lines = [l.strip() for l in lines if l.strip()]
|
||||
if clean_lines:
|
||||
result = clean_lines[0]
|
||||
|
||||
# 3. Aggressive Preamble Stripping (Updates for new prefix)
|
||||
import re
|
||||
prefixes = [
|
||||
r"^Here is the processed output:?\s*", # The one we injected
|
||||
r"^Here is the corrected text:?\s*",
|
||||
r"^Here is the rewritten text:?\s*",
|
||||
r"^Here's the result:?\s*",
|
||||
r"^Sure,? here is regex.*:?\s*",
|
||||
r"^Output:?\s*",
|
||||
r"^Processing result:?\s*",
|
||||
]
|
||||
|
||||
for p in prefixes:
|
||||
result = re.sub(p, "", result, flags=re.IGNORECASE).strip()
|
||||
|
||||
if result.startswith('"') and result.endswith('"') and len(result) > 2 and '"' not in result[1:-1]:
|
||||
result = result[1:-1]
|
||||
|
||||
logging.info(f"LLM Result: '{result}'")
|
||||
return result
|
||||
except Exception as e:
|
||||
logging.error(f"LLM inference failed: {e}")
|
||||
return text # Fail safe logic
|
||||
@@ -15,8 +15,13 @@ import numpy as np
|
||||
from src.core.config import ConfigManager
|
||||
from src.core.paths import get_models_path
|
||||
|
||||
try:
|
||||
import torch
|
||||
except ImportError:
|
||||
torch = None
|
||||
|
||||
# Import directly - valid since we are now running in the full environment
|
||||
from faster_whisper import WhisperModel
|
||||
|
||||
|
||||
class WhisperTranscriber:
|
||||
"""
|
||||
@@ -57,6 +62,8 @@ class WhisperTranscriber:
|
||||
# Force offline if path exists to avoid HF errors
|
||||
local_only = new_path.exists()
|
||||
|
||||
try:
|
||||
from faster_whisper import WhisperModel
|
||||
self.model = WhisperModel(
|
||||
model_input,
|
||||
device=device,
|
||||
@@ -64,6 +71,23 @@ class WhisperTranscriber:
|
||||
download_root=str(get_models_path()),
|
||||
local_files_only=local_only
|
||||
)
|
||||
except Exception as load_err:
|
||||
# CRITICAL FALLBACK: If CUDA/cublas fails (AMD/Intel users), fallback to CPU
|
||||
err_str = str(load_err).lower()
|
||||
if "cublas" in err_str or "cudnn" in err_str or "library" in err_str or "device" in err_str:
|
||||
logging.warning(f"CUDA Init Failed ({load_err}). Falling back to CPU...")
|
||||
self.config.set("compute_device", "cpu") # Update config for persistence/UI
|
||||
self.current_compute_device = "cpu"
|
||||
|
||||
self.model = WhisperModel(
|
||||
model_input,
|
||||
device="cpu",
|
||||
compute_type="int8", # CPU usually handles int8 well with newer extensions, or standard
|
||||
download_root=str(get_models_path()),
|
||||
local_files_only=local_only
|
||||
)
|
||||
else:
|
||||
raise load_err
|
||||
|
||||
self.current_model_size = size
|
||||
self.current_compute_device = device
|
||||
@@ -74,41 +98,119 @@ class WhisperTranscriber:
|
||||
logging.error(f"Failed to load model: {e}")
|
||||
self.model = None
|
||||
|
||||
def transcribe(self, audio_data, is_file: bool = False) -> str:
|
||||
# Auto-Repair: Detect vocabulary/corrupt errors
|
||||
err_str = str(e).lower()
|
||||
if "vocabulary" in err_str or "tokenizer" in err_str or "config.json" in err_str:
|
||||
# ... existing auto-repair logic ...
|
||||
logging.warning("Corrupt model detected on load. Attempting to delete and reset...")
|
||||
try:
|
||||
import shutil
|
||||
# Differentiate between simple path and HF path
|
||||
new_path = get_models_path() / f"faster-whisper-{size}"
|
||||
if new_path.exists():
|
||||
shutil.rmtree(new_path)
|
||||
logging.info(f"Deleted corrupt model at {new_path}")
|
||||
else:
|
||||
# Try legacy HF path
|
||||
hf_path = get_models_path() / f"models--Systran--faster-whisper-{size}"
|
||||
if hf_path.exists():
|
||||
shutil.rmtree(hf_path)
|
||||
logging.info(f"Deleted corrupt HF model at {hf_path}")
|
||||
|
||||
# Notify UI to refresh state (will show 'Download' button now)
|
||||
# We can't reach bridge easily here without passing it in,
|
||||
# but the UI polls or listens to logs.
|
||||
# The user will simply see "Model Missing" in settings after this.
|
||||
except Exception as del_err:
|
||||
logging.error(f"Failed to delete corrupt model: {del_err}")
|
||||
|
||||
def transcribe(self, audio_data, is_file: bool = False, task: Optional[str] = None) -> str:
|
||||
"""
|
||||
Transcribe audio data.
|
||||
"""
|
||||
logging.info(f"Starting transcription... (is_file={is_file})")
|
||||
logging.info(f"Starting transcription... (is_file={is_file}, task={task})")
|
||||
|
||||
# Ensure model is loaded
|
||||
if not self.model:
|
||||
self.load_model()
|
||||
if not self.model:
|
||||
return "Error: Model failed to load."
|
||||
return "Error: Model failed to load. Please check Settings -> Model Info."
|
||||
|
||||
try:
|
||||
# Config
|
||||
beam_size = int(self.config.get("beam_size"))
|
||||
best_of = int(self.config.get("best_of"))
|
||||
vad = False if is_file else self.config.get("vad_filter")
|
||||
language = self.config.get("language")
|
||||
|
||||
# Use task override if provided, otherwise config
|
||||
# Ensure safe string and lowercase ("transcribe" vs "Transcribe")
|
||||
raw_task = task if task else self.config.get("task")
|
||||
final_task = str(raw_task).strip().lower() if raw_task else "transcribe"
|
||||
|
||||
# Sanity check for valid Whisper tasks
|
||||
if final_task not in ["transcribe", "translate"]:
|
||||
logging.warning(f"Invalid task '{final_task}' detected. Defaulting to 'transcribe'.")
|
||||
final_task = "transcribe"
|
||||
|
||||
# Language handling
|
||||
final_language = language if language != "auto" else None
|
||||
|
||||
# Anti-Hallucination: Force condition_on_previous_text=False for translation
|
||||
condition_prev = self.config.get("condition_on_previous_text")
|
||||
|
||||
# Helper options for Translation Stability
|
||||
initial_prompt = self.config.get("initial_prompt")
|
||||
|
||||
if final_task == "translate":
|
||||
condition_prev = False
|
||||
# Force beam search if user has set it to greedy (1)
|
||||
# Translation requires more search breadth to find the English mapping
|
||||
if beam_size < 5:
|
||||
logging.info("Forcing beam_size=5 for Translation task.")
|
||||
beam_size = 5
|
||||
|
||||
# Inject guidance prompt if none exists
|
||||
if not initial_prompt:
|
||||
initial_prompt = "Translate this to English."
|
||||
|
||||
logging.info(f"Model Dispatch: Task='{final_task}', Language='{final_language}', ConditionPrev={condition_prev}, Beam={beam_size}")
|
||||
|
||||
# Build arguments dynamically to avoid passing None if that's the issue
|
||||
transcribe_opts = {
|
||||
"beam_size": beam_size,
|
||||
"best_of": best_of,
|
||||
"vad_filter": vad,
|
||||
"task": final_task,
|
||||
"vad_parameters": dict(min_silence_duration_ms=500),
|
||||
"condition_on_previous_text": condition_prev,
|
||||
"without_timestamps": True
|
||||
}
|
||||
|
||||
if initial_prompt:
|
||||
transcribe_opts["initial_prompt"] = initial_prompt
|
||||
|
||||
# Only add language if it's explicitly set (not None/Auto)
|
||||
# This avoids potentially confusing the model with explicit None
|
||||
if final_language:
|
||||
transcribe_opts["language"] = final_language
|
||||
|
||||
# Transcribe
|
||||
segments, info = self.model.transcribe(
|
||||
audio_data,
|
||||
beam_size=beam_size,
|
||||
best_of=best_of,
|
||||
vad_filter=vad,
|
||||
vad_parameters=dict(min_silence_duration_ms=500),
|
||||
condition_on_previous_text=self.config.get("condition_on_previous_text"),
|
||||
without_timestamps=True
|
||||
)
|
||||
segments, info = self.model.transcribe(audio_data, **transcribe_opts)
|
||||
|
||||
# Aggregate text
|
||||
text_result = ""
|
||||
for segment in segments:
|
||||
text_result += segment.text + " "
|
||||
|
||||
return text_result.strip()
|
||||
text_result = text_result.strip()
|
||||
|
||||
# Low VRAM Mode: Unload Whisper Model immediately
|
||||
if self.config.get("unload_models_after_use"):
|
||||
self.unload_model()
|
||||
|
||||
logging.info(f"Final Transcription Output: '{text_result}'")
|
||||
return text_result
|
||||
|
||||
except Exception as e:
|
||||
logging.error(f"Transcription failed: {e}")
|
||||
@@ -117,7 +219,10 @@ class WhisperTranscriber:
|
||||
def model_exists(self, size: str) -> bool:
|
||||
"""Checks if a model size is already downloaded."""
|
||||
new_path = get_models_path() / f"faster-whisper-{size}"
|
||||
if (new_path / "config.json").exists():
|
||||
if new_path.exists():
|
||||
# Strict check
|
||||
required = ["config.json", "model.bin", "vocabulary.json"]
|
||||
if all((new_path / f).exists() for f in required):
|
||||
return True
|
||||
|
||||
# Legacy HF cache check
|
||||
@@ -127,3 +232,21 @@ class WhisperTranscriber:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def unload_model(self):
|
||||
"""
|
||||
Unloads model to free memory.
|
||||
"""
|
||||
if self.model:
|
||||
del self.model
|
||||
|
||||
self.model = None
|
||||
self.current_model_size = None
|
||||
|
||||
# Force garbage collection
|
||||
import gc
|
||||
gc.collect()
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.empty_cache()
|
||||
|
||||
logging.info("Whisper Model unloaded (Low VRAM Mode).")
|
||||
|
||||
@@ -110,6 +110,8 @@ class UIBridge(QObject):
|
||||
logAppended = Signal(str) # Emits new log line
|
||||
settingChanged = Signal(str, 'QVariant')
|
||||
modelStatesChanged = Signal() # Notify UI to re-check isModelDownloaded
|
||||
llmDownloadRequested = Signal()
|
||||
reduceMotionChanged = Signal(bool)
|
||||
|
||||
def __init__(self, parent=None):
|
||||
super().__init__(parent)
|
||||
@@ -129,6 +131,7 @@ class UIBridge(QObject):
|
||||
self._app_vram_mb = 0.0
|
||||
self._app_vram_percent = 0.0
|
||||
self._is_destroyed = False
|
||||
self._reduce_motion = bool(ConfigManager().get("reduce_motion"))
|
||||
|
||||
# Start QThread Stats Worker
|
||||
self.stats_worker = StatsWorker()
|
||||
@@ -245,6 +248,26 @@ class UIBridge(QObject):
|
||||
|
||||
# --- Methods called from QML ---
|
||||
|
||||
@Slot(result=list)
|
||||
def get_supported_languages(self):
|
||||
from src.core.languages import get_language_names
|
||||
return get_language_names()
|
||||
|
||||
@Slot(str)
|
||||
def set_language_by_name(self, name):
|
||||
from src.core.languages import get_code_by_name
|
||||
from src.core.config import ConfigManager
|
||||
code = get_code_by_name(name)
|
||||
ConfigManager().set("language", code)
|
||||
self.settingChanged.emit("language", code)
|
||||
|
||||
@Slot(result=str)
|
||||
def get_current_language_name(self):
|
||||
from src.core.languages import get_name_by_code
|
||||
from src.core.config import ConfigManager
|
||||
code = ConfigManager().get("language")
|
||||
return get_name_by_code(code)
|
||||
|
||||
@Slot(str, result='QVariant')
|
||||
def getSetting(self, key):
|
||||
from src.core.config import ConfigManager
|
||||
@@ -256,6 +279,8 @@ class UIBridge(QObject):
|
||||
ConfigManager().set(key, value)
|
||||
if key == "ui_scale":
|
||||
self.uiScale = float(value)
|
||||
if key == "reduce_motion":
|
||||
self.reduceMotion = bool(value)
|
||||
self.settingChanged.emit(key, value) # Notify listeners (e.g. Overlay)
|
||||
|
||||
@Property(float, notify=uiScaleChanged)
|
||||
@@ -267,6 +292,15 @@ class UIBridge(QObject):
|
||||
self._ui_scale = val
|
||||
self.uiScaleChanged.emit(val)
|
||||
|
||||
@Property(bool, notify=reduceMotionChanged)
|
||||
def reduceMotion(self): return self._reduce_motion
|
||||
|
||||
@reduceMotion.setter
|
||||
def reduceMotion(self, val):
|
||||
if self._reduce_motion != val:
|
||||
self._reduce_motion = val
|
||||
self.reduceMotionChanged.emit(val)
|
||||
|
||||
@Property(float, notify=appCpuChanged)
|
||||
def appCpu(self): return self._app_cpu
|
||||
|
||||
@@ -336,11 +370,7 @@ class UIBridge(QObject):
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to preload audio devices: {e}")
|
||||
|
||||
@Slot()
|
||||
def toggle_recording(self):
|
||||
"""Called by UI elements to trigger the app's recording logic."""
|
||||
# This will be connected to the main app's toggle logic
|
||||
pass
|
||||
|
||||
@Property(bool, notify=isDownloadingChanged)
|
||||
def isDownloading(self): return self._is_downloading
|
||||
|
||||
@@ -356,9 +386,15 @@ class UIBridge(QObject):
|
||||
|
||||
try:
|
||||
from src.core.paths import get_models_path
|
||||
|
||||
|
||||
|
||||
# Check new simple format used by DownloadWorker
|
||||
path_simple = get_models_path() / f"faster-whisper-{size}"
|
||||
if path_simple.exists() and any(path_simple.iterdir()):
|
||||
if path_simple.exists():
|
||||
# Strict check: Ensure all critical files exist
|
||||
required = ["config.json", "model.bin", "vocabulary.json"]
|
||||
if all((path_simple / f).exists() for f in required):
|
||||
return True
|
||||
|
||||
# Check HF Cache format (legacy/default)
|
||||
@@ -366,16 +402,22 @@ class UIBridge(QObject):
|
||||
path_hf = get_models_path() / folder_name
|
||||
snapshots = path_hf / "snapshots"
|
||||
if snapshots.exists() and any(snapshots.iterdir()):
|
||||
return True
|
||||
return True # Legacy cache structure is complex, assume valid if present
|
||||
|
||||
# Check direct folder (simple)
|
||||
path_direct = get_models_path() / size
|
||||
if (path_direct / "config.json").exists():
|
||||
return True
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logging.error(f"Error checking model status: {e}")
|
||||
return False
|
||||
|
||||
@Slot(result=bool)
|
||||
def isLLMModelDownloaded(self):
|
||||
try:
|
||||
from src.core.paths import get_models_path
|
||||
# Hardcoded check for the 1B model we support
|
||||
model_file = get_models_path() / "llm" / "llama-3.2-1b-instruct" / "llama-3.2-1b-instruct-q4_k_m.gguf"
|
||||
return model_file.exists()
|
||||
except:
|
||||
return False
|
||||
|
||||
@Slot(str)
|
||||
@@ -385,3 +427,7 @@ class UIBridge(QObject):
|
||||
@Slot()
|
||||
def notifyModelStatesChanged(self):
|
||||
self.modelStatesChanged.emit()
|
||||
|
||||
@Slot()
|
||||
def downloadLLM(self):
|
||||
self.llmDownloadRequested.emit()
|
||||
|
||||
@@ -1,210 +0,0 @@
|
||||
"""
|
||||
Modern Components Library.
|
||||
==========================
|
||||
|
||||
Contains custom-painted widgets that move beyond the standard 'amateur' Qt look.
|
||||
Implements smooth animations, hardware acceleration, and glassmorphism.
|
||||
"""
|
||||
|
||||
from PySide6.QtWidgets import (
|
||||
QPushButton, QWidget, QVBoxLayout, QHBoxLayout,
|
||||
QLabel, QGraphicsDropShadowEffect, QFrame, QAbstractButton
|
||||
)
|
||||
from PySide6.QtCore import Qt, QPropertyAnimation, QEasingCurve, Property, QRect, QPoint, Signal, Slot
|
||||
from PySide6.QtGui import QPainter, QColor, QBrush, QPen, QLinearGradient, QFont
|
||||
|
||||
from src.ui.styles import Theme
|
||||
|
||||
class GlassButton(QPushButton):
|
||||
"""A premium button with gradient hover effects and smooth scaling."""
|
||||
|
||||
def __init__(self, text, parent=None, accent_color=Theme.ACCENT_CYAN):
|
||||
super().__init__(text, parent)
|
||||
self.accent = QColor(accent_color)
|
||||
self.setCursor(Qt.PointingHandCursor)
|
||||
self.setFixedHeight(40)
|
||||
self._hover_opacity = 0.0
|
||||
|
||||
self.setStyleSheet(f"""
|
||||
QPushButton {{
|
||||
background-color: rgba(255, 255, 255, 0.05);
|
||||
border: 1px solid {Theme.BORDER_SUBTLE};
|
||||
color: {Theme.TEXT_SECONDARY};
|
||||
border-radius: 8px;
|
||||
padding: 0 20px;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
}}
|
||||
""")
|
||||
|
||||
# Hover Animation
|
||||
self.anim = QPropertyAnimation(self, b"hover_opacity")
|
||||
self.anim.setDuration(200)
|
||||
self.anim.setStartValue(0.0)
|
||||
self.anim.setEndValue(1.0)
|
||||
self.anim.setEasingCurve(QEasingCurve.OutCubic)
|
||||
|
||||
@Property(float)
|
||||
def hover_opacity(self): return self._hover_opacity
|
||||
|
||||
@hover_opacity.setter
|
||||
def hover_opacity(self, value):
|
||||
self._hover_opacity = value
|
||||
self.update()
|
||||
|
||||
def enterEvent(self, event):
|
||||
self.anim.setDirection(QPropertyAnimation.Forward)
|
||||
self.anim.start()
|
||||
super().enterEvent(event)
|
||||
|
||||
def leaveEvent(self, event):
|
||||
self.anim.setDirection(QPropertyAnimation.Backward)
|
||||
self.anim.start()
|
||||
super().leaveEvent(event)
|
||||
|
||||
def paintEvent(self, event):
|
||||
"""Custom paint for the glow effect."""
|
||||
super().paintEvent(event)
|
||||
if self._hover_opacity > 0:
|
||||
painter = QPainter(self)
|
||||
painter.setRenderHint(QPainter.Antialiasing)
|
||||
|
||||
# Subtle Glow Border
|
||||
color = QColor(self.accent)
|
||||
color.setAlphaF(self._hover_opacity * 0.5)
|
||||
painter.setPen(QPen(color, 1.5))
|
||||
painter.setBrush(Qt.NoBrush)
|
||||
painter.drawRoundedRect(self.rect().adjusted(1,1,-1,-1), 8, 8)
|
||||
|
||||
# Text Glow color shift
|
||||
self.setStyleSheet(f"""
|
||||
QPushButton {{
|
||||
background-color: rgba(255, 255, 255, {0.05 + (self._hover_opacity * 0.05)});
|
||||
border: 1px solid {Theme.BORDER_SUBTLE};
|
||||
color: white;
|
||||
border-radius: 8px;
|
||||
padding: 0 20px;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
}}
|
||||
""")
|
||||
|
||||
class ModernSwitch(QAbstractButton):
|
||||
"""A sleek iOS-style toggle switch."""
|
||||
|
||||
def __init__(self, parent=None, active_color=Theme.ACCENT_GREEN):
|
||||
super().__init__(parent)
|
||||
self.setCheckable(True)
|
||||
self.setFixedSize(44, 24)
|
||||
self._thumb_pos = 3.0
|
||||
self.active_color = QColor(active_color)
|
||||
|
||||
self.anim = QPropertyAnimation(self, b"thumb_pos")
|
||||
self.anim.setDuration(200)
|
||||
self.anim.setEasingCurve(QEasingCurve.InOutCubic)
|
||||
|
||||
@Property(float)
|
||||
def thumb_pos(self): return self._thumb_pos
|
||||
|
||||
@thumb_pos.setter
|
||||
def thumb_pos(self, value):
|
||||
self._thumb_pos = value
|
||||
self.update()
|
||||
|
||||
def nextCheckState(self):
|
||||
super().nextCheckState()
|
||||
self.anim.stop()
|
||||
if self.isChecked():
|
||||
self.anim.setEndValue(23.0)
|
||||
else:
|
||||
self.anim.setEndValue(3.0)
|
||||
self.anim.start()
|
||||
|
||||
def paintEvent(self, event):
|
||||
painter = QPainter(self)
|
||||
painter.setRenderHint(QPainter.Antialiasing)
|
||||
|
||||
# Background
|
||||
bg_color = QColor("#2d2d3d")
|
||||
if self.isChecked():
|
||||
bg_color = self.active_color
|
||||
|
||||
painter.setBrush(bg_color)
|
||||
painter.setPen(Qt.NoPen)
|
||||
painter.drawRoundedRect(self.rect(), 12, 12)
|
||||
|
||||
# Thumb
|
||||
painter.setBrush(Qt.white)
|
||||
painter.drawEllipse(QPoint(self._thumb_pos + 9, 12), 9, 9)
|
||||
|
||||
class ModernFrame(QFrame):
|
||||
"""A base frame with rounded corners and a shadow."""
|
||||
def __init__(self, parent=None):
|
||||
super().__init__(parent)
|
||||
self.setObjectName("premiumFrame")
|
||||
self.setStyleSheet(f"""
|
||||
#premiumFrame {{
|
||||
background-color: {Theme.BG_CARD};
|
||||
border: 1px solid {Theme.BORDER_SUBTLE};
|
||||
border-radius: 12px;
|
||||
}}
|
||||
""")
|
||||
|
||||
self.shadow = QGraphicsDropShadowEffect(self)
|
||||
self.shadow.setBlurRadius(25)
|
||||
self.shadow.setXOffset(0)
|
||||
self.shadow.setYOffset(8)
|
||||
self.shadow.setColor(QColor(0, 0, 0, 180))
|
||||
self.setGraphicsEffect(self.shadow)
|
||||
|
||||
from PySide6.QtWidgets import (
|
||||
QPushButton, QWidget, QVBoxLayout, QHBoxLayout,
|
||||
QLabel, QGraphicsDropShadowEffect, QFrame, QAbstractButton, QSlider
|
||||
)
|
||||
|
||||
class ModernSlider(QSlider):
|
||||
"""A custom painted modern slider with a glowing knob."""
|
||||
def __init__(self, orientation=Qt.Horizontal, parent=None):
|
||||
super().__init__(orientation, parent)
|
||||
self.setStyleSheet(f"""
|
||||
QSlider::groove:horizontal {{
|
||||
border: 1px solid {Theme.BG_DARK};
|
||||
height: 4px;
|
||||
background: {Theme.BG_DARK};
|
||||
margin: 2px 0;
|
||||
border-radius: 2px;
|
||||
}}
|
||||
QSlider::handle:horizontal {{
|
||||
background: {Theme.ACCENT_CYAN};
|
||||
border: 2px solid white;
|
||||
width: 16px;
|
||||
height: 16px;
|
||||
margin: -7px 0;
|
||||
border-radius: 8px;
|
||||
}}
|
||||
QSlider::add-page:horizontal {{
|
||||
background: {Theme.BG_DARK};
|
||||
}}
|
||||
QSlider::sub-page:horizontal {{
|
||||
background: {Theme.ACCENT_CYAN};
|
||||
border-radius: 2px;
|
||||
}}
|
||||
""")
|
||||
|
||||
class FramelessWindow(QWidget):
|
||||
"""Base class for all premium windows to handle dragging and frameless logic."""
|
||||
def __init__(self, parent=None):
|
||||
super().__init__(parent)
|
||||
self.setWindowFlags(Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.NoDropShadowWindowHint)
|
||||
self.setAttribute(Qt.WA_TranslucentBackground)
|
||||
self._drag_pos = None
|
||||
|
||||
def mousePressEvent(self, event):
|
||||
if event.button() == Qt.LeftButton:
|
||||
self._drag_pos = event.globalPosition().toPoint() - self.frameGeometry().topLeft()
|
||||
event.accept()
|
||||
|
||||
def mouseMoveEvent(self, event):
|
||||
if event.buttons() & Qt.LeftButton:
|
||||
self.move(event.globalPosition().toPoint() - self._drag_pos)
|
||||
event.accept()
|
||||
109
src/ui/loader.py
109
src/ui/loader.py
@@ -1,109 +0,0 @@
|
||||
"""
|
||||
Loader Widget Module.
|
||||
=====================
|
||||
|
||||
Handles the application initialization and model checks.
|
||||
Refactored for 2026 Premium Aesthetics.
|
||||
"""
|
||||
|
||||
from PySide6.QtWidgets import QWidget, QVBoxLayout, QLabel, QProgressBar
|
||||
from PySide6.QtCore import Qt, QThread, Signal
|
||||
from PySide6.QtGui import QFont
|
||||
import os
|
||||
import logging
|
||||
from faster_whisper import download_model
|
||||
|
||||
from src.core.paths import get_models_path
|
||||
from src.ui.styles import Theme, StyleGenerator, load_modern_fonts
|
||||
from src.ui.components import FramelessWindow, ModernFrame
|
||||
|
||||
class DownloadWorker(QThread):
|
||||
"""Background worker for model downloads."""
|
||||
progress = Signal(str, int)
|
||||
download_finished = Signal()
|
||||
error = Signal(str)
|
||||
|
||||
def run(self):
|
||||
try:
|
||||
model_path = get_models_path()
|
||||
self.progress.emit("Verifying AI Core...", 10)
|
||||
os.environ["HF_HOME"] = str(model_path)
|
||||
|
||||
self.progress.emit("Downloading Model...", 30)
|
||||
download_model("small", output_dir=str(model_path))
|
||||
|
||||
self.progress.emit("System Ready!", 100)
|
||||
self.download_finished.emit()
|
||||
except Exception as e:
|
||||
logging.error(f"Loader failed: {e}")
|
||||
self.error.emit(str(e))
|
||||
|
||||
class LoaderWidget(FramelessWindow):
|
||||
"""
|
||||
Premium bootstrapper UI.
|
||||
Inherits from FramelessWindow for rounded glass look.
|
||||
"""
|
||||
ready_signal = Signal()
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.setFixedSize(400, 180)
|
||||
|
||||
# Main Layout
|
||||
self.root = QVBoxLayout(self)
|
||||
self.root.setContentsMargins(10, 10, 10, 10)
|
||||
|
||||
# Glass Card
|
||||
self.card = ModernFrame()
|
||||
self.card.setStyleSheet(StyleGenerator.get_glass_card(radius=20))
|
||||
self.root.addWidget(self.card)
|
||||
|
||||
# Content Layout
|
||||
self.layout = QVBoxLayout(self.card)
|
||||
self.layout.setContentsMargins(30,30,30,30)
|
||||
self.layout.setSpacing(15)
|
||||
|
||||
# App Title/Brand
|
||||
self.brand = QLabel("WHISPER VOICE")
|
||||
self.brand.setFont(load_modern_fonts())
|
||||
self.brand.setStyleSheet(f"color: {Theme.ACCENT_CYAN}; font-weight: 900; letter-spacing: 4px; font-size: 14px;")
|
||||
self.brand.setAlignment(Qt.AlignCenter)
|
||||
self.layout.addWidget(self.brand)
|
||||
|
||||
# Status Label
|
||||
self.status_label = QLabel("INITIALIZING...")
|
||||
self.status_label.setStyleSheet(f"color: {Theme.TEXT_SECONDARY}; font-weight: 600; font-size: 11px;")
|
||||
self.status_label.setAlignment(Qt.AlignCenter)
|
||||
self.layout.addWidget(self.status_label)
|
||||
|
||||
# Progress Bar (Modern Slim style)
|
||||
self.progress_bar = QProgressBar()
|
||||
self.progress_bar.setFixedHeight(4)
|
||||
self.progress_bar.setStyleSheet(f"""
|
||||
QProgressBar {{
|
||||
background-color: {Theme.BG_DARK};
|
||||
border-radius: 2px;
|
||||
border: none;
|
||||
text-align: center;
|
||||
color: transparent;
|
||||
}}
|
||||
QProgressBar::chunk {{
|
||||
background-color: {Theme.ACCENT_CYAN};
|
||||
border-radius: 2px;
|
||||
}}
|
||||
""")
|
||||
self.layout.addWidget(self.progress_bar)
|
||||
|
||||
# Start Worker
|
||||
self.worker = DownloadWorker()
|
||||
self.worker.progress.connect(self.update_progress)
|
||||
self.worker.download_finished.connect(self.on_finished)
|
||||
self.worker.start()
|
||||
|
||||
def update_progress(self, text: str, percent: int):
|
||||
self.status_label.setText(text.upper())
|
||||
self.progress_bar.setValue(percent)
|
||||
|
||||
def on_finished(self):
|
||||
self.ready_signal.emit()
|
||||
self.close()
|
||||
@@ -1,105 +0,0 @@
|
||||
"""
|
||||
Overlay Window Module.
|
||||
======================
|
||||
|
||||
Premium High-Fidelity Overlay for Whisper Voice.
|
||||
Features glassmorphism, pulsating status indicators, and smart positioning.
|
||||
"""
|
||||
|
||||
from PySide6.QtWidgets import QWidget, QVBoxLayout, QHBoxLayout, QLabel
|
||||
from PySide6.QtCore import Qt, Slot, QPoint, QPropertyAnimation, QEasingCurve
|
||||
from PySide6.QtGui import QColor, QFont, QGuiApplication
|
||||
|
||||
from src.ui.visualizer import AudioVisualizer
|
||||
from src.ui.styles import Theme, StyleGenerator, load_modern_fonts
|
||||
from src.ui.components import FramelessWindow, ModernFrame
|
||||
|
||||
class OverlayWindow(FramelessWindow):
|
||||
"""
|
||||
The main transparent overlay (The Pill).
|
||||
Refactored for 2026 Premium Aesthetics.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.setFixedSize(320, 95)
|
||||
|
||||
# Main Layout
|
||||
self.master_layout = QVBoxLayout(self)
|
||||
self.master_layout.setContentsMargins(10, 10, 10, 10)
|
||||
|
||||
# The Glass Pill Container
|
||||
self.pill = ModernFrame()
|
||||
self.pill.setStyleSheet(StyleGenerator.get_glass_card(radius=24))
|
||||
self.master_layout.addWidget(self.pill)
|
||||
|
||||
# Layout inside the pill
|
||||
self.layout = QHBoxLayout(self.pill)
|
||||
self.layout.setContentsMargins(20, 10, 20, 10)
|
||||
self.layout.setSpacing(15)
|
||||
|
||||
# Status Visualization (Left Dot)
|
||||
self.status_dot = QWidget()
|
||||
self.status_dot.setFixedSize(14, 14)
|
||||
self.status_dot.setStyleSheet(f"background-color: {Theme.ACCENT_CYAN}; border-radius: 7px; border: 2px solid white;")
|
||||
self.layout.addWidget(self.status_dot)
|
||||
|
||||
# Text/Visualizer Stack
|
||||
self.content_stack = QVBoxLayout()
|
||||
self.content_stack.setSpacing(2)
|
||||
self.content_stack.setContentsMargins(0, 0, 0, 0)
|
||||
|
||||
self.status_label = QLabel("READY")
|
||||
self.status_label.setFont(load_modern_fonts())
|
||||
self.status_label.setStyleSheet(f"color: white; font-weight: 800; font-size: 11px; letter-spacing: 2px;")
|
||||
self.content_stack.addWidget(self.status_label)
|
||||
|
||||
self.visualizer = AudioVisualizer()
|
||||
self.visualizer.setFixedHeight(30)
|
||||
self.content_stack.addWidget(self.visualizer)
|
||||
|
||||
self.layout.addLayout(self.content_stack)
|
||||
|
||||
# Animations
|
||||
self.pulse_timer = None # Use style-based pulsing to avoid window flags issues
|
||||
|
||||
# Initial State
|
||||
self.hide()
|
||||
self.first_show = True
|
||||
|
||||
def showEvent(self, event):
|
||||
"""Handle positioning and config updates."""
|
||||
from src.core.config import ConfigManager
|
||||
config = ConfigManager()
|
||||
self.setWindowOpacity(config.get("opacity"))
|
||||
|
||||
if self.first_show:
|
||||
self.center_above_taskbar()
|
||||
self.first_show = False
|
||||
super().showEvent(event)
|
||||
|
||||
def center_above_taskbar(self):
|
||||
screen = QGuiApplication.primaryScreen()
|
||||
if not screen: return
|
||||
avail_rect = screen.availableGeometry()
|
||||
x = avail_rect.x() + (avail_rect.width() - self.width()) // 2
|
||||
y = avail_rect.bottom() - self.height() - 15
|
||||
self.move(x, y)
|
||||
|
||||
@Slot(str)
|
||||
def update_status(self, text: str):
|
||||
"""Updates the status text and visual indicator."""
|
||||
self.status_label.setText(text.upper())
|
||||
|
||||
if "RECORDING" in text.upper():
|
||||
color = Theme.ACCENT_GREEN
|
||||
elif "THINKING" in text.upper():
|
||||
color = Theme.ACCENT_PURPLE
|
||||
else:
|
||||
color = Theme.ACCENT_CYAN
|
||||
|
||||
self.status_dot.setStyleSheet(f"background-color: {color}; border-radius: 7px; border: 2px solid white;")
|
||||
|
||||
@Slot(float)
|
||||
def update_visualizer(self, amp: float):
|
||||
self.visualizer.set_amplitude(amp)
|
||||
@@ -6,12 +6,15 @@ Button {
|
||||
text: "Button"
|
||||
|
||||
property color accentColor: "#00f2ff"
|
||||
Accessible.role: Accessible.Button
|
||||
Accessible.name: control.text
|
||||
activeFocusOnTab: true
|
||||
|
||||
contentItem: Text {
|
||||
text: control.text
|
||||
font.pixelSize: 13
|
||||
font.bold: true
|
||||
color: control.hovered ? "white" : "#9499b0"
|
||||
color: control.hovered ? "white" : "#ABABAB"
|
||||
horizontalAlignment: Text.AlignHCenter
|
||||
verticalAlignment: Text.AlignVCenter
|
||||
elide: Text.ElideRight
|
||||
@@ -25,8 +28,8 @@ Button {
|
||||
opacity: control.down ? 0.7 : 1.0
|
||||
color: control.hovered ? Qt.rgba(1, 1, 1, 0.1) : Qt.rgba(1, 1, 1, 0.05)
|
||||
radius: 8
|
||||
border.color: control.hovered ? control.accentColor : Qt.rgba(1, 1, 1, 0.1)
|
||||
border.width: 1
|
||||
border.color: control.hovered ? control.accentColor : SettingsStyle.borderSubtle
|
||||
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
|
||||
|
||||
Behavior on border.color { ColorAnimation { duration: 200 } }
|
||||
Behavior on color { ColorAnimation { duration: 200 } }
|
||||
|
||||
Binary file not shown.
@@ -14,6 +14,8 @@ ApplicationWindow {
|
||||
visible: true
|
||||
flags: Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.Tool
|
||||
color: "transparent"
|
||||
title: "WhisperVoice"
|
||||
Accessible.name: "WhisperVoice Loading"
|
||||
|
||||
Rectangle {
|
||||
id: bgRect
|
||||
@@ -21,7 +23,7 @@ ApplicationWindow {
|
||||
anchors.margins: 20 // Space for shadow
|
||||
radius: 16
|
||||
color: "#1a1a20"
|
||||
border.color: "#40ffffff"
|
||||
border.color: Qt.rgba(1, 1, 1, 0.22)
|
||||
border.width: 1
|
||||
|
||||
// --- SHADOW & GLOW ---
|
||||
@@ -55,6 +57,7 @@ ApplicationWindow {
|
||||
|
||||
// Pulse Animation
|
||||
SequentialAnimation on scale {
|
||||
running: ui ? !ui.reduceMotion : true
|
||||
loops: Animation.Infinite
|
||||
NumberAnimation { from: 1.0; to: 1.1; duration: 1000; easing.type: Easing.InOutSine }
|
||||
NumberAnimation { from: 1.1; to: 1.0; duration: 1000; easing.type: Easing.InOutSine }
|
||||
@@ -95,7 +98,7 @@ ApplicationWindow {
|
||||
|
||||
Text {
|
||||
text: "AI TRANSCRIPTION ENGINE"
|
||||
color: "#80ffffff"
|
||||
color: "#ABABAB"
|
||||
font.family: jetBrainsMono.name
|
||||
font.pixelSize: 10
|
||||
font.letterSpacing: 2
|
||||
@@ -135,6 +138,7 @@ ApplicationWindow {
|
||||
// Shimmer effect on bar
|
||||
Rectangle {
|
||||
width: 20; height: parent.height
|
||||
visible: ui ? !ui.reduceMotion : true
|
||||
color: "#80ffffff"
|
||||
x: -width
|
||||
opacity: 0.5
|
||||
@@ -157,8 +161,10 @@ ApplicationWindow {
|
||||
font.family: jetBrainsMono.name
|
||||
font.pixelSize: 11
|
||||
font.bold: true
|
||||
Accessible.role: Accessible.AlertMessage
|
||||
Accessible.name: "Loading status: " + text
|
||||
anchors.horizontalCenter: parent.horizontalCenter
|
||||
opacity: 0.8
|
||||
opacity: 1.0
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -10,6 +10,9 @@ ComboBox {
|
||||
property color bgColor: "#1a1a20"
|
||||
property color popupColor: "#252530"
|
||||
|
||||
Accessible.role: Accessible.ComboBox
|
||||
Accessible.name: control.displayText
|
||||
|
||||
delegate: ItemDelegate {
|
||||
id: delegate
|
||||
width: control.width
|
||||
@@ -68,7 +71,7 @@ ComboBox {
|
||||
context.lineTo(width, 0);
|
||||
context.lineTo(width / 2, height);
|
||||
context.closePath();
|
||||
context.fillStyle = control.pressed ? control.accentColor : "#888888";
|
||||
context.fillStyle = control.pressed ? control.accentColor : "#ABABAB";
|
||||
context.fill();
|
||||
}
|
||||
}
|
||||
@@ -89,8 +92,8 @@ ComboBox {
|
||||
implicitWidth: 140
|
||||
implicitHeight: 40
|
||||
color: control.bgColor
|
||||
border.color: control.pressed || control.activeFocus ? control.accentColor : "#40ffffff"
|
||||
border.width: 1
|
||||
border.color: control.pressed || control.activeFocus ? control.accentColor : SettingsStyle.borderSubtle
|
||||
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
|
||||
radius: 6
|
||||
|
||||
// Glow effect on focus (Simplified to just border for stability)
|
||||
@@ -100,7 +103,7 @@ ComboBox {
|
||||
popup: Popup {
|
||||
y: control.height - 1
|
||||
width: control.width
|
||||
implicitHeight: contentItem.implicitHeight
|
||||
implicitHeight: Math.min(contentItem.implicitHeight, 300)
|
||||
padding: 5
|
||||
|
||||
contentItem: ListView {
|
||||
@@ -114,7 +117,7 @@ ComboBox {
|
||||
|
||||
background: Rectangle {
|
||||
color: control.popupColor
|
||||
border.color: "#40ffffff"
|
||||
border.color: SettingsStyle.borderSubtle
|
||||
border.width: 1
|
||||
radius: 6
|
||||
}
|
||||
|
||||
@@ -7,8 +7,11 @@ Rectangle {
|
||||
implicitHeight: 32
|
||||
color: "#1a1a20"
|
||||
radius: 6
|
||||
border.width: 1
|
||||
border.color: activeFocus || recording ? SettingsStyle.accent : "#40ffffff"
|
||||
activeFocusOnTab: true
|
||||
Accessible.role: Accessible.Button
|
||||
Accessible.name: control.currentSequence ? "Hotkey: " + control.currentSequence + ". Click to change" : "No hotkey set. Click to record"
|
||||
border.width: (activeFocus || recording) ? SettingsStyle.focusRingWidth : 1
|
||||
border.color: activeFocus || recording ? SettingsStyle.accent : SettingsStyle.borderSubtle
|
||||
|
||||
property string currentSequence: ""
|
||||
signal sequenceChanged(string seq)
|
||||
@@ -25,8 +28,8 @@ Rectangle {
|
||||
|
||||
Text {
|
||||
anchors.centerIn: parent
|
||||
text: control.recording ? "Listening..." : (control.currentSequence || "None")
|
||||
color: control.recording ? SettingsStyle.accent : (control.currentSequence ? "#ffffff" : "#808080")
|
||||
text: control.recording ? "Listening..." : (formatSequence(control.currentSequence) || "None")
|
||||
color: control.recording ? SettingsStyle.accent : (control.currentSequence ? "#ffffff" : "#ABABAB")
|
||||
font.family: "JetBrains Mono"
|
||||
font.pixelSize: 13
|
||||
font.bold: true
|
||||
@@ -72,6 +75,23 @@ Rectangle {
|
||||
if (!activeFocus) control.recording = false
|
||||
}
|
||||
|
||||
function formatSequence(seq) {
|
||||
if (!seq) return ""
|
||||
var parts = seq.split("+")
|
||||
for (var i = 0; i < parts.length; i++) {
|
||||
var p = parts[i]
|
||||
// Standardize modifiers
|
||||
if (p === "ctrl") parts[i] = "Ctrl"
|
||||
else if (p === "alt") parts[i] = "Alt"
|
||||
else if (p === "shift") parts[i] = "Shift"
|
||||
else if (p === "win") parts[i] = "Win"
|
||||
else if (p === "esc") parts[i] = "Esc"
|
||||
// Capitalize F-keys and others (e.g. f8 -> F8, space -> Space)
|
||||
else parts[i] = p.charAt(0).toUpperCase() + p.slice(1)
|
||||
}
|
||||
return parts.join(" + ")
|
||||
}
|
||||
|
||||
function getKeyName(key, text) {
|
||||
// F-Keys
|
||||
if (key >= Qt.Key_F1 && key <= Qt.Key_F35) return "f" + (key - Qt.Key_F1 + 1)
|
||||
|
||||
@@ -18,6 +18,8 @@ Rectangle {
|
||||
property string description: ""
|
||||
property alias control: controlContainer.data
|
||||
property bool showSeparator: true
|
||||
Accessible.name: root.label
|
||||
Accessible.role: Accessible.Row
|
||||
|
||||
Behavior on color { ColorAnimation { duration: 150 } }
|
||||
|
||||
|
||||
@@ -9,6 +9,8 @@ ColumnLayout {
|
||||
|
||||
default property alias content: contentColumn.data
|
||||
property string title: ""
|
||||
Accessible.name: root.title + " settings group"
|
||||
Accessible.role: Accessible.Grouping
|
||||
|
||||
// Section Header
|
||||
Text {
|
||||
|
||||
@@ -5,30 +5,49 @@ import QtQuick.Effects
|
||||
Slider {
|
||||
id: control
|
||||
|
||||
Accessible.role: Accessible.Slider
|
||||
Accessible.name: control.value.toString()
|
||||
activeFocusOnTab: true
|
||||
|
||||
background: Rectangle {
|
||||
x: control.leftPadding
|
||||
y: control.topPadding + control.availableHeight / 2 - height / 2
|
||||
implicitWidth: 200
|
||||
implicitHeight: 4
|
||||
implicitHeight: 6
|
||||
width: control.availableWidth
|
||||
height: implicitHeight
|
||||
radius: 2
|
||||
radius: 3
|
||||
color: "#2d2d3d"
|
||||
|
||||
Rectangle {
|
||||
width: control.visualPosition * parent.width
|
||||
height: parent.height
|
||||
color: SettingsStyle.accent
|
||||
radius: 2
|
||||
radius: 3
|
||||
}
|
||||
}
|
||||
|
||||
handle: Rectangle {
|
||||
handle: Item {
|
||||
x: control.leftPadding + control.visualPosition * (control.availableWidth - width)
|
||||
y: control.topPadding + control.availableHeight / 2 - height / 2
|
||||
implicitWidth: 18
|
||||
implicitHeight: 18
|
||||
radius: 9
|
||||
implicitWidth: SettingsStyle.minTargetSize
|
||||
implicitHeight: SettingsStyle.minTargetSize
|
||||
|
||||
// Focus ring
|
||||
Rectangle {
|
||||
anchors.centerIn: parent
|
||||
width: parent.width + SettingsStyle.focusRingWidth * 2 + 2
|
||||
height: width
|
||||
radius: width / 2
|
||||
color: "transparent"
|
||||
border.width: SettingsStyle.focusRingWidth
|
||||
border.color: SettingsStyle.accent
|
||||
visible: control.activeFocus
|
||||
}
|
||||
|
||||
Rectangle {
|
||||
anchors.fill: parent
|
||||
radius: width / 2
|
||||
color: "white"
|
||||
border.color: SettingsStyle.accent
|
||||
border.width: 2
|
||||
@@ -41,7 +60,9 @@ Slider {
|
||||
shadowColor: SettingsStyle.accent
|
||||
}
|
||||
}
|
||||
// Value Readout (Left side to avoid clipping on right edge)
|
||||
}
|
||||
|
||||
// Value Readout
|
||||
Text {
|
||||
anchors.right: parent.left
|
||||
anchors.rightMargin: 12
|
||||
|
||||
@@ -4,6 +4,10 @@ import QtQuick.Controls
|
||||
Switch {
|
||||
id: control
|
||||
|
||||
Accessible.role: Accessible.CheckBox
|
||||
Accessible.name: control.text + (control.checked ? " on" : " off")
|
||||
activeFocusOnTab: true
|
||||
|
||||
indicator: Rectangle {
|
||||
implicitWidth: 44
|
||||
implicitHeight: 24
|
||||
@@ -11,9 +15,11 @@ Switch {
|
||||
y: parent.height / 2 - height / 2
|
||||
radius: 12
|
||||
color: control.checked ? SettingsStyle.accent : "#2d2d3d"
|
||||
border.color: control.checked ? SettingsStyle.accent : "#3d3d4d"
|
||||
border.color: control.checked ? SettingsStyle.accent : SettingsStyle.borderSubtle
|
||||
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
|
||||
|
||||
Behavior on color { ColorAnimation { duration: 200 } }
|
||||
Behavior on border.color { ColorAnimation { duration: 200 } }
|
||||
|
||||
Rectangle {
|
||||
x: control.checked ? parent.width - width - 3 : 3
|
||||
@@ -26,6 +32,15 @@ Switch {
|
||||
Behavior on x {
|
||||
NumberAnimation { duration: 200; easing.type: Easing.InOutQuad }
|
||||
}
|
||||
|
||||
// I/O pip marks for non-color state indication
|
||||
Text {
|
||||
anchors.centerIn: parent
|
||||
text: control.checked ? "I" : "O"
|
||||
font.pixelSize: 9
|
||||
font.bold: true
|
||||
color: control.checked ? SettingsStyle.accent : "#666666"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -7,7 +7,10 @@ TextField {
|
||||
property color accentColor: "#00f2ff"
|
||||
property color bgColor: "#1a1a20"
|
||||
|
||||
placeholderTextColor: "#606060"
|
||||
Accessible.role: Accessible.EditableText
|
||||
Accessible.name: control.placeholderText || "Text input"
|
||||
|
||||
placeholderTextColor: SettingsStyle.textDisabled
|
||||
color: "#ffffff"
|
||||
font.family: "JetBrains Mono"
|
||||
font.pixelSize: 14
|
||||
@@ -18,8 +21,8 @@ TextField {
|
||||
implicitWidth: 200
|
||||
implicitHeight: 40
|
||||
color: control.bgColor
|
||||
border.color: control.activeFocus ? control.accentColor : "#40ffffff"
|
||||
border.width: 1
|
||||
border.color: control.activeFocus ? control.accentColor : SettingsStyle.borderSubtle
|
||||
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
|
||||
radius: 6
|
||||
|
||||
Behavior on border.color { ColorAnimation { duration: 150 } }
|
||||
|
||||
@@ -13,6 +13,8 @@ ApplicationWindow {
|
||||
visible: true
|
||||
flags: Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.Tool
|
||||
color: "transparent"
|
||||
title: "WhisperVoice"
|
||||
Accessible.name: "WhisperVoice Overlay"
|
||||
|
||||
FontLoader {
|
||||
id: jetBrainsMono
|
||||
@@ -35,7 +37,7 @@ ApplicationWindow {
|
||||
property bool isActive: ui.isRecording || ui.isProcessing
|
||||
|
||||
SequentialAnimation {
|
||||
running: true
|
||||
running: !ui.reduceMotion
|
||||
loops: Animation.Infinite
|
||||
PauseAnimation { duration: 3000 }
|
||||
NumberAnimation {
|
||||
@@ -96,6 +98,7 @@ ApplicationWindow {
|
||||
ShaderEffect {
|
||||
anchors.fill: parent
|
||||
opacity: 0.4
|
||||
visible: !ui.reduceMotion
|
||||
property real time: 0
|
||||
fragmentShader: "gradient_blobs.qsb"
|
||||
NumberAnimation on time { from: 0; to: 1000; duration: 100000; loops: Animation.Infinite }
|
||||
@@ -105,6 +108,7 @@ ApplicationWindow {
|
||||
ShaderEffect {
|
||||
anchors.fill: parent
|
||||
opacity: 0.04
|
||||
visible: !ui.reduceMotion
|
||||
property real time: 0
|
||||
property real intensity: ui.amplitude
|
||||
fragmentShader: "glow.qsb"
|
||||
@@ -115,6 +119,7 @@ ApplicationWindow {
|
||||
ParticleSystem {
|
||||
id: particles
|
||||
anchors.fill: parent
|
||||
running: !ui.reduceMotion
|
||||
ItemParticle {
|
||||
system: particles
|
||||
delegate: Rectangle { width: 2; height: 2; radius: 1; color: "#10ffffff" }
|
||||
@@ -143,6 +148,7 @@ ApplicationWindow {
|
||||
// F. CRT Shader Effect (Overlay on chassis ONLY)
|
||||
ShaderEffect {
|
||||
anchors.fill: parent
|
||||
visible: !ui.reduceMotion
|
||||
property real time: 0
|
||||
fragmentShader: "crt.qsb"
|
||||
NumberAnimation on time { from: 0; to: 100; duration: 5000; loops: Animation.Infinite }
|
||||
@@ -172,7 +178,7 @@ ApplicationWindow {
|
||||
radius: height / 2
|
||||
color: "transparent"
|
||||
border.width: 1
|
||||
border.color: "#40ffffff"
|
||||
border.color: Qt.rgba(1, 1, 1, 0.22)
|
||||
|
||||
MouseArea {
|
||||
anchors.fill: parent; hoverEnabled: true
|
||||
@@ -194,7 +200,7 @@ ApplicationWindow {
|
||||
NumberAnimation { duration: 150; easing.type: Easing.OutCubic }
|
||||
}
|
||||
SequentialAnimation on border.color {
|
||||
running: ui.isRecording
|
||||
running: ui.isRecording && !ui.reduceMotion
|
||||
loops: Animation.Infinite
|
||||
ColorAnimation { from: "#A0ff4b4b"; to: "#C0ff6b6b"; duration: 800 }
|
||||
ColorAnimation { from: "#C0ff6b6b"; to: "#A0ff4b4b"; duration: 800 }
|
||||
@@ -209,6 +215,11 @@ ApplicationWindow {
|
||||
anchors.left: parent.left
|
||||
anchors.leftMargin: 10
|
||||
anchors.verticalCenter: parent.verticalCenter
|
||||
activeFocusOnTab: true
|
||||
Accessible.name: ui.isRecording ? "Stop recording" : "Start recording"
|
||||
Accessible.role: Accessible.Button
|
||||
Keys.onReturnPressed: ui.toggleRecordingRequested()
|
||||
Keys.onSpacePressed: ui.toggleRecordingRequested()
|
||||
|
||||
// Make entire button scale with amplitude
|
||||
scale: ui.isRecording ? (1.0 + ui.amplitude * 0.12) : 1.0
|
||||
@@ -245,7 +256,7 @@ ApplicationWindow {
|
||||
border.width: 2; border.color: "#60ffffff"
|
||||
|
||||
SequentialAnimation on scale {
|
||||
running: ui.isRecording
|
||||
running: ui.isRecording && !ui.reduceMotion
|
||||
loops: Animation.Infinite
|
||||
NumberAnimation { from: 1.0; to: 1.08; duration: 600; easing.type: Easing.InOutQuad }
|
||||
NumberAnimation { from: 1.08; to: 1.0; duration: 600; easing.type: Easing.InOutQuad }
|
||||
@@ -263,6 +274,17 @@ ApplicationWindow {
|
||||
fillMode: Image.PreserveAspectFit
|
||||
}
|
||||
}
|
||||
|
||||
// Focus ring
|
||||
Rectangle {
|
||||
anchors.fill: micCircle
|
||||
anchors.margins: -4
|
||||
radius: width / 2
|
||||
color: "transparent"
|
||||
border.width: 2
|
||||
border.color: "#B794F6" // SettingsStyle.accent equivalent
|
||||
visible: micContainer.activeFocus
|
||||
}
|
||||
}
|
||||
|
||||
// --- RAINBOW WAVEFORM (Shader) ---
|
||||
@@ -277,6 +299,7 @@ ApplicationWindow {
|
||||
|
||||
ShaderEffect {
|
||||
anchors.fill: parent
|
||||
visible: !ui.reduceMotion
|
||||
property real time: 0
|
||||
property real amplitude: ui.amplitude
|
||||
fragmentShader: "rainbow_wave.qsb"
|
||||
@@ -341,8 +364,10 @@ ApplicationWindow {
|
||||
font.family: jetBrainsMono.name; font.pixelSize: 16; font.bold: true; font.letterSpacing: 2
|
||||
style: Text.Outline
|
||||
styleColor: ui.isRecording ? "#ff0000" : "#808085"
|
||||
Accessible.role: Accessible.StaticText
|
||||
Accessible.name: "Recording time: " + text
|
||||
SequentialAnimation on opacity {
|
||||
running: ui.isRecording; loops: Animation.Infinite
|
||||
running: ui.isRecording && !ui.reduceMotion; loops: Animation.Infinite
|
||||
NumberAnimation { from: 1.0; to: 0.7; duration: 800 }
|
||||
NumberAnimation { from: 0.7; to: 1.0; duration: 800 }
|
||||
}
|
||||
|
||||
@@ -12,7 +12,8 @@ Window {
|
||||
visible: false
|
||||
flags: Qt.FramelessWindowHint | Qt.Window
|
||||
color: "transparent"
|
||||
title: "Settings"
|
||||
title: "WhisperVoice Settings"
|
||||
Accessible.name: "WhisperVoice Settings"
|
||||
|
||||
// Explicit sizing for Python to read
|
||||
|
||||
@@ -133,15 +134,20 @@ Window {
|
||||
// Improved Close Button
|
||||
Rectangle {
|
||||
width: 32; height: 32
|
||||
activeFocusOnTab: true
|
||||
Accessible.name: "Close settings"
|
||||
Accessible.role: Accessible.Button
|
||||
Keys.onReturnPressed: root.close()
|
||||
Keys.onSpacePressed: root.close()
|
||||
radius: 8
|
||||
color: closeMa.containsMouse ? "#20ff4b4b" : "transparent"
|
||||
border.color: closeMa.containsMouse ? "#40ff4b4b" : "transparent"
|
||||
color: closeMa.containsMouse ? "#20FF8A8A" : "transparent"
|
||||
border.color: closeMa.containsMouse ? "#40FF8A8A" : "transparent"
|
||||
border.width: 1
|
||||
|
||||
Text {
|
||||
anchors.centerIn: parent
|
||||
text: "×"
|
||||
color: closeMa.containsMouse ? "#ff4b4b" : SettingsStyle.textSecondary
|
||||
color: closeMa.containsMouse ? "#FF8A8A" : SettingsStyle.textSecondary
|
||||
font.family: mainFont
|
||||
font.pixelSize: 20
|
||||
font.bold: true
|
||||
@@ -157,6 +163,15 @@ Window {
|
||||
|
||||
Behavior on color { ColorAnimation { duration: 150 } }
|
||||
Behavior on border.color { ColorAnimation { duration: 150 } }
|
||||
// Focus ring
|
||||
Rectangle {
|
||||
anchors.fill: parent
|
||||
radius: 8
|
||||
color: "transparent"
|
||||
border.width: SettingsStyle.focusRingWidth
|
||||
border.color: SettingsStyle.accent
|
||||
visible: parent.activeFocus
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -206,6 +221,23 @@ Window {
|
||||
height: 38
|
||||
color: stack.currentIndex === index ? SettingsStyle.surfaceHover : (ma.containsMouse ? Qt.rgba(1,1,1,0.03) : "transparent")
|
||||
radius: 6
|
||||
activeFocusOnTab: true
|
||||
Accessible.name: name
|
||||
Accessible.role: Accessible.Tab
|
||||
Keys.onReturnPressed: stack.currentIndex = index
|
||||
Keys.onSpacePressed: stack.currentIndex = index
|
||||
Keys.onDownPressed: {
|
||||
if (index < navModel.count - 1) {
|
||||
var nextItem = navBtnRoot.parent.children[index + 2]
|
||||
if (nextItem && nextItem.forceActiveFocus) nextItem.forceActiveFocus()
|
||||
}
|
||||
}
|
||||
Keys.onUpPressed: {
|
||||
if (index > 0) {
|
||||
var prevItem = navBtnRoot.parent.children[index]
|
||||
if (prevItem && prevItem.forceActiveFocus) prevItem.forceActiveFocus()
|
||||
}
|
||||
}
|
||||
|
||||
Behavior on color { ColorAnimation { duration: 150 } }
|
||||
|
||||
@@ -256,6 +288,15 @@ Window {
|
||||
cursorShape: Qt.PointingHandCursor
|
||||
onClicked: stack.currentIndex = index
|
||||
}
|
||||
// Focus ring
|
||||
Rectangle {
|
||||
anchors.fill: parent
|
||||
radius: 6
|
||||
color: "transparent"
|
||||
border.width: SettingsStyle.focusRingWidth
|
||||
border.color: SettingsStyle.accent
|
||||
visible: parent.activeFocus
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -286,6 +327,7 @@ Window {
|
||||
|
||||
// --- TAB: GENERAL ---
|
||||
ScrollView {
|
||||
Accessible.role: Accessible.PageTab
|
||||
ScrollBar.vertical.policy: ScrollBar.AsNeeded
|
||||
contentWidth: availableWidth
|
||||
|
||||
@@ -314,15 +356,35 @@ Window {
|
||||
spacing: 0
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Global Hotkey"
|
||||
description: "Press to record a new shortcut (e.g. Ctrl+Space)"
|
||||
label: "Global Hotkey (Transcribe)"
|
||||
description: "Standard: Raw transcription"
|
||||
control: ModernKeySequenceRecorder {
|
||||
Layout.preferredWidth: 200
|
||||
implicitWidth: 240
|
||||
currentSequence: ui.getSetting("hotkey")
|
||||
onSequenceChanged: (seq) => ui.setSetting("hotkey", seq)
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Global Hotkey (Correct)"
|
||||
description: "Enhanced: Transcribe + AI Correction"
|
||||
control: ModernKeySequenceRecorder {
|
||||
implicitWidth: 240
|
||||
currentSequence: ui.getSetting("hotkey_correct")
|
||||
onSequenceChanged: (seq) => ui.setSetting("hotkey_correct", seq)
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Global Hotkey (Translate)"
|
||||
description: "Press to record a new shortcut (e.g. F10)"
|
||||
control: ModernKeySequenceRecorder {
|
||||
implicitWidth: 240
|
||||
currentSequence: ui.getSetting("hotkey_translate")
|
||||
onSequenceChanged: (seq) => ui.setSetting("hotkey_translate", seq)
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Run on Startup"
|
||||
description: "Automatically launch when you log in"
|
||||
@@ -349,8 +411,8 @@ Window {
|
||||
showSeparator: false
|
||||
control: ModernSlider {
|
||||
Layout.preferredWidth: 200
|
||||
from: 10; to: 6000
|
||||
stepSize: 10
|
||||
from: 10; to: 20000
|
||||
stepSize: 100
|
||||
snapMode: Slider.SnapAlways
|
||||
value: ui.getSetting("typing_speed")
|
||||
onMoved: ui.setSetting("typing_speed", value)
|
||||
@@ -363,6 +425,7 @@ Window {
|
||||
|
||||
// --- TAB: AUDIO ---
|
||||
ScrollView {
|
||||
Accessible.role: Accessible.PageTab
|
||||
ScrollBar.vertical.policy: ScrollBar.AsNeeded
|
||||
contentWidth: availableWidth
|
||||
|
||||
@@ -451,6 +514,7 @@ Window {
|
||||
|
||||
// --- TAB: VISUALS ---
|
||||
ScrollView {
|
||||
Accessible.role: Accessible.PageTab
|
||||
ScrollBar.vertical.policy: ScrollBar.AsNeeded
|
||||
contentWidth: availableWidth
|
||||
|
||||
@@ -500,7 +564,7 @@ Window {
|
||||
ModernSettingsItem {
|
||||
label: "Window Opacity"
|
||||
description: "Transparency level"
|
||||
showSeparator: false
|
||||
showSeparator: true
|
||||
control: ModernSlider {
|
||||
Layout.preferredWidth: 200
|
||||
from: 0.1; to: 1.0
|
||||
@@ -508,6 +572,15 @@ Window {
|
||||
onMoved: ui.setSetting("opacity", Number(value.toFixed(2)))
|
||||
}
|
||||
}
|
||||
ModernSettingsItem {
|
||||
label: "Reduce Motion"
|
||||
description: "Disable animations for accessibility"
|
||||
showSeparator: false
|
||||
control: ModernSwitch {
|
||||
checked: ui.getSetting("reduce_motion")
|
||||
onToggled: ui.setSetting("reduce_motion", checked)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -560,6 +633,7 @@ Window {
|
||||
|
||||
// --- TAB: AI ENGINE ---
|
||||
ScrollView {
|
||||
Accessible.role: Accessible.PageTab
|
||||
ScrollBar.vertical.policy: ScrollBar.AsNeeded
|
||||
contentWidth: availableWidth
|
||||
|
||||
@@ -577,6 +651,53 @@ Window {
|
||||
Text { text: "Model configuration and performance"; color: SettingsStyle.textSecondary; font.family: mainFont; font.pixelSize: 14 }
|
||||
}
|
||||
|
||||
ModernSettingsSection {
|
||||
title: "Style & Prompting"
|
||||
Layout.margins: 32
|
||||
Layout.topMargin: 0
|
||||
|
||||
content: ColumnLayout {
|
||||
width: parent.width
|
||||
spacing: 0
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Punctuation Style"
|
||||
description: "Hint for how to format text"
|
||||
control: ModernComboBox {
|
||||
id: styleCombo
|
||||
width: 180
|
||||
model: ["Standard (Proper)", "Casual (Lowercase)", "Custom"]
|
||||
|
||||
// Logic to determine initial index based on config string
|
||||
Component.onCompleted: {
|
||||
let current = ui.getSetting("initial_prompt")
|
||||
if (current === "Mm-hmm. Okay, let's go. I speak in full sentences.") currentIndex = 0
|
||||
else if (current === "um, okay... i guess so.") currentIndex = 1
|
||||
else currentIndex = 2
|
||||
}
|
||||
|
||||
onActivated: {
|
||||
if (index === 0) ui.setSetting("initial_prompt", "Mm-hmm. Okay, let's go. I speak in full sentences.")
|
||||
else if (index === 1) ui.setSetting("initial_prompt", "um, okay... i guess so.")
|
||||
// Custom: Don't change string immediately, let user type
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Custom Prompt"
|
||||
description: "Advanced: Define your own style hint"
|
||||
visible: styleCombo.currentIndex === 2
|
||||
control: ModernTextField {
|
||||
Layout.preferredWidth: 280
|
||||
placeholderText: "e.g. 'Hello, World.'"
|
||||
text: ui.getSetting("initial_prompt") || ""
|
||||
onEditingFinished: ui.setSetting("initial_prompt", text === "" ? null : text)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsSection {
|
||||
title: "Model Config"
|
||||
Layout.margins: 32
|
||||
@@ -685,8 +806,8 @@ Window {
|
||||
}
|
||||
color: "#ffffff"
|
||||
font.family: "JetBrains Mono"
|
||||
font.pixelSize: 10
|
||||
opacity: 0.7
|
||||
font.pixelSize: 11
|
||||
opacity: 1.0
|
||||
elide: Text.ElideRight
|
||||
Layout.fillWidth: true
|
||||
}
|
||||
@@ -742,15 +863,17 @@ Window {
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Language"
|
||||
description: "Force language or Auto-detect"
|
||||
description: "Spoken language to transcribe"
|
||||
control: ModernComboBox {
|
||||
width: 140
|
||||
model: ["auto", "en", "fr", "de", "es", "it", "ja", "zh", "ru"]
|
||||
currentIndex: model.indexOf(ui.getSetting("language"))
|
||||
onActivated: ui.setSetting("language", currentText)
|
||||
Layout.preferredWidth: 200
|
||||
model: ui.get_supported_languages()
|
||||
currentIndex: model.indexOf(ui.get_current_language_name())
|
||||
onActivated: (index) => ui.set_language_by_name(currentText)
|
||||
}
|
||||
}
|
||||
|
||||
// Task selector removed as per user request (Hotkeys handle this now)
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Compute Device"
|
||||
description: "Hardware acceleration (CUDA requires NVidia GPU)"
|
||||
@@ -773,6 +896,147 @@ Window {
|
||||
onActivated: ui.setSetting("compute_type", currentText)
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Low VRAM Mode"
|
||||
description: "Unload models immediately after use (Saves VRAM, Adds Delay)"
|
||||
showSeparator: false
|
||||
control: ModernSwitch {
|
||||
checked: ui.getSetting("unload_models_after_use")
|
||||
onToggled: ui.setSetting("unload_models_after_use", checked)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsSection {
|
||||
title: "Correction & Rewriting"
|
||||
Layout.margins: 32
|
||||
Layout.topMargin: 0
|
||||
|
||||
content: ColumnLayout {
|
||||
width: parent.width
|
||||
spacing: 0
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Enable Correction"
|
||||
description: "Post-process text with Llama 3.2 1B (Adds latency)"
|
||||
control: ModernSwitch {
|
||||
checked: ui.getSetting("llm_enabled")
|
||||
onToggled: ui.setSetting("llm_enabled", checked)
|
||||
}
|
||||
}
|
||||
|
||||
ModernSettingsItem {
|
||||
label: "Correction Mode"
|
||||
description: "Grammar Fix vs. Complete Rewrite"
|
||||
visible: ui.getSetting("llm_enabled")
|
||||
control: ModernComboBox {
|
||||
width: 140
|
||||
model: ["Grammar", "Standard", "Rewrite"]
|
||||
currentIndex: model.indexOf(ui.getSetting("llm_mode"))
|
||||
onActivated: ui.setSetting("llm_mode", currentText)
|
||||
}
|
||||
}
|
||||
|
||||
// LLM Model Status Card
|
||||
Rectangle {
|
||||
Layout.fillWidth: true
|
||||
Layout.margins: 12
|
||||
Layout.topMargin: 0
|
||||
Layout.bottomMargin: 16
|
||||
height: 54
|
||||
color: "#0a0a0f"
|
||||
visible: ui.getSetting("llm_enabled")
|
||||
radius: 6
|
||||
border.color: SettingsStyle.borderSubtle
|
||||
border.width: 1
|
||||
|
||||
property bool isDownloaded: false
|
||||
property bool isDownloading: ui.isDownloading && ui.statusText.indexOf("LLM") !== -1
|
||||
|
||||
Timer {
|
||||
interval: 2000
|
||||
running: visible
|
||||
repeat: true
|
||||
onTriggered: parent.checkStatus()
|
||||
}
|
||||
|
||||
function checkStatus() {
|
||||
isDownloaded = ui.isLLMModelDownloaded()
|
||||
}
|
||||
|
||||
Component.onCompleted: checkStatus()
|
||||
|
||||
Connections {
|
||||
target: ui
|
||||
function onModelStatesChanged() { parent.checkStatus() }
|
||||
function onIsDownloadingChanged() { parent.checkStatus() }
|
||||
}
|
||||
|
||||
RowLayout {
|
||||
anchors.fill: parent
|
||||
anchors.leftMargin: 12
|
||||
anchors.rightMargin: 12
|
||||
spacing: 12
|
||||
|
||||
Image {
|
||||
source: "smart_toy.svg"
|
||||
sourceSize: Qt.size(16, 16)
|
||||
layer.enabled: true
|
||||
layer.effect: MultiEffect {
|
||||
colorization: 1.0
|
||||
colorizationColor: parent.parent.isDownloaded ? SettingsStyle.accent : "#808080"
|
||||
}
|
||||
}
|
||||
|
||||
ColumnLayout {
|
||||
Layout.fillWidth: true
|
||||
spacing: 2
|
||||
Text {
|
||||
text: "Llama 3.2 1B (Instruct)"
|
||||
color: "#ffffff"
|
||||
font.family: "JetBrains Mono"; font.bold: true
|
||||
font.pixelSize: 11
|
||||
}
|
||||
Text {
|
||||
text: parent.parent.isDownloaded ? "Ready." : "Model missing (~1.2GB)"
|
||||
color: SettingsStyle.textSecondary
|
||||
font.family: "JetBrains Mono"; font.pixelSize: 10
|
||||
}
|
||||
}
|
||||
|
||||
Button {
|
||||
id: dlBtn
|
||||
text: "Download"
|
||||
visible: !parent.parent.isDownloaded && !parent.parent.isDownloading
|
||||
Layout.preferredHeight: 24
|
||||
Layout.preferredWidth: 80
|
||||
|
||||
contentItem: Text {
|
||||
text: "DOWNLOAD"
|
||||
font.pixelSize: 10; font.bold: true; color: "#000000"; horizontalAlignment: Text.AlignHCenter; verticalAlignment: Text.AlignVCenter
|
||||
}
|
||||
background: Rectangle {
|
||||
color: dlBtn.hovered ? "#ffffff" : SettingsStyle.accent; radius: 4
|
||||
}
|
||||
onClicked: ui.downloadLLM()
|
||||
}
|
||||
|
||||
// Progress Bar
|
||||
Rectangle {
|
||||
visible: parent.parent.isDownloading
|
||||
Layout.fillWidth: true
|
||||
height: 4
|
||||
color: "#30ffffff"
|
||||
Rectangle {
|
||||
width: parent.width * (ui.downloadProgress / 100)
|
||||
height: parent.height
|
||||
color: SettingsStyle.accent
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -830,6 +1094,7 @@ Window {
|
||||
|
||||
// --- TAB: DEBUG ---
|
||||
ScrollView {
|
||||
Accessible.role: Accessible.PageTab
|
||||
ScrollBar.vertical.policy: ScrollBar.AsNeeded
|
||||
contentWidth: availableWidth
|
||||
|
||||
@@ -855,9 +1120,9 @@ Window {
|
||||
spacing: 16
|
||||
|
||||
StatBox { label: "APP CPU"; value: ui.appCpu; unit: "%"; accent: "#00f2ff" }
|
||||
StatBox { label: "APP RAM"; value: ui.appRamMb; unit: "MB"; accent: "#bd93f9" }
|
||||
StatBox { label: "GPU VRAM"; value: ui.appVramMb; unit: "MB"; accent: "#ff79c6" }
|
||||
StatBox { label: "GPU LOAD"; value: ui.appVramPercent; unit: "%"; accent: "#ff5555" }
|
||||
StatBox { label: "APP RAM"; value: ui.appRamMb; unit: "MB"; accent: "#CAA9FF" }
|
||||
StatBox { label: "GPU VRAM"; value: ui.appVramMb; unit: "MB"; accent: "#FF8FD0" }
|
||||
StatBox { label: "GPU LOAD"; value: ui.appVramPercent; unit: "%"; accent: "#FF8A8A" }
|
||||
}
|
||||
|
||||
Rectangle {
|
||||
|
||||
@@ -6,13 +6,14 @@ QtObject {
|
||||
// Colors
|
||||
readonly property color background: "#F2121212" // Deep Obsidian with 95% opacity
|
||||
readonly property color surfaceCard: "#1A1A1A" // Layer 1
|
||||
readonly property color surfaceHover: "#2A2A2A" // Layer 2 (Lighter for better contrast)
|
||||
readonly property color borderSubtle: Qt.rgba(1, 1, 1, 0.08)
|
||||
readonly property color surfaceHover: "#2A2A2A" // Layer 2
|
||||
readonly property color borderSubtle: Qt.rgba(1, 1, 1, 0.22) // WCAG 3:1 non-text contrast
|
||||
|
||||
readonly property color textPrimary: "#FAFAFA" // Brighter white
|
||||
readonly property color textSecondary: "#999999"
|
||||
readonly property color textPrimary: "#FAFAFA"
|
||||
readonly property color textSecondary: "#ABABAB" // WCAG AAA 8.1:1 on #121212
|
||||
readonly property color textDisabled: "#808080" // 4.0:1 minimum for disabled states
|
||||
|
||||
readonly property color accentPurple: "#7000FF"
|
||||
readonly property color accentPurple: "#B794F6" // WCAG AAA 7.2:1 on #121212
|
||||
readonly property color accentCyan: "#00F2FF"
|
||||
|
||||
// Configurable active accent
|
||||
@@ -21,5 +22,9 @@ QtObject {
|
||||
// Dimensions
|
||||
readonly property int cardRadius: 16
|
||||
readonly property int itemRadius: 8
|
||||
readonly property int itemHeight: 60 // Even taller for more breathing room
|
||||
readonly property int itemHeight: 60
|
||||
|
||||
// Accessibility
|
||||
readonly property int focusRingWidth: 2
|
||||
readonly property int minTargetSize: 24
|
||||
}
|
||||
|
||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
@@ -1,50 +0,0 @@
|
||||
#version 440
|
||||
|
||||
layout(location = 0) in vec2 qt_TexCoord0;
|
||||
layout(location = 0) out vec4 fragColor;
|
||||
|
||||
layout(std140, binding = 0) uniform buf {
|
||||
mat4 qt_Matrix;
|
||||
float qt_Opacity;
|
||||
float time;
|
||||
float aberration; // 0.0 to 1.0, controlled by Audio Amplitude
|
||||
};
|
||||
|
||||
float rand(vec2 co) {
|
||||
return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
|
||||
}
|
||||
|
||||
void main() {
|
||||
// 1. Calculate Distortion Offset based on Amplitude (aberration)
|
||||
// We warp the UVs slightly away from center
|
||||
vec2 uv = qt_TexCoord0;
|
||||
vec2 dist = uv - 0.5;
|
||||
|
||||
// 2. Chromatic Aberration
|
||||
// Red Channel shifts OUT
|
||||
// Blue Channel shifts IN
|
||||
float strength = aberration * 0.02; // Max shift 2% of texture size
|
||||
|
||||
vec2 rUV = uv + (dist * strength);
|
||||
vec2 bUV = uv - (dist * strength);
|
||||
|
||||
// Sample texture? We don't have a texture input (source is empty Item), we are generating visuals.
|
||||
// Wait, ShaderEffect usually works on sourceItem.
|
||||
// Here we are generating NOISE on top of a gradient.
|
||||
// So we apply Aberration to the NOISE function?
|
||||
// Or do we want to aberrate the pixels UNDERNEATH?
|
||||
// ShaderEffect with no source property renders purely procedural content.
|
||||
|
||||
// Let's create layered procedural noise with channel offsets
|
||||
float nR = rand(rUV + vec2(time * 0.01, 0.0));
|
||||
float nG = rand(uv + vec2(time * 0.01, 0.0)); // Green is anchor
|
||||
float nB = rand(bUV + vec2(time * 0.01, 0.0));
|
||||
|
||||
// Also modulate alpha by aberration - higher volume = more intense grain?
|
||||
// Or maybe just pure glitch.
|
||||
|
||||
vec4 grainColor = vec4(nR, nG, nB, 1.0);
|
||||
|
||||
// Mix it with opacity
|
||||
fragColor = grainColor * qt_Opacity;
|
||||
}
|
||||
Binary file not shown.
@@ -1,25 +0,0 @@
|
||||
#version 440
|
||||
|
||||
layout(location = 0) in vec2 qt_TexCoord0;
|
||||
layout(location = 0) out vec4 fragColor;
|
||||
|
||||
layout(std140, binding = 0) uniform buf {
|
||||
mat4 qt_Matrix;
|
||||
float qt_Opacity;
|
||||
float time;
|
||||
};
|
||||
|
||||
// High-quality pseudo-random function
|
||||
float rand(vec2 co) {
|
||||
return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
|
||||
}
|
||||
|
||||
void main() {
|
||||
// Dynamic Noise based on Time
|
||||
// We add 'time' to the coordinate to animate the grain
|
||||
float noise = rand(qt_TexCoord0 + vec2(time * 0.01, time * 0.02));
|
||||
|
||||
// Output grayscale noise with alpha modulation
|
||||
// We want white noise, applied with qt_Opacity
|
||||
fragColor = vec4(noise, noise, noise, 1.0) * qt_Opacity;
|
||||
}
|
||||
Binary file not shown.
Binary file not shown.
|
Before Width: | Height: | Size: 492 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 490 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 464 KiB |
@@ -1,236 +0,0 @@
|
||||
"""
|
||||
Settings Window Module.
|
||||
=======================
|
||||
|
||||
Manages the application configuration UI.
|
||||
Refactored for 2026 Premium Aesthetics with Sidebar navigation.
|
||||
"""
|
||||
|
||||
from PySide6.QtWidgets import (
|
||||
QWidget, QVBoxLayout, QHBoxLayout, QStackedWidget,
|
||||
QLabel, QComboBox, QFormLayout, QFrame, QMessageBox, QScrollArea
|
||||
)
|
||||
from PySide6.QtCore import Qt, Signal, Slot, QSize
|
||||
from PySide6.QtGui import QFont, QIcon
|
||||
|
||||
from src.core.config import ConfigManager
|
||||
from src.ui.styles import Theme, StyleGenerator, load_modern_fonts
|
||||
from src.ui.components import FramelessWindow, ModernFrame, GlassButton, ModernSwitch, ModernSlider
|
||||
import sounddevice as sd
|
||||
|
||||
class SettingsWindow(FramelessWindow):
|
||||
"""
|
||||
The main settings dialog.
|
||||
Refactored with 2026 Premium Sidebar Layout.
|
||||
"""
|
||||
settings_changed = Signal()
|
||||
|
||||
def __init__(self, parent=None):
|
||||
super().__init__(parent)
|
||||
self.config = ConfigManager()
|
||||
self.setFixedSize(700, 500)
|
||||
|
||||
# Main Container
|
||||
self.bg_frame = ModernFrame()
|
||||
self.bg_frame.setStyleSheet(StyleGenerator.get_glass_card(radius=20))
|
||||
|
||||
self.root_layout = QVBoxLayout(self)
|
||||
self.root_layout.setContentsMargins(10, 10, 10, 10)
|
||||
self.root_layout.addWidget(self.bg_frame)
|
||||
|
||||
# Title Bar Area (Inside glass card)
|
||||
self.title_layout = QHBoxLayout()
|
||||
self.title_layout.setContentsMargins(20, 15, 20, 0)
|
||||
|
||||
title_lbl = QLabel("PREMIUM SETTINGS")
|
||||
title_lbl.setFont(load_modern_fonts())
|
||||
title_lbl.setStyleSheet(f"color: white; font-weight: 900; font-size: 14px; letter-spacing: 2px;")
|
||||
self.title_layout.addWidget(title_lbl)
|
||||
|
||||
self.title_layout.addStretch()
|
||||
|
||||
self.btn_close = GlassButton("×", accent_color="#ff4b4b")
|
||||
self.btn_close.setFixedSize(30, 30)
|
||||
self.btn_close.clicked.connect(self.close)
|
||||
self.title_layout.addWidget(self.btn_close)
|
||||
|
||||
# Central Layout (Sidebar + Content)
|
||||
self.content_layout = QHBoxLayout()
|
||||
self.content_layout.setContentsMargins(10, 10, 10, 10)
|
||||
self.content_layout.setSpacing(10)
|
||||
|
||||
# 1. SIDEBAR
|
||||
self.sidebar = QWidget()
|
||||
self.sidebar.setFixedWidth(160)
|
||||
self.sidebar_layout = QVBoxLayout(self.sidebar)
|
||||
self.sidebar_layout.setContentsMargins(0, 10, 0, 10)
|
||||
self.sidebar_layout.setSpacing(8)
|
||||
|
||||
self.nav_general = GlassButton("General")
|
||||
self.nav_audio = GlassButton("Audio")
|
||||
self.nav_visuals = GlassButton("Visuals")
|
||||
self.nav_advanced = GlassButton("Advanced/AI")
|
||||
|
||||
self.sidebar_layout.addWidget(self.nav_general)
|
||||
self.sidebar_layout.addWidget(self.nav_audio)
|
||||
self.sidebar_layout.addWidget(self.nav_visuals)
|
||||
self.sidebar_layout.addWidget(self.nav_advanced)
|
||||
self.sidebar_layout.addStretch()
|
||||
|
||||
self.btn_save = GlassButton("SAVE CHANGES", accent_color=Theme.ACCENT_GREEN)
|
||||
self.btn_save.clicked.connect(self.save_settings)
|
||||
self.sidebar_layout.addWidget(self.btn_save)
|
||||
|
||||
# 2. CONTENT STACK
|
||||
self.stack = QStackedWidget()
|
||||
self.stack.setStyleSheet("background: transparent;")
|
||||
|
||||
# Connect sidebar to stack
|
||||
self.nav_general.clicked.connect(lambda: self.stack.setCurrentIndex(0))
|
||||
self.nav_audio.clicked.connect(lambda: self.stack.setCurrentIndex(1))
|
||||
self.nav_visuals.clicked.connect(lambda: self.stack.setCurrentIndex(2))
|
||||
self.nav_advanced.clicked.connect(lambda: self.stack.setCurrentIndex(3))
|
||||
|
||||
# Main Layout Assembly
|
||||
self.inner_layout = QVBoxLayout(self.bg_frame)
|
||||
self.inner_layout.addLayout(self.title_layout)
|
||||
self.inner_layout.addLayout(self.content_layout)
|
||||
|
||||
self.content_layout.addWidget(self.sidebar)
|
||||
self.content_layout.addWidget(self.stack)
|
||||
|
||||
self.setup_pages()
|
||||
self.load_values()
|
||||
|
||||
def setup_pages(self):
|
||||
"""Creates the settings pages."""
|
||||
# --- GENERAL ---
|
||||
self.page_general = QWidget()
|
||||
l1 = QFormLayout(self.page_general)
|
||||
l1.setVerticalSpacing(20)
|
||||
|
||||
self.inp_hotkey = QComboBox()
|
||||
self.inp_hotkey.addItems(["f1", "f2", "f3", "f4", "f5", "f6", "f7", "f8", "f9", "f10", "f11", "f12", "caps lock"])
|
||||
self.inp_hotkey.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
|
||||
l1.addRow(self.create_lbl("Global Hotkey:"), self.inp_hotkey)
|
||||
|
||||
self.chk_top = ModernSwitch()
|
||||
l1.addRow(self.create_lbl("Always on Top:"), self.chk_top)
|
||||
|
||||
self.stack.addWidget(self.page_general)
|
||||
|
||||
# --- AUDIO ---
|
||||
self.page_audio = QWidget()
|
||||
l2 = QFormLayout(self.page_audio)
|
||||
l2.setVerticalSpacing(15)
|
||||
|
||||
self.inp_device = QComboBox()
|
||||
self.inp_device.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
|
||||
self.populate_audio_devices()
|
||||
l2.addRow(self.create_lbl("Input Device:"), self.inp_device)
|
||||
|
||||
self.sld_threshold = ModernSlider(Qt.Horizontal)
|
||||
self.sld_threshold.setRange(1, 25)
|
||||
self.lbl_threshold = self.create_lbl("2%")
|
||||
self.sld_threshold.valueChanged.connect(lambda v: self.lbl_threshold.setText(f"{v}%"))
|
||||
l2.addRow(self.create_lbl("Noise Gate:"), self.sld_threshold)
|
||||
l2.addRow("", self.lbl_threshold)
|
||||
|
||||
self.sld_duration = ModernSlider(Qt.Horizontal)
|
||||
self.sld_duration.setRange(5, 50)
|
||||
self.lbl_duration = self.create_lbl("1.0s")
|
||||
self.sld_duration.valueChanged.connect(lambda v: self.lbl_duration.setText(f"{v/10}s"))
|
||||
l2.addRow(self.create_lbl("Auto-Submit:"), self.sld_duration)
|
||||
l2.addRow("", self.lbl_duration)
|
||||
|
||||
self.stack.addWidget(self.page_audio)
|
||||
|
||||
# --- VISUALS ---
|
||||
self.page_visuals = QWidget()
|
||||
l3 = QFormLayout(self.page_visuals)
|
||||
l3.setVerticalSpacing(20)
|
||||
|
||||
self.inp_style = QComboBox()
|
||||
self.inp_style.addItem("Neon Line (Recommended)", "line")
|
||||
self.inp_style.addItem("Classic Bars", "bar")
|
||||
self.inp_style.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
|
||||
l3.addRow(self.create_lbl("Visualizer:"), self.inp_style)
|
||||
|
||||
self.sld_opacity = ModernSlider(Qt.Horizontal)
|
||||
self.sld_opacity.setRange(40, 100)
|
||||
self.lbl_opacity = self.create_lbl("100%")
|
||||
self.sld_opacity.valueChanged.connect(lambda v: self.lbl_opacity.setText(f"{v}%"))
|
||||
l3.addRow(self.create_lbl("Opacity:"), self.sld_opacity)
|
||||
l3.addRow("", self.lbl_opacity)
|
||||
|
||||
self.stack.addWidget(self.page_visuals)
|
||||
|
||||
# --- ADVANCED ---
|
||||
self.page_adv = QWidget()
|
||||
l4 = QFormLayout(self.page_adv)
|
||||
l4.setVerticalSpacing(15)
|
||||
|
||||
self.inp_model = QComboBox()
|
||||
self.inp_model.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
|
||||
for id, name in [("tiny", "Tiny (Fast)"), ("base", "Base"), ("small", "Small (Default)"), ("medium", "Medium"), ("large-v3", "Large V3")]:
|
||||
self.inp_model.addItem(name, id)
|
||||
l4.addRow(self.create_lbl("Model:"), self.inp_model)
|
||||
|
||||
info = QLabel("Large models provide higher accuracy but require significant RAM/VRAM.")
|
||||
info.setWordWrap(True)
|
||||
info.setStyleSheet(f"color: {Theme.TEXT_SECONDARY}; font-style: italic; font-size: 11px;")
|
||||
l4.addRow("", info)
|
||||
|
||||
self.stack.addWidget(self.page_adv)
|
||||
|
||||
def create_lbl(self, text):
|
||||
lbl = QLabel(text)
|
||||
lbl.setStyleSheet(f"color: {Theme.TEXT_SECONDARY}; font-weight: 600; font-size: 13px;")
|
||||
return lbl
|
||||
|
||||
def populate_audio_devices(self):
|
||||
try:
|
||||
self.inp_device.addItem("System Default", -1)
|
||||
for i, dev in enumerate(sd.query_devices()):
|
||||
if dev['max_input_channels'] > 0:
|
||||
self.inp_device.addItem(dev['name'], i)
|
||||
except: pass
|
||||
|
||||
def load_values(self):
|
||||
self.inp_hotkey.setCurrentText(self.config.get("hotkey"))
|
||||
self.chk_top.setChecked(self.config.get("always_on_top"))
|
||||
|
||||
dev_id = self.config.get("input_device")
|
||||
idx = self.inp_device.findData(dev_id if dev_id is not None else -1)
|
||||
if idx >= 0: self.inp_device.setCurrentIndex(idx)
|
||||
|
||||
self.sld_threshold.setValue(int(self.config.get("silence_threshold") * 100))
|
||||
self.sld_duration.setValue(int(self.config.get("silence_duration") * 10))
|
||||
|
||||
idx = self.inp_style.findData(self.config.get("visualizer_style"))
|
||||
if idx >= 0: self.inp_style.setCurrentIndex(idx)
|
||||
|
||||
self.sld_opacity.setValue(int(self.config.get("opacity") * 100))
|
||||
|
||||
idx = self.inp_model.findData(self.config.get("model_size"))
|
||||
if idx >= 0: self.inp_model.setCurrentIndex(idx)
|
||||
|
||||
def save_settings(self):
|
||||
updates = {
|
||||
"hotkey": self.inp_hotkey.currentText(),
|
||||
"always_on_top": self.chk_top.isChecked(),
|
||||
"input_device": self.inp_device.currentData() if self.inp_device.currentData() != -1 else None,
|
||||
"silence_threshold": self.sld_threshold.value() / 100.0,
|
||||
"silence_duration": self.sld_duration.value() / 10.0,
|
||||
"visualizer_style": self.inp_style.currentData(),
|
||||
"opacity": self.sld_opacity.value() / 100.0,
|
||||
"model_size": self.inp_model.currentData()
|
||||
}
|
||||
|
||||
new_model = updates["model_size"]
|
||||
if new_model != self.config.get("model_size"):
|
||||
QMessageBox.information(self, "Model Updated", f"Downloaded {new_model} on next launch.")
|
||||
|
||||
self.config.set_bulk(updates)
|
||||
self.settings_changed.emit()
|
||||
self.close()
|
||||
@@ -1,62 +0,0 @@
|
||||
"""
|
||||
Style Engine Module.
|
||||
====================
|
||||
|
||||
Centralized design system for the 2026 Premium UI.
|
||||
Defines color palettes, glassmorphism templates, and modern font loading.
|
||||
"""
|
||||
|
||||
from PySide6.QtGui import QColor, QFont, QFontDatabase
|
||||
import os
|
||||
|
||||
class Theme:
|
||||
"""Premium Dark Theme Palette (2026 Edition)."""
|
||||
# Backgrounds
|
||||
BG_DARK = "#0d0d12" # Deep cosmic black
|
||||
BG_CARD = "#16161e" # Slightly lighter for components
|
||||
BG_GLASS = "rgba(22, 22, 30, 0.7)" # Semi-transparent for glass effect
|
||||
|
||||
# Neons & Accents
|
||||
ACCENT_CYAN = "#00f2ff" # Electric cyan
|
||||
ACCENT_PURPLE = "#7000ff" # Deep cyber purple
|
||||
ACCENT_GREEN = "#00ff88" # Mint neon
|
||||
|
||||
# Text
|
||||
TEXT_PRIMARY = "#ffffff" # Pure white
|
||||
TEXT_SECONDARY = "#9499b0" # Muted blue-gray
|
||||
TEXT_MUTED = "#565f89" # Darker blue-gray
|
||||
|
||||
# Borders
|
||||
BORDER_SUBTLE = "rgba(100, 100, 150, 0.2)"
|
||||
BORDER_GLOW = "rgba(0, 242, 255, 0.5)"
|
||||
|
||||
class StyleGenerator:
|
||||
"""Generates QSS strings for complex effects."""
|
||||
|
||||
@staticmethod
|
||||
def get_glass_card(radius=12, border=True):
|
||||
"""Returns QSS for a glassmorphism card."""
|
||||
border_css = f"border: 1px solid {Theme.BORDER_SUBTLE};" if border else "border: none;"
|
||||
return f"""
|
||||
background-color: {Theme.BG_GLASS};
|
||||
border-radius: {radius}px;
|
||||
{border_css}
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def get_glow_border(color=Theme.ACCENT_CYAN):
|
||||
"""Returns QSS for a glowing border state."""
|
||||
return f"border: 1px solid {color};"
|
||||
|
||||
def load_modern_fonts():
|
||||
"""Attempts to load a modern font stack for the 2026 look."""
|
||||
# Preferred order: Segoe UI Variable, Inter, Segoe UI, sans-serif
|
||||
families = ["Segoe UI Variable Text", "Inter", "Segoe UI", "sans-serif"]
|
||||
|
||||
for family in families:
|
||||
font = QFont(family, 10)
|
||||
if QFontDatabase.families().count(family) > 0:
|
||||
return font
|
||||
|
||||
# Absolute fallback
|
||||
return QFont("Arial", 10)
|
||||
@@ -1,117 +0,0 @@
|
||||
"""
|
||||
Audio Visualizer Module.
|
||||
========================
|
||||
|
||||
High-Fidelity rendering for the 2026 Premium UI.
|
||||
Supports 'Classic Bars' and 'Neon Line' with smooth curves and glows.
|
||||
"""
|
||||
|
||||
from PySide6.QtWidgets import QWidget
|
||||
from PySide6.QtCore import Qt, QTimer, Slot, QRectF, QPointF
|
||||
from PySide6.QtGui import QPainter, QBrush, QColor, QPainterPath, QPen, QLinearGradient
|
||||
import random
|
||||
|
||||
from src.ui.styles import Theme
|
||||
|
||||
class AudioVisualizer(QWidget):
|
||||
"""
|
||||
A premium audio visualizer with smooth physics and neon aesthetics.
|
||||
"""
|
||||
|
||||
def __init__(self, parent=None):
|
||||
super().__init__(parent)
|
||||
self.amplitude = 0.0
|
||||
self.bars = 12
|
||||
self.history = [0.0] * self.bars
|
||||
|
||||
# High-refresh timer for silky smooth motion
|
||||
self.timer = QTimer(self)
|
||||
self.timer.timeout.connect(self.update_animation)
|
||||
self.timer.start(16) # ~60 FPS
|
||||
|
||||
@Slot(float)
|
||||
def set_amplitude(self, amp: float):
|
||||
self.amplitude = amp
|
||||
|
||||
def update_animation(self):
|
||||
self.history.pop(0)
|
||||
# Smooth interpolation + noise
|
||||
jitter = random.uniform(0.01, 0.03)
|
||||
# Decay logic: Gravity-like pull
|
||||
self.history.append(max(self.amplitude, jitter))
|
||||
self.update()
|
||||
|
||||
def paintEvent(self, event):
|
||||
from src.core.config import ConfigManager
|
||||
style = ConfigManager().get("visualizer_style")
|
||||
|
||||
painter = QPainter(self)
|
||||
painter.setRenderHint(QPainter.Antialiasing)
|
||||
|
||||
w, h = self.width(), self.height()
|
||||
painter.translate(0, h / 2)
|
||||
|
||||
if style == "bar":
|
||||
self._draw_bars(painter, w, h)
|
||||
else:
|
||||
self._draw_line(painter, w, h)
|
||||
|
||||
def _draw_bars(self, painter, w, h):
|
||||
bar_w = w / self.bars
|
||||
spacing = 3
|
||||
|
||||
for i, val in enumerate(self.history):
|
||||
bar_h = val * (h * 0.9)
|
||||
x = i * bar_w
|
||||
|
||||
# Gradient Bar
|
||||
grad = QLinearGradient(0, -bar_h/2, 0, bar_h/2)
|
||||
grad.setColorAt(0, QColor(Theme.ACCENT_PURPLE))
|
||||
grad.setColorAt(1, QColor(Theme.ACCENT_CYAN))
|
||||
|
||||
painter.setBrush(grad)
|
||||
painter.setPen(Qt.NoPen)
|
||||
painter.drawRoundedRect(QRectF(x + spacing, -bar_h/2, bar_w - spacing*2, bar_h), 3, 3)
|
||||
|
||||
def _draw_line(self, painter, w, h):
|
||||
path = QPainterPath()
|
||||
points = len(self.history)
|
||||
dx = w / (points - 1)
|
||||
|
||||
path.moveTo(0, 0)
|
||||
|
||||
def get_path(multi):
|
||||
p = QPainterPath()
|
||||
p.moveTo(0, 0)
|
||||
for i in range(points):
|
||||
curr_x = i * dx
|
||||
curr_y = -self.history[i] * (h * 0.45) * multi
|
||||
if i == 0:
|
||||
p.moveTo(curr_x, curr_y)
|
||||
else:
|
||||
prev_x = (i-1) * dx
|
||||
# Simple lerp or quadTo for smoothness
|
||||
p.lineTo(curr_x, curr_y)
|
||||
return p
|
||||
|
||||
# Draw Top & Bottom
|
||||
p_top = get_path(1)
|
||||
p_bot = get_path(-1)
|
||||
|
||||
# Glow layer
|
||||
glow_pen = QPen(QColor(Theme.ACCENT_CYAN))
|
||||
glow_pen.setWidth(4)
|
||||
glow_alpha = QColor(Theme.ACCENT_CYAN)
|
||||
glow_alpha.setAlpha(60)
|
||||
glow_pen.setColor(glow_alpha)
|
||||
|
||||
painter.setPen(glow_pen)
|
||||
painter.drawPath(p_top)
|
||||
painter.drawPath(p_bot)
|
||||
|
||||
# Core layer
|
||||
core_pen = QPen(Qt.white)
|
||||
core_pen.setWidth(2)
|
||||
painter.setPen(core_pen)
|
||||
painter.drawPath(p_top)
|
||||
painter.drawPath(p_bot)
|
||||
32
src/utils/formatters.py
Normal file
32
src/utils/formatters.py
Normal file
@@ -0,0 +1,32 @@
|
||||
"""
|
||||
Formatter Utilities
|
||||
===================
|
||||
Helper functions for text formatting.
|
||||
"""
|
||||
|
||||
def format_hotkey(sequence: str) -> str:
|
||||
"""
|
||||
Formats a hotkey sequence string (e.g. 'ctrl+alt+f9')
|
||||
into a pretty readable string (e.g. 'Ctrl + Alt + F9').
|
||||
"""
|
||||
if not sequence:
|
||||
return "None"
|
||||
|
||||
parts = sequence.split('+')
|
||||
formatted_parts = []
|
||||
|
||||
for p in parts:
|
||||
p = p.strip().lower()
|
||||
if p == 'ctrl': formatted_parts.append('Ctrl')
|
||||
elif p == 'alt': formatted_parts.append('Alt')
|
||||
elif p == 'shift': formatted_parts.append('Shift')
|
||||
elif p == 'win': formatted_parts.append('Win')
|
||||
elif p == 'esc': formatted_parts.append('Esc')
|
||||
else:
|
||||
# Capitalize first letter
|
||||
if len(p) > 0:
|
||||
formatted_parts.append(p[0].upper() + p[1:])
|
||||
else:
|
||||
formatted_parts.append(p)
|
||||
|
||||
return " + ".join(formatted_parts)
|
||||
@@ -55,6 +55,10 @@ except AttributeError:
|
||||
def LOWORD(l): return l & 0xffff
|
||||
def HIWORD(l): return (l >> 16) & 0xffff
|
||||
|
||||
GWL_EXSTYLE = -20
|
||||
WS_EX_TRANSPARENT = 0x00000020
|
||||
WS_EX_LAYERED = 0x00080000
|
||||
|
||||
class WindowHook:
|
||||
def __init__(self, hwnd, width, height, initial_scale=1.0):
|
||||
self.hwnd = hwnd
|
||||
@@ -65,6 +69,34 @@ class WindowHook:
|
||||
# (Window 420x140, Pill 380x100)
|
||||
self.logical_rect = [20, 20, 20+380, 20+100]
|
||||
self.current_scale = initial_scale
|
||||
self.enabled = True # New flag
|
||||
|
||||
def set_enabled(self, enabled):
|
||||
"""
|
||||
Enables or disables interaction.
|
||||
When disabled, we set WS_EX_TRANSPARENT so clicks pass through physically.
|
||||
"""
|
||||
if self.enabled == enabled:
|
||||
return
|
||||
|
||||
self.enabled = enabled
|
||||
|
||||
# Get current styles
|
||||
style = user32.GetWindowLongW(self.hwnd, GWL_EXSTYLE)
|
||||
|
||||
if not enabled:
|
||||
# Enable Click-Through (Add Transparent)
|
||||
# We also ensure Layered is set (Qt usually sets it, but good to be sure)
|
||||
new_style = style | WS_EX_TRANSPARENT | WS_EX_LAYERED
|
||||
else:
|
||||
# Disable Click-Through (Remove Transparent)
|
||||
new_style = style & ~WS_EX_TRANSPARENT
|
||||
|
||||
if new_style != style:
|
||||
SetWindowLongPtr(self.hwnd, GWL_EXSTYLE, new_style)
|
||||
|
||||
# Force a redraw/frame update just in case
|
||||
user32.SetWindowPos(self.hwnd, 0, 0, 0, 0, 0, 0x0027) # SWP_NOMOVE | SWP_NOSIZE | SWP_NOZORDER | SWP_FRAMECHANGED
|
||||
|
||||
def install(self):
|
||||
proc_address = ctypes.cast(self.new_wnd_proc, ctypes.c_void_p)
|
||||
@@ -73,6 +105,10 @@ class WindowHook:
|
||||
def wnd_proc_callback(self, hwnd, msg, wParam, lParam):
|
||||
try:
|
||||
if msg == WM_NCHITTEST:
|
||||
# If disabled (invisible/inactive), let clicks pass through (HTTRANSPARENT)
|
||||
if not self.enabled:
|
||||
return HTTRANSPARENT
|
||||
|
||||
res = self.on_nchittest(lParam)
|
||||
if res != 0:
|
||||
return res
|
||||
|
||||
Reference in New Issue
Block a user