16 Commits

Author SHA1 Message Date
Your Name
aa2b0acd86 Add WCAG 2.2 AAA accessibility section to README 2026-02-18 22:30:48 +02:00
Your Name
08b9ecc1cb Untrack .gitignore and docs/ from repo 2026-02-18 22:26:32 +02:00
Your Name
02ef33023d Remove unused files: old UI, build scripts, fonts, test files
Remove old widget-based Python UI (replaced by QML), unused build
scripts, NL/variable/webfont variants, old shaders, PNG icons
(replaced by SVGs), and standalone test files. Add build.bat and
build.spec for the bootstrapper build system.
2026-02-18 22:25:04 +02:00
Your Name
d509eb5efb WCAG: Review fixes - arrow keys, PageTab roles, AlertMessage
Add Up/Down arrow key navigation to Settings sidebar.
Add Accessible.role: Accessible.PageTab to all 5 tab pages.
Fix Loader status to use AlertMessage role per WCAG 4.1.3.
2026-02-18 21:10:21 +02:00
Your Name
7c80ecfbed WCAG: Loader - subtitle contrast, accessibility, reduced motion 2026-02-18 21:06:43 +02:00
Your Name
d8707b5ade WCAG: Overlay - border, keyboard, accessibility, reduced motion 2026-02-18 21:06:43 +02:00
Your Name
07ad3b220d WCAG: Settings - AAA colors, keyboard nav, reduce motion toggle 2026-02-18 21:05:35 +02:00
Your Name
dc15e11e8e WCAG: Slider (24px, focus ring) + Switch (I/O marks, border token) 2026-02-18 21:03:21 +02:00
Your Name
a70e76b4ab WCAG: Add reduce_motion config, bridge property, OS detection
Config default, reduceMotion Q_PROPERTY on UIBridge, Windows
SystemParametersInfo detection for prefers-reduced-motion.
2026-02-18 21:02:27 +02:00
Your Name
d40c83cc45 WCAG: TextField, ComboBox, KeyRecorder - contrast, focus, accessible roles 2026-02-18 21:02:07 +02:00
Your Name
f2f80fc863 WCAG: GlowButton, SettingsSection, SettingsItem - contrast, accessible roles 2026-02-18 21:02:06 +02:00
Your Name
a6cf9efbcb WCAG: Update design tokens for AAA contrast compliance
accentPurple #7000FF->#B794F6 (7.2:1), textSecondary #999999->#ABABAB (8.1:1),
borderSubtle rgba(0.08)->rgba(0.22) (3:1). Add textDisabled, focusRingWidth, minTargetSize.
2026-02-18 21:00:49 +02:00
Your Name
4615f3084f Docs: Add WCAG 2.2 AAA step-by-step implementation plan
15 tasks covering: design tokens, 6 component fixes, Settings/Overlay/Loader
hardcoded colors, accessibility properties, keyboard nav, reduced motion.
2026-02-18 20:56:51 +02:00
Your Name
937061f710 Docs: Add WCAG 2.2 AAA compliance design document
Comprehensive design covering color system overhaul, component fixes,
accessibility properties, keyboard navigation, and reduced motion support.
2026-02-18 20:53:25 +02:00
Your Name
798a35e6d9 Feat: Integrated Local LLM (Llama 3.2 1B) for Intelligent Correction -- New Core: Added LLMEngine utilizing llama-cpp-python for local private text post-processing. -- Forensic Protocol: Engineered strict system prompts to prevent LLM refusals, censorship, or assistant chatter. -- Three Modes: Grammar, Standard, Rewrite. -- Start/Stop Logic: Consolidated conflicting recording methods. -- Hotkeys: Added dedicated F9 (Correct) vs F8 (Transcribe). -- UI: Updated Settings. -- Build: Updated portable_build.py. -- Docs: Updated README. 2026-01-31 01:02:24 +02:00
Your Name
6737ed4547 Release v1.0.4: The Compatibility Update
- Added robust CPU Fallback for AMD/Non-CUDA GPUs.
- Implemented Lazy Load for AI Engine to prevent startup crashes.
- Added explicit DLL injection for Cublas/Cudnn on Windows.
- Added Corrupt Model Auto-Repair logic.
- Includes pre-compiled v1.0.4 executable.
2026-01-25 20:28:01 +02:00
84 changed files with 1147 additions and 1488 deletions

25
.gitignore vendored
View File

@@ -1,25 +0,0 @@
# Python
__pycache__/
*.py[cod]
*$py.class
# Virtual Environment
venv/
env/
# Distribution / Build
dist/
build/
*.spec
_unused_files/
runtime/
# IDEs
.vscode/
.idea/
# Application Specific
models/
recordings/
*.log
settings.json

View File

@@ -43,6 +43,18 @@ Whisper Voice operates directly on the metal. It is not an API wrapper; it is an
| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out the noise, ensuring only pure intent is processed. |
| **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that is fluid, responsive, and sovereign. |
### 🛑 Compatibility Matrix (Windows)
The core engine (`CTranslate2`) is heavily optimized for Nvidia tensor cores.
| Manufacturer | Hardware | Status | Notes |
| :--- | :--- | :--- | :--- |
| **Nvidia** | GTX 900+ / RTX | ✅ **Supported** | Full heavy-metal acceleration. |
| **AMD** | Radeon RX | ⚠️ **CPU Fallback** | Runs on CPU. Valid for `Small/Medium`, slow for `Large`. |
| **Intel** | Arc / Iris | ⚠️ **CPU Fallback** | Runs on CPU. Valid for `Small/Medium`, slow for `Large`. |
| **Apple** | M1 / M2 / M3 | ❌ **Unsupported** | Release is strictly Windows x64. |
> **AMD Users**: v1.0.3 auto-detects GPU failures and silently falls back to CPU.
<br>
## 🖋️ Universal Transcription
@@ -56,14 +68,20 @@ At its core, Whisper Voice is the ultimate bridge between thought and text. It l
### Workflow: `F9 (Default)`
The primary channel for native-language transcription. It transcribes precisely what it hears in the language you speak (or the one you've locked in Settings).
### ✨ Style Prompting (New in v1.0.2)
Whisper Voice replaces traditional "grammar correction models" with a native **Style Prompting** engine. By injecting a specific "pre-prompt" into the model's context window, we can guide its internal style without external post-processing.
### 🧠 Intelligent Correction (New in v1.1.0)
Whisper Voice now integrates a local **Llama 3.2 1B** LLM to act as a "Silent Consultant". It post-processes transcripts to fix grammar or polish style without effectively "chatting" back.
* **Standard (Default)**: Forces the model to use full sentences, proper capitalization, and periods. Ideal for dictation.
* **Casual**: Encourages a relaxed, lowercase style (e.g., "no way that's crazy lol").
* **Custom**: Allows you to seed the model with your own context (e.g., "Here is a list of medical terms:").
It is strictly trained on a **Forensic Protocol**: it will never lecture you, never refuse to process explicit language, and never sanitize your words. Your profanity is yours to keep.
This approach incurs **zero latency penalty** and **zero extra VRAM** usage.
#### Correction Modes:
* **Standard (Default)**: Fixes grammar, punctuation, and capitalization while keeping every word you said.
* **Grammar Only**: Strictly fixes objective errors (spelling/agreement). Touches nothing else.
* **Rewrite**: Polishes the flow and clarity of your sentences while explicitly preserving your original tone (Casual stays casual, Formal stays formal).
#### Supported Languages:
The correction engine is optimized for **English, German, French, Italian, Portuguese, Spanish, Hindi, and Thai**. It also performs well on **Russian, Chinese, Japanese, and Romanian**.
This approach incurs a ~2s latency penalty but uses **zero extra VRAM** when in Low VRAM mode.
<br>
@@ -123,6 +141,54 @@ For users with limited GPU memory (e.g., 4GB cards) or those running heavy games
---
## ♿ Accessibility (WCAG 2.2 AAA)
Whisper Voice is built to be usable by everyone. The entire interface has been engineered to meet **WCAG 2.2 AAA** — the highest tier of accessibility compliance. This is not a checkbox exercise; it is a structural commitment.
### Color & Contrast
Every design token is calibrated for **Enhanced Contrast** (WCAG 1.4.6, 7:1 minimum):
| Token | Ratio | Purpose |
| :--- | :--- | :--- |
| `textPrimary` #FAFAFA | ~17:1 | Body text, headings |
| `textSecondary` #ABABAB | 8.1:1 | Descriptions, hints |
| `accentPurple` #B794F6 | 7.2:1 | Interactive elements, focus rings |
| `borderSubtle` | 3:1 | Non-text contrast for borders and separators |
### Keyboard Navigation
Full keyboard access — no mouse required:
* **Tab / Shift+Tab**: Navigate between all interactive controls (sliders, switches, buttons, dropdowns, text fields).
* **Arrow Keys**: Navigate the Settings sidebar tabs.
* **Enter / Space**: Activate any focused control.
* **Focus Rings**: Every interactive element shows a visible 2px accent-colored focus indicator.
### Screen Reader Support
Every component is annotated with semantic roles and descriptive names:
* Buttons, sliders, checkboxes, combo boxes, text fields — all declare their `Accessible.role` and `Accessible.name`.
* Switches report "on" / "off" state in their accessible name.
* The loader status uses `AlertMessage` for live-region announcements.
* Settings tabs use `Tab` / `PageTab` roles matching WAI-ARIA patterns.
### Non-Color State Indicators
Toggle switches display **I/O marks** inside the thumb (not just color changes), ensuring state is perceivable without color vision (WCAG 1.4.1).
### Target Sizes
All interactive controls meet the **24px minimum** target size (WCAG 2.5.8). Slider handles, buttons, switches, and nav items are all comfortably clickable.
### Reduced Motion
A **Reduce Motion** toggle (Settings > Visuals) disables all decorative animations:
* Shader effects (gradient blobs, glow, CRT scanlines, rainbow waveform)
* Particle systems
* Pulsing animations (mic button, recording timer, border)
* Loader logo pulse and progress shimmer
The system also respects the **Windows "Show animations" preference** via `SystemParametersInfo` detection. Essential information (recording state, progress bars, timer text) remains fully functional.
---
## 🛠️ Deployment
### 📥 Installation

View File

@@ -1,28 +1,28 @@
# Release v1.0.2
# Release v1.0.4
**"The Lightweight Release"**
**"The Compatibility Update"**
This release focuses on removing bloat and switching to a native approach for punctuation, resulting in a significantly faster and smaller application.
This release focuses on maximum stability across different hardware configurations (AMD, Intel, Nvidia) and fixing startup crashes related to corrupted models or missing drivers.
## 🚀 Key Changes
## 🛠️ Critical Fixes
### 1. Style Prompting (Replaces Grammar Model)
We have removed the heavy "Grammar Correction" model (M2M100) and replaced it with **Style Prompting**.
* **How it works**: Uses Whisper's internal context awareness to force proper punctuation.
* **New Settings**: Go to `Settings -> AI Engine` to choose a style:
* **Standard**: (Default) Forces full sentences and proper punctuation.
* **Casual**: Relaxed, lowercase style.
* **Custom**: Enter your own prompt context.
### 1. Robust CPU Fallback (AMD / Intel Support)
* **Problem**: Previously, if an AMD user tried to run the app, it would crash instantly because it tried to load Nvidia CUDA libraries by default.
* **Fix**: The app now **silently detects** if CUDA initialization fails (due to missing DLLs or incompatible hardware) and **automatically falls back to CPU mode**.
* **Result**: The app "just works" on any Windows machine, regardless of GPU.
### 2. Bloat Removal
* **Removed**: `transformers`, `sentencepiece`, `accelerate` libraries.
* **Removed**: `grammar-m2m100` model downloader and logic.
* **Impact**: The application is lighter, installs faster, and uses less RAM.
### 2. Startup Crash Protection
* **Problem**: If `faster_whisper` was imported before checking for valid drivers, the app would crash on launch for some users.
* **Fix**: Implemented **Lazy Loading** for the AI engine. The app now starts the UI first, and only loads the heavy AI libraries inside a safety block that catches errors.
### 3. Stability Fixes
* **Fixed**: `NameError: 'torch' is not defined` when using Low VRAM Mode.
* **Fixed**: Bootstrapper now self-repairs missing dependencies if the environment gets corrupted.
### 3. Corrupt Model Auto-Repair
* **Problem**: Interrupted downloads could leave a corrupted model folder, preventing the app from ever starting again.
* **Fix**: If the app detects a "vocabulary missing" or invalid config error, it will now **automatically delete the corrupt folder** and allow you to re-download it cleanly.
### 4. Windows DLL Injection
* **Fix**: Added explicit DLL path injection for `nvidia-cublas` and `nvidia-cudnn` to ensure Python 3.8+ can find the required CUDA libraries on Windows systems that don't have them in PATH.
## 📦 Installation
1. Download `WhisperVoice.exe` (attached below or in `dist/`).
2. Run it. It will automatically update your environment if needed.
1. Download `WhisperVoice.exe` below.
2. Replace your existing `.exe`.
3. Run it.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 73 KiB

View File

@@ -245,18 +245,38 @@ class Bootstrapper:
req_file = self.source_path / "requirements.txt"
# Use --prefer-binary to avoid building from source on Windows if possible
# Use --no-warn-script-location to reduce noise
# CRITICAL: Force --only-binary for llama-cpp-python to prevent picking new source-only versions
cmd = [
str(self.python_path / "python.exe"), "-m", "pip", "install",
"--prefer-binary",
"--only-binary", "llama-cpp-python",
"--extra-index-url", "https://abetlen.github.io/llama-cpp-python/whl/cpu",
"-r", str(req_file)
]
process = subprocess.Popen(
[str(self.python_path / "python.exe"), "-m", "pip", "install", "-r", str(req_file)],
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
stderr=subprocess.STDOUT, # Merge stderr into stdout
text=True,
cwd=str(self.python_path),
creationflags=subprocess.CREATE_NO_WINDOW
)
output_buffer = []
for line in process.stdout:
if self.ui: self.ui.set_detail(line.strip()[:60])
process.wait()
line_stripped = line.strip()
if self.ui: self.ui.set_detail(line_stripped[:60])
output_buffer.append(line_stripped)
log(line_stripped)
return_code = process.wait()
if return_code != 0:
err_msg = "\n".join(output_buffer[-15:]) # Show last 15 lines
raise RuntimeError(f"Pip install failed (Exit code {return_code}):\n{err_msg}")
def refresh_app_source(self):
"""
@@ -348,8 +368,22 @@ class Bootstrapper:
return False
def check_dependencies(self):
"""Quick check if critical dependencies are installed."""
return True # Deprecated logic placeholder
"""Check if critical dependencies are importable in the embedded python."""
if not self.is_python_ready(): return False
try:
# Check for core libs that might be missing
# We use a subprocess to check imports in the runtime environment
subprocess.check_call(
[str(self.python_path / "python.exe"), "-c", "import faster_whisper; import llama_cpp; import PySide6"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
cwd=str(self.python_path),
creationflags=subprocess.CREATE_NO_WINDOW
)
return True
except (subprocess.CalledProcessError, FileNotFoundError):
return False
def setup_and_run(self):
"""Full setup/update and run flow."""
@@ -359,11 +393,17 @@ class Bootstrapper:
self.download_python()
self._fix_pth_file() # Ensure pth is fixed immediately after download
self.install_pip()
self.install_packages()
# self.install_packages() # We'll do this in the dependency check step now
# Always refresh source to ensure we have the latest bundled code
self.refresh_app_source()
# 2. Check and Install Dependencies
# We do this AFTER refreshing source so we have the latest requirements.txt
if not self.check_dependencies():
log("Dependencies missing or incomplete. Installing...")
self.install_packages()
# Launch
if self.run_app():
if self.ui: self.ui.root.quit()

31
build.bat Normal file
View File

@@ -0,0 +1,31 @@
@echo off
echo ============================================
echo Building WhisperVoice Portable EXE
echo ============================================
echo.
if not exist venv (
echo ERROR: venv not found. Run run_source.bat first.
pause
exit /b 1
)
call venv\Scripts\activate
echo Running PyInstaller (single-file bootstrapper)...
pyinstaller build.spec --clean --noconfirm
if %ERRORLEVEL% NEQ 0 (
echo.
echo BUILD FAILED! Check errors above.
pause
exit /b 1
)
echo.
echo Build complete!
echo.
echo Output: dist\WhisperVoice.exe
echo.
echo This single exe will download all dependencies on first run.
pause

95
build.spec Normal file
View File

@@ -0,0 +1,95 @@
# -*- mode: python ; coding: utf-8 -*-
# WhisperVoice — Single-file portable bootstrapper
#
# This builds a TINY exe that contains only:
# - The bootstrapper (downloads Python + deps on first run)
# - The app source code (bundled as data, extracted to runtime/app/)
#
# NO heavy dependencies (torch, PySide6, etc.) are bundled.
import os
import glob
block_cipher = None
# ── Collect app source as data (goes into app_source/ inside the bundle) ──
app_datas = []
# main.py
app_datas.append(('main.py', 'app_source'))
# requirements.txt
app_datas.append(('requirements.txt', 'app_source'))
# src/**/*.py (core, ui, utils — preserving directory structure)
for py in glob.glob('src/**/*.py', recursive=True):
dest = os.path.join('app_source', os.path.dirname(py))
app_datas.append((py, dest))
# src/ui/qml/** (QML files, shaders, SVGs, fonts, qmldir)
qml_dir = os.path.join('src', 'ui', 'qml')
for pattern in ('*.qml', '*.qsb', '*.frag', '*.svg', '*.ico', '*.png',
'qmldir', 'AUTHORS.txt', 'OFL.txt'):
for f in glob.glob(os.path.join(qml_dir, pattern)):
app_datas.append((f, os.path.join('app_source', qml_dir)))
# Fonts
for f in glob.glob(os.path.join(qml_dir, 'fonts', 'ttf', '*.ttf')):
app_datas.append((f, os.path.join('app_source', qml_dir, 'fonts', 'ttf')))
# assets/
if os.path.exists(os.path.join('assets', 'icon.ico')):
app_datas.append((os.path.join('assets', 'icon.ico'), os.path.join('app_source', 'assets')))
# ── Analysis — only the bootstrapper, NO heavy imports ────────────────────
a = Analysis(
['bootstrapper.py'],
pathex=[],
binaries=[],
datas=app_datas,
hiddenimports=[],
hookspath=[],
hooksconfig={},
runtime_hooks=[],
excludes=[
# Exclude everything heavy — the bootstrapper only uses stdlib
'torch', 'numpy', 'scipy', 'PySide6', 'shiboken6',
'faster_whisper', 'ctranslate2', 'llama_cpp',
'sounddevice', 'soundfile', 'keyboard', 'pyperclip',
'psutil', 'pynvml', 'pystray', 'PIL', 'Pillow',
'darkdetect', 'huggingface_hub', 'requests',
'tqdm', 'onnxruntime', 'av',
'tkinter', 'matplotlib', 'notebook', 'IPython',
],
win_no_prefer_redirects=False,
win_private_assemblies=False,
cipher=block_cipher,
noarchive=False,
)
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)
# ── Single-file EXE (--onefile) ──────────────────────────────────────────
exe = EXE(
pyz,
a.scripts,
a.binaries,
a.zipfiles,
a.datas,
[],
name='WhisperVoice',
debug=False,
bootloader_ignore_signals=False,
strip=False,
upx=True,
console=False, # No console — bootstrapper allocates one when needed
disable_windowed_traceback=False,
argv_emulation=False,
target_arch=None,
codesign_identity=None,
entitlements_file=None,
icon='assets/icon.ico',
)

View File

@@ -1,66 +0,0 @@
"""
Build the Lightweight Bootstrapper
==================================
This creates a small (~15-20MB) .exe that downloads Python + dependencies on first run.
"""
import os
import shutil
import PyInstaller.__main__
from pathlib import Path
def build_bootstrapper():
project_root = Path(__file__).parent.absolute()
dist_path = project_root / "dist"
# Collect all app source files to bundle
# These will be extracted and used when setting up
app_source_files = [
("src", "app_source/src"),
("assets", "app_source/assets"), # Include icon etc
("main.py", "app_source"),
("requirements.txt", "app_source"),
]
add_data_args = []
for src, dst in app_source_files:
src_path = project_root / src
if src_path.exists():
add_data_args.extend(["--add-data", f"{src}{os.pathsep}{dst}"])
# Use absolute project root for copying
shutil.copy2(project_root / "assets" / "icon.ico", project_root / "app_icon.ico")
print("🚀 Building Lightweight Bootstrapper...")
print("⏳ This creates a small .exe that downloads dependencies on first run.\n")
PyInstaller.__main__.run([
"bootstrapper.py",
"--name=WhisperVoice",
"--onefile",
"--noconsole", # Re-enabled! Error handling in bootstrapper is ready.
"--clean",
"--icon=app_icon.ico", # Simplified path at root
*add_data_args,
])
exe_path = dist_path / "WhisperVoice.exe"
if exe_path.exists():
size_mb = exe_path.stat().st_size / (1024 * 1024)
print("\n" + "="*60)
print("✅ BOOTSTRAPPER BUILD COMPLETE!")
print("="*60)
print(f"\n📍 Output: {exe_path}")
print(f"📦 Size: {size_mb:.1f} MB")
print("\n📋 How it works:")
print(" 1. User runs WhisperVoice.exe")
print(" 2. First run: Downloads Python + packages (~2-3GB)")
print(" 3. Subsequent runs: Launches instantly")
print("\n💡 The 'runtime/' folder will be created next to the .exe")
else:
print("\n❌ Build failed. Check the output above for errors.")
if __name__ == "__main__":
os.chdir(Path(__file__).parent)
build_bootstrapper()

View File

@@ -1,17 +0,0 @@
@echo off
echo Building Whisper Voice Portable EXE...
if not exist venv (
echo Please run run_source.bat first to setup environment!
pause
exit /b
)
call venv\Scripts\activate
pip install pyinstaller
echo Running PyInstaller...
pyinstaller build.spec --clean --noconfirm
echo.
echo Build Complete! Check dist/WhisperVoice.exe
pause

View File

@@ -1,14 +0,0 @@
from PIL import Image
import os
# Path from the generate_image tool output
src = r"C:/Users/lashman/.gemini/antigravity/brain/9a183770-2481-475b-b748-03f4910f9a8e/app_icon_1769195450659.png"
dst = r"d:\!!! SYSTEM DATA !!!\Desktop\python crap\whisper_voice\assets\icon.ico"
if os.path.exists(src):
img = Image.open(src)
# Resize to standard icon sizes
img.save(dst, format='ICO', sizes=[(256, 256)])
print(f"Icon saved to {dst}")
else:
print(f"Source image not found: {src}")

BIN
dist/WhisperVoice.exe vendored

Binary file not shown.

View File

@@ -1,43 +0,0 @@
import requests
import os
ICONS = {
"settings.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/gear.svg",
"visibility.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/eye.svg",
"smart_toy.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/brain.svg",
"microphone.svg": "https://raw.githubusercontent.com/FortAwesome/Font-Awesome/6.x/svgs/solid/microphone.svg"
}
TARGET_DIR = r"d:\!!! SYSTEM DATA !!!\Desktop\python crap\whisper_voice\src\ui\qml"
def download_icons():
if not os.path.exists(TARGET_DIR):
print(f"Directory not found: {TARGET_DIR}")
return
for filename, url in ICONS.items():
try:
print(f"Downloading {filename} from {url}...")
response = requests.get(url, timeout=10)
response.raise_for_status()
# Force white fill
content = response.text
if "<path" in content and "fill=" not in content:
content = content.replace("<path", '<path fill="#ffffff"')
elif "<path" in content and "fill=" in content:
# Regex or simple replace if possible, but simplest is usually just injecting style or checking common FA format
pass # FA standard usually has no fill.
# Additional safety: Replace currentcolor if present
content = content.replace("currentColor", "#ffffff")
filepath = os.path.join(TARGET_DIR, filename)
with open(filepath, 'w', encoding='utf-8') as f:
f.write(content)
print(f"Saved {filepath} (modified to white)")
except Exception as e:
print(f"FAILED to download {filename}: {e}")
if __name__ == "__main__":
download_icons()

263
main.py
View File

@@ -9,6 +9,31 @@ app_dir = os.path.dirname(os.path.abspath(__file__))
if app_dir not in sys.path:
sys.path.insert(0, app_dir)
# -----------------------------------------------------------------------------
# WINDOWS DLL FIX (CRITICAL for Portable CUDA)
# Python 3.8+ on Windows requires explicit DLL directory addition.
# -----------------------------------------------------------------------------
if os.name == 'nt' and hasattr(os, 'add_dll_directory'):
try:
from pathlib import Path
# Scan sys.path for site-packages
for p in sys.path:
path_obj = Path(p)
if path_obj.name == 'site-packages' and path_obj.exists():
nvidia_path = path_obj / "nvidia"
if nvidia_path.exists():
for subdir in nvidia_path.iterdir():
# Add 'bin' folder from each nvidia stub (cublas, cudnn, etc.)
bin_path = subdir / "bin"
if bin_path.exists():
os.add_dll_directory(str(bin_path))
# Also try adding site-packages itself just in case
# os.add_dll_directory(str(path_obj))
break
except Exception:
pass
# -----------------------------------------------------------------------------
from PySide6.QtWidgets import QApplication, QFileDialog, QMessageBox
from PySide6.QtCore import QObject, Slot, Signal, QThread, Qt, QUrl
from PySide6.QtQml import QQmlApplicationEngine
@@ -19,6 +44,7 @@ from src.ui.bridge import UIBridge
from src.ui.tray import SystemTray
from src.core.audio_engine import AudioEngine
from src.core.transcriber import WhisperTranscriber
from src.core.llm_engine import LLMEngine
from src.core.hotkey_manager import HotkeyManager
from src.core.config import ConfigManager
from src.utils.injector import InputInjector
@@ -54,6 +80,21 @@ try:
except:
pass
# Detect Windows "Reduce Motion" preference
try:
import ctypes
SPI_GETCLIENTAREAANIMATION = 0x1042
animation_enabled = ctypes.c_bool(True)
ctypes.windll.user32.SystemParametersInfoW(
SPI_GETCLIENTAREAANIMATION, 0,
ctypes.byref(animation_enabled), 0
)
if not animation_enabled.value:
ConfigManager().data["reduce_motion"] = True
ConfigManager().save()
except Exception:
pass
# Configure Logging
class QmlLoggingHandler(logging.Handler, QObject):
sig_log = Signal(str)
@@ -163,6 +204,69 @@ class DownloadWorker(QThread):
logging.error(f"Download failed: {e}")
self.error.emit(str(e))
class LLMDownloadWorker(QThread):
progress = Signal(int)
finished = Signal()
error = Signal(str)
def __init__(self, parent=None):
super().__init__(parent)
def run(self):
try:
import requests
# Support one model for now
url = "https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/resolve/main/llama-3.2-1b-instruct-q4_k_m.gguf?download=true"
fname = "llama-3.2-1b-instruct-q4_k_m.gguf"
model_path = get_models_path() / "llm" / "llama-3.2-1b-instruct"
model_path.mkdir(parents=True, exist_ok=True)
dest_file = model_path / fname
# Simple check if exists and > 0 size?
# We assume if the user clicked download, they want to download it.
with requests.Session() as s:
head = s.head(url, allow_redirects=True)
total_size = int(head.headers.get('content-length', 0))
resp = s.get(url, stream=True)
resp.raise_for_status()
downloaded = 0
with open(dest_file, 'wb') as f:
for chunk in resp.iter_content(chunk_size=8192):
if chunk:
f.write(chunk)
downloaded += len(chunk)
if total_size > 0:
pct = int((downloaded / total_size) * 100)
self.progress.emit(pct)
self.finished.emit()
except Exception as e:
logging.error(f"LLM Download failed: {e}")
self.error.emit(str(e))
class LLMWorker(QThread):
finished = Signal(str)
def __init__(self, llm_engine, text, mode, parent=None):
super().__init__(parent)
self.llm_engine = llm_engine
self.text = text
self.mode = mode
def run(self):
try:
corrected = self.llm_engine.correct_text(self.text, self.mode)
self.finished.emit(corrected)
except Exception as e:
logging.error(f"LLMWorker crashed: {e}")
self.finished.emit(self.text) # Fail safe: return original text
class TranscriptionWorker(QThread):
finished = Signal(str)
def __init__(self, transcriber, audio_data, is_file=False, parent=None, task_override=None):
@@ -204,6 +308,7 @@ class WhisperApp(QObject):
self.bridge.settingChanged.connect(self.on_settings_changed)
self.bridge.hotkeysEnabledChanged.connect(self.on_hotkeys_enabled_toggle)
self.bridge.downloadRequested.connect(self.on_download_requested)
self.bridge.llmDownloadRequested.connect(self.on_llm_download_requested)
self.engine.rootContext().setContextProperty("ui", self.bridge)
@@ -224,7 +329,9 @@ class WhisperApp(QObject):
# 3. Logic Components Placeholders
self.audio_engine = None
self.transcriber = None
self.llm_engine = None
self.hk_transcribe = None
self.hk_correct = None
self.hk_translate = None
self.overlay_root = None
@@ -319,14 +426,19 @@ class WhisperApp(QObject):
self.audio_engine.set_visualizer_callback(self.bridge.update_amplitude)
self.audio_engine.set_silence_callback(self.on_silence_detected)
self.transcriber = WhisperTranscriber()
self.llm_engine = LLMEngine()
# Dual Hotkey Managers
self.hk_transcribe = HotkeyManager(config_key="hotkey")
self.hk_transcribe.triggered.connect(lambda: self.toggle_recording(task_override="transcribe"))
self.hk_transcribe.triggered.connect(lambda: self.toggle_recording(task_override="transcribe", task_mode="standard"))
self.hk_transcribe.start()
self.hk_correct = HotkeyManager(config_key="hotkey_correct")
self.hk_correct.triggered.connect(lambda: self.toggle_recording(task_override="transcribe", task_mode="correct"))
self.hk_correct.start()
self.hk_translate = HotkeyManager(config_key="hotkey_translate")
self.hk_translate.triggered.connect(lambda: self.toggle_recording(task_override="translate"))
self.hk_translate.triggered.connect(lambda: self.toggle_recording(task_override="translate", task_mode="standard"))
self.hk_translate.start()
self.bridge.update_status("Ready")
@@ -334,6 +446,57 @@ class WhisperApp(QObject):
def run(self):
sys.exit(self.qt_app.exec())
@Slot(str, str)
@Slot(str)
def toggle_recording(self, task_override=None, task_mode="standard"):
"""
task_override: 'transcribe' or 'translate' (passed to whisper)
task_mode: 'standard' or 'correct' (determines post-processing)
"""
if task_mode == "correct":
self.current_task_requires_llm = True
elif task_mode == "standard":
self.current_task_requires_llm = False # Explicit reset
# Actual Logic
if self.bridge.isRecording:
logging.info("Stopping recording...")
# stop_recording returns the numpy array directly
audio_data = self.audio_engine.stop_recording()
self.bridge.isRecording = False
self.bridge.update_status("Processing...")
self.bridge.isProcessing = True
# Save task override for processing
self.last_task_override = task_override
if audio_data is not None and len(audio_data) > 0:
# Use the task that started this session, or the override if provided
final_task = getattr(self, "current_recording_task", self.config.get("task"))
if task_override: final_task = task_override
self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self, task_override=final_task)
self.worker.finished.connect(self.on_transcription_done)
self.worker.start()
else:
self.bridge.update_status("Ready")
self.bridge.isProcessing = False
else:
# START RECORDING
if self.bridge.isProcessing:
logging.warning("Ignored toggle request: Transcription in progress.")
return
intended_task = task_override if task_override else self.config.get("task")
self.current_recording_task = intended_task
logging.info(f"Starting recording... (Task: {intended_task}, Mode: {task_mode})")
self.audio_engine.start_recording()
self.bridge.isRecording = True
self.bridge.update_status(f"Recording ({intended_task})...")
@Slot()
def quit_app(self):
logging.info("Shutting down...")
@@ -422,14 +585,16 @@ class WhisperApp(QObject):
print(f"Setting Changed: {key} = {value}")
# 1. Hotkey Reload
if key in ["hotkey", "hotkey_translate"]:
if key in ["hotkey", "hotkey_translate", "hotkey_correct"]:
if self.hk_transcribe: self.hk_transcribe.reload_hotkey()
if self.hk_correct: self.hk_correct.reload_hotkey()
if self.hk_translate: self.hk_translate.reload_hotkey()
if self.tray:
hk1 = self.format_hotkey(self.config.get("hotkey"))
hk3 = self.format_hotkey(self.config.get("hotkey_correct"))
hk2 = self.format_hotkey(self.config.get("hotkey_translate"))
self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nTranslate: {hk2}")
self.tray.setToolTip(f"Whisper Voice\nTranscribe: {hk1}\nCorrect: {hk3}\nTranslate: {hk2}")
# 2. AI Model Reload (Heavy)
if key in ["model_size", "compute_device", "compute_type"]:
@@ -546,40 +711,7 @@ class WhisperApp(QObject):
# Let's ensure toggle_recording handles no arg calls by stopping the CURRENT task.
QMetaObject.invokeMethod(self, "toggle_recording", Qt.QueuedConnection)
@Slot() # Modified to allow lambda override
def toggle_recording(self, task_override=None):
if not self.audio_engine: return
# Prevent starting a new recording while we are still transcribing the last one
if self.bridge.isProcessing:
logging.warning("Ignored toggle request: Transcription in progress.")
return
# Determine which task we are entering
if task_override:
intended_task = task_override
else:
intended_task = self.config.get("task")
if self.audio_engine.recording:
# STOP RECORDING
self.bridge.update_status("Thinking...")
self.bridge.isRecording = False
self.bridge.isProcessing = True # Start Processing
audio_data = self.audio_engine.stop_recording()
# Use the task that started this session, or the override if provided (though usually override is for starting)
final_task = getattr(self, "current_recording_task", self.config.get("task"))
self.worker = TranscriptionWorker(self.transcriber, audio_data, parent=self, task_override=final_task)
self.worker.finished.connect(self.on_transcription_done)
self.worker.start()
else:
# START RECORDING
self.current_recording_task = intended_task
self.bridge.update_status(f"Recording ({intended_task})...")
self.bridge.isRecording = True
self.audio_engine.start_recording()
@Slot(bool)
def on_ui_toggle_request(self, state):
@@ -589,12 +721,54 @@ class WhisperApp(QObject):
@Slot(str)
def on_transcription_done(self, text: str):
self.bridge.update_status("Ready")
self.bridge.isProcessing = False # End Processing
self.bridge.isProcessing = False # Temporarily false? No, keep it true if we chain.
# Check LLM Settings -> AND check if the current task requested it
llm_enabled = self.config.get("llm_enabled")
requires_llm = getattr(self, "current_task_requires_llm", False)
# We only correct if:
# 1. LLM is globally enabled (safety switch)
# 2. current_task_requires_llm is True (triggered by Correct hotkey)
# OR 3. Maybe user WANTS global correction? Ideally user uses separate hotkey.
# Let's say: If "Correction" is enabled in settings, does it apply to ALL?
# The user's feedback suggests they DON'T want it on regular hotkey.
# So we enforce: Correct Hotkey -> Corrects. Regular Hotkey -> Raw.
# BUT we must handle the case where user expects the old behavior?
# Let's make it strict: Only correct if triggered by correct hotkey OR if we add a "Correct All" toggle later.
# For now, let's respect the flag. But wait, if llm_enabled is OFF, we shouldn't run it even if hotkey pressed?
# Yes, safety switch.
if text and llm_enabled and requires_llm:
# Chain to LLM
self.bridge.isProcessing = True
self.bridge.update_status("Correcting...")
mode = self.config.get("llm_mode")
self.llm_worker = LLMWorker(self.llm_engine, text, mode, parent=self)
self.llm_worker.finished.connect(self.on_llm_done)
self.llm_worker.start()
return
self.bridge.isProcessing = False
if text:
method = self.config.get("input_method")
speed = int(self.config.get("typing_speed"))
InputInjector.inject_text(text, method, speed)
@Slot(str)
def on_llm_done(self, text: str):
self.bridge.update_status("Ready")
self.bridge.isProcessing = False
if text:
method = self.config.get("input_method")
speed = int(self.config.get("typing_speed"))
InputInjector.inject_text(text, method, speed)
# Cleanup
if hasattr(self, 'llm_worker') and self.llm_worker:
self.llm_worker.deleteLater()
self.llm_worker = None
@Slot(bool)
def on_hotkeys_enabled_toggle(self, state):
if self.hk_transcribe: self.hk_transcribe.set_enabled(state)
@@ -613,6 +787,19 @@ class WhisperApp(QObject):
self.download_worker.error.connect(self.on_download_error)
self.download_worker.start()
@Slot()
def on_llm_download_requested(self):
if self.bridge.isDownloading: return
self.bridge.update_status("Downloading LLM...")
self.bridge.isDownloading = True
self.llm_dl_worker = LLMDownloadWorker(parent=self)
self.llm_dl_worker.progress.connect(self.on_loader_progress) # Reuse existing progress slot? Yes.
self.llm_dl_worker.finished.connect(self.on_download_finished) # Reuses same cleanup
self.llm_dl_worker.error.connect(self.on_download_error)
self.llm_dl_worker.start()
def on_download_finished(self):
self.bridge.isDownloading = False
self.bridge.update_status("Ready")

View File

@@ -1,85 +0,0 @@
"""
Portable Build Script for WhisperVoice.
=======================================
Creates a single-file portable .exe using PyInstaller.
All data (settings, models) will be stored next to the .exe at runtime.
"""
import os
import shutil
import PyInstaller.__main__
from pathlib import Path
def build_portable():
# 1. Setup Paths
project_root = Path(__file__).parent.absolute()
dist_path = project_root / "dist"
build_path = project_root / "build"
# 2. Define Assets to bundle (into the .exe)
# Format: (Source, Destination relative to bundle root)
data_files = [
# QML files
("src/ui/qml/*.qml", "src/ui/qml"),
("src/ui/qml/*.svg", "src/ui/qml"),
("src/ui/qml/*.qsb", "src/ui/qml"),
("src/ui/qml/fonts/ttf/*.ttf", "src/ui/qml/fonts/ttf"),
# Subprocess worker script (CRITICAL for transcription)
("src/core/transcribe_worker.py", "src/core"),
]
# Convert to PyInstaller format "--add-data source;dest" (Windows uses ';')
add_data_args = []
for src, dst in data_files:
add_data_args.extend(["--add-data", f"{src}{os.pathsep}{dst}"])
# 3. Run PyInstaller
print("🚀 Starting Portable Build...")
print("⏳ This may take 5-10 minutes...")
PyInstaller.__main__.run([
"bootstrapper.py", # Entry point (Tiny Installer)
"--name=WhisperVoice", # EXE name
"--onefile", # Single EXE
"--noconsole", # No terminal window
"--clean", # Clean cache
# Bundle the app source to be extracted by bootstrapper
# The bootstrapper expects 'app_source' folder in bundled resources
"--add-data", f"src{os.pathsep}app_source/src",
"--add-data", f"main.py{os.pathsep}app_source",
"--add-data", f"requirements.txt{os.pathsep}app_source",
# Add assets
"--add-data", f"src/ui/qml{os.pathsep}app_source/src/ui/qml",
"--add-data", f"assets{os.pathsep}app_source/assets",
# No heavy collections!
# The bootstrapper uses internal pip to install everything.
# Exclude heavy modules to ensure this exe stays tiny
"--exclude-module", "faster_whisper",
"--exclude-module", "torch",
"--exclude-module", "PySide6",
# Icon
# "--icon=icon.ico",
])
print("\n" + "="*60)
print("✅ BUILD COMPLETE!")
print("="*60)
print(f"\n📍 Output: {dist_path / 'WhisperVoice.exe'}")
print("\n📋 First run instructions:")
print(" 1. Place WhisperVoice.exe in a folder (e.g., C:\\WhisperVoice\\)")
print(" 2. Run it - it will create 'models' and 'settings.json' folders")
print(" 3. The app will download the Whisper model on first transcription\n")
print("💡 TIP: Keep the .exe with its generated files for true portability!")
if __name__ == "__main__":
# Ensure we are in project root
os.chdir(Path(__file__).parent)
build_portable()

View File

@@ -29,3 +29,6 @@ huggingface-hub>=0.20.0
pystray>=0.19.0
Pillow>=10.0.0
darkdetect>=0.8.0
# LLM / Correction
llama-cpp-python>=0.2.20

View File

@@ -1,5 +0,0 @@
@echo off
echo [LAUNCHER] Starting Fake Blur UI (Python/Qt)...
call venv\Scripts\activate.bat
python main.py
if %errorlevel% neq 0 pause

View File

@@ -17,6 +17,7 @@ from src.core.paths import get_base_path
DEFAULT_SETTINGS = {
"hotkey": "f8",
"hotkey_translate": "f10",
"hotkey_correct": "f9", # New: Transcribe + Correct
"model_size": "small",
"input_device": None, # Device ID (int) or Name (str), None = Default
"save_recordings": False, # Save .wav files for debugging
@@ -49,10 +50,18 @@ DEFAULT_SETTINGS = {
"condition_on_previous_text": True,
"initial_prompt": "Mm-hmm. Okay, let's go. I speak in full sentences.", # Default: Forces punctuation
# LLM Correction
"llm_enabled": False,
"llm_mode": "Standard", # "Grammar", "Standard", "Rewrite"
"llm_model_name": "llama-3.2-1b-instruct",
# Low VRAM Mode
"unload_models_after_use": False # If True, models are unloaded immediately to free VRAM
"unload_models_after_use": False, # If True, models are unloaded immediately to free VRAM
# Accessibility
"reduce_motion": False # Disable animations for WCAG 2.3.3
}
class ConfigManager:
@@ -102,9 +111,9 @@ class ConfigManager:
except Exception as e:
logging.error(f"Failed to save settings: {e}")
def get(self, key: str) -> Any:
def get(self, key: str, default: Any = None) -> Any:
"""Get a setting value."""
return self.data.get(key, DEFAULT_SETTINGS.get(key))
return self.data.get(key, DEFAULT_SETTINGS.get(key, default))

View File

@@ -1,31 +0,0 @@
@echo off
echo [DEBUG] LAUNCHER STARTED
echo [DEBUG] CWD: %CD%
echo [DEBUG] Python Path (expected relative): ..\python\python.exe
REM Read stdin to a file to verify data input (optional debugging)
REM python.exe might be in different relative path depending on where this bat is run
REM We assume this bat is in runtime/app/src/core/
REM So python is in ../../../python/python.exe
set PYTHON_EXE=..\..\..\python\python.exe
if exist "%PYTHON_EXE%" (
echo [DEBUG] Found Python at %PYTHON_EXE%
) else (
echo [ERROR] Python NOT found at %PYTHON_EXE%
echo [ERROR] Listing relative directories:
dir ..\..\..\
pause
exit /b 1
)
echo [DEBUG] Launching script: transcribe_worker.py
"%PYTHON_EXE%" transcribe_worker.py
if %ERRORLEVEL% NEQ 0 (
echo [ERROR] Python script failed with code %ERRORLEVEL%
pause
) else (
echo [SUCCESS] Script finished.
pause
)

185
src/core/llm_engine.py Normal file
View File

@@ -0,0 +1,185 @@
"""
LLM Engine Module.
==================
Handles interaction with the local Llama 3.2 1B model for transcription correction.
Uses llama-cpp-python for efficient local inference.
"""
import os
import logging
from typing import Optional
from src.core.paths import get_models_path
from src.core.config import ConfigManager
try:
from llama_cpp import Llama
except ImportError:
Llama = None
class LLMEngine:
"""
Manages the Llama model and performs text correction/rewriting.
"""
def __init__(self):
self.config = ConfigManager()
self.model = None
self.current_model_path = None
# --- Mode 1: Grammar Only (Strict) ---
self.prompt_grammar = (
"You are a text correction tool. "
"Correct the grammar/spelling. Do not change punctuation or capitalization styles. "
"Do not remove any words (including profanity). Output ONLY the result."
"\n\nExample:\nInput: 'damn it works'\nOutput: 'damn it works'"
)
# --- Mode 2: Standard (Grammar + Punctuation + Caps) ---
self.prompt_standard = (
"You are a text correction tool. "
"Standardize the grammar, punctuation, and capitalization. "
"Do not remove any words (including profanity). Output ONLY the result."
"\n\nExample:\nInput: 'damn it works'\nOutput: 'Damn it works.'"
)
# --- Mode 3: Rewrite (Tone-Aware Polish) ---
self.prompt_rewrite = (
"You are a text rewriting tool. Improve flow/clarity but keep the exact tone and vocabulary. "
"Do not remove any words (including profanity). Output ONLY the result."
"\n\nExample:\nInput: 'damn it works'\nOutput: 'Damn, it works.'"
)
def load_model(self) -> bool:
"""
Loads the LLM model if it exists.
Returns True if successful, False otherwise.
"""
if Llama is None:
logging.error("llama-cpp-python not installed.")
return False
model_name = self.config.get("llm_model_name", "llama-3.2-1b-instruct")
model_dir = get_models_path() / "llm" / model_name
model_file = model_dir / "llama-3.2-1b-instruct-q4_k_m.gguf"
if not model_file.exists():
logging.warning(f"LLM Model not found at: {model_file}")
return False
if self.model and self.current_model_path == str(model_file):
return True
try:
logging.info(f"Loading LLM from {model_file}...")
n_gpu_layers = 0
try:
import torch
if torch.cuda.is_available():
n_gpu_layers = -1
except:
pass
self.model = Llama(
model_path=str(model_file),
n_gpu_layers=n_gpu_layers,
n_ctx=2048,
verbose=False
)
self.current_model_path = str(model_file)
logging.info("LLM loaded successfully.")
return True
except Exception as e:
logging.error(f"Failed to load LLM: {e}")
self.model = None
return False
def correct_text(self, text: str, mode: str = "Standard") -> str:
"""Corrects or rewrites the provided text."""
if not text or not text.strip():
return text
if not self.model:
if not self.load_model():
return text
logging.info(f"LLM Processing ({mode}): '{text}'")
system_prompt = self.prompt_standard
if mode == "Grammar": system_prompt = self.prompt_grammar
elif mode == "Rewrite": system_prompt = self.prompt_rewrite
# PREFIX INJECTION TECHNIQUE
# We end the prompt with the start of the assistant's answer specifically phrased to force compliance.
# "Here is the processed output:" forces it into a completion mode rather than a refusal mode.
prefix_injection = "Here is the processed output:\n"
prompt = (
f"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_prompt}<|eot_id|>"
f"<|start_header_id|>user<|end_header_id|>\n\nProcess this input:\n{text}<|eot_id|>"
f"<|start_header_id|>assistant<|end_header_id|>\n\n{prefix_injection}"
)
try:
output = self.model(
prompt,
max_tokens=512,
stop=["<|eot_id|>"],
echo=False,
temperature=0.1
)
result = output['choices'][0]['text'].strip()
# 1. Fallback: If result is empty, it might have just outputted nothing because we prefilled?
# Actually llama-cpp-python usually returns the *continuation*.
# So if it outputted "My corrected text.", the full logical response is "Here is...: My corrected text."
# We just want the result.
# Refusal Detection (Safety Net)
refusal_triggers = [
"I cannot", "I can't", "I am unable", "I apologize", "sorry",
"As an AI", "explicit content", "harmful content", "safety guidelines"
]
lower_res = result.lower()
if any(trig in lower_res for trig in refusal_triggers) and len(result) < 150:
logging.warning(f"LLM Refusal Detected: '{result}'. Falling back to original.")
return text # Return original text on refusal!
# --- Robust Post-Processing ---
# 1. Strip quotes
if result.startswith('"') and result.endswith('"') and len(result) > 2 and '"' not in result[1:-1]:
result = result[1:-1]
if result.startswith("'") and result.endswith("'") and len(result) > 2 and "'" not in result[1:-1]:
result = result[1:-1]
# 2. Split by newline
if "\n" in result:
lines = result.split('\n')
clean_lines = [l.strip() for l in lines if l.strip()]
if clean_lines:
result = clean_lines[0]
# 3. Aggressive Preamble Stripping (Updates for new prefix)
import re
prefixes = [
r"^Here is the processed output:?\s*", # The one we injected
r"^Here is the corrected text:?\s*",
r"^Here is the rewritten text:?\s*",
r"^Here's the result:?\s*",
r"^Sure,? here is regex.*:?\s*",
r"^Output:?\s*",
r"^Processing result:?\s*",
]
for p in prefixes:
result = re.sub(p, "", result, flags=re.IGNORECASE).strip()
if result.startswith('"') and result.endswith('"') and len(result) > 2 and '"' not in result[1:-1]:
result = result[1:-1]
logging.info(f"LLM Result: '{result}'")
return result
except Exception as e:
logging.error(f"LLM inference failed: {e}")
return text # Fail safe logic

View File

@@ -21,7 +21,7 @@ except ImportError:
torch = None
# Import directly - valid since we are now running in the full environment
from faster_whisper import WhisperModel
class WhisperTranscriber:
"""
@@ -62,6 +62,8 @@ class WhisperTranscriber:
# Force offline if path exists to avoid HF errors
local_only = new_path.exists()
try:
from faster_whisper import WhisperModel
self.model = WhisperModel(
model_input,
device=device,
@@ -69,6 +71,23 @@ class WhisperTranscriber:
download_root=str(get_models_path()),
local_files_only=local_only
)
except Exception as load_err:
# CRITICAL FALLBACK: If CUDA/cublas fails (AMD/Intel users), fallback to CPU
err_str = str(load_err).lower()
if "cublas" in err_str or "cudnn" in err_str or "library" in err_str or "device" in err_str:
logging.warning(f"CUDA Init Failed ({load_err}). Falling back to CPU...")
self.config.set("compute_device", "cpu") # Update config for persistence/UI
self.current_compute_device = "cpu"
self.model = WhisperModel(
model_input,
device="cpu",
compute_type="int8", # CPU usually handles int8 well with newer extensions, or standard
download_root=str(get_models_path()),
local_files_only=local_only
)
else:
raise load_err
self.current_model_size = size
self.current_compute_device = device
@@ -79,6 +98,32 @@ class WhisperTranscriber:
logging.error(f"Failed to load model: {e}")
self.model = None
# Auto-Repair: Detect vocabulary/corrupt errors
err_str = str(e).lower()
if "vocabulary" in err_str or "tokenizer" in err_str or "config.json" in err_str:
# ... existing auto-repair logic ...
logging.warning("Corrupt model detected on load. Attempting to delete and reset...")
try:
import shutil
# Differentiate between simple path and HF path
new_path = get_models_path() / f"faster-whisper-{size}"
if new_path.exists():
shutil.rmtree(new_path)
logging.info(f"Deleted corrupt model at {new_path}")
else:
# Try legacy HF path
hf_path = get_models_path() / f"models--Systran--faster-whisper-{size}"
if hf_path.exists():
shutil.rmtree(hf_path)
logging.info(f"Deleted corrupt HF model at {hf_path}")
# Notify UI to refresh state (will show 'Download' button now)
# We can't reach bridge easily here without passing it in,
# but the UI polls or listens to logs.
# The user will simply see "Model Missing" in settings after this.
except Exception as del_err:
logging.error(f"Failed to delete corrupt model: {del_err}")
def transcribe(self, audio_data, is_file: bool = False, task: Optional[str] = None) -> str:
"""
Transcribe audio data.
@@ -89,7 +134,7 @@ class WhisperTranscriber:
if not self.model:
self.load_model()
if not self.model:
return "Error: Model failed to load."
return "Error: Model failed to load. Please check Settings -> Model Info."
try:
# Config
@@ -174,7 +219,10 @@ class WhisperTranscriber:
def model_exists(self, size: str) -> bool:
"""Checks if a model size is already downloaded."""
new_path = get_models_path() / f"faster-whisper-{size}"
if (new_path / "config.json").exists():
if new_path.exists():
# Strict check
required = ["config.json", "model.bin", "vocabulary.json"]
if all((new_path / f).exists() for f in required):
return True
# Legacy HF cache check

View File

@@ -110,6 +110,8 @@ class UIBridge(QObject):
logAppended = Signal(str) # Emits new log line
settingChanged = Signal(str, 'QVariant')
modelStatesChanged = Signal() # Notify UI to re-check isModelDownloaded
llmDownloadRequested = Signal()
reduceMotionChanged = Signal(bool)
def __init__(self, parent=None):
super().__init__(parent)
@@ -129,6 +131,7 @@ class UIBridge(QObject):
self._app_vram_mb = 0.0
self._app_vram_percent = 0.0
self._is_destroyed = False
self._reduce_motion = bool(ConfigManager().get("reduce_motion"))
# Start QThread Stats Worker
self.stats_worker = StatsWorker()
@@ -276,6 +279,8 @@ class UIBridge(QObject):
ConfigManager().set(key, value)
if key == "ui_scale":
self.uiScale = float(value)
if key == "reduce_motion":
self.reduceMotion = bool(value)
self.settingChanged.emit(key, value) # Notify listeners (e.g. Overlay)
@Property(float, notify=uiScaleChanged)
@@ -287,6 +292,15 @@ class UIBridge(QObject):
self._ui_scale = val
self.uiScaleChanged.emit(val)
@Property(bool, notify=reduceMotionChanged)
def reduceMotion(self): return self._reduce_motion
@reduceMotion.setter
def reduceMotion(self, val):
if self._reduce_motion != val:
self._reduce_motion = val
self.reduceMotionChanged.emit(val)
@Property(float, notify=appCpuChanged)
def appCpu(self): return self._app_cpu
@@ -356,11 +370,7 @@ class UIBridge(QObject):
except Exception as e:
logging.error(f"Failed to preload audio devices: {e}")
@Slot()
def toggle_recording(self):
"""Called by UI elements to trigger the app's recording logic."""
# This will be connected to the main app's toggle logic
pass
@Property(bool, notify=isDownloadingChanged)
def isDownloading(self): return self._is_downloading
@@ -381,7 +391,10 @@ class UIBridge(QObject):
# Check new simple format used by DownloadWorker
path_simple = get_models_path() / f"faster-whisper-{size}"
if path_simple.exists() and any(path_simple.iterdir()):
if path_simple.exists():
# Strict check: Ensure all critical files exist
required = ["config.json", "model.bin", "vocabulary.json"]
if all((path_simple / f).exists() for f in required):
return True
# Check HF Cache format (legacy/default)
@@ -389,16 +402,22 @@ class UIBridge(QObject):
path_hf = get_models_path() / folder_name
snapshots = path_hf / "snapshots"
if snapshots.exists() and any(snapshots.iterdir()):
return True
return True # Legacy cache structure is complex, assume valid if present
# Check direct folder (simple)
path_direct = get_models_path() / size
if (path_direct / "config.json").exists():
return True
return False
except Exception as e:
logging.error(f"Error checking model status: {e}")
return False
@Slot(result=bool)
def isLLMModelDownloaded(self):
try:
from src.core.paths import get_models_path
# Hardcoded check for the 1B model we support
model_file = get_models_path() / "llm" / "llama-3.2-1b-instruct" / "llama-3.2-1b-instruct-q4_k_m.gguf"
return model_file.exists()
except:
return False
@Slot(str)
@@ -408,3 +427,7 @@ class UIBridge(QObject):
@Slot()
def notifyModelStatesChanged(self):
self.modelStatesChanged.emit()
@Slot()
def downloadLLM(self):
self.llmDownloadRequested.emit()

View File

@@ -1,210 +0,0 @@
"""
Modern Components Library.
==========================
Contains custom-painted widgets that move beyond the standard 'amateur' Qt look.
Implements smooth animations, hardware acceleration, and glassmorphism.
"""
from PySide6.QtWidgets import (
QPushButton, QWidget, QVBoxLayout, QHBoxLayout,
QLabel, QGraphicsDropShadowEffect, QFrame, QAbstractButton
)
from PySide6.QtCore import Qt, QPropertyAnimation, QEasingCurve, Property, QRect, QPoint, Signal, Slot
from PySide6.QtGui import QPainter, QColor, QBrush, QPen, QLinearGradient, QFont
from src.ui.styles import Theme
class GlassButton(QPushButton):
"""A premium button with gradient hover effects and smooth scaling."""
def __init__(self, text, parent=None, accent_color=Theme.ACCENT_CYAN):
super().__init__(text, parent)
self.accent = QColor(accent_color)
self.setCursor(Qt.PointingHandCursor)
self.setFixedHeight(40)
self._hover_opacity = 0.0
self.setStyleSheet(f"""
QPushButton {{
background-color: rgba(255, 255, 255, 0.05);
border: 1px solid {Theme.BORDER_SUBTLE};
color: {Theme.TEXT_SECONDARY};
border-radius: 8px;
padding: 0 20px;
font-size: 13px;
font-weight: 600;
}}
""")
# Hover Animation
self.anim = QPropertyAnimation(self, b"hover_opacity")
self.anim.setDuration(200)
self.anim.setStartValue(0.0)
self.anim.setEndValue(1.0)
self.anim.setEasingCurve(QEasingCurve.OutCubic)
@Property(float)
def hover_opacity(self): return self._hover_opacity
@hover_opacity.setter
def hover_opacity(self, value):
self._hover_opacity = value
self.update()
def enterEvent(self, event):
self.anim.setDirection(QPropertyAnimation.Forward)
self.anim.start()
super().enterEvent(event)
def leaveEvent(self, event):
self.anim.setDirection(QPropertyAnimation.Backward)
self.anim.start()
super().leaveEvent(event)
def paintEvent(self, event):
"""Custom paint for the glow effect."""
super().paintEvent(event)
if self._hover_opacity > 0:
painter = QPainter(self)
painter.setRenderHint(QPainter.Antialiasing)
# Subtle Glow Border
color = QColor(self.accent)
color.setAlphaF(self._hover_opacity * 0.5)
painter.setPen(QPen(color, 1.5))
painter.setBrush(Qt.NoBrush)
painter.drawRoundedRect(self.rect().adjusted(1,1,-1,-1), 8, 8)
# Text Glow color shift
self.setStyleSheet(f"""
QPushButton {{
background-color: rgba(255, 255, 255, {0.05 + (self._hover_opacity * 0.05)});
border: 1px solid {Theme.BORDER_SUBTLE};
color: white;
border-radius: 8px;
padding: 0 20px;
font-size: 13px;
font-weight: 600;
}}
""")
class ModernSwitch(QAbstractButton):
"""A sleek iOS-style toggle switch."""
def __init__(self, parent=None, active_color=Theme.ACCENT_GREEN):
super().__init__(parent)
self.setCheckable(True)
self.setFixedSize(44, 24)
self._thumb_pos = 3.0
self.active_color = QColor(active_color)
self.anim = QPropertyAnimation(self, b"thumb_pos")
self.anim.setDuration(200)
self.anim.setEasingCurve(QEasingCurve.InOutCubic)
@Property(float)
def thumb_pos(self): return self._thumb_pos
@thumb_pos.setter
def thumb_pos(self, value):
self._thumb_pos = value
self.update()
def nextCheckState(self):
super().nextCheckState()
self.anim.stop()
if self.isChecked():
self.anim.setEndValue(23.0)
else:
self.anim.setEndValue(3.0)
self.anim.start()
def paintEvent(self, event):
painter = QPainter(self)
painter.setRenderHint(QPainter.Antialiasing)
# Background
bg_color = QColor("#2d2d3d")
if self.isChecked():
bg_color = self.active_color
painter.setBrush(bg_color)
painter.setPen(Qt.NoPen)
painter.drawRoundedRect(self.rect(), 12, 12)
# Thumb
painter.setBrush(Qt.white)
painter.drawEllipse(QPoint(self._thumb_pos + 9, 12), 9, 9)
class ModernFrame(QFrame):
"""A base frame with rounded corners and a shadow."""
def __init__(self, parent=None):
super().__init__(parent)
self.setObjectName("premiumFrame")
self.setStyleSheet(f"""
#premiumFrame {{
background-color: {Theme.BG_CARD};
border: 1px solid {Theme.BORDER_SUBTLE};
border-radius: 12px;
}}
""")
self.shadow = QGraphicsDropShadowEffect(self)
self.shadow.setBlurRadius(25)
self.shadow.setXOffset(0)
self.shadow.setYOffset(8)
self.shadow.setColor(QColor(0, 0, 0, 180))
self.setGraphicsEffect(self.shadow)
from PySide6.QtWidgets import (
QPushButton, QWidget, QVBoxLayout, QHBoxLayout,
QLabel, QGraphicsDropShadowEffect, QFrame, QAbstractButton, QSlider
)
class ModernSlider(QSlider):
"""A custom painted modern slider with a glowing knob."""
def __init__(self, orientation=Qt.Horizontal, parent=None):
super().__init__(orientation, parent)
self.setStyleSheet(f"""
QSlider::groove:horizontal {{
border: 1px solid {Theme.BG_DARK};
height: 4px;
background: {Theme.BG_DARK};
margin: 2px 0;
border-radius: 2px;
}}
QSlider::handle:horizontal {{
background: {Theme.ACCENT_CYAN};
border: 2px solid white;
width: 16px;
height: 16px;
margin: -7px 0;
border-radius: 8px;
}}
QSlider::add-page:horizontal {{
background: {Theme.BG_DARK};
}}
QSlider::sub-page:horizontal {{
background: {Theme.ACCENT_CYAN};
border-radius: 2px;
}}
""")
class FramelessWindow(QWidget):
"""Base class for all premium windows to handle dragging and frameless logic."""
def __init__(self, parent=None):
super().__init__(parent)
self.setWindowFlags(Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.NoDropShadowWindowHint)
self.setAttribute(Qt.WA_TranslucentBackground)
self._drag_pos = None
def mousePressEvent(self, event):
if event.button() == Qt.LeftButton:
self._drag_pos = event.globalPosition().toPoint() - self.frameGeometry().topLeft()
event.accept()
def mouseMoveEvent(self, event):
if event.buttons() & Qt.LeftButton:
self.move(event.globalPosition().toPoint() - self._drag_pos)
event.accept()

View File

@@ -1,109 +0,0 @@
"""
Loader Widget Module.
=====================
Handles the application initialization and model checks.
Refactored for 2026 Premium Aesthetics.
"""
from PySide6.QtWidgets import QWidget, QVBoxLayout, QLabel, QProgressBar
from PySide6.QtCore import Qt, QThread, Signal
from PySide6.QtGui import QFont
import os
import logging
from faster_whisper import download_model
from src.core.paths import get_models_path
from src.ui.styles import Theme, StyleGenerator, load_modern_fonts
from src.ui.components import FramelessWindow, ModernFrame
class DownloadWorker(QThread):
"""Background worker for model downloads."""
progress = Signal(str, int)
download_finished = Signal()
error = Signal(str)
def run(self):
try:
model_path = get_models_path()
self.progress.emit("Verifying AI Core...", 10)
os.environ["HF_HOME"] = str(model_path)
self.progress.emit("Downloading Model...", 30)
download_model("small", output_dir=str(model_path))
self.progress.emit("System Ready!", 100)
self.download_finished.emit()
except Exception as e:
logging.error(f"Loader failed: {e}")
self.error.emit(str(e))
class LoaderWidget(FramelessWindow):
"""
Premium bootstrapper UI.
Inherits from FramelessWindow for rounded glass look.
"""
ready_signal = Signal()
def __init__(self):
super().__init__()
self.setFixedSize(400, 180)
# Main Layout
self.root = QVBoxLayout(self)
self.root.setContentsMargins(10, 10, 10, 10)
# Glass Card
self.card = ModernFrame()
self.card.setStyleSheet(StyleGenerator.get_glass_card(radius=20))
self.root.addWidget(self.card)
# Content Layout
self.layout = QVBoxLayout(self.card)
self.layout.setContentsMargins(30,30,30,30)
self.layout.setSpacing(15)
# App Title/Brand
self.brand = QLabel("WHISPER VOICE")
self.brand.setFont(load_modern_fonts())
self.brand.setStyleSheet(f"color: {Theme.ACCENT_CYAN}; font-weight: 900; letter-spacing: 4px; font-size: 14px;")
self.brand.setAlignment(Qt.AlignCenter)
self.layout.addWidget(self.brand)
# Status Label
self.status_label = QLabel("INITIALIZING...")
self.status_label.setStyleSheet(f"color: {Theme.TEXT_SECONDARY}; font-weight: 600; font-size: 11px;")
self.status_label.setAlignment(Qt.AlignCenter)
self.layout.addWidget(self.status_label)
# Progress Bar (Modern Slim style)
self.progress_bar = QProgressBar()
self.progress_bar.setFixedHeight(4)
self.progress_bar.setStyleSheet(f"""
QProgressBar {{
background-color: {Theme.BG_DARK};
border-radius: 2px;
border: none;
text-align: center;
color: transparent;
}}
QProgressBar::chunk {{
background-color: {Theme.ACCENT_CYAN};
border-radius: 2px;
}}
""")
self.layout.addWidget(self.progress_bar)
# Start Worker
self.worker = DownloadWorker()
self.worker.progress.connect(self.update_progress)
self.worker.download_finished.connect(self.on_finished)
self.worker.start()
def update_progress(self, text: str, percent: int):
self.status_label.setText(text.upper())
self.progress_bar.setValue(percent)
def on_finished(self):
self.ready_signal.emit()
self.close()

View File

@@ -1,105 +0,0 @@
"""
Overlay Window Module.
======================
Premium High-Fidelity Overlay for Whisper Voice.
Features glassmorphism, pulsating status indicators, and smart positioning.
"""
from PySide6.QtWidgets import QWidget, QVBoxLayout, QHBoxLayout, QLabel
from PySide6.QtCore import Qt, Slot, QPoint, QPropertyAnimation, QEasingCurve
from PySide6.QtGui import QColor, QFont, QGuiApplication
from src.ui.visualizer import AudioVisualizer
from src.ui.styles import Theme, StyleGenerator, load_modern_fonts
from src.ui.components import FramelessWindow, ModernFrame
class OverlayWindow(FramelessWindow):
"""
The main transparent overlay (The Pill).
Refactored for 2026 Premium Aesthetics.
"""
def __init__(self):
super().__init__()
self.setFixedSize(320, 95)
# Main Layout
self.master_layout = QVBoxLayout(self)
self.master_layout.setContentsMargins(10, 10, 10, 10)
# The Glass Pill Container
self.pill = ModernFrame()
self.pill.setStyleSheet(StyleGenerator.get_glass_card(radius=24))
self.master_layout.addWidget(self.pill)
# Layout inside the pill
self.layout = QHBoxLayout(self.pill)
self.layout.setContentsMargins(20, 10, 20, 10)
self.layout.setSpacing(15)
# Status Visualization (Left Dot)
self.status_dot = QWidget()
self.status_dot.setFixedSize(14, 14)
self.status_dot.setStyleSheet(f"background-color: {Theme.ACCENT_CYAN}; border-radius: 7px; border: 2px solid white;")
self.layout.addWidget(self.status_dot)
# Text/Visualizer Stack
self.content_stack = QVBoxLayout()
self.content_stack.setSpacing(2)
self.content_stack.setContentsMargins(0, 0, 0, 0)
self.status_label = QLabel("READY")
self.status_label.setFont(load_modern_fonts())
self.status_label.setStyleSheet(f"color: white; font-weight: 800; font-size: 11px; letter-spacing: 2px;")
self.content_stack.addWidget(self.status_label)
self.visualizer = AudioVisualizer()
self.visualizer.setFixedHeight(30)
self.content_stack.addWidget(self.visualizer)
self.layout.addLayout(self.content_stack)
# Animations
self.pulse_timer = None # Use style-based pulsing to avoid window flags issues
# Initial State
self.hide()
self.first_show = True
def showEvent(self, event):
"""Handle positioning and config updates."""
from src.core.config import ConfigManager
config = ConfigManager()
self.setWindowOpacity(config.get("opacity"))
if self.first_show:
self.center_above_taskbar()
self.first_show = False
super().showEvent(event)
def center_above_taskbar(self):
screen = QGuiApplication.primaryScreen()
if not screen: return
avail_rect = screen.availableGeometry()
x = avail_rect.x() + (avail_rect.width() - self.width()) // 2
y = avail_rect.bottom() - self.height() - 15
self.move(x, y)
@Slot(str)
def update_status(self, text: str):
"""Updates the status text and visual indicator."""
self.status_label.setText(text.upper())
if "RECORDING" in text.upper():
color = Theme.ACCENT_GREEN
elif "THINKING" in text.upper():
color = Theme.ACCENT_PURPLE
else:
color = Theme.ACCENT_CYAN
self.status_dot.setStyleSheet(f"background-color: {color}; border-radius: 7px; border: 2px solid white;")
@Slot(float)
def update_visualizer(self, amp: float):
self.visualizer.set_amplitude(amp)

View File

@@ -6,12 +6,15 @@ Button {
text: "Button"
property color accentColor: "#00f2ff"
Accessible.role: Accessible.Button
Accessible.name: control.text
activeFocusOnTab: true
contentItem: Text {
text: control.text
font.pixelSize: 13
font.bold: true
color: control.hovered ? "white" : "#9499b0"
color: control.hovered ? "white" : "#ABABAB"
horizontalAlignment: Text.AlignHCenter
verticalAlignment: Text.AlignVCenter
elide: Text.ElideRight
@@ -25,8 +28,8 @@ Button {
opacity: control.down ? 0.7 : 1.0
color: control.hovered ? Qt.rgba(1, 1, 1, 0.1) : Qt.rgba(1, 1, 1, 0.05)
radius: 8
border.color: control.hovered ? control.accentColor : Qt.rgba(1, 1, 1, 0.1)
border.width: 1
border.color: control.hovered ? control.accentColor : SettingsStyle.borderSubtle
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
Behavior on border.color { ColorAnimation { duration: 200 } }
Behavior on color { ColorAnimation { duration: 200 } }

Binary file not shown.

View File

@@ -14,6 +14,8 @@ ApplicationWindow {
visible: true
flags: Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.Tool
color: "transparent"
title: "WhisperVoice"
Accessible.name: "WhisperVoice Loading"
Rectangle {
id: bgRect
@@ -21,7 +23,7 @@ ApplicationWindow {
anchors.margins: 20 // Space for shadow
radius: 16
color: "#1a1a20"
border.color: "#40ffffff"
border.color: Qt.rgba(1, 1, 1, 0.22)
border.width: 1
// --- SHADOW & GLOW ---
@@ -55,6 +57,7 @@ ApplicationWindow {
// Pulse Animation
SequentialAnimation on scale {
running: ui ? !ui.reduceMotion : true
loops: Animation.Infinite
NumberAnimation { from: 1.0; to: 1.1; duration: 1000; easing.type: Easing.InOutSine }
NumberAnimation { from: 1.1; to: 1.0; duration: 1000; easing.type: Easing.InOutSine }
@@ -95,7 +98,7 @@ ApplicationWindow {
Text {
text: "AI TRANSCRIPTION ENGINE"
color: "#80ffffff"
color: "#ABABAB"
font.family: jetBrainsMono.name
font.pixelSize: 10
font.letterSpacing: 2
@@ -135,6 +138,7 @@ ApplicationWindow {
// Shimmer effect on bar
Rectangle {
width: 20; height: parent.height
visible: ui ? !ui.reduceMotion : true
color: "#80ffffff"
x: -width
opacity: 0.5
@@ -157,8 +161,10 @@ ApplicationWindow {
font.family: jetBrainsMono.name
font.pixelSize: 11
font.bold: true
Accessible.role: Accessible.AlertMessage
Accessible.name: "Loading status: " + text
anchors.horizontalCenter: parent.horizontalCenter
opacity: 0.8
opacity: 1.0
}
}
}

View File

@@ -10,6 +10,9 @@ ComboBox {
property color bgColor: "#1a1a20"
property color popupColor: "#252530"
Accessible.role: Accessible.ComboBox
Accessible.name: control.displayText
delegate: ItemDelegate {
id: delegate
width: control.width
@@ -68,7 +71,7 @@ ComboBox {
context.lineTo(width, 0);
context.lineTo(width / 2, height);
context.closePath();
context.fillStyle = control.pressed ? control.accentColor : "#888888";
context.fillStyle = control.pressed ? control.accentColor : "#ABABAB";
context.fill();
}
}
@@ -89,8 +92,8 @@ ComboBox {
implicitWidth: 140
implicitHeight: 40
color: control.bgColor
border.color: control.pressed || control.activeFocus ? control.accentColor : "#40ffffff"
border.width: 1
border.color: control.pressed || control.activeFocus ? control.accentColor : SettingsStyle.borderSubtle
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
radius: 6
// Glow effect on focus (Simplified to just border for stability)
@@ -114,7 +117,7 @@ ComboBox {
background: Rectangle {
color: control.popupColor
border.color: "#40ffffff"
border.color: SettingsStyle.borderSubtle
border.width: 1
radius: 6
}

View File

@@ -7,8 +7,11 @@ Rectangle {
implicitHeight: 32
color: "#1a1a20"
radius: 6
border.width: 1
border.color: activeFocus || recording ? SettingsStyle.accent : "#40ffffff"
activeFocusOnTab: true
Accessible.role: Accessible.Button
Accessible.name: control.currentSequence ? "Hotkey: " + control.currentSequence + ". Click to change" : "No hotkey set. Click to record"
border.width: (activeFocus || recording) ? SettingsStyle.focusRingWidth : 1
border.color: activeFocus || recording ? SettingsStyle.accent : SettingsStyle.borderSubtle
property string currentSequence: ""
signal sequenceChanged(string seq)
@@ -26,7 +29,7 @@ Rectangle {
Text {
anchors.centerIn: parent
text: control.recording ? "Listening..." : (formatSequence(control.currentSequence) || "None")
color: control.recording ? SettingsStyle.accent : (control.currentSequence ? "#ffffff" : "#808080")
color: control.recording ? SettingsStyle.accent : (control.currentSequence ? "#ffffff" : "#ABABAB")
font.family: "JetBrains Mono"
font.pixelSize: 13
font.bold: true

View File

@@ -18,6 +18,8 @@ Rectangle {
property string description: ""
property alias control: controlContainer.data
property bool showSeparator: true
Accessible.name: root.label
Accessible.role: Accessible.Row
Behavior on color { ColorAnimation { duration: 150 } }

View File

@@ -9,6 +9,8 @@ ColumnLayout {
default property alias content: contentColumn.data
property string title: ""
Accessible.name: root.title + " settings group"
Accessible.role: Accessible.Grouping
// Section Header
Text {

View File

@@ -5,30 +5,49 @@ import QtQuick.Effects
Slider {
id: control
Accessible.role: Accessible.Slider
Accessible.name: control.value.toString()
activeFocusOnTab: true
background: Rectangle {
x: control.leftPadding
y: control.topPadding + control.availableHeight / 2 - height / 2
implicitWidth: 200
implicitHeight: 4
implicitHeight: 6
width: control.availableWidth
height: implicitHeight
radius: 2
radius: 3
color: "#2d2d3d"
Rectangle {
width: control.visualPosition * parent.width
height: parent.height
color: SettingsStyle.accent
radius: 2
radius: 3
}
}
handle: Rectangle {
handle: Item {
x: control.leftPadding + control.visualPosition * (control.availableWidth - width)
y: control.topPadding + control.availableHeight / 2 - height / 2
implicitWidth: 18
implicitHeight: 18
radius: 9
implicitWidth: SettingsStyle.minTargetSize
implicitHeight: SettingsStyle.minTargetSize
// Focus ring
Rectangle {
anchors.centerIn: parent
width: parent.width + SettingsStyle.focusRingWidth * 2 + 2
height: width
radius: width / 2
color: "transparent"
border.width: SettingsStyle.focusRingWidth
border.color: SettingsStyle.accent
visible: control.activeFocus
}
Rectangle {
anchors.fill: parent
radius: width / 2
color: "white"
border.color: SettingsStyle.accent
border.width: 2
@@ -41,7 +60,9 @@ Slider {
shadowColor: SettingsStyle.accent
}
}
// Value Readout (Left side to avoid clipping on right edge)
}
// Value Readout
Text {
anchors.right: parent.left
anchors.rightMargin: 12

View File

@@ -4,6 +4,10 @@ import QtQuick.Controls
Switch {
id: control
Accessible.role: Accessible.CheckBox
Accessible.name: control.text + (control.checked ? " on" : " off")
activeFocusOnTab: true
indicator: Rectangle {
implicitWidth: 44
implicitHeight: 24
@@ -11,9 +15,11 @@ Switch {
y: parent.height / 2 - height / 2
radius: 12
color: control.checked ? SettingsStyle.accent : "#2d2d3d"
border.color: control.checked ? SettingsStyle.accent : "#3d3d4d"
border.color: control.checked ? SettingsStyle.accent : SettingsStyle.borderSubtle
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
Behavior on color { ColorAnimation { duration: 200 } }
Behavior on border.color { ColorAnimation { duration: 200 } }
Rectangle {
x: control.checked ? parent.width - width - 3 : 3
@@ -26,6 +32,15 @@ Switch {
Behavior on x {
NumberAnimation { duration: 200; easing.type: Easing.InOutQuad }
}
// I/O pip marks for non-color state indication
Text {
anchors.centerIn: parent
text: control.checked ? "I" : "O"
font.pixelSize: 9
font.bold: true
color: control.checked ? SettingsStyle.accent : "#666666"
}
}
}

View File

@@ -7,7 +7,10 @@ TextField {
property color accentColor: "#00f2ff"
property color bgColor: "#1a1a20"
placeholderTextColor: "#606060"
Accessible.role: Accessible.EditableText
Accessible.name: control.placeholderText || "Text input"
placeholderTextColor: SettingsStyle.textDisabled
color: "#ffffff"
font.family: "JetBrains Mono"
font.pixelSize: 14
@@ -18,8 +21,8 @@ TextField {
implicitWidth: 200
implicitHeight: 40
color: control.bgColor
border.color: control.activeFocus ? control.accentColor : "#40ffffff"
border.width: 1
border.color: control.activeFocus ? control.accentColor : SettingsStyle.borderSubtle
border.width: control.activeFocus ? SettingsStyle.focusRingWidth : 1
radius: 6
Behavior on border.color { ColorAnimation { duration: 150 } }

View File

@@ -13,6 +13,8 @@ ApplicationWindow {
visible: true
flags: Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.Tool
color: "transparent"
title: "WhisperVoice"
Accessible.name: "WhisperVoice Overlay"
FontLoader {
id: jetBrainsMono
@@ -35,7 +37,7 @@ ApplicationWindow {
property bool isActive: ui.isRecording || ui.isProcessing
SequentialAnimation {
running: true
running: !ui.reduceMotion
loops: Animation.Infinite
PauseAnimation { duration: 3000 }
NumberAnimation {
@@ -96,6 +98,7 @@ ApplicationWindow {
ShaderEffect {
anchors.fill: parent
opacity: 0.4
visible: !ui.reduceMotion
property real time: 0
fragmentShader: "gradient_blobs.qsb"
NumberAnimation on time { from: 0; to: 1000; duration: 100000; loops: Animation.Infinite }
@@ -105,6 +108,7 @@ ApplicationWindow {
ShaderEffect {
anchors.fill: parent
opacity: 0.04
visible: !ui.reduceMotion
property real time: 0
property real intensity: ui.amplitude
fragmentShader: "glow.qsb"
@@ -115,6 +119,7 @@ ApplicationWindow {
ParticleSystem {
id: particles
anchors.fill: parent
running: !ui.reduceMotion
ItemParticle {
system: particles
delegate: Rectangle { width: 2; height: 2; radius: 1; color: "#10ffffff" }
@@ -143,6 +148,7 @@ ApplicationWindow {
// F. CRT Shader Effect (Overlay on chassis ONLY)
ShaderEffect {
anchors.fill: parent
visible: !ui.reduceMotion
property real time: 0
fragmentShader: "crt.qsb"
NumberAnimation on time { from: 0; to: 100; duration: 5000; loops: Animation.Infinite }
@@ -172,7 +178,7 @@ ApplicationWindow {
radius: height / 2
color: "transparent"
border.width: 1
border.color: "#40ffffff"
border.color: Qt.rgba(1, 1, 1, 0.22)
MouseArea {
anchors.fill: parent; hoverEnabled: true
@@ -194,7 +200,7 @@ ApplicationWindow {
NumberAnimation { duration: 150; easing.type: Easing.OutCubic }
}
SequentialAnimation on border.color {
running: ui.isRecording
running: ui.isRecording && !ui.reduceMotion
loops: Animation.Infinite
ColorAnimation { from: "#A0ff4b4b"; to: "#C0ff6b6b"; duration: 800 }
ColorAnimation { from: "#C0ff6b6b"; to: "#A0ff4b4b"; duration: 800 }
@@ -209,6 +215,11 @@ ApplicationWindow {
anchors.left: parent.left
anchors.leftMargin: 10
anchors.verticalCenter: parent.verticalCenter
activeFocusOnTab: true
Accessible.name: ui.isRecording ? "Stop recording" : "Start recording"
Accessible.role: Accessible.Button
Keys.onReturnPressed: ui.toggleRecordingRequested()
Keys.onSpacePressed: ui.toggleRecordingRequested()
// Make entire button scale with amplitude
scale: ui.isRecording ? (1.0 + ui.amplitude * 0.12) : 1.0
@@ -245,7 +256,7 @@ ApplicationWindow {
border.width: 2; border.color: "#60ffffff"
SequentialAnimation on scale {
running: ui.isRecording
running: ui.isRecording && !ui.reduceMotion
loops: Animation.Infinite
NumberAnimation { from: 1.0; to: 1.08; duration: 600; easing.type: Easing.InOutQuad }
NumberAnimation { from: 1.08; to: 1.0; duration: 600; easing.type: Easing.InOutQuad }
@@ -263,6 +274,17 @@ ApplicationWindow {
fillMode: Image.PreserveAspectFit
}
}
// Focus ring
Rectangle {
anchors.fill: micCircle
anchors.margins: -4
radius: width / 2
color: "transparent"
border.width: 2
border.color: "#B794F6" // SettingsStyle.accent equivalent
visible: micContainer.activeFocus
}
}
// --- RAINBOW WAVEFORM (Shader) ---
@@ -277,6 +299,7 @@ ApplicationWindow {
ShaderEffect {
anchors.fill: parent
visible: !ui.reduceMotion
property real time: 0
property real amplitude: ui.amplitude
fragmentShader: "rainbow_wave.qsb"
@@ -341,8 +364,10 @@ ApplicationWindow {
font.family: jetBrainsMono.name; font.pixelSize: 16; font.bold: true; font.letterSpacing: 2
style: Text.Outline
styleColor: ui.isRecording ? "#ff0000" : "#808085"
Accessible.role: Accessible.StaticText
Accessible.name: "Recording time: " + text
SequentialAnimation on opacity {
running: ui.isRecording; loops: Animation.Infinite
running: ui.isRecording && !ui.reduceMotion; loops: Animation.Infinite
NumberAnimation { from: 1.0; to: 0.7; duration: 800 }
NumberAnimation { from: 0.7; to: 1.0; duration: 800 }
}

View File

@@ -12,7 +12,8 @@ Window {
visible: false
flags: Qt.FramelessWindowHint | Qt.Window
color: "transparent"
title: "Settings"
title: "WhisperVoice Settings"
Accessible.name: "WhisperVoice Settings"
// Explicit sizing for Python to read
@@ -133,15 +134,20 @@ Window {
// Improved Close Button
Rectangle {
width: 32; height: 32
activeFocusOnTab: true
Accessible.name: "Close settings"
Accessible.role: Accessible.Button
Keys.onReturnPressed: root.close()
Keys.onSpacePressed: root.close()
radius: 8
color: closeMa.containsMouse ? "#20ff4b4b" : "transparent"
border.color: closeMa.containsMouse ? "#40ff4b4b" : "transparent"
color: closeMa.containsMouse ? "#20FF8A8A" : "transparent"
border.color: closeMa.containsMouse ? "#40FF8A8A" : "transparent"
border.width: 1
Text {
anchors.centerIn: parent
text: "×"
color: closeMa.containsMouse ? "#ff4b4b" : SettingsStyle.textSecondary
color: closeMa.containsMouse ? "#FF8A8A" : SettingsStyle.textSecondary
font.family: mainFont
font.pixelSize: 20
font.bold: true
@@ -157,6 +163,15 @@ Window {
Behavior on color { ColorAnimation { duration: 150 } }
Behavior on border.color { ColorAnimation { duration: 150 } }
// Focus ring
Rectangle {
anchors.fill: parent
radius: 8
color: "transparent"
border.width: SettingsStyle.focusRingWidth
border.color: SettingsStyle.accent
visible: parent.activeFocus
}
}
}
@@ -206,6 +221,23 @@ Window {
height: 38
color: stack.currentIndex === index ? SettingsStyle.surfaceHover : (ma.containsMouse ? Qt.rgba(1,1,1,0.03) : "transparent")
radius: 6
activeFocusOnTab: true
Accessible.name: name
Accessible.role: Accessible.Tab
Keys.onReturnPressed: stack.currentIndex = index
Keys.onSpacePressed: stack.currentIndex = index
Keys.onDownPressed: {
if (index < navModel.count - 1) {
var nextItem = navBtnRoot.parent.children[index + 2]
if (nextItem && nextItem.forceActiveFocus) nextItem.forceActiveFocus()
}
}
Keys.onUpPressed: {
if (index > 0) {
var prevItem = navBtnRoot.parent.children[index]
if (prevItem && prevItem.forceActiveFocus) prevItem.forceActiveFocus()
}
}
Behavior on color { ColorAnimation { duration: 150 } }
@@ -256,6 +288,15 @@ Window {
cursorShape: Qt.PointingHandCursor
onClicked: stack.currentIndex = index
}
// Focus ring
Rectangle {
anchors.fill: parent
radius: 6
color: "transparent"
border.width: SettingsStyle.focusRingWidth
border.color: SettingsStyle.accent
visible: parent.activeFocus
}
}
}
@@ -286,6 +327,7 @@ Window {
// --- TAB: GENERAL ---
ScrollView {
Accessible.role: Accessible.PageTab
ScrollBar.vertical.policy: ScrollBar.AsNeeded
contentWidth: availableWidth
@@ -315,7 +357,7 @@ Window {
ModernSettingsItem {
label: "Global Hotkey (Transcribe)"
description: "Press to record a new shortcut (e.g. F9)"
description: "Standard: Raw transcription"
control: ModernKeySequenceRecorder {
implicitWidth: 240
currentSequence: ui.getSetting("hotkey")
@@ -323,6 +365,16 @@ Window {
}
}
ModernSettingsItem {
label: "Global Hotkey (Correct)"
description: "Enhanced: Transcribe + AI Correction"
control: ModernKeySequenceRecorder {
implicitWidth: 240
currentSequence: ui.getSetting("hotkey_correct")
onSequenceChanged: (seq) => ui.setSetting("hotkey_correct", seq)
}
}
ModernSettingsItem {
label: "Global Hotkey (Translate)"
description: "Press to record a new shortcut (e.g. F10)"
@@ -359,8 +411,8 @@ Window {
showSeparator: false
control: ModernSlider {
Layout.preferredWidth: 200
from: 10; to: 6000
stepSize: 10
from: 10; to: 20000
stepSize: 100
snapMode: Slider.SnapAlways
value: ui.getSetting("typing_speed")
onMoved: ui.setSetting("typing_speed", value)
@@ -373,6 +425,7 @@ Window {
// --- TAB: AUDIO ---
ScrollView {
Accessible.role: Accessible.PageTab
ScrollBar.vertical.policy: ScrollBar.AsNeeded
contentWidth: availableWidth
@@ -461,6 +514,7 @@ Window {
// --- TAB: VISUALS ---
ScrollView {
Accessible.role: Accessible.PageTab
ScrollBar.vertical.policy: ScrollBar.AsNeeded
contentWidth: availableWidth
@@ -510,7 +564,7 @@ Window {
ModernSettingsItem {
label: "Window Opacity"
description: "Transparency level"
showSeparator: false
showSeparator: true
control: ModernSlider {
Layout.preferredWidth: 200
from: 0.1; to: 1.0
@@ -518,6 +572,15 @@ Window {
onMoved: ui.setSetting("opacity", Number(value.toFixed(2)))
}
}
ModernSettingsItem {
label: "Reduce Motion"
description: "Disable animations for accessibility"
showSeparator: false
control: ModernSwitch {
checked: ui.getSetting("reduce_motion")
onToggled: ui.setSetting("reduce_motion", checked)
}
}
}
}
@@ -570,6 +633,7 @@ Window {
// --- TAB: AI ENGINE ---
ScrollView {
Accessible.role: Accessible.PageTab
ScrollBar.vertical.policy: ScrollBar.AsNeeded
contentWidth: availableWidth
@@ -742,8 +806,8 @@ Window {
}
color: "#ffffff"
font.family: "JetBrains Mono"
font.pixelSize: 10
opacity: 0.7
font.pixelSize: 11
opacity: 1.0
elide: Text.ElideRight
Layout.fillWidth: true
}
@@ -845,6 +909,137 @@ Window {
}
}
ModernSettingsSection {
title: "Correction & Rewriting"
Layout.margins: 32
Layout.topMargin: 0
content: ColumnLayout {
width: parent.width
spacing: 0
ModernSettingsItem {
label: "Enable Correction"
description: "Post-process text with Llama 3.2 1B (Adds latency)"
control: ModernSwitch {
checked: ui.getSetting("llm_enabled")
onToggled: ui.setSetting("llm_enabled", checked)
}
}
ModernSettingsItem {
label: "Correction Mode"
description: "Grammar Fix vs. Complete Rewrite"
visible: ui.getSetting("llm_enabled")
control: ModernComboBox {
width: 140
model: ["Grammar", "Standard", "Rewrite"]
currentIndex: model.indexOf(ui.getSetting("llm_mode"))
onActivated: ui.setSetting("llm_mode", currentText)
}
}
// LLM Model Status Card
Rectangle {
Layout.fillWidth: true
Layout.margins: 12
Layout.topMargin: 0
Layout.bottomMargin: 16
height: 54
color: "#0a0a0f"
visible: ui.getSetting("llm_enabled")
radius: 6
border.color: SettingsStyle.borderSubtle
border.width: 1
property bool isDownloaded: false
property bool isDownloading: ui.isDownloading && ui.statusText.indexOf("LLM") !== -1
Timer {
interval: 2000
running: visible
repeat: true
onTriggered: parent.checkStatus()
}
function checkStatus() {
isDownloaded = ui.isLLMModelDownloaded()
}
Component.onCompleted: checkStatus()
Connections {
target: ui
function onModelStatesChanged() { parent.checkStatus() }
function onIsDownloadingChanged() { parent.checkStatus() }
}
RowLayout {
anchors.fill: parent
anchors.leftMargin: 12
anchors.rightMargin: 12
spacing: 12
Image {
source: "smart_toy.svg"
sourceSize: Qt.size(16, 16)
layer.enabled: true
layer.effect: MultiEffect {
colorization: 1.0
colorizationColor: parent.parent.isDownloaded ? SettingsStyle.accent : "#808080"
}
}
ColumnLayout {
Layout.fillWidth: true
spacing: 2
Text {
text: "Llama 3.2 1B (Instruct)"
color: "#ffffff"
font.family: "JetBrains Mono"; font.bold: true
font.pixelSize: 11
}
Text {
text: parent.parent.isDownloaded ? "Ready." : "Model missing (~1.2GB)"
color: SettingsStyle.textSecondary
font.family: "JetBrains Mono"; font.pixelSize: 10
}
}
Button {
id: dlBtn
text: "Download"
visible: !parent.parent.isDownloaded && !parent.parent.isDownloading
Layout.preferredHeight: 24
Layout.preferredWidth: 80
contentItem: Text {
text: "DOWNLOAD"
font.pixelSize: 10; font.bold: true; color: "#000000"; horizontalAlignment: Text.AlignHCenter; verticalAlignment: Text.AlignVCenter
}
background: Rectangle {
color: dlBtn.hovered ? "#ffffff" : SettingsStyle.accent; radius: 4
}
onClicked: ui.downloadLLM()
}
// Progress Bar
Rectangle {
visible: parent.parent.isDownloading
Layout.fillWidth: true
height: 4
color: "#30ffffff"
Rectangle {
width: parent.width * (ui.downloadProgress / 100)
height: parent.height
color: SettingsStyle.accent
}
}
}
}
}
}
ModernSettingsSection {
title: "Advanced Decoding"
Layout.margins: 32
@@ -899,6 +1094,7 @@ Window {
// --- TAB: DEBUG ---
ScrollView {
Accessible.role: Accessible.PageTab
ScrollBar.vertical.policy: ScrollBar.AsNeeded
contentWidth: availableWidth
@@ -924,9 +1120,9 @@ Window {
spacing: 16
StatBox { label: "APP CPU"; value: ui.appCpu; unit: "%"; accent: "#00f2ff" }
StatBox { label: "APP RAM"; value: ui.appRamMb; unit: "MB"; accent: "#bd93f9" }
StatBox { label: "GPU VRAM"; value: ui.appVramMb; unit: "MB"; accent: "#ff79c6" }
StatBox { label: "GPU LOAD"; value: ui.appVramPercent; unit: "%"; accent: "#ff5555" }
StatBox { label: "APP RAM"; value: ui.appRamMb; unit: "MB"; accent: "#CAA9FF" }
StatBox { label: "GPU VRAM"; value: ui.appVramMb; unit: "MB"; accent: "#FF8FD0" }
StatBox { label: "GPU LOAD"; value: ui.appVramPercent; unit: "%"; accent: "#FF8A8A" }
}
Rectangle {

View File

@@ -6,13 +6,14 @@ QtObject {
// Colors
readonly property color background: "#F2121212" // Deep Obsidian with 95% opacity
readonly property color surfaceCard: "#1A1A1A" // Layer 1
readonly property color surfaceHover: "#2A2A2A" // Layer 2 (Lighter for better contrast)
readonly property color borderSubtle: Qt.rgba(1, 1, 1, 0.08)
readonly property color surfaceHover: "#2A2A2A" // Layer 2
readonly property color borderSubtle: Qt.rgba(1, 1, 1, 0.22) // WCAG 3:1 non-text contrast
readonly property color textPrimary: "#FAFAFA" // Brighter white
readonly property color textSecondary: "#999999"
readonly property color textPrimary: "#FAFAFA"
readonly property color textSecondary: "#ABABAB" // WCAG AAA 8.1:1 on #121212
readonly property color textDisabled: "#808080" // 4.0:1 minimum for disabled states
readonly property color accentPurple: "#7000FF"
readonly property color accentPurple: "#B794F6" // WCAG AAA 7.2:1 on #121212
readonly property color accentCyan: "#00F2FF"
// Configurable active accent
@@ -21,5 +22,9 @@ QtObject {
// Dimensions
readonly property int cardRadius: 16
readonly property int itemRadius: 8
readonly property int itemHeight: 60 // Even taller for more breathing room
readonly property int itemHeight: 60
// Accessibility
readonly property int focusRingWidth: 2
readonly property int minTargetSize: 24
}

View File

@@ -1,50 +0,0 @@
#version 440
layout(location = 0) in vec2 qt_TexCoord0;
layout(location = 0) out vec4 fragColor;
layout(std140, binding = 0) uniform buf {
mat4 qt_Matrix;
float qt_Opacity;
float time;
float aberration; // 0.0 to 1.0, controlled by Audio Amplitude
};
float rand(vec2 co) {
return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
}
void main() {
// 1. Calculate Distortion Offset based on Amplitude (aberration)
// We warp the UVs slightly away from center
vec2 uv = qt_TexCoord0;
vec2 dist = uv - 0.5;
// 2. Chromatic Aberration
// Red Channel shifts OUT
// Blue Channel shifts IN
float strength = aberration * 0.02; // Max shift 2% of texture size
vec2 rUV = uv + (dist * strength);
vec2 bUV = uv - (dist * strength);
// Sample texture? We don't have a texture input (source is empty Item), we are generating visuals.
// Wait, ShaderEffect usually works on sourceItem.
// Here we are generating NOISE on top of a gradient.
// So we apply Aberration to the NOISE function?
// Or do we want to aberrate the pixels UNDERNEATH?
// ShaderEffect with no source property renders purely procedural content.
// Let's create layered procedural noise with channel offsets
float nR = rand(rUV + vec2(time * 0.01, 0.0));
float nG = rand(uv + vec2(time * 0.01, 0.0)); // Green is anchor
float nB = rand(bUV + vec2(time * 0.01, 0.0));
// Also modulate alpha by aberration - higher volume = more intense grain?
// Or maybe just pure glitch.
vec4 grainColor = vec4(nR, nG, nB, 1.0);
// Mix it with opacity
fragColor = grainColor * qt_Opacity;
}

Binary file not shown.

View File

@@ -1,25 +0,0 @@
#version 440
layout(location = 0) in vec2 qt_TexCoord0;
layout(location = 0) out vec4 fragColor;
layout(std140, binding = 0) uniform buf {
mat4 qt_Matrix;
float qt_Opacity;
float time;
};
// High-quality pseudo-random function
float rand(vec2 co) {
return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
}
void main() {
// Dynamic Noise based on Time
// We add 'time' to the coordinate to animate the grain
float noise = rand(qt_TexCoord0 + vec2(time * 0.01, time * 0.02));
// Output grayscale noise with alpha modulation
// We want white noise, applied with qt_Opacity
fragColor = vec4(noise, noise, noise, 1.0) * qt_Opacity;
}

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 492 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 490 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 464 KiB

View File

@@ -1,236 +0,0 @@
"""
Settings Window Module.
=======================
Manages the application configuration UI.
Refactored for 2026 Premium Aesthetics with Sidebar navigation.
"""
from PySide6.QtWidgets import (
QWidget, QVBoxLayout, QHBoxLayout, QStackedWidget,
QLabel, QComboBox, QFormLayout, QFrame, QMessageBox, QScrollArea
)
from PySide6.QtCore import Qt, Signal, Slot, QSize
from PySide6.QtGui import QFont, QIcon
from src.core.config import ConfigManager
from src.ui.styles import Theme, StyleGenerator, load_modern_fonts
from src.ui.components import FramelessWindow, ModernFrame, GlassButton, ModernSwitch, ModernSlider
import sounddevice as sd
class SettingsWindow(FramelessWindow):
"""
The main settings dialog.
Refactored with 2026 Premium Sidebar Layout.
"""
settings_changed = Signal()
def __init__(self, parent=None):
super().__init__(parent)
self.config = ConfigManager()
self.setFixedSize(700, 500)
# Main Container
self.bg_frame = ModernFrame()
self.bg_frame.setStyleSheet(StyleGenerator.get_glass_card(radius=20))
self.root_layout = QVBoxLayout(self)
self.root_layout.setContentsMargins(10, 10, 10, 10)
self.root_layout.addWidget(self.bg_frame)
# Title Bar Area (Inside glass card)
self.title_layout = QHBoxLayout()
self.title_layout.setContentsMargins(20, 15, 20, 0)
title_lbl = QLabel("PREMIUM SETTINGS")
title_lbl.setFont(load_modern_fonts())
title_lbl.setStyleSheet(f"color: white; font-weight: 900; font-size: 14px; letter-spacing: 2px;")
self.title_layout.addWidget(title_lbl)
self.title_layout.addStretch()
self.btn_close = GlassButton("×", accent_color="#ff4b4b")
self.btn_close.setFixedSize(30, 30)
self.btn_close.clicked.connect(self.close)
self.title_layout.addWidget(self.btn_close)
# Central Layout (Sidebar + Content)
self.content_layout = QHBoxLayout()
self.content_layout.setContentsMargins(10, 10, 10, 10)
self.content_layout.setSpacing(10)
# 1. SIDEBAR
self.sidebar = QWidget()
self.sidebar.setFixedWidth(160)
self.sidebar_layout = QVBoxLayout(self.sidebar)
self.sidebar_layout.setContentsMargins(0, 10, 0, 10)
self.sidebar_layout.setSpacing(8)
self.nav_general = GlassButton("General")
self.nav_audio = GlassButton("Audio")
self.nav_visuals = GlassButton("Visuals")
self.nav_advanced = GlassButton("Advanced/AI")
self.sidebar_layout.addWidget(self.nav_general)
self.sidebar_layout.addWidget(self.nav_audio)
self.sidebar_layout.addWidget(self.nav_visuals)
self.sidebar_layout.addWidget(self.nav_advanced)
self.sidebar_layout.addStretch()
self.btn_save = GlassButton("SAVE CHANGES", accent_color=Theme.ACCENT_GREEN)
self.btn_save.clicked.connect(self.save_settings)
self.sidebar_layout.addWidget(self.btn_save)
# 2. CONTENT STACK
self.stack = QStackedWidget()
self.stack.setStyleSheet("background: transparent;")
# Connect sidebar to stack
self.nav_general.clicked.connect(lambda: self.stack.setCurrentIndex(0))
self.nav_audio.clicked.connect(lambda: self.stack.setCurrentIndex(1))
self.nav_visuals.clicked.connect(lambda: self.stack.setCurrentIndex(2))
self.nav_advanced.clicked.connect(lambda: self.stack.setCurrentIndex(3))
# Main Layout Assembly
self.inner_layout = QVBoxLayout(self.bg_frame)
self.inner_layout.addLayout(self.title_layout)
self.inner_layout.addLayout(self.content_layout)
self.content_layout.addWidget(self.sidebar)
self.content_layout.addWidget(self.stack)
self.setup_pages()
self.load_values()
def setup_pages(self):
"""Creates the settings pages."""
# --- GENERAL ---
self.page_general = QWidget()
l1 = QFormLayout(self.page_general)
l1.setVerticalSpacing(20)
self.inp_hotkey = QComboBox()
self.inp_hotkey.addItems(["f1", "f2", "f3", "f4", "f5", "f6", "f7", "f8", "f9", "f10", "f11", "f12", "caps lock"])
self.inp_hotkey.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
l1.addRow(self.create_lbl("Global Hotkey:"), self.inp_hotkey)
self.chk_top = ModernSwitch()
l1.addRow(self.create_lbl("Always on Top:"), self.chk_top)
self.stack.addWidget(self.page_general)
# --- AUDIO ---
self.page_audio = QWidget()
l2 = QFormLayout(self.page_audio)
l2.setVerticalSpacing(15)
self.inp_device = QComboBox()
self.inp_device.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
self.populate_audio_devices()
l2.addRow(self.create_lbl("Input Device:"), self.inp_device)
self.sld_threshold = ModernSlider(Qt.Horizontal)
self.sld_threshold.setRange(1, 25)
self.lbl_threshold = self.create_lbl("2%")
self.sld_threshold.valueChanged.connect(lambda v: self.lbl_threshold.setText(f"{v}%"))
l2.addRow(self.create_lbl("Noise Gate:"), self.sld_threshold)
l2.addRow("", self.lbl_threshold)
self.sld_duration = ModernSlider(Qt.Horizontal)
self.sld_duration.setRange(5, 50)
self.lbl_duration = self.create_lbl("1.0s")
self.sld_duration.valueChanged.connect(lambda v: self.lbl_duration.setText(f"{v/10}s"))
l2.addRow(self.create_lbl("Auto-Submit:"), self.sld_duration)
l2.addRow("", self.lbl_duration)
self.stack.addWidget(self.page_audio)
# --- VISUALS ---
self.page_visuals = QWidget()
l3 = QFormLayout(self.page_visuals)
l3.setVerticalSpacing(20)
self.inp_style = QComboBox()
self.inp_style.addItem("Neon Line (Recommended)", "line")
self.inp_style.addItem("Classic Bars", "bar")
self.inp_style.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
l3.addRow(self.create_lbl("Visualizer:"), self.inp_style)
self.sld_opacity = ModernSlider(Qt.Horizontal)
self.sld_opacity.setRange(40, 100)
self.lbl_opacity = self.create_lbl("100%")
self.sld_opacity.valueChanged.connect(lambda v: self.lbl_opacity.setText(f"{v}%"))
l3.addRow(self.create_lbl("Opacity:"), self.sld_opacity)
l3.addRow("", self.lbl_opacity)
self.stack.addWidget(self.page_visuals)
# --- ADVANCED ---
self.page_adv = QWidget()
l4 = QFormLayout(self.page_adv)
l4.setVerticalSpacing(15)
self.inp_model = QComboBox()
self.inp_model.setStyleSheet(f"background: {Theme.BG_DARK}; border-radius: 4px; padding: 5px; color: white;")
for id, name in [("tiny", "Tiny (Fast)"), ("base", "Base"), ("small", "Small (Default)"), ("medium", "Medium"), ("large-v3", "Large V3")]:
self.inp_model.addItem(name, id)
l4.addRow(self.create_lbl("Model:"), self.inp_model)
info = QLabel("Large models provide higher accuracy but require significant RAM/VRAM.")
info.setWordWrap(True)
info.setStyleSheet(f"color: {Theme.TEXT_SECONDARY}; font-style: italic; font-size: 11px;")
l4.addRow("", info)
self.stack.addWidget(self.page_adv)
def create_lbl(self, text):
lbl = QLabel(text)
lbl.setStyleSheet(f"color: {Theme.TEXT_SECONDARY}; font-weight: 600; font-size: 13px;")
return lbl
def populate_audio_devices(self):
try:
self.inp_device.addItem("System Default", -1)
for i, dev in enumerate(sd.query_devices()):
if dev['max_input_channels'] > 0:
self.inp_device.addItem(dev['name'], i)
except: pass
def load_values(self):
self.inp_hotkey.setCurrentText(self.config.get("hotkey"))
self.chk_top.setChecked(self.config.get("always_on_top"))
dev_id = self.config.get("input_device")
idx = self.inp_device.findData(dev_id if dev_id is not None else -1)
if idx >= 0: self.inp_device.setCurrentIndex(idx)
self.sld_threshold.setValue(int(self.config.get("silence_threshold") * 100))
self.sld_duration.setValue(int(self.config.get("silence_duration") * 10))
idx = self.inp_style.findData(self.config.get("visualizer_style"))
if idx >= 0: self.inp_style.setCurrentIndex(idx)
self.sld_opacity.setValue(int(self.config.get("opacity") * 100))
idx = self.inp_model.findData(self.config.get("model_size"))
if idx >= 0: self.inp_model.setCurrentIndex(idx)
def save_settings(self):
updates = {
"hotkey": self.inp_hotkey.currentText(),
"always_on_top": self.chk_top.isChecked(),
"input_device": self.inp_device.currentData() if self.inp_device.currentData() != -1 else None,
"silence_threshold": self.sld_threshold.value() / 100.0,
"silence_duration": self.sld_duration.value() / 10.0,
"visualizer_style": self.inp_style.currentData(),
"opacity": self.sld_opacity.value() / 100.0,
"model_size": self.inp_model.currentData()
}
new_model = updates["model_size"]
if new_model != self.config.get("model_size"):
QMessageBox.information(self, "Model Updated", f"Downloaded {new_model} on next launch.")
self.config.set_bulk(updates)
self.settings_changed.emit()
self.close()

View File

@@ -1,62 +0,0 @@
"""
Style Engine Module.
====================
Centralized design system for the 2026 Premium UI.
Defines color palettes, glassmorphism templates, and modern font loading.
"""
from PySide6.QtGui import QColor, QFont, QFontDatabase
import os
class Theme:
"""Premium Dark Theme Palette (2026 Edition)."""
# Backgrounds
BG_DARK = "#0d0d12" # Deep cosmic black
BG_CARD = "#16161e" # Slightly lighter for components
BG_GLASS = "rgba(22, 22, 30, 0.7)" # Semi-transparent for glass effect
# Neons & Accents
ACCENT_CYAN = "#00f2ff" # Electric cyan
ACCENT_PURPLE = "#7000ff" # Deep cyber purple
ACCENT_GREEN = "#00ff88" # Mint neon
# Text
TEXT_PRIMARY = "#ffffff" # Pure white
TEXT_SECONDARY = "#9499b0" # Muted blue-gray
TEXT_MUTED = "#565f89" # Darker blue-gray
# Borders
BORDER_SUBTLE = "rgba(100, 100, 150, 0.2)"
BORDER_GLOW = "rgba(0, 242, 255, 0.5)"
class StyleGenerator:
"""Generates QSS strings for complex effects."""
@staticmethod
def get_glass_card(radius=12, border=True):
"""Returns QSS for a glassmorphism card."""
border_css = f"border: 1px solid {Theme.BORDER_SUBTLE};" if border else "border: none;"
return f"""
background-color: {Theme.BG_GLASS};
border-radius: {radius}px;
{border_css}
"""
@staticmethod
def get_glow_border(color=Theme.ACCENT_CYAN):
"""Returns QSS for a glowing border state."""
return f"border: 1px solid {color};"
def load_modern_fonts():
"""Attempts to load a modern font stack for the 2026 look."""
# Preferred order: Segoe UI Variable, Inter, Segoe UI, sans-serif
families = ["Segoe UI Variable Text", "Inter", "Segoe UI", "sans-serif"]
for family in families:
font = QFont(family, 10)
if QFontDatabase.families().count(family) > 0:
return font
# Absolute fallback
return QFont("Arial", 10)

View File

@@ -1,117 +0,0 @@
"""
Audio Visualizer Module.
========================
High-Fidelity rendering for the 2026 Premium UI.
Supports 'Classic Bars' and 'Neon Line' with smooth curves and glows.
"""
from PySide6.QtWidgets import QWidget
from PySide6.QtCore import Qt, QTimer, Slot, QRectF, QPointF
from PySide6.QtGui import QPainter, QBrush, QColor, QPainterPath, QPen, QLinearGradient
import random
from src.ui.styles import Theme
class AudioVisualizer(QWidget):
"""
A premium audio visualizer with smooth physics and neon aesthetics.
"""
def __init__(self, parent=None):
super().__init__(parent)
self.amplitude = 0.0
self.bars = 12
self.history = [0.0] * self.bars
# High-refresh timer for silky smooth motion
self.timer = QTimer(self)
self.timer.timeout.connect(self.update_animation)
self.timer.start(16) # ~60 FPS
@Slot(float)
def set_amplitude(self, amp: float):
self.amplitude = amp
def update_animation(self):
self.history.pop(0)
# Smooth interpolation + noise
jitter = random.uniform(0.01, 0.03)
# Decay logic: Gravity-like pull
self.history.append(max(self.amplitude, jitter))
self.update()
def paintEvent(self, event):
from src.core.config import ConfigManager
style = ConfigManager().get("visualizer_style")
painter = QPainter(self)
painter.setRenderHint(QPainter.Antialiasing)
w, h = self.width(), self.height()
painter.translate(0, h / 2)
if style == "bar":
self._draw_bars(painter, w, h)
else:
self._draw_line(painter, w, h)
def _draw_bars(self, painter, w, h):
bar_w = w / self.bars
spacing = 3
for i, val in enumerate(self.history):
bar_h = val * (h * 0.9)
x = i * bar_w
# Gradient Bar
grad = QLinearGradient(0, -bar_h/2, 0, bar_h/2)
grad.setColorAt(0, QColor(Theme.ACCENT_PURPLE))
grad.setColorAt(1, QColor(Theme.ACCENT_CYAN))
painter.setBrush(grad)
painter.setPen(Qt.NoPen)
painter.drawRoundedRect(QRectF(x + spacing, -bar_h/2, bar_w - spacing*2, bar_h), 3, 3)
def _draw_line(self, painter, w, h):
path = QPainterPath()
points = len(self.history)
dx = w / (points - 1)
path.moveTo(0, 0)
def get_path(multi):
p = QPainterPath()
p.moveTo(0, 0)
for i in range(points):
curr_x = i * dx
curr_y = -self.history[i] * (h * 0.45) * multi
if i == 0:
p.moveTo(curr_x, curr_y)
else:
prev_x = (i-1) * dx
# Simple lerp or quadTo for smoothness
p.lineTo(curr_x, curr_y)
return p
# Draw Top & Bottom
p_top = get_path(1)
p_bot = get_path(-1)
# Glow layer
glow_pen = QPen(QColor(Theme.ACCENT_CYAN))
glow_pen.setWidth(4)
glow_alpha = QColor(Theme.ACCENT_CYAN)
glow_alpha.setAlpha(60)
glow_pen.setColor(glow_alpha)
painter.setPen(glow_pen)
painter.drawPath(p_top)
painter.drawPath(p_bot)
# Core layer
core_pen = QPen(Qt.white)
core_pen.setWidth(2)
painter.setPen(core_pen)
painter.drawPath(p_top)
painter.drawPath(p_bot)

View File

@@ -1,38 +0,0 @@
import sys
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
def test_m2m():
model_name = "facebook/m2m100_418M"
print(f"Loading {model_name}...")
tokenizer = M2M100Tokenizer.from_pretrained(model_name)
model = M2M100ForConditionalGeneration.from_pretrained(model_name)
# Test cases: (Language Code, Input)
test_cases = [
("en", "he go to school yesterday"),
("pl", "on iść do szkoła wczoraj"), # Intentional broken grammar in Polish
]
print("\nStarting M2M Tests (Self-Translation):\n")
for lang, input_text in test_cases:
tokenizer.src_lang = lang
encoded = tokenizer(input_text, return_tensors="pt")
# Translate to SAME language
generated_tokens = model.generate(
**encoded,
forced_bos_token_id=tokenizer.get_lang_id(lang)
)
corrected = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
print(f"[{lang}]")
print(f"Input: {input_text}")
print(f"Output: {corrected}")
print("-" * 20)
if __name__ == "__main__":
test_m2m()

View File

@@ -1,40 +0,0 @@
import sys
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
def test_mt0():
model_name = "bigscience/mt0-base"
print(f"Loading {model_name}...")
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Test cases: (Language, Prompt, Input)
# MT0 is instruction tuned, so we should prompt it in the target language or English.
# Cross-lingual prompting (English prompt -> Target tasks) is usually supported.
test_cases = [
("English", "Correct grammar:", "he go to school yesterday"),
("Polish", "Popraw gramatykę:", "to jest testowe zdanie bez kropki"),
("Finnish", "Korjaa kielioppi:", "tämä on testilause ilman pistettä"),
("Russian", "Исправь грамматику:", "это тестовое предложение без точки"),
("Japanese", "文法を直してください:", "これは点のないテスト文です"),
("Spanish", "Corrige la gramática:", "esta es una oración de prueba sin punto"),
]
print("\nStarting MT0 Tests:\n")
for lang, prompt_text, input_text in test_cases:
full_input = f"{prompt_text} {input_text}"
inputs = tokenizer(full_input, return_tensors="pt")
outputs = model.generate(inputs.input_ids, max_length=128)
corrected = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"[{lang}]")
print(f"Input: {full_input}")
print(f"Output: {corrected}")
print("-" * 20)
if __name__ == "__main__":
test_mt0()

View File

@@ -1,34 +0,0 @@
import sys
import os
# Add src to path
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from src.core.grammar_assistant import GrammarAssistant
def test_punctuation():
assistant = GrammarAssistant()
assistant.load_model()
samples = [
# User's example (verbatim)
"If the voice recognition doesn't recognize that I like stopped Or something would that would it also correct that",
# Generic run-on
"hello how are you doing today i am doing fine thanks for asking",
# Missing commas/periods
"well i think its valid however we should probably check the logs first"
]
print("\nStarting Punctuation Tests:\n")
for sample in samples:
print(f"Original: {sample}")
corrected = assistant.correct(sample)
print(f"Corrected: {corrected}")
print("-" * 20)
if __name__ == "__main__":
test_punctuation()