Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3137770742 |
12
README.md
12
README.md
@@ -43,6 +43,18 @@ Whisper Voice operates directly on the metal. It is not an API wrapper; it is an
|
|||||||
| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out the noise, ensuring only pure intent is processed. |
|
| **Sensory Gate** | **Silero VAD** | Enterprise-grade Voice Activity Detection filters out the noise, ensuring only pure intent is processed. |
|
||||||
| **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that is fluid, responsive, and sovereign. |
|
| **Interface** | **Qt 6 / QML** | Hardware-accelerated, glassmorphic UI that is fluid, responsive, and sovereign. |
|
||||||
|
|
||||||
|
### 🛑 Compatibility Matrix (Windows)
|
||||||
|
The core engine (`CTranslate2`) is heavily optimized for Nvidia tensor cores.
|
||||||
|
|
||||||
|
| Manufacturer | Hardware | Status | Notes |
|
||||||
|
| :--- | :--- | :--- | :--- |
|
||||||
|
| **Nvidia** | GTX 900+ / RTX | ✅ **Supported** | Full heavy-metal acceleration. |
|
||||||
|
| **AMD** | Radeon RX | ⚠️ **CPU Fallback** | Runs on CPU. Valid for `Small/Medium`, slow for `Large`. |
|
||||||
|
| **Intel** | Arc / Iris | ⚠️ **CPU Fallback** | Runs on CPU. Valid for `Small/Medium`, slow for `Large`. |
|
||||||
|
| **Apple** | M1 / M2 / M3 | ❌ **Unsupported** | Release is strictly Windows x64. |
|
||||||
|
|
||||||
|
> **AMD Users**: v1.0.3 auto-detects GPU failures and silently falls back to CPU.
|
||||||
|
|
||||||
<br>
|
<br>
|
||||||
|
|
||||||
## 🖋️ Universal Transcription
|
## 🖋️ Universal Transcription
|
||||||
|
|||||||
@@ -1,28 +1,28 @@
|
|||||||
# Release v1.0.2
|
# Release v1.0.4
|
||||||
|
|
||||||
**"The Lightweight Release"**
|
**"The Compatibility Update"**
|
||||||
|
|
||||||
This release focuses on removing bloat and switching to a native approach for punctuation, resulting in a significantly faster and smaller application.
|
This release focuses on maximum stability across different hardware configurations (AMD, Intel, Nvidia) and fixing startup crashes related to corrupted models or missing drivers.
|
||||||
|
|
||||||
## 🚀 Key Changes
|
## 🛠️ Critical Fixes
|
||||||
|
|
||||||
### 1. Style Prompting (Replaces Grammar Model)
|
### 1. Robust CPU Fallback (AMD / Intel Support)
|
||||||
We have removed the heavy "Grammar Correction" model (M2M100) and replaced it with **Style Prompting**.
|
* **Problem**: Previously, if an AMD user tried to run the app, it would crash instantly because it tried to load Nvidia CUDA libraries by default.
|
||||||
* **How it works**: Uses Whisper's internal context awareness to force proper punctuation.
|
* **Fix**: The app now **silently detects** if CUDA initialization fails (due to missing DLLs or incompatible hardware) and **automatically falls back to CPU mode**.
|
||||||
* **New Settings**: Go to `Settings -> AI Engine` to choose a style:
|
* **Result**: The app "just works" on any Windows machine, regardless of GPU.
|
||||||
* **Standard**: (Default) Forces full sentences and proper punctuation.
|
|
||||||
* **Casual**: Relaxed, lowercase style.
|
|
||||||
* **Custom**: Enter your own prompt context.
|
|
||||||
|
|
||||||
### 2. Bloat Removal
|
### 2. Startup Crash Protection
|
||||||
* **Removed**: `transformers`, `sentencepiece`, `accelerate` libraries.
|
* **Problem**: If `faster_whisper` was imported before checking for valid drivers, the app would crash on launch for some users.
|
||||||
* **Removed**: `grammar-m2m100` model downloader and logic.
|
* **Fix**: Implemented **Lazy Loading** for the AI engine. The app now starts the UI first, and only loads the heavy AI libraries inside a safety block that catches errors.
|
||||||
* **Impact**: The application is lighter, installs faster, and uses less RAM.
|
|
||||||
|
|
||||||
### 3. Stability Fixes
|
### 3. Corrupt Model Auto-Repair
|
||||||
* **Fixed**: `NameError: 'torch' is not defined` when using Low VRAM Mode.
|
* **Problem**: Interrupted downloads could leave a corrupted model folder, preventing the app from ever starting again.
|
||||||
* **Fixed**: Bootstrapper now self-repairs missing dependencies if the environment gets corrupted.
|
* **Fix**: If the app detects a "vocabulary missing" or invalid config error, it will now **automatically delete the corrupt folder** and allow you to re-download it cleanly.
|
||||||
|
|
||||||
|
### 4. Windows DLL Injection
|
||||||
|
* **Fix**: Added explicit DLL path injection for `nvidia-cublas` and `nvidia-cudnn` to ensure Python 3.8+ can find the required CUDA libraries on Windows systems that don't have them in PATH.
|
||||||
|
|
||||||
## 📦 Installation
|
## 📦 Installation
|
||||||
1. Download `WhisperVoice.exe` (attached below or in `dist/`).
|
1. Download `WhisperVoice.exe` below.
|
||||||
2. Run it. It will automatically update your environment if needed.
|
2. Replace your existing `.exe`.
|
||||||
|
3. Run it.
|
||||||
|
|||||||
BIN
dist/WhisperVoice.exe
vendored
BIN
dist/WhisperVoice.exe
vendored
Binary file not shown.
25
main.py
25
main.py
@@ -9,6 +9,31 @@ app_dir = os.path.dirname(os.path.abspath(__file__))
|
|||||||
if app_dir not in sys.path:
|
if app_dir not in sys.path:
|
||||||
sys.path.insert(0, app_dir)
|
sys.path.insert(0, app_dir)
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# WINDOWS DLL FIX (CRITICAL for Portable CUDA)
|
||||||
|
# Python 3.8+ on Windows requires explicit DLL directory addition.
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
if os.name == 'nt' and hasattr(os, 'add_dll_directory'):
|
||||||
|
try:
|
||||||
|
from pathlib import Path
|
||||||
|
# Scan sys.path for site-packages
|
||||||
|
for p in sys.path:
|
||||||
|
path_obj = Path(p)
|
||||||
|
if path_obj.name == 'site-packages' and path_obj.exists():
|
||||||
|
nvidia_path = path_obj / "nvidia"
|
||||||
|
if nvidia_path.exists():
|
||||||
|
for subdir in nvidia_path.iterdir():
|
||||||
|
# Add 'bin' folder from each nvidia stub (cublas, cudnn, etc.)
|
||||||
|
bin_path = subdir / "bin"
|
||||||
|
if bin_path.exists():
|
||||||
|
os.add_dll_directory(str(bin_path))
|
||||||
|
# Also try adding site-packages itself just in case
|
||||||
|
# os.add_dll_directory(str(path_obj))
|
||||||
|
break
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
from PySide6.QtWidgets import QApplication, QFileDialog, QMessageBox
|
from PySide6.QtWidgets import QApplication, QFileDialog, QMessageBox
|
||||||
from PySide6.QtCore import QObject, Slot, Signal, QThread, Qt, QUrl
|
from PySide6.QtCore import QObject, Slot, Signal, QThread, Qt, QUrl
|
||||||
from PySide6.QtQml import QQmlApplicationEngine
|
from PySide6.QtQml import QQmlApplicationEngine
|
||||||
|
|||||||
73
publish_release.py
Normal file
73
publish_release.py
Normal file
@@ -0,0 +1,73 @@
|
|||||||
|
import os
|
||||||
|
import requests
|
||||||
|
import mimetypes
|
||||||
|
|
||||||
|
# Configuration
|
||||||
|
API_URL = "https://git.lashman.live/api/v1"
|
||||||
|
OWNER = "lashman"
|
||||||
|
REPO = "whisper_voice"
|
||||||
|
TAG = "v1.0.4"
|
||||||
|
TOKEN = "6153890332afff2d725aaf4729bc54b5030d5700" # Extracted from git config
|
||||||
|
EXE_PATH = r"dist\WhisperVoice.exe"
|
||||||
|
|
||||||
|
headers = {
|
||||||
|
"Authorization": f"token {TOKEN}",
|
||||||
|
"Accept": "application/json"
|
||||||
|
}
|
||||||
|
|
||||||
|
def create_release():
|
||||||
|
print(f"Creating release {TAG}...")
|
||||||
|
|
||||||
|
# Read Release Notes
|
||||||
|
with open("RELEASE_NOTES.md", "r", encoding="utf-8") as f:
|
||||||
|
notes = f.read()
|
||||||
|
|
||||||
|
# Create Release
|
||||||
|
payload = {
|
||||||
|
"tag_name": TAG,
|
||||||
|
"name": TAG,
|
||||||
|
"body": notes,
|
||||||
|
"draft": False,
|
||||||
|
"prerelease": False
|
||||||
|
}
|
||||||
|
|
||||||
|
url = f"{API_URL}/repos/{OWNER}/{REPO}/releases"
|
||||||
|
resp = requests.post(url, json=payload, headers=headers)
|
||||||
|
|
||||||
|
if resp.status_code == 201:
|
||||||
|
print("Release created successfully!")
|
||||||
|
return resp.json()
|
||||||
|
elif resp.status_code == 409:
|
||||||
|
print("Release already exists. Fetching it...")
|
||||||
|
# Get by tag
|
||||||
|
resp = requests.get(f"{API_URL}/repos/{OWNER}/{REPO}/releases/tags/{TAG}", headers=headers)
|
||||||
|
if resp.status_code == 200:
|
||||||
|
return resp.json()
|
||||||
|
|
||||||
|
print(f"Failed to create release: {resp.status_code} - {resp.text}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def upload_asset(release_id, file_path):
|
||||||
|
print(f"Uploading asset: {file_path}...")
|
||||||
|
filename = os.path.basename(file_path)
|
||||||
|
|
||||||
|
with open(file_path, "rb") as f:
|
||||||
|
data = f.read()
|
||||||
|
|
||||||
|
url = f"{API_URL}/repos/{OWNER}/{REPO}/releases/{release_id}/assets?name={filename}"
|
||||||
|
|
||||||
|
# Gitea API expects raw body
|
||||||
|
resp = requests.post(url, data=data, headers=headers)
|
||||||
|
|
||||||
|
if resp.status_code == 201:
|
||||||
|
print(f"Uploaded {filename} successfully!")
|
||||||
|
else:
|
||||||
|
print(f"Failed to upload asset: {resp.status_code} - {resp.text}")
|
||||||
|
|
||||||
|
def main():
|
||||||
|
release = create_release()
|
||||||
|
if release:
|
||||||
|
upload_asset(release["id"], EXE_PATH)
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -21,7 +21,7 @@ except ImportError:
|
|||||||
torch = None
|
torch = None
|
||||||
|
|
||||||
# Import directly - valid since we are now running in the full environment
|
# Import directly - valid since we are now running in the full environment
|
||||||
from faster_whisper import WhisperModel
|
|
||||||
|
|
||||||
class WhisperTranscriber:
|
class WhisperTranscriber:
|
||||||
"""
|
"""
|
||||||
@@ -62,6 +62,8 @@ class WhisperTranscriber:
|
|||||||
# Force offline if path exists to avoid HF errors
|
# Force offline if path exists to avoid HF errors
|
||||||
local_only = new_path.exists()
|
local_only = new_path.exists()
|
||||||
|
|
||||||
|
try:
|
||||||
|
from faster_whisper import WhisperModel
|
||||||
self.model = WhisperModel(
|
self.model = WhisperModel(
|
||||||
model_input,
|
model_input,
|
||||||
device=device,
|
device=device,
|
||||||
@@ -69,6 +71,23 @@ class WhisperTranscriber:
|
|||||||
download_root=str(get_models_path()),
|
download_root=str(get_models_path()),
|
||||||
local_files_only=local_only
|
local_files_only=local_only
|
||||||
)
|
)
|
||||||
|
except Exception as load_err:
|
||||||
|
# CRITICAL FALLBACK: If CUDA/cublas fails (AMD/Intel users), fallback to CPU
|
||||||
|
err_str = str(load_err).lower()
|
||||||
|
if "cublas" in err_str or "cudnn" in err_str or "library" in err_str or "device" in err_str:
|
||||||
|
logging.warning(f"CUDA Init Failed ({load_err}). Falling back to CPU...")
|
||||||
|
self.config.set("compute_device", "cpu") # Update config for persistence/UI
|
||||||
|
self.current_compute_device = "cpu"
|
||||||
|
|
||||||
|
self.model = WhisperModel(
|
||||||
|
model_input,
|
||||||
|
device="cpu",
|
||||||
|
compute_type="int8", # CPU usually handles int8 well with newer extensions, or standard
|
||||||
|
download_root=str(get_models_path()),
|
||||||
|
local_files_only=local_only
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
raise load_err
|
||||||
|
|
||||||
self.current_model_size = size
|
self.current_model_size = size
|
||||||
self.current_compute_device = device
|
self.current_compute_device = device
|
||||||
@@ -79,6 +98,32 @@ class WhisperTranscriber:
|
|||||||
logging.error(f"Failed to load model: {e}")
|
logging.error(f"Failed to load model: {e}")
|
||||||
self.model = None
|
self.model = None
|
||||||
|
|
||||||
|
# Auto-Repair: Detect vocabulary/corrupt errors
|
||||||
|
err_str = str(e).lower()
|
||||||
|
if "vocabulary" in err_str or "tokenizer" in err_str or "config.json" in err_str:
|
||||||
|
# ... existing auto-repair logic ...
|
||||||
|
logging.warning("Corrupt model detected on load. Attempting to delete and reset...")
|
||||||
|
try:
|
||||||
|
import shutil
|
||||||
|
# Differentiate between simple path and HF path
|
||||||
|
new_path = get_models_path() / f"faster-whisper-{size}"
|
||||||
|
if new_path.exists():
|
||||||
|
shutil.rmtree(new_path)
|
||||||
|
logging.info(f"Deleted corrupt model at {new_path}")
|
||||||
|
else:
|
||||||
|
# Try legacy HF path
|
||||||
|
hf_path = get_models_path() / f"models--Systran--faster-whisper-{size}"
|
||||||
|
if hf_path.exists():
|
||||||
|
shutil.rmtree(hf_path)
|
||||||
|
logging.info(f"Deleted corrupt HF model at {hf_path}")
|
||||||
|
|
||||||
|
# Notify UI to refresh state (will show 'Download' button now)
|
||||||
|
# We can't reach bridge easily here without passing it in,
|
||||||
|
# but the UI polls or listens to logs.
|
||||||
|
# The user will simply see "Model Missing" in settings after this.
|
||||||
|
except Exception as del_err:
|
||||||
|
logging.error(f"Failed to delete corrupt model: {del_err}")
|
||||||
|
|
||||||
def transcribe(self, audio_data, is_file: bool = False, task: Optional[str] = None) -> str:
|
def transcribe(self, audio_data, is_file: bool = False, task: Optional[str] = None) -> str:
|
||||||
"""
|
"""
|
||||||
Transcribe audio data.
|
Transcribe audio data.
|
||||||
@@ -89,7 +134,7 @@ class WhisperTranscriber:
|
|||||||
if not self.model:
|
if not self.model:
|
||||||
self.load_model()
|
self.load_model()
|
||||||
if not self.model:
|
if not self.model:
|
||||||
return "Error: Model failed to load."
|
return "Error: Model failed to load. Please check Settings -> Model Info."
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Config
|
# Config
|
||||||
@@ -174,7 +219,10 @@ class WhisperTranscriber:
|
|||||||
def model_exists(self, size: str) -> bool:
|
def model_exists(self, size: str) -> bool:
|
||||||
"""Checks if a model size is already downloaded."""
|
"""Checks if a model size is already downloaded."""
|
||||||
new_path = get_models_path() / f"faster-whisper-{size}"
|
new_path = get_models_path() / f"faster-whisper-{size}"
|
||||||
if (new_path / "config.json").exists():
|
if new_path.exists():
|
||||||
|
# Strict check
|
||||||
|
required = ["config.json", "model.bin", "vocabulary.json"]
|
||||||
|
if all((new_path / f).exists() for f in required):
|
||||||
return True
|
return True
|
||||||
|
|
||||||
# Legacy HF cache check
|
# Legacy HF cache check
|
||||||
|
|||||||
@@ -381,7 +381,10 @@ class UIBridge(QObject):
|
|||||||
|
|
||||||
# Check new simple format used by DownloadWorker
|
# Check new simple format used by DownloadWorker
|
||||||
path_simple = get_models_path() / f"faster-whisper-{size}"
|
path_simple = get_models_path() / f"faster-whisper-{size}"
|
||||||
if path_simple.exists() and any(path_simple.iterdir()):
|
if path_simple.exists():
|
||||||
|
# Strict check: Ensure all critical files exist
|
||||||
|
required = ["config.json", "model.bin", "vocabulary.json"]
|
||||||
|
if all((path_simple / f).exists() for f in required):
|
||||||
return True
|
return True
|
||||||
|
|
||||||
# Check HF Cache format (legacy/default)
|
# Check HF Cache format (legacy/default)
|
||||||
@@ -389,16 +392,12 @@ class UIBridge(QObject):
|
|||||||
path_hf = get_models_path() / folder_name
|
path_hf = get_models_path() / folder_name
|
||||||
snapshots = path_hf / "snapshots"
|
snapshots = path_hf / "snapshots"
|
||||||
if snapshots.exists() and any(snapshots.iterdir()):
|
if snapshots.exists() and any(snapshots.iterdir()):
|
||||||
return True
|
return True # Legacy cache structure is complex, assume valid if present
|
||||||
|
|
||||||
# Check direct folder (simple)
|
return False
|
||||||
path_direct = get_models_path() / size
|
|
||||||
if (path_direct / "config.json").exists():
|
|
||||||
return True
|
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logging.error(f"Error checking model status: {e}")
|
logging.error(f"Error checking model status: {e}")
|
||||||
|
|
||||||
return False
|
return False
|
||||||
|
|
||||||
@Slot(str)
|
@Slot(str)
|
||||||
|
|||||||
Reference in New Issue
Block a user