Add local text correction engine

This commit is contained in:
2026-01-31 01:02:24 +02:00
parent 6a98142c1d
commit 32d4e328ff
10 changed files with 601 additions and 61 deletions

View File

@@ -68,14 +68,20 @@ Whisper Voice is the bridge between thought and text. It listens with superhuman
### Workflow: `F9 (Default)`
The primary channel for native-language transcription. It transcribes precisely what it hears in the language you speak (or the one you've locked in Settings).
### ✨ Style Prompting (New in v1.0.2)
Whisper Voice replaces traditional "grammar correction models" with a native **Style Prompting** engine. By injecting a specific "pre-prompt" into the model's context window, we can guide its internal style without external post-processing.
### 🧠 Intelligent Correction (New in v1.1.0)
Whisper Voice now integrates a local **Llama 3.2 1B** LLM to act as a "Silent Consultant". It post-processes transcripts to fix grammar or polish style without effectively "chatting" back.
* **Standard (Default)**: Forces the model to use full sentences, proper capitalization, and periods. Ideal for dictation.
* **Casual**: Encourages a relaxed, lowercase style (e.g., "no way that's crazy lol").
* **Custom**: Allows you to seed the model with your own context (e.g., "Here is a list of medical terms:").
It is strictly trained on a **Forensic Protocol**: it will never lecture you, never refuse to process explicit language, and never sanitize your words. Your profanity is yours to keep.
This approach incurs **zero latency penalty** and **zero extra VRAM** usage.
#### Correction Modes:
* **Standard (Default)**: Fixes grammar, punctuation, and capitalization while keeping every word you said.
* **Grammar Only**: Strictly fixes objective errors (spelling/agreement). Touches nothing else.
* **Rewrite**: Polishes the flow and clarity of your sentences while explicitly preserving your original tone (Casual stays casual, Formal stays formal).
#### Supported Languages:
The correction engine is optimized for **English, German, French, Italian, Portuguese, Spanish, Hindi, and Thai**. It also performs well on **Russian, Chinese, Japanese, and Romanian**.
This approach incurs a ~2s latency penalty but uses **zero extra VRAM** when in Low VRAM mode.
<br>