Docs: Detailed explanation of Low VRAM Mode and Style Prompting

2026-01-25 13:52:10 +02:00
parent e23c492360
commit aed489dd23
1 changed files with 16 additions and 0 deletions
@@ -56,6 +56,15 @@ At its core, Whisper Voice is the ultimate bridge between thought and text. It l
 ### Workflow: `F9 (Default)`
 The primary channel for native-language transcription. It transcribes precisely what it hears in the language you speak (or the one you've locked in Settings).
 ### ✨ Style Prompting (New in v1.0.2)
 Whisper Voice replaces traditional "grammar correction models" with a native **Style Prompting** engine. By injecting a specific "pre-prompt" into the model's context window, we can guide its internal style without external post-processing.
 *   **Standard (Default)**: Forces the model to use full sentences, proper capitalization, and periods. Ideal for dictation.
 *   **Casual**: Encourages a relaxed, lowercase style (e.g., "no way that's crazy lol").
 *   **Custom**: Allows you to seed the model with your own context (e.g., "Here is a list of medical terms:").
 This approach incurs **zero latency penalty** and **zero extra VRAM** usage.
 <br>
 ## 🌎 Universal Translation
@@ -105,6 +114,13 @@ Select the model that aligns with your available resources.
 > *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
 ### 📉 Low VRAM Mode
 For users with limited GPU memory (e.g., 4GB cards) or those running heavy games simultaneously, Whisper Voice offers a specialized **Low VRAM Mode**.
 *   **Behavior**: The AI model is aggressively unloaded from the GPU immediately after every transcription.
 *   **Benefit**: When idle, the app consumes near-zero VRAM (~0MB), leaving your GPU completely free for gaming or rendering.
 *   **Trade-off**: There is a "cold start" latency of 1-2 seconds for every voice command as the model reloads from the disk cache.
 ---
 ## 🛠️ Deployment