Docs: Detailed explanation of Low VRAM Mode and Style Prompting
This commit is contained in:
16
README.md
16
README.md
@@ -56,6 +56,15 @@ At its core, Whisper Voice is the ultimate bridge between thought and text. It l
|
|||||||
### Workflow: `F9 (Default)`
|
### Workflow: `F9 (Default)`
|
||||||
The primary channel for native-language transcription. It transcribes precisely what it hears in the language you speak (or the one you've locked in Settings).
|
The primary channel for native-language transcription. It transcribes precisely what it hears in the language you speak (or the one you've locked in Settings).
|
||||||
|
|
||||||
|
### ✨ Style Prompting (New in v1.0.2)
|
||||||
|
Whisper Voice replaces traditional "grammar correction models" with a native **Style Prompting** engine. By injecting a specific "pre-prompt" into the model's context window, we can guide its internal style without external post-processing.
|
||||||
|
|
||||||
|
* **Standard (Default)**: Forces the model to use full sentences, proper capitalization, and periods. Ideal for dictation.
|
||||||
|
* **Casual**: Encourages a relaxed, lowercase style (e.g., "no way that's crazy lol").
|
||||||
|
* **Custom**: Allows you to seed the model with your own context (e.g., "Here is a list of medical terms:").
|
||||||
|
|
||||||
|
This approach incurs **zero latency penalty** and **zero extra VRAM** usage.
|
||||||
|
|
||||||
<br>
|
<br>
|
||||||
|
|
||||||
## 🌎 Universal Translation
|
## 🌎 Universal Translation
|
||||||
@@ -105,6 +114,13 @@ Select the model that aligns with your available resources.
|
|||||||
|
|
||||||
> *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
|
> *Note: Acceleration requires you to manually select your Compute Device (CUDA GPU or CPU) in Settings.*
|
||||||
|
|
||||||
|
### 📉 Low VRAM Mode
|
||||||
|
For users with limited GPU memory (e.g., 4GB cards) or those running heavy games simultaneously, Whisper Voice offers a specialized **Low VRAM Mode**.
|
||||||
|
|
||||||
|
* **Behavior**: The AI model is aggressively unloaded from the GPU immediately after every transcription.
|
||||||
|
* **Benefit**: When idle, the app consumes near-zero VRAM (~0MB), leaving your GPU completely free for gaming or rendering.
|
||||||
|
* **Trade-off**: There is a "cold start" latency of 1-2 seconds for every voice command as the model reloads from the disk cache.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🛠️ Deployment
|
## 🛠️ Deployment
|
||||||
|
|||||||
Reference in New Issue
Block a user