Docs: Detailed expansion of README with Translation features and open layout
This commit is contained in:
113
README.md
113
README.md
@@ -42,7 +42,40 @@ This operates on the metal. It is not a wrapper. It is an engine.
|
||||
|
||||
---
|
||||
|
||||
## 📊 Intelligence Matrix
|
||||
## 🌎 Native Translation Engine
|
||||
|
||||
Whisper Voice v1.0.1 introduces a powerful **Universal Translator** built directly into the core. This is not a web-request to Google Translate. This is a neural network running on your GPU that understands the semantic meaning of speech and reconstructs it in fluent English.
|
||||
|
||||
* **Any Language Source**: Speak in French, Japanese, Russian, or 96 other languages.
|
||||
* **English Output**: The engine instantly transcribes the audio into English text.
|
||||
* **Zero Latency**: Translation happens in real-time as you speak (sentence-by-sentence).
|
||||
|
||||
### Dual-Channel Operation
|
||||
You do not need to switch modes manually. The application listens on two separate channels simultaneously.
|
||||
|
||||
* **F9 (Default)** -> **Transcribe**: Types exactly what you say, in the language you speak.
|
||||
* **F10 (Default)** -> **Translate**: Translates whatever you say in *any* language into English.
|
||||
|
||||
This allows for seamless bilingual workflows. Dictate a message to a local friend on `F9`, then instantly reply to an international colleague on `F10` without touching a single setting.
|
||||
|
||||
---
|
||||
|
||||
## 🕹️ Controls & Configuration
|
||||
|
||||
### Global Hotkeys
|
||||
The system runs silently in the background. Control it via global shortcuts:
|
||||
|
||||
* **Transcribe (Default: F9)**: Use this for normal speech-to-text. It respects the language set in Settings (or Auto-Detect).
|
||||
* **Translate (Default: F10)**: Use this to force translation to English.
|
||||
* **Customization**: Both keys can be remapped in the Settings menu. The recorder supports complex combinations (e.g., `Ctrl + Alt + Space`).
|
||||
|
||||
### Input Modes
|
||||
* **Clipboard Paste**: Injects text via OS clipboard. Instant, but some games disable paste.
|
||||
* **Simulate Typing**: Mimics physical keystrokes. Bypasses anti-cheat and anti-paste blocks. Configurable speed (default 6000 CPM) to prevent game kicks.
|
||||
|
||||
---
|
||||
|
||||
## 📊 Intelligence Matrix (Models)
|
||||
|
||||
Select the model that aligns with your hardware capabilities.
|
||||
|
||||
@@ -65,82 +98,20 @@ Select the model that aligns with your hardware capabilities.
|
||||
1. **Download**: Grab `WhisperVoice.exe` from [Releases](https://git.lashman.live/lashman/whisper_voice/releases).
|
||||
2. **Deploy**: Place it anywhere. It is portable.
|
||||
3. **Bootstrap**: Run it. The agent will self-provision an isolated Python environment (~2GB) on first launch.
|
||||
4. **Updates**: Simply replace the `.exe`. The **Smart Bootstrapper** will detect the update and sync only the changed files, preserving your settings and skipping unnecessary downloads.
|
||||
|
||||
### 🕹️ Controls
|
||||
* **Global Hook**: `F9` (Default). Press to open the channel. Release to inject text.
|
||||
* **Tray Agent**: Retracts to the system tray. Right-click for **Settings** or **File Transcription**.
|
||||
|
||||
### 📡 Input Modes
|
||||
| Mode | Description | Speed |
|
||||
| :--- | :--- | :--- |
|
||||
| **Clipboard Paste** | Standard text injection via OS clipboard. | Instant |
|
||||
| **Simulate Typing** | Mimics physical keystrokes. Bypasses anti-paste blocks. | Up to **6000** CPM |
|
||||
### <EFBFBD> Troubleshooting
|
||||
* **App crashes on start**: Ensure you have [Microsoft Visual C++ Redistributable 2015-2022](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist) installed.
|
||||
* **"Simulate Typing" is slow**: Some applications (remote desktops, older games) choke on super-fast input. Lower the typing speed in Settings to ~1200 CPM.
|
||||
* **No Audio**: The agent listens to the **Default Communication Device**. Check your Windows Sound Control Panel.
|
||||
|
||||
---
|
||||
|
||||
## 🌐 Universal Translation
|
||||
## 🌐 Supported Languages
|
||||
|
||||
The model listens in **99 languages** and translates them to English or transcribes them natively.
|
||||
The engine supports 99 languages. You can lock the engine to a specific language in Settings to improve accuracy, or leave it on **Auto-Detect** for multilingual usage.
|
||||
|
||||
<details>
|
||||
<summary><b>Click to view supported languages</b></summary>
|
||||
<br>
|
||||
|
||||
| | | | |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| Afrikaans 🇿🇦 | Albanian 🇦🇱 | Amharic 🇪🇹 | Arabic 🇸🇦 |
|
||||
| Armenian 🇦🇲 | Assamese 🇮🇳 | Azerbaijani 🇦🇿 | Bashkir 🇷🇺 |
|
||||
| Basque 🇪🇸 | Belarusian 🇧🇾 | Bengali 🇧🇩 | Bosnian 🇧🇦 |
|
||||
| Breton 🇫🇷 | Bulgarian 🇧🇬 | Burmese 🇲🇲 | Castilian 🇪🇸 |
|
||||
| Catalan 🇪🇸 | Chinese 🇨🇳 | Croatian 🇭🇷 | Czech 🇨🇿 |
|
||||
| Danish 🇩🇰 | Dutch 🇳🇱 | English 🇺🇸 | Estonian 🇪🇪 |
|
||||
| Faroese 🇫🇴 | Finnish 🇫🇮 | Flemish 🇧🇪 | French 🇫🇷 |
|
||||
| Galician 🇪🇸 | Georgian 🇬🇪 | German 🇩🇪 | Greek 🇬🇷 |
|
||||
| Gujarati 🇮🇳 | Haitian 🇭🇹 | Hausa 🇳🇬 | Hawaiian 🇺🇸 |
|
||||
| Hebrew 🇮🇱 | Hindi 🇮🇳 | Hungarian 🇭🇺 | Icelandic 🇮🇸 |
|
||||
| Indonesian 🇮🇩 | Italian 🇮🇹 | Japanese 🇯🇵 | Javanese 🇮🇩 |
|
||||
| Kannada 🇮🇳 | Kazakh 🇰🇿 | Khmer 🇰🇭 | Korean 🇰🇷 |
|
||||
| Lao 🇱🇦 | Latin 🇻🇦 | Latvian 🇱🇻 | Lingala 🇨🇩 |
|
||||
| Lithuanian 🇱🇹 | Luxembourgish 🇱🇺 | Macedonian 🇲🇰 | Malagasy 🇲🇬 |
|
||||
| Malay 🇲🇾 | Malayalam 🇮🇳 | Maltese 🇲🇹 | Maori 🇳🇿 |
|
||||
| Marathi 🇮🇳 | Moldavian 🇲🇩 | Mongolian 🇲🇳 | Myanmar 🇲🇲 |
|
||||
| Nepali 🇳🇵 | Norwegian 🇳🇴 | Occitan 🇫🇷 | Panjabi 🇮🇳 |
|
||||
| Pashto 🇦🇫 | Persian 🇮🇷 | Polish 🇵🇱 | Portuguese 🇵🇹 |
|
||||
| Punjabi 🇮🇳 | Romanian 🇷🇴 | Russian 🇷🇺 | Sanskrit 🇮🇳 |
|
||||
| Serbian 🇷🇸 | Shona 🇿🇼 | Sindhi 🇵🇰 | Sinhala 🇱🇰 |
|
||||
| Slovak 🇸🇰 | Slovenian 🇸🇮 | Somali 🇸🇴 | Spanish 🇪🇸 |
|
||||
| Sundanese 🇮🇩 | Swahili 🇰🇪 | Swedish 🇸🇪 | Tagalog 🇵🇭 |
|
||||
| Tajik 🇹🇯 | Tamil 🇮🇳 | Tatar 🇷🇺 | Telugu 🇮🇳 |
|
||||
| Thai 🇹🇭 | Tibetan 🇨🇳 | Turkish 🇹🇷 | Turkmen 🇹🇲 |
|
||||
| Ukrainian 🇺🇦 | Urdu 🇵🇰 | Uzbek 🇺🇿 | Vietnamese 🇻e |
|
||||
| Welsh 🏴 | Yiddish 🇮🇱 | Yoruba 🇳🇬 | |
|
||||
|
||||
</details>
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
<details>
|
||||
<summary><b>🔥 App crashes on start</b></summary>
|
||||
<blockquote>
|
||||
The underlying engine requires standard C++ libraries. Install the <b>Microsoft Visual C++ Redistributable (2015-2022)</b>.
|
||||
</blockquote>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><b>🐌 "Simulate Typing" is slow</b></summary>
|
||||
<blockquote>
|
||||
Some apps (games, RDP) can't handle supersonic input. Go to <b>Settings</b> and lower the <b>Typing Speed</b> to ~1200 CPM.
|
||||
</blockquote>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><b>🎤 No Audio / Silence</b></summary>
|
||||
<blockquote>
|
||||
The agent listens to the <b>Default Communication Device</b>. Ensure your microphone is set correctly in Windows Sound Settings.
|
||||
</blockquote>
|
||||
</details>
|
||||
Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Burmese, Castilian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, Flemish, French, Galician, Georgian, German, Greek, Gujarati, Haitian, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Moldavian, Mongolian, Myanmar, Nepali, Norwegian, Occitan, Panjabi, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user