Whisper Voice

Reclaim Your Voice from the Cloud.

Whisper Voice is a high-performance, strictly local speech-to-text tool designed for the desktop. It provides instant, high-accuracy dictation anywhere on your system—no internet connection required, no corporate servers, and absolutely no data harvesting.

We believe that the tools of production—and communication—should belong to the individual, not rented from centralized tech giants.

✊ Core Principles

1. Total Autonomy (Local-First)

Your voice data is yours alone. Unlike commercial alternatives that siphon your words to remote data centers for processing and profiling, Whisper Voice runs entirely on your hardware. No masters, no servers. You retain full sovereignty over your digital footprint.

2. Decentralized Power

By leveraging optimized local processing, we strip away the need for reliance on massive, energy-hungry corporate infrastructure. This is technology scaled to the human level—powerful, efficient, and completely under your control.

3. Accessible to All

High-quality speech recognition shouldn't be gated behind subscriptions or paywalls. This tool is free, open, and built to empower users to interact with their machines on their own terms.

✨ Features

100% Offline Processing: Once the recognition engine is downloaded, the cable can be cut. Nothing leaves your machine.
Universal Compatibility: Works in any text field—editors, chat apps, terminals, or browsers. If you can type there, you can speak there.
Adaptive Input:
- Clipboard Mode: Standard paste injection.
- High-Speed Simulation: Simulates keystrokes at supersonic speeds (up to 6000 CPM) for apps that block pasting.
System Integration: Minimalist overlay and system tray presence. It exists when you need it and vanishes when you don't.
Resource Efficiency: Optimized to run smoothly on consumer hardware without monopolizing your system resources.

🚀 Getting Started

Installation

Download the latest release.
Run WhisperVoice.exe.
On the first run, the bootstrapper will autonomously provision the necessary runtime environment. This ensures your system remains clean and dependencies are self-contained.

Usage

Set Your Trigger: Configure a global hotkey (default: F9) in the settings.
Speak Freely: Hold the hotkey (or toggle it) and speak.
Direct Action: Your words are instantly transcribed and injected into your active window.

⚙️ Configuration

The Settings panel puts the means of configuration in your hands:

Recognition Engine: Choose the size of the model that fits your hardware capabilities (Tiny to Large). Larger models offer greater precision but require more computing power.
Input Method: Switch between "Clipboard Paste" and "Simulate Typing" depending on target application restrictions.
Typing Speed: Adjust the keystroke injection rate. Crank it up to 6000 CPM for instant text delivery.
Run on Startup: Configure the agent to be ready the moment your session begins.

🤝 Mutual Aid

This project thrives on community collaboration. If you have improvements, fixes, or ideas, you are encouraged to contribute. We build better systems when we build them together, horizontally and transparently.

Report Issues: If something breaks, let us know.
Contribute Code: The source is open. Fork it, improve it, share it.

Built with local processing libraries and Qt. No gods, no cloud managers.

3.5 KiB Raw Blame History