Minor updates

pull/27/head
Georgi Gerganov 2 years ago
parent 167324584b
commit b8f713482e
No known key found for this signature in database
GPG Key ID: 449E073F9DC10735

@ -7,13 +7,12 @@ High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisp
- Mixed F16 / F32 precision - Mixed F16 / F32 precision
- Low memory usage (Flash Attention + Flash Forward) - Low memory usage (Flash Attention + Flash Forward)
- Zero memory allocations at runtime - Zero memory allocations at runtime
- Runs on the CPU (Mac and Linux) - Runs on the CPU
- [C-style API](https://github.com/ggerganov/whisper.cpp/blob/master/whisper.h) - [C-style API](https://github.com/ggerganov/whisper.cpp/blob/master/whisper.h)
- Supported platforms: Linux, Mac OS (Intel and Arm), Raspberry Pi, Android
Incoming features: Incoming features:
- [Realtime audio input transcription](https://github.com/ggerganov/whisper.cpp/issues/10#issuecomment-1264665959) - [Realtime audio input transcription](https://github.com/ggerganov/whisper.cpp/issues/10#issuecomment-1264665959)
- [Raspberry Pi support](https://github.com/ggerganov/whisper.cpp/issues/7)
- [Android support](https://github.com/ggerganov/whisper.cpp/issues/8)
## Usage ## Usage
@ -220,10 +219,16 @@ $ ./stream -m models/ggml-small.en.bin -t 8
https://user-images.githubusercontent.com/1991296/193465125-c163d304-64f6-4f5d-83e5-72239c9a203e.mp4 https://user-images.githubusercontent.com/1991296/193465125-c163d304-64f6-4f5d-83e5-72239c9a203e.mp4
## Implementation details
- The core tensor operations are implemented in C (`ggml.h` / `ggml.c`)
- The high-level C-style API is implemented in C++ (`whisper.h` / `whisper.cpp`)
- Simple usage is demonstrated in `main.cpp`
- Sample real-time audio transcription from the microphone is demonstrated in `stream.cpp`
## Limitations ## Limitations
- Very basic greedy sampling scheme - always pick up the top token - Very basic greedy sampling scheme - always pick up the top token. You can implement your own strategy
- Only 16-bit WAV at 16 kHz is supported
- Inference only - Inference only
- No GPU support - No GPU support

@ -265,6 +265,11 @@ int main(int argc, char ** argv) {
wparams.print_progress = false; wparams.print_progress = false;
wparams.print_special_tokens = params.print_special_tokens; wparams.print_special_tokens = params.print_special_tokens;
wparams.print_realtime = false;
wparams.print_timestamps = !params.no_timestamps;
wparams.translate = params.translate;
wparams.language = params.language.c_str();
wparams.n_threads = params.n_threads;
if (whisper_full(ctx, wparams, pcmf32.data(), pcmf32.size()) != 0) { if (whisper_full(ctx, wparams, pcmf32.data(), pcmf32.size()) != 0) {
fprintf(stderr, "%s: failed to process audio\n", argv[0]); fprintf(stderr, "%s: failed to process audio\n", argv[0]);

Loading…
Cancel
Save