whisper.cpp

Commit Graph

Author	SHA1	Message	Date
Georgi Gerganov	be16dfa038	whisper.wasm : do not block page while processing (close #86 )	2 years ago
Georgi Gerganov	b8ce25dec1	refactoring : more readable code	2 years ago
Georgi Gerganov	d7024cf9dc	main, stream : remove --verbose flag (#178 )	2 years ago
Georgi Gerganov	385236d1d3	stream : "-kc" now enables context keeping from previous segment (#90 ) By default, the context keeping is disabled	2 years ago
M. Eren Akbiyik	63ae03b8e0	Prompt previous tokens for streaming (#163 ) * feat: prompt previous tokens for streaming I used a vector pointer instead of vector itself because it gave weird errors, and why not * convert vector to use with C api * feat: remove old refs, check for prompt size * feat: use better way of getting the pointer	2 years ago
Georgi Gerganov	f2df9bd768	stream : add "max_tokens" cli arg Controls the max tokens per segment for the stream example	2 years ago
Georgi Gerganov	fb8d77f760	stream : add "audio_ctx" parameter Used to overwrite the audio context size of the Encoder. For example, setting "audio_ctx = 512" will make it run about 3 times faster, processing about 10s of audio, instead of 30s. The transcription quality drops, but this can be used for real-time streaming purposes where performance is important.	2 years ago
Georgi Gerganov	62b5ff875c	stream : add "max_tokens" parameter Used to limit the number of tokens in a segment. Useful to battle with word repetition when using partial encoder context	2 years ago
Georgi Gerganov	d351771a4b	stream : add "single_segment" option Force the entire audio chunk to be transcribed into a single segment	2 years ago
Georgi Gerganov	c058aaf22e	stream : partial encoder experiments	2 years ago
Georgi Gerganov	83c742f1a7	whisper : add option to speed up the audio tempo by x2 Using a Phase Vocoder for speeding up the audio tempo by scaling down the frequencies in the frequency domain. This reduces the computation in the Encoder by a factor of 2. The transcription accuracy is degraded, but for slow to normal speech - it seems to be still very good. I think this can find application for real-time transcription - i.e. the "stream" example.	2 years ago
Georgi Gerganov	5a9e4260a6	stream : add "--capture" option to select capture device (ref #10 )	2 years ago
Georgi Gerganov	8347a7bb6a	stream : few updates to make it compatible for Vim usage (#99 )	2 years ago
Georgi Gerganov	c6710efde2	refactoring : move main + stream in examples + other stuff	2 years ago

14 Commits (093c840deef894cf38729785825cd4cc05e7cec0)