venkr
b597c5a779
qual-bench.sh : add quality comparison tool, and update main.cpp to allow using a font file ( #569 )
2 years ago
sandrohanea
59fdcd19c8
whisper : add whisper_state + default state on the whisper_context ( #523 )
...
* Added whisper state + default state on the whisper_context
* Fixed some examples and bindings
* Fixed whisper_n_len (which was used in some binding) and added whisper_n_len_from_state
* Fixed comments
* whisper : reuse kv_cache_free() and fix compiler warnings
* whisper : clean-up the API comments
---------
Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2 years ago
HY. Kelvin Lee
72af0f5697
main : add csv header ( #552 )
2 years ago
Georgi Gerganov
f254e78737
yt-wsp.sh : print help on empty args
2 years ago
conradg
69e6e4644a
main : fix std in input ( #503 )
...
if we don't add this as an explicit check, then we get an "error: unknown argument: -" later on
2 years ago
Georgi Gerganov
09d7d2b68e
examples : refactor in order to reuse code and reduce duplication ( #482 )
...
* examples : refactor common code into a library
* examples : refactor common SDL code into a library
* make : update Makefile to use common libs
* common : fix MSVC M_PI ..
* addon.node : link common lib
2 years ago
Matija Pevec
d012b5c7e4
whisper : add "split_on_word" flag when using using "max_len" option ( #455 )
...
* Update whisper.cpp
* fix: trim function
* feat: added flag to split on word
* fix: arguments for main
2 years ago
Georgi Gerganov
f3ee4a9673
whisper : reduce memory usage during inference ( #431 )
...
* ggml : add "scratch" buffer support
* ggml : support for scratch ring-buffer
* ggml : bug fix in ggml_repeat()
* ggml : error on scratch buffer overflow
* whisper : use scratch buffers during inference (base model only)
* whisper : update memory usage for all models
* whisper : fix encoder memory usage
* whisper : use whisper_context functions instead of macros
* whisper : fix FF + remove it from README
* ggml : reuse ggml_new_i32
* ggml : refactor the scratch buffer storage
* whisper : reorder scratch buffers in the decoder
* main : add option to disable temp fallback
* Update README.md
2 years ago
Alex Bacart
3b1960520a
main : CSV format export trimmed spaces fix ( #444 )
...
* Update main.cpp
Removed string trimming
* Update main.cpp
* Update main.cpp
* Revert "Update main.cpp"
This reverts commit d8924fdcfe
.
* Revert "Update main.cpp"
This reverts commit 252e508d85
.
2 years ago
Georgi Gerganov
f583e2d2f5
main : we had accidentally disabled the temperature fallback .. ( #291 )
2 years ago
Chia-Hsiang Cheng
472a473fd1
main : add an option to accept optional output filenames ( #424 )
...
* Add an option to accept optional output filenames
* Format the file
Co-authored-by: Chia-Hsiang Cheng <gary.chiahsiang.cheng@gmail.com>
2 years ago
Georgi Gerganov
8de452c18b
Improve decoding ( #291 )
...
* whisper : prepare infra for new decoding strategies
* whisper : apply logit filters and compute logprobs
* whisper : add whisper_get_logits()
* whisper : separate self and cross attention memory
Initial step needed for supporting parallel decoders
* whisper : move probs_id buffer to whisper_context
* whisper : refactor kv cache into separate struct
* whisper : move self-attention kv cache to whisper_decoder
* whisper : wip decoding parameters + strategies
* whisper : wip decoding parameters + strategies (part 2)
* whisper : wip decoding parameters + strategies (part 3)
* whisper : wip decoding parameters + strategies (part 4)
* whisper : fix prompt_past update to not include prompt_init
* whisper : temperature + best_of support
* whisper : support for compression_ration_threshold
We actually use entropy, but it is similar
* command : fix example to use logits instead of obsolete probs
* whisper : handle empty sequence ranking
* whisper : add WHISPER_DEBUG + diagnostic prints + new main args
* whisper : minor fixes
* whisper : add beam-search support
* whisper : bug fix when there no previous context
* whisper : add comments
* stream : disable temperature fallback
For real-time processing, we always want a single decoder running at T=0
* whisper.swiftui : update example - fix paths + add empty folders
2 years ago
Syahmi Azhar
1512545149
whisper : add loader class to allow loading from buffer and others ( #353 )
...
* whisper : add loader to allow loading from other than file
* whisper : rename whisper_init to whisper_init_from_file
* whisper : add whisper_init_from_buffer
* android : Delete local.properties
* android : load models directly from assets
* whisper : adding <stddef.h> needed for size_t + code style
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2 years ago
Georgi Gerganov
b3c865083e
ci : add emscripten build
2 years ago
Georgi Gerganov
a0d4f8e65c
main : make whisper_print_segment_callback() more readable ( close #371 )
2 years ago
Georgi Gerganov
196d738974
minor : close #370 + Makefile build info print change
2 years ago
Andy Maloney
84c6b42e65
cmake : update to 3.19 ( #351 )
...
- update from 3.0 (from 2014) to 3.19 (from 2020)
- move some global setting onto the targets (through a cmake include)
2 years ago
Niels Mayer
a593b932e4
main : add -ocsv, aka --output-csv to output a CSV file
...
Adds -ocsv, aka --output-csv feature to examples/main, which outputs a CSV file containing lines formatted as follows <startTime-in-integer-milliseconds>, <endTime-in-integer-milliseconds>, "<transcript-line-including-commas>".
2 years ago
Andy Maloney
dc90efd504
examples : small code cleanups ( #322 )
...
- remove unnecessary initialization of string to ""
- use empty() instead of checking size()
- use emplace_back instead of push_back
- use nullptr instead of NULL
- remove unnecessary call to .data() on string
- use character overload of find_first_of() instead of passing a string
2 years ago
Georgi Gerganov
99da1e5cc8
cmake : enable and fix -Wall -Wextra -Wpedantic C++ warnings
2 years ago
Matheus de Sousa
8e3f129b4d
minor : resolves some of warnings when compiling with clang/clang++ ( #294 )
...
* Resolves some of warnings when compiling with clang/clang++
Mostly nit stuff that clang catches when compiling with -Wall -Wextra
-pedantic.
- Fix comparison between sign/unsigned integers.
- Passes a constant reference (const&) instead of copying each time.
* minor : normalize coding style
* minor : fix warning
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2 years ago
Georgi Gerganov
fba10a4c68
whisper : language auto-detect ( #59 )
2 years ago
Georgi Gerganov
32fbc8cd04
main : add option to print the progress ( #276 )
2 years ago
Georgi Gerganov
b8065d90f5
main : add "--prompt" command line argument ( #90 )
...
This allows to provide an initial prompt to be used at the start of the
processing.
2 years ago
Lexevolution
6ed786957e
Add newline per segment for text output ( #254 )
2 years ago
Georgi Gerganov
4698dcdb52
whisper : add mechanism for aborting the whisper_full() computation
2 years ago
Georgi Gerganov
0f619b52ce
main : add stereo-channel-based diarization ( #64 )
...
Not tested - I don't have stereo dialog audio
2 years ago
Georgi Gerganov
bc88eb13c6
examples : add "command" tool ( #171 )
2 years ago
Georgi Gerganov
b8ce25dec1
refactoring : more readable code
2 years ago
Georgi Gerganov
454b91de16
main : fix dangling pointer when using stdin for input ( #65 )
2 years ago
Georgi Gerganov
d7024cf9dc
main, stream : remove --verbose flag ( #178 )
2 years ago
Georgi Gerganov
e5dcdabbb8
unicode : fix character replacement (thanks to @tamo)
2 years ago
Georgi Gerganov
83c742f1a7
whisper : add option to speed up the audio tempo by x2
...
Using a Phase Vocoder for speeding up the audio tempo by scaling down
the frequencies in the frequency domain.
This reduces the computation in the Encoder by a factor of 2.
The transcription accuracy is degraded, but for slow to normal speech -
it seems to be still very good.
I think this can find application for real-time transcription - i.e. the
"stream" example.
2 years ago
Alan
7519eabf65
Adds support for stdin wav input
2 years ago
Georgi Gerganov
c30bffc8a5
ref #22 : add "duration" option
...
Can be used to partially process a recording
2 years ago
Georgi Gerganov
ef47d77492
main : fix generated bash script
2 years ago
Georgi Gerganov
d5afebd37c
whisper : token-level timestamp refactoring ( #49 , #120 )
...
This turned out pretty good overall. The algorithm has been moved from
main.cpp to whisper.cpp and can be reused for all subtitles types. This
means that now you can specify the maximum length of the generated
lines. Simply provide the "-ml" argument specifying the max length in
number of characters
2 years ago
Georgi Gerganov
6fb98370ba
main : add some comments for the word-level timestamp algorithm
2 years ago
Georgi Gerganov
0729da9a3b
main : fix some edge cases for word-level timestamps
2 years ago
Georgi Gerganov
dc12994603
Update README.md
2 years ago
Georgi Gerganov
57fb46f307
main : add option for word-leve timestamps (very experimental)
2 years ago
Georgi Gerganov
2827cbbbe8
main : merge parallel example in main
2 years ago
Georgi Gerganov
0b2dc3c82c
parallel : working
2 years ago
Georgi Gerganov
85d6e1e1e7
main : fix sampling time + add max_context parameter
2 years ago
Georgi Gerganov
ebb01b9e33
Print system info at start of program
2 years ago
Georgi Gerganov
2400660f3f
Print system info in main
2 years ago
Georgi Gerganov
47e78b7288
Update README.md
2 years ago
Georgi Gerganov
c6710efde2
refactoring : move main + stream in examples + other stuff
2 years ago