Georgi Gerganov
c30bffc8a5
ref #22 : add "duration" option
...
Can be used to partially process a recording
2 years ago
Georgi Gerganov
d5afebd37c
whisper : token-level timestamp refactoring ( #49 , #120 )
...
This turned out pretty good overall. The algorithm has been moved from
main.cpp to whisper.cpp and can be reused for all subtitles types. This
means that now you can specify the maximum length of the generated
lines. Simply provide the "-ml" argument specifying the max length in
number of characters
2 years ago
Georgi Gerganov
02dfd5b8c3
whisper : fix extra memory usage after recent processor changes
...
Had increased the memory buffer to the size of the model and forgot to
bring it down.
2 years ago
Georgi Gerganov
57fb46f307
main : add option for word-leve timestamps (very experimental)
2 years ago
Georgi Gerganov
eba62e0fa1
close #113 : fix struct whisper_token_data
2 years ago
Georgi Gerganov
014a119052
minor : fix multiple definitions of to_timestamp()
2 years ago
Georgi Gerganov
dec40be58f
parallel : print time of audio boundaries + fix timings
2 years ago
Georgi Gerganov
0b2dc3c82c
parallel : working
2 years ago
Georgi Gerganov
85d6e1e1e7
main : fix sampling time + add max_context parameter
2 years ago
Georgi Gerganov
72e9cdd6bf
parallel : adding tool for parallel transformer inference
2 years ago
Borislav Stanimirov
c565c569e7
Define WHISPER_BUILD so as to export symbols on Windows
2 years ago
Georgi Gerganov
34bb3ab0cf
ggml : add system info functions
2 years ago
Georgi Gerganov
5f7e9fa2dc
ref #68 , #79 : fix segment time output
2 years ago
Georgi Gerganov
7affd309d3
whisper : add new-segment callback
...
Can be used to process new segments as they are being generated.
Sample usage in main, for printing the resulting segments during the
inference.
2 years ago
Georgi Gerganov
31ff0c6a1f
wip : experimental color coding of tokens based on probabilities
2 years ago
Georgi Gerganov
8d15a1c635
ci : fix and re-enable tests (2nd try)
2 years ago
Georgi Gerganov
692aa0784f
Revert "ci : fix and re-enable tests"
...
This reverts commit 80aefc9514
.
2 years ago
Georgi Gerganov
80aefc9514
ci : fix and re-enable tests
2 years ago
Georgi Gerganov
7eeef0358a
ref #52 : improve greedy sampling strategy
...
Force timestamp token to be sampled if the probability sum over all
timestamp tokens is above the probability of any other token
2 years ago
Georgi Gerganov
e30cf83158
ref #57 , #62 , #63 : remove unions in C-api + remove designated initializers
...
We are not ready for designated initializers - many compilers do not
support this C++ feature yet, so removing it's non-trivial usages.
2 years ago
Georgi Gerganov
d6b84b2a23
ref #62 : fix build for some compilers
...
For some reason, new version of GCC panic when the struct type is not
specified explicitly
2 years ago
Georgi Gerganov
b4a3875b2c
Revert recent sampling change
...
It does not actually help and seems to produce worse results on some of
the samples
2 years ago
Georgi Gerganov
cf67bfffa0
Fix EOT token handling
...
If it is the end of the audio, pick all sampled tokens.
Otherwise, print error message.
2 years ago
Georgi Gerganov
d14823582d
Try to improve the sampling strategy a bit
...
It sill fails sometimes when it does not sample a timestamp token for
the entire segment. We now print a message in such cases
2 years ago
Georgi Gerganov
20d8e7a309
Fix memory sizes
2 years ago
Georgi Gerganov
72d967bce4
Use Accelerate framework on Apple silicon
...
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)
Also various extra optimizations:
- Multi-threaded NORM operator
- Faster GELU via F16 cast
2 years ago
Georgi Gerganov
0ad085f5e8
ref #48 : clear results at the start of whisper_full
...
This way, even if the input audio is empty, the previous results will be
removed.
2 years ago
0/0
b799226973
check if spectogram length is <100 before doing anything else
...
fixes #39
2 years ago
Borislav Stanimirov
0b45d25151
Building with MSVC
2 years ago
Georgi Gerganov
63b6786767
Minor
2 years ago
lnyan
4bbb8a587b
Add MinGW support
2 years ago
Georgi Gerganov
2ca8cc77b2
ref #17 : print whisper logs to stderr
...
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2 years ago
Georgi Gerganov
8c7c018893
ref #17 : add options to output result to file
...
Support for:
- plain text
- VTT
- SRT
2 years ago
Georgi Gerganov
b43b36e006
Update tests
2 years ago
Georgi Gerganov
2f069335ab
Adding sanitizer tests
2 years ago
Georgi Gerganov
332c9d77fe
whisper : fix bug in token sampling logic
...
Could overflow buffer
2 years ago
Georgi Gerganov
481cd685d5
ref #10 : option to keep context in "stream" example
...
Seems the results become worse when we keep the context, so by default
this is not enabled
2 years ago
Georgi Gerganov
7787b878e1
ref #16 , #22 : add "offset" argument
...
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2 years ago
Georgi Gerganov
167324584b
wip : rpi4 support
2 years ago
Georgi Gerganov
ce1fe95902
wip : improve makefile
2 years ago
Georgi Gerganov
6814cc9b02
Improve result printing
2 years ago
Georgi Gerganov
eba33adadd
Extend C-style API with full inference methods
2 years ago
Georgi Gerganov
6b77124e01
Initial C-style interface for whisper.cpp
2 years ago