Georgi Gerganov
5f7e9fa2dc
ref #68 , #79 : fix segment time output
2 years ago
Georgi Gerganov
7affd309d3
whisper : add new-segment callback
...
Can be used to process new segments as they are being generated.
Sample usage in main, for printing the resulting segments during the
inference.
2 years ago
Georgi Gerganov
31ff0c6a1f
wip : experimental color coding of tokens based on probabilities
2 years ago
Georgi Gerganov
8d15a1c635
ci : fix and re-enable tests (2nd try)
2 years ago
Georgi Gerganov
692aa0784f
Revert "ci : fix and re-enable tests"
...
This reverts commit 80aefc9514
.
2 years ago
Georgi Gerganov
80aefc9514
ci : fix and re-enable tests
2 years ago
Georgi Gerganov
7eeef0358a
ref #52 : improve greedy sampling strategy
...
Force timestamp token to be sampled if the probability sum over all
timestamp tokens is above the probability of any other token
2 years ago
Georgi Gerganov
e30cf83158
ref #57 , #62 , #63 : remove unions in C-api + remove designated initializers
...
We are not ready for designated initializers - many compilers do not
support this C++ feature yet, so removing it's non-trivial usages.
2 years ago
Georgi Gerganov
d6b84b2a23
ref #62 : fix build for some compilers
...
For some reason, new version of GCC panic when the struct type is not
specified explicitly
2 years ago
Georgi Gerganov
b4a3875b2c
Revert recent sampling change
...
It does not actually help and seems to produce worse results on some of
the samples
2 years ago
Georgi Gerganov
cf67bfffa0
Fix EOT token handling
...
If it is the end of the audio, pick all sampled tokens.
Otherwise, print error message.
2 years ago
Georgi Gerganov
d14823582d
Try to improve the sampling strategy a bit
...
It sill fails sometimes when it does not sample a timestamp token for
the entire segment. We now print a message in such cases
2 years ago
Georgi Gerganov
20d8e7a309
Fix memory sizes
2 years ago
Georgi Gerganov
72d967bce4
Use Accelerate framework on Apple silicon
...
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)
Also various extra optimizations:
- Multi-threaded NORM operator
- Faster GELU via F16 cast
2 years ago
Georgi Gerganov
0ad085f5e8
ref #48 : clear results at the start of whisper_full
...
This way, even if the input audio is empty, the previous results will be
removed.
2 years ago
0/0
b799226973
check if spectogram length is <100 before doing anything else
...
fixes #39
2 years ago
Borislav Stanimirov
0b45d25151
Building with MSVC
2 years ago
Georgi Gerganov
63b6786767
Minor
2 years ago
lnyan
4bbb8a587b
Add MinGW support
2 years ago
Georgi Gerganov
2ca8cc77b2
ref #17 : print whisper logs to stderr
...
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2 years ago
Georgi Gerganov
8c7c018893
ref #17 : add options to output result to file
...
Support for:
- plain text
- VTT
- SRT
2 years ago
Georgi Gerganov
b43b36e006
Update tests
2 years ago
Georgi Gerganov
2f069335ab
Adding sanitizer tests
2 years ago
Georgi Gerganov
332c9d77fe
whisper : fix bug in token sampling logic
...
Could overflow buffer
2 years ago
Georgi Gerganov
481cd685d5
ref #10 : option to keep context in "stream" example
...
Seems the results become worse when we keep the context, so by default
this is not enabled
2 years ago
Georgi Gerganov
7787b878e1
ref #16 , #22 : add "offset" argument
...
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2 years ago
Georgi Gerganov
167324584b
wip : rpi4 support
2 years ago
Georgi Gerganov
ce1fe95902
wip : improve makefile
2 years ago
Georgi Gerganov
6814cc9b02
Improve result printing
2 years ago
Georgi Gerganov
eba33adadd
Extend C-style API with full inference methods
2 years ago
Georgi Gerganov
6b77124e01
Initial C-style interface for whisper.cpp
2 years ago