Georgi Gerganov
7affd309d3
whisper : add new-segment callback
...
Can be used to process new segments as they are being generated.
Sample usage in main, for printing the resulting segments during the
inference.
2 years ago
Georgi Gerganov
8f95c25aed
main : refactor subtitle output
2 years ago
Georgi Gerganov
31ff0c6a1f
wip : experimental color coding of tokens based on probabilities
2 years ago
Georgi Gerganov
7d0dee7a8a
ref #68 : add option "-on" to specify segment index offset for SRT
...
Also, change option "-o" to "-ot"
2 years ago
Georgi Gerganov
e30cf83158
ref #57 , #62 , #63 : remove unions in C-api + remove designated initializers
...
We are not ready for designated initializers - many compilers do not
support this C++ feature yet, so removing it's non-trivial usages.
2 years ago
Georgi Gerganov
72d967bce4
Use Accelerate framework on Apple silicon
...
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)
Also various extra optimizations:
- Multi-threaded NORM operator
- Faster GELU via F16 cast
2 years ago
Topping1
50b5fe964c
Update main.cpp
2 years ago
Georgi Gerganov
4a6bf11db3
Minor
2 years ago
Georgi Gerganov
9bbca3110f
ref #9 : add API documentation in whisper.h
2 years ago
Georgi Gerganov
2ca8cc77b2
ref #17 : print whisper logs to stderr
...
Only the transcribed/translted text is printed to stdout.
This way, one can redirect the result to a file.
2 years ago
Georgi Gerganov
8c7c018893
ref #17 : add options to output result to file
...
Support for:
- plain text
- VTT
- SRT
2 years ago
Georgi Gerganov
7787b878e1
ref #16 , #22 : add "offset" argument
...
Allows to start processing the input audio at some offset from the
beginning. Useful for splitting a long job into multiple tasks.
2 years ago
Georgi Gerganov
700898e6ed
ref #22 : add option to provide multiple input .wav files
2 years ago
Georgi Gerganov
ce1fe95902
wip : improve makefile
2 years ago
Артём Земляк
495b81b367
Fix: main get n_threads from cli
2 years ago
Артём Земляк
f007e186fe
Fix: main get language from cli args
2 years ago
Georgi Gerganov
6814cc9b02
Improve result printing
2 years ago
Georgi Gerganov
eba33adadd
Extend C-style API with full inference methods
2 years ago
Georgi Gerganov
6b77124e01
Initial C-style interface for whisper.cpp
2 years ago
Georgi Gerganov
77d929f603
Fix bug in FFT
...
The FFT routine does not work for odd N
Solution is to add DFT and use it when N is odd
2 years ago
Georgi Gerganov
6d654d192a
Fix reading of stereo WAV files
2 years ago
Georgi Gerganov
15b49e8baf
Bug fix
...
Longer prompts could cause out-of-bounds access
2 years ago
Georgi Gerganov
3bcdbdfc32
Reduce memory usage even more + better sampling
...
- The encode/decode memory buffers are now reused
- If the 30-sec segment goes for too long without a timestamp token, we
force one. Improves transcription for large model
- Stereo support
- Add "micro-machines.wav" sample
2 years ago
Georgi Gerganov
5877c3578e
ref #4 : added transcription timestamps
...
Can be turned off with "-nt" argument.
Performance has also improved.
2 years ago
Georgi Gerganov
f888c2373d
Flash + language support (ref #2 )
...
- Achieved big performance improvement + memory usage reduction
- Can now translate / transcribe different languages
2 years ago
Georgi Gerganov
476182e439
Update README.md and simplify usage
2 years ago
Georgi Gerganov
b0a11594ae
Initial release
2 years ago