Georgi Gerganov
|
06e2a3b721
|
ggml : bugfix in new soft max computation
|
2 years ago |
Georgi Gerganov
|
78af1420bf
|
tests : change test2 eps
|
2 years ago |
Georgi Gerganov
|
1af4cf0102
|
ggml : sync with latest whisper.cpp
|
2 years ago |
Georgi Gerganov
|
73a7916d30
|
tests : some more quantization experiments
|
2 years ago |
Georgi Gerganov
|
e0abac1be7
|
sync : forgot to sync ggml.h
|
2 years ago |
Georgi Gerganov
|
45fc4fed0b
|
sync : latest changes from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
deb0c486c7
|
tests : wip quantized matrix multiplication method 2
|
2 years ago |
Georgi Gerganov
|
d677c7f61d
|
tests : minor fixes for x86
|
2 years ago |
Georgi Gerganov
|
446ccf3ab1
|
tests : experiments with n-bit quantized matrix multiplication
|
2 years ago |
Georgi Gerganov
|
bd9f710a45
|
sync : latest changes from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
1dcbe86a0c
|
gpt-2 : experimenting with attention mask
|
2 years ago |
Georgi Gerganov
|
99f1afb613
|
gpt-2 : fix off-by-one error in batching logic
|
2 years ago |
Georgi Gerganov
|
64efeceabd
|
examples : redirect download scripts to HF
|
2 years ago |
Georgi Gerganov
|
ed09c7190e
|
gpt : add support for gpt-jt + fix unicode support
|
2 years ago |
Georgi Gerganov
|
f56828ed78
|
ggml : sync with latest code from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
90ee5c6358
|
sync : latest changes from whisper.cpp
- Documentation
- whisper : token-level timestamps
- ggml : Windows build fixes
- etc.
|
2 years ago |
Georgi Gerganov
|
db13973820
|
Update README.md
|
2 years ago |
Georgi Gerganov
|
6feeca262f
|
sync : latest changes from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
624e4f5313
|
whisper : fix timestamp sampling
|
2 years ago |
Georgi Gerganov
|
7094be1f37
|
sync : whisper.cpp
- Add MSVC header
- FP16 GELU
- C interface fixes (no unions)
- Minor CMake updates
|
2 years ago |
Georgi Gerganov
|
270829aa9f
|
sync : whisper.cpp
|
2 years ago |
Georgi Gerganov
|
7b70c5a561
|
Minor fixes
|
2 years ago |
Georgi Gerganov
|
d8f64bce3d
|
Improve mul_mat performance for big matrices using Accelerate framework
Also:
- Speedup GELU operator via F16 cast
- Multi-thread NORM operator
- Disable FLASH_FF in whisper example
|
2 years ago |
Georgi Gerganov
|
ea0ef2a41e
|
Performance tests - trying to optimize mul_mat
|
2 years ago |
Georgi Gerganov
|
67ac34fcfa
|
sync : whisper.cpp
|
2 years ago |
Georgi Gerganov
|
e2f39f4b52
|
whisper : sync with whisper.cpp
|
2 years ago |
Georgi Gerganov
|
8e3c634b27
|
whisper : various improvements
|
2 years ago |
Georgi Gerganov
|
8ca553add4
|
whisper : add C-style API
|
2 years ago |
Georgi Gerganov
|
dd1f4dfbab
|
whisper : various fixes
|
2 years ago |
Georgi Gerganov
|
0116c03fb7
|
whisper : various updates and improvements
|
2 years ago |
Georgi Gerganov
|
787efb4d2e
|
Adding Whisper inference example
|
2 years ago |
Georgi Gerganov
|
f21b84cd21
|
Update README.md + minor stuff
- Changed default threads to 4
- Added GGML_PERF for enabling runtime performance timings
|
2 years ago |
Georgi Gerganov
|
0f4e99b1cc
|
Update README.md
|
2 years ago |
Georgi Gerganov
|
fb558f78d9
|
Initial release
|
2 years ago |