Georgi Gerganov
|
6309a60bac
|
ggml : vectorized quantize_row_q4_0 (ARM)
|
2 years ago |
Georgi Gerganov
|
ea97a5f469
|
ggml : vectorized mad q4_0 (ARM)
|
2 years ago |
Georgi Gerganov
|
8ce6d1e492
|
gq : add method 6 (ARM)
|
2 years ago |
Georgi Gerganov
|
cc94fdafe7
|
ggml : 4-bit quantization works (only scalar for now)
|
2 years ago |
Georgi Gerganov
|
b48b09c37f
|
gpt-2 : add gpt-2-quantize tool for quantizing f32 GPT-2 models
|
2 years ago |
Georgi Gerganov
|
a366dd31cc
|
ggml : q4_1 quantization support (seems to work for bigger models)
|
2 years ago |
Georgi Gerganov
|
a37776ddc0
|
ggml : q4_0 quantization support
|
2 years ago |
Georgi Gerganov
|
751aa84f1a
|
gpt-2 : loading Q4_0 quantized model
|
2 years ago |
Georgi Gerganov
|
38faca7efe
|
ggml : Q4_0 quantization support (ggml_get_rows())
|
2 years ago |
Georgi Gerganov
|
ca2714384b
|
gpt-2 : model conversion for Q4_0 quantization
|
2 years ago |
Georgi Gerganov
|
1ca898f94b
|
gq : method 5 (ARM)
|
2 years ago |
Georgi Gerganov
|
5a96c91bea
|
gq : method 4 (AVX2 attempt) + method 5 (no min)
|
2 years ago |
Georgi Gerganov
|
cde7c22ab1
|
gq : method 4 (ARM)
|
2 years ago |
Georgi Gerganov
|
054d97e0e1
|
gq : method 4 (AVX2)
|
2 years ago |
Georgi Gerganov
|
37dcfad83b
|
gq : progress on method 2
|
2 years ago |
Georgi Gerganov
|
bf709e45de
|
gq : add amax based method 3
|
2 years ago |
Georgi Gerganov
|
0a7debb7bf
|
gq : attempt at n-bit quantization
|
2 years ago |
katsu560
|
4c2f924553
|
cmake : update CMakeLists.txt to add correct flags (#26)
* modify src/CMakeLists.txt from whisper.cpp
* cmake : remove OpenBLAS stuff
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
2 years ago |
Georgi Gerganov
|
ba3e8a3d7f
|
readme : update Roadmap
|
2 years ago |
Georgi Gerganov
|
2546cb7780
|
readme : add Roadmap section
|
2 years ago |
Georgi Gerganov
|
8f8a5aca99
|
sync : latest whisper.cpp
|
2 years ago |
Georgi Gerganov
|
efa2cc36a2
|
tests : fix cblas_sgemm call
|
2 years ago |
Georgi Gerganov
|
3b3ad42906
|
tests : add SVD experiments
|
2 years ago |
Georgi Gerganov
|
a6acb3318a
|
sync : latest whisper.cpp (scratch buffers in ggml)
|
2 years ago |
Georgi Gerganov
|
47b297224e
|
Update README.md
|
2 years ago |
Takuya Takeuchi
|
0467385010
|
cmake : configure CMAKE_C_FLAGS and target_link_libraries for MSVC (#15)
|
2 years ago |
Georgi Gerganov
|
fb64edddb7
|
gpt : fix sampling to use the temperature (close #16)
|
2 years ago |
Georgi Gerganov
|
c40a5b51a0
|
ggml : sync latest whisper.cpp
|
2 years ago |
Georgi Gerganov
|
a0f2f68cdb
|
gpt-2 : fix broken prompt due to recent experiments
No idea why I commited that!?
|
2 years ago |
Georgi Gerganov
|
dee3684fec
|
ggml : sync latest whisper.cpp
|
2 years ago |
Georgi Gerganov
|
6ed4da0b03
|
cmake : disable warnings about unused functions
|
2 years ago |
Georgi Gerganov
|
06e2a3b721
|
ggml : bugfix in new soft max computation
|
2 years ago |
Georgi Gerganov
|
78af1420bf
|
tests : change test2 eps
|
2 years ago |
Georgi Gerganov
|
1af4cf0102
|
ggml : sync with latest whisper.cpp
|
2 years ago |
Georgi Gerganov
|
73a7916d30
|
tests : some more quantization experiments
|
2 years ago |
Georgi Gerganov
|
e0abac1be7
|
sync : forgot to sync ggml.h
|
2 years ago |
Georgi Gerganov
|
45fc4fed0b
|
sync : latest changes from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
deb0c486c7
|
tests : wip quantized matrix multiplication method 2
|
2 years ago |
Georgi Gerganov
|
d677c7f61d
|
tests : minor fixes for x86
|
2 years ago |
Georgi Gerganov
|
446ccf3ab1
|
tests : experiments with n-bit quantized matrix multiplication
|
2 years ago |
Georgi Gerganov
|
bd9f710a45
|
sync : latest changes from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
1dcbe86a0c
|
gpt-2 : experimenting with attention mask
|
2 years ago |
Georgi Gerganov
|
99f1afb613
|
gpt-2 : fix off-by-one error in batching logic
|
2 years ago |
Georgi Gerganov
|
64efeceabd
|
examples : redirect download scripts to HF
|
2 years ago |
Georgi Gerganov
|
ed09c7190e
|
gpt : add support for gpt-jt + fix unicode support
|
2 years ago |
Georgi Gerganov
|
f56828ed78
|
ggml : sync with latest code from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
90ee5c6358
|
sync : latest changes from whisper.cpp
- Documentation
- whisper : token-level timestamps
- ggml : Windows build fixes
- etc.
|
2 years ago |
Georgi Gerganov
|
db13973820
|
Update README.md
|
2 years ago |
Georgi Gerganov
|
6feeca262f
|
sync : latest changes from whisper.cpp
|
2 years ago |
Georgi Gerganov
|
624e4f5313
|
whisper : fix timestamp sampling
|
2 years ago |