Georgi Gerganov
|
b48b09c37f
|
gpt-2 : add gpt-2-quantize tool for quantizing f32 GPT-2 models
|
2 years ago |
Georgi Gerganov
|
a366dd31cc
|
ggml : q4_1 quantization support (seems to work for bigger models)
|
2 years ago |
Georgi Gerganov
|
751aa84f1a
|
gpt-2 : loading Q4_0 quantized model
|
2 years ago |
Georgi Gerganov
|
fb64edddb7
|
gpt : fix sampling to use the temperature (close #16)
|
2 years ago |
Georgi Gerganov
|
a0f2f68cdb
|
gpt-2 : fix broken prompt due to recent experiments
No idea why I commited that!?
|
2 years ago |
Georgi Gerganov
|
1dcbe86a0c
|
gpt-2 : experimenting with attention mask
|
2 years ago |
Georgi Gerganov
|
99f1afb613
|
gpt-2 : fix off-by-one error in batching logic
|
2 years ago |
Georgi Gerganov
|
787efb4d2e
|
Adding Whisper inference example
|
2 years ago |
Georgi Gerganov
|
fb558f78d9
|
Initial release
|
2 years ago |