Sebastián Aedo
e1415de535
Initial support for CMake
2 years ago
Matvey Soloviev
460c482540
Fix token count accounting
2 years ago
Georgi Gerganov
c80e2a8f2a
Revert "10% performance boost on ARM"
...
This reverts commit 113a9e83eb
.
There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve
2 years ago
Georgi Gerganov
54a0e66ea0
Check for vdotq_s32 availability
2 years ago
Georgi Gerganov
543c57e991
Ammend to previous commit - forgot to update non-QRDMX branch
2 years ago
Georgi Gerganov
113a9e83eb
10% performance boost on ARM
2 years ago
Matvey Soloviev
404fac0d62
Fix color getting reset before prompt output done ( #65 )
...
(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)
2 years ago
Georgi Gerganov
1a0a74300f
Update README.md
2 years ago
Matvey Soloviev
96ea727f47
Add interactive mode ( #61 )
...
* Initial work on interactive mode.
* Improve interactive mode. Make rev. prompt optional.
* Update README to explain interactive mode.
* Fix OS X build
2 years ago
Marc Köhlbrugge
9661954835
Fix typo in README ( #45 )
2 years ago
Ben Garney
f385f8dee8
Allow using prompt files ( #59 )
2 years ago
beiller
02f0c6fe7f
Add back top_k ( #56 )
...
* Add back top_k
* Update utils.cpp
* Update utils.h
---------
Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2 years ago
Sebastián A
eb062bb012
Windows fixes ( #31 )
...
* Apply fixes suggested to build on windows
Issue: https://github.com/ggerganov/llama.cpp/issues/22
* Remove unsupported VLAs
* MSVC: Remove features that are only available on MSVC C++20.
* Fix zero initialization of the other fields.
* Change the use of vector for stack allocations.
2 years ago
Georgi Gerganov
7027a97837
Update README.md
2 years ago
Georgi Gerganov
2d555e5b42
Add CI ( #60 )
2 years ago
Georgi Gerganov
7c9e54e55e
Revert "weights_only" arg - this causing more trouble than help
2 years ago
Oleksandr Nikitin
b9bd1d0141
python/pytorch compat notes ( #44 )
2 years ago
beiller
129c7d1ea8
Add repetition penalty ( #20 )
...
* Adding repeat penalization
* Update utils.h
* Update utils.cpp
* Numeric fix
Should probably still scale by temp even if penalized
* Update comments, more proper application
I see that numbers can go negative so a fix from a referenced commit
* Minor formatting
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2 years ago
Georgi Gerganov
702fddf5c5
Clarify meaning of hacking
2 years ago
Georgi Gerganov
7d86e25bf6
README: add "Supported platforms" + update hot topics
2 years ago
deepdiffuser
a93120236f
use weights_only in conversion script ( #32 )
...
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
2 years ago
Pavol Rusnak
6a9a67f0be
Add LICENSE ( #21 )
2 years ago
Georgi Gerganov
da1a4ff01f
Update README.md
2 years ago
Juraj Bednar
6b2cb6302f
Fix a typo in model name ( #16 )
2 years ago
Georgi Gerganov
4235e3d5b3
Update README.md
2 years ago
Georgi Gerganov
f1eaff4721
Add AVX2 support for x86 architectures thanks to @Const-me !
2 years ago
Georgi Gerganov
a9e58529ea
Fix un-initialized FP16 tables on x86 ( #15 , #2 )
2 years ago
Georgi Gerganov
7d9ed7b25f
Bump memory buffer
2 years ago
Georgi Gerganov
0c6803321c
Update README.md
2 years ago
Georgi Gerganov
f60fa9e50a
.gitignore models/
2 years ago
Georgi Gerganov
7211862c94
Update Makefile var + add comment
2 years ago
Georgi Gerganov
a5c5ae2f54
Update README.md
2 years ago
Georgi Gerganov
ea977e85ec
Update README.md
2 years ago
Georgi Gerganov
007a8f6f45
Support all LLaMA models + change Q4_0 quantization storage
2 years ago
Simon Willison
5f2f970d51
Include Python dependencies in README ( #6 )
2 years ago
Georgi Gerganov
73c6ed5e87
Update README.md
2 years ago
Georgi Gerganov
01eeed8fb1
Update README.md
2 years ago
Georgi Gerganov
6da2df34ee
Update README.md
2 years ago
Jean-Michaël Celerier
9dcf4dba45
Add missing headers for memcpy and assert ( #3 )
2 years ago
Georgi Gerganov
920a7fe2d9
Update README.md
2 years ago
Georgi Gerganov
3a57ee59de
Update README.md
2 years ago
Georgi Gerganov
b85028522d
Update README.md
2 years ago
Georgi Gerganov
8a01f565ff
Update README.md
2 years ago
Georgi Gerganov
70bc0b8b15
Fix a bug in the rope calculation
2 years ago
Georgi Gerganov
18ebda34d6
Update README.md
2 years ago
Georgi Gerganov
319cdb3e1f
Final touches
2 years ago
Georgi Gerganov
775328064e
Create README.md
2 years ago
Georgi Gerganov
26c0846629
Initial release
2 years ago