Commit Graph

70 Commits (02c7516c575de5bf999c306e855bdfda120b01ff)

Author SHA1 Message Date
fitzsim ae16c21e9c
whisper : PPC64 big-endian support (#398)
2 years ago
Georgi Gerganov 1290fc6457
bench : add memcpy and ggml_mul_mat benchmarks
2 years ago
Georgi Gerganov 4ef3398e8f
ggml : remove obsolete zeroing + comment fixes (#390)
2 years ago
Abitofevrything 8d7b29cedd
ggml : correct behaviour of ggml_vec_sum_f32 (#390)
2 years ago
Georgi Gerganov 52a3e0c92a
ggml : improve vec_dot_f16 unrolling in flash_attn_f16
2 years ago
Georgi Gerganov f30b5d322c
ggml : fix bug in new soft max computation
2 years ago
Georgi Gerganov d347a59a5f
ggml : when using BLAS start only 1 CPU thread
2 years ago
Georgi Gerganov 6394c906af
ggml : fix running tasks with variable number of threads
2 years ago
Georgi Gerganov 74ffa14e1d
ggml : unroll ggml_vec_dot_f16 in ggml_compute_forward_flash_attn_f16
2 years ago
Georgi Gerganov 65fdcbbbbb
whisper : revert accidental MB change
2 years ago
Georgi Gerganov d61d55cd4b
ggml : speed-up soft max via Accelerate + unroll
2 years ago
Georgi Gerganov d51fc3ee0a
ggml : use vDSP_sve and vDSP_maxv from Accelerate
2 years ago
Georgi Gerganov f82a7dd019
ggml : make gcc happy (minor)
2 years ago
Abitofevrything a62170c656
ggml : add SSE3 and fp16 conversion lookup table (#368)
2 years ago
Thomas Fitzsimmons 1944e7c33e whisper : document POWER VSX support
2 years ago
Thomas Fitzsimmons 49a8dd6732 ggml : reorganize POWER9 ppc64le SIMD code
2 years ago
Thomas Fitzsimmons 8c7f642286 ggml : change f16 load and store macro arguments
2 years ago
Georgi Gerganov 0a0cfa7985
ggml : add void to argument-less functions
2 years ago
Georgi Gerganov d51c5eb906
ggml : define MIN / MAX only if not defined (minor)
2 years ago
Thomas Fitzsimmons 424c410c42 ggml : improve f16 acceleration for POWER9 ppc64le
2 years ago
Georgi Gerganov 4e0b2069e7
ggml : barrier refactor + static functions
2 years ago
Georgi Gerganov ac521a566e
ggml : simplify the SIMD code (#324)
2 years ago
Georgi Gerganov 7282e2109e
ggml : use vaddvq_f32 for slightly more efficient reduce
2 years ago
Thomas Fitzsimmons 466ceebb78 ggml : add f16 acceleration for POWER9 ppc64le
2 years ago
Andy Maloney 493d94130d
ggml : make consts static (#317)
2 years ago
Andy Maloney fa463313ad
minor : small code cleanups (#302)
2 years ago
Kevin Brothaler e1432dd91a Check for both __ARM_NEON and __ARM_FEATURE_FMA so that the project can be compiled for armv7a.
2 years ago
katsu560 419b8a6402 Add AVX,AVX2 support for ggml_vec_scale_f32
2 years ago
Georgi Gerganov a7047b2a28
ggml : implement ggml_compute_forward_dup_f16() special cases
2 years ago
Georgi Gerganov 0f11759406
ggml : make more compatible with c99 (#262)
2 years ago
Georgi Gerganov f66ac6dc4f
ggml : fix indentation
2 years ago
Georgi Gerganov 9955fa4ed7
ggml : make compatible with c99 (#262)
2 years ago
Roland Rabien e70d47baab
Remove C++20 requirement (#257)
2 years ago
Georgi Gerganov 3b1aacbe6d talk : talk with AI in the terminal
2 years ago
Georgi Gerganov 50a061b313
ggml : add alternative cblas_sgemm call
2 years ago
Al Hoang 04a16bbf11 fix compilation on haiku
2 years ago
Georgi Gerganov b6597539f9
ggml : fix typo in previous commit
2 years ago
Georgi Gerganov 9a4b7a916e
ggml : use macros to inline FP16 <-> FP32 conversions
2 years ago
Georgi Gerganov f8ec718b76
ggml : add F16C CPU flag check
2 years ago
katsu560 35b40a93b9 add fp16/fp32 convert intrinsics
2 years ago
Georgi Gerganov 061fc81bd6
ggml : remove inline specifier from fp16 <-> fp32 converters
2 years ago
Georgi Gerganov 388e9f79ad
ggml : fix the fix
2 years ago
Georgi Gerganov 35cd29ce1f
ggml : fix cross-compile Linux -> Window with mingw (#168)
2 years ago
katsu560 804f36aa2c ggml: change inline ggml_fp16_to_fp32, ggml_fp16_t ggml_fp32_to_fp16
2 years ago
katsu560 83456076f0 add AVX support
2 years ago
Georgi Gerganov 2065572a11 ggml : fix Windows build
2 years ago
boolemancer 0bfe728b84 Fix the Windows pthread_create shim
2 years ago
Georgi Gerganov 75171c2b79
ggml : multi-thread the ggml_add operator
2 years ago
Georgi Gerganov 137321915f
ggml : fix the check for NEON support (#7)
2 years ago
Syed Jafri 24cd12f647
Cross compilation (#121)
2 years ago