models : change default hosting to Hugging Face

My Linode is running out of monthly bandwidth due to the big interest in
the project
pull/147/head
Georgi Gerganov 2 years ago
parent 83c742f1a7
commit 864a78a8d0
No known key found for this signature in database
GPG Key ID: 449E073F9DC10735

@ -428,11 +428,14 @@ The original models are converted to a custom binary format. This allows to pack
- vocabulary - vocabulary
- weights - weights
You can download the converted models using the [models/download-ggml-model.sh](models/download-ggml-model.sh) script or from here: You can download the converted models using the [models/download-ggml-model.sh](models/download-ggml-model.sh) script
or manually from here:
https://ggml.ggerganov.com - https://huggingface.co/datasets/ggerganov/whisper.cpp
- https://ggml.ggerganov.com
For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or the README in [models](models). For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or the README
in [models](models).
## Bindings ## Bindings

@ -1,10 +1,13 @@
## Whisper model files in custom ggml format ## Whisper model files in custom ggml format
The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27) The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27)
have been converted to custom `ggml` format in order to be able to load them in C/C++. The conversion has been performed using the have been converted to custom `ggml` format in order to be able to load them in C/C++. The conversion has been performed
[convert-pt-to-ggml.py](convert-pt-to-ggml.py) script. You can either obtain the original models and generate the `ggml` files using the [convert-pt-to-ggml.py](convert-pt-to-ggml.py) script. You can either obtain the original models and generate
yourself using the conversion script, or you can use the [download-ggml-model.sh](download-ggml-model.sh) script to download the the `ggml` files yourself using the conversion script, or you can use the [download-ggml-model.sh](download-ggml-model.sh)
already converted models from https://ggml.ggerganov.com script to download the already converted models. Currently, they are hosted on the following locations:
- https://huggingface.co/datasets/ggerganov/whisper.cpp
- https://ggml.ggerganov.com
Sample usage: Sample usage:

@ -3,6 +3,12 @@
# This script downloads Whisper model files that have already been converted to ggml format. # This script downloads Whisper model files that have already been converted to ggml format.
# This way you don't have to convert them yourself. # This way you don't have to convert them yourself.
#src="https://ggml.ggerganov.com"
#pfx="ggml-model-whisper"
src="https://huggingface.co/datasets/ggerganov/whisper.cpp"
pfx="resolve/main/ggml"
# get the path of this script # get the path of this script
function get_script_path() { function get_script_path() {
if [ -x "$(command -v realpath)" ]; then if [ -x "$(command -v realpath)" ]; then
@ -46,7 +52,7 @@ fi
# download ggml model # download ggml model
printf "Downloading ggml model $model ...\n" printf "Downloading ggml model $model from '$src' ...\n"
cd $models_path cd $models_path
@ -56,9 +62,9 @@ if [ -f "ggml-$model.bin" ]; then
fi fi
if [ -x "$(command -v wget)" ]; then if [ -x "$(command -v wget)" ]; then
wget --quiet --show-progress -O ggml-$model.bin https://ggml.ggerganov.com/ggml-model-whisper-$model.bin wget --quiet --show-progress -O ggml-$model.bin $src/$pfx-$model.bin
elif [ -x "$(command -v curl)" ]; then elif [ -x "$(command -v curl)" ]; then
curl --output ggml-$model.bin https://ggml.ggerganov.com/ggml-model-whisper-$model.bin curl --output ggml-$model.bin $src/$pfx-$model.bin
else else
printf "Either wget or curl is required to download models.\n" printf "Either wget or curl is required to download models.\n"
exit 1 exit 1

Loading…
Cancel
Save