From b0f2aa0ea67a40597a7a8390cb0e279f9fac00aa Mon Sep 17 00:00:00 2001 From: Georgi Gerganov Date: Sun, 30 Oct 2022 17:10:46 +0200 Subject: [PATCH] Update README.md --- README.md | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/README.md b/README.md index a975238..fd8f6db 100644 --- a/README.md +++ b/README.md @@ -273,6 +273,45 @@ to highlight words with high or low confidence: image +## Word-level timestamps (experimental) + +The [main](examples/main) example has experimental support for word-level timestamp generation. The accuracy +is not great, but might be improved in the future. + +To use it, simply add the `-owts` command-line argument. There is a free parameter `-wt` that should be around `0.01`. + +Here are a few *"typical"* examples: + +```java +./main -m ./models/ggml-base.en.bin -f ./samples/jfk.wav -owts -wt 0.01 +source ./samples/jfk.wav.wts +ffplay ./samples/jfk.wav.mp4 +``` + +https://user-images.githubusercontent.com/1991296/198885665-b34b6845-11b8-4449-a255-d9ec2eab1344.mp4 + +--- + +```java +./main -m ./models/ggml-base.en.bin -f ./samples/mm0.wav -owts -wt 0.1 +source ./samples/mm0.wav.wts +ffplay ./samples/mm0.wav.mp4 +``` + +https://user-images.githubusercontent.com/1991296/198885703-0547ba17-c288-4827-8361-84cc440f2901.mp4 + +--- + +```java +./main -m ./models/ggml-base.en.bin -f ./samples/gb0.wav -owts -wt 0.01 +source ./samples/gb0.wav.wts +ffplay ./samples/gb0.wav.mp4 +``` + +https://user-images.githubusercontent.com/1991296/198885729-3fc9028c-a50c-4549-a11f-3306ef97e0c4.mp4 + +--- + ## Implementation details - The core tensor operations are implemented in C ([ggml.h](ggml.h) / [ggml.c](ggml.c))