Release v1.1.0 · ggerganov/whisper.cpp

Overview

The major change in this pre-release is the improved decoding implementation in whisper.cpp:

Support for average logprob and entropy based criteria for fallback
Support for temperature T > 0
Improved Greedy decoder via best_of parameter for T > 0
Add beam search decoding (a.k.a beam_size)

More information about the decoding changes can be found in #291
Additionally, there are a few performance improvements for Apple Silicon, WASM and non-F16C platforms.
Support for POWER9 architectures has been added.

The reason that this is a pre-release and not an official release is that the new implementation has not been sufficiently tested yet and the existing bindings for other languages have not been updated to support the API changes. The official release 1.1.x will be created when there is enough feedback about the new decoding implementation and when the bindings have been updated. So make sure to send your feedback in the discussion created for this pre-release. For now, the 1.0.4 release should be considered more stable.

What's Changed

Core `ggml` / `whisper`

ggml : POWER9 support by @fitzsim in #320, #349, #369
ggml : simplify the SIMD code by @ggerganov in #324
ggml : add SSE3 and fp16 conversion lookup table by @abitofevrything in #368
ggml : utilise Accelerate's vDSP for some computations d51fc3e
ggml : speed-up softmax compute via Accelerate and loop unrolling d61d55c
ggml : do not start extra threads when using BLAS d347a59
whisper : do sample_to_timestamp calculation with 64 bit precision to avoid overflow by @boolemancer in #388
whisper : various code clean-up and improvements by @asmaloney in #317 #318 #319 #322 etc
whisper : improve decoding by @ggerganov in #291
whisper : account for speed_up flag for short audio #405

C-style API

Add loader class to allow loading from buffer and others by @prsyahmi in #353
Add whisper_token_data::plog
Add whisper_init_from_file()
Add whisper_init_from_buffer()
Change whisper_init()
Remove whisper_sample_best()
Remove whisper_sample_timestamp()
Add whisper_n_audio_ctx()
Add whisper_get_logits()
Remove whisper_get_probs()
Change struct whisper_full_params

Bindings

Golang bindings by @djthorpe in #287, #379, #384

Examples

whisper.android : remove android ABI constraint by @Digipom in #301
whisper.swiftui : SwiftUI example by @Digipom in #308
main : add -ocsv, aka --output-csv for writing CSV file containing millisecond timestamps by @NielsMayer in #340
command : refactor to split command list & general transcription modes by @asmaloney in #331
command : always-prompt mode by @dnhkng in #383
stream : fix data race on bool + avoid division-by-zero a466c34
stream : fix a bug that inserted a lot of empty audio at the start a6dbd91
bench.wasm : print system info fafd789

New Contributors

@djthorpe made their first contribution in #287
@0xmohit made their first contribution in #296
@asmaloney made their first contribution in #298
@fitzsim made their first contribution in #320
@NielsMayer made their first contribution in #340
@aviks made their first contribution in #345
@eltociear made their first contribution in #346
@abitofevrything made their first contribution in #368
@Mike-Bell made their first contribution in #381
@dnhkng made their first contribution in #383
@prsyahmi made their first contribution in #353
@ianb made their first contribution in #391

Full Changelog: v1.0.4...v1.1.0

Highlights

Sample SwiftUI application example/whisper.swiftui

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.1.0

Overview

What's Changed

Core `ggml` / `whisper`

C-style API

Bindings

Examples

New Contributors

Highlights

Contributors

v1.1.0

Overview

What's Changed

Core ggml / whisper

C-style API

Bindings

Examples

New Contributors

Highlights

Contributors

Core `ggml` / `whisper`