speculative-decoding

Here are 13 public repositories matching this topic...

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

retrieval chatbot rag habana large-language-model chatpdf llm-inference 4-bits speculative-decoding llm-cpu streamingllm intel-optimized-llamacpp neural-chat neural-chat-7b autoround gaudi3

Updated Jul 19, 2024
Python

SafeAILab / EAGLE

Star

Official Implementation of EAGLE-1 and EAGLE-2

large-language-models llm-inference speculative-decoding

Updated Jul 19, 2024
Python

Infini-AI-Lab / Sequoia

Star

scalable and robust tree-based speculative decoding algorithm

efficiency inference llm speculative-decoding

Updated Jun 7, 2024
Python

Infini-AI-Lab / TriForce

Star

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

acceleration efficiency inference llm long-context llm-inference speculative-decoding

Updated Jul 4, 2024
Python

kssteven418 / BigLittleDecoder

Star

[NeurIPS'23] Speculative Decoding with Big Little Decoder

decoding efficient-inference speculative-execution fast-inference llm speculative-decoding

Updated Feb 6, 2024
Python

hemingkx / SpecDec

Star

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

non-autoregressive speculative-decoding

Updated Dec 9, 2023
Python

mscheong01 / speculative_decoding.c

Star

minimal C implementation of speculative decoding based on llama2.c

c artificial-intelligence llm llama2 speculative-decoding

Updated Jul 15, 2024
C

romsto / Speculative-Decoding

Star

Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

fast-inference llm llm-inference speculative-decoding llm-optimization

Updated Jul 16, 2024
Python

AutonomicPerfectionist / PipeInfer

Sponsor

Star

PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

inference llm llamacpp speculative-decoding

Updated Jul 5, 2024
C++

pinqian77 / Dynasurge

Star

Dynasurge: Dynamic Tree Speculation for Prompt-Specific Decoding

large-language-models speculative-decoding

Updated Apr 29, 2024
Python

u-hyszk / japanese-speculative-decoding

Star

Verification of the effect of speculative decoding in Japanese.

nlp japanese fast-inference speculative-decoding

Updated Mar 4, 2024
Python

kinshukdua / SpecDec

Star

Some experiments aimed at increasing LLM throughput and efficiency via Speculative Decoding.

inference llm speculative-decoding

Updated Jul 31, 2023
Python

PopoDev / BiLD

Star

Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder

reproducibility fast-inference llm speculative-decoding

Updated May 30, 2024
Python

Improve this page

Add a description, image, and links to the speculative-decoding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speculative-decoding topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speculative-decoding

Here are 13 public repositories matching this topic...

intel / intel-extension-for-transformers

SafeAILab / EAGLE

Infini-AI-Lab / Sequoia

Infini-AI-Lab / TriForce

kssteven418 / BigLittleDecoder

hemingkx / SpecDec

mscheong01 / speculative_decoding.c

romsto / Speculative-Decoding

AutonomicPerfectionist / PipeInfer

pinqian77 / Dynasurge

u-hyszk / japanese-speculative-decoding

kinshukdua / SpecDec

PopoDev / BiLD

Improve this page

Add this topic to your repo