Discussion about this post

User's avatar
Hao Hoang's avatar

📚 Related Papers:

- Fast Inference from Transformers via Speculative Decoding. Available at: https://arxiv.org/abs/2211.17192

- Accelerating Large Language Model Decoding with Speculative Sampling. Available at: https://arxiv.org/abs/2302.01318

- DistillSpec: Improving Speculative Decoding via Knowledge Distillation. Available at: https://arxiv.org/abs/2310.08461

- Accelerating Speculative Decoding with Block Diffusion Draft Trees. Available at: https://arxiv.org/abs/2604.12989

No posts

Ready for more?