DeepSeek just released DSpark for V4 Flash & Pro, a new speculative decoding method boosting throughput by 51% to 400%!
DS also showed DSpark works well for other models like Gemma & Qwen
Github: https://github.com/deepseek-ai/DeepSpec
Paper: https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf
HF: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro-DSpark https://t.co/GC31XiVjSK