May 03, 2026 Speculative Decoding: 3× Faster LLM Inference for Free Apr 27, 2026 LoRA and QLoRA: Fine-Tuning 70 B Models on a Consumer GPU Apr 18, 2026 Knowledge Distillation: Teaching Small Models to Think Big Apr 13, 2026 Speculative Decoding: 3× Faster LLM Inference for Free Apr 07, 2026 LoRA and QLoRA: Fine-Tuning 70 B Models on a Consumer GPU