efficiency | Sadjad Alikhani

May 03, 2026	Speculative Decoding: 3× Faster LLM Inference for Free
Apr 27, 2026	LoRA and QLoRA: Fine-Tuning 70 B Models on a Consumer GPU
Apr 18, 2026	Knowledge Distillation: Teaching Small Models to Think Big
Apr 13, 2026	Speculative Decoding: 3× Faster LLM Inference for Free
Apr 07, 2026	LoRA and QLoRA: Fine-Tuning 70 B Models on a Consumer GPU