al-folio

a simple whitespace theme for academics

a distill-style blog post

an example of a distill-style blog post and main elements

25 min read · 2021

a post with code

an example of a blog post with some code

4 min read · 2015

Graph Neural Networks and Foundation Models for Science

How GNNs and graph-aware Transformers are enabling breakthroughs in drug discovery, materials science, and protein structure prediction.

7 min read · April 17, 2026

2026 · gnn graph molecular drug-discovery alphafold · applications
Contrastive Self-Supervised Learning: CLIP, SimCLR, and DINO

SimCLR, MoCo, BYOL, and DINO — the elegant mathematics of learning powerful representations by contrasting augmented views, without any labels.

7 min read · April 16, 2026

2026 · contrastive ssl simclr moco dino clip · foundation-models
The Transformer Architecture: A First-Principles Deep Dive

A rigorous technical walkthrough of every sublayer in the original Transformer — the architecture underpinning virtually all modern AI.

7 min read · April 15, 2026

2026 · transformers attention architecture foundational · foundation-models
Mechanistic Interpretability: Reverse-Engineering the Transformer

How researchers use circuits, activation patching, and the logit lens to understand exactly what computations happen inside Transformer models.

6 min read · April 14, 2026

2026 · interpretability circuits induction-heads features · interpretability
Speculative Decoding: 3× Faster LLM Inference for Free

How speculative decoding uses a small draft model and one parallel verification pass to dramatically accelerate autoregressive inference.

7 min read · April 13, 2026

2026 · inference efficiency speculative-decoding latency · efficiency

al-folio

a simple whitespace theme for academics

a distill-style blog post

a post with code

Graph Neural Networks and Foundation Models for Science

Contrastive Self-Supervised Learning: CLIP, SimCLR, and DINO

The Transformer Architecture: A First-Principles Deep Dive

Mechanistic Interpretability: Reverse-Engineering the Transformer

Speculative Decoding: 3× Faster LLM Inference for Free