nana
  • projects
  • writing
  • now
  • connect
  • projects
  • writing
  • now
  • connect

Writing

What I Learned Profiling PyTorch Memory Leaks Across Two Backends

June 3, 202615 min read

Eight leak patterns on MPS and CUDA — CUDA dampens leak signals but does not hide them, and the bugs that matter show up everywhere.

Diagnosing a PyTorch Memory Leak on Apple MPS

May 15, 20262 min read

How Stormlog helped us find and fix a subtle GPU memory leak that only appeared on Apple Silicon.

I Helped Build a GPU Memory Profiler. Then I Had to Learn What GPU Memory Actually Is.

March 17, 20268 min read

Walking through the Stormlog tutorial after shipping the tool — from PyTorch counters to a deliberate leak, OOM evidence, and the fix.

OOM Flight Recorder Ring-Buffer: Deep Dive

February 14, 202611 min read

Why a ring-buffer flight recorder for GPU OOM fills a gap PyTorch snapshots and NCCL tracing do not — temporal context, automatic dumps, and structured artifacts.

Prince Agyei Tuffour© 2026
  • email
  • github
  • linkedin

Send a message

I'll reply to your email