BPE on Thamme Gowda

BPE on Thamme Gowdahttps://gowda.ai/tags/bpe/Recent content in BPE on Thamme GowdaHugoen-usMon, 30 Mar 2026 20:30:00 +0000From O(N) to O(log N): A Faster BPE Training Algorithm, Buried and Rediscoveredhttps://gowda.ai/posts/2026/03/faster-bpe-learn/Mon, 30 Mar 2026 20:30:00 +0000https://gowda.ai/posts/2026/03/faster-bpe-learn/I wrote a fast BPE training algorithm in 2020, buried it in a Python codebase, and forgot about it. Five years later, I rewrote it in C++ and benchmarked it: up to 11× faster than SentencePiece. The trick? A max-heap with lazy deletion instead of periodic linear scans.