Go 1.25 & 1.26 Compiler Magic — How the Stack Is Eating the Heap

1. Introduction
Every Go developer knows the rule of thumb: stack allocations are fast, heap allocations are slow (and cause GC pressure). The Go compiler has always tried to keep allocations on the stack via escape analysis, but until Go 1.25, there were several common patterns that forced heap allocations even when logically unnecessary.
Go 1.25 and 1.26 bring significant improvements to stack allocation. In this article, we’ll walk through exactly what changed, why, and how it affects your code — sometimes even outperforming hand-optimized alternatives.
2. Background: Escape Analysis
Before diving into the optimizations, let’s briefly recap how Go decides where to allocate.
When you write make([]T, n), the compiler runs escape analysis to determine whether the slice can live entirely on the stack (local to the function) or must be promoted to the heap (because it outlives the function, is passed to a goroutine, etc.).
1 | func stackExample() []int { |
You can check escape decisions with:
1 | go build -gcflags="-m" ./... |
3. Go 1.25: Variable-Sized Slice Optimization
Prior to Go 1.25, a slice created with a non-constant capacity was always heap-allocated, even if the actual size was tiny:
1 | func process(lengthGuess int) { |
Go 1.25 introduced a clever optimization: the compiler allocates a small 32-byte backing store on the stack speculatively. At runtime:
- If
lengthGuess * sizeof(T)fits in 32 bytes → use the stack buffer, zero heap allocations - If it’s larger → fall back to a regular heap allocation
1 | func process(lengthGuess int) { |
For the common case of small dynamic slices — processing a handful of items, building short request batches — this eliminates allocation entirely.
4. Go 1.26: Append-Site Stack Allocation
Go 1.26 extends this idea to the most common slice growth pattern: append-based accumulation.
Case 1: Non-escaping slices
1 | func processLocal(c chan int) { |
Before Go 1.26, the very first append call would allocate a slice of length 1 on the heap, then 2, then 4, then 8 — the standard doubling pattern. Each of those is a separate heap allocation.
Go 1.26 allocates a small stack-based backing store before the loop begins, so the first several appends use the stack buffer with zero heap involvement.
Case 2: Escaping slices (the surprising one)
1 | func extract(c chan int) []int { |
Even when the slice must eventually escape to the heap (because we return it), Go 1.26 still uses a stack buffer during accumulation. The compiler inserts a call to runtime.move2heap() that copies the final data to the heap only once, when returning.
1 | // Conceptually, the compiler transforms the above to: |
The key insight: instead of 3+ startup heap allocations (size 1, 2, 4…), you get exactly 1 heap allocation at the end, and only if the data actually exceeds the stack buffer.
5. Benchmark: Better Than Hand-Optimized
The Go blog notes that these optimizations can actually outperform manually optimized code. Here’s why:
If you pre-allocate with a fixed capacity to avoid the append growth pattern:
1 | func manualOpt(c chan int, hint int) []int { |
When hint is larger than needed, you’ve over-allocated. When it’s smaller, you still get growth copies. The Go 1.26 stack optimization handles both cases more efficiently: it starts on the stack and only pays the heap cost for what’s actually needed.
1 | // run benchmarks to compare |
6. Opting Out
If you encounter issues with these optimizations (rare, but possible in edge cases with unsafe code):
1 | # Disable Go 1.25 variable-make optimization |
Note that disabling these optimizations should only be necessary if you’re doing something unusual with unsafe. For typical Go code, the compiler’s decisions are correct.
7. Conclusion
Go 1.25 and 1.26 bring meaningful, zero-effort performance improvements to one of the most common patterns in Go code: slice accumulation. By speculatively allocating on the stack and delaying heap promotion, the compiler eliminates multiple early heap allocations — sometimes outperforming even carefully hand-optimized code.
You don’t need to change a single line of code to benefit. Just upgrade to Go 1.25 or 1.26 and let the compiler do its job.
Have you profiled your Go applications and found allocation hotspots in slice-heavy code? I’d love to hear what patterns you’ve found most impactful!
More in the “You Should Know In Golang” series:
https://wesley-wei.medium.com/list/you-should-know-in-golang-e9491363cd9a
Comments