TL;DR
Ken Jin reports that an experimental tail-calling interpreter produces measurable speedups for CPython on several platforms. On Windows x86-64 with an internal MSVC build and Visual Studio 2026, pyperformance geometric-mean results indicate roughly a 15% improvement over the switch-case interpreter.
What happened
In a December 2025 blog post, Ken Jin described performance experiments showing that a tail-calling interpreter design improves CPython throughput on some platforms. On macOS AArch64 (Xcode Clang) the tail-calling interpreter beat a computed-goto implementation by about 5% on pyperformance. On Windows x86-64, tests run against an experimental internal MSVC and a Visual Studio 2026 build reported a geometric-mean speedup near 15% over the traditional switch-case interpreter, with individual benchmarks ranging from notable slowdowns in outliers to up to 78% faster in some cases. The CPython 3.15 'What’s New' notes now mention that builds using Visual Studio 2026 may opt into the new tail-calling interpreter, and contributors credited include Chris Eibl, Brandt Bucher and the MSVC team. Jin cautions that MSVC features used are undocumented and that the change could be reverted during development.
Why it matters
- Faster interpreter loops can reduce runtime for many pure-Python workloads on Windows x86-64 without rewriting code.
- If broadly adopted, the change could shift performance expectations for CPython 3.15 builds compiled with Visual Studio 2026.
- The change highlights how compiler behavior and large interpreter functions interact and can affect real-world performance.
- MSVC implementation details are currently experimental and undocumented, so long-term availability is not guaranteed.
Key facts
- Author: Ken Jin, blog post published 24 December 2025.
- Two platform comparisons: ~5% geomean speedup on macOS AArch64 (Xcode Clang) versus computed gotos, ~15% geomean on Windows x86-64 (MSVC) versus switch-case.
- Benchmarks were run using pyperformance; reported Windows sample improvements include spectralnorm (1.48x), nbody (1.35x), bm_django_template (1.18x), xdsl (1.14x) on a Visual Studio 2026 build.
- Tail-calling interpreters implement each bytecode handler as a function and use tail calls to transfer control to the next handler.
- Computed gotos (labels-as-values) and switch-case are alternative interpreter dispatch methods; computed gotos historically needed fewer jumps.
- Clang’s __attribute__((musttail)) enforces a tail call at compile time; this kind of mechanism made tail-calling interpreters practical on some toolchains.
- The Windows tail-calling results cited were produced with an experimental internal MSVC compiler; the MSVC features used are described as undocumented.
- CPython 3.15 'What’s New' now notes that builds using Visual Studio 2026 may use the tail-calling interpreter and reports early speedup figures.
What to watch next
- Whether the tail-calling interpreter remains enabled in final CPython 3.15 builds (the author notes the result assumes the change is not reverted).
- Stability and official support for the undocumented MSVC features relied upon—these features may change or be removed (not confirmed in the source).
- How broadly the Visual Studio 2026/ MSVC 18 toolchain is adopted for Windows CPython builds, and whether downstream distributors enable the interpreter by default (not confirmed in the source).
Quick glossary
- Tail call / tail-calling: A function call performed as the final action of another function so control can be transferred without growing the call stack; in C, compilers may optimize such calls into jumps.
- Computed goto (labels-as-values): A GCC/Clang extension where code jumps to an address held in a table of labels, often used to implement fast interpreter dispatch.
- Switch-case interpreter: A dispatch technique where a central switch statement selects the handler for each bytecode opcode.
- pyperformance: A benchmark suite commonly used to measure Python interpreter performance across a variety of workloads.
- MSVC: Microsoft Visual C++, the C/C++ compiler and toolchain provided with Visual Studio.
Reader FAQ
Will all Windows users see a 15% speedup in Python 3.15?
The source reports a roughly 15% geometric-mean speedup for builds using an experimental MSVC and Visual Studio 2026; results may vary by workload and compiler, and are not guaranteed for all distributions.
Is the tail-calling interpreter already enabled in CPython 3.15 releases?
CPython 3.15 'What’s New' notes that builds using Visual Studio 2026 may now use the tail-calling interpreter, but the author also warns the change could be reverted during development.
Does this rely on documented MSVC features?
No — the post cautions that the MSVC features used are to the author’s knowledge undocumented and not guaranteed to persist.
Did the author confirm this on macOS as well?
Yes — the post reports about a 5% geomean speedup on macOS AArch64 using Xcode Clang versus computed gotos.
Ken Jin Python 3.15’s interpreter for Windows x86-64 should hopefully be 15% faster 24 December 2025 Some time ago I posted an apology peice for Python’s tail caling results. I…
Sources
- Python 3.15’s interpreter for Windows x86-64 should hopefully be 15% faster
- What's new in Python 3.15
- Python Releases for Windows
- Despite 30 months work, core developer says Python's JIT …
Related posts
- Mattermost restricts access to older posts after 10,000-message cap is reached
- BoltCache: A Go-based, Redis-compatible in-memory cache with REST API
- Ruby 4.0.0 Released — major features Ruby Box, ZJIT, Ractor upgrades