TL;DR
The author implemented a small x86-64 just-in-time compiler that turns simple recurrence operations into native machine code instead of interpreting them. The piece explains allocating executable memory, handling platform differences, assembling machine opcodes, and calling the generated function from C.
What happened
Faced with a programming challenge that evaluates recurrence relations (u(n+1) = f(u(n))) expressed as a short sequence of arithmetic operations, the author built a tiny JIT that emits x86-64 machine code for the sequence and executes it directly. Rather than repeatedly interpreting operators, the program writes instruction bytes into a page-sized buffer allocated with mmap (or VirtualAlloc on Windows), switches the page protections to executable with mprotect (or VirtualProtect), casts the buffer to a function pointer, and invokes it. The write-up covers using the System V AMD64 ABI (first argument in rdi, return in rax) and notes the Windows convention difference (first arg in rcx). The author also shows how to derive raw opcode bytes by assembling short snippets with nasm and disassembling with ndisasm, and demonstrates emitting code for add, sub, imul, and idiv sequences.
Why it matters
- JIT compilation can convert interpreted loops into native code, potentially reducing runtime overhead.
- Page-level protections and the W^X policy require careful switching between writable and executable memory, which impacts JIT design and security.
- Platform calling conventions affect the generated entry code; portability requires handling OS-specific ABI differences.
- Learning to emit machine code directly demystifies low-level execution and can inform more advanced compiler or VM work.
Key facts
- The example recurrence expresses operations as a sequence like "+2 *3 -5" applied repeatedly to produce successive terms.
- Executable pages are allocated with mmap on POSIX and VirtualAlloc on Windows; both start writable but not executable in the example.
- After filling the buffer, protections are changed to read+execute via mprotect (POSIX) or VirtualProtect (Win32) to comply with W^X.
- The minimal emitted function conforms to the long recurrence(long) prototype: input arrives in rdi, result returned in rax under the System V AMD64 ABI.
- The article demonstrates extracting machine-code byte sequences by assembling snippets with nasm and disassembling with ndisasm, then inserting those bytes into the buffer.
- Supported integer operators shown include +, -, *, and /, with specific instruction sequences for add, sub, imul, and idiv (plus zeroing rdx before division).
- On Windows the first argument is in rcx instead of rdi, which only changes the initial move instruction emitted.
- The generated code is invoked by casting the buffer to a function pointer and calling it; the author notes the approach is intentionally simple and constrained.
What to watch next
- The author suggests extending the JIT to support modulus, XOR, and bit shifts to enable more complex recurrences (e.g., PRNGs).
- A follow-up challenge mentioned uses Reverse Polish notation and the author wrote another JIT for that variant.
- Switching the implementation to floating-point arithmetic is proposed as an alternative direction.
Quick glossary
- Just-In-Time (JIT) compiler: A system that generates and executes native machine code at runtime instead of interpreting source or bytecode.
- mmap / VirtualAlloc: OS calls to allocate memory: mmap is used on POSIX systems, VirtualAlloc on Windows; both can reserve page-aligned regions.
- W^X: A security policy meaning memory is either writable or executable, but not both at the same time, reducing executable injection risks.
- Calling convention: Platform rules that determine how function arguments are passed (which registers or stack locations) and where return values appear.
- mprotect / VirtualProtect: System calls used to change memory page permissions, for example to make a page executable after writing code into it.
Reader FAQ
Will this approach run on Windows?
Yes; the article shows equivalent allocation and protection calls using VirtualAlloc and VirtualProtect and notes the first-argument register difference (rcx).
Does the JIT support branching and complex control flow?
Not in this simplified example; the article explicitly describes a straightforward, branch-free sequence and stops before adding branching or intermediate-value management.
Can the same technique be used for floating-point operations?
The author mentions switching to floating point as an alternative, but does not provide an implementation in this article.
Is this production-ready and safe to enable on all platforms?
Not confirmed in the source.
A Basic Just-In-Time Compiler March 19, 2015 (The author is currently open to employment opportunities in the United States.) This article was discussed on Hacker News and on reddit. Monday’s…
Sources
- A Basic Just-In-Time Compiler (2015)
- What does a just-in-time (JIT) compiler do?
- Verified Just-In-Time Compiler on x86
- spencertipping/jit-tutorial: How to write a very simple JIT …
Related posts
- Publish on Your Own Site and Syndicate Elsewhere: POSSE Explained
- Daft Punk’s ‘Harder, Better, Faster, Stronger’ appears to be 123.45 BPM
- Unix v4 (1973) Live Web Terminal Emulation of PDP‑11/45 Environment