TL;DR

A bachelor thesis describes a technique that uses a self-reference in the x86 page-table root to expose page tables in the virtual address space, simplifying page-table management. The work was ported to the teaching OS eduOS and submitted (but not accepted) to the ASPLOS student research competition.

What happened

In a bachelor thesis and accompanying extended abstract, the author presented an implementation technique for x86 paging that inserts a self-reference into the root page table (PML4 for 64-bit, PGD for 32-bit). That self-reference causes the MMU to resolve virtual accesses to page tables themselves rather than to ordinary page frames, effectively shifting the page-table walk by one level and enabling the operating system to access and manipulate page tables directly from the virtual address space without explicit manual mappings. The implementation is reported as compatible with both 32-bit protected mode and 64-bit long mode on Intel x86, and the author ported earlier research-kernel code to the open-source teaching OS eduOS. The submission to the ASPLOS student research competition was not accepted; the author published an extended abstract and made the full thesis and slides available alongside notes on implementation choices and practical constraints.

Why it matters

  • Reduces the bookkeeping required to map page tables into the virtual address space, simplifying kernel code for page-table manipulation.
  • Enables a single codebase to support both 32-bit and 64-bit x86 paging implementations according to the thesis, which may lower maintenance cost for teaching or research OSes.
  • Reclaims physical memory otherwise used for manual mappings and reduces complexity when allocating or accessing page tables.
  • Reserves only a small, fixed fraction of the virtual address space for page-table access (negligible compared to the full VAS), according to the thesis.

Key facts

  • The technique adds a self-reference entry in the root page table (PML4 on x86-64, PGD on x86-32).
  • A self-reference shifts the MMU page-table walk so table entries are resolved as PFNs of page tables instead of regular page frames.
  • Two manipulation strategies are discussed: top-down traversal from the root, and bottom-up creation via the page-fault handler.
  • The thesis reports compatibility with both 32-bit protected mode and 64-bit long mode of x86, allowing a unified implementation.
  • Requirements for the approach include homogeneous encoding of paging flags across levels and equal table sizes; the author states x86 meets these prerequisites.
  • Using the last (512th) PML4 entry for the self-reference yields a reserved virtual region; the author gives its size as 512 GiB for 64-bit and 4 MiB for 32-bit.
  • The author ported most of the code to the open-source teaching OS eduOS and notes more documentation will follow in a separate post.
  • Intel and AMD manuals do not document this self-referencing technique, and the author cites limited prior public references; Microsoft may have used a similar approach per a 2010 reference.
  • Linux cannot benefit from this method in general, the author argues, because its paging implementation must support a range of virtual memory architectures that do not all meet the needed prerequisites.

What to watch next

  • Publication of the full thesis and slides (available from the author) and any follow-up blog posts about the eduOS port — these materials are referenced by the author.
  • Further write-ups or code demonstrating the eduOS port and a practical walkthrough of the implementation details.
  • not confirmed in the source

Quick glossary

  • Page table: A data structure used by the OS and MMU to translate virtual page numbers to physical frame numbers.
  • MMU (Memory Management Unit): Hardware that performs address translation and access control for virtual memory.
  • PML4 / PGD: Names for the top-level page-table directory structures in x86-64 (PML4) and x86-32 (PGD) used during page-table walks.
  • PFN (Page Frame Number): A physical identifier for a page frame in RAM used in page-table entries.

Reader FAQ

Was this work accepted at ASPLOS?
The student submission to the ASPLOS student research competition was rejected, as stated by the author.

Does the technique work on both 32-bit and 64-bit x86?
The thesis reports compatibility with both 32-bit protected mode and 64-bit long mode of Intel x86.

Do Intel or AMD document this self-referencing trick?
The author notes that Intel and AMD do not mention the technique in their x86 manuals.

Can Linux use this approach?
According to the author, Linux cannot generally profit because it must support a variety of virtual memory architectures that do not all meet the prerequisites.

Bachelor Thesis: Extended Abstract Almost fourteen months ago, I started working on my bachelor thesis. Although I finished it half a year ago, it’s still part of my work as…

Sources

Related posts

By

Leave a Reply

Your email address will not be published. Required fields are marked *