TL;DR

The lead maintainer of LLVM published a critique listing persistent design and process problems across the project, from reviewer shortages and API churn to slow builds and flaky CI. Some long-standing IR issues have been resolved or are in progress, but many operational and testing gaps remain.

What happened

In a reflective post, the lead maintainer of the LLVM project cataloged a set of structural and technical problems that affect contributors and downstream users. The author notes that a few targeted IR problems have seen fixes (opaque pointers) or progress (constant expression removal; ptradd migration underway), but broader issues persist. Key operational pain points include limited qualified code review capacity, frequent API and IR churn that burdens integrators, long build times for the large C++ codebase, and a flaky, multi-hundred-bot post-commit CI fleet that routinely reports failures. The post also highlights weak end-to-end and executable testing coverage—llvm-test-suite is used separately from routine development and lacks comprehensive coverage across data types and target combinations—and growing divergence among backends driven by target-specific fixes. The author suggests a number of potential improvements (for example, better reviewer assignment, precompiled headers, and reducing test overhead) but focuses mainly on describing the problems rather than prescribing detailed solutions.

Why it matters

  • Insufficient review capacity can delay contributor progress and allow low-quality changes to land, harming code health.
  • Ongoing API and IR churn forces downstream integrators to continually adapt, increasing maintenance costs.
  • Slow builds and an unpredictable CI signal slow development turnaround and obscure real regressions.
  • Lack of comprehensive end-to-end and executable tests increases the risk of regressions from pass interactions and backend-specific fixes.
  • Backend divergence leads to duplicated effort and inconsistent behavior across targets.

Key facts

  • Author: the lead maintainer of the LLVM project (writing from a maintainer perspective).
  • Some IR issues have been addressed: opaque pointers migration is complete, constant expression removal is mostly done, and ptradd migration is well on the way.
  • LLVM codebase size: the LLVM component exceeds 2.5 million lines of C++; the full monorepo is about 9 million lines.
  • CI: LLVM operates over 200 post-commit buildbots; the system is often not fully green and exhibits flakiness.
  • Typical throughput: more than 150 commits are common on a typical workday, creating scale challenges for CI.
  • Contribution model: authors must request reviewers; many more people contribute code than perform qualified reviews.
  • Build-time mitigations mentioned include pre-compiled headers and moving to a dylib build by default to reduce link times and disk usage for debug builds.
  • Executable tests live in a separate llvm-test-suite repo, which is typically not used during routine development and has fewer tests than lit-based tests.
  • Compilation-time problems include especially poor -O0 compile-times and expensive costs even when optimizations are disabled.
  • There is no single official LLVM performance-tracking infrastructure according to the author; many organizations track performance downstream independently.

What to watch next

  • Progress on the ptradd migration and any follow-up IR cleanup (author reports migration is well underway).
  • Whether LLVM adopts a Rust-style PR assignment system to reduce reviewer discovery friction — not confirmed in the source.
  • Rollout and impact of precompiled headers and a default dylib build to improve build and debug-time performance — not confirmed in the source.
  • Any move to formalize an official LLVM performance-tracking infrastructure or consolidated downstream telemetry — not confirmed in the source.

Quick glossary

  • LLVM IR: An intermediate representation used by LLVM as a target-independent, low-level programming language for compiler optimizations and code generation.
  • Opaque pointers: A representation change that separates pointer type information from the pointee type, simplifying some IR semantics and enabling certain refactorings.
  • CI (Continuous Integration): Automated systems that build and test software on commits to detect regressions and ensure integration quality.
  • Precompiled headers: A build technique that caches the result of parsing frequently included headers to reduce compile times.
  • dylib build: A build mode that produces shared libraries (dynamic linking) instead of static binaries, which can reduce link time and disk usage for large projects.

Reader FAQ

Who wrote the critique?
The post was written by the lead maintainer of the LLVM project.

Have any of the listed IR problems been fixed?
According to the author, opaque pointers migration is fixed, constant expression removal is mostly complete, and ptradd migration is in progress.

Is llvm-test-suite used in everyday development?
No; the author states llvm-test-suite is typically not used during routine development and is run by buildbots instead.

Does LLVM have an official performance-tracking system?
The author says LLVM lacks an official performance-tracking infrastructure; many organizations track performance downstream independently.

« Back to article overview. LLVM: The bad parts 11. January 2026 A few years ago, I wrote a blog post on design issues in LLVM IR. Since then, one…

Sources

Related posts

By

Leave a Reply

Your email address will not be published. Required fields are marked *