Exploring LLM-Optimized Programming Languages: B-IR and TBIR Experiments

TL;DR

A developer experimented with the idea of programming languages designed for large language models, asking Gemini and Claude Opus to invent token-efficient syntaxes. Gemini produced a compact, unreadable B-IR; Claude produced TBIR and helped bootstrap a Python compiler that generated Arm64 Mach-O binaries.

What happened

Jason Hall explored whether a programming language could be created specifically for LLM consumption rather than human readability. He prompted Gemini to invent B-IR (Byte-encoded Intent Representation), a heavily token-focused notation using multi-byte Unicode opcodes. Translating B-IR into runnable machine code proved difficult, so Hall switched to Claude Opus, which proposed TBIR (text-based B-IR) with single-byte opcodes and then simplified those to short English words to ease compilation. Using a Python bootstrap compiler, Claude produced a toolchain that emitted Arm64 assembly and created runnable Mach-O binaries; it also wrote example programs such as a simple cat.tbir and implemented its own compiler in TBIR at roughly 700 lines. Throughout the experiment Hall reflected on token efficiency limits and other LLM pain points like ambiguity, loose typing, indentation sensitivity, and keeping track of long-lived intent.

Why it matters

LLM-specific languages could reduce token costs and influence how models represent and manipulate code.
Bootstrapping compilers from LLM-generated specifications highlights practical challenges in turning compressed encodings into executable binaries.
Design choices that help LLMs—reducing ambiguity, enforcing stricter typing, and improving validation locality—could change how programming languages are designed.
The experiment suggests existing low-level representations (e.g., assembly) may already be efficient for LLMs, affecting whether novel syntaxes are necessary.

Key facts

Article authored by Jason Hall and first published January 11, 2026.
Author states the article itself was written entirely by a human without LLM augmentation.
Gemini generated B-IR, a Unicode-heavy, token-minimizing representation described as unreadable.
Initial attempts to compile B-IR into executable code using Python struggled to produce valid Mach-O binaries.
Claude Opus proposed TBIR, moving from 0x80–0x8B single-byte opcodes to short English-word operations to aid compilation.
A Python bootstrap compiler was created that compiles TBIR to Arm64 assembly and links to a Mach-O executable.
Example TBIR programs included a simple file-copy (cat.tbir); Claude produced a TBIR-written compiler of about 700 lines.
The author identifies several LLM challenges for code generation: ambiguity in scope, loose typing, indentation sensitivity, and keeping track of user intent.
The author coined the term 'Validation Locality' to describe the need for tests to live near the code they verify.

What to watch next

Whether LLM-authored languages become self-hosting and can fully replace human-written toolchains (not confirmed in the source).
Adoption or experimentation by others with LLM-first languages and how they compare to existing low-level representations like assembly (not confirmed in the source).
Research or tooling focused on reducing ambiguity, enforcing typing, and improving validation locality to assist LLM code generation (not confirmed in the source).

Quick glossary

Token: A unit of text a language model processes; tokens are the basic pieces models consume and produce when generating or interpreting text.
Compiler bootstrap: A process where an initial compiler written in one language is used to build a compiler in the target language, eventually producing a self-hosted toolchain.
Assembly: A low-level, human-readable representation of machine instructions that maps closely to what a processor executes.
Mach-O: A binary executable format used on some operating systems for storing machine code and related metadata.

Reader FAQ

Did the author use LLMs to write the article?
No — the author states the article was written entirely by a human without LLM augmentation.

Which LLMs were involved in the experiments?
The experiments used Gemini to propose B-IR and Claude Opus to design and implement TBIR.

Is B-IR runnable as delivered?
The initial B-IR representation could not be straightforwardly translated into a runnable binary; compilation attempts ran into problems.

Did the experiment produce executable programs?
Yes; Claude Opus helped produce a TBIR toolchain that could be compiled via a Python bootstrapper into Arm64 Mach-O executables.

Are LLM-optimized languages ready for production?
not confirmed in the source

An LLM-optimized Programming Language http://articles.imjasonh.com/llm-programming-language.md Jason Hall First published January 11, 2026 Note This article describes heavy use of LLMs, but the article itself was written completely by me, a…

Exploring LLM-Optimized Programming Languages: B-IR and TBIR Experiments

By

TL;DR

What happened

Why it matters

Key facts

What to watch next

Quick glossary

Reader FAQ

Sources

Related posts

By

Related Post

Why Xfce Still Matters: A Longtime User’s Case for the Lightweight Desktop

How AI and market shifts could reshape software engineering by 2026

Reduced winter snowfall leaves Himalayas bare and raises water risks

Leave a Reply Cancel reply

You missed

inDrive expands ads and grocery delivery to diversify revenue streams

Why Xfce Still Matters: A Longtime User’s Case for the Lightweight Desktop

India denies plan to force smartphone makers to hand over source code

Samsung Galaxy S26 may add iPhone-style 24MP camera mode in stock app