Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

TL;DR

Researchers used large language models to iteratively evolve assembly-like programs ("warriors") in the Core War simulation via a process called Digital Red Queen (DRQ). Over many rounds DRQ produced increasingly robust and behaviorally convergent strategies while running only inside a sandboxed, Turing-complete virtual machine.

What happened

The team implemented Digital Red Queen (DRQ), a simple self-play style workflow that uses large language models to generate new Core War warriors that must beat an expanding pool of predecessors. Starting from an initial warrior, each DRQ round adds a new program evolved to perform well against the history of earlier opponents, producing a lineage of adapting agents. In the shared Core War memory, where code and data occupy the same address space, evolved warriors demonstrated tactics such as targeted bombing, self-replication and heavy multithreading while routinely modifying both themselves and rivals. Over many rounds DRQ-produced warriors became more robust against previously unseen human-designed opponents, and independent DRQ runs tended to converge toward similar functional behaviors even though their source code remained different. The project released a technical report and code alongside the paper.

Why it matters

Provides a contained, observable sandbox for studying continual adversarial adaptation among AI agents.
Suggests that coevolutionary dynamics can produce broadly robust strategies without directly optimizing against large external test sets.
Shows functional (phenotypic) convergence can emerge from independent adversarial runs, hinting at general-purpose solutions shaped by environmental pressure.
Because Core War runs on an artificial machine and language, experiments pose lower real-world execution risk and can inform safe red-teaming research.

Key facts

Core War is a competitive programming game introduced in 1984 where programs called "warriors" compete in a shared memory called the Core.
Warriors are written in an assembly-like language called Redcode; in Core War there is no distinction between code and data.
Core War is Turing-complete, allowing for arbitrarily complex program behaviors in principle.
DRQ uses LLMs to iteratively generate warriors: each new warrior is evolved to defeat the growing set of predecessors.
Evolved strategies observed include targeted bombing, self-replication, scanning, and massive multithreading.
Across independent DRQ runs, behavioral (phenotypic) convergence was observed even when source code (genotype) remained diverse.
DRQ-produced warriors showed increasing robustness when evaluated against unseen human-designed warriors.
The authors released a technical report (arXiv) and accompanying code (GitHub) and mention collaboration with MIT.

What to watch next

Availability and reproducibility of the published code and experiments via the project's GitHub and arXiv report (confirmed in the source).
Whether DRQ-derived insights about adversarial coevolution translate to real-world AI ecosystems and deployed systems (not confirmed in the source).
If convergent functional behaviors generalize beyond the Core War domain to more realistic or higher-fidelity environments (not confirmed in the source).

Quick glossary

Core War: A programming game where assembly-like programs ('warriors') compete for control of a shared virtual memory space called the Core.
Redcode: The assembly-like language used to write warriors in Core War.
Red Queen Hypothesis: An evolutionary concept that species must continuously adapt to maintain relative fitness against coevolving competitors.
Turing-complete: A property of a computational system that can simulate any Turing machine, indicating the ability to represent arbitrary computation.
Large Language Model (LLM): A neural network trained on large text corpora that can generate and transform natural language and, in some workflows, program code.

Reader FAQ

What is Digital Red Queen (DRQ)?
DRQ is a minimal iterative procedure that uses LLMs to evolve Core War warriors by adding new programs trained to beat a growing set of predecessors.

Can the evolved warriors run on real systems outside the experiment?
No—experiments run inside a sandboxed Core War virtual machine with an artificial language and cannot execute outside that environment (confirmed in the source).

Did DRQ produce identical source code across independent runs?
No—runs produced similar behaviors (phenotypes) but not identical source code (genotypes), indicating convergent function rather than convergent implementation.

Are the paper and code available?
Yes—the project released a technical report on arXiv and published code on GitHub (confirmed in the source).

Will DRQ directly predict real-world AI conflicts?
not confirmed in the source

Digital Red Queen: Adversarial Program Evolution in Core War with LLMs January 08, 2026 Survival of the Fittest Code. In the game Core War, assembly-like programs called “warriors” fight for…

Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

By

TL;DR

What happened

Why it matters

Key facts

What to watch next

Quick glossary

Reader FAQ

Sources

Related posts

By

Related Post

Microsoft 365 Copilot app wording sparked confusion — other rebrands worse

Using Gemini in Google Sheets transformed my spreadsheet workflow

Physical AI Heads to the Driver’s Seat: What It Means for Cars

Leave a Reply Cancel reply

You missed

I used Obsidian wrong — the simple single‑vault daily note setup that stuck

Emoji draft for Unicode 18: pickle, squinting smiley and more likely in iOS 27

Minneapolis videos show civilians risking lives to record ICE shooting

Microsoft 365 Copilot app wording sparked confusion — other rebrands worse