TL;DR

While building a small resilient LL-style parser, the author ran into infinite loops when a subparser returned without consuming input. Instead of the previous 'fuel' heuristic and a mental map of advancing functions, they added explicit advance assertions to the parser API so failures are detected immediately and the code documents which calls must make progress.

What happened

During a holiday exercise implementing a toy parser that follows a resilient LL approach, the author encountered a hard-to-debug failure: certain parsing paths could bail out without consuming any tokens, causing loops or deep recursion that eventually exhausted memory. Previously, the author relied on a combination of a "fuel" counter (topped up whenever tokens were consumed and decremented by lookahead) and a mental checklist of functions guaranteed to advance the input. To make the problem visible and the guarantees explicit, they introduced an advances stack inside the Parser and three helper methods (advance_push, advance_pop, advance_drop). Call sites push before a speculative parse and either pop after confirming progress or drop the expectation when no advance is needed. Applying this to a Pratt-style expression parser revealed the exact spot that did not consume a token; the concrete fix was to call the token-consuming bump action so the parser makes forward progress and the assertion machinery stays consistent.

Why it matters

  • Makes non-progressing parsing paths fail immediately instead of causing long, opaque OOM or infinite-loop crashes.
  • Encodes which parser operations must advance input into the code, reducing reliance on an error-prone mental model.
  • Improves diagnostics by converting distant memory errors into local assertion failures with clearer stack traces.
  • Helps maintain resilience while ensuring loops and recursive descent steps make measurable progress.

Key facts

  • The author was implementing a resilient LL-style parser and produced a syntax tree plus diagnostics rather than bailing on the first error.
  • Infinite loops occurred when a parsing function returned without consuming any token while the caller expected progress.
  • A prior mitigation called 'fuel' used a Cell<u32> that was decremented by lookahead and replenished on token consumption.
  • The new approach adds an advances array on the Parser and three methods: advance_push, advance_pop, and advance_drop.
  • Call sites push an expectation before speculative parses; advance_pop asserts that the parser index increased, advance_drop removes the expectation without asserting.
  • Applying the new assertions to a Pratt expression parser turned an out-of-memory crash into a local assertion failure during tests.
  • The concrete fix in the example was to explicitly consume the operator token (via a bump) before recursing, ensuring forward progress.

What to watch next

  • How broadly the advance-assertion pattern is applied across other parser entry points and recursive constructs.
  • Whether the approach replaces or complements fuel-style safeguards in larger parsing codebases.
  • not confirmed in the source

Quick glossary

  • Resilient LL parsing: A parsing style that attempts to continue producing a parse tree and diagnostics after encountering errors rather than stopping at the first problem.
  • Pratt parser: A technique for parsing expressions with operator precedence using recursive calls guided by token binding strengths.
  • Fuel: A heuristic counter used to detect non-progressing parsing work by decrementing on lookahead and replenishing on token consumption.
  • Advance assertion: A runtime check that verifies a parser has moved its input index forward when a call site expects progress.

Reader FAQ

What caused the original crash?
A parsing loop recursed or iterated without consuming tokens, eventually exhausting memory and triggering a fatal out-of-memory condition.

How did the author previously try to prevent non-progressing parses?
By using a 'fuel' counter that decremented on lookahead and was topped up when tokens were consumed, plus a mental map of which functions always advance.

How do the new advance_push/pop/drop methods help?
They let a caller record an expectation of progress, assert when the parser index has moved, and explicitly drop the expectation when no advance is required, surfacing failures immediately.

Is this approach ready for production use?
not confirmed in the source

Parsing Advances Dec 28, 2025 I find myself writing yet another toy parser, as one does during a Christmas break. It roughly follows Resilient LL Parsing Tutorial. Not because I…

Sources

Related posts

By

Leave a Reply

Your email address will not be published. Required fields are marked *