Balancing code-driven and LLM-driven internal agents for reliable workflows

TL;DR

An engineering team added support for both LLM-orchestrated and code-driven workflows after finding that model-based automation sometimes produced nondeterministic, harmful behavior. They now run a handler that can either delegate orchestration to an LLM or to checked-in Python scripts that have access to the same tools and can call an LLM as a subagent when needed.

What happened

The author describes building an internal agent platform that supports two coordination modes: an LLM-led workflow and a script-led workflow. An early LLM automation that added a :merged: reaction to Slack messages about GitHub pull requests misclassified some PRs as merged, which discouraged human reviewers from checking those PRs. To address this, the team extended their handler configuration so triggers can select either coordinator: llm (the default) or coordinator: script with a reference to a Python script (for example, scripts/pr_merged.py). The handler still gathers trigger data, approved tools, and virtual files, and it enforces termination conditions; when the script coordinator is used, custom code directly invokes tools and may call an LLM via a subagent only when explicitly desired. The team treats code-driven workflows as a progressive enhancement for cases where LLM prompts are insufficiently reliable or fast.

Why it matters

Determinism matters: nondeterministic model outputs can introduce real operational regressions (e.g., false :merged: indicators).
Giving engineers a code path restores predictable behavior and standard software controls like code review and dependency management.
Hybrid design preserves LLM strengths while limiting model autonomy to tasks that truly require intelligence.
Same toolset access for scripts and LLMs keeps capability parity while shifting control to audited code when needed.

Key facts

Published December 31, 2025; part of a series on building an internal agent.
Initial LLM workflow fetched recent Slack messages, extracted single PR URLs, checked status via GitHub, and added a :merged: reacji when a PR was merged or closed.
The LLM sometimes added :merged: incorrectly, which reduced human review of PRs and undermined the goal.
System handler flow: select configuration for trigger, load prompt and approved tools, generate virtual files, send prompt/tools to LLM, coordinate tool calls, apply termination conditions, then use or discard the final LLM response based on config.
Configuration supports coordinator: llm (default) and coordinator: script with coordinator_script pointing to a Python file.
Scripts have access to the same tools, trigger data, and virtual files as the LLM-handling code and may call an LLM via a subagent when explicitly needed.
Scripts are written and checked in by engineers and subject to code review; code-driven workflows are used when LLMs are unreliable or slow.
The team still initiates workflows with the LLM and uses code-driven approaches as a progressive enhancement; Claude Code reportedly can often convert prompts into code in one shot.

What to watch next

Adoption rate of coordinator: script for workflows that previously relied on LLMs.
Incidence of incorrect automated actions (like false :merged: reacji) after transitioning to script-driven coordination.
Whether the team formalizes additional safeguards or monitoring for script commits (not confirmed in the source).

Quick glossary

LLM: Large language model; a neural model trained to generate or analyze text and sometimes coordinate tool use.
Handler: The software component that receives triggers, selects configuration, loads prompts and tools, and orchestrates LLM or script execution.
Coordinator: The configured mode that determines whether orchestration is driven by an LLM (coordinator: llm) or by custom code (coordinator: script).
Subagent: A capability that lets code or workflows invoke an LLM as a subordinate agent with access to approved tools.
Virtual files: Attachments or ephemeral file representations (for example, files attached to a Jira issue or Slack message) made available to tools during workflow execution.

Reader FAQ

Why add a code-driven option if LLMs can orchestrate tools?
The team found LLMs sometimes behave nondeterministically on routine checks, so code-driven workflows provide predictable, auditable behavior for those cases.

Do scripts lose access to the tools LLMs use?
No — scripts have access to the same approved tools, trigger data, and virtual files, and can call an LLM via a subagent when needed.

Are these scripts code-reviewed?
Yes; scripts are written and checked in by engineers and go through the team's usual code review process.

Has this change eliminated the need for LLMs?
Not confirmed in the source.

Building an internal agent: Code-driven vs LLM-driven workflows Published on December 31, 2025. llm (20), agents (11), internal-agent (7) When I started this project, I knew deep in my heart…

Balancing code-driven and LLM-driven internal agents for reliable workflows

By

TL;DR

What happened

Why it matters

Key facts

What to watch next

Quick glossary

Reader FAQ

Sources

Related posts

By

Related Post

Microsoft rebrands Office as the Microsoft 365 Copilot app globally

AI Deepfakes Impersonate Pastors to Scam Their Congregations

At CES 2026, Everything Is AI — Success Hinges on How Firms Design UX

Leave a Reply Cancel reply

You missed

Capita tells civil servants to wait for chatbots to fix pension portal issues

Auditing my subscriptions for the New Year revealed $100 in monthly waste

Samsung Galaxy S26 could rise in price in South Korea but stay flat in US

Galaxy S26 Edge’s Return in Doubt After Indian Certification Listing Sparks Debate