harness/
intro
A talk by Peter Eysermans
Building the AI Agent
Harness.
From hands-on-the-wheel agentic engineering to autonomous teams that actually ship.
Levels of
AI
in Software Development.
L1
Spicy Autocomplete
Cruise control
Inline suggestions.
L2
Vibe Coding
Lane-keeping assist
Prompt → accept → ship.
L3
L3 · You are here
Agentic Engineering
Hands off, eyes on
Tailored agents and skills.
Tailored agents and skills.
L4
Autonomous Teams
Eyes off, nap allowed
This talk.
L5
Software Factory
No steering wheel
→
Adapted from Dan Shapiro,
“The Five Levels: From Spicy Autocomplete to the Software Factory”
· danshapiro.com
harness/
context
How Do
We Become
this Guy?
harness/
tension
Trust me bro
"I don't have employees. I have agents. That's not a flex, it's just the truth."
harness/
tension
They Are not Running This
in
Production
.
harness/
tension
What happens
Agents
Forget
.
harness/
turn
How Do We Get to
the
Next Level
?
harness/
diagnosis
Attempt one
Let the LLM
Manage
the Team.
Claude Code's
Agent Teams
. One orchestrator spawns teammates, shares a todo list, delegates the work.
Source:
code.claude.com/docs/en/agent-teams
harness/
diagnosis
First run
It
Stopped
Halfway.
Engineer opened the PR. Orchestrator said "done." Forgot the review step.
harness/
diagnosis
The diagnosis
Context
≠
Comprehension
.
The window still has room. The comprehension does not.
harness/
evidence
Context
Rots
.
Model performance degrades as context grows. Your orchestration rules? Effectively forgotten.
→
Chroma Research,
“Context Rot: How Increasing Input Tokens Impacts LLM Performance”
· trychroma.com
harness/
diagnosis
It's an
Architecture
Problem.
harness/
turn
LLMs Are
Workers.
Not
Managers
.
Great at creative work. Terrible state machines.
harness/
architecture
The architecture
Three
Components
.
01 · Orchestration
Deterministic.
Doesn't forget.
02 · Agents
Creative.
Clean context. One job. Exits.
03 · Memory
Shared.
Long-term context storage.
harness/
architecture
The stack
Three
Components
.
01 · Orchestration
Deterministic.
Doesn't forget.
Bash
n8n
02 · Agents
Creative.
Clean context. One job. Exits.
Claude Code
03 · Memory
Shared.
Long-term context storage.
GitHub
harness/
architecture
It Works Where Your
Team
Already
Works.
Issue = spec. PR = implementation. Comments = feedback. Labels = state.
harness/
architecture
Labels Are the
State
.
ai-ready
→
ai-implementing
→
ai-testing
→
ai-needs-review
→
ai-reviewing
→
ai-done
Stuck? Label ai-stuck. A human looks.
harness/
evidence
The orchestrator
180 Lines
of
Bash
.
Polls every 3 minutes. grep, jq, gh, flock. Zero AI in the orchestration.
harness/
architecture
Every agent
One Job.
Fresh
Context.
Reads the issue. Does the work. Writes back. Exits. Never knows about the reviewer.
harness/
architecture
Agents Are
Markdown
Files.
.claude/agents/ ├─
engineer
.md
# Safety, conventions, commits
├─
reviewer
.md
# Review criteria, security
└─
qa
.md
# Test strategy, coverage
harness/
demo
Live demo
The
Full
Pipeline.
harness/
landing
The Harness
Is Your
Advantage
.
The AI doesn't differentiate your team. The harness does. Engineering culture, encoded as infrastructure.
harness/
end
Go Build Your
Harness.
Eyes off. Nap allowed.
Peter Eysermans
CTO in Residence
madewithlove
LinkedIn
linkedin.com/in/petereysermans
Email
[email protected]
Web
eysermans.com