harness/
intro
A talk by Peter Eysermans

Building the AI Agent
Harness.

From hands-on-the-wheel agentic engineering to autonomous teams that actually ship.
 

Levels of AI in Software Development.

L1
Spicy Autocomplete
Cruise control
Inline suggestions.
L2
Vibe Coding
Lane-keeping assist
Prompt → accept → ship.
L3
L3 · You are here
Agentic Engineering
Hands off, eyes on
Tailored agents and skills.
Tailored agents and skills.
L4
Autonomous Teams
Eyes off, nap allowed
This talk.
L5
Software Factory
No steering wheel
 
Adapted from Dan Shapiro, “The Five Levels: From Spicy Autocomplete to the Software Factory” · danshapiro.com
harness/
context
 

How Do
We Become
this Guy?

 
harness/
tension
Trust me bro "I don't have employees. I have agents. That's not a flex, it's just the truth."
harness/
tension
 

They Are not Running This
in Production.

harness/
tension
What happens

Agents Forget.

Claude Code: 'Oh I apologize, you are right. I forgot to run the review step and immediately created the PR. Let me retry'
harness/
turn
 

How Do We Get to
the Next Level?

harness/
diagnosis
Attempt one

Let the LLM
Manage the Team.

Claude Code's Agent Teams. One orchestrator spawns teammates, shares a todo list, delegates the work.
Agent Teams: main agent spawns teammates, who share a task list and communicate with each other.
Source: code.claude.com/docs/en/agent-teams
harness/
diagnosis
First run

It Stopped
Halfway.

Engineer opened the PR. Orchestrator said "done." Forgot the review step.
harness/
diagnosis
The diagnosis

Context Comprehension.

The window still has room. The comprehension does not.
harness/
evidence
 

Context Rots.

Model performance degrades as context grows. Your orchestration rules? Effectively forgotten.
harness/
diagnosis
 

It's an
Architecture Problem.

 
harness/
turn
 

LLMs Are Workers.
Not Managers.

Great at creative work. Terrible state machines.
harness/
architecture
The architecture

Three Components.

01 · Orchestration
Deterministic.
Doesn't forget.
02 · Agents
Creative.
Clean context. One job. Exits.
03 · Memory
Shared.
Long-term context storage.
harness/
architecture
The stack

Three Components.

01 · Orchestration
Deterministic.
Doesn't forget.
02 · Agents
Creative.
Clean context. One job. Exits.
03 · Memory
Shared.
Long-term context storage.
harness/
architecture
 

It Works Where Your
Team Already Works.

Issue = spec. PR = implementation. Comments = feedback. Labels = state.
harness/
architecture
 

Labels Are the State.

ai-ready ai-implementing ai-testing ai-needs-review ai-reviewing ai-done
Stuck? Label ai-stuck. A human looks.
harness/
evidence
The orchestrator

180 Lines
of Bash.

Polls every 3 minutes. grep, jq, gh, flock. Zero AI in the orchestration.
harness/
architecture
Every agent

One Job.
Fresh Context.

Reads the issue. Does the work. Writes back. Exits. Never knows about the reviewer.
harness/
architecture
 

Agents Are
Markdown Files.

.claude/agents/ ├─ engineer.md # Safety, conventions, commits ├─ reviewer.md # Review criteria, security └─ qa.md # Test strategy, coverage
harness/
demo
Live demo

The Full
Pipeline.

 
harness/
landing
 

The Harness
Is Your Advantage.

The AI doesn't differentiate your team. The harness does. Engineering culture, encoded as infrastructure.
harness/
end
 

Go Build Your
Harness.

Eyes off. Nap allowed.
Peter Eysermans CTO in Residence madewithlovemadewithlove