thejambot.com / Agent Capability Framework

29 Capabilities. Three Modes. One Threshold.

The question isn't whether your agents are capable. It's whether your platform has the semantic infrastructure for Execute mode to be safe.

"If your agent executes perfectly against stale definitions, how would you know?"

Wall-E compacted trash 700 years after everyone left. AUTO enforced a directive long after it stopped applying. They weren't broken. They were literal. The gap was in the maintenance, not the model.

HOW TO USE THIS MAP
1
Green = safe to Execute
Guardrails in place. Low blast radius. Agents can act without a review gate.
2
Amber = keep Propose+Review
Agents draft, humans approve. GitOps is the safety net. Don't skip the gate.
3
Blue = fix vocabulary first
Semantic debt blocks autonomous execution. The gap is language, not tooling.
Assist agent observes
Propose agent drafts, human approves
Execute agent acts, semantic infra required
0
Ready Now
0
Emerging
0
Invest to Unlock
87
Total Cells

How to Read This Map

15 seconds
The Grid
Rows = Capability domains
Columns = Agent modes (Assist → Propose → Execute)
Selector = Your platform maturity level
The Colors
Green = Go. Guardrails in place, safe to scale.
Amber = Slow. Works, but add review gates.
Blue = Stop. Fix vocabulary/docs first.
→ Start Here
  1. 1. Select your maturity level
  2. 2. Find your blue cells (your gaps)
  3. 3. Hover any cell for context
  4. 4. Use the scoring rubric below
PLATFORM MATURITY LEVEL

Select your maturity to watch the map transform

AUTONOMY:
Assist
Assist Mode

Agent watches and surfaces insights. Human drives. Low risk: the agent can't change anything.

Example: "Show me slow queries" or "Flag risky PRs"
Propose
Propose Mode

Agent drafts changes for human approval. GitOps provides the safety net. Medium risk: requires a review gate.

Example: "Draft a fix for this bug" or "Suggest config change"
Execute
Execute Mode

Agent acts autonomously. Higher risk: requires semantic coherence and rollback capability before granting this mode.

Example: "Auto-scale based on traffic" or "Auto-remediate alerts"
PACE:
Fast
Fast Pace (Weeks)

Changes frequently. Low blast radius. Safe for agent experimentation.

Medium
Medium Pace (Months)

Changes quarterly. Moderate blast radius. Agent proposals work well.

Slow
Slow Pace (Quarters)

Changes rarely. High blast radius. Agent assist valuable, execution risky.

Cross
Cross-Cutting

Spans multiple pace layers. Changes cascade unpredictably.

Hover over any term for details

The Five Ceremonies

Every product org runs on recurring rituals: standups, retros, planning, demos. These ceremonies fall into four categories. A fifth is missing from almost every org. It's the one your agents need most.

CEREMONY 01

Discovery & Planning

🤖 Agent Ready

Backlog grooming, sprint planning, user research synthesis, competitive analysis.

Why agents work here: Low stakes, reversible outputs.
CEREMONY 02

Delivery & Iteration

🛸 Emerging

Standups, code review, CI/CD, deployments, hotfixes, feature flags.

Why agents are close: GitOps provides guardrails.
CEREMONY 03

Strategy & Alignment

🧭 Captain Territory

OKR setting, roadmap reviews, resource allocation, priority calls, reorgs.

Why agents struggle: These decisions require context that lives in someone's head, not in any doc.
CEREMONY 04

Learning & Validation

🤖 Agent Ready

Retros, post-mortems, A/B test analysis, metrics reviews, customer feedback loops.

Why agents work here: Data-heavy, pattern-matching territory.
CEREMONY 05

Semantic Maintenance

The ceremony nobody runs. Explicitly maintaining the shared vocabulary: what "production-ready" means, what "customer" refers to, why this service exists, what "done" looks like for this team.

When this degrades, humans navigate by asking questions, reading between lines, pinging Slack. Agents execute against the docs as written. Stale docs mean stale execution.

📖
Glossary reviews
🔍
Doc-to-reality audits
🤝
Cross-team vocab sync

This is the unlock. Run this ceremony, and blue cells start turning amber. Amber turns green. Your agents stop being AUTO and start being Wall-E.

The Mental Model

Three modes. Three thresholds. The characters are a lens for remembering which preconditions apply.

🤖

Assist

Wall-E mode

Agent observes and surfaces insights. Reliable, consistent. Low blast radius. Safe to run without a gate.

🛸

Propose

EVE mode

Agent drafts, human approves. Focused and purposeful, but needs clear scope. Keep the review gate.

🧭

Execute

Captain territory

Agent acts autonomously. Requires semantic infrastructure, not just capable models. Invest in vocabulary before granting this.

These characters map to agent theory. Assist territory works because simple reflex agents are sufficient: no planning needed. Propose territory requires goal-based agents that understand intent. Execute territory demands model-based or learning agents that can predict and adapt, plus semantic infrastructure to validate their directives.

Mode
Readiness
Agent Class Required
Why This Pairing
Assist
🤖 Wall-E
Ready
Simple Reflex
Stable rules, low variance. "If X, do Y" is sufficient.
Propose
🛸 EVE
Emerging
Goal-Based
Needs intent awareness. "Reach state Y" requires understanding the goal, not just the rule.
Execute
🧭 Captain territory
Invest
Model-Based / Learning
Requires prediction, adaptation, and semantic infrastructure. Granting Execute without this produces Wall-E compacting trash.
🌱

The Captain Updated the Directive

At the end of Wall-E, the captain doesn't defeat AUTO with a better algorithm. He sees the plant, understands what it means, and decides the old instruction no longer applies.

That's the work. Not faster agents, not more capable models. Keeping the meaning behind the directive aligned with reality. Semantic Maintenance is the ceremony that makes Execute mode safe.

Read the Framework

The missing ceremony: Semantic Maintenance. The recurring audit that keeps your vocabulary aligned with your reality.