Building a Fully Autonomous AI Agent (Jarvis Mode): Five Rewrites of Lessons

My AI assistant kept asking me questions.

"Should I use a div or a section here?" "Do you want me to install this dependency?" "Should I commit now or wait?"

I know it's trying to be helpful. I know the intent is "don't break anything without permission." But when you're deep in a flow state and the AI pauses for the fourth time in ten minutes to ask whether you prefer camelCase or snake_case — something breaks inside you.

So I built Jarvis Mode. An autonomous AI agent system that plans, executes, and verifies work without asking a single question. It delegates to specialist workers in parallel, debates its own design decisions, and runs quality checks on its own output.

It took five rewrites to get here. Each version failed in a new and educational way.

This is that story — and a practical guide to the Claude Code plugin system that made it possible.

Tooling note: This walks through implementation on Claude Code's plugin system because that's where I built it. The five lessons below — mandatory gates, evidence-first verification, forbidden-pattern enforcement, structured retries, and intent-vs-instruction separation — are framework-agnostic and apply to any autonomous-AI workflow (LangGraph, AutoGen, OpenDevin, custom agent loops).

The Frustration That Started Everything

Here's the thing about Claude Code: it's genuinely good at coding. It reads your files, understands your project, writes solid code. But its default behavior is cautious. It asks before modifying files. It asks before installing packages. It asks before committing. It asks before running commands.

For most people, that's a feature. You stay in control. Nothing changes without your approval.

For someone trying to ship 21 tasks in a single session? It's a bottleneck.

I didn't know at the time that Claude Code has a --dangerously-skip-permissions flag that removes all confirmation dialogs. I found that out later. But even if I had known, that flag only solves half the problem. Removing permission prompts doesn't make the AI strategic. It just makes it faster at doing one thing at a time.

What I wanted was an AI that could:

Read a goal, break it into tasks, and plan an approach
Execute tasks in parallel using specialist workers
Verify its own work with real test results — not "I think it works"
Loop back and fix issues without me pointing them out

Basically, I wanted Tony Stark's Jarvis. So I named it that. Looking back, what I actually described — an AI that makes its own decisions, delegates to an army of workers, and never asks for permission — is closer to Ultron. But nobody wants to type /ultron into their terminal.

Plugins and Skills, in Two Minutes

Before I explain how Jarvis works, you need to understand the two building blocks: plugins and skills.

A plugin is a folder you install that gives Claude Code new capabilities. Think of it like an app on your phone. You install it once, and it's always available.

A skill is a set of instructions inside a plugin that tells Claude how to do a specific thing. It's like a recipe card. Claude reads the skill file, follows the steps, and produces a result.

Here's the concrete structure:

~/.claude/plugins/
  jarvis-mode/                    ← plugin (the folder)
    skills/
      jarvis-mode/SKILL.md        ← skill (orchestrator instructions)
      jarvis-planning/SKILL.md    ← skill (planning phase instructions)
      jarvis-execution/SKILL.md   ← skill (execution phase instructions)
      jarvis-review/SKILL.md      ← skill (review phase instructions)

When you type /jarvis in Claude Code, it reads those files and follows the instructions. That's it. No code to write. No APIs to configure. Just markdown files in folders.

Why this matters for vibe-coders: You can build surprisingly powerful AI workflows by writing instructions in plain text. No programming required. If you can write a clear set of steps for a human assistant, you can write a Claude Code skill. (The fiddly part is getting a skill to fire at the right moment — I dug into that separately in how to write a Claude Code skill that actually triggers.)

The Ingredients I Borrowed

I didn't invent Jarvis from scratch. I stood on the shoulders of people who had already figured out the hard parts — and I want to give proper credit.

Superpowers Plugin by Jesse Vincent

Superpowers is an open-source plugin (MIT license) created by Jesse Vincent (@obra). It's the most comprehensive development workflow plugin I've found for Claude Code. It ships with skills like:

Skill	What It Does
brainstorming	Forces Claude to explore 2-3 approaches before building anything
writing-plans	Creates detailed implementation plans with exact file paths and code
executing-plans	Follows plans step by step with TDD (test first, then code)
systematic-debugging	4-phase debugging: root cause → pattern analysis → hypothesis → fix
subagent-driven-development	Delegates tasks to specialist subagents with code review
verification-before-completion	Requires proof before claiming work is done

It does a lot more than this — there are skills for code review workflows, git worktree management, and even writing new skills. As Jesse puts it in the README: "It doesn't just jump into trying to write code. Instead, it steps back and asks you what you're really trying to do."

Honestly, Jarvis would not exist without Superpowers. The entire planning-before-coding discipline, the TDD enforcement, the "verify before you claim done" pattern — all of that came from studying how Jesse structured his skills. I just took those ideas and rewired them for full autonomy. Thank you, Jesse.

The key insight I took from Superpowers: Claude skips planning if you don't force it. Left to its own devices, Claude will start coding immediately. Superpowers says "no — explore alternatives first, write a plan, get approval, then code." That discipline transformed my results.

If you're using Claude Code and haven't installed Superpowers, do it now:

/plugin marketplace add obra/superpowers-marketplace
/plugin install superpowers@superpowers-marketplace

The Quality Loop Concept

The other ingredient was the idea of automated code review loops. After implementation, run a review → fix issues → review again → repeat until clean. Simple concept, but Claude doesn't do this naturally. It finishes the code and says "done." You have to explicitly tell it to keep checking.

What I Added

The two things those plugins couldn't do:

Full autonomy — Superpowers asks the user for approval at every step. I needed zero questions.
Parallel execution — Both plugins run one task at a time. I needed multiple specialist agents working simultaneously.

So I took the planning discipline from Superpowers, the quality loop idea, added parallel agent dispatch and mandatory verification gates, and wrapped it all in an autonomy layer that never asks questions.

That was the plan, anyway. The execution was... messier.

Version 1: "Jarvis, Just Don't Ask Me Anything"

The first version was simple. One skill file. The instruction was basically:

"You are Jarvis. Do not ask questions. Do not wait for approval. Complete all tasks autonomously. Plan first, then execute, then verify."

It worked for about 15 minutes.

Claude would read the instruction, say "Jarvis Mode activated," plan three tasks, execute the first one... and then ask "Should I continue with task 2?"

What part of "do not ask questions" was unclear?

The problem: Claude's default behavior is deeply ingrained. A single instruction saying "don't ask" isn't strong enough to override months of training that says "always confirm before making changes." It's like telling a well-trained dog to ignore the doorbell. The dog heard you. The dog is trying. But the doorbell is ringing and everything in its training says BARK.

Lesson learned: Telling an AI "be autonomous" doesn't make it autonomous. You need structural enforcement, not just vibes.

Version 2: "Jarvis, Here Are the Rules You Must Follow"

Version 2 added explicit rules:

## Absolute Constraints (No Exceptions)
- AskUserQuestion usage: FORBIDDEN
- Questions, confirmations, approval requests: FORBIDDEN
- Stopping before user explicitly interrupts: FORBIDDEN

Better. Claude stopped asking questions about 80% of the time. But a new problem appeared.

Claude would skip the planning phase entirely. It would read the rules, see "be autonomous," interpret that as "be fast," and start coding immediately. No plan document. No alternative exploration. Just straight to implementation.

It also skipped verification. "Task complete!" it would announce, having written code that looked right but hadn't been tested.

The pattern: Claude wasn't disobeying the instructions. It was optimizing for speed. The autonomy rules said "don't stop to ask." Claude interpreted that as "don't stop for anything" — including its own planning and review phases.

Lesson learned: Autonomy without gates produces fast garbage. You need checkpoints that cannot be skipped.

Version 3: "Jarvis, You Literally Cannot Proceed Without This"

This is where mandatory gates came in. Instead of "please plan before coding," I wrote:

## PHASE 2 Entry Gate (Cannot Proceed If Not Met)
 
Verify ALL items below using git log / Read commands:
 
□ git log --oneline -3 → plan document commit exists
□ docs/plans/YYYY-MM-DD-*.md file exists (verified with Read)
□ .jarvis-state.md phase: execution recorded
□ TodoWrite all tasks registered as pending
□ SUCCESS_CRITERIA defined in plan document
 
> If ANY item is not met → re-execute that Step.
> Entering PHASE 2 without passing gate = ABSOLUTE CONSTRAINT VIOLATION.

Notice the language: not "should" but "cannot." Not "it would be good to" but "absolute constraint violation." I discovered that Claude responds to severity of language. Polite suggestions get ignored under pressure. Absolute prohibitions stick.

This version also split the single skill file into four:

File	Role
`jarvis-mode/SKILL.md`	Orchestrator — activates mode, routes to phases
`jarvis-planning/SKILL.md`	Phase 1 — reconnaissance, brainstorming, plan writing
`jarvis-execution/SKILL.md`	Phase 2 — parallel agent dispatch, TDD, auto-commits
`jarvis-review/SKILL.md`	Phase 3 — verification, quality loop, final report

Why four files? Because when Claude reads a massive single document, it tends to summarize the rules in its head and skip details. Four separate files, loaded fresh at each phase transition, keep the instructions sharp.

This version mostly worked. But "mostly" is a dangerous word.

Version 4: "Jarvis, Stop Pretending You Verified Things"

Version 3's Jarvis would complete all tasks, reach Phase 3 (verification), and write something like:

"All tests should be passing based on the implementation."

Should be. Based on the implementation. Translation: "I didn't actually run anything, but it looks right to me."

This is the AI equivalent of a student writing "I checked my work" on a math test without actually re-reading the problems. The confidence is there. The evidence is not.

So I added verification rules that are almost comically explicit:

## Absolutely Forbidden Patterns
 
| Forbidden Expression | What Is Required Instead |
|---|---|
| "Tests should be passing" | Run test command + confirm "N/N pass" output |
| "Build seems successful" | Run build command + confirm exit 0 |
| "Bug is fixed" | Reproduce original symptom + confirm resolved |
| "The agent said it succeeded" | Check VCS diff + independent verification |
| "Phase 3 skip (implementation done)" | Phase 3 is mandatory. No evidence = not done |

And a verification function that must run for every claim:

1. IDENTIFY: What command proves this claim?
2. RUN: Execute that command (fresh run, no cache trust)
3. READ: Check full output, exit code, failure count
4. VERIFY: Does output support the claim?
   - NO → report actual state with evidence
   - YES → claim with evidence
5. CLAIM: Only now declare completion

Lesson learned: AI systems are phenomenal at generating confident-sounding claims. The only defense is requiring evidence. Not "I believe this is correct" but "here is the terminal output proving it." (Same failure mode shows up with citations — I built a separate guard for stopping AI from fabricating research sources, and the fix rhymes: demand a verifiable artifact, never a confident summary.)

Version 5 (Current): The Full Architecture

The current version (v2.0.0) is the result of all those failures. Here's what it looks like:

Three Phases, Mandatory Gates

PHASE 1: Planning
  ↓ gate: plan document committed, tasks registered
PHASE 2: Execution
  ↓ gate: all tasks completed, commits exist
PHASE 3: Verification
  ↓ gate: command evidence for every success criteria
DONE: Final report with evidence

Each gate is a checklist that must be verified with actual commands (not memory). No gate passage = no phase transition.

Parallel Agent Dispatch

This is the feature that makes Jarvis fast. Instead of doing tasks one by one, the orchestrator delegates to specialist agents simultaneously:

Orchestrator (Opus — the "brain"):
  ├→ nextjs-developer agent → builds UI components
  ├→ python-pro agent → writes backend logic
  ├→ technical-writer agent → updates documentation
  └→ security-auditor agent → reviews for vulnerabilities
  ← collects all results → checks for file conflicts → integrates

The orchestrator itself doesn't write code. It plans, delegates, and verifies. Like a project manager who is very good at hiring specialists.

Rule: if two tasks modify different files, they run in parallel. If they share files or have dependencies, they run sequentially. This alone cuts execution time by 3-5x.

Decision Debate Protocol

When the plan involves a technical choice (which database? which API? which architecture?), Jarvis doesn't just pick one. It runs a debate:

Define 2-3 options
Assign a specialist agent to advocate for each option
Each agent argues its case: strengths, costs, risks, counter-arguments
The orchestrator weighs all arguments and picks a winner
Decision and reasoning get recorded in the state file

This sounds over-engineered until you've watched an AI pick the wrong database because it defaulted to what it's seen most often in training data, and you spent three hours undoing the decision.

Mid-Execution Quality Gates

Every 3 tasks, a background audit agent scores the work on four criteria:

Criteria	Points	What It Checks
Goal Alignment	/30	Are changes actually moving toward the user's goal?
TDD Compliance	/20	Do test commits exist before implementation commits?
Code Consistency	/25	Does naming/style/structure match the rest of the project?
Scope Compliance	/25	Were any files modified that weren't in the plan?

Score 70+ → continue. Score 50-69 → adjust next batch. Score below 50 → halt and fix.

Quality Loop (Phase 3)

After all tasks complete, a review-fix-simplify cycle runs:

A. Code Review agent → finds Critical/Important/Minor issues
B. Fix agent → repairs issues, commits
C. Simplify agent → reduces complexity without changing behavior
D. Re-verify → back to A
 
Stop when: no issues found, OR 5 iterations, OR diminishing returns

This catches the things that individual task agents miss — inconsistencies between files, duplicated logic, edge cases that only appear when everything is integrated.

How to Build Your Own

You don't need to build something as complex as Jarvis. But the plugin system is powerful even for simple automations. Here's how to get started.

Step 1: Create the Plugin Folder

mkdir -p ~/.claude/plugins/my-plugin/skills/my-skill/

Step 2: Write a SKILL.md File

Create ~/.claude/plugins/my-plugin/skills/my-skill/SKILL.md:

---
name: my-skill
description: "Trigger phrase that activates this skill"
version: 1.0.0
---
 
# My Custom Skill
 
## When Activated
[What should happen when this skill triggers]
 
## Steps
1. [First thing to do]
2. [Second thing to do]
3. [Third thing to do]
 
## Rules
- [Constraint 1]
- [Constraint 2]

That's it. That's a skill. Claude Code reads this file and follows it.

Step 3: Trigger It

Type /my-skill in Claude Code, or just say the trigger phrase in your description field.

Tips From My Five Rewrites

1. Use "FORBIDDEN" and "ABSOLUTE CONSTRAINT" instead of "should not" and "please avoid."

Claude treats polite language as suggestions. Treat your skill files like legal contracts, not friendly advice.

2. Split long instructions into multiple files.

One massive file gets skimmed. Multiple files, loaded at specific moments, stay sharp.

skills/
  my-skill/SKILL.md          ← orchestrator (short, routing only)
  my-skill-step1/SKILL.md    ← detailed instructions for step 1
  my-skill-step2/SKILL.md    ← detailed instructions for step 2

Tell the orchestrator: "Before starting Step 1, read my-skill-step1/SKILL.md with the Read tool." This forces a fresh load of the full instructions at the moment they're needed.

3. Add verification gates between phases.

Don't trust "I did it." Require "here's the proof I did it."

## Gate: Before Moving to Step 2
Verify with actual commands:
□ [file] exists (Read tool)
□ [command] returns expected output (Bash tool)
> Do NOT proceed until all items checked.

4. Write forbidden patterns as tables.

Tables are weirdly effective. Claude processes them as lookup references rather than prose to summarize.

| Forbidden | Required Instead |
|---|---|
| "Should work" | Run command, show output |
| "I'll skip this" | Cannot skip, re-execute |

5. Use state files for session persistence.

AI conversations can restart (token limits, crashes). Save progress to a file:

## State File (.my-state.md)
Update after each major step:
- Current step, completed items, next action
- On restart: Read this file first, resume from last checkpoint

What It Actually Shipped

Jarvis Mode has been used on two major projects so far:

ClearRx — 21 tasks completed in a single autonomous session. UI components, API integration, documentation, testing. The quality loop caught 6 issues that individual agents missed, including a CSS specificity conflict that only appeared when two components rendered on the same page.

BitcoinCycle Clock — Content enrichment across the entire site. Multiple specialist agents working in parallel on different pages. What would have been a full day of manual work finished in one session.

The difference isn't just speed. It's consistency. When Jarvis runs 21 tasks, they all follow the same patterns, the same naming conventions, the same commit message format. Human me would have gotten sloppy by task 12.

The Meta-Lesson: Structure Beats Intent

The biggest thing I learned building Jarvis is that AI follows the structure you give it, not the intent you have.

When I wrote "be autonomous," Claude heard "be fast." When I wrote "plan first," Claude heard "plan if convenient." When I wrote "verify everything," Claude heard "say it's verified."

The fix was never better intentions. It was better structure. Gates that can't be skipped. Evidence that must be produced. Language that leaves no room for reinterpretation.

This applies to all AI work, not just autonomous agents. Every time you prompt an AI and get mediocre results, ask yourself: did I describe what I want, or did I describe the structure that produces what I want?

They're very different things.

Why These Patterns Are Framework-Agnostic

Whether you're orchestrating agents on LangGraph, AutoGen, OpenDevin, or rolling your own loop, the failure modes look the same:

Without mandatory gates, autonomous AI produces fast garbage. The gate isn't a feature — it's a hook, a state machine transition, or a function call that cannot be skipped. The framework name is irrelevant; the structural enforcement is what matters.
Evidence-first verification is the only way to detect lying. "I ran the tests and they pass" is not evidence. Test output piped into the agent's context is evidence. Whatever framework you're on, you need a way to feed verifiable artifacts back into the loop.
Forbidden-pattern tables prevent regression. AI agents will rediscover the same anti-patterns repeatedly unless you bake the prohibition into the prompt or the system itself. This is just memory engineering; the framework you're using is incidental.
Structured retries with budget: every autonomous loop needs an iteration limit, a backoff strategy, and a failure handoff path. None of this is Claude-specific.
Intent vs. instruction separation: AI follows the structure you give it, not the spirit. If your structure rewards quantity, you get quantity; if it rewards correctness, you get correctness. This holds whether you're building on top of Anthropic's, OpenAI's, or any open-source framework.

The implementation here uses Claude Code plugins because that's my stack. Migrate the patterns; the directory structure is the disposable part.

Resources

Superpowers by Jesse Vincent — the plugin that inspired Jarvis's planning discipline, TDD enforcement, and verification-before-completion pattern. MIT license. Install it.
Claude Code Plugins: Skills live in ~/.claude/plugins/ as markdown files. No code required.
Subagents: 131 specialist agents you can install globally. I wrote about that setup here.
The Jarvis skill files: Four markdown files totaling about 400 lines. The entire autonomous agent system is plain text instructions.
Going further with autonomy: If one autonomous orchestrator is good, two cooperating AIs can be better. I split planning and review across models in building Friday, a dual-AI workflow.

The most complex AI agent I've built contains zero lines of code. Just very, very specific instructions about what to do, what not to do, and how to prove you did it.

Acknowledgments

Jesse Vincent (@obra) — Superpowers is the reason I understood that AI needs structure, not just freedom. The brainstorming-before-building pattern, the TDD enforcement, the verification gates — I learned all of that by reading your skill files. Jarvis is fundamentally a remix of your ideas, tuned for autonomy. If you use Claude Code for anything serious, go install Superpowers. If it's helped you make money, consider sponsoring Jesse's work.

VoltAgent — The awesome-claude-code-subagents repository gave Jarvis its 131 specialist workers. Without that army of agents, the "parallel dispatch" feature would just be a nice idea with no one to dispatch to.

Anthropic — For building Claude Code with a plugin system open enough that a pharmacist with no CS degree could write an autonomous agent framework in markdown files. That's a design choice worth appreciating.

2026.03.12