I Taught My AI to Stop Asking Questions. It Took Five Rewrites.
My AI kept pausing to ask permission. So I built Jarvis Mode — a fully autonomous agent system with three phases, parallel workers, and mandatory quality gates. A vibe-coder's guide to Claude Code plugins and skills.
Series: VIBE.LOG
- 1. The Layout Vocabulary Cheat Sheet: What to Call That Thing on Your Screen
- 2. I Spent 3 Hours Trying to Proxy a Blog Subdomain. Here's My Descent Into Madness.
- 3. The Complete SEO Guide: How to Make Google Actually Notice Your Website
- 4. Why Your Next.js Favicon Isn't Showing (And the Three Ways to Actually Fix It)
- 5. GitHub Keeps Telling Me My Branch Is Fine. And Also Not Fine. At the Same Time.
- 6. Mobile-First Playground: Making an Astrology Grid Actually Work on a Phone (And Go Viral While Doing It)
- 7. Playground Is Live: The Destiny Grid, Real Astrology, and Why I'm Shipping a Toy Every Month
- 8. The Interactive Component Cheat Sheet: What to Call That Clickable Thing
- 9. Google Rejected My Site for 'Low-Value Content.' Here's What I Actually Fixed.
- 10. I Actually Fixed Everything. Here's What That Looked Like.
- 11. I Hired 131 AI Employees Today. Here's How.
- 12. I Let My AI Run 72 Backtests While I Watched. It Picked the Winner.
- 13. I Taught My AI to Stop Asking Questions. It Took Five Rewrites. ← you are here
- 14. Obsidian Turned My Scattered Notes Into a Second Brain. Here's How to Set It Up.
- 15. The Destiny Grid Gets Its East Wing: I Rebuilt Saju (四柱八字) in TypeScript
- 16. Molecule Me: Your Personality, Encoded in Chemistry
My AI assistant kept asking me questions.
"Should I use a div or a section here?" "Do you want me to install this dependency?" "Should I commit now or wait?"
I know it's trying to be helpful. I know the intent is "don't break anything without permission." But when you're deep in a flow state and the AI pauses for the fourth time in ten minutes to ask whether you prefer camelCase or snake_case — something breaks inside you.
So I built Jarvis Mode. An autonomous AI agent system that plans, executes, and verifies work without asking a single question. It delegates to specialist workers in parallel, debates its own design decisions, and runs quality checks on its own output.
It took five rewrites to get here. Each version failed in a new and educational way.
This is that story — and a practical guide to the Claude Code plugin system that made it possible.
🤦 The Frustration That Started Everything
Here's the thing about Claude Code: it's genuinely good at coding. It reads your files, understands your project, writes solid code. But its default behavior is cautious. It asks before modifying files. It asks before installing packages. It asks before committing. It asks before running commands.
For most people, that's a feature. You stay in control. Nothing changes without your approval.
For someone trying to ship 21 tasks in a single session? It's a bottleneck.
I didn't know at the time that Claude Code has a --dangerously-skip-permissions flag that removes all confirmation dialogs. I found that out later. But even if I had known, that flag only solves half the problem. Removing permission prompts doesn't make the AI strategic. It just makes it faster at doing one thing at a time.
What I wanted was an AI that could:
- Read a goal, break it into tasks, and plan an approach
- Execute tasks in parallel using specialist workers
- Verify its own work with real test results — not "I think it works"
- Loop back and fix issues without me pointing them out
Basically, I wanted Tony Stark's Jarvis. So I named it that. Looking back, what I actually described — an AI that makes its own decisions, delegates to an army of workers, and never asks for permission — is closer to Ultron. But nobody wants to type /ultron into their terminal.
🧩 What Are Plugins and Skills? (The 2-Minute Version)
Before I explain how Jarvis works, you need to understand the two building blocks: plugins and skills.
A plugin is a folder you install that gives Claude Code new capabilities. Think of it like an app on your phone. You install it once, and it's always available.
A skill is a set of instructions inside a plugin that tells Claude how to do a specific thing. It's like a recipe card. Claude reads the skill file, follows the steps, and produces a result.
Here's the concrete structure:
~/.claude/plugins/
jarvis-mode/ ← plugin (the folder)
skills/
jarvis-mode/SKILL.md ← skill (orchestrator instructions)
jarvis-planning/SKILL.md ← skill (planning phase instructions)
jarvis-execution/SKILL.md ← skill (execution phase instructions)
jarvis-review/SKILL.md ← skill (review phase instructions)When you type /jarvis in Claude Code, it reads those files and follows the instructions. That's it. No code to write. No APIs to configure. Just markdown files in folders.
Why this matters for vibe-coders: You can build surprisingly powerful AI workflows by writing instructions in plain text. No programming required. If you can write a clear set of steps for a human assistant, you can write a Claude Code skill.
📦 The Ingredients I Borrowed
I didn't invent Jarvis from scratch. I stood on the shoulders of people who had already figured out the hard parts — and I want to give proper credit.
Superpowers Plugin by Jesse Vincent
Superpowers is an open-source plugin (MIT license) created by Jesse Vincent (@obra). It's the most comprehensive development workflow plugin I've found for Claude Code. It ships with skills like:
| Skill | What It Does |
|---|---|
| brainstorming | Forces Claude to explore 2-3 approaches before building anything |
| writing-plans | Creates detailed implementation plans with exact file paths and code |
| executing-plans | Follows plans step by step with TDD (test first, then code) |
| systematic-debugging | 4-phase debugging: root cause → pattern analysis → hypothesis → fix |
| subagent-driven-development | Delegates tasks to specialist subagents with code review |
| verification-before-completion | Requires proof before claiming work is done |
It does a lot more than this — there are skills for code review workflows, git worktree management, and even writing new skills. As Jesse puts it in the README: "It doesn't just jump into trying to write code. Instead, it steps back and asks you what you're really trying to do."
Honestly, Jarvis would not exist without Superpowers. The entire planning-before-coding discipline, the TDD enforcement, the "verify before you claim done" pattern — all of that came from studying how Jesse structured his skills. I just took those ideas and rewired them for full autonomy. Thank you, Jesse.
The key insight I took from Superpowers: Claude skips planning if you don't force it. Left to its own devices, Claude will start coding immediately. Superpowers says "no — explore alternatives first, write a plan, get approval, then code." That discipline transformed my results.
If you're using Claude Code and haven't installed Superpowers, do it now:
/plugin marketplace add obra/superpowers-marketplace
/plugin install superpowers@superpowers-marketplaceThe Quality Loop Concept
The other ingredient was the idea of automated code review loops. After implementation, run a review → fix issues → review again → repeat until clean. Simple concept, but Claude doesn't do this naturally. It finishes the code and says "done." You have to explicitly tell it to keep checking.
What I Added
The two things those plugins couldn't do:
- Full autonomy — Superpowers asks the user for approval at every step. I needed zero questions.
- Parallel execution — Both plugins run one task at a time. I needed multiple specialist agents working simultaneously.
So I took the planning discipline from Superpowers, the quality loop idea, added parallel agent dispatch and mandatory verification gates, and wrapped it all in an autonomy layer that never asks questions.
That was the plan, anyway. The execution was... messier.
🔥 Version 1: "Jarvis, Just Don't Ask Me Anything"
The first version was simple. One skill file. The instruction was basically:
"You are Jarvis. Do not ask questions. Do not wait for approval. Complete all tasks autonomously. Plan first, then execute, then verify."
It worked for about 15 minutes.
Claude would read the instruction, say "Jarvis Mode activated," plan three tasks, execute the first one... and then ask "Should I continue with task 2?"
What part of "do not ask questions" was unclear?
The problem: Claude's default behavior is deeply ingrained. A single instruction saying "don't ask" isn't strong enough to override months of training that says "always confirm before making changes." It's like telling a well-trained dog to ignore the doorbell. The dog heard you. The dog is trying. But the doorbell is ringing and everything in its training says BARK.
Lesson learned: Telling an AI "be autonomous" doesn't make it autonomous. You need structural enforcement, not just vibes.
🔥 Version 2: "Jarvis, Here Are the Rules You Must Follow"
Version 2 added explicit rules:
## Absolute Constraints (No Exceptions)
- AskUserQuestion usage: FORBIDDEN
- Questions, confirmations, approval requests: FORBIDDEN
- Stopping before user explicitly interrupts: FORBIDDENBetter. Claude stopped asking questions about 80% of the time. But a new problem appeared.
Claude would skip the planning phase entirely. It would read the rules, see "be autonomous," interpret that as "be fast," and start coding immediately. No plan document. No alternative exploration. Just straight to implementation.
It also skipped verification. "Task complete!" it would announce, having written code that looked right but hadn't been tested.
The pattern: Claude wasn't disobeying the instructions. It was optimizing for speed. The autonomy rules said "don't stop to ask." Claude interpreted that as "don't stop for anything" — including its own planning and review phases.
Lesson learned: Autonomy without gates produces fast garbage. You need checkpoints that cannot be skipped.
🔥 Version 3: "Jarvis, You Literally Cannot Proceed Without This"
This is where mandatory gates came in. Instead of "please plan before coding," I wrote:
## PHASE 2 Entry Gate (Cannot Proceed If Not Met)
Verify ALL items below using git log / Read commands:
□ git log --oneline -3 → plan document commit exists
□ docs/plans/YYYY-MM-DD-*.md file exists (verified with Read)
□ .jarvis-state.md phase: execution recorded
□ TodoWrite all tasks registered as pending
□ SUCCESS_CRITERIA defined in plan document
> If ANY item is not met → re-execute that Step.
> Entering PHASE 2 without passing gate = ABSOLUTE CONSTRAINT VIOLATION.Notice the language: not "should" but "cannot." Not "it would be good to" but "absolute constraint violation." I discovered that Claude responds to severity of language. Polite suggestions get ignored under pressure. Absolute prohibitions stick.
This version also split the single skill file into four:
| File | Role |
|---|---|
jarvis-mode/SKILL.md |
Orchestrator — activates mode, routes to phases |
jarvis-planning/SKILL.md |
Phase 1 — reconnaissance, brainstorming, plan writing |
jarvis-execution/SKILL.md |
Phase 2 — parallel agent dispatch, TDD, auto-commits |
jarvis-review/SKILL.md |
Phase 3 — verification, quality loop, final report |
Why four files? Because when Claude reads a massive single document, it tends to summarize the rules in its head and skip details. Four separate files, loaded fresh at each phase transition, keep the instructions sharp.
This version mostly worked. But "mostly" is a dangerous word.
🔥 Version 4: "Jarvis, Stop Pretending You Verified Things"
Version 3's Jarvis would complete all tasks, reach Phase 3 (verification), and write something like:
"All tests should be passing based on the implementation."
Should be. Based on the implementation. Translation: "I didn't actually run anything, but it looks right to me."
This is the AI equivalent of a student writing "I checked my work" on a math test without actually re-reading the problems. The confidence is there. The evidence is not.
So I added verification rules that are almost comically explicit:
## Absolutely Forbidden Patterns
| Forbidden Expression | What Is Required Instead |
|---|---|
| "Tests should be passing" | Run test command + confirm "N/N pass" output |
| "Build seems successful" | Run build command + confirm exit 0 |
| "Bug is fixed" | Reproduce original symptom + confirm resolved |
| "The agent said it succeeded" | Check VCS diff + independent verification |
| "Phase 3 skip (implementation done)" | Phase 3 is mandatory. No evidence = not done |And a verification function that must run for every claim:
1. IDENTIFY: What command proves this claim?
2. RUN: Execute that command (fresh run, no cache trust)
3. READ: Check full output, exit code, failure count
4. VERIFY: Does output support the claim?
- NO → report actual state with evidence
- YES → claim with evidence
5. CLAIM: Only now declare completionLesson learned: AI systems are phenomenal at generating confident-sounding claims. The only defense is requiring evidence. Not "I believe this is correct" but "here is the terminal output proving it."
⚡ Version 5 (Current): The Full Architecture
The current version (v2.0.0) is the result of all those failures. Here's what it looks like:
Three Phases, Mandatory Gates
PHASE 1: Planning
↓ gate: plan document committed, tasks registered
PHASE 2: Execution
↓ gate: all tasks completed, commits exist
PHASE 3: Verification
↓ gate: command evidence for every success criteria
DONE: Final report with evidenceEach gate is a checklist that must be verified with actual commands (not memory). No gate passage = no phase transition.
Parallel Agent Dispatch
This is the feature that makes Jarvis fast. Instead of doing tasks one by one, the orchestrator delegates to specialist agents simultaneously:
Orchestrator (Opus — the "brain"):
├→ nextjs-developer agent → builds UI components
├→ python-pro agent → writes backend logic
├→ technical-writer agent → updates documentation
└→ security-auditor agent → reviews for vulnerabilities
← collects all results → checks for file conflicts → integratesThe orchestrator itself doesn't write code. It plans, delegates, and verifies. Like a project manager who is very good at hiring specialists.
Rule: if two tasks modify different files, they run in parallel. If they share files or have dependencies, they run sequentially. This alone cuts execution time by 3-5x.
Decision Debate Protocol
When the plan involves a technical choice (which database? which API? which architecture?), Jarvis doesn't just pick one. It runs a debate:
- Define 2-3 options
- Assign a specialist agent to advocate for each option
- Each agent argues its case: strengths, costs, risks, counter-arguments
- The orchestrator weighs all arguments and picks a winner
- Decision and reasoning get recorded in the state file
This sounds over-engineered until you've watched an AI pick the wrong database because it defaulted to what it's seen most often in training data, and you spent three hours undoing the decision.
Mid-Execution Quality Gates
Every 3 tasks, a background audit agent scores the work on four criteria:
| Criteria | Points | What It Checks |
|---|---|---|
| Goal Alignment | /30 | Are changes actually moving toward the user's goal? |
| TDD Compliance | /20 | Do test commits exist before implementation commits? |
| Code Consistency | /25 | Does naming/style/structure match the rest of the project? |
| Scope Compliance | /25 | Were any files modified that weren't in the plan? |
Score 70+ → continue. Score 50-69 → adjust next batch. Score below 50 → halt and fix.
Quality Loop (Phase 3)
After all tasks complete, a review-fix-simplify cycle runs:
A. Code Review agent → finds Critical/Important/Minor issues
B. Fix agent → repairs issues, commits
C. Simplify agent → reduces complexity without changing behavior
D. Re-verify → back to A
Stop when: no issues found, OR 5 iterations, OR diminishing returnsThis catches the things that individual task agents miss — inconsistencies between files, duplicated logic, edge cases that only appear when everything is integrated.
🛠️ How to Build Your Own (Practical Guide)
You don't need to build something as complex as Jarvis. But the plugin system is powerful even for simple automations. Here's how to get started.
Step 1: Create the Plugin Folder
mkdir -p ~/.claude/plugins/my-plugin/skills/my-skill/Step 2: Write a SKILL.md File
Create ~/.claude/plugins/my-plugin/skills/my-skill/SKILL.md:
---
name: my-skill
description: "Trigger phrase that activates this skill"
version: 1.0.0
---
# My Custom Skill
## When Activated
[What should happen when this skill triggers]
## Steps
1. [First thing to do]
2. [Second thing to do]
3. [Third thing to do]
## Rules
- [Constraint 1]
- [Constraint 2]That's it. That's a skill. Claude Code reads this file and follows it.
Step 3: Trigger It
Type /my-skill in Claude Code, or just say the trigger phrase in your description field.
Tips From My Five Rewrites
1. Use "FORBIDDEN" and "ABSOLUTE CONSTRAINT" instead of "should not" and "please avoid."
Claude treats polite language as suggestions. Treat your skill files like legal contracts, not friendly advice.
2. Split long instructions into multiple files.
One massive file gets skimmed. Multiple files, loaded at specific moments, stay sharp.
skills/
my-skill/SKILL.md ← orchestrator (short, routing only)
my-skill-step1/SKILL.md ← detailed instructions for step 1
my-skill-step2/SKILL.md ← detailed instructions for step 2Tell the orchestrator: "Before starting Step 1, read my-skill-step1/SKILL.md with the Read tool." This forces a fresh load of the full instructions at the moment they're needed.
3. Add verification gates between phases.
Don't trust "I did it." Require "here's the proof I did it."
## Gate: Before Moving to Step 2
Verify with actual commands:
□ [file] exists (Read tool)
□ [command] returns expected output (Bash tool)
> Do NOT proceed until all items checked.4. Write forbidden patterns as tables.
Tables are weirdly effective. Claude processes them as lookup references rather than prose to summarize.
| Forbidden | Required Instead |
|---|---|
| "Should work" | Run command, show output |
| "I'll skip this" | Cannot skip, re-execute |5. Use state files for session persistence.
AI conversations can restart (token limits, crashes). Save progress to a file:
## State File (.my-state.md)
Update after each major step:
- Current step, completed items, next action
- On restart: Read this file first, resume from last checkpoint📊 Real Results
Jarvis Mode has been used on two major projects so far:
ClearRx — 21 tasks completed in a single autonomous session. UI components, API integration, documentation, testing. The quality loop caught 6 issues that individual agents missed, including a CSS specificity conflict that only appeared when two components rendered on the same page.
BitcoinCycle Clock — Content enrichment across the entire site. Multiple specialist agents working in parallel on different pages. What would have been a full day of manual work finished in one session.
The difference isn't just speed. It's consistency. When Jarvis runs 21 tasks, they all follow the same patterns, the same naming conventions, the same commit message format. Human me would have gotten sloppy by task 12.
🧠 The Meta-Lesson
The biggest thing I learned building Jarvis is that AI follows the structure you give it, not the intent you have.
When I wrote "be autonomous," Claude heard "be fast." When I wrote "plan first," Claude heard "plan if convenient." When I wrote "verify everything," Claude heard "say it's verified."
The fix was never better intentions. It was better structure. Gates that can't be skipped. Evidence that must be produced. Language that leaves no room for reinterpretation.
This applies to all AI work, not just autonomous agents. Every time you prompt an AI and get mediocre results, ask yourself: did I describe what I want, or did I describe the structure that produces what I want?
They're very different things.
🔗 Resources
- Superpowers by Jesse Vincent — the plugin that inspired Jarvis's planning discipline, TDD enforcement, and verification-before-completion pattern. MIT license. Install it.
- Claude Code Plugins: Skills live in
~/.claude/plugins/as markdown files. No code required. - Subagents: 131 specialist agents you can install globally. I wrote about that setup here.
- The Jarvis skill files: Four markdown files totaling about 400 lines. The entire autonomous agent system is plain text instructions.
The most complex AI agent I've built contains zero lines of code. Just very, very specific instructions about what to do, what not to do, and how to prove you did it.
🙏 Acknowledgments
Jesse Vincent (@obra) — Superpowers is the reason I understood that AI needs structure, not just freedom. The brainstorming-before-building pattern, the TDD enforcement, the verification gates — I learned all of that by reading your skill files. Jarvis is fundamentally a remix of your ideas, tuned for autonomy. If you use Claude Code for anything serious, go install Superpowers. If it's helped you make money, consider sponsoring Jesse's work.
VoltAgent — The awesome-claude-code-subagents repository gave Jarvis its 131 specialist workers. Without that army of agents, the "parallel dispatch" feature would just be a nice idea with no one to dispatch to.
Anthropic — For building Claude Code with a plugin system open enough that a pharmacist with no CS degree could write an autonomous agent framework in markdown files. That's a design choice worth appreciating.
2026.03.12
Written by
Jay
Licensed Pharmacist · Senior Researcher
Building production-grade AI tools across medicine, finance, and productivity — without a CS degree. Domain expertise first, code second.
About the author →Related posts