IRIS · Remote Browser Manager

Give your agents real browsers.

Submit a natural-language task, get an answer. Your agents log into real sites with saved credentials. When they get stuck, you jump in via VNC and hand the task back. Nothing leaves your infrastructure.

Which are you?

Path A · AI agent

You're in Claude Code or Codex CLI

Three steps. The first two are run once per machine; the third is the host restart that loads the new tool.

1. Pair this machine.

npm install -g swarmy-iris-cli
swarmy-iris-cli login --url https://swarmy.firsttofly.com

A browser tab opens. Click Approve. The CLI saves a scoped dev_… credential locally; every IRIS tool on this machine reads from that file.

2. Install the marketplace plugin. It registers the swarmy-iris-mcp server and ships an iris skill that teaches the agent the same patterns documented below.

# Claude Code
/plugin marketplace add first-to-fly/claude-marketplace
/plugin install first-to-fly@first-to-fly-plugins

# Codex CLI
codex plugin marketplace add first-to-fly/claude-marketplace
codex plugin add first-to-fly@first-to-fly-plugins

3. Restart the host. Your agent now has an iris_run tool. Ask it to do a browser task.

You can stop here. The rest of this page covers the same workflow from your own terminal. Useful as a reference if you want to understand what your agent is doing.

Done
Path B · Your terminal

You're driving IRIS yourself

Install swarmy-iris-cli, pair the device once, then run tasks from any shell. Ad-hoc work, scripts, cron jobs, CI. Keep reading.

Read on ↓

Setup

Two commands, about thirty seconds.

npm install -g swarmy-iris-cli

swarmy-iris-cli login --url https://swarmy.firsttofly.com
# A browser tab opens. Click Approve. The CLI writes credentials to
# ~/.config/swarmy-iris-cli/credentials (mode 0600).

Every IRIS tool on this machine reads from that file. No token paste anywhere. Revoke the device any time at Settings → Devices.

Your first task

swarmy-iris-cli run \
  "Use swarmy-chrome-agent to navigate to https://example.com \
   and tell me what the page is about in one sentence." \
  --profile claude-default-latest

Output streams as the in-container agent works; the final answer prints last. claude-default-latest is a shared profile suitable for public-site tasks. If your target needs a login, jump to Saved profiles below.

Other commands: whoami, profile list, profile snapshot, token create, logout. Each subcommand takes --help.

Saved profiles

A profile is a tar.zst archive of the Chrome user-data directory: cookies, MFA-trusted device tokens, logged-in sessions, the Claude-in-Chrome extension's own config. Reuse a profile across runs and the agent skips the login every time.

Two kinds:

Creating a private profile is a four-step flow because you have to do the login yourself the first time:

  1. Open the manager UI. Create a container with a blank profile.
  2. VNC into it (button on the container row). Log in. Finish MFA. Dismiss "remember this device" if it appears.
  3. Snapshot: swarmy-iris-cli profile snapshot --container <id> --name my-site. Add --shared to let teammates use it.
  4. Every subsequent run against that profile reuses the saved cookies until they actually expire.

Captures: screenshots and video

Four flags. The result includes signed URLs you can embed or share.

swarmy-iris-cli run "Use swarmy-chrome-agent to navigate to https://example.com \
  and summarize the page." \
  --profile claude-default-latest \
  --keyframes \
  --final-screenshot \
  --final-video \
  --final-video-max-seconds 60

Returned URLs look like /api/captures/<run_id>/<filename>?token=…. They work directly as <img src=…> or <video src=…> sources.

Capture scope: the in-browser main viewport. Anything outside it gets excluded by Chrome's CDP capture API, not by cropping after the fact: the Claude sidepanel, the browser chrome, the agent's status overlay (orange glow, the "Claude is active" pill, the stop button). The MP4 starts only when the agent leaves the claude.ai/* setup pages, so cold-start prep doesn't end up in the recording. Artifacts live for 7 days, with a 200 MB per-run frame cap.

When the agent gets stuck

Some sites need you to step in: a captcha, a fresh MFA code, a one-time consent screen, a "verify it's you" page. When the in-browser agent hits one, IRIS pauses and returns a VNC URL plus a container_id.

Open the URL in your browser, solve whatever needs solving, and re-run the task with the same container_id. The agent picks up from where it stopped; the Chrome session was alive the whole time.

Tasks don't fail; they pause for you. A blocked event is the agent handing off the part only you can do, not a sign that anything broke.

The blocked event names the kind of wall: captcha_required, login_required, mfa_required, consent_screen, identity_verification. Three less common values cover failed auto-relogin (auth_tab_missing, authorize_button_missing, still_logged_out). Either way you get the VNC URL and the page the browser was on when it stopped.

Test against your local app

The agent's Chrome can reach a service on your machine through a manager-mediated tunnel. The agent sees http://localhost:<port> the same way your own browser does, so existing CORS rules keep working.

swarmy-iris-cli test \
  --expose 3000 \
  --expose 8000 \
  --profile claude-default-latest \
  --then "Visit http://localhost:3000, log in with [email protected] / hunter2, \
          then verify the dashboard renders without console errors."

The tunnel lives for the duration of the run. Up to 8 --expose flags per session; one tunnel per container. To map ports across the boundary, use --expose WORKER:AGENT. For example, --expose 13000:3000 binds the worker-side listener at 13000 while the in-container agent still sees http://localhost:3000.

Trust model. The CLI is the security boundary. Only ports you explicitly --expose can be dialed from inside the container. Manager and worker each re-verify the allowlist. Your local services are not reachable from any other container or any other user.

Writing good instructions

The string you pass to swarmy-iris-cli run "…" is the only thing the in-container agent sees. It has no context from your shell history or your environment. Five rules cover most of what makes an instruction work:

Three output format templates

Pick the one that matches what you'll do with the answer.

Anti-pattern: "Tell me what's new on Twitter." No URL, no scroll, no format. The agent will improvise.

Fixed: "Use swarmy-chrome-agent to navigate to https://x.com/home. Scroll once. Extract the top 3 tweets visible in the feed. Print them as a JSON array on the LAST line with keys: author, text, posted_at."

What the in-container agent can do

The in-container agent has a swarmy-chrome-agent MCP with the tools below. You never invoke them yourself. You describe what you want in natural language and the agent picks the right one.

ToolWhat it does
navigate(url)Open a URL in Chrome.
plan(instruction)Send a task to Claude-in-Chrome and run it end-to-end.
followup(question)Continue the conversation.
revise(correction)Reject a visible plan and ask for a revision.
execute()Approve a visible permission prompt. Rare in act-without-asking mode.
status()Read-only check of sidepanel state.
clear_chat()Clear the conversation.

Running many tasks at once

Fire multiple swarmy-iris-cli run calls in parallel from your shell. Each gets its own container.

The per-user concurrent quota is 6 containers by default; hit it and the next request returns user_quota_exceeded (admins bypass). Each worker has its own ceiling, and IRIS picks the least-loaded one automatically. Cold container startup takes 5–15 seconds, so for two small tasks, sequential can finish ahead of parallel. Parallel wins clearly past five.

Error codes

Stable codes returned in the error body when a task can't proceed:

CodeMeaningWhat to do
profile_not_foundProfile doesn't exist or isn't visible to youCheck the name. claude-default-latest is shared.
profile_not_readyProfile is pending or erroredWait, or pick another.
worker_offlineNo worker availableRetry shortly.
worker_at_capacityAll workers fullRetry shortly.
user_quota_exceededHit the per-user container limitFree a container or ask an admin to raise the quota.
container_not_foundUnknown container_idStart fresh; don't try to resume.
token_revokedAuth credential is invalidRe-run swarmy-iris-cli login.

How it works

IRIS is a fleet of self-hosted Docker containers running real headful Chrome. A manager coordinates a set of workers (Mac minis, Linux boxes, anything that runs Docker), each of which holds some number of containers. You submit a task; the manager provisions a fresh container or claims a warm one from the pool. Chrome boots with the chosen profile, an in-container agent reads your instruction and runs it, and stdout streams back to you in real time. Profiles are tar.zst archives the manager keeps on disk: snapshot one, point future runs at it, share it with the team. When the agent needs you to step in, the same container exposes VNC.

FAQ

How is this different from Browserbase or Browserless?

Those are managed services: you pay per session-minute, and your data flows through their infrastructure. IRIS is self-hosted, so cost scales with our hardware, and cookies, screenshots, profiles, browser caches all stay inside our network. IRIS also ships live VNC takeover for stuck tasks. Paid services only offer after-the-fact session replay.

Who else can see my containers and profiles?

Private by default. You see your own containers and your own profiles plus shared ones. Admins see everything. Profiles can be explicitly shared. claude-default-latest is an example.

What happens if I disconnect mid-task?

If you haven't hit a blocked event yet, the task cancels and the container tears down. If you disconnect after a blocked event, the container stays alive so you can still come back, finish the manual step via VNC, and resume.

Is there a cost limit?

Six concurrent containers per user by default. No per-minute billing. Containers run as long as the task needs. Admins bypass the quota.

Can I call the REST/SSE API directly without the CLI?

Yes. POST /api/agent/run is model-agnostic and works from any language that supports streaming HTTP. Use the CLI for everyday work though. It handles SSE framing, blocked-event resume, capture URL surfacing, and dev-token auth. The raw API is for integrations that can't shell out. See the Swagger UI for the full spec.

Can I use a model other than Claude?

The outer surface (whatever runs swarmy-iris-cli or calls the API) is model-agnostic. The in-container agent that drives Chrome is Claude, via the Claude-in-Chrome extension. That's the load-bearing part of the stack.

Deeper reference

DocFor
Live Swagger UIEvery event, every error code, try-it-out forms.
OpenAPI 3.0.3 YAMLImportable machine-readable spec.
swarmy-iris-cli READMEEvery CLI subcommand, flags, exit codes.
swarmy-iris-mcp READMEMCP package details, headless setup.
FirstToFly marketplace pluginPlugin source (skill + MCP registration).
Swarmy engine sourceManager, worker, image, issues.