Running Scenarios

Read this first if you’re wiring Archal into any agent repo. For two related shapes, jump straight to:

Use an existing agent repo — playbook for adapting an app/agent repo that already has its own runtime (Electron, web app shell, etc.)
archal run --harness — reference for the --harness flag itself and how it resolves

Add a headless harness

For a repo-local integration, Archal needs a runnable headless entrypoint. Prefer ./.archal/harness.ts for new integrations. It should read ARCHAL_ENGINE_TASK, call the repo’s real agent runtime, and print the final result without booting the full app shell. archal run performs a headless boot preflight on that entrypoint before it provisions hosted twins, so you can go straight to your first run — a boot-time failure fails fast before any remote resources are allocated.

Configure `.archal.json` (optional project-default path)

Create .archal.json if you want project-default agent wiring:

{
  "title": "my-agent-tests",
  "agent": {
    "command": "npx",
    "args": ["tsx", "./.archal/harness.ts"]
  },
  "twins": ["github"],
  "agentModel": "claude-sonnet-4-6"
}

Field	Required	Description
`agent`	Yes unless you pass `--harness` or Archal can discover a repo-local harness	Shell command to run your agent. Can be a string or an object with `command`, `args`, and `env`.
`title`	No	Display name for this project.
`twins`	No	Which twins to start. If omitted, inferred from the scenario.
`scenarios`	No	Array of scenario file paths relative to `.archal.json`.
`seeds`	No	Per-twin seed names, e.g. `{ "github": "small-project" }`.
`agentModel`	No	LLM model for the agent (e.g. `claude-sonnet-4-6`).
`evaluatorModel`	No	Evaluator/judge model for `[P]` criteria (e.g. `gemini-2.5-flash`). Inside a scenario’s `## Config` block the short alias `model` is also accepted, but at the `.archal.json` level only `evaluatorModel` is read.
`runs`	No	Default number of runs per scenario. Default: `1`.
`timeout`	No	Default timeout per run in seconds. Default: `180`.

Run a task

The quickest way to test is an inline task with an explicit harness:

archal run --task "Create an issue titled 'hello world'" --harness ./.archal/harness.ts --twin github

If .archal.json has an agent field, Archal can use that instead of --harness. --task only replaces the scenario file; it does not remove the need for a runnable headless harness boundary.

Run a scenario

For repeatable tests, write a scenario file and point archal run at it:

archal run scenarios/close-stale-issues.md --harness ./.archal/harness.ts

Run it multiple times for a satisfaction score:

archal run scenarios/close-stale-issues.md --harness ./.archal/harness.ts --runs 5

How the proxy works

Local harness runs start a TLS proxy by default. Harnesses using SDKs with hardcoded service domains (e.g. googleapis calling oauth2.googleapis.com) reach the twin transparently — the harness does not need to know about twin URLs. Pass --no-proxy to disable it when your harness already wires twin base URLs through env vars and makes no calls to real service domains.

archal run --task "..." --harness ./.archal/harness.ts
  |
  +-- 1. Resolve a runnable harness path
  +-- 2. Start cloud twin session
  +-- 3. Load seed data into twins
  +-- 4. Start local TLS proxy (generates a short-lived CA cert)
  +-- 5. Spawn harness with HTTPS_PROXY + NODE_EXTRA_CA_CERTS set
  |      HTTP calls to twin domains get intercepted:
  |        api.github.com           -> github twin
  |        api.stripe.com           -> stripe twin
  |        gmail.googleapis.com     -> google-workspace twin
  |        oauth2.googleapis.com    -> google-workspace twin
  |        api.anthropic.com        -> passthrough (not a twin)
  +-- 6. Harness completes
  +-- 7. Collect trace and final state
  +-- 8. Evaluate against success criteria -> satisfaction score

Traffic to non-twin domains passes through unchanged.

Disabling the proxy

Pass --no-proxy when your harness already talks to twins via configurable REST base URLs (set via ARCHAL_<TWIN>_BASE_URL env vars Archal injects) and never calls the real service domains. Skipping the proxy saves a few hundred milliseconds of boot and avoids installing a short-lived CA cert in the harness process. See Route-mode trust and safety for runtime compatibility and troubleshooting.

Local run artifacts

Every archal run also saves local artifacts under .archal/cache/:

.archal/cache/last-run.json
.archal/cache/runs/*.json

Use --output json when you need machine-readable stdout. It is not required to save traces locally.

View results

Results print to the terminal. They also appear at archal.ai/dashboard.

Use an existing agent repo
Writing scenarios
archal run reference
Twin sessions for persistent twins outside the run lifecycle
CI integration

Start here

Scenarios

Run anywhere

Advanced

Running Scenarios

Add a headless harness

Configure `.archal.json` (optional project-default path)

Run a task

Run a scenario

How the proxy works

Disabling the proxy

Local run artifacts

View results

Start here

Scenarios

Run anywhere

Advanced

Documentation Index

​Add a headless harness

​Configure .archal.json (optional project-default path)

​Run a task

​Run a scenario

​How the proxy works

​Disabling the proxy

​Local run artifacts

​View results

​Related

Add a headless harness

Configure `.archal.json` (optional project-default path)

Run a task

Run a scenario

How the proxy works

Disabling the proxy

Local run artifacts

View results

Related