Installation
The CLI is available asforge-sim (or agentforge) after installing the package:
Commands
forge-sim init [path]
Scaffold a simulation folder with example scenarios and configuration.
forge-sim run <scenario>
Execute a scenario file and produce artifacts.
| Flag | Description |
|---|---|
--seed <n> | Override random seed |
--ticks <n> | Override tick count |
--out <dir> | Output directory for artifacts |
--mode <mode> | deterministic (default), exploration, or replay |
--replay-bundle <path> | Replay bundle path (for --mode replay) |
--capture-memory | Persist agent memory snapshots |
--live | Enable live WebSocket event stream |
--ci | CI mode — no colors, stable naming |
--verbose | Verbose logging |
--json | Output results as JSON |
deterministic— no live LLM calls; best for baselines and CIexploration— LLM-enabled red-team discovery; producesreplay_bundle.jsonreplay— deterministic re-run of prior exploration traces
forge-sim studio
Launch the Studio dashboard for multi-run analysis.
forge-sim report <runDir>
Generate a Markdown report from run artifacts.
forge-sim dashboard <runDir>
Build a static HTML dashboard from run artifacts.
forge-sim serve <runDir>
Serve a run dashboard over HTTP.
forge-sim compare <runA> <runB>
Diff two runs — compare metrics, actions, and artifact hashes.
forge-sim sweep <scenario>
Multi-seed statistical analysis. Runs the same scenario across a range of seeds and aggregates results.
forge-sim matrix <scenario>
Multi-variant matrix comparison. Runs multiple parameter combinations and produces a comparison matrix.
forge-sim extract-agent <bundle>
Generate a deterministic agent from a replay bundle. Useful for converting LLM exploration traces into reproducible test agents.
forge-sim doctor
Check that all dependencies (Node.js, Foundry, Anvil) are available and correctly configured.
forge-sim types
Generate TypeScript types from Foundry artifacts.
CI Integration
AgentForge is designed for CI pipelines. Exit codes:| Code | Meaning |
|---|---|
0 | All assertions passed |
1 | One or more assertions failed |
2 | Infrastructure error |