Session Lifecycle
How AGH creates, activates, stops, and classifies one durable runtime session.
- Audience
- Operators running durable agent work
- Focus
- Sessions guidance shaped for scanability, day-two clarity, and operator context.
A session is the runtime object AGH manages. It ties together:
- one ACP-compatible agent subprocess
- one workspace boundary
- one per-session SQLite event store
- one permission policy
- one durable AGH session ID
That AGH session ID stays stable across stop and resume cycles, even when the underlying ACP session ID changes.
State machine
Rendering diagram...
| State | Meaning | Valid live transition |
|---|---|---|
starting | AGH has allocated the session ID, prepared the session directory, opened the event store, and is spawning or resuming the agent subprocess. | starting -> active |
active | The agent is ready to accept prompts and emit events. This is the only state that accepts live prompts and approvals. | active -> stopping |
stopping | AGH has accepted a stop request and is draining the session toward a terminal state. | stopping -> stopped |
stopped | Final metadata is written, the session recorder is closed, and the session can be listed, inspected, or resumed later. | none |
Creating a session
Create a new session from the CLI:
agh session new --agent general --cwd "$PWD" --name code-reviewCreate the same session over HTTP:
curl -X POST http://localhost:2123/api/sessions \
-H "Content-Type: application/json" \
-d '{
"agent_name": "general",
"name": "code-review",
"workspace_path": "/absolute/path/to/repo"
}'The request must include exactly one of:
workspace: a registered workspace name or IDworkspace_path: an absolute filesystem path
Behind that request, AGH:
- Resolves the workspace and agent definition.
- Creates
~/.agh/sessions/<session-id>/. - Opens
events.dbfor the session. - Spawns the ACP subprocess and initializes the JSON-RPC connection.
- Creates or loads the ACP session.
- Transitions the AGH session from
startingtoactive.
Running an active session
Once a session is active, prompts are accepted through the CLI or HTTP transport:
agh session prompt sess-1234 "Explain how the stop path works."curl -N -X POST http://localhost:2123/api/sessions/sess-1234/prompt \
-H "Content-Type: application/json" \
-d '{"message":"Explain how the stop path works."}'Two operational details matter here:
- AGH records a
user_messageevent before it hands the prompt to ACP, so persisted history keeps the prompt even if prompt startup fails later. - One session processes one prompt turn at a time. Stop logic waits for in-flight prompt setup to finish before it asks the ACP driver to stop the subprocess.
- Long-running prompts stay healthy through runtime activity supervision. AGH updates
activity.last_activity_aton real ACP events and metadata-only waiting heartbeats, then emits lower-frequencyruntime_progressandruntime_warningevents for clients. - AGH also maintains a separate metadata-only session health
record (
state,health,attachable,eligible_for_wake). Activity supervision feeds health, but the two are distinct authorities — supervision owns timers and persisted progress events, health is consumed byHEARTBEAT.mdwake decisions and theagh session health|status|inspectsurfaces.
Runtime activity supervision
Activity supervision is designed for prompts that may run for hours. It treats inactivity as the failure mode, not total elapsed time.
For each active prompt turn, AGH tracks:
- turn ID and source
- turn start time
- last activity time, kind, and detail
- current tool and tool call ID when known
- last progress notification time
- idle and elapsed seconds
Short heartbeats keep last_activity_at fresh in session metadata and health output. They do not
enter the persisted event stream. runtime_progress events are persisted only at
session.supervision.progress_notify_interval, and runtime_warning is persisted once when
session.supervision.inactivity_warning_after is crossed.
When session.supervision.inactivity_timeout is crossed, AGH cancels the prompt cooperatively. If
the prompt does not finish within session.supervision.timeout_cancel_grace, the session is stopped
with stop reason timeout.
Stopping a session
Stop a running session from the CLI:
agh session stop sess-1234Stop it over HTTP:
curl -X DELETE http://localhost:2123/api/sessions/sess-1234The stop path is cooperative first and forceful only when needed:
active -> stopping- wait for any in-flight prompt setup to finish
- send ACP
session/cancel - wait for the subprocess to exit
- escalate through the subprocess shutdown path if it does not exit cleanly
- classify the stop reason
- record a terminal
session_stoppedevent - close the recorder and write final metadata
stopping -> stopped
Timeout behavior
AGH has separate timeout concepts:
| Timeout | What it protects |
|---|---|
session.limits.timeout | Optional wall-clock session limit. 0s disables it. |
session.supervision.inactivity_timeout | Prompt inactivity limit. Long-running prompts remain healthy if activity continues. |
session.supervision.timeout_cancel_grace | Grace period after inactivity timeout cancel before AGH stops the session as timeout. |
| ACP driver stop timeout | Subprocess shutdown escalation after cooperative stop. |
| Session lifecycle timeout | Recorder close and final cleanup work during stop. |
The important distinction is that inactivity timeout is not wall-clock timeout. A long prompt can
run beyond inactivity_timeout as long as AGH keeps observing real activity or controlled waiting
heartbeats.
Stop reasons
The session manager currently emits these persisted stop classifications:
| Stop reason | When AGH uses it |
|---|---|
completed | The agent finished normally. |
user_canceled | A normal user stop request. |
max_iterations | A user stop request carried max_iterations detail. |
loop_detected | A user stop request carried loop_detected detail. |
budget_exceeded | A user stop request carried budget_exceeded detail. |
timeout | Runtime supervision detected inactivity, prompt cancel did not complete within the configured grace, and AGH stopped the session. |
error | Start or stop failed, or the process exited unexpectedly without a crash wait error. |
agent_crashed | The subprocess exited with a wait error, or AGH repaired stale active/stopping metadata after a daemon crash. |
hook_stopped | A required lifecycle hook denied continuation. |
shutdown | The daemon shut the session down. |
Inspect the current session state and stop classification at any time:
agh session status sess-1234Failure diagnostics
Stop reasons answer "why did the AGH session stop?" Failure diagnostics answer "what kind of
lifecycle failure did AGH classify, and what evidence is available?" When a lifecycle failure occurs,
AGH persists a failure object with:
kind: stable machine-readable failure kindsummary: bounded redacted diagnostic textcrash_bundle_path: path to a redacted crash bundle when AGH captured one
The same failure object is exposed through session status, session list JSON, session SSE terminal events, and observe health.
agh session status sess-1234 -o json | jq '.session.failure'Current failure kinds are:
| Failure kind | Meaning |
|---|---|
startup_failure | AGH could not finish preparing, launching, initializing, or resuming the provider. |
handshake_failure | The subprocess launched but ACP initialization did not complete. |
load_session_failure | ACP session/load failed while resuming an existing provider session. |
protocol_failure | ACP returned a structured protocol/request error outside a prompt-specific path. |
prompt_failure | ACP prompt submission or prompt streaming failed. |
cancellation | The operation was canceled by user action or context cancellation. |
permission_failure | A lifecycle hook or permission boundary denied continuation. |
process_exit | The provider subprocess exited unexpectedly or returned a wait error. |
transport_failure | The ACP stdio transport closed or failed while AGH expected protocol traffic. |
timeout | Runtime supervision or lifecycle timeout expired. |
unknown_failure | AGH had an error but no more specific kind was available. |
Crash bundles are written under ~/.agh/logs/crash-bundles/ by default. They are JSON documents
with schema agh.session_crash_bundle.v1, session identity, provider identity, failure kind,
process metadata, error text, and captured stderr when available. Bundle contents are bounded and
redacted before they are written, and files are created with owner-only permissions.
Crash repair and restart behavior
AGH repairs stale metadata the next time it reads a stopped session after an unclean daemon exit. That repair is what makes status, list, and resume safe after a crash:
- stale
activebecomesstoppedwithagent_crashed,process_exit, and detaildaemon crashed while session active - stale
stoppingbecomesstoppedwithagent_crashed,process_exit, and detailstop did not complete - stale
startingbecomesstoppedwitherror,startup_failure, and detailstart did not complete
During daemon boot, AGH also inspects stopped sessions whose stop reason is agent_crashed or
error. If the final persisted turn was interrupted, AGH appends repair events to terminalize the
transcript: dangling tool calls receive interrupted tool results, then the turn receives a terminal
error event. The repair is append-only; AGH does not truncate, delete, or resequence session events.
This repair happens before resume validation. The repaired session keeps the same AGH session ID. Operators and agents can inspect or run the same transcript repair explicitly:
agh session repair <session-id> --dry-run
agh session repair <session-id>What gets persisted
Every session owns a directory under ~/.agh/sessions/<session-id>/:
meta.json: durable session metadata such as state, workspace, stop reason, failure diagnostics, and ACP session IDevents.db: persisted event, token usage, and hook-run history
That durable store is what makes resume, event replay, and approval audit possible.
Session types
AGH records why a session exists:
| Type | Meaning |
|---|---|
user | Normal interactive work created by a person or client. |
dream | Background memory consolidation work. |
system | Internal AGH-managed work. |
coordinator | Managed autonomy coordinator for workspace-scoped coordinated task runs. |
spawned | Child session created by an agent through safe spawn. |
Dream sessions are special for permissions: they always start with approve-all.
Coordinator sessions are root lineage rows. They are created only after executable coordinated work is enqueued, not when a task is merely created. Spawned sessions record parent, root, depth, role, TTL, and permission metadata so AGH can reap children when their TTL expires or their parent stops.
Next steps
- Use Resume and Replay when you need to continue a stopped session.
- Use Event Streaming when you need the exact stored and streamed session record.
- Use Runtime Autonomy for coordinator, task lease, and safe-spawn behavior.
- Use Permissions when you need to control or approve agent actions.