Skip to content
AGH RuntimeSessions

Session Lifecycle

How AGH creates, activates, stops, and classifies one durable runtime session.

Audience
Operators running durable agent work
Focus
Sessions guidance shaped for scanability, day-two clarity, and operator context.

A session is the runtime object AGH manages. It ties together:

  • one ACP-compatible agent subprocess
  • one workspace boundary
  • one per-session SQLite event store
  • one permission policy
  • one durable AGH session ID

That AGH session ID stays stable across stop and resume cycles, even when the underlying ACP session ID changes.

State machine

Rendering diagram...

AGH creates or resumes a session into `starting`, activates it once the ACP driver is ready, and finalizes it through `stopping` before it becomes durable `stopped` state.
StateMeaningValid live transition
startingAGH has allocated the session ID, prepared the session directory, opened the event store, and is spawning or resuming the agent subprocess.starting -> active
activeThe agent is ready to accept prompts and emit events. This is the only state that accepts live prompts and approvals.active -> stopping
stoppingAGH has accepted a stop request and is draining the session toward a terminal state.stopping -> stopped
stoppedFinal metadata is written, the session recorder is closed, and the session can be listed, inspected, or resumed later.none

Creating a session

Create a new session from the CLI:

agh session new --agent general --cwd "$PWD" --name code-review

Create the same session over HTTP:

curl -X POST http://localhost:2123/api/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "agent_name": "general",
    "name": "code-review",
    "workspace_path": "/absolute/path/to/repo"
  }'

The request must include exactly one of:

  • workspace: a registered workspace name or ID
  • workspace_path: an absolute filesystem path

Behind that request, AGH:

  1. Resolves the workspace and agent definition.
  2. Creates ~/.agh/sessions/<session-id>/.
  3. Opens events.db for the session.
  4. Spawns the ACP subprocess and initializes the JSON-RPC connection.
  5. Creates or loads the ACP session.
  6. Transitions the AGH session from starting to active.

Running an active session

Once a session is active, prompts are accepted through the CLI or HTTP transport:

agh session prompt sess-1234 "Explain how the stop path works."
curl -N -X POST http://localhost:2123/api/sessions/sess-1234/prompt \
  -H "Content-Type: application/json" \
  -d '{"message":"Explain how the stop path works."}'

Two operational details matter here:

  • AGH records a user_message event before it hands the prompt to ACP, so persisted history keeps the prompt even if prompt startup fails later.
  • One session processes one prompt turn at a time. Stop logic waits for in-flight prompt setup to finish before it asks the ACP driver to stop the subprocess.
  • Long-running prompts stay healthy through runtime activity supervision. AGH updates activity.last_activity_at on real ACP events and metadata-only waiting heartbeats, then emits lower-frequency runtime_progress and runtime_warning events for clients.
  • AGH also maintains a separate metadata-only session health record (state, health, attachable, eligible_for_wake). Activity supervision feeds health, but the two are distinct authorities — supervision owns timers and persisted progress events, health is consumed by HEARTBEAT.md wake decisions and the agh session health|status|inspect surfaces.

Runtime activity supervision

Activity supervision is designed for prompts that may run for hours. It treats inactivity as the failure mode, not total elapsed time.

For each active prompt turn, AGH tracks:

  • turn ID and source
  • turn start time
  • last activity time, kind, and detail
  • current tool and tool call ID when known
  • last progress notification time
  • idle and elapsed seconds

Short heartbeats keep last_activity_at fresh in session metadata and health output. They do not enter the persisted event stream. runtime_progress events are persisted only at session.supervision.progress_notify_interval, and runtime_warning is persisted once when session.supervision.inactivity_warning_after is crossed.

When session.supervision.inactivity_timeout is crossed, AGH cancels the prompt cooperatively. If the prompt does not finish within session.supervision.timeout_cancel_grace, the session is stopped with stop reason timeout.

Stopping a session

Stop a running session from the CLI:

agh session stop sess-1234

Stop it over HTTP:

curl -X DELETE http://localhost:2123/api/sessions/sess-1234

The stop path is cooperative first and forceful only when needed:

  1. active -> stopping
  2. wait for any in-flight prompt setup to finish
  3. send ACP session/cancel
  4. wait for the subprocess to exit
  5. escalate through the subprocess shutdown path if it does not exit cleanly
  6. classify the stop reason
  7. record a terminal session_stopped event
  8. close the recorder and write final metadata
  9. stopping -> stopped

Timeout behavior

AGH has separate timeout concepts:

TimeoutWhat it protects
session.limits.timeoutOptional wall-clock session limit. 0s disables it.
session.supervision.inactivity_timeoutPrompt inactivity limit. Long-running prompts remain healthy if activity continues.
session.supervision.timeout_cancel_graceGrace period after inactivity timeout cancel before AGH stops the session as timeout.
ACP driver stop timeoutSubprocess shutdown escalation after cooperative stop.
Session lifecycle timeoutRecorder close and final cleanup work during stop.

The important distinction is that inactivity timeout is not wall-clock timeout. A long prompt can run beyond inactivity_timeout as long as AGH keeps observing real activity or controlled waiting heartbeats.

Stop reasons

The session manager currently emits these persisted stop classifications:

Stop reasonWhen AGH uses it
completedThe agent finished normally.
user_canceledA normal user stop request.
max_iterationsA user stop request carried max_iterations detail.
loop_detectedA user stop request carried loop_detected detail.
budget_exceededA user stop request carried budget_exceeded detail.
timeoutRuntime supervision detected inactivity, prompt cancel did not complete within the configured grace, and AGH stopped the session.
errorStart or stop failed, or the process exited unexpectedly without a crash wait error.
agent_crashedThe subprocess exited with a wait error, or AGH repaired stale active/stopping metadata after a daemon crash.
hook_stoppedA required lifecycle hook denied continuation.
shutdownThe daemon shut the session down.

Inspect the current session state and stop classification at any time:

agh session status sess-1234

Failure diagnostics

Stop reasons answer "why did the AGH session stop?" Failure diagnostics answer "what kind of lifecycle failure did AGH classify, and what evidence is available?" When a lifecycle failure occurs, AGH persists a failure object with:

  • kind: stable machine-readable failure kind
  • summary: bounded redacted diagnostic text
  • crash_bundle_path: path to a redacted crash bundle when AGH captured one

The same failure object is exposed through session status, session list JSON, session SSE terminal events, and observe health.

agh session status sess-1234 -o json | jq '.session.failure'

Current failure kinds are:

Failure kindMeaning
startup_failureAGH could not finish preparing, launching, initializing, or resuming the provider.
handshake_failureThe subprocess launched but ACP initialization did not complete.
load_session_failureACP session/load failed while resuming an existing provider session.
protocol_failureACP returned a structured protocol/request error outside a prompt-specific path.
prompt_failureACP prompt submission or prompt streaming failed.
cancellationThe operation was canceled by user action or context cancellation.
permission_failureA lifecycle hook or permission boundary denied continuation.
process_exitThe provider subprocess exited unexpectedly or returned a wait error.
transport_failureThe ACP stdio transport closed or failed while AGH expected protocol traffic.
timeoutRuntime supervision or lifecycle timeout expired.
unknown_failureAGH had an error but no more specific kind was available.

Crash bundles are written under ~/.agh/logs/crash-bundles/ by default. They are JSON documents with schema agh.session_crash_bundle.v1, session identity, provider identity, failure kind, process metadata, error text, and captured stderr when available. Bundle contents are bounded and redacted before they are written, and files are created with owner-only permissions.

Crash repair and restart behavior

AGH repairs stale metadata the next time it reads a stopped session after an unclean daemon exit. That repair is what makes status, list, and resume safe after a crash:

  • stale active becomes stopped with agent_crashed, process_exit, and detail daemon crashed while session active
  • stale stopping becomes stopped with agent_crashed, process_exit, and detail stop did not complete
  • stale starting becomes stopped with error, startup_failure, and detail start did not complete

During daemon boot, AGH also inspects stopped sessions whose stop reason is agent_crashed or error. If the final persisted turn was interrupted, AGH appends repair events to terminalize the transcript: dangling tool calls receive interrupted tool results, then the turn receives a terminal error event. The repair is append-only; AGH does not truncate, delete, or resequence session events.

This repair happens before resume validation. The repaired session keeps the same AGH session ID. Operators and agents can inspect or run the same transcript repair explicitly:

agh session repair <session-id> --dry-run
agh session repair <session-id>

What gets persisted

Every session owns a directory under ~/.agh/sessions/<session-id>/:

  • meta.json: durable session metadata such as state, workspace, stop reason, failure diagnostics, and ACP session ID
  • events.db: persisted event, token usage, and hook-run history

That durable store is what makes resume, event replay, and approval audit possible.

Session types

AGH records why a session exists:

TypeMeaning
userNormal interactive work created by a person or client.
dreamBackground memory consolidation work.
systemInternal AGH-managed work.
coordinatorManaged autonomy coordinator for workspace-scoped coordinated task runs.
spawnedChild session created by an agent through safe spawn.

Dream sessions are special for permissions: they always start with approve-all.

Coordinator sessions are root lineage rows. They are created only after executable coordinated work is enqueued, not when a task is merely created. Spawned sessions record parent, root, depth, role, TTL, and permission metadata so AGH can reap children when their TTL expires or their parent stops.

Next steps

On this page