Skip to content
AGH RuntimeOperations

Troubleshooting

Diagnose and resolve common AGH daemon, socket, agent, session, and database issues.

Audience
Operators running durable agent work
Focus
Operations guidance shaped for scanability, day-two clarity, and operator context.

Use this guide when an operational command fails or the daemon is not behaving as expected. Each entry has symptoms, diagnosis, and a resolution path.

Daemon reports "already running"

FieldWhat to check
Symptomscli: daemon already running (pid=12345) or daemon: already running with pid 12345.
DiagnosisThe lock file is held by a live process, or daemon.json points at a process that is still alive.
ResolutionUse the existing daemon, or stop it with agh daemon stop. Do not remove daemon.lock while the PID is alive.

Inspect the current discovery files:

export AGH_HOME="${AGH_HOME:-$HOME/.agh}"
agh daemon status
cat "$AGH_HOME/daemon.json"
cat "$AGH_HOME/daemon.lock"

If the recorded PID is not alive, a new daemon start should acquire the lock, remove the stale socket path, and clean up orphan child processes from the old daemon.

Detached start times out or exits before readiness

FieldWhat to check
Symptomscli: daemon did not become ready before timeout or cli: detached daemon exited before readiness.
DiagnosisThe detached child failed before /api/daemon/status became available over UDS. Common causes are invalid config, a port conflict, a socket path conflict, or database open failure.
ResolutionRead the recent log lines, then run foreground mode to see the startup error directly.

Commands:

export AGH_HOME="${AGH_HOME:-$HOME/.agh}"
tail -n 120 "$AGH_HOME/logs/agh.log"
agh daemon start --foreground

Fix the error reported by foreground mode, then start normally:

agh daemon start

Unix socket cannot be created or opened

FieldWhat to check
SymptomsThe daemon fails with udsapi: existing path ".../daemon.sock" is not a unix socket, or the CLI cannot connect to the daemon socket.
DiagnosisThe configured socket path is occupied by a regular file, or the CLI user cannot read and write the socket. The live socket is chmodded to 0600.
ResolutionStop the daemon, move the non-socket file out of the way, and run the daemon and CLI as the same OS user.

Inspect the socket path:

export AGH_HOME="${AGH_HOME:-$HOME/.agh}"
socket="$AGH_HOME/daemon.sock"

agh daemon status
ls -ld "$AGH_HOME" "$(dirname "$socket")"
ls -l "$socket"
file "$socket"

If file "$socket" shows a regular file and the daemon is stopped, move it aside:

mv "$socket" "$socket.stale"
agh daemon start

If the socket lives outside AGH_HOME, confirm the configured [daemon].socket path and parent directory ownership in config.toml.

HTTP UI or API is unavailable

FieldWhat to check
SymptomsThe browser cannot load http://localhost:2123, or curl cannot connect to the API.
DiagnosisThe daemon is not running, the HTTP port is different, or another process prevented the daemon from binding the configured port.
ResolutionCheck daemon status for the active HTTP host and port. If startup fails from an HTTP bind error, change [http].port or stop the conflicting process.

Commands:

agh daemon status
curl -s http://localhost:2123/api/daemon/status | jq '.daemon'

The default HTTP bind is localhost:2123. For local production use, keep the host on localhost unless you intentionally place AGH behind a protected reverse proxy.

Agent fails to spawn

FieldWhat to check
SymptomsSession creation fails with an ACP subprocess or initialize error. Logs mention acp: start subprocess, initialize session, command not found, or permission denied.
DiagnosisThe provider command cannot be parsed or executed, the workspace path is invalid, required provider environment variables are missing, or the upstream ACP runtime failed initialization.
ResolutionValidate the agent definition, provider command, daemon environment, and workspace paths; then restart or recreate the session.

Commands:

agh agent info <agent-name>
agh daemon start --foreground
tail -n 120 "$AGH_HOME/logs/agh.log"
command -v npx
command -v codex
command -v gemini

Agent subprocesses inherit the daemon environment and receive provider credentials from bound credential_slots. If a provider uses an env:NAME secret ref, set that variable in the shell or service manager that starts the daemon. If it uses a vault:providers/<provider>/<slot> ref, save the credential through the settings API or web provider editor and confirm the provider status reports the credential as present.

See Spawning for the exact launch and ACP negotiation flow.

Session is stuck after a crash

FieldWhat to check
SymptomsA session appears to stay in starting, active, or stopping after a daemon or agent crash.
DiagnosisThe metadata on disk may describe an in-flight state from a previous daemon process.
ResolutionRestart the daemon, then list or inspect the session. AGH repairs stale session metadata during boot and session reads.

Commands:

agh daemon start
agh session list --all
agh session status <session-id>
agh session repair <session-id> --dry-run

The repair rules are:

Stale stateRepaired state
activestopped with stop reason agent_crashed
stoppingstopped with stop reason agent_crashed
startingstopped with stop reason error

For sessions already stopped with agent_crashed or error, boot also repairs interrupted transcripts by appending terminal repair events. If a transcript or chat replay still shows a dangling tool call or streaming assistant message after restart, run:

agh session repair <session-id> --dry-run
agh session repair <session-id>

If resume still fails, check that the workspace directory, agent definition, and $AGH_HOME/sessions/<session-id>/events.db still exist.

Database is locked or corrupted

FieldWhat to check
SymptomsSQLite reports database is locked, database disk image is malformed, or the daemon cannot open agh.db or events.db.
DiagnosisA live daemon may still hold the database, a filesystem copy may have missed WAL sidecars, or SQLite detected a recoverable corruption marker.
ResolutionStop the daemon before manual inspection. Restore from a backup if the database is corrupt. Preserve any .corrupt.<timestamp> files for diagnosis.

Commands:

export AGH_HOME="${AGH_HOME:-$HOME/.agh}"
agh daemon stop
sqlite3 "$AGH_HOME/agh.db" "pragma integrity_check;"

For backup and restore details, see Database Operations.

Permission errors in AGH_HOME

FieldWhat to check
SymptomsStartup logs show errors creating the home layout, lock file, log file, socket parent, or database directory.
DiagnosisThe daemon user does not own AGH_HOME, or a service manager starts AGH with a different home path than the CLI.
ResolutionUse one stable AGH_HOME, ensure the daemon user owns it, and run the CLI with the same AGH_HOME when managing that daemon.

Commands:

export AGH_HOME="${AGH_HOME:-$HOME/.agh}"
ls -ld "$AGH_HOME" "$AGH_HOME/logs" "$AGH_HOME/sessions"
agh daemon status

For systemd or launchd installs, define AGH_HOME in the service environment and keep provider API keys in the same environment.

On this page