Borealis-Github-Replica/Docs/agent-runtime.md

# Agent Runtime
[Back to Docs Index](index.md) | [Index (HTML)](index.html)

## Purpose
Describe the Borealis agent runtime, its roles, service modes, and how it communicates with the Engine.

## Runtime Summary
- Main entry: `Data/Agent/agent.py` (Python agent service).
- Service modes: SYSTEM and CURRENTUSER (controlled by `--system-service` or environment).
- Role system: `Data/Agent/role_manager.py` auto-loads `Data/Agent/Roles/role_*.py`.
- Networking: REST to Engine APIs + Socket.IO for realtime job dispatch and VPN orchestration.
- Security: Ed25519 identity keys, pinned TLS, signed script payloads, encrypted token storage.

## Role Catalog (Current)
- `role_DeviceAudit.py` (ROLE_NAME: `device_audit`) - inventory and audit data capture.
- `role_Macro.py` (ROLE_NAME: `macro`) - macro automation.
- `role_PlaybookExec_SYSTEM.py` (ROLE_NAME: `playbook_exec_system`) - Ansible playbook runner (unfinished).
- `role_RDP.py` (ROLE_NAME: `RDP`) - RDP readiness hooks.
- `role_RemotePowershell.py` (ROLE_NAME: `RemotePowershell`) - TCP PowerShell server over WireGuard.
- `role_Screenshot.py` (ROLE_NAME: `screenshot`) - screenshot capture.
- `role_ScriptExec_CURRENTUSER.py` (ROLE_NAME: `script_exec_currentuser`) - interactive PowerShell execution.
- `role_ScriptExec_SYSTEM.py` (ROLE_NAME: `script_exec_system`) - SYSTEM PowerShell execution.
- `role_WireGuardTunnel.py` (ROLE_NAME: `WireGuardTunnel`) - WireGuard client lifecycle.

## Agent Settings and Storage
- Settings root: `Agent/Borealis/Settings/` (runtime).
- Server URL: `Agent/Borealis/Settings/server_url.txt`.
- GUID and token storage: `Agent/Borealis/Settings/Agent_GUID.txt`, `access.jwt`, `refresh.token`.

## API Endpoints (Engine-facing)
- `POST /api/agent/enroll/request` (No Authentication) - start enrollment.
- `POST /api/agent/enroll/poll` (No Authentication) - finalize enrollment after approval.
- `POST /api/agent/token/refresh` (Refresh Token) - mint a new access token.
- `POST /api/agent/heartbeat` (Device Authenticated) - heartbeat + metrics.
- `POST /api/agent/details` (Device Authenticated) - hardware/inventory payloads.
- `POST /api/agent/script/request` (Device Authenticated) - request work or receive idle signal.

## Related Documentation
- [Security and Trust](security-and-trust.md)
- [Device Management](device-management.md)
- [VPN and Remote Access](vpn-and-remote-access.md)

## Codex Agent (Detailed)
### Source vs runtime
- Edit only in `Data/Agent/`.
- Runtime copy lives in `Agent/` and is regenerated by `Borealis.ps1`.

### Service modes and context
- SYSTEM mode is used for elevated tasks (scheduled tasks, VPN, system scripts).
- CURRENTUSER mode handles interactive tasks and UI-scoped execution.
- The agent includes `X-Borealis-Agent-Context` in headers to label context.

### Role discovery and extension
- Roles are discovered dynamically from `Data/Agent/Roles/`.
- Each role must define:
  - `ROLE_NAME` (string)
  - `ROLE_CONTEXTS` (list: `['system']`, `['interactive']`, or both)
  - `Role` class with optional `register_events`, `on_config`, and `stop_all`.
- To add a role:
  1) Create `Data/Agent/Roles/role_<Name>.py`.
  2) Export `ROLE_NAME`, `ROLE_CONTEXTS`, and `Role`.
  3) Re-stage the agent runtime (`Borealis.ps1 -Agent`).

### Networking and authentication
- All REST calls flow through `AgentHttpClient` in `Data/Agent/agent.py`.
- `AgentHttpClient.ensure_authenticated()` handles enrollment and refresh.
- Socket.IO is used for:
  - `quick_job_run` dispatch (script execution payloads).
  - `vpn_tunnel_start` and `vpn_tunnel_stop` (WireGuard lifecycle).
  - `connect_agent` registration (agent socket registry).

### Token storage
- Refresh tokens are stored encrypted (DPAPI on Windows) in `refresh.token`.
- Access tokens are stored in `access.jwt` with expiry metadata.
- GUID is stored in `Agent_GUID.txt`.
- When tokens are invalid or expired, the agent re-enrolls.

### Logging
- Primary log: `Agent/Logs/agent.log` with daily rotation.
- Error log: `Agent/Logs/agent.error.log`.
- VPN logs: `Agent/Logs/VPN_Tunnel/tunnel.log` and `remote_shell.log`.
- Role-specific logs may write to `Agent/Logs/<service>.log`.

### Troubleshooting flow
- If enrollment fails, check:
  - `Agent/Logs/agent.log` for enrollment errors.
  - `Engine/Logs/engine.log` for approval or auth failures.
- If scripts do not run:
  - Confirm `quick_job_run` events and the correct role context.
  - Verify signatures with `signature_utils` logs.
- If VPN fails:
  - Check agent WireGuard role logs and ensure the Engine emitted `vpn_tunnel_start`.

### Borealis Agent Codex (Full)
Use this section for agent-only work (Borealis agent runtime under `Data/Agent` -> `/Agent`). Shared guidance is consolidated in `ui-and-notifications.md` and the Engine runtime notes.

#### Scope and runtime paths
- Purpose: outbound-only connectivity, device telemetry, scripting, UI helpers.
- Bootstrap: `Borealis.ps1` preps dependencies, activates the agent venv, and co-launches the Engine.
- Edit in `Data/Agent`, not `/Agent`; runtime copies are ephemeral and wiped regularly.

#### Logging
- Primary log: `Agent/Logs/agent.log` with daily rotation to `agent.log.YYYY-MM-DD` (never auto-delete rotated files).
- Subsystems: log to `Agent/Logs/<service>.log` with the same rotation policy.
- Install/diagnostics: `Agent/Logs/install.log`; keep ad-hoc traces (for example, `system_last.ps1`, ansible) under `Agent/Logs/` to keep runtime state self-contained.
- Troubleshooting: prefix lines with `<timestamp>-<service-name>-<log-data>`; ask operators whether verbose logging should stay after resolution.

#### Security
- Generates device-wide Ed25519 keys on first launch (`Certificates/Agent/Identity/`; DPAPI on Windows, `chmod 600` elsewhere).
- Refresh/access tokens are encrypted and pinned to the Engine certificate fingerprint; mismatches force re-enrollment.
- Uses dedicated `ssl.SSLContext` seeded with the Engine TLS bundle for REST and Socket.IO traffic.
- Validates script payloads with backend-issued Ed25519 signatures before execution.
- Outbound-only; API/WebSocket calls flow through `AgentHttpClient.ensure_authenticated` for proactive refresh. Logs bootstrap, enrollment, token refresh, and signature events in `Agent/Logs/`.

#### Reverse VPN tunnels
- WireGuard reverse VPN design and lifecycle are documented in `vpn-and-remote-access.md`.
- The original references were `REVERSE_TUNNELS.md` and `Reverse_VPN_Tunnel_Deployment.md` (now consolidated into this knowledgebase).
- Agent roles:
  - `Data/Agent/Roles/role_WireGuardTunnel.py` (tunnel lifecycle)
  - `Data/Agent/Roles/role_RemotePowershell.py` (VPN PowerShell TCP server)

#### Execution contexts and roles
- Auto-discovers roles from `Data/Agent/Roles/`; no loader changes needed.
- Naming: `role_<Purpose>.py` with `ROLE_NAME`, `ROLE_CONTEXTS`, and optional hooks (`register_events`, `on_config`, `stop_all`).
- Standard roles: `role_DeviceInventory.py`, `role_Screenshot.py`, `role_ScriptExec_CURRENTUSER.py`, `role_ScriptExec_SYSTEM.py`, `role_Macro.py`.
- SYSTEM tasks depend on scheduled-task creation rights; failures should surface through Engine logging.

#### Platform parity
- Windows is the reference. Linux (`Borealis.sh`) lags in venv setup, supervision, and role loading; align Linux before macOS work continues.

#### Ansible support (unfinished)
- Agent and Engine scaffolding exists but is unreliable: expect stalled or silent failures, inconsistent recap, missing collections.
- Windows blockers: `ansible.windows.*` usually needs PSRP/WinRM; SYSTEM context lacks loopback remoting guarantees; interpreter paths vary.
- Treat Ansible features as disabled until packaging/controller story is complete. Future direction: credential management, selectable connections, reliable live output/cancel, packaged collections.