Documentation Overhaul

This commit is contained in:
2026-01-27 01:09:20 -07:00
parent e23daf69b5
commit 287d3b1cf7
26 changed files with 2062 additions and 801 deletions

View File

@@ -1,14 +1,17 @@
# Borealis Codex Engagement Index
Use this file as the entrypoint for Codex instructions. Domain-specific guidance lives in `/Docs/Codex` so we can scale without bloating this page.
Use this file as the entrypoint for Codex instructions. The full knowledgebase now lives under `Docs/` and includes both human-facing guidance and **Codex Agent** sections with deep, agent-ready details. There is no separate Codex folder anymore.
## Where to Read
- Agent: `Docs/Codex/BOREALIS_AGENT.md` (runtime paths, logging, security, roles, platform parity, Ansible status).
- Engine: `Docs/Codex/BOREALIS_ENGINE.md` (migration tracker, architecture, logging, security/API parity, platform parity, Ansible state).
- Shared: `Docs/Codex/SHARED.md` with UI guidance at `Docs/Codex/USER_INTERFACE.md`.
- Start here: `Docs/index.md` (table of contents and documentation rules).
- Agent runtime: `Docs/agent-runtime.md` (runtime paths, logging, security, roles, platform parity, Ansible status).
- Engine runtime: `Docs/engine-runtime.md` (architecture, logging, security/API parity, platform parity, migration notes).
- UI and notifications: `Docs/ui-and-notifications.md` (MagicUI styling, AG Grid rules, toast notifications, UI handoffs).
- VPN and remote access: `Docs/vpn-and-remote-access.md` (WireGuard tunnels, remote shell, RDP, troubleshooting context).
- Security and trust: `Docs/security-and-trust.md` (enrollment, tokens, code signing, sequence diagrams).
Precedence: follow domain docs first; fall back to Shared when there is overlap. If domain and Shared disagree, domain wins.
Precedence: follow domain docs first; where overlap exists, the domain page wins. The Codex Agent sections inside each page are the authoritative agent guidance.
## UI / AG Grid
- MagicUI styling language and AG Grid rules are consolidated in `Docs/Codex/USER_INTERFACE.md`.
- Visual example: `Data/Engine/web-interface/src/Admin/Page_Template.jsx` (reference onlyno business logic). Use it to mirror layout, spacing, and selection column behavior.
- MagicUI styling language and AG Grid rules are consolidated in `Docs/ui-and-notifications.md`.
- Visual example: `Data/Engine/web-interface/src/Admin/Page_Template.jsx` (reference only - no business logic). Use it to mirror layout, spacing, and selection column behavior.

View File

@@ -1,60 +0,0 @@
# Borealis Agent Refresh Tokens
This page explains what an agent refresh token is, how it is issued, where it is stored, how long it lives, how sliding expiry works, and how the agent uses it to stay authenticated.
## What a Refresh Token Is
- A long-lived credential the agent gets during enrollment; it represents device trust and is bound to the agents key/certificate fingerprint.
- Stored locally under the agent settings directory as an encrypted blob (`refresh.token`) alongside token metadata (`access.meta.json`) and the agent GUID.
- Not presented to normal APIs; it is only sent to the Engine to mint new short-lived access tokens.
## How the Agent Obtains It
1) Enrollment (`/api/agent/enroll/request``/api/agent/enroll/poll`):
- The agent proves possession of its Ed25519 identity and an operator-approved enrollment code.
- The Engine issues:
- `guid` (device identity),
- `access_token` (EdDSA JWT, ~15 minutes),
- `refresh_token` (random 48-byte urlsafe string),
- Engine TLS bundle and signing key.
- The agent persists the GUID, access token, refresh token, and expiry metadata via `AgentKeyStore` (`Data/Agent/security.py`).
## How Long It Lasts (Sliding Expiry)
- Base TTL: 90 days. Enrollment stores `expires_at = now + 90 days` in the Engine DB (`Data/Engine/services/API/enrollment/routes.py`).
- Sliding refresh: every successful call to `/api/agent/token/refresh` resets `expires_at` to `now + 90 days` and updates `last_used_at` (`Data/Engine/services/API/tokens/routes.py`). This favors recent activity rather than absolute age.
- If the Engine is offline, the agent simply keeps the stored refresh token; it will retry when connectivity returns. Expiry is enforced by the Engine clock, not the agent.
## Access Tokens vs. Refresh Tokens
- Access tokens: EdDSA JWTs with a ~15 minute lifetime (`expires_in: 900`). Used for all authenticated REST/WebSocket calls and renewed proactively.
- Refresh tokens: never presented to device APIs; only used to obtain a new access token. If absent/expired/invalid, the agent falls back to re-enrollment.
## How the Agent Uses It
- Every authenticated call runs through `AgentHttpClient.ensure_authenticated()` (`Data/Agent/agent.py`):
- Reloads tokens from disk.
- If no GUID/refresh token, performs enrollment.
- If no access token or it is about to expire, posts `{guid, refresh_token}` to `/api/agent/token/refresh`.
- On refresh:
- Success: stores the new access token, updates expiry metadata, and continues.
- 401/403 with specific errors (e.g., expired, fingerprint mismatch, token_version bump) trigger token clear + re-enrollment.
- Token storage:
- Refresh token: DPAPI-protected on Windows (or stored locally with restricted permissions elsewhere) at `refresh.token`.
- Access token: `access.jwt` plus expiry in `access.meta.json`.
- GUID: `Agent_GUID.txt`.
## When It Stops Working
- Engine-side expiry: if `expires_at` (in Engine DB) is older than “now,” refresh attempts return `refresh_token_expired` (401) and the agent re-enrolls.
- Revocation: device status `revoked/decommissioned` or token_version bumps invalidate the refresh token and force re-enrollment.
- Certificate/key changes: mismatched fingerprint bindings also force a re-enrollment path.
## Operational Notes
- Short outages (days/weeks) are tolerated: the 90-day sliding window resets on the first successful refresh after the Engine is back.
- Very long inactivity (>90 days without refresh) will require re-enrollment; the agent will reuse the last installer code if available, otherwise operator action is needed.
- Logs for token activity live under `Agent/Logs/` (`agent.log`, `agent.error.log`). Engine-side changes are recorded in the Engine DB `refresh_tokens` table with `last_used_at` and `expires_at`.
## Relevant Files (relative paths)
- `Docs/Agent/Refresh_Tokens.md` (this document)
- `Data/Agent/agent.py` (agent token lifecycle: ensure/authenticate/refresh/enroll)
- `Data/Agent/security.py` (token persistence: `access.jwt`, `refresh.token`, `access.meta.json`, GUID)
- `Data/Engine/services/API/enrollment/routes.py` (issues initial access/refresh tokens; sets refresh `expires_at`)
- `Data/Engine/services/API/tokens/routes.py` (refresh endpoint; sliding 90-day extension)
- `Data/Engine/auth/jwt_service.py` (access token issuance; 15-minute `expires_in`)
- `Data/Engine/database_migrations.py` (defines `refresh_tokens` table schema and indexes)
- `Data/Engine/Unit_Tests/test_tokens_api.py` (coverage for refresh behavior and expiry updates)

View File

@@ -1,39 +0,0 @@
# Codex Guide: Borealis Agent
Use this doc for agent-only work (Borealis agent runtime under `Data/Agent``/Agent`). For shared guidance, see `Docs/Codex/SHARED.md`.
## Scope & Runtime Paths
- Purpose: outbound-only connectivity, device telemetry, scripting, UI helpers.
- Bootstrap: `Borealis.ps1` preps dependencies, activates the agent venv, and co-launches the Engine.
- Edit in `Data/Agent`, not `/Agent`; runtime copies are ephemeral and wiped regularly.
## Logging
- Primary log: `Agent/Logs/agent.log` with daily rotation to `agent.log.YYYY-MM-DD` (never auto-delete rotated files).
- Subsystems: log to `Agent/Logs/<service>.log` with the same rotation policy.
- Install/diagnostics: `Agent/Logs/install.log`; keep ad-hoc traces (e.g., `system_last.ps1`, ansible) under `Agent/Logs/` to keep runtime state self-contained.
- Troubleshooting: prefix lines with `<timestamp>-<service-name>-<log-data>`; ask operators whether verbose logging should stay after resolution.
## Security
- Generates device-wide Ed25519 keys on first launch (`Certificates/Agent/Identity/`; DPAPI on Windows, `chmod 600` elsewhere).
- Refresh/access tokens are encrypted and pinned to the Engine certificate fingerprint; mismatches force re-enrollment.
- Uses dedicated `ssl.SSLContext` seeded with the Engine TLS bundle for REST + Socket.IO traffic.
- Validates script payloads with backend-issued Ed25519 signatures before execution.
- Outbound-only; API/WebSocket calls flow through `AgentHttpClient.ensure_authenticated` for proactive refresh. Logs bootstrap, enrollment, token refresh, and signature events in `Agent/Logs/`.
## Reverse VPN Tunnels
- WireGuard reverse VPN design and lifecycle live in `Docs/Codex/REVERSE_TUNNELS.md` and `Docs/Codex/Reverse_VPN_Tunnel_Deployment.md`.
- Agent roles: `Data/Agent/Roles/role_WireGuardTunnel.py` (tunnel lifecycle) and `Data/Agent/Roles/role_RemotePowershell.py` (VPN PowerShell TCP server).
## Execution Contexts & Roles
- Auto-discovers roles from `Data/Agent/Roles/`; no loader changes needed.
- Naming: `role_<Purpose>.py` with `ROLE_NAME`, `ROLE_CONTEXTS`, and optional hooks (`register_events`, `on_config`, `stop_all`).
- Standard roles: `role_DeviceInventory.py`, `role_Screenshot.py`, `role_ScriptExec_CURRENTUSER.py`, `role_ScriptExec_SYSTEM.py`, `role_Macro.py`.
- SYSTEM tasks depend on scheduled-task creation rights; failures should surface through Engine logging.
## Platform Parity
- Windows is the reference. Linux (`Borealis.sh`) lags in venv setup, supervision, and role loading; align Linux before macOS work continues.
## Ansible Support (Unfinished)
- Agent + Engine scaffolding exists but is unreliable: expect stalled/silent failures, inconsistent recap, missing collections.
- Windows blockers: `ansible.windows.*` usually needs PSRP/WinRM; SYSTEM context lacks loopback remoting guarantees; interpreter paths vary.
- Treat Ansible features as disabled until packaging/controller story is complete. Future direction: credential mgmt, selectable connections, reliable live output/cancel, packaged collections.

View File

@@ -1,40 +0,0 @@
# Codex Guide: Borealis Engine
Use this doc for Engine work (successor to the legacy server). For shared guidance, see `Docs/Codex/SHARED.md`.
## Scope & Runtime Paths
- Bootstrap: `Borealis.ps1` launches the Engine and/or Agent. The equivalant bootstrap script exists for Linux when running `Borealis.sh`.
- Edit in `Data/Engine`; runtime copies live under `/Engine` and are discarded every time the engine is launched.
## Architecture
- Runtime: `Data/Engine/server.py` with NodeJS + Vite for live dev and Flask for production serving/API endpoints.
## Development Guidelines
- Every Python module under `Data/Engine` or `Engine/Data/Engine` starts with the standard commentary header (purpose + API endpoints). Add the header to any existing module before further edits.
## Logging
- Primary log: `Engine/Logs/engine.log` with daily rotation (`engine.log.YYYY-MM-DD`); do not auto-delete rotated files.
- Subsystems: `Engine/Logs/<service>.log`; install output to `Engine/Logs/install.log`.
- Keep Engine-specific artifacts within `Engine/Logs/` to preserve the runtime boundary.
## Security & API Parity
- Mirrors legacy mutual trust: Ed25519 device identities, EdDSA-signed access tokens, pinned Borealis root CA, TLS 1.3-only serving, Authorization headers + service-context markers on every device API.
- Implements DPoP validation, short-lived access tokens (~15 min), SHA-256hashed refresh tokens (30-day) with explicit reuse errors.
- Enrollment: operator approvals, conflict detection, auditor recording, pruning of expired codes/refresh tokens.
- Background jobs and service adapters maintain compatibility with legacy DB schemas while enabling gradual API takeover.
## Reverse VPN Tunnels
- WireGuard reverse VPN design and lifecycle live in `Docs/Codex/REVERSE_TUNNELS.md` and `Docs/Codex/Reverse_VPN_Tunnel_Deployment.md`.
- Engine orchestrator: `Data/Engine/services/VPN/vpn_tunnel_service.py` with WireGuard manager `Data/Engine/services/VPN/wireguard_server.py`.
- UI shell bridge: `Data/Engine/services/WebSocket/vpn_shell.py`.
## WebUI & WebSocket Migration
- Static/template handling: `Data/Engine/services/WebUI`; deployment copy paths are wired through `Borealis.ps1` with TLS-aware URL generation.
- Stage 6 tasks: migration switch in the legacy server for WebUI delegation and porting device/admin API endpoints into Engine services.
- Stage 7 (queued): `register_realtime` hooks, Engine-side Socket.IO handlers, integration checks, legacy delegation updates.
## Platform Parity
- Windows is primary target. Keep Engine tooling aligned with the agent experience; Linux packaging must catch up before macOS work resumes.
## Ansible Support (Shared State)
- Mirrors the agents unfinished story: treat orchestration as experimental until packaging, connection management, and logging mature.

View File

@@ -1,62 +0,0 @@
# Borealis Reverse VPN Tunnels (WireGuard) Operator & Developer Guide
This document is the reference for Borealis reverse VPN tunnels built on WireGuard. The legacy WebSocket framing and domain-lane tunnel stack has been retired; the system now uses a single outbound WireGuard tunnel per agent with host-only routing and per-device ACLs.
## 1) High-Level Model
- Outbound-only: agents establish WireGuard tunnels to the Engine; no inbound access on devices.
- Transport: WireGuard/UDP on port 30000.
- Sessions: one live VPN tunnel per agent; multiple operators share it.
- Routing: host-only /32 per agent; AllowedIPs restricted to the agent /32 and engine /32; no client-to-client.
- Idle timeout: 15 minutes of no operator activity; no grace period.
- Keys: WireGuard server keys under `Engine/Certificates/VPN_Server`; client keys under `Agent/Borealis/Certificates/VPN_Client`.
## 2) Engine Components
- Orchestrator: `Data/Engine/services/VPN/vpn_tunnel_service.py`
- Allocates per-agent /32, issues short-lived orchestration tokens, enforces single-session.
- Starts/stops WireGuard listener, applies firewall rules, idles out on inactivity.
- Emits Socket.IO events: `vpn_tunnel_start`, `vpn_tunnel_stop`, `vpn_tunnel_activity`.
- WireGuard manager: `Data/Engine/services/VPN/wireguard_server.py`
- Generates server keys, renders config, manages `wireguard.exe` tunnel service, applies ACL rules.
- PowerShell bridge: `Data/Engine/services/WebSocket/vpn_shell.py`
- Proxies UI shell input/output to the agents TCP shell server over WireGuard.
- Logging: `Engine/Logs/VPN_Tunnel/tunnel.log` plus Device Activity entries; shell I/O is in `Engine/Logs/VPN_Tunnel/remote_shell.log`.
## 3) API Endpoints
- `POST /api/tunnel/connect` → issues session material (tunnel_id, token, virtual_ip, endpoint, allowed_ports, idle_seconds).
- `GET /api/tunnel/status` → returns up/down status for an agent.
- `GET /api/tunnel/connect/status` → alias for status (used by UI before shell open).
- `GET /api/tunnel/active` → lists active VPN tunnel sessions (tunnel_id, agent_id, virtual_ip, last_activity, etc.).
- `DELETE /api/tunnel/disconnect` → immediate teardown (agent + engine cleanup).
- `GET /api/device/vpn_config/<agent_id>` → read per-agent allowed ports.
- `PUT /api/device/vpn_config/<agent_id>` → update allowed ports.
## 4) Agent Components
- Tunnel lifecycle: `Data/Agent/Roles/role_WireGuardTunnel.py`
- Validates orchestration tokens, starts/stops WireGuard client service, enforces idle.
- Shell server: `Data/Agent/Roles/role_RemotePowershell.py`
- TCP PowerShell server bound to `0.0.0.0:47002`, restricted to VPN subnet (10.255.x.x).
- Logging: `Agent/Logs/VPN_Tunnel/tunnel.log` (tunnel lifecycle) and `Agent/Logs/VPN_Tunnel/remote_shell.log` (shell I/O).
## 5) Security & Auth
- TLS pinned for Engine API/Socket.IO.
- Orchestration tokens signed via Engine Ed25519 key; agent verifies signatures and stores the signing key.
- WireGuard AllowedIPs /32; no LAN routes; client-to-client blocked.
- Engine firewall rules enforce per-device allowed ports.
## 6) UI
- Device details now include an “Advanced Config” tab for per-device allowed ports.
- PowerShell MVP reuses `Data/Engine/web-interface/src/Devices/ReverseTunnel/Powershell.jsx` with WireGuard APIs + VPN shell events.
## 7) Extending to New Protocols
- Add protocol ports to the device allowlist and UI toggles.
- Reuse the existing VPN tunnel; no new transport/domain lanes required.
## 8) Legacy Removal
- WebSocket tunnel domains, protocol handlers, and domain limits are removed.
- No `/tunnel` Socket.IO namespace or framed protocol messages remain.
## 9) Change Log (not exhaustive)
- 2025-11-30: Legacy WebSocket tunnel scaffold introduced (lease manager, framing, tokens).
- 2025-12-06: Legacy PowerShell handler simplified to pipes-only; UI status tweaks.
- 2025-12-18: Legacy domain lanes added (`remote-interactive-shell`, `remote-management`, `remote-video`) with limits.
- 2025-12-20: WireGuard reverse VPN migration complete; legacy WebSocket tunnels retired; VPN shell bridge + new APIs.

View File

@@ -1,41 +0,0 @@
# Remote Shell UI Changes Handoff
You are a new ChatGPT Codex agent working in `d:\Github\Borealis`. Start by reading
`AGENTS.md`, then follow the doc chain it specifies. Also read
`Docs/Codex/TOAST_NOTIFICATIONS.md` to implement toast notifications correctly.
## Current Situation
- The WireGuard tunnel and Remote Shell work once the agent SYSTEM socket is online.
- If the operator clicks **Connect** too early, the UI shows `agent_socket_missing`
and no toast appears.
- Goal: prevent the Remote Shell connect attempt until the agent is actually ready,
and show a toast notification if the operator clicks too early.
## Required Behavior
- When the agent SYSTEM socket is not registered, the UI must block the connection
attempt, show a toast via `/api/notifications/notify`, and keep the UI idle
(no tunnel/session attempt).
- Toast title: `Agent Onboarding Underway`
- Toast message:
`Please wait for the agent to finish onboarding into Borealis. It takes about 1 minute to finish the process.`
## Important References
- `Docs/Codex/TOAST_NOTIFICATIONS.md` (toast API path, payload schema, auth, Socket.IO event)
- `AGENTS.md` (instructions and precedence)
- UI file: `Data/Engine/web-interface/src/Devices/ReverseTunnel/Powershell.jsx`
- API status endpoint: `/api/tunnel/status` returns `agent_socket` when available
- Socket error path: `agent_socket_missing`
## Troubleshooting Context
- Engine logs show `vpn_shell_open_failed ... reason=agent_socket_missing` when the
SYSTEM socket is not connected.
- Toasts do not appear; likely causes: WebUI build is reused (`Existing WebUI build found`)
or the UI error path doesn't trigger the toast.
- Ensure the toast is sent via `/api/notifications/notify` with `credentials: "include"`
and the payload schema from `TOAST_NOTIFICATIONS.md`.
## Deliverables
- Update UI logic to call the notification API and block the connection attempt until
readiness is confirmed.
- Cover both preflight status checks and the `agent_socket_missing` shell open response.
- Provide explicit rebuild/restart steps if the WebUI build must be refreshed.

View File

@@ -1,6 +0,0 @@
# Codex Guide: Shared Conventions
Cross-cutting guidance that applies to both Agent and Engine work. Domain-specific rules live in `Docs/Codex/BOREALIS_AGENT.md` and `Docs/Codex/BOREALIS_ENGINE.md`.
- UI & AG Grid: see `Docs/Codex/USER_INTERFACE.md` for MagicUI styling language and AG Grid patterns (with references to live templates).
- Add further shared topics here (e.g., triage process, security posture deltas) instead of growing `AGENTS.md`.

View File

@@ -1,97 +0,0 @@
# Codex Guide: Toast Notifications (Borealis WebUI)
Use this guide to add, configure, and test transient toast notifications across Borealis. It documents the backend endpoint, frontend listener, payload contract, and quick Firefox console commands you can hand to operators for validation.
## Components & Paths
- Backend endpoint: `Data/Engine/services/API/notifications/management.py` (registered as `/api/notifications/notify`).
- Frontend listener + renderer: `Data/Engine/web-interface/src/Notifications.jsx` (mounted in `App.jsx`).
- Transport: Socket.IO event `borealis_notification` broadcast to connected WebUI clients.
## Backend Behavior
- Auth: Uses `RequestAuthContext.require_user()`; session/Bearer must be present. Returns `401/403` otherwise.
- Route: `POST /api/notifications/notify`
- Emits `borealis_notification` over Socket.IO (no persistence).
- Logs via `service_log("notifications", ...)`.
- Validation: Requires `message` in payload. `title` defaults to `"Notification"` if omitted.
- Registration: API group `notifications` is enabled by default via `DEFAULT_API_GROUPS` and `_GROUP_REGISTRARS` in `Data/Engine/services/API/__init__.py`.
## Payload Schema
Send JSON body (session-authenticated):
- `title` (string, optional): Heading line. Default `"Notification"`.
- `message` (string, required): Body copy.
- `icon` (string, optional): Material icon name hint (e.g., `info`, `filter`, `schedule`, `warning`, `error`). Falls back to `NotificationsActive`.
- `variant` (string, optional): Visual theme. Accepted: `info` | `warning` | `error` (case-insensitive). Aliases: `type` or `severity`. Defaults to `info`.
- `ttl_ms` (number, optional): Client-side lifetime in milliseconds; defaults to ~5200ms before fade-out.
Notes:
- Payload is fanned out verbatim to the WebUI (plus server-added fields: `id`, `username`, `role`, `created_at`).
- The client caps the visible stack to the 5 most recent items (newest on top).
- Non-empty `message` is mandatory; otherwise HTTP 400.
## Frontend Rendering Rules
- Component: `Notifications.jsx` listens to `borealis_notification` on `window.BorealisSocket`.
- Stack position: fixed top-right, high z-index, pointer events enabled on toasts only.
- Auto-dismiss: ~5s default; each item fades out and is removed.
- Theme by `variant`:
- `info` (default): Borealis blue aurora gradient.
- `warning`: Muted amber gradient.
- `error`: Deep red gradient.
- Icon: No container; uses the provided Material icon hint. Small drop shadow for legibility.
## Implementation Steps (Recap)
1) Backend: Ensure `/api/notifications/notify` is registered (already in repo). New services should import `register_notifications` if API groups are customized.
2) Emit: From any authenticated server flow, POST to `/api/notifications/notify` with the payload above.
3) Frontend: `App.jsx` mounts `Notifications` globally; no per-page wiring needed.
4) Test: Use the Firefox console examples below while logged in to confirm toast rendering.
## Firefox Console Examples (run while signed in)
Info (default blue):
```js
fetch("/api/notifications/notify", {
method: "POST",
credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
title: "Test Notification",
message: "Hello from the console!",
icon: "info",
variant: "info"
})
}).then(r => r.json()).then(console.log).catch(console.error);
```
Warning (amber):
```js
fetch("/api/notifications/notify", {
method: "POST",
credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
title: "Heads up",
message: "This is a warning example.",
icon: "warning",
variant: "warning"
})
}).then(r => r.json()).then(console.log).catch(console.error);
```
Error (red):
```js
fetch("/api/notifications/notify", {
method: "POST",
credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
title: "Error encountered",
message: "Something failed during processing.",
icon: "error",
variant: "error"
})
}).then(r => r.json()).then(console.log).catch(console.error);
```
## Usage Notes & Tips
- Keep `message` concise; multiline is supported via `\n`.
- Use `icon` to match the source feature (e.g., `filter`, `schedule`, `device`, `error`).
- The server adds `username`/`role` to payloads; the client currently shows all variants regardless of role (filtering is per-username match when present).
- If sockets are unavailable, the endpoint still returns 200; toasts simply will not render until Socket.IO is connected.

View File

@@ -1,122 +0,0 @@
# Codex Guide: Shared UI (MagicUI + AG Grid)
Applies to all Borealis frontends. Use `Data/Engine/web-interface/src/Admin/Page_Template.jsx` as the canonical visual reference (no API/business logic). Keep this doc as the single source of truth for styling rules and AG Grid behavior.
- Toast notifications: see `Docs/Codex/TOAST_NOTIFICATIONS.md` for endpoint, payload, severity variants, and quick test commands.
## Page Template Reference
- Purpose: visual-only baseline for new pages; copy structure but wire your data in real pages.
- Header: small Material icon left of the title, subtitle beneath, utility buttons on the top-right.
- Shell: avoid gutters on the Paper.
- Selection column (for bulk actions): pinned left, square checkboxes, header checkbox enabled, ~52px fixed width, no menu/sort/resize; rely on AG Grid built-ins.
- Typography/buttons: IBM Plex Sans, gradient primary buttons, rounded corners (~8px), themed Quartz grid wrapper.
## MagicUI Styling Language (Visual System)
- Full-bleed canvas: hero shells run edge-to-edge; inset padding lives inside cards so gradients feel immersive.
- Glass panels: glassmorphic layers (`rgba(15,23,42,0.7)`), rounded 1624px corners, blurred backdrops, micro borders, optional radial flares for motion.
- Hero storytelling: start views with stat-forward heroes—gradient StatTiles (min 160px) and uppercase pills (HERO_BADGE_SX) summarizing live signals/filters.
- Summary data grids: use AG Grid inside a glass wrapper (two columns Field/Value), matte navy background, no row striping.
- Tile palettes: online cyan→green; stale orange→red; “needs update” violet→cyan; secondary metrics fade from cyan into desaturated steel for consistent hue families.
- Hardware islands: storage/memory/network blocks reuse Quartz theme in rounded glass shells with flat fills; present numeric columns (Capacity/Used/Free/%) to match Device Inventory.
- Action surfaces: control bars live in translucent glass bands; filled dark inputs with cyan hover borders; primary actions are pill-shaped gradients; secondary controls are soft-outline icon buttons.
- Anchored controls: align selectors/utility buttons with grid edges in a single row; reserve glass backdrops for hero sections so content stays flush.
- Buttons & chips: gradient pills for primary CTAs (`linear-gradient(135deg,#34d399,#22d3ee)` success; `#7dd3fc→#c084fc` creation); neutral actions use rounded outlines with `rgba(148,163,184,0.4)` borders and uppercase microcopy.
- Rainbow accents: for creation CTAs, use dark-fill pills with rainbow border gradients + teal halo (shared with Quick Job).
- AG Grid treatment: Quartz theme with matte navy headers, subtle alternating row opacity, cyan/magenta interaction glows, rounded wrappers, soft borders, inset selection glows.
- Default grid cell padding: keep roughly 18px on the left edge and 12px on the right for standard cells (12px/9px for `auto-col-tight`) so text never hugs a column edge. Target the center + pinned containers so both regions stay aligned.
- Overlays/menus: `rgba(8,12,24,0.96)` canvas, blurred backdrops, thin steel borders; bright typography; deep blue glass inputs; cyan confirm, mauve destructive accents.
## Aurora Tabs (MagicUI Tabbed Interfaces)
- Placement: sit directly below the hero title/subtitle band (816px gap). Tabs span the full width of the content column.
- Typography: IBM Plex Sans, `fontSize: 15`, mixed case labels (`textTransform: "none"`). Use `fontWeight: 600` for emphasis, but avoid uppercase that crowds the aurora glow.
- Indicator: 3px tall bar with rounded corners that uses the cyan→violet aurora gradient `linear-gradient(90deg,#7dd3fc,#c084fc)`. Keep it flush with the bottom border so it looks like a light strip under the active tab.
- Hover/active treatment: tabs float on a translucent aurora panel `linear-gradient(120deg, rgba(125,211,252,0.18), rgba(192,132,252,0.22))` with a 1px inset steel outline. This gradient applies on hover for both selected and non-selected tabs to keep parity.
- Colors: base text `MAGIC_UI.textMuted` (`#94a3b8`). Hovering switches to `MAGIC_UI.textBright` (`#e2e8f0`). Always force `opacity: 1` to avoid MUIs default faded text on unfocused tabs.
- Shape/spacing: tabs are pill-like with `borderRadius: 4` (MUI unit `1`). Maintain `minHeight: 44px` so targets are touchable. Provide `borderBottom: 1px solid MAGIC_UI.panelBorder` to anchor the rail.
- CSS/SX snippet to copy into new tab stacks:
```jsx
const TAB_HOVER_GRADIENT = "linear-gradient(120deg, rgba(125,211,252,0.18), rgba(192,132,252,0.22))";
<Tabs
value={tab}
onChange={(_, v) => setTab(v)}
variant="scrollable"
scrollButtons="auto"
TabIndicatorProps={{
style: {
height: 3,
borderRadius: 3,
background: "linear-gradient(90deg,#7dd3fc,#c084fc)",
},
}}
sx={{
borderBottom: `1px solid ${MAGIC_UI.panelBorder}`,
"& .MuiTab-root": {
color: MAGIC_UI.textMuted,
fontFamily: "\"IBM Plex Sans\", \"Helvetica Neue\", Arial, sans-serif",
fontSize: 15,
textTransform: "none",
fontWeight: 600,
minHeight: 44,
opacity: 1,
borderRadius: 1,
transition: "background 0.2s ease, color 0.2s ease, box-shadow 0.2s ease",
"&:hover": {
color: MAGIC_UI.textBright,
backgroundImage: TAB_HOVER_GRADIENT,
boxShadow: "0 0 0 1px rgba(148,163,184,0.25) inset",
},
},
"& .Mui-selected": {
color: MAGIC_UI.textBright,
"&:hover": {
backgroundImage: TAB_HOVER_GRADIENT,
},
},
}}
>
{TABS.map((t) => (
<Tab key={t} label={t} />
))}
</Tabs>
```
- Interaction rules: tabs should never scroll vertically; rely on horizontal scroll for overflow. Always align the tab rail with the first section header on the page so the aurora indicator lines up with hero metrics.
- Accessibility: keep `aria-label`/`aria-controls` pairs when the panes hold complex content, and ensure the gradient backgrounds preserve 4.5:1 contrast for the text (the current cyan on dark meets this).
## Page-Level Action Buttons
- Place page-level actions/buttons/hero-badges in a fixed overlay at the top-right, just below the global menu bar. Match the Filter Editor's placement if an example is needed `Data\Engine\web-interface\src\Devices\Filters\Filter_Editor.jsx`: wrapper `position: "fixed"`, `top: { xs: 72, md: 88 }`, `right: { xs: 12, md: 20 }`, `zIndex: 1400`, with `pointerEvents: "none"` on the wrapper and `pointerEvents: "auto"` on the inner `Stack` so underlying content remains clickable.
- Use gradient primary pills and outlined secondary pills (rounded 999 radius, MagicUI colors). Keep horizontal spacing via a `Stack` (e.g., `spacing={1.25}`); do not nest these buttons inside the title grid or tab rail.
- Tabs stay in normal document flow beneath the title/subtitle; the floating action bar should not shift layout. When operators request moving page actions (or when building new pages), apply this fixed overlay pattern instead of absolute positioning tied to tab rails.
- Keep the responsive offsets (xs/md) unless a specific page has a different header height/padding; only adjust the numeric values when explicitly needed to align with a nonstandard shell.
## AG Grid Column Behavior (All Tables)
- Auto-size value columns and let the last column absorb remaining width so views span available space.
- Declare `AUTO_SIZE_COLUMNS` near the grid component (exclude the fill column).
- Helper: store the grid API in a ref and call `api.autoSizeColumns(AUTO_SIZE_COLUMNS, true)` inside `requestAnimationFrame` (or `setTimeout(...,0)` fallback); swallow errors because it can run before rows render.
- Hook the helper into both `onGridReady` and a `useEffect` watching the dataset (e.g., `[filteredRows, loading]`); skip while `loading` or when there are zero rows.
- Column defs: apply shared `cellClass: "auto-col-tight"` (or equivalent) to every auto-sized column for consistent padding. Last column keeps the class for styling consistency.
- CSS override: ensure the wrapper targets both center and pinned containers so every cell shares the same flex alignment. Then apply the tighter inset to `auto-col-tight`:
```jsx
"& .ag-center-cols-container .ag-cell, & .ag-pinned-left-cols-container .ag-cell, & .ag-pinned-right-cols-container .ag-cell": {
display: "flex",
alignItems: "center",
justifyContent: "flex-start",
textAlign: "left",
padding: "8px 12px 8px 18px",
},
"& .ag-center-cols-container .ag-cell .ag-cell-wrapper, & .ag-pinned-left-cols-container .ag-cell .ag-cell-wrapper, & .ag-pinned-right-cols-container .ag-cell .ag-cell-wrapper": {
width: "100%",
display: "flex",
alignItems: "center",
justifyContent: "flex-start",
padding: 0,
},
"& .ag-center-cols-container .ag-cell.auto-col-tight, & .ag-pinned-left-cols-container .ag-cell.auto-col-tight, & .ag-pinned-right-cols-container .ag-cell.auto-col-tight": {
paddingLeft: "12px",
paddingRight: "9px",
},
```
- Style helper: reuse a `GRID_STYLE_BASE` (or similar) to set fonts/icons and `--ag-cell-horizontal-padding: "18px"` on every grid, then merge it with per-grid dimensions.
- Fill column: last column `{ flex: 1, minWidth: X }` (no width/maxWidth) to stretch when horizontal space remains.
- Pagination baseline: every Quartz grid ships with `pagination`, `paginationPageSize={20}`, and `paginationPageSizeSelector={[20, 50, 100]}`. This matches Device List behavior and prevents infinitely tall tables (Targets, assembly pickers, job histories, etc.).
- Example: follow the scaffolding in `Engine/web-interface/src/Scheduling/Scheduled_Jobs_List.jsx` and the structure in `Data/Engine/web-interface/src/Admin/Page_Template.jsx`.

View File

@@ -1,111 +0,0 @@
# Borealis WireGuard Troubleshooting Handoff
This file is a self-contained handoff prompt + context for a new Codex agent to resume WireGuard tunnel troubleshooting.
## Prompt to Use in a New Codex Session
Copy/paste the prompt below into a new Codex chat:
"""
You are a new Codex agent working in d:\Github\Borealis. Please do the following:
1) Read AGENTS.md, Docs/Codex/BOREALIS_AGENT.md, Docs/Codex/BOREALIS_ENGINE.md, Docs/Codex/REVERSE_TUNNELS.md, then Docs/Codex/WireGuard_Troubleshooting.md (this file).
2) Note environment mapping:
- D:\Github\Borealis = Engine (this device).
- Z:\ = Agent (remote device) read-only share for logs/configs.
- Use Z:\ to read agent logs/configs instead of asking the user to paste them.
3) Confirm the WireGuard listener on the Engine starts and stays running, then confirm the tunnel handshake from the remote agent.
4) Keep all config files inside the project root only:
- Agent config path: Agent\Borealis\Settings\WireGuard\Borealis.conf
- Engine config path: Engine\WireGuard\borealis-wg.conf
5) Make edits only in Data/Agent or Data/Engine. The user handles redeploying the Agent runtime on the remote device when needed.
6) If any doc in Docs\Codex is outdated, update it to reflect the current state and blockers.
"""
## Environment / Scope
- Workspace: D:\Github\Borealis (local project root for the Engine)
- Host OS: Windows 10/11 (build 26200). Engine runs on this machine.
- Remote Agent: mounted read-only at Z:\ (maps to C:\Borealis on the remote device; logs/configs under Z:\Agent\...).
- Agent/Engine launch: via Borealis.ps1, always elevated as admin.
- Network: Engine on 10.0.0.54; remote agent uses server_url.txt to derive endpoint host.
- WireGuard version: wireguard.exe 0.5.3, wg.exe 1.0.20210914.
- PIA (Private Internet Access) is installed and supplies a wintun driver (pia-wintun.sys). Do NOT treat the PIA adapter as the Borealis adapter.
## Desired Behavior
- Agent has a dedicated WireGuard adapter named "Borealis".
- Adapter provisioning is idempotent: if "Borealis" exists, do not recreate it.
- Configs must live inside the project root:
- Agent: Agent\Borealis\Settings\WireGuard\Borealis.conf
- Engine: Engine\WireGuard\borealis-wg.conf
- Agent brings up the WireGuard tunnel on vpn_tunnel_start, then remote shell/RDP/VNC/SSH flow through it.
- On stop/idle, the tunnel is torn down and firewall rules removed.
## Recent Changes (Current Repo State)
- Data/Agent/Roles/role_WireGuardTunnel.py
- Lazy client init (avoid side effects on import).
- Service name fix: WireGuard tunnel service is "WireGuardTunnel$Borealis".
- Endpoint override: if Engine sends localhost, use host from server_url.txt and port from the token.
- Config path preference: Agent\Borealis\Settings\WireGuard.
- Service display name set to "Borealis - WireGuard - Agent".
- Applies/removes the VPN shell firewall rule using the engine /32 from allowed_ips.
- Data/Engine/services/VPN/wireguard_server.py
- Engine config path: Engine\WireGuard\borealis-wg.conf (project root only).
- Removed invalid "SaveConfig = false" line (WireGuard rejected it).
- Service display name set to "Borealis - WireGuard - Engine".
- Ensures the listener service is running after install, and raises if it fails.
- Borealis.ps1
- Service name interpolation fixed to include the literal "$" in "WireGuardTunnel$Borealis".
Note: Data/Agent changes only apply after Borealis.ps1 re-stages the agent under Agent\.
## Current Symptoms (2026-01-14 00:05)
- Tunnel handshakes are healthy; TCP shell connectivity succeeds after adding a firewall rule for TCP/47002 from the engine /32.
- The firewall rule is now applied/removed by `role_WireGuardTunnel.py` using the engine /32 in the `allowed_ips` payload.
- `wireguard.exe /dumplog /tail` still fails with "Stdout must be set" when run from PowerShell (use file redirection).
## Key Paths
- Agent WireGuard role: Data/Agent/Roles/role_WireGuardTunnel.py
- Agent VPN shell role: Data/Agent/Roles/role_RemotePowershell.py
- Engine WireGuard manager: Data/Engine/services/VPN/wireguard_server.py
- Engine tunnel service: Data/Engine/services/VPN/vpn_tunnel_service.py
- Agent tunnel logs: Z:\Agent\Logs\VPN_Tunnel\tunnel.log
- Agent shell logs: Z:\Agent\Logs\VPN_Tunnel\remote_shell.log
- Engine tunnel logs: Engine\Logs\VPN_Tunnel\tunnel.log
- Engine shell logs: Engine\Logs\VPN_Tunnel\remote_shell.log
- Agent WireGuard config: Z:\Agent\Borealis\Settings\WireGuard\Borealis.conf
- Engine WireGuard config: Engine\WireGuard\borealis-wg.conf
## Known WireGuard Services / Names
- Engine listener service name: "WireGuardTunnel$borealis-wg"
- Agent tunnel service name: "WireGuardTunnel$Borealis"
- Adapter name in Control Panel: "Borealis"
- Service display names:
- "Borealis - WireGuard - Engine"
- "Borealis - WireGuard - Agent"
## Suggested Verification Commands
- Engine service status:
- Get-Service -Name "WireGuardTunnel$borealis-wg"
- sc.exe query "WireGuardTunnel$borealis-wg"
- netstat -ano -p udp | findstr :30000
- Engine WireGuard log tail:
- cmd /c ""C:\\Program Files\\WireGuard\\wireguard.exe" /dumplog /tail > %TEMP%\\wg-tail.log"
- powershell -NoProfile -Command "& 'C:\\Program Files\\WireGuard\\wireguard.exe' /dumplog /tail 2>&1 | Out-File $env:TEMP\\wg-tail.log"
- Agent tunnel state (remote, via Z:\ logs):
- Z:\Agent\Logs\VPN_Tunnel\tunnel.log
- Z:\Agent\Logs\VPN_Tunnel\remote_shell.log
- Z:\Agent\Borealis\Settings\WireGuard\Borealis.conf
## Current Blockers / Next Steps
1) Ensure the agent runtime is re-staged so `role_WireGuardTunnel.py` applies the shell firewall rule on tunnel start.
2) During an active session, run `Test-NetConnection -ComputerName 10.255.0.2 -Port 47002` on the Engine and confirm it reaches the agent.
3) While the session is active, confirm `Agent\Borealis\Settings\WireGuard\Borealis.conf` includes a [Peer] with endpoint/AllowedIPs (it reverts to idle config after stop).
4) Capture engine + agent tunnel/shell logs around a failed shell open attempt and re-check WireGuard service state if issues persist.

134
Docs/agent-runtime.md Normal file
View File

@@ -0,0 +1,134 @@
# Agent Runtime
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Describe the Borealis agent runtime, its roles, service modes, and how it communicates with the Engine.
## Runtime Summary
- Main entry: `Data/Agent/agent.py` (Python agent service).
- Service modes: SYSTEM and CURRENTUSER (controlled by `--system-service` or environment).
- Role system: `Data/Agent/role_manager.py` auto-loads `Data/Agent/Roles/role_*.py`.
- Networking: REST to Engine APIs + Socket.IO for realtime job dispatch and VPN orchestration.
- Security: Ed25519 identity keys, pinned TLS, signed script payloads, encrypted token storage.
## Role Catalog (Current)
- `role_DeviceAudit.py` (ROLE_NAME: `device_audit`) - inventory and audit data capture.
- `role_Macro.py` (ROLE_NAME: `macro`) - macro automation.
- `role_PlaybookExec_SYSTEM.py` (ROLE_NAME: `playbook_exec_system`) - Ansible playbook runner (unfinished).
- `role_RDP.py` (ROLE_NAME: `RDP`) - RDP readiness hooks.
- `role_RemotePowershell.py` (ROLE_NAME: `RemotePowershell`) - TCP PowerShell server over WireGuard.
- `role_Screenshot.py` (ROLE_NAME: `screenshot`) - screenshot capture.
- `role_ScriptExec_CURRENTUSER.py` (ROLE_NAME: `script_exec_currentuser`) - interactive PowerShell execution.
- `role_ScriptExec_SYSTEM.py` (ROLE_NAME: `script_exec_system`) - SYSTEM PowerShell execution.
- `role_WireGuardTunnel.py` (ROLE_NAME: `WireGuardTunnel`) - WireGuard client lifecycle.
## Agent Settings and Storage
- Settings root: `Agent/Borealis/Settings/` (runtime).
- Server URL: `Agent/Borealis/Settings/server_url.txt`.
- GUID and token storage: `Agent/Borealis/Settings/Agent_GUID.txt`, `access.jwt`, `refresh.token`.
## API Endpoints (Engine-facing)
- `POST /api/agent/enroll/request` (No Authentication) - start enrollment.
- `POST /api/agent/enroll/poll` (No Authentication) - finalize enrollment after approval.
- `POST /api/agent/token/refresh` (Refresh Token) - mint a new access token.
- `POST /api/agent/heartbeat` (Device Authenticated) - heartbeat + metrics.
- `POST /api/agent/details` (Device Authenticated) - hardware/inventory payloads.
- `POST /api/agent/script/request` (Device Authenticated) - request work or receive idle signal.
## Related Documentation
- [Security and Trust](security-and-trust.md)
- [Device Management](device-management.md)
- [VPN and Remote Access](vpn-and-remote-access.md)
## Codex Agent (Detailed)
### Source vs runtime
- Edit only in `Data/Agent/`.
- Runtime copy lives in `Agent/` and is regenerated by `Borealis.ps1`.
### Service modes and context
- SYSTEM mode is used for elevated tasks (scheduled tasks, VPN, system scripts).
- CURRENTUSER mode handles interactive tasks and UI-scoped execution.
- The agent includes `X-Borealis-Agent-Context` in headers to label context.
### Role discovery and extension
- Roles are discovered dynamically from `Data/Agent/Roles/`.
- Each role must define:
- `ROLE_NAME` (string)
- `ROLE_CONTEXTS` (list: `['system']`, `['interactive']`, or both)
- `Role` class with optional `register_events`, `on_config`, and `stop_all`.
- To add a role:
1) Create `Data/Agent/Roles/role_<Name>.py`.
2) Export `ROLE_NAME`, `ROLE_CONTEXTS`, and `Role`.
3) Re-stage the agent runtime (`Borealis.ps1 -Agent`).
### Networking and authentication
- All REST calls flow through `AgentHttpClient` in `Data/Agent/agent.py`.
- `AgentHttpClient.ensure_authenticated()` handles enrollment and refresh.
- Socket.IO is used for:
- `quick_job_run` dispatch (script execution payloads).
- `vpn_tunnel_start` and `vpn_tunnel_stop` (WireGuard lifecycle).
- `connect_agent` registration (agent socket registry).
### Token storage
- Refresh tokens are stored encrypted (DPAPI on Windows) in `refresh.token`.
- Access tokens are stored in `access.jwt` with expiry metadata.
- GUID is stored in `Agent_GUID.txt`.
- When tokens are invalid or expired, the agent re-enrolls.
### Logging
- Primary log: `Agent/Logs/agent.log` with daily rotation.
- Error log: `Agent/Logs/agent.error.log`.
- VPN logs: `Agent/Logs/VPN_Tunnel/tunnel.log` and `remote_shell.log`.
- Role-specific logs may write to `Agent/Logs/<service>.log`.
### Troubleshooting flow
- If enrollment fails, check:
- `Agent/Logs/agent.log` for enrollment errors.
- `Engine/Logs/engine.log` for approval or auth failures.
- If scripts do not run:
- Confirm `quick_job_run` events and the correct role context.
- Verify signatures with `signature_utils` logs.
- If VPN fails:
- Check agent WireGuard role logs and ensure the Engine emitted `vpn_tunnel_start`.
### Borealis Agent Codex (Full)
Use this section for agent-only work (Borealis agent runtime under `Data/Agent` -> `/Agent`). Shared guidance is consolidated in `ui-and-notifications.md` and the Engine runtime notes.
#### Scope and runtime paths
- Purpose: outbound-only connectivity, device telemetry, scripting, UI helpers.
- Bootstrap: `Borealis.ps1` preps dependencies, activates the agent venv, and co-launches the Engine.
- Edit in `Data/Agent`, not `/Agent`; runtime copies are ephemeral and wiped regularly.
#### Logging
- Primary log: `Agent/Logs/agent.log` with daily rotation to `agent.log.YYYY-MM-DD` (never auto-delete rotated files).
- Subsystems: log to `Agent/Logs/<service>.log` with the same rotation policy.
- Install/diagnostics: `Agent/Logs/install.log`; keep ad-hoc traces (for example, `system_last.ps1`, ansible) under `Agent/Logs/` to keep runtime state self-contained.
- Troubleshooting: prefix lines with `<timestamp>-<service-name>-<log-data>`; ask operators whether verbose logging should stay after resolution.
#### Security
- Generates device-wide Ed25519 keys on first launch (`Certificates/Agent/Identity/`; DPAPI on Windows, `chmod 600` elsewhere).
- Refresh/access tokens are encrypted and pinned to the Engine certificate fingerprint; mismatches force re-enrollment.
- Uses dedicated `ssl.SSLContext` seeded with the Engine TLS bundle for REST and Socket.IO traffic.
- Validates script payloads with backend-issued Ed25519 signatures before execution.
- Outbound-only; API/WebSocket calls flow through `AgentHttpClient.ensure_authenticated` for proactive refresh. Logs bootstrap, enrollment, token refresh, and signature events in `Agent/Logs/`.
#### Reverse VPN tunnels
- WireGuard reverse VPN design and lifecycle are documented in `vpn-and-remote-access.md`.
- The original references were `REVERSE_TUNNELS.md` and `Reverse_VPN_Tunnel_Deployment.md` (now consolidated into this knowledgebase).
- Agent roles:
- `Data/Agent/Roles/role_WireGuardTunnel.py` (tunnel lifecycle)
- `Data/Agent/Roles/role_RemotePowershell.py` (VPN PowerShell TCP server)
#### Execution contexts and roles
- Auto-discovers roles from `Data/Agent/Roles/`; no loader changes needed.
- Naming: `role_<Purpose>.py` with `ROLE_NAME`, `ROLE_CONTEXTS`, and optional hooks (`register_events`, `on_config`, `stop_all`).
- Standard roles: `role_DeviceInventory.py`, `role_Screenshot.py`, `role_ScriptExec_CURRENTUSER.py`, `role_ScriptExec_SYSTEM.py`, `role_Macro.py`.
- SYSTEM tasks depend on scheduled-task creation rights; failures should surface through Engine logging.
#### Platform parity
- Windows is the reference. Linux (`Borealis.sh`) lags in venv setup, supervision, and role loading; align Linux before macOS work continues.
#### Ansible support (unfinished)
- Agent and Engine scaffolding exists but is unreliable: expect stalled or silent failures, inconsistent recap, missing collections.
- Windows blockers: `ansible.windows.*` usually needs PSRP/WinRM; SYSTEM context lacks loopback remoting guarantees; interpreter paths vary.
- Treat Ansible features as disabled until packaging/controller story is complete. Future direction: credential management, selectable connections, reliable live output/cancel, packaged collections.

155
Docs/api-reference.md Normal file
View File

@@ -0,0 +1,155 @@
# API Reference
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Provide a consolidated, human-readable list of Borealis Engine API endpoints grouped by domain.
## API Endpoints
### Core
- `GET /health` (No Authentication) - liveness probe.
### Authentication and Access Management
- `POST /api/auth/login` (No Authentication) - operator login.
- `POST /api/auth/logout` (Token Authenticated) - operator logout.
- `POST /api/auth/mfa/verify` (Token Authenticated, MFA pending) - verify MFA.
- `GET /api/auth/me` (Token Authenticated) - current operator profile.
- `GET /api/users` (Admin) - list operator accounts.
- `POST /api/users` (Admin) - create operator account.
- `DELETE /api/users/<username>` (Admin) - delete operator account.
- `POST /api/users/<username>/reset_password` (Admin) - reset operator password.
- `POST /api/users/<username>/role` (Admin) - update operator role.
- `POST /api/users/<username>/mfa` (Admin) - toggle MFA/reset secrets.
- `GET /api/github/token` (Admin) - GitHub API token status.
- `POST /api/github/token` (Admin) - update GitHub API token.
### Enrollment and Tokens
- `POST /api/agent/enroll/request` (No Authentication) - submit enrollment request.
- `POST /api/agent/enroll/poll` (No Authentication) - finalize approved enrollment.
- `POST /api/agent/token/refresh` (Refresh Token) - mint new access token.
### Devices and Inventory
- `POST /api/agent/heartbeat` (Device Authenticated) - heartbeat + metrics.
- `POST /api/agent/details` (Device Authenticated) - full hardware/inventory payload.
- `POST /api/agent/script/request` (Device Authenticated) - request work or idle signal.
- `GET /api/agents` (Token Authenticated) - list online collectors by hostname/context.
- `GET /api/devices` (Token Authenticated) - device summary list.
- `GET /api/devices/<guid>` (Token Authenticated) - device summary by GUID.
- `GET /api/device/details/<hostname>` (Token Authenticated) - full device details.
- `POST /api/device/description/<hostname>` (Token Authenticated) - update description.
- `GET /api/device/vpn_config/<agent_id>` (Token Authenticated) - VPN allowed ports.
- `PUT /api/device/vpn_config/<agent_id>` (Token Authenticated) - update VPN allowed ports.
- `GET /api/device_list_views` (Token Authenticated) - list saved device views.
- `GET /api/device_list_views/<int:view_id>` (Token Authenticated) - get saved view.
- `POST /api/device_list_views` (Token Authenticated) - create saved view.
- `PUT /api/device_list_views/<int:view_id>` (Token Authenticated) - update saved view.
- `DELETE /api/device_list_views/<int:view_id>` (Token Authenticated) - delete saved view.
- `GET /api/sites` (Token Authenticated) - list sites.
- `POST /api/sites` (Admin) - create site.
- `POST /api/sites/delete` (Admin) - delete sites.
- `GET /api/sites/device_map` (Token Authenticated) - hostname to site map.
- `POST /api/sites/assign` (Admin) - assign devices to site.
- `POST /api/sites/rename` (Admin) - rename site.
- `POST /api/sites/rotate_code` (Admin) - rotate site enrollment code.
- `GET /api/repo/current_hash` (Device or Token Authenticated) - current agent repo hash.
- `GET /api/agent/hash` (Device Authenticated) - get agent hash.
- `POST /api/agent/hash` (Device Authenticated) - update agent hash.
- `GET /api/agent/hash_list` (Admin + Loopback) - list agent hashes (local diagnostics).
### Admin Approvals and Install Codes
- `GET /api/admin/enrollment-codes` (Admin) - list install codes.
- `POST /api/admin/enrollment-codes` (Admin) - create install code.
- `DELETE /api/admin/enrollment-codes/<code_id>` (Admin) - delete install code.
- `GET /api/admin/device-approvals` (Admin) - approval queue.
- `POST /api/admin/device-approvals/<approval_id>/approve` (Admin) - approve device.
- `POST /api/admin/device-approvals/<approval_id>/deny` (Admin) - deny device.
### Device Filters
- `GET /api/device_filters` (Token Authenticated) - list filters.
- `GET /api/device_filters/<filter_id>` (Token Authenticated) - get filter.
- `POST /api/device_filters` (Token Authenticated) - create filter.
- `PUT /api/device_filters/<filter_id>` (Token Authenticated) - update filter.
### Assemblies and Execution
- `GET /api/assemblies` (Token Authenticated) - list assemblies.
- `GET /api/assemblies/<assembly_guid>` (Token Authenticated) - assembly details.
- `POST /api/assemblies` (Token Authenticated) - create assembly.
- `PUT /api/assemblies/<assembly_guid>` (Token Authenticated) - update assembly.
- `DELETE /api/assemblies/<assembly_guid>` (Token Authenticated) - delete assembly.
- `POST /api/assemblies/<assembly_guid>/clone` (Admin + Dev Mode for protected domains) - clone assembly.
- `POST /api/assemblies/dev-mode/switch` (Admin) - toggle dev mode.
- `POST /api/assemblies/dev-mode/write` (Admin + Dev Mode) - flush queued writes.
- `POST /api/assemblies/import` (Domain write permission) - import legacy JSON assembly.
- `GET /api/assemblies/<assembly_guid>/export` (Token Authenticated) - export legacy JSON.
- `POST /api/scripts/quick_run` (Token Authenticated) - quick job (PowerShell).
- `POST /api/ansible/quick_run` (Token Authenticated) - placeholder (not implemented).
- `GET /api/device/activity/<hostname>` (Token Authenticated) - device activity history.
- `DELETE /api/device/activity/<hostname>` (Token Authenticated) - clear activity history.
- `GET /api/device/activity/job/<int:job_id>` (Token Authenticated) - activity record details.
### Scheduled Jobs
- `GET /api/scheduled_jobs` (Token Authenticated) - list scheduled jobs.
- `POST /api/scheduled_jobs` (Token Authenticated) - create scheduled job.
- `GET /api/scheduled_jobs/<int:job_id>` (Token Authenticated) - get scheduled job.
- `PUT /api/scheduled_jobs/<int:job_id>` (Token Authenticated) - update scheduled job.
- `POST /api/scheduled_jobs/<int:job_id>/toggle` (Token Authenticated) - enable/disable.
- `DELETE /api/scheduled_jobs/<int:job_id>` (Token Authenticated) - delete scheduled job.
- `GET /api/scheduled_jobs/<int:job_id>/runs` (Token Authenticated) - run history.
- `GET /api/scheduled_jobs/<int:job_id>/devices` (Token Authenticated) - device results.
- `DELETE /api/scheduled_jobs/<int:job_id>/runs` (Token Authenticated) - clear run history.
### Notifications
- `POST /api/notifications/notify` (Token Authenticated) - broadcast toast notification.
### VPN and Remote Access
- `POST /api/tunnel/connect` (Token Authenticated) - start WireGuard tunnel.
- `GET /api/tunnel/status` (Token Authenticated) - tunnel status by agent.
- `GET /api/tunnel/connect/status` (Token Authenticated) - alias for status.
- `GET /api/tunnel/active` (Token Authenticated) - list active tunnels.
- `DELETE /api/tunnel/disconnect` (Token Authenticated) - stop tunnel.
### RDP
- `POST /api/rdp/session` (Token Authenticated) - issue Guacamole RDP session token.
### Server Info and Logs
- `GET /api/server/time` (Operator Session) - server clock.
- `GET /api/server/logs` (Admin) - list logs and retention.
- `GET /api/server/logs/<log_name>/entries` (Admin) - tail log lines.
- `PUT /api/server/logs/retention` (Admin) - update retention policies.
- `DELETE /api/server/logs/<log_name>` (Admin) - delete log file(s).
## Related Documentation
- [Engine Runtime](engine-runtime.md)
- [Device Management](device-management.md)
- [Assemblies and Quick Jobs](assemblies.md)
- [Scheduled Jobs](scheduled-jobs.md)
- [VPN and Remote Access](vpn-and-remote-access.md)
## Codex Agent (Detailed)
### Where endpoints are defined
- Each API module begins with a header listing endpoints.
- Search under `Data/Engine/services/API/` to find the authoritative source.
- The registry lives in `Data/Engine/services/API/__init__.py`.
### How to keep this doc accurate
- When you add or remove a route, update:
1) The module header comment in the source file.
2) This `api-reference.md` page.
3) The domain page (example: `device-management.md`).
### Quick discovery workflow
- Use `rg "# - (GET|POST|PUT|DELETE)" Data/Engine/services/API` to list endpoints.
- Cross-check auth requirements in each module (RequestAuthContext, session checks, or device auth decorators).
- If a route is Socket.IO only, document it in the relevant domain page instead of this REST list.
### Auth labels used in this doc
- No Authentication: open endpoints (rare).
- Token Authenticated: operator session or bearer token.
- Device Authenticated: agent JWT access token.
- Admin: operator must have Admin role.
### Example update scenario
- You add `POST /api/devices/retire`:
- Update `Data/Engine/services/API/devices/management.py` header.
- Add the endpoint under the Devices and Inventory section here.
- Update `device-management.md` with behavior and UI impact.

View File

@@ -0,0 +1,82 @@
# Architecture Overview
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Explain how Borealis is structured and how the core components interact end to end.
## Core Components
- Engine: Flask + Socket.IO runtime that hosts APIs, scheduled jobs, VPN orchestration, and WebUI assets.
- WebUI: React single page app served by the Engine (Vite in dev, static build in prod).
- Agent: Python runtime that enrolls, reports inventory, executes scripts, and opens VPN tunnels.
- SQLite database: stores devices, approvals, schedules, activity history, tokens, and configuration records.
- Assemblies: script definitions stored in SQLite domains with payload artifacts on disk.
- Remote access: WireGuard reverse VPN, remote PowerShell, and Guacamole-backed RDP proxy.
## How the Pieces Talk
- Enrollment: agent calls `/api/agent/enroll/request` and `/api/agent/enroll/poll`, operator approves, Engine issues tokens and cert bundle.
- Inventory: agent posts `/api/agent/heartbeat` and `/api/agent/details`, Engine updates device records.
- Quick jobs: operator calls `/api/scripts/quick_run`, Engine emits `quick_job_run` over Socket.IO, agent executes and returns `quick_job_result`.
- Scheduled jobs: scheduler reads jobs from DB, resolves targets (including filters), then emits quick jobs.
- VPN tunnels: operator calls `/api/tunnel/connect`, Engine emits `vpn_tunnel_start`, agent starts WireGuard client.
- Remote shell: UI uses Socket.IO `vpn_shell_*` events, Engine bridges to agent TCP shell over WireGuard.
- RDP: operator calls `/api/rdp/session`, Engine creates a one-time token and proxies Guacamole WebSocket to guacd.
- Notifications: operator or services call `/api/notifications/notify`, WebUI receives `borealis_notification` events.
## Directory Map (High Level)
- `Data/Engine/` - Engine source (authoritative).
- `Data/Agent/` - Agent source (authoritative).
- `Engine/` - Engine runtime copy (regenerated each launch).
- `Agent/` - Agent runtime copy (regenerated each launch).
- `Data/Engine/web-interface/src/` - WebUI source.
- `Engine/Logs/` and `Agent/Logs/` - runtime logs.
- `Data/Engine/Assemblies/` and `Engine/Assemblies/` - assemblies (staging and runtime).
## API Endpoints
None on this page. See [API Reference](api-reference.md).
## Related Documentation
- [Engine Runtime](engine-runtime.md)
- [Agent Runtime](agent-runtime.md)
- [Security and Trust](security-and-trust.md)
- [Device Management](device-management.md)
- [Assemblies and Quick Jobs](assemblies.md)
- [Scheduled Jobs](scheduled-jobs.md)
- [VPN and Remote Access](vpn-and-remote-access.md)
- [UI and Notifications](ui-and-notifications.md)
## Codex Agent (Detailed)
### Service map by folder
- Engine APIs: `Data/Engine/services/API/` (grouped by domain, registered in `Data/Engine/services/API/__init__.py`).
- Engine realtime: `Data/Engine/services/WebSocket/` (Socket.IO events: quick jobs, VPN shell, agent socket registry).
- WebUI hosting: `Data/Engine/services/WebUI/` (SPA static assets and 404 fallback).
- VPN orchestration: `Data/Engine/services/VPN/` (WireGuard server and tunnel lifecycle).
- Remote desktop proxy: `Data/Engine/services/RemoteDesktop/` (Guacamole WebSocket proxy).
- Filters and targeting: `Data/Engine/services/filters/matcher.py` (used by scheduled jobs and filter counts).
- Agent roles: `Data/Agent/Roles/` (script exec, screenshot, WireGuard tunnel, remote PowerShell, etc).
### End-to-end flow examples (use these to debug)
- Quick job:
1) UI calls `/api/scripts/quick_run` with script path + hostnames.
2) Engine signs script and emits `quick_job_run`.
3) Agent role executes and posts `quick_job_result` over Socket.IO.
4) Engine updates `activity_history` and emits `device_activity_changed`.
- VPN shell:
1) UI calls `/api/tunnel/connect` to request tunnel material.
2) Engine emits `vpn_tunnel_start` to agent socket.
3) Agent WireGuard role starts tunnel; agent shell role listens on TCP 47002.
4) UI opens `vpn_shell_open` Socket.IO event; Engine bridges to TCP shell.
5) UI sends/receives `vpn_shell_send` and `vpn_shell_output` events.
### Runtime boundaries
- Do not edit `Engine/` or `Agent/` directly. They are recreated on each launch.
- Always edit `Data/Engine/` and `Data/Agent/` then re-run the bootstrap script.
### What to read first when debugging
- Start with logs: `Engine/Logs/engine.log` and `Agent/Logs/agent.log`.
- Check domain-specific logs (example: `Engine/Logs/VPN_Tunnel/tunnel.log`).
- Inspect active DB state in `Engine/database.db` for device/job metadata.
### Interaction points to remember
- REST for inventory, enrollment, and admin actions.
- Socket.IO for realtime job results, VPN shell, and notifications.
- WireGuard for remote protocol transport (shell, RDP, future protocols).

View File

@@ -1,20 +1,97 @@
# Assemblies Runtime Reference
# Assemblies and Quick Jobs
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Database Layout
- Three SQLite databases live under `Data/Engine/Assemblies` (`official.db`, `community.db`, `user_created.db`) and mirror to `Engine/Assemblies` at runtime.
- Automatic JSON → SQLite imports for the official domain have been retired; the staged `official.db` now serves as the authoritative store unless you invoke a manual sync.
- Payload binaries/json store under `Payloads/<payload-guid>` in both staging and runtime directories; the AssemblyCache references payload GUIDs instead of embedding large blobs.
- WAL mode with shared-cache is enabled on every connection; queue flushes copy the refreshed `.db`, `-wal`, and `-shm` files into the runtime mirror.
- `AssemblyCache.describe()` reveals dirty/clean state per assembly, helping operators spot pending writes before shutdown or sync operations.
## Purpose
Explain Borealis assemblies (script definitions), how they are stored, and how quick jobs execute them.
## Dev Mode Controls
- User-created domain mutations remain open to authenticated operators; community/official writes require an administrator with Dev Mode enabled.
- Toggle Dev Mode via `POST /api/assemblies/dev-mode/switch` or the Assemblies admin controls; state expires automatically based on the server-side TTL.
- Privileged actions (create/update/delete, cross-domain clone, queue flush, official sync, import into protected domains) emit audit entries under `Engine/Logs/assemblies.log`.
- When Dev Mode is disabled, API responses return `dev_mode_required` to prompt admins to enable overrides before retrying protected mutations.
## Assemblies at a Glance
- Assemblies are script definitions stored in SQLite domains.
- Domains include: `official`, `community`, and `user_created`.
- Payload artifacts live under `Payloads/<payload-guid>`.
- Assemblies are cached at runtime by the Engine and served via API.
## Backup Guidance
- Regularly snapshot `Data/Engine/Assemblies` and `Data/Engine/Assemblies/Payloads` alongside the mirrored runtime copies to preserve both metadata and payload artifacts.
- Include the queue inspection endpoint (`GET /api/assemblies`) in maintenance scripts to verify no dirty entries remain before capturing backups.
- Maintain the staged databases directly; to publish new official assemblies copy the curated `official.db` into `Data/Engine/Assemblies` before restarting the Engine.
- Future automation will extend to scheduled backups and staged restore helpers; until then, ensure filesystem backups capture both SQLite databases and payload directories atomically.
## Quick Jobs
- Quick jobs are immediate executions of a script assembly.
- The Engine resolves the script, signs it, and emits a Socket.IO `quick_job_run` event.
- Agents execute the payload and return `quick_job_result` for status and output.
## Activity History
- Quick job executions are tracked in `activity_history`.
- Operators can view or delete device activity history via API.
## Ansible Status (Current)
- Ansible quick-run exists as an endpoint but is not implemented.
- Agent and Engine scaffolding exist but are unstable; treat as disabled.
## API Endpoints
- `GET /api/assemblies` (Token Authenticated) - list assemblies.
- `GET /api/assemblies/<assembly_guid>` (Token Authenticated) - assembly details.
- `POST /api/assemblies` (Token Authenticated) - create assembly.
- `PUT /api/assemblies/<assembly_guid>` (Token Authenticated) - update assembly.
- `DELETE /api/assemblies/<assembly_guid>` (Token Authenticated) - delete assembly.
- `POST /api/assemblies/<assembly_guid>/clone` (Admin + Dev Mode for protected domains) - clone assembly.
- `POST /api/assemblies/dev-mode/switch` (Admin) - toggle dev mode.
- `POST /api/assemblies/dev-mode/write` (Admin + Dev Mode) - flush queued writes.
- `POST /api/assemblies/import` (Domain write permissions) - import legacy JSON.
- `GET /api/assemblies/<assembly_guid>/export` (Token Authenticated) - export legacy JSON.
- `POST /api/scripts/quick_run` (Token Authenticated) - quick PowerShell job.
- `POST /api/ansible/quick_run` (Token Authenticated) - placeholder (not implemented).
- `GET /api/device/activity/<hostname>` (Token Authenticated) - device activity history.
- `DELETE /api/device/activity/<hostname>` (Token Authenticated) - clear history.
- `GET /api/device/activity/job/<int:job_id>` (Token Authenticated) - activity record.
## Related Documentation
- [Flow Editor and Nodes](flow-editor-and-nodes.md)
- [Scheduled Jobs](scheduled-jobs.md)
- [Security and Trust](security-and-trust.md)
- [API Reference](api-reference.md)
## Codex Agent (Detailed)
### Storage layout and caching
- Staging assemblies live under `Data/Engine/Assemblies/`:
- `official.db`
- `community.db`
- `user_created.db`
- `Payloads/` for large script assets
- Runtime mirror lives under `Engine/Assemblies/` and is refreshed at launch.
- The Engine loads and caches assemblies via `Data/Engine/assembly_management` and `AssemblyRuntimeService`.
### Dev Mode behavior
- User-created domain writes are allowed for authenticated operators.
- Official/community domains require Admin + Dev Mode enabled.
- Dev Mode state is tracked per session and expires after a TTL.
- Use `/api/assemblies/dev-mode/switch` to toggle and `/api/assemblies/dev-mode/write` to flush.
### Quick job execution path
1) Operator calls `/api/scripts/quick_run` with `script_path` and `hostnames`.
2) Engine resolves the assembly document (DB-backed or filesystem).
3) Engine rewrites variable placeholders and signs the script with Ed25519.
4) Engine creates `activity_history` rows and emits `quick_job_run` over Socket.IO.
5) Agent role executes the script (SYSTEM or CURRENTUSER) and returns `quick_job_result`.
6) Engine updates `activity_history` and emits `device_activity_changed`.
### Script variables and environment injection
- Assembly variables are stored with name, type, default, and description.
- Engine builds an environment map and also rewrites `$env:VAR` occurrences.
- Variables are included in the payload so agents can log context.
### Code signing
- Script bytes are signed in `Data/Engine/services/API/assemblies/execution.py`.
- Agents verify signatures using `signature_utils` before execution.
### Activity history
- `activity_history` stores script metadata, timestamps, status, stdout, stderr.
- Use `/api/device/activity/<hostname>` to query or clear entries.
### Backup guidance
- Back up `Data/Engine/Assemblies/` and `Data/Engine/Assemblies/Payloads/` together.
- Also back up `Engine/Assemblies/` if you need runtime snapshots.
### Known limitations
- Ansible quick-run is not implemented in the Engine runtime.
- Linux agent support is incomplete; PowerShell scripts are Windows-first.
### Touch points to remember
- API routes: `Data/Engine/services/API/assemblies/`.
- Assembly runtime: `Data/Engine/services/assemblies/service.py` and `Data/Engine/assembly_management/`.
- UI editors: `Data/Engine/web-interface/src/Assemblies/`.

116
Docs/device-management.md Normal file
View File

@@ -0,0 +1,116 @@
# Device Management
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Explain how Borealis tracks devices, ingests inventory, manages sites and filters, and handles enrollment approvals.
## Inventory and Status
- Agents send heartbeats and inventory payloads to the Engine.
- The Engine stores device summaries and detailed hardware/software data in SQLite.
- Online status is derived from `last_seen` (online if the heartbeat is within ~5 minutes).
## Sites and Enrollment Codes
- Sites group devices for organizational and targeting purposes.
- Each site can have an enrollment code that agents can use during install.
- Site mapping is stored separately from device records and exposed via API.
## Device Filters
- Filters are stored in the `device_filters` table as JSON criteria groups.
- The Engine computes match counts using `DeviceFilterMatcher` against the inventory snapshot.
- Filters can be global or scoped to a site.
## Device List Views
- Operators can save custom table views for the device list UI.
- Views are stored per operator and exposed via `/api/device_list_views`.
## Per-Device VPN Configuration
- Each device can have a per-agent allowlist of VPN ports.
- These settings are used by the WireGuard service to build firewall rules.
## Enrollment Approvals
- Enrollment requests are queued for admin approval.
- Approvals enforce hostname conflict checks and device identity tracking.
## API Endpoints
- `POST /api/agent/heartbeat` (Device Authenticated) - heartbeat + metrics.
- `POST /api/agent/details` (Device Authenticated) - inventory payloads.
- `GET /api/agents` (Token Authenticated) - online collectors grouped by context.
- `GET /api/devices` (Token Authenticated) - device summary list.
- `GET /api/devices/<guid>` (Token Authenticated) - device summary by GUID.
- `GET /api/device/details/<hostname>` (Token Authenticated) - full device details.
- `POST /api/device/description/<hostname>` (Token Authenticated) - update description.
- `GET /api/device/vpn_config/<agent_id>` (Token Authenticated) - VPN allowed ports.
- `PUT /api/device/vpn_config/<agent_id>` (Token Authenticated) - update VPN allowed ports.
- `GET /api/device_list_views` (Token Authenticated) - list saved views.
- `GET /api/device_list_views/<int:view_id>` (Token Authenticated) - get saved view.
- `POST /api/device_list_views` (Token Authenticated) - create saved view.
- `PUT /api/device_list_views/<int:view_id>` (Token Authenticated) - update saved view.
- `DELETE /api/device_list_views/<int:view_id>` (Token Authenticated) - delete saved view.
- `GET /api/sites` (Token Authenticated) - list sites.
- `POST /api/sites` (Admin) - create site.
- `POST /api/sites/delete` (Admin) - delete sites.
- `GET /api/sites/device_map` (Token Authenticated) - hostname to site map.
- `POST /api/sites/assign` (Admin) - assign devices to site.
- `POST /api/sites/rename` (Admin) - rename site.
- `POST /api/sites/rotate_code` (Admin) - rotate site enrollment code.
- `GET /api/device_filters` (Token Authenticated) - list filters.
- `GET /api/device_filters/<filter_id>` (Token Authenticated) - get filter.
- `POST /api/device_filters` (Token Authenticated) - create filter.
- `PUT /api/device_filters/<filter_id>` (Token Authenticated) - update filter.
- `GET /api/admin/enrollment-codes` (Admin) - list enrollment codes.
- `POST /api/admin/enrollment-codes` (Admin) - create enrollment codes.
- `DELETE /api/admin/enrollment-codes/<code_id>` (Admin) - delete enrollment codes.
- `GET /api/admin/device-approvals` (Admin) - approval queue.
- `POST /api/admin/device-approvals/<approval_id>/approve` (Admin) - approve device.
- `POST /api/admin/device-approvals/<approval_id>/deny` (Admin) - deny device.
## Related Documentation
- [Agent Runtime](agent-runtime.md)
- [Security and Trust](security-and-trust.md)
- [Scheduled Jobs](scheduled-jobs.md)
- [VPN and Remote Access](vpn-and-remote-access.md)
- [API Reference](api-reference.md)
## Codex Agent (Detailed)
### Key files and services
- Device APIs: `Data/Engine/services/API/devices/` (management, approval, tunnel, rdp, routes).
- Filters: `Data/Engine/services/filters/matcher.py` and `Data/Engine/services/API/filters/management.py`.
- Enrollment approvals: `Data/Engine/services/API/devices/approval.py`.
### Inventory ingestion behavior
- `/api/agent/heartbeat` updates `last_seen` and key metrics (last_user, OS, uptime).
- `/api/agent/details` stores full inventory payloads for memory, network, storage, software, cpu.
- JSON blobs are serialized into SQLite text columns and rehydrated for UI.
### Status computation
- Online/offline is computed from `last_seen` (online if within ~300 seconds).
- UI tables use the derived `status` field from the API payload.
### Device identity and keys
- Device identity is tied to GUID + SSL fingerprint + token version.
- `DeviceAuthManager` enforces fingerprint matches and token version checks.
### Sites and enrollment codes
- Sites live in `sites` and `device_sites` tables (see `Data/Engine/database.py`).
- Enrollment codes are stored in `enrollment_install_codes` and can be site-scoped.
- Rotating a site code updates the code record and timestamps.
### Device filters (matching)
- Filters are stored as JSON criteria groups in `device_filters.criteria_json`.
- `DeviceFilterMatcher.fetch_devices()` loads a snapshot from `devices` and joins `sites`.
- `count_filter_devices` computes match counts for UI summaries.
### Per-device VPN configuration
- Allowed ports are stored in `device_vpn_config.allowed_ports` (JSON list).
- WireGuard uses this list to build firewall rules and allowlist transport ports.
### Approval flow detail
- Enrollment requests create approval records (pending).
- Admin approval handles hostname conflicts (merge or rename).
- Denials are logged and remove pending requests.
### Debug checklist
- Device missing from list: check `Engine/database.db` tables `devices` and `device_keys`.
- Online status wrong: check `last_seen` timestamps in `devices` table.
- Filter counts zero: validate `device_filters.criteria_json` and matcher logic.
- VPN config not applying: confirm `device_vpn_config` row and tunnel logs.

128
Docs/engine-runtime.md Normal file
View File

@@ -0,0 +1,128 @@
# Engine Runtime
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Describe the Borealis Engine runtime, its services, configuration, and operational responsibilities.
## Runtime Summary
- Application factory: `Data/Engine/server.py` (Flask + Socket.IO, Eventlet).
- Configuration loader: `Data/Engine/config.py` (environment-first, defaults, TLS discovery).
- API registration: `Data/Engine/services/API/__init__.py` (groups + adapters).
- WebUI serving: `Data/Engine/services/WebUI/` (SPA static assets and 404 fallback).
- Realtime events: `Data/Engine/services/WebSocket/` (quick job results, VPN shell bridge).
- VPN orchestration: `Data/Engine/services/VPN/` (WireGuard server manager + tunnel service).
- Remote desktop proxy: `Data/Engine/services/RemoteDesktop/` (Guacamole WebSocket bridge).
- Assemblies: `Data/Engine/assembly_management/` and `Data/Engine/services/assemblies/`.
## Runtime Paths
- Source code: `Data/Engine/` (edit here).
- Runtime copy: `Engine/` (regenerated each launch).
- Database: `Engine/database.db` (default; configurable).
- Logs: `Engine/Logs/` (engine.log, error.log, api.log, service logs).
- Certificates: `Engine/Certificates/` (TLS bundle + code signing keys).
- WebUI build output: `Engine/web-interface/` (served as static assets).
## API Endpoints
- `GET /health` (No Authentication) - Engine liveness probe.
- The Engine hosts all `/api/*` endpoints listed in [API Reference](api-reference.md).
## Related Documentation
- [Architecture Overview](architecture-overview.md)
- [Security and Trust](security-and-trust.md)
- [API Reference](api-reference.md)
- [Logging and Operations](logging-and-operations.md)
- [VPN and Remote Access](vpn-and-remote-access.md)
## Codex Agent (Detailed)
### Source vs runtime
- Edit only in `Data/Engine/`.
- `Engine/` is a runtime mirror and will be wiped/rebuilt by `Borealis.ps1` or `Borealis.sh`.
### EngineContext and lifecycle
- `Data/Engine/server.py` builds an `EngineContext` that includes:
- TLS paths, WireGuard settings, scheduler, Socket.IO instance.
- RDP proxy settings (guacd host/port, ws host/port, session TTL).
- The app factory wires in:
- API registration: `API.register_api(app, context)`
- WebUI static hosting: `WebUI.register_web_ui(app, context)`
- Realtime events: `WebSocket.register_realtime(socketio, context)`
### API groups and adapters
- Default groups live in `Data/Engine/services/API/__init__.py` (`DEFAULT_API_GROUPS`).
- Each group has a registrar in `_GROUP_REGISTRARS`.
- `EngineServiceAdapters` exposes:
- `db_conn_factory` (SQLite with WAL and busy_timeout).
- `service_log` (per-service log files with rotation).
- `jwt_service`, `dpop_validator`, rate limiters, signing keys, GitHub integration.
### Logging expectations
- Main logs: `Engine/Logs/engine.log` and `Engine/Logs/error.log`.
- API access log: `Engine/Logs/api.log` (per-request stats).
- Service logs: `Engine/Logs/<service>.log` (created via `service_log`).
- VPN logs: `Engine/Logs/VPN_Tunnel/tunnel.log` and `Engine/Logs/VPN_Tunnel/remote_shell.log`.
### Adding or updating an API
- Add new routes under `Data/Engine/services/API/<domain>/`.
- Ensure each module starts with the standard header block (purpose + API endpoints).
- Update `Data/Engine/services/API/__init__.py` if you add a new API group.
- Update `Docs/api-reference.md` and the relevant domain doc.
### WebUI hosting and dev mode
- Production UI is served from `Engine/web-interface/`.
- Dev UI uses Vite and still relies on Engine APIs for data.
- The SPA fallback in `Data/Engine/services/WebUI/__init__.py` prevents 404s on client routes.
### WireGuard and RDP wiring
- WireGuard server manager: `Data/Engine/services/VPN/wireguard_server.py`.
- Tunnel orchestration: `Data/Engine/services/VPN/vpn_tunnel_service.py`.
- RDP proxy: `Data/Engine/services/RemoteDesktop/guacamole_proxy.py`.
- API entrypoints: `/api/tunnel/*` and `/api/rdp/session`.
### Assembly runtime
- Assembly cache is initialized in `Data/Engine/assembly_management` and attached to `context.assembly_cache`.
- Quick jobs and scheduled jobs share this runtime to resolve scripts and variables.
### Platform parity
- Windows is the reference platform.
- Linux Engine works via `Borealis.sh`; Linux agent remains incomplete.
### Borealis Engine Codex (Full)
Use this section for Engine work (successor to the legacy server). Shared guidance is consolidated in `ui-and-notifications.md` and other knowledgebase pages.
#### Scope and runtime paths
- Bootstrap: `Borealis.ps1` launches the Engine and/or Agent. The equivalent bootstrap script exists for Linux when running `Borealis.sh`.
- Edit in `Data/Engine`; runtime copies live under `/Engine` and are discarded every time the engine is launched.
#### Architecture
- Runtime: `Data/Engine/server.py` with NodeJS + Vite for live dev and Flask for production serving/API endpoints.
#### Development guidelines
- Every Python module under `Data/Engine` or `Engine/Data/Engine` starts with the standard commentary header (purpose + API endpoints). Add the header to any existing module before further edits.
#### Logging
- Primary log: `Engine/Logs/engine.log` with daily rotation (`engine.log.YYYY-MM-DD`); do not auto-delete rotated files.
- Subsystems: `Engine/Logs/<service>.log`; install output to `Engine/Logs/install.log`.
- Keep Engine-specific artifacts within `Engine/Logs/` to preserve the runtime boundary.
#### Security and API parity
- Mirrors legacy mutual trust: Ed25519 device identities, EdDSA-signed access tokens, pinned Borealis root CA, TLS 1.3-only serving, Authorization headers and service-context markers on every device API.
- Implements DPoP validation, short-lived access tokens (about 15 min), SHA-256 hashed refresh tokens (30-day) with explicit reuse errors.
- Enrollment: operator approvals, conflict detection, auditor recording, pruning of expired codes/refresh tokens.
- Background jobs and service adapters maintain compatibility with legacy DB schemas while enabling gradual API takeover.
#### Reverse VPN tunnels
- WireGuard reverse VPN design and lifecycle are documented in `vpn-and-remote-access.md`.
- The original references were `REVERSE_TUNNELS.md` and `Reverse_VPN_Tunnel_Deployment.md` (now consolidated into this knowledgebase).
- Engine orchestrator: `Data/Engine/services/VPN/vpn_tunnel_service.py` with WireGuard manager `Data/Engine/services/VPN/wireguard_server.py`.
- UI shell bridge: `Data/Engine/services/WebSocket/vpn_shell.py`.
#### WebUI and WebSocket migration
- Static/template handling: `Data/Engine/services/WebUI`; deployment copy paths are wired through `Borealis.ps1` with TLS-aware URL generation.
- Stage 6 tasks: migration switch in the legacy server for WebUI delegation and porting device/admin API endpoints into Engine services.
- Stage 7 (queued): `register_realtime` hooks, Engine-side Socket.IO handlers, integration checks, legacy delegation updates.
#### Platform parity
- Windows is primary target. Keep Engine tooling aligned with the agent experience; Linux packaging must catch up before macOS work resumes.
#### Ansible support (shared state)
- Mirrors the agent's unfinished story: treat orchestration as experimental until packaging, connection management, and logging mature.

View File

@@ -0,0 +1,82 @@
# Flow Editor and Nodes
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Document the Borealis visual flow editor (React Flow) and how nodes are defined, grouped, and rendered.
## Core UI Components
- `Data/Engine/web-interface/src/Flow_Editor/Flow_Editor.jsx` - canvas, drag/drop, edge editing, context menus.
- `Data/Engine/web-interface/src/Flow_Editor/Node_Sidebar.jsx` - node catalog and drag source.
- `Data/Engine/web-interface/src/Flow_Editor/Node_Configuration_Sidebar.jsx` - per-node configuration UI.
- `Data/Engine/web-interface/src/Flow_Editor/Context_Menu_Sidebar.jsx` - right-click actions.
## Node Registration Pipeline
- Node modules are auto-loaded in `Data/Engine/web-interface/src/App.jsx` via:
`import.meta.glob('./Nodes/**/*.jsx', { eager: true })`.
- Each module default-exports a descriptor object that includes:
- `type` (unique node type string)
- `component` (React component)
- metadata like `name`, `category`, `description`, `config`, `usage_documentation`
- `App.jsx` builds:
- `nodeTypes` (type -> component)
- `categorizedNodes` (category -> list of descriptors)
## Node Categories (Current Folder Layout)
- `Agent`
- `Alerting`
- `Data Analysis & Manipulation`
- `Data Collection`
- `Flow Control`
- `General Purpose`
- `Image Processing`
- `Organization`
- `Reporting`
- `Templates`
## Scheduling Flow Usage
- `Data/Engine/web-interface/src/Scheduling/Create_Job.jsx` uses React Flow for job status and dependency visualization.
## API Endpoints
None. This is a UI-only domain.
## Related Documentation
- [Assemblies and Quick Jobs](assemblies.md)
- [Scheduled Jobs](scheduled-jobs.md)
- [UI and Notifications](ui-and-notifications.md)
## Codex Agent (Detailed)
### How node modules are structured
- A node file exports a descriptor object, for example:
- `type: "agent"`
- `component: BorealisAgentNode`
- `config: [{ key, label, type, defaultValue, optionsKey, ... }]`
- The `component` is the React Flow node UI.
- The descriptor is used by the sidebar for display and configuration forms.
### Adding a new node (step-by-step)
1) Create a new file under `Data/Engine/web-interface/src/nodes/<Category>/Node_<Name>.jsx`.
2) Export a descriptor object as the default export with `type` and `component` fields.
3) Include `config` entries if you want the configuration sidebar to render fields.
4) Rebuild the WebUI (or run Vite dev mode) so `import.meta.glob` picks it up.
5) Validate drag/drop in the Node Sidebar and ensure the node renders correctly.
### Sidebar behavior
- `Node_Sidebar.jsx` renders `categorizedNodes` and sets `dataTransfer` payloads with `application/reactflow`.
- `Flow_Editor.jsx` listens for drop events and creates nodes from the descriptor catalog.
### Node configuration sidebar
- `Node_Configuration_Sidebar.jsx` uses `useReactFlow().setNodes` to update node data.
- `config` metadata drives form rendering; data is stored in `node.data`.
### Canvas interactions
- Right-click context menus allow node delete, edge unlink, and property edit.
- Snap guides are computed in `Flow_Editor.jsx` for alignment.
### Job flow editor
- `Scheduling/Create_Job.jsx` uses a custom React Flow setup for status and dependency visualization.
- Keep job flow nodes separate from the general node catalog to avoid accidental crossover.
### Common gotchas
- Folder path casing is `src/nodes/` in the repo, but `App.jsx` imports `./Nodes/` (Windows is case-insensitive).
- Ensure each node descriptor has a unique `type` or React Flow will mis-render.
- If the sidebar does not show the new node, verify the export default object has `type` and `component`.

75
Docs/getting-started.md Normal file
View File

@@ -0,0 +1,75 @@
# Getting Started with Borealis
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Help operators install, launch, and verify the Borealis Engine and (optionally) the Agent.
## Quick Start (Engine)
- Windows production: `./Borealis.ps1 -EngineProduction` (Engine UI at `https://localhost:5000`).
- Windows dev: `./Borealis.ps1 -EngineDev` (Vite + Flask at `https://localhost:5173`).
- Linux Engine: `./Borealis.sh --EngineProduction` (use `--EngineDev` for Vite).
- TLS is auto-provisioned under `Engine/Certificates` on first launch.
## Optional: Install the Agent (Windows)
- Run in elevated PowerShell: `./Borealis.ps1 -Agent`.
- Automated enrollment example:
`./Borealis.ps1 -Agent -EnrollmentCode "E925-448B-626D-D595-5A0F-FB24-B4D6-6983"`
- Linux agent binaries are not available; `Borealis.sh --Agent` only stages settings.
## First Run Checklist
- Open the Engine URL and confirm the login page loads.
- Check `Engine/Logs/engine.log` for startup messages.
- Verify liveness: `GET /health` returns `{"status":"ok"}`.
## Reverse Proxy Notes
- Borealis expects HTTPS for production use.
- If you reverse proxy, preserve `Upgrade` headers for WebSockets and allow `/socket.io` paths.
- A Traefik example is included in `README.md`.
## API Endpoints
- `GET /health` (No Authentication) - Engine liveness probe.
- `GET /api/server/time` (Operator Session) - Quick sanity check after login.
## Related Documentation
- [Architecture Overview](architecture-overview.md)
- [Engine Runtime](engine-runtime.md)
- [Agent Runtime](agent-runtime.md)
- [Security and Trust](security-and-trust.md)
- [Logging and Operations](logging-and-operations.md)
## Codex Agent (Detailed)
### Bootstrap and runtime separation
- The authoritative source code lives in `Data/Engine/` and `Data/Agent/`.
- Runtime copies are staged to `Engine/` and `Agent/` every launch; these are disposable.
- Always edit source under `Data/` and re-run the bootstrap scripts to apply changes.
### Launch mechanics
- `Borealis.ps1` handles dependency setup, venv activation, and staging for Windows.
- `Borealis.sh` provides the same for Linux Engine; the Linux agent path only stages config today.
- Dev mode (`-EngineDev` / `--EngineDev`) uses Vite for the WebUI and Flask for APIs.
- Production (`-EngineProduction` / `--EngineProduction`) serves the built SPA through Flask.
### Configuration precedence
- Engine config is assembled by `Data/Engine/config.py` in this order:
1) Explicit overrides passed to the app factory.
2) Environment variables prefixed with `BOREALIS_`.
3) Defaults baked into `config.py`.
- Key defaults to remember:
- Database: `Engine/database.db`
- Logs: `Engine/Logs/engine.log`, `Engine/Logs/error.log`, `Engine/Logs/api.log`
- WireGuard: UDP 30000, engine virtual IP `10.255.0.1/32`, shell port 47002
### TLS and certificates
- Engine generates an ECDSA root + leaf chain on first boot.
- TLS bundle lives under `Engine/Certificates` and is pinned by agents.
- If you override TLS paths, update `BOREALIS_TLS_*` env vars and restart.
### Agent install and enrollment notes
- The Windows agent must run elevated to create services and scheduled tasks.
- Enrollment requires an install code and operator approval (see `device-management.md`).
- If enrollment fails, inspect `Agent/Logs/agent.log` and `Engine/Logs/engine.log`.
### Health verification
- Use `GET /health` to confirm the API is alive.
- Use `GET /api/server/time` after login to verify session auth and API reachability.
- Confirm WebSockets by opening the UI and checking that toasts and live updates work.

88
Docs/index.md Normal file
View File

@@ -0,0 +1,88 @@
# Borealis Knowledgebase Index
[Index (HTML)](index.html)
## Purpose
This page is the navigation hub for the Borealis documentation set. The knowledgebase now includes the full content that previously lived under `Docs/Codex` and `Docs/Agent`, compiled into the relevant pages below.
## Table of Contents
### Start Here
- [Getting Started](getting-started.md)
- [Architecture Overview](architecture-overview.md)
### Core Runtimes
- [Engine Runtime](engine-runtime.md)
- [Agent Runtime](agent-runtime.md)
### Security and Trust
- [Security and Trust](security-and-trust.md)
### Automation and Execution
- [Assemblies and Quick Jobs](assemblies.md)
- [Flow Editor and Nodes](flow-editor-and-nodes.md)
- [Scheduled Jobs](scheduled-jobs.md)
### Operations and Remote Access
- [Device Management](device-management.md)
- [VPN and Remote Access](vpn-and-remote-access.md)
- [Logging and Operations](logging-and-operations.md)
### UI and API
- [UI and Notifications](ui-and-notifications.md)
- [API Reference](api-reference.md)
### Integrations
- [Integrations](integrations.md)
### Key Repo References
- [README](../README.md)
- [AGENTS.md](../AGENTS.md)
## API Endpoints
None. This index only links to other pages.
## Related Documentation
- See the Table of Contents above for the primary knowledgebase pages.
## Codex Agent (Detailed)
### How to use this knowledgebase
- Start with `AGENTS.md` at the repo root.
- Read `getting-started.md` and `architecture-overview.md` to build the global model.
- Use `engine-runtime.md` and `agent-runtime.md` for implementation-level details.
- Use `ui-and-notifications.md` for MagicUI, AG Grid, and toast notification rules.
- Use `vpn-and-remote-access.md` for WireGuard and remote shell/RDP details.
- Use `security-and-trust.md` for enrollment, tokens, and code-signing behavior.
### Where the truth lives in code
- Engine source code: `Data/Engine/` (edit here).
- Agent source code: `Data/Agent/` (edit here).
- Web UI source: `Data/Engine/web-interface/src/`.
- Runtime copies: `Engine/` and `Agent/` (do not edit directly; they are regenerated).
- Logs: `Engine/Logs/` and `Agent/Logs/` (runtime artifacts).
- Assemblies data: `Data/Engine/Assemblies/` (staging) and `Engine/Assemblies/` (runtime mirror).
### Documentation authoring rules
- Keep filenames lowercase with hyphens (example: `device-management.md`).
- Add a top-of-page link back to the index: `[Back to Docs Index](index.md) | [Index (HTML)](index.html)`.
- For docs in subfolders, use relative paths (example: `../index.md`).
- Use ASCII characters only unless the file already uses Unicode.
- Avoid duplicating long source code; paraphrase and point to files instead.
- When a feature has UI and backend components, document both and link the relevant files.
- Codex Agent sections must remain verbose and example-driven; they now hold the full former Codex content.
### Cross-linking and maintenance
- Link outward to adjacent domains (example: device management should link to filters, scheduled jobs, VPN).
- When adding a new doc, add it to the Table of Contents and add at least two Related Documentation links from other pages.
- Keep Codex Agent sections detailed so a new agent can act without extra discovery.
### Update workflow example
- Change: add a new endpoint in `Data/Engine/services/API/devices/management.py`.
- Update steps:
1) Add the endpoint to the file header in that module.
2) Update `api-reference.md` under the Devices and Inventory section.
3) Update `device-management.md` with the new endpoint and behavior.
4) If UI changes are involved, update `ui-and-notifications.md`.
### Editing safety reminders
- Do not edit runtime directories `Engine/` or `Agent/`.
- Prefer reading with `rg` for quick discovery and update docs after code changes.
- If you notice unexpected changes in git, pause and clarify before proceeding.

50
Docs/integrations.md Normal file
View File

@@ -0,0 +1,50 @@
# Integrations
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Document external integrations used by Borealis, primarily the GitHub repository hash service.
## GitHub Integration (Repository Hash)
- The Engine can query GitHub for the latest commit hash of a repository/branch.
- Results are cached locally to reduce API usage.
- Admins can store a GitHub API token via the WebUI.
## API Endpoints
- `GET /api/github/token` (Admin) - GitHub token status.
- `POST /api/github/token` (Admin) - update GitHub token.
- `GET /api/repo/current_hash` (Device or Token Authenticated) - current repo hash.
## Related Documentation
- [Engine Runtime](engine-runtime.md)
- [API Reference](api-reference.md)
- [Logging and Operations](logging-and-operations.md)
## Codex Agent (Detailed)
### Integration implementation
- `Data/Engine/integrations/github.py` implements `GitHubIntegration`.
- The integration uses:
- Cached results stored in `repo_hash_cache.json` (under the Engine cache directory).
- Token storage in the `github_token` SQLite table.
### Defaults and overrides
- Default repo: `bunny-lab-io/Borealis`.
- Default branch: `main`.
- Environment overrides:
- `BOREALIS_REPO`
- `BOREALIS_REPO_BRANCH`
- Cache TTL can be overridden via Engine config (`repo_hash_refresh`).
### Token management
- Admins manage tokens via `/api/github/token`.
- The token is stored in the Engine database (`github_token` table).
- `GitHubIntegration.verify_token()` reports validity and rate-limit status.
### `GET /api/repo/current_hash`
- This endpoint uses the cached GitHub integration to return a hash.
- It supports device-auth and operator-auth contexts.
- Useful for agent update checks and diagnostics.
### Debug checklist
- Token missing: call `/api/github/token` as Admin and confirm `has_token`.
- API rate limit errors: inspect the response payload for `rate_limit` fields.
- Cache stale: use the `force_refresh` behavior in `GitHubIntegration` (via config or code).

View File

@@ -0,0 +1,66 @@
# Logging and Operations
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Describe Borealis operational logging, retention, and core runtime checks.
## Log Locations
- Engine primary log: `Engine/Logs/engine.log` (daily rotation).
- Engine error log: `Engine/Logs/error.log`.
- Engine API access log: `Engine/Logs/api.log`.
- Service logs: `Engine/Logs/<service>.log` (per-domain).
- VPN logs: `Engine/Logs/VPN_Tunnel/tunnel.log` and `Engine/Logs/VPN_Tunnel/remote_shell.log`.
- Agent logs: `Agent/Logs/agent.log` and `Agent/Logs/agent.error.log` (daily rotation).
## Log Retention
- Retention is managed via `/api/server/logs` endpoints.
- Retention overrides are stored in `Engine/Logs/retention_policy.json`.
## Operational Health
- `GET /health` returns liveness status.
- `GET /api/server/time` returns server clock information after login.
## API Endpoints
- `GET /health` (No Authentication) - liveness probe.
- `GET /api/server/time` (Operator Session) - server time.
- `GET /api/server/logs` (Admin) - list logs and retention metadata.
- `GET /api/server/logs/<log_name>/entries` (Admin) - tail log entries.
- `PUT /api/server/logs/retention` (Admin) - update retention policies.
- `DELETE /api/server/logs/<log_name>` (Admin) - delete log file(s).
## Related Documentation
- [Engine Runtime](engine-runtime.md)
- [Security and Trust](security-and-trust.md)
- [API Reference](api-reference.md)
## Codex Agent (Detailed)
### Engine log formatting
- Service logs are written via `service_log` in `Data/Engine/services/API/__init__.py`.
- Format: `[YYYY-MM-DD HH:MM:SS] [LEVEL][CONTEXT-<SCOPE>] message`.
- Context values are derived from agent context headers or message patterns.
### Log retention implementation
- `Data/Engine/services/API/server/log_management.py` manages retention.
- Retention overrides are stored in `Engine/Logs/retention_policy.json`.
- The API never deletes the active log file automatically.
### Operational checks
- Startup warnings appear in `Engine/Logs/engine.log`.
- API access metrics appear in `Engine/Logs/api.log` (method, path, duration, status).
- VPN-specific logs are under `Engine/Logs/VPN_Tunnel/`.
### Agent logging notes
- Logs are scoped by context (SYSTEM vs CURRENTUSER) in prefixes.
- Role-specific logs live under `Agent/Logs/<service>.log`.
- VPN logs are kept in `Agent/Logs/VPN_Tunnel/`.
### Debug workflow
- Start with the log file closest to the symptom.
- Use API log lines to confirm the request reached the Engine.
- Use service logs to diagnose domain-specific behavior.
- If troubleshooting WireGuard, inspect both Engine and Agent VPN logs.
### Operational safety
- Do not delete logs by hand while debugging; use the log API or archive first.
- Keep runtime artifacts inside `Engine/` and `Agent/` to preserve boundaries.
- If you change log formats, update this document and `engine-runtime.md`.

101
Docs/scheduled-jobs.md Normal file
View File

@@ -0,0 +1,101 @@
# Scheduled Jobs
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Explain how Borealis schedules recurring jobs, targets devices, and records run history.
## Scheduler Overview
- Scheduler implementation lives in `Data/Engine/services/API/scheduled_jobs/job_scheduler.py`.
- It reads job definitions from SQLite and emits quick job payloads over Socket.IO.
- Run history is stored in `scheduled_job_runs` and `scheduled_job_run_activity` tables.
## Schedule Types
Supported schedule types (from the scheduler core):
- `immediately`
- `once`
- `every_5_minutes`
- `every_10_minutes`
- `every_15_minutes`
- `every_30_minutes`
- `every_hour`
- `daily`
- `weekly`
- `monthly`
- `yearly`
## Target Resolution
- Targets can be explicit hostnames or device filter definitions.
- The scheduler uses `DeviceFilterMatcher` to resolve filters to live inventory snapshots.
- Online snapshot logic is used to avoid stale targets.
## Execution Flow
1) Scheduler tick loads enabled jobs.
2) Each due occurrence creates `scheduled_job_runs` rows.
3) Quick job payloads are emitted with `scheduled_job_id` context.
4) Agents execute and return `quick_job_result`.
5) The Engine updates run status and activity links.
## Run History and Retention
- Run history is retained for `BOREALIS_JOB_HISTORY_DAYS` (default 30).
- Old runs are purged during scheduler ticks.
## API Endpoints
- `GET /api/scheduled_jobs` (Token Authenticated) - list scheduled jobs.
- `POST /api/scheduled_jobs` (Token Authenticated) - create scheduled job.
- `GET /api/scheduled_jobs/<int:job_id>` (Token Authenticated) - get scheduled job.
- `PUT /api/scheduled_jobs/<int:job_id>` (Token Authenticated) - update scheduled job.
- `POST /api/scheduled_jobs/<int:job_id>/toggle` (Token Authenticated) - enable/disable.
- `DELETE /api/scheduled_jobs/<int:job_id>` (Token Authenticated) - delete scheduled job.
- `GET /api/scheduled_jobs/<int:job_id>/runs` (Token Authenticated) - run history.
- `GET /api/scheduled_jobs/<int:job_id>/devices` (Token Authenticated) - device results.
- `DELETE /api/scheduled_jobs/<int:job_id>/runs` (Token Authenticated) - clear run history.
## Related Documentation
- [Assemblies and Quick Jobs](assemblies.md)
- [Device Management](device-management.md)
- [API Reference](api-reference.md)
## Codex Agent (Detailed)
### Scheduler entry points
- API registration: `Data/Engine/services/API/scheduled_jobs/management.py`.
- Scheduler core: `Data/Engine/services/API/scheduled_jobs/job_scheduler.py`.
- Scheduler runner: `Data/Engine/services/API/scheduled_jobs/runner.py`.
### Core tables (Engine DB)
- `scheduled_jobs` - job definition, schedule, targets, execution context.
- `scheduled_job_runs` - per-run status, timestamps, error fields.
- `scheduled_job_run_activity` - links activity_history to scheduled runs.
### Schedule computation
- `_compute_next_run` normalizes timestamps to minutes and applies schedule type logic.
- `immediately` schedules once if the job never ran.
- `once` schedules at `start_ts` only once.
### Targeting logic
- Targets can be hostnames or device filters (criteria JSON).
- `DeviceFilterMatcher` loads device snapshots and resolves filter matches.
- The scheduler can also request an online-only hostname snapshot.
### Execution context
- Payloads are emitted as quick jobs with extra context:
- `scheduled_job_id`
- `scheduled_job_run_id`
- `scheduled_ts`
- `quick_job_result` updates `scheduled_job_runs` and `activity_history`.
### Retention and cleanup
- Retention defaults to 30 days and is configured by `BOREALIS_JOB_HISTORY_DAYS`.
- Purging is done inside the scheduler tick loop.
### Failure and retry notes
- The scheduler is designed to be resilient; it logs and continues on errors.
- Expired runs are marked `Timed Out` when they exceed the expiration window.
### UI touch points
- Scheduled job UI lives under `Data/Engine/web-interface/src/Scheduling/`.
- The list page expects pagination and run history endpoints to respond quickly.
### Debug checklist
- Jobs not running: check `Engine/Logs/engine.log` and `Engine/Logs/scheduled_jobs.log`.
- Run history empty: verify `scheduled_job_runs` table and quick job events.
- Filter target mismatch: inspect `device_filters.criteria_json` and matcher logic.

303
Docs/security-and-trust.md Normal file
View File

@@ -0,0 +1,303 @@
# Security and Trust
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Explain the Borealis trust model, enrollment security, token handling, and code signing behavior.
## Security Model Summary
- Mutual trust: each agent has a unique Ed25519 identity key; the Engine issues Ed25519-signed access tokens bound to that fingerprint.
- Pinned TLS: the Engine generates a root + leaf chain and agents pin the bundle for REST and Socket.IO traffic.
- Short-lived access tokens: JWTs signed with Ed25519, default lifetime about 15 minutes.
- Long-lived refresh tokens: 90-day sliding window, hashed in the Engine database.
- Code signing: scripts are signed by the Engine; agents reject payloads with invalid signatures.
## Security Breakdown (Full)
### Overall
- Borealis enforces mutual trust: each agent presents a unique Ed25519 identity to the server, the server issues EdDSA-signed (Ed25519) access tokens bound to that fingerprint, and both sides pin the generated Borealis root CA.
- End-to-end TLS everywhere: the Engine auto-provisions an ECDSA P-384 root + leaf chain under `Engine/Certificates` and serves TLS using Python defaults (TLS 1.2+); agents pin the delivered bundle for both REST and WebSocket traffic to eliminate man-in-the-middle avenues.
- Device enrollment is gated by enrollment and installer codes (configurable expiration and usage limits) and an operator approval queue; replay-resistant nonces plus rate limits (40 req/min/IP, 12 req/min/fingerprint) prevent brute force or code reuse.
- All device APIs require Authorization: Bearer headers and a service-context marker (SYSTEM or CURRENTUSER); missing, expired, mismatched, or revoked credentials are rejected before any business logic runs. Operator-driven revoking and device quarantining are not yet implemented.
- Replay and credential theft defenses layer in DPoP proof validation (thumbprint binding) on the server side and short-lived access tokens (about 15 minutes) with 90-day refresh tokens hashed via SHA-256.
- Centralized logging under `Engine/Logs` and `Agent/Logs` captures enrollment approvals, rate-limit hits, signature failures, and auth anomalies for post-incident review.
- Operator-facing API endpoints (device inventory, assemblies, job history, etc.) require an authenticated operator session or bearer token; unauthenticated requests are rejected with 401/403 responses before any inventory or script metadata is returned and the requesting user is logged with each quick-run dispatch.
### Server Security
- Auto-manages PKI: a persistent Borealis root CA (ECDSA SECP384R1) signs leaf certificates that include localhost SANs, tightened filesystem permissions, and a combined bundle for agent identity and cert pinning.
- Script delivery is code-signed with an Ed25519 key stored under `Engine/Certificates/Code-Signing`; agents refuse any payload whose signature does not match the pinned public key.
- Device authentication checks GUID normalization, SSL fingerprint matches, token version counters, and quarantine flags before admitting requests; missing rows with valid tokens auto-recover into placeholder records to avoid accidental lockouts.
- Refresh tokens are never stored in cleartext; only SHA-256 hashes plus DPoP bindings are stored in SQLite, and reuse after revocation/expiry returns explicit error codes.
- Enrollment workflow queues approvals, detects hostname and fingerprint conflicts, offers merge/overwrite options, and records auditor identities so trust decisions are traceable.
- Background pruning of expired enrollment codes and refresh tokens is not wired yet; a maintenance task is still needed.
### Agent
- Generates device-wide Ed25519 key pairs on first launch, storing them under `Certificates/Agent/Identity/` with DPAPI protection on Windows (chmod 600 elsewhere) and persisting the server-issued GUID alongside.
- Stores refresh/access tokens encrypted (DPAPI) and re-enrolls on authentication failures; TLS pinning relies on the stored server certificate bundle rather than a separate fingerprint binding for the tokens.
- Imports the server TLS bundle into a dedicated `ssl.SSLContext`, reuses it for the REST session, and injects it into the Socket.IO engine so WebSockets enjoy the same pinning and hostname checks.
- Treats every script payload as hostile until verified: only Ed25519 signatures from the server are accepted, missing or invalid signatures are logged and dropped, and the trusted signing key is updated only after successful verification between the agent and the server.
- Operates outbound-only; there are no listener ports, and every API/WebSocket call flows through `AgentHttpClient.ensure_authenticated`, forcing token refresh logic before retrying.
- Logs bootstrap, enrollment, token refresh, and signature events to daily-rotated files under `Agent/Logs`, giving operators visibility without leaking secrets outside the project root.
### WireGuard Agent to Engine Tunnels
- Borealis started with a bespoke reverse tunnel stack (WebSocket framing + domain lanes); its handshake and security model did not scale, so the project moved to WireGuard as the Engine <-> Agent data pipeline for secure remote protocols and future remote desktop control.
- On-demand, outbound-only: operators trigger a tunnel start, the agent dials the Engine (no inbound listeners), and the tunnel tears down on stop or idle.
- Shared sessions: one live VPN tunnel per agent, reused across operators to avoid redundant connections.
- Fast and robust transport: WireGuard provides encrypted UDP transport with lightweight handshakes that keep latency low and reconnects resilient.
- Orchestration security: the Engine issues short-lived, Ed25519-signed tunnel tokens that the agent verifies before bringing the tunnel up.
- Pinned trust: tunnel orchestration uses the same pinned TLS channel as REST and Socket.IO to prevent MITM during setup and control.
- Isolation by default: each agent gets a host-only /32; AllowedIPs are restricted to the agent /32 and the Engine /32; no LAN routes and no client-to-client traffic.
- Port-level controls: per-device allowlists plus Engine-applied firewall rules limit which protocols can traverse the tunnel.
- Live PowerShell today: a VPN-only shell endpoint enables remote command execution with SYSTEM-level (`NT AUTHORITY\\SYSTEM`) access for deep diagnostics and remediation.
- Session lifecycle: 15-minute idle timeout with no grace period; session material includes a virtual IP plus allowed ports; teardown removes the tunnel and firewall rules.
- Future protocols: extend the same tunnel for SSH, WinRM, RDP, VNC, WebRTC streaming, and other remote management workflows by enabling ports per device.
## Enrollment and Identity
- Enrollment uses install codes and operator approval.
- The agent generates its Ed25519 key pair locally and proves possession via signed nonces.
- Engine returns GUID, access token, refresh token, TLS bundle, and script signing key.
## Token and DPoP Handling
- Access tokens are required on device APIs (Bearer token).
- Refresh tokens are stored encrypted on the agent and hashed on the Engine.
- DPoP proof headers bind refresh tokens to a key thumbprint and prevent replay.
## Code Signing
- Engine signs script payloads using `Engine/Certificates/Code-Signing` keys.
- Agent verifies signatures before execution; failures are logged and rejected.
## Automated Agent Enrollment
If you deploy the agent via Group Policy or another automation platform, you can pre-inject an enrollment code during install. The enrollment code below is an example only.
**Windows**:
```powershell
.\Borealis.ps1 -Agent -EnrollmentCode "E925-448B-626D-D595-5A0F-FB24-B4D6-6983"
```
**Linux**: Agent enrollment is not yet available on Linux; `Borealis.sh --Agent` only writes settings placeholders.
## Agent/Server Enrollment (Sequence Diagram)
```mermaid
sequenceDiagram
participant Operator
participant Server
participant SYS as "SYSTEM Agent"
participant CUR as "CURRENTUSER Agent"
Operator->>Server: Request installer code
Server-->>Operator: Deliver hashed installer code
Note over Operator,Server: Human-controlled code binds enrollment to known device
par TLS Handshake (SYSTEM)
SYS->>Server: Initiate TLS session
Server-->>SYS: Present TLS certificate
and TLS Handshake (CURRENTUSER)
CUR->>Server: Initiate TLS session
Server-->>CUR: Present TLS certificate
end
Note over SYS,Server: Certificate pinning plus CA checks stop MITM
Note over CUR,Server: Pinning also blocks spoofed control planes
SYS->>SYS: Generate Ed25519 identity key pair
Note right of SYS: Private key stored under Certificates/... protected by DPAPI or chmod 600
CUR->>CUR: Generate Ed25519 identity key pair
Note right of CUR: Private key stored in user context and DPAPI-protected
SYS->>Server: Enrollment request (installer code, public key, fingerprint)
CUR->>Server: Enrollment request (installer code, public key, fingerprint)
Server->>Operator: Prompt for enrollment approval
Operator-->>Server: Approve device enrollment
Note over Operator,Server: Manual approval blocks rogue agents
Server-->>SYS: Send enrollment nonce
Server-->>CUR: Send enrollment nonce
SYS->>Server: Return signed nonce to prove key possession
CUR->>Server: Return signed nonce
Note over Server,Operator: Server verifies signatures and records GUID plus key fingerprint
Server->>SYS: Issue GUID, short-lived token, refresh token, server cert, script-signing key
Server->>CUR: Issue GUID, short-lived token, refresh token, server cert, script-signing key
Note over SYS,Server: Agent pins cert, stores GUID, DPAPI-encrypts refresh token
Note over CUR,Server: Agent stores GUID, pins cert, encrypts refresh token
Note over Server,Operator: Database keeps refresh token hash, key fingerprint, audit trail
loop Secure Sessions
SYS->>Server: REST heartbeat and job polling with Bearer token
CUR->>Server: REST heartbeat and WebSocket connect with Bearer token
Server-->>SYS: Provide new access token before expiry
Server-->>CUR: Provide new access token before expiry
SYS->>Server: Refresh request over pinned TLS
CUR->>Server: Refresh request over pinned TLS
end
Server-->>SYS: Deliver script payload plus Ed25519 signature
SYS->>SYS: Verify signature before execution
Server-->>CUR: Deliver script payload plus Ed25519 signature
CUR->>CUR: Verify signature and reject tampered content
Note over SYS,CUR: Signature failure triggers re-enrollment and detailed logging
Note over Server,Operator: Persistent records and approvals sustain long term trust
```
## Code-Signed Remote Script Execution (Sequence Diagram)
```mermaid
sequenceDiagram
participant Operator
participant Server
participant SYS as "SYSTEM Agent"
participant CUR as "CURRENTUSER Agent"
Operator->>Server: Upload or author script
Server->>Server: Store script and metadata on-disk
Operator->>Server: Request script execution on a specific device + execution context (NT Authority\\SYSTEM or Current-User)
Server->>Server: Load Ed25519 code signing key from secure store
Server->>Server: Sign script hash and execution manifest (The Assembly)
Server->>Server: Enqueue job with signed payload for target agent (SYSTEM or CurrentUser)
Note over Server: Dispatch limited to enrolled agents with valid GUID + tokens
loop Agent job polling (pinned TLS + Bearer token)
SYS->>Server: REST heartbeat and job poll
CUR->>Server: REST heartbeat and job poll
Server-->>SYS: Pending job payloads
Server-->>CUR: Pending job payloads
end
alt SYSTEM context
Server-->>SYS: Script, signature, hash, execution parameters
SYS->>SYS: Verify TLS pinning and token freshness
SYS->>SYS: Verify Ed25519 signature using pinned server key
SYS->>SYS: Recalculate script hash and compare
Note right of SYS: Verification failure stops execution and logs incident
SYS->>SYS: Execute via SYSTEM scheduled-task runner
SYS-->>Server: Return execution status, output, telemetry
else CURRENTUSER context
Server-->>CUR: Script, signature, hash, execution parameters
CUR->>CUR: Verify TLS pinning and token freshness
CUR->>CUR: Verify Ed25519 signature using pinned server key
CUR->>CUR: Recalculate script hash and compare
Note right of CUR: Validation failure stops execution and logs incident
CUR->>CUR: Execute within interactive PowerShell host
CUR-->>Server: Return execution status, output, telemetry
end
Server->>Server: Record results and logs alongside job metadata
Note over SYS,CUR: Pinned TLS, signed payloads, and DPAPI-protected secrets defend against tampering and replay
```
## API Endpoints
- `POST /api/agent/enroll/request` (No Authentication) - start enrollment.
- `POST /api/agent/enroll/poll` (No Authentication) - finalize enrollment after approval.
- `POST /api/agent/token/refresh` (Refresh Token) - mint a new access token.
- `POST /api/auth/login` (No Authentication) - operator login.
- `POST /api/auth/logout` (Token Authenticated) - operator logout.
- `POST /api/auth/mfa/verify` (Token Authenticated, MFA pending) - verify MFA.
- `GET /api/auth/me` (Token Authenticated) - current operator profile.
- `GET /api/admin/enrollment-codes` (Admin) - list install codes.
- `POST /api/admin/enrollment-codes` (Admin) - create install codes.
- `DELETE /api/admin/enrollment-codes/<code_id>` (Admin) - delete install codes.
## Related Documentation
- [Agent Runtime](agent-runtime.md)
- [Engine Runtime](engine-runtime.md)
- [Device Management](device-management.md)
- [API Reference](api-reference.md)
## Codex Agent (Detailed)
### Key material locations (Engine)
- TLS certificate: `Engine/Certificates/borealis-server-cert.pem`.
- TLS private key: `Engine/Certificates/borealis-server-key.pem`.
- TLS bundle (CA + server): `Engine/Certificates/borealis-server-bundle.pem`.
- Root CA key: `Engine/Certificates/borealis-root-ca-key.pem`.
- Script signing keys: `Engine/Certificates/Code-Signing/borealis-script-ed25519.key` and `.pub`.
### Key material locations (Agent)
- Identity keys: `Certificates/Agent/Identity/agent_identity_private.ed25519` and `agent_identity_public.ed25519`.
- Trusted server bundle: `Certificates/Agent/Trusted_Server_Cert/` (scope-specific).
- Tokens and GUID: `Agent/Borealis/Settings/` (refresh.token, access.jwt, Agent_GUID.txt).
### Enrollment sequence (step-by-step)
1) Agent generates Ed25519 key pair and a fingerprint.
2) Agent submits `/api/agent/enroll/request` with install code and public key.
3) Engine rate-limits and queues for operator approval.
4) Operator approves via `/api/admin/device-approvals/<id>/approve`.
5) Agent polls `/api/agent/enroll/poll`, returns signed nonce.
6) Engine issues GUID, access token, refresh token, TLS bundle, and signing key.
7) Agent pins cert bundle and stores tokens securely.
### Access vs refresh tokens
- Access token (JWT, EdDSA): used on every device API call; default expiry about 900 seconds.
- Refresh token: used only on `/api/agent/token/refresh` to mint new access tokens.
- Refresh token is SHA-256 hashed in DB and never stored in plaintext by the Engine.
### DPoP binding
- Refresh token requests can include a `DPoP` header.
- Engine validates DPoP proof and stores `dpop_jkt` in `refresh_tokens` table.
- Replay attempts return `dpop_replayed` and force re-enrollment behavior.
### Rate limiting and abuse controls
- Enrollment uses IP and fingerprint rate limiters (see `Data/Engine/services/API/enrollment/routes.py`).
- README documents IP and fingerprint rate limits (40 req/min/IP, 12 req/min/fingerprint).
### Code signing behavior
- Engine signs script payload bytes (Ed25519) before dispatch.
- Agent verifies signatures with `signature_utils` and stores the signing key on first success.
- If verification fails, the script is rejected and the agent logs an incident.
### Common failure modes
- `fingerprint_mismatch`: agent identity changed or cert data was wiped.
- `token_version_mismatch`: device token version bumped or revoked.
- `refresh_token_expired`: agent offline too long (greater than 90 days without refresh).
- `dpop_invalid`: DPoP proof missing or malformed.
### Agent Refresh Tokens (Full)
#### What a refresh token is
- A long-lived credential the agent gets during enrollment; it represents device trust and is bound to the agent's key/certificate fingerprint.
- Stored locally under the agent settings directory as an encrypted blob (`refresh.token`) alongside token metadata (`access.meta.json`) and the agent GUID.
- Not presented to normal APIs; it is only sent to the Engine to mint new short-lived access tokens.
#### How the agent obtains it
1) Enrollment (`/api/agent/enroll/request` -> `/api/agent/enroll/poll`):
- The agent proves possession of its Ed25519 identity and an operator-approved enrollment code.
- The Engine issues:
- `guid` (device identity)
- `access_token` (EdDSA JWT, about 15 minutes)
- `refresh_token` (random urlsafe string)
- Engine TLS bundle and signing key
- The agent persists the GUID, access token, refresh token, and expiry metadata via `AgentKeyStore` (`Data/Agent/security.py`).
#### How long it lasts (sliding expiry)
- Base TTL: 90 days (Engine stores `expires_at = now + 90 days`).
- Sliding refresh: every successful call to `/api/agent/token/refresh` resets `expires_at` to `now + 90 days`.
- Expiry is enforced by the Engine clock, not the agent.
#### Access tokens vs refresh tokens
- Access tokens: EdDSA JWTs with a about 15 minute lifetime (default `expires_in = 900`). Used for all device API calls and Socket.IO auth.
- Refresh tokens: used only to obtain new access tokens. If missing or invalid, the agent re-enrolls.
#### How the agent uses it
- All authenticated calls pass through `AgentHttpClient.ensure_authenticated()` (`Data/Agent/agent.py`).
- If no GUID/refresh token, the agent triggers enrollment.
- If the access token is missing or near expiry, the agent posts `{guid, refresh_token}` to `/api/agent/token/refresh`.
- On success, it stores the new access token and updated expiry metadata.
#### When it stops working
- Engine-side expiry: `refresh_token_expired` (401) forces re-enrollment.
- Revocation: device status `revoked` or `decommissioned` blocks refresh.
- Fingerprint mismatch: identity key changes cause the Engine to reject refresh.
- Token version mismatch: token version bump in DB forces re-enrollment.
#### Operational notes
- Short outages are tolerated: the 90-day sliding window resets on the first successful refresh after the Engine is back.
- Long inactivity (more than 90 days without refresh) requires re-enrollment; the agent will reuse the last installer code if available, otherwise operator action is needed.
- Logs for token activity live under `Agent/Logs/` (`agent.log`, `agent.error.log`). Engine-side changes are recorded in the Engine DB `refresh_tokens` table with `last_used_at` and `expires_at`.
#### Relevant files
- Agent token lifecycle: `Data/Agent/agent.py` (`AgentHttpClient`).
- Token storage: `Data/Agent/security.py` (`AgentKeyStore`).
- Refresh API: `Data/Engine/services/API/tokens/routes.py`.
- Enrollment API: `Data/Engine/services/API/enrollment/routes.py`.
- JWT issuance: `Data/Engine/auth/jwt_service.py`.
- Database schema: `Data/Engine/database_migrations.py` (`refresh_tokens` table).
### Where to update docs when security changes
- Update this page and any impacted runtime docs (engine or agent).
- Update `api-reference.md` if you add or change security-related endpoints.

View File

@@ -0,0 +1,284 @@
# UI and Notifications
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Describe the Borealis WebUI architecture, styling conventions, and the toast notification system.
## WebUI Architecture (High Level)
- Entry point: `Data/Engine/web-interface/src/App.jsx`.
- Global socket: `window.BorealisSocket` (Socket.IO client).
- Navigation: `Data/Engine/web-interface/src/Navigation_Sidebar.jsx`.
- Page template reference: `Data/Engine/web-interface/src/Admin/Page_Template.jsx` (layout only).
## Styling and Layout
- Borealis uses a MagicUI styling language with glass panels, gradients, and Quartz-themed AG Grid tables.
- The full MagicUI and AG Grid specification is embedded in the Codex Agent section below.
## Toast Notifications
- Backend: `POST /api/notifications/notify`.
- Transport: Socket.IO event `borealis_notification`.
- Frontend: `Data/Engine/web-interface/src/Notifications.jsx`.
## API Endpoints
- `POST /api/notifications/notify` (Token Authenticated) - broadcast a toast to all connected operators.
## Related Documentation
- [Engine Runtime](engine-runtime.md)
- [API Reference](api-reference.md)
- [Logging and Operations](logging-and-operations.md)
- [VPN and Remote Access](vpn-and-remote-access.md)
## Codex Agent (Detailed)
### Shared Conventions (Full)
- Cross-cutting guidance that applies to both Agent and Engine work.
- Domain-specific rules live in the Agent and Engine runtime docs.
- UI and AG Grid rules are defined in this document under the MagicUI and AG Grid sections.
- Add further shared topics here (for example, triage process, security posture deltas) instead of growing `AGENTS.md`.
### Shared UI (MagicUI + AG Grid) (Full)
Applies to all Borealis frontends. Use `Data/Engine/web-interface/src/Admin/Page_Template.jsx` as the canonical visual reference (no API/business logic). Keep this doc as the single source of truth for styling rules and AG Grid behavior.
- Toast notifications: see the Toast Notifications section below for endpoint, payload, severity variants, and quick test commands.
#### Page Template Reference
- Purpose: visual-only baseline for new pages; copy structure but wire your data in real pages.
- Header: small Material icon left of the title, subtitle beneath, utility buttons on the top-right.
- Shell: avoid gutters on the Paper.
- Selection column (for bulk actions): pinned left, square checkboxes, header checkbox enabled, about 52px fixed width, no menu/sort/resize; rely on AG Grid built-ins.
- Typography/buttons: IBM Plex Sans, gradient primary buttons, rounded corners (about 8px), themed Quartz grid wrapper.
#### MagicUI Styling Language (Visual System)
- Full-bleed canvas: hero shells run edge-to-edge; inset padding lives inside cards so gradients feel immersive.
- Glass panels: glassmorphic layers (`rgba(15,23,42,0.7)`), rounded 16-24px corners, blurred backdrops, micro borders, optional radial flares for motion.
- Hero storytelling: start views with stat-forward heroes, gradient StatTiles (min 160px) and uppercase pills (HERO_BADGE_SX) summarizing live signals/filters.
- Summary data grids: use AG Grid inside a glass wrapper (two columns Field/Value), matte navy background, no row striping.
- Tile palettes: online cyan to green; stale orange to red; needs update violet to cyan; secondary metrics fade from cyan into desaturated steel for consistent hue families.
- Hardware islands: storage/memory/network blocks reuse Quartz theme in rounded glass shells with flat fills; present numeric columns (Capacity/Used/Free/%) to match Device Inventory.
- Action surfaces: control bars live in translucent glass bands; filled dark inputs with cyan hover borders; primary actions are pill-shaped gradients; secondary controls are soft-outline icon buttons.
- Anchored controls: align selectors/utility buttons with grid edges in a single row; reserve glass backdrops for hero sections so content stays flush.
- Buttons and chips: gradient pills for primary CTAs (`linear-gradient(135deg,#34d399,#22d3ee)` success; `#7dd3fc->#c084fc` creation); neutral actions use rounded outlines with `rgba(148,163,184,0.4)` borders and uppercase microcopy.
- Rainbow accents: for creation CTAs, use dark-fill pills with rainbow border gradients and teal halo (shared with Quick Job).
- AG Grid treatment: Quartz theme with matte navy headers, subtle alternating row opacity, cyan/magenta interaction glows, rounded wrappers, soft borders, inset selection glows.
- Default grid cell padding: keep roughly 18px on the left edge and 12px on the right for standard cells (12px/9px for `auto-col-tight`) so text never hugs a column edge. Target the center + pinned containers so both regions stay aligned.
- Overlays/menus: `rgba(8,12,24,0.96)` canvas, blurred backdrops, thin steel borders; bright typography; deep blue glass inputs; cyan confirm, mauve destructive accents.
#### Aurora Tabs (MagicUI Tabbed Interfaces)
- Placement: sit directly below the hero title/subtitle band (8-16px gap). Tabs span the full width of the content column.
- Typography: IBM Plex Sans, `fontSize: 15`, mixed case labels (`textTransform: "none"`). Use `fontWeight: 600` for emphasis, but avoid uppercase that crowds the aurora glow.
- Indicator: 3px tall bar with rounded corners that uses the cyan to violet aurora gradient `linear-gradient(90deg,#7dd3fc,#c084fc)`. Keep it flush with the bottom border so it looks like a light strip under the active tab.
- Hover/active treatment: tabs float on a translucent aurora panel `linear-gradient(120deg, rgba(125,211,252,0.18), rgba(192,132,252,0.22))` with a 1px inset steel outline. This gradient applies on hover for both selected and non-selected tabs to keep parity.
- Colors: base text `MAGIC_UI.textMuted` (`#94a3b8`). Hovering switches to `MAGIC_UI.textBright` (`#e2e8f0`). Always force `opacity: 1` to avoid MUI's default faded text on unfocused tabs.
- Shape/spacing: tabs are pill-like with `borderRadius: 4` (MUI unit `1`). Maintain `minHeight: 44px` so targets are touchable. Provide `borderBottom: 1px solid MAGIC_UI.panelBorder` to anchor the rail.
- CSS/SX snippet to copy into new tab stacks:
```jsx
const TAB_HOVER_GRADIENT = "linear-gradient(120deg, rgba(125,211,252,0.18), rgba(192,132,252,0.22))";
<Tabs
value={tab}
onChange={(_, v) => setTab(v)}
variant="scrollable"
scrollButtons="auto"
TabIndicatorProps={{
style: {
height: 3,
borderRadius: 3,
background: "linear-gradient(90deg,#7dd3fc,#c084fc)",
},
}}
sx={{
borderBottom: `1px solid ${MAGIC_UI.panelBorder}`,
"& .MuiTab-root": {
color: MAGIC_UI.textMuted,
fontFamily: "\"IBM Plex Sans\", \"Helvetica Neue\", Arial, sans-serif",
fontSize: 15,
textTransform: "none",
fontWeight: 600,
minHeight: 44,
opacity: 1,
borderRadius: 1,
transition: "background 0.2s ease, color 0.2s ease, box-shadow 0.2s ease",
"&:hover": {
color: MAGIC_UI.textBright,
backgroundImage: TAB_HOVER_GRADIENT,
boxShadow: "0 0 0 1px rgba(148,163,184,0.25) inset",
},
},
"& .Mui-selected": {
color: MAGIC_UI.textBright,
"&:hover": {
backgroundImage: TAB_HOVER_GRADIENT,
},
},
}}
>
{TABS.map((t) => (
<Tab key={t} label={t} />
))}
</Tabs>
```
- Interaction rules: tabs should never scroll vertically; rely on horizontal scroll for overflow. Always align the tab rail with the first section header on the page so the aurora indicator lines up with hero metrics.
- Accessibility: keep `aria-label` and `aria-controls` pairs when the panes hold complex content, and ensure the gradient backgrounds preserve 4.5:1 contrast for the text (the current cyan on dark meets this).
#### Page-Level Action Buttons
- Place page-level actions/buttons/hero-badges in a fixed overlay at the top-right, just below the global menu bar. Match the Filter Editor placement if an example is needed `Data/Engine/web-interface/src/Devices/Filters/Filter_Editor.jsx`: wrapper `position: "fixed"`, `top: { xs: 72, md: 88 }`, `right: { xs: 12, md: 20 }`, `zIndex: 1400`, with `pointerEvents: "none"` on the wrapper and `pointerEvents: "auto"` on the inner `Stack` so underlying content remains clickable.
- Use gradient primary pills and outlined secondary pills (rounded 999 radius, MagicUI colors). Keep horizontal spacing via a `Stack` (for example, `spacing={1.25}`); do not nest these buttons inside the title grid or tab rail.
- Tabs stay in normal document flow beneath the title/subtitle; the floating action bar should not shift layout. When operators request moving page actions (or when building new pages), apply this fixed overlay pattern instead of absolute positioning tied to tab rails.
- Keep the responsive offsets (xs/md) unless a specific page has a different header height/padding; only adjust the numeric values when explicitly needed to align with a nonstandard shell.
#### AG Grid Column Behavior (All Tables)
- Auto-size value columns and let the last column absorb remaining width so views span available space.
- Declare `AUTO_SIZE_COLUMNS` near the grid component (exclude the fill column).
- Helper: store the grid API in a ref and call `api.autoSizeColumns(AUTO_SIZE_COLUMNS, true)` inside `requestAnimationFrame` (or `setTimeout(...,0)` fallback); swallow errors because it can run before rows render.
- Hook the helper into both `onGridReady` and a `useEffect` watching the dataset (for example, `[filteredRows, loading]`); skip while `loading` or when there are zero rows.
- Column defs: apply shared `cellClass: "auto-col-tight"` (or equivalent) to every auto-sized column for consistent padding. Last column keeps the class for styling consistency.
- CSS override: ensure the wrapper targets both center and pinned containers so every cell shares the same flex alignment. Then apply the tighter inset to `auto-col-tight`:
```jsx
"& .ag-center-cols-container .ag-cell, & .ag-pinned-left-cols-container .ag-cell, & .ag-pinned-right-cols-container .ag-cell": {
display: "flex",
alignItems: "center",
justifyContent: "flex-start",
textAlign: "left",
padding: "8px 12px 8px 18px",
},
"& .ag-center-cols-container .ag-cell .ag-cell-wrapper, & .ag-pinned-left-cols-container .ag-cell .ag-cell-wrapper, & .ag-pinned-right-cols-container .ag-cell .ag-cell-wrapper": {
width: "100%",
display: "flex",
alignItems: "center",
justifyContent: "flex-start",
padding: 0,
},
"& .ag-center-cols-container .ag-cell.auto-col-tight, & .ag-pinned-left-cols-container .ag-cell.auto-col-tight, & .ag-pinned-right-cols-container .ag-cell.auto-col-tight": {
paddingLeft: "12px",
paddingRight: "9px",
},
```
- Style helper: reuse a `GRID_STYLE_BASE` (or similar) to set fonts/icons and `--ag-cell-horizontal-padding: "18px"` on every grid, then merge it with per-grid dimensions.
- Fill column: last column `{ flex: 1, minWidth: X }` (no width/maxWidth) to stretch when horizontal space remains.
- Pagination baseline: every Quartz grid ships with `pagination`, `paginationPageSize={20}`, and `paginationPageSizeSelector={[20, 50, 100]}`. This matches Device List behavior and prevents infinitely tall tables (Targets, assembly pickers, job histories, etc.).
- Example: follow the scaffolding in `Engine/web-interface/src/Scheduling/Scheduled_Jobs_List.jsx` and the structure in `Data/Engine/web-interface/src/Admin/Page_Template.jsx`.
### Toast Notifications (Full)
Use this guide to add, configure, and test transient toast notifications across Borealis. It documents the backend endpoint, frontend listener, payload contract, and quick Firefox console commands you can hand to operators for validation.
#### Components and paths
- Backend endpoint: `Data/Engine/services/API/notifications/management.py` (registered as `/api/notifications/notify`).
- Frontend listener and renderer: `Data/Engine/web-interface/src/Notifications.jsx` (mounted in `App.jsx`).
- Transport: Socket.IO event `borealis_notification` broadcast to connected WebUI clients.
#### Backend behavior
- Auth: Uses `RequestAuthContext.require_user()`; session or bearer must be present. Returns `401/403` otherwise.
- Route: `POST /api/notifications/notify`
- Emits `borealis_notification` over Socket.IO (no persistence).
- Logs via `service_log("notifications", ...)`.
- Validation: Requires `message` in payload. `title` defaults to `"Notification"` if omitted.
- Registration: API group `notifications` is enabled by default via `DEFAULT_API_GROUPS` and `_GROUP_REGISTRARS` in `Data/Engine/services/API/__init__.py`.
#### Payload schema
Send JSON body (session-authenticated):
- `title` (string, optional): heading line. Default `"Notification"`.
- `message` (string, required): body copy.
- `icon` (string, optional): Material icon name hint (for example, `info`, `filter`, `schedule`, `warning`, `error`). Falls back to `NotificationsActive`.
- `variant` (string, optional): visual theme. Accepted: `info` | `warning` | `error` (case-insensitive). Aliases: `type` or `severity`. Defaults to `info`.
- `ttl_ms` (number, optional): client-side lifetime in milliseconds; defaults to about 5200ms before fade-out.
Notes:
- Payload is fanned out verbatim to the WebUI (plus server-added fields: `id`, `username`, `role`, `created_at`).
- The client caps the visible stack to the 5 most recent items (newest on top).
- Non-empty `message` is mandatory; otherwise HTTP 400.
#### Frontend rendering rules
- Component: `Notifications.jsx` listens to `borealis_notification` on `window.BorealisSocket`.
- Stack position: fixed top-right, high z-index, pointer events enabled on toasts only.
- Auto-dismiss: about 5s default; each item fades out and is removed.
- Theme by `variant`:
- `info` (default): Borealis blue aurora gradient.
- `warning`: muted amber gradient.
- `error`: deep red gradient.
- Icon: no container; uses the provided Material icon hint. Small drop shadow for legibility.
#### Implementation steps (recap)
1) Backend: ensure `/api/notifications/notify` is registered (already in repo). New services should import `register_notifications` if API groups are customized.
2) Emit: from any authenticated server flow, POST to `/api/notifications/notify` with the payload above.
3) Frontend: `App.jsx` mounts `Notifications` globally; no per-page wiring needed.
4) Test: use the Firefox console examples below while logged in to confirm toast rendering.
#### Firefox console examples (run while signed in)
Info (default blue):
```js
fetch("/api/notifications/notify", {
method: "POST",
credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
title: "Test Notification",
message: "Hello from the console!",
icon: "info",
variant: "info"
})
}).then(r => r.json()).then(console.log).catch(console.error);
```
Warning (amber):
```js
fetch("/api/notifications/notify", {
method: "POST",
credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
title: "Heads up",
message: "This is a warning example.",
icon: "warning",
variant: "warning"
})
}).then(r => r.json()).then(console.log).catch(console.error);
```
Error (red):
```js
fetch("/api/notifications/notify", {
method: "POST",
credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
title: "Error encountered",
message: "Something failed during processing.",
icon: "error",
variant: "error"
})
}).then(r => r.json()).then(console.log).catch(console.error);
```
#### Usage notes and tips
- Keep `message` concise; multiline is supported via `\n`.
- Use `icon` to match the source feature (for example, `filter`, `schedule`, `device`, `error`).
- The server adds `username` and `role` to payloads; the client currently shows all variants regardless of role (filtering is per-username match when present).
- If sockets are unavailable, the endpoint still returns 200; toasts simply will not render until Socket.IO is connected.
### Remote Shell UI Changes Handoff (Full)
This section captures the UI behavior requirements and troubleshooting context for the Remote Shell onboarding path.
#### Current situation
- The WireGuard tunnel and Remote Shell work once the agent SYSTEM socket is online.
- If the operator clicks Connect too early, the UI shows `agent_socket_missing` and no toast appears.
- Goal: prevent the Remote Shell connect attempt until the agent is actually ready, and show a toast notification if the operator clicks too early.
#### Required behavior
- When the agent SYSTEM socket is not registered, the UI must block the connection attempt, show a toast via `/api/notifications/notify`, and keep the UI idle (no tunnel/session attempt).
- Toast title: `Agent Onboarding Underway`.
- Toast message: `Please wait for the agent to finish onboarding into Borealis. It takes about 1 minute to finish the process.`
#### Important references
- Toast API and payload rules are documented above.
- UI file: `Data/Engine/web-interface/src/Devices/ReverseTunnel/Powershell.jsx`.
- API status endpoint: `/api/tunnel/status` returns `agent_socket` when available.
- Socket error path: `agent_socket_missing`.
#### Troubleshooting context
- Engine logs show `vpn_shell_open_failed ... reason=agent_socket_missing` when the SYSTEM socket is not connected.
- Toasts do not appear; likely causes: WebUI build is reused (`Existing WebUI build found`) or the UI error path does not trigger the toast.
- Ensure the toast is sent via `/api/notifications/notify` with `credentials: "include"` and the payload schema above.
#### Deliverables
- Update UI logic to call the notification API and block the connection attempt until readiness is confirmed.
- Cover both preflight status checks and the `agent_socket_missing` shell open response.
- Provide explicit rebuild/restart steps if the WebUI build must be refreshed.

View File

@@ -0,0 +1,287 @@
# VPN and Remote Access
[Back to Docs Index](index.md) | [Index (HTML)](index.html)
## Purpose
Document Borealis remote access features: WireGuard reverse VPN tunnels, remote PowerShell, and RDP via Guacamole.
## WireGuard Reverse VPN (High Level)
- Outbound-only: agents initiate tunnels to the Engine; no inbound listeners on devices.
- Transport: WireGuard UDP 30000.
- One active tunnel per agent, shared across operators.
- Host-only routing: each agent gets a /32; no client-to-client routes.
- Idle timeout: 15 minutes without activity.
- Per-device allowlist: ports are restricted per device and enforced by Engine firewall rules.
## Remote PowerShell
- Uses the WireGuard tunnel and a TCP shell server on the agent.
- Engine bridges UI Socket.IO events to the agent TCP shell.
- Shell port default: 47002 (configurable).
## RDP via Guacamole
- Engine issues one-time RDP session tokens via `/api/rdp/session`.
- WebUI connects to `ws(s)://<engine_host>:4823/guacamole`.
- RDP allowed only if the device allowlist includes 3389.
## Reverse Proxy Configuration
Traefik dynamic config (replace service URL with the actual Borealis Engine URL):
```yml
http:
routers:
borealis:
entryPoints:
- websecure
tls:
certResolver: letsencrypt
service: borealis
rule: "Host(`borealis.example.com`) && PathPrefix(`/`)"
middlewares:
- cors-headers
middlewares:
cors-headers:
headers:
accessControlAllowOriginList:
- "*"
accessControlAllowMethods:
- GET
- POST
- OPTIONS
accessControlAllowHeaders:
- Content-Type
- Upgrade
- Connection
accessControlMaxAge: 100
addVaryHeader: true
services:
borealis:
loadBalancer:
servers:
- url: "http://127.0.0.1:5000"
passHostHeader: true
```
## API Endpoints
- `POST /api/tunnel/connect` (Token Authenticated) - start WireGuard tunnel.
- `GET /api/tunnel/status` (Token Authenticated) - tunnel status by agent.
- `GET /api/tunnel/connect/status` (Token Authenticated) - alias for status.
- `GET /api/tunnel/active` (Token Authenticated) - list active tunnels.
- `DELETE /api/tunnel/disconnect` (Token Authenticated) - stop tunnel.
- `GET /api/device/vpn_config/<agent_id>` (Token Authenticated) - read allowed ports.
- `PUT /api/device/vpn_config/<agent_id>` (Token Authenticated) - update allowed ports.
- `POST /api/rdp/session` (Token Authenticated) - issue RDP session token.
## Related Documentation
- [Device Management](device-management.md)
- [Agent Runtime](agent-runtime.md)
- [Security and Trust](security-and-trust.md)
- [API Reference](api-reference.md)
## Codex Agent (Detailed)
### Core Engine files
- Tunnel service: `Data/Engine/services/VPN/vpn_tunnel_service.py`.
- WireGuard server manager: `Data/Engine/services/VPN/wireguard_server.py`.
- Tunnel API: `Data/Engine/services/API/devices/tunnel.py`.
- Shell bridge: `Data/Engine/services/WebSocket/vpn_shell.py`.
- RDP session API: `Data/Engine/services/API/devices/rdp.py`.
- Guacamole proxy: `Data/Engine/services/RemoteDesktop/guacamole_proxy.py`.
### Core Agent files
- WireGuard client role: `Data/Agent/Roles/role_WireGuardTunnel.py`.
- Remote PowerShell role: `Data/Agent/Roles/role_RemotePowershell.py`.
### Config paths
- Engine WireGuard config: `Engine/WireGuard/borealis-wg.conf`.
- Agent WireGuard config: `Agent/Borealis/Settings/WireGuard/Borealis.conf`.
- Engine WireGuard keys: `Engine/Certificates/VPN_Server/`.
- Agent WireGuard keys: `Agent/Borealis/Certificates/VPN_Client/`.
### Service names (Windows)
- Engine listener service: `WireGuardTunnel$borealis-wg`.
- Agent tunnel service: `WireGuardTunnel$Borealis`.
- Adapter name in Control Panel: `Borealis`.
- Display names:
- `Borealis - WireGuard - Engine`
- `Borealis - WireGuard - Agent`
### Event flow (WireGuard tunnel)
1) UI calls `/api/tunnel/connect`.
2) Engine creates a tunnel session and emits `vpn_tunnel_start`.
3) Agent verifies token signature and starts WireGuard client.
4) Engine applies firewall allowlist rules for the agent /32.
5) Activity is recorded in `activity_history` as a VPN event.
### Event flow (Remote PowerShell)
1) UI opens a shell and emits `vpn_shell_open`.
2) Engine checks tunnel status and agent socket readiness.
3) Engine opens a TCP connection to agent shell on port 47002.
4) UI sends `vpn_shell_send`; Engine forwards to agent over TCP.
5) Agent returns stdout frames; Engine emits `vpn_shell_output`.
### Allowed ports and ACL rules
- Default allowlist (Windows): 3389, 5985, 5986, 5900, 3478, 47002.
- Per-device overrides are stored in `device_vpn_config`.
- Engine creates outbound firewall rules for each allowed port and protocol.
### Idle timeout behavior
- The tunnel idle timer resets when activity is detected or when `bump_activity` is called.
- Idle sessions are torn down and firewall rules are removed.
### Logs to inspect
- Engine tunnel log: `Engine/Logs/VPN_Tunnel/tunnel.log`.
- Engine shell log: `Engine/Logs/VPN_Tunnel/remote_shell.log`.
- Agent tunnel log: `Agent/Logs/VPN_Tunnel/tunnel.log`.
- Agent shell log: `Agent/Logs/VPN_Tunnel/remote_shell.log`.
### Troubleshooting checklist
- Confirm WireGuard service is running (Engine and Agent).
- Confirm `/api/tunnel/status` returns `status=up` and `agent_socket=true`.
- Verify `Agent/Borealis/Settings/WireGuard/Borealis.conf` during an active session.
- Test TCP shell reachability: `Test-NetConnection <agent_vpn_ip> -Port 47002`.
### Known limitations
- Legacy WebSocket tunnels are retired; only WireGuard is supported.
- RDP requires guacd running and port 3389 allowed in VPN config.
### Reverse VPN Tunnels (WireGuard) - Full Reference
#### 1) High-level model
- Outbound-only: agents establish WireGuard tunnels to the Engine; no inbound access on devices.
- Transport: WireGuard/UDP on port 30000.
- Sessions: one live VPN tunnel per agent; multiple operators share it.
- Routing: host-only /32 per agent; AllowedIPs restricted to the agent /32 and engine /32; no client-to-client.
- Idle timeout: 15 minutes of no operator activity; no grace period.
- Keys: WireGuard server keys under `Engine/Certificates/VPN_Server`; client keys under `Agent/Borealis/Certificates/VPN_Client`.
#### 2) Engine components
- Orchestrator: `Data/Engine/services/VPN/vpn_tunnel_service.py`
- Allocates per-agent /32, issues short-lived orchestration tokens, enforces single-session.
- Starts/stops WireGuard listener, applies firewall rules, idles out on inactivity.
- Emits Socket.IO events: `vpn_tunnel_start`, `vpn_tunnel_stop`, `vpn_tunnel_activity`.
- WireGuard manager: `Data/Engine/services/VPN/wireguard_server.py`
- Generates server keys, renders config, manages `wireguard.exe` tunnel service, applies ACL rules.
- PowerShell bridge: `Data/Engine/services/WebSocket/vpn_shell.py`
- Proxies UI shell input/output to the agent's TCP shell server over WireGuard.
- Logging: `Engine/Logs/VPN_Tunnel/tunnel.log` plus Device Activity entries; shell I/O is in `Engine/Logs/VPN_Tunnel/remote_shell.log`.
#### 3) API endpoints
- `POST /api/tunnel/connect` -> issues session material (tunnel_id, token, virtual_ip, endpoint, allowed_ports, idle_seconds).
- `GET /api/tunnel/status` -> returns up/down status for an agent.
- `GET /api/tunnel/connect/status` -> alias for status (used by UI before shell open).
- `GET /api/tunnel/active` -> lists active VPN tunnel sessions (tunnel_id, agent_id, virtual_ip, last_activity, etc.).
- `DELETE /api/tunnel/disconnect` -> immediate teardown (agent and engine cleanup).
- `GET /api/device/vpn_config/<agent_id>` -> read per-agent allowed ports.
- `PUT /api/device/vpn_config/<agent_id>` -> update allowed ports.
#### 4) Agent components
- Tunnel lifecycle: `Data/Agent/Roles/role_WireGuardTunnel.py`
- Validates orchestration tokens, starts/stops WireGuard client service, enforces idle.
- Shell server: `Data/Agent/Roles/role_RemotePowershell.py`
- TCP PowerShell server bound to `0.0.0.0:47002`, restricted to VPN subnet (10.255.x.x).
- Logging: `Agent/Logs/VPN_Tunnel/tunnel.log` (tunnel lifecycle) and `Agent/Logs/VPN_Tunnel/remote_shell.log` (shell I/O).
#### 5) Security and auth
- TLS pinned for Engine API and Socket.IO.
- Orchestration tokens signed via Engine Ed25519 key; agent verifies signatures and stores the signing key.
- WireGuard AllowedIPs /32; no LAN routes; client-to-client blocked.
- Engine firewall rules enforce per-device allowed ports.
#### 6) UI
- Device details now include an "Advanced Config" tab for per-device allowed ports.
- PowerShell MVP reuses `Data/Engine/web-interface/src/Devices/ReverseTunnel/Powershell.jsx` with WireGuard APIs and VPN shell events.
#### 7) Extending to new protocols
- Add protocol ports to the device allowlist and UI toggles.
- Reuse the existing VPN tunnel; no new transport/domain lanes required.
#### 8) Legacy removal
- WebSocket tunnel domains, protocol handlers, and domain limits are removed.
- No `/tunnel` Socket.IO namespace or framed protocol messages remain.
#### 9) Change log (not exhaustive)
- 2025-11-30: Legacy WebSocket tunnel scaffold introduced (lease manager, framing, tokens).
- 2025-12-06: Legacy PowerShell handler simplified to pipes-only; UI status tweaks.
- 2025-12-18: Legacy domain lanes added (`remote-interactive-shell`, `remote-management`, `remote-video`) with limits.
- 2025-12-20: WireGuard reverse VPN migration complete; legacy WebSocket tunnels retired; VPN shell bridge and new APIs.
### WireGuard Troubleshooting Handoff (Full)
This section consolidates the troubleshooting context and environment notes for WireGuard tunnel investigations. It is written as reference material only (no standalone prompts).
#### Environment and scope
- Workspace: D:\Github\Borealis (local project root for the Engine)
- Host OS: Windows 10/11 (build 26200). Engine runs on this machine.
- Remote Agent: mounted read-only at Z:\ (maps to C:\Borealis on the remote device; logs/configs under Z:\Agent\...).
- Agent and Engine launch: via Borealis.ps1, always elevated as admin.
- Network: Engine on 10.0.0.54; remote agent uses server_url.txt to derive endpoint host.
- WireGuard version: wireguard.exe 0.5.3, wg.exe 1.0.20210914.
- PIA (Private Internet Access) is installed and supplies a wintun driver (pia-wintun.sys). Do NOT treat the PIA adapter as the Borealis adapter.
#### Desired behavior
- Agent has a dedicated WireGuard adapter named "Borealis".
- Adapter provisioning is idempotent: if "Borealis" exists, do not recreate it.
- Configs must live inside the project root:
- Agent: Agent\Borealis\Settings\WireGuard\Borealis.conf
- Engine: Engine\WireGuard\borealis-wg.conf
- Agent brings up the WireGuard tunnel on vpn_tunnel_start, then remote shell/RDP/VNC/SSH flow through it.
- On stop/idle, the tunnel is torn down and firewall rules removed.
#### Recent changes (current repo state)
- Data/Agent/Roles/role_WireGuardTunnel.py
- Lazy client init (avoid side effects on import).
- Service name fix: WireGuard tunnel service is "WireGuardTunnel$Borealis".
- Endpoint override: if Engine sends localhost, use host from server_url.txt and port from the token.
- Config path preference: Agent\Borealis\Settings\WireGuard.
- Service display name set to "Borealis - WireGuard - Agent".
- Applies/removes the VPN shell firewall rule using the engine /32 from allowed_ips.
- Data/Engine/services/VPN/wireguard_server.py
- Engine config path: Engine\WireGuard\borealis-wg.conf (project root only).
- Removed invalid "SaveConfig = false" line (WireGuard rejected it).
- Service display name set to "Borealis - WireGuard - Engine".
- Ensures the listener service is running after install, and raises if it fails.
- Borealis.ps1
- Service name interpolation fixed to include the literal "$" in "WireGuardTunnel$Borealis".
Note: Data/Agent changes only apply after Borealis.ps1 re-stages the agent under Agent\.
#### Current symptoms (2026-01-14 00:05)
- Tunnel handshakes are healthy; TCP shell connectivity succeeds after adding a firewall rule for TCP/47002 from the engine /32.
- The firewall rule is now applied/removed by `role_WireGuardTunnel.py` using the engine /32 in the `allowed_ips` payload.
- `wireguard.exe /dumplog /tail` still fails with "Stdout must be set" when run from PowerShell (use file redirection).
#### Key paths
- Agent WireGuard role: Data/Agent/Roles/role_WireGuardTunnel.py
- Agent VPN shell role: Data/Agent/Roles/role_RemotePowershell.py
- Engine WireGuard manager: Data/Engine/services/VPN/wireguard_server.py
- Engine tunnel service: Data/Engine/services/VPN/vpn_tunnel_service.py
- Agent tunnel logs: Z:\Agent\Logs\VPN_Tunnel\tunnel.log
- Agent shell logs: Z:\Agent\Logs\VPN_Tunnel\remote_shell.log
- Engine tunnel logs: Engine\Logs\VPN_Tunnel\tunnel.log
- Engine shell logs: Engine\Logs\VPN_Tunnel\remote_shell.log
- Agent WireGuard config: Z:\Agent\Borealis\Settings\WireGuard\Borealis.conf
- Engine WireGuard config: Engine\WireGuard\borealis-wg.conf
#### Known WireGuard services and names
- Engine listener service name: "WireGuardTunnel$borealis-wg"
- Agent tunnel service name: "WireGuardTunnel$Borealis"
- Adapter name in Control Panel: "Borealis"
- Service display names:
- "Borealis - WireGuard - Engine"
- "Borealis - WireGuard - Agent"
#### Suggested verification commands
- Engine service status:
- Get-Service -Name "WireGuardTunnel$borealis-wg"
- sc.exe query "WireGuardTunnel$borealis-wg"
- netstat -ano -p udp | findstr :30000
- Engine WireGuard log tail:
- cmd /c ""C:\Program Files\WireGuard\wireguard.exe" /dumplog /tail > %TEMP%\wg-tail.log"
- powershell -NoProfile -Command "& 'C:\Program Files\WireGuard\wireguard.exe' /dumplog /tail 2>&1 | Out-File $env:TEMP\wg-tail.log"
- Agent tunnel state (remote, via Z:\ logs):
- Z:\Agent\Logs\VPN_Tunnel\tunnel.log
- Z:\Agent\Logs\VPN_Tunnel\remote_shell.log
- Z:\Agent\Borealis\Settings\WireGuard\Borealis.conf
#### Current blockers and next steps
1) Ensure the agent runtime is re-staged so `role_WireGuardTunnel.py` applies the shell firewall rule on tunnel start.
2) During an active session, run `Test-NetConnection -ComputerName 10.255.0.2 -Port 47002` on the Engine and confirm it reaches the agent.
3) While the session is active, confirm `Agent\Borealis\Settings\WireGuard\Borealis.conf` includes a [Peer] with endpoint/AllowedIPs (it reverts to idle config after stop).
4) Capture engine and agent tunnel/shell logs around a failed shell open attempt and re-check WireGuard service state if issues persist.

206
readme.md
View File

@@ -7,13 +7,18 @@ I'm the sole maintainer and still learning as I go, while working a full-time IT
---
## Documentation
- Human-friendly docs live in `Docs/` with a top-level index at `Docs/index.md`.
- The same files also contain **Codex Agent** sections with deep, agent-focused implementation details.
- Start with `Docs/getting-started.md` and `Docs/architecture-overview.md`, then jump to the domain pages.
## Features
- **Device Inventory**: OS, hardware, and status posted on connect and periodically.
- **Remote Script Execution**: Run PowerShell in `CURRENT USER` context or as `NT AUTHORITY\SYSTEM`.
- **Jobs and Scheduling**: Launch "*Quick Jobs*" instantly or create more advanced schedules.
- **Visual Workflows**: Draganddrop node canvas for combining steps, analysis, and logic.
- **Visual Workflows**: Drag-and-drop node canvas for combining steps, analysis, and logic.
- **Ansible Playbooks**: Ansible playbook support is unfinished/broken in both the Engine and agent runtimes. The goal is to ship server-driven Ansible (SSH/WinRM) alongside agent-driven playbooks.
- **Windowsfirst**. Linux Engine support ships via `Borealis.sh` (Engine is currently the focus); the Linux agent is not yet available; only settings can be stagedand the current Linux agent build would not execute scripts, audits, or likely even enroll reliably.
- **Windows-first**. Linux Engine support ships via `Borealis.sh` (Engine is currently the focus); the Linux agent is not yet available; only settings can be staged - and the current Linux agent build would not execute scripts, audits, or likely even enroll reliably.
## Current Status & Limitations
- Ansible is disabled/unstable: Engine quick-run returns not implemented, scheduled-job and agent paths are incomplete, and server-side SSH/WinRM playbook dispatch is still on the roadmap. Expect failures until the Ansible pipeline is rebuilt.
@@ -75,200 +80,3 @@ Site List:
2) (*Optional*) Install the Agent (*Windows, elevated PowerShell*):
- Windows: `./Borealis.ps1 -Agent`
- Linux agent binaries are not available yet; `Borealis.sh --Agent` only stages config settings.
## Automated Agent Enrollment
If you plan on deploying the agent via something like a Group Policy or other existing automation platform, you can use the following commandline arguments to install an agent automatically with an enrollment code pre-injected. *The enrollment code below is simply an example*.
**Windows**:
```powershell
.\Borealis.ps1 -Agent -EnrollmentCode "E925-448B-626D-D595-5A0F-FB24-B4D6-6983"
```
**Linux**: Agent enrollment is not yet available on Linux; `Borealis.sh --Agent` only writes settings placeholders.
### Reverse Proxy Configuration
Traefik Dynamic Config: `Replace Service URL with actual IP of Borealis server`
```yml
http:
routers:
borealis:
entryPoints:
- websecure
tls:
certResolver: letsencrypt
service: borealis
rule: "Host(`borealis.example.com`) && PathPrefix(`/`)"
middlewares:
- cors-headers
middlewares:
cors-headers:
headers:
accessControlAllowOriginList:
- "*"
accessControlAllowMethods:
- GET
- POST
- OPTIONS
accessControlAllowHeaders:
- Content-Type
- Upgrade
- Connection
accessControlMaxAge: 100
addVaryHeader: true
services:
borealis:
loadBalancer:
servers:
- url: "http://127.0.0.1:5000"
passHostHeader: true
```
## Security Breakdowns
The process that agents go through when authenticating securely with a Borealis server can be a little complex, so I have included a few sequence diagrams below along with a summary of the (current) security posture of Borealis to go over the core systems so you can visually understand what is going on behind-the-scenes.
### Security Overview
#### Overall
- Borealis enforces mutual trust: each agent presents a unique Ed25519 identity to the server, the server issues EdDSA-signed (Ed25519) access tokens bound to that fingerprint, and both sides pin the generated Borealis root CA.
- End-to-end TLS everywhere: the Engine auto-provisions an ECDSA P-384 root + leaf chain under `Engine/Certificates` and serves TLS using Python defaults (TLS1.2+); agents pin the delivered bundle for both REST and WebSocket traffic to eliminate Man-in-the-middle avenues.
- Device enrollment is gated by enrollment/installer codes (*They have configurable expiration and usage limits*) and an operator approval queue; replay-resistant nonces plus rate limits (40req/min/IP, 12req/min/fingerprint) prevent brute force or code reuse.
- All device APIs now require Authorization: Bearer headers and a service-context (e.g. SYSTEM or CURRENTUSER) marker; missing, expired, mismatched, or revoked credentials are rejected before any business logic runs. Operator-driven revoking / device quarantining logic is not yet implemented.
- Replay and credential theft defenses layer in DPoP proof validation (thumbprint binding) on the server side and short-lived access tokens (15min) with 90-day refresh tokens hashed via SHA-256.
- Centralized logging under Engine/Logs and Agent/Logs captures enrollment approvals, rate-limit hits, signature failures, and auth anomalies for post-incident review.
- The Engines operator-facing API endpoints (device inventory, assemblies, job history, etc.) require an authenticated operator session or bearer token; unauthenticated requests are rejected with 401/403 responses before any inventory or script metadata is returned and the requesting user is logged with each quick-run dispatch.
#### Server Security
- Auto-manages PKI: a persistent Borealis root CA (ECDSA SECP384R1) signs leaf certificates that include localhost SANs, tightened filesystem permissions, and a combined bundle for agent identity / cert pinning.
- Script delivery is code-signed with an Ed25519 key stored under Engine/Certificates/Code-Signing; agents refuse any payload whose signature does not match the pinned public key.
- Device authentication checks GUID normalization, SSL fingerprint matches, token version counters, and quarantine flags before admitting requests; missing rows with valid tokens auto-recover into placeholder records to avoid accidental lockouts.
- Refresh tokens are never stored in cleartext, only SHA-256 hashes plus DPoP bindings land in SQLite, and reuse after revocation/expiry returns explicit error codes.
- Enrollment workflow queues approvals, detects hostname/fingerprint conflicts, offers merge/overwrite options, and records auditor identities so trust decisions are traceable.
- Background pruning of expired enrollment codes and refresh tokens is not wired yet; a maintenance task is still needed.
#### Agent
- Generates device-wide Ed25519 key pairs on first launch, storing them under Certificates/Agent/Identity/ with DPAPI protection on Windows (chmod600 elsewhere) and persisting the server-issued GUID alongside.
- Stores refresh/access tokens encrypted (DPAPI) and re-enrolls on authentication failures; TLS pinning relies on the stored server certificate bundle rather than a separate fingerprint binding for the tokens.
- Imports the servers TLS bundle into a dedicated ssl.SSLContext, reuses it for the REST session, and injects it into the Socket.IO engine so WebSockets enjoy the same pinning and hostname checks.
- Treats every script payload as hostile until verified: only Ed25519 signatures from the server are accepted, missing/invalid signatures are logged and dropped, and the trusted signing key is updated only after successful verification between the agent and the server.
- Operates outbound-only; there are no listener ports, and every API/WebSocket call flows through AgentHttpClient.ensure_authenticated, forcing token refresh logic before retrying.
- Logs bootstrap, enrollment, token refresh, and signature events to daily-rotated files under Agent/Logs, giving operators visibility without leaking secrets outside the project root.
#### WireGuard Agent to Engine Tunnels
- Borealis started with a bespoke reverse tunnel stack (WebSocket framing + domain lanes); its handshake/security model did not scale, so the project made a major move to WireGuard as the Engine <-> Agent data pipeline for secure remote protocols and future remote desktop control.
- On-demand, outbound-only: operators trigger a tunnel start, the agent dials the Engine (no inbound listeners), and the tunnel tears down on stop or idle.
- Shared sessions: one live VPN tunnel per agent, reused across operators to avoid redundant connections.
- Fast and robust transport: WireGuard provides encrypted UDP transport with lightweight handshakes that keep latency low and reconnects are resilient.
- Orchestration security: the Engine issues short-lived, Ed25519-signed tunnel tokens that the agent verifies before bringing the tunnel up.
- Pinned trust: tunnel orchestration uses the same pinned TLS channel as REST/Socket.IO to prevent MITM during setup and control.
- Isolation by default: each agent gets a host-only /32; AllowedIPs are restricted to the agent /32 and the Engine /32; no LAN routes and no client-to-client traffic.
- Port-level controls: per-device allowlists plus Engine-applied firewall rules limit which protocols can traverse the tunnel.
- Live PowerShell today: a VPN-only shell endpoint enables remote command execution with SYSTEM-level (`NT AUTHORITY\SYSTEM`) access for deep diagnostics and remediation.
- Session lifecycle: 15-minute idle timeout with no grace period; session material includes a virtual IP plus allowed ports; teardown removes the tunnel and firewall rules.
- Future protocols: extend the same tunnel for SSH, WinRM, RDP, VNC, WebRTC streaming, and other remote management workflows by enabling ports per device.
### Agent/Server Enrollment
```mermaid
sequenceDiagram
participant Operator
participant Server
participant SYS as "SYSTEM Agent"
participant CUR as "CURRENTUSER Agent"
Operator->>Server: Request installer code
Server-->>Operator: Deliver hashed installer code
Note over Operator,Server: Human-controlled code binds enrollment to known device
par TLS Handshake (SYSTEM)
SYS->>Server: Initiate TLS session
Server-->>SYS: Present TLS certificate
and TLS Handshake (CURRENTUSER)
CUR->>Server: Initiate TLS session
Server-->>CUR: Present TLS certificate
end
Note over SYS,Server: Certificate pinning plus CA checks stop MITM
Note over CUR,Server: Pinning also blocks spoofed control planes
SYS->>SYS: Generate Ed25519 identity key pair
Note right of SYS: Private key stored under Certificates/... protected by DPAPI or chmod 600
CUR->>CUR: Generate Ed25519 identity key pair
Note right of CUR: Private key stored in user context and DPAPI-protected
SYS->>Server: Enrollment request (installer code, public key, fingerprint)
CUR->>Server: Enrollment request (installer code, public key, fingerprint)
Server->>Operator: Prompt for enrollment approval
Operator-->>Server: Approve device enrollment
Note over Operator,Server: Manual approval blocks rogue agents
Server-->>SYS: Send enrollment nonce
Server-->>CUR: Send enrollment nonce
SYS->>Server: Return signed nonce to prove key possession
CUR->>Server: Return signed nonce
Note over Server,Operator: Server verifies signatures and records GUID plus key fingerprint
Server->>SYS: Issue GUID, short-lived token, refresh token, server cert, script-signing key
Server->>CUR: Issue GUID, short-lived token, refresh token, server cert, script-signing key
Note over SYS,Server: Agent pins cert, stores GUID, DPAPI-encrypts refresh token
Note over CUR,Server: Agent stores GUID, pins cert, encrypts refresh token
Note over Server,Operator: Database keeps refresh token hash, key fingerprint, audit trail
loop Secure Sessions
SYS->>Server: REST heartbeat and job polling with Bearer token
CUR->>Server: REST heartbeat and WebSocket connect with Bearer token
Server-->>SYS: Provide new access token before expiry
Server-->>CUR: Provide new access token before expiry
SYS->>Server: Refresh request over pinned TLS
CUR->>Server: Refresh request over pinned TLS
end
Server-->>SYS: Deliver script payload plus Ed25519 signature
SYS->>SYS: Verify signature before execution
Server-->>CUR: Deliver script payload plus Ed25519 signature
CUR->>CUR: Verify signature and reject tampered content
Note over SYS,CUR: Signature failure triggers re-enrollment and detailed logging
Note over Server,Operator: Persistent records and approvals sustain long term trust
```
### Code-Signed Remote Script Execution
```mermaid
sequenceDiagram
participant Operator
participant Server
participant SYS as "SYSTEM Agent"
participant CUR as "CURRENTUSER Agent"
Operator->>Server: Upload or author script
Server->>Server: Store script and metadata on-disk
Operator->>Server: Request script execution on a specific device + execution context (NT Authority\SYSTEM or Current-User)
Server->>Server: Load Ed25519 code signing key from secure store
Server->>Server: Sign script hash and execution manifest (The Assembly)
Server->>Server: Enqueue job with signed payload for target agent (SYSTEM or CurrentUser)
Note over Server: Dispatch limited to enrolled agents with valid GUID + tokens
loop Agent job polling (pinned TLS + Bearer token)
SYS->>Server: REST heartbeat and job poll
CUR->>Server: REST heartbeat and job poll
Server-->>SYS: Pending job payloads
Server-->>CUR: Pending job payloads
end
alt SYSTEM context
Server-->>SYS: Script, signature, hash, execution parameters
SYS->>SYS: Verify TLS pinning and token freshness
SYS->>SYS: Verify Ed25519 signature using pinned server key
SYS->>SYS: Recalculate script hash and compare
Note right of SYS: Verification failure stops execution and logs incident
SYS->>SYS: Execute via SYSTEM scheduled-task runner
SYS-->>Server: Return execution status, output, telemetry
else CURRENTUSER context
Server-->>CUR: Script, signature, hash, execution parameters
CUR->>CUR: Verify TLS pinning and token freshness
CUR->>CUR: Verify Ed25519 signature using pinned server key
CUR->>CUR: Recalculate script hash and compare
Note right of CUR: Validation failure stops execution and logs incident
CUR->>CUR: Execute within interactive PowerShell host
CUR-->>Server: Return execution status, output, telemetry
end
Server->>Server: Record results and logs alongside job metadata
Note over SYS,CUR: Pinned TLS, signed payloads, and DPAPI-protected secrets defend against tampering and replay
```