Overhaul of VPN Codebase

This commit is contained in:
2025-12-18 01:35:03 -07:00
parent 2f81061a1b
commit 6ceb59f717
56 changed files with 1786 additions and 4778 deletions

View File

@@ -42,8 +42,8 @@ At each milestone: pause, run the listed checks, talk to the operator, and commi
- [x] Do not start any tunnel yet.
- Linux: do nothing yet (see later section).
- Checkpoint tests:
- [x] WireGuard binaries available in agent runtime.
- [x] WireGuard driver installed and visible.
- [ ] WireGuard binaries available in agent runtime.
- [ ] WireGuard driver installed and visible.
### 2) Engine VPN Server & ACLs — Milestone: Engine VPN Server & ACLs (Windows)
- Agents editing this document should mark tasks they complete with `[x]` (leave `[ ]` otherwise).
@@ -54,15 +54,15 @@ At each milestone: pause, run the listed checks, talk to the operator, and commi
- [x] Do not push DNS or LAN routes; host-only reachability engine IP ↔ agent virtual /32.
- ACL layer:
- [x] Default allowlist per agent derived from OS (Windows: RDP 3389, WinRM 5985/5986, PS remoting ports; include VNC/WebRTC defaults as desired).
- [x] Allow operator overrides per agent; enforce at engine firewall layer. (rule plans produced; application wiring pending)
- [x] Allow operator overrides per agent; enforce at engine firewall layer.
- Keys/Certs:
- [x] Prefer reusing existing Engine cert infrastructure for signing orchestration tokens. Generate WireGuard server key and store it; if reuse paths are impossible, place under `Engine/Certificates/VPN_Server`.
- [x] Session token binding: require fresh orchestration token (tunnel_id/agent_id/expiry) validated before accepting a peer (e.g., via pre-shared keys or control-plane validation before adding peer).
- Logging: server logs to `Engine/Logs/reverse_tunnel.log` (or renamed consistently). [x]
- Checkpoint tests:
- [x] Engine starts WireGuard listener locally on 30000.
- [x] Only engine IP reachable; client-to-client blocked.
- [x] Peers without valid token/key are rejected.
- [ ] Engine starts WireGuard listener locally on 30000.
- [ ] Only engine IP reachable; client-to-client blocked.
- [ ] Peers without valid token/key are rejected.
### 3) Agent VPN Client & Lifecycle — Milestone: Agent VPN Client & Lifecycle (Windows)
- Agents editing this document should mark tasks they complete with `[x]` (leave `[ ]` otherwise).
@@ -83,60 +83,64 @@ At each milestone: pause, run the listed checks, talk to the operator, and commi
- [ ] Idle timeout fires at ~15 minutes of inactivity.
### 4) API & Service Orchestration — Milestone: API & Service Orchestration (Windows)
- Replace legacy tunnel APIs with:
- `POST /api/tunnel/connect` → tunnel_id, token, WG client config (keys, endpoint, allowed IPs), virtual IP, idle_seconds (900).
- `GET /api/tunnel/status` → up/down, virtual IP, connected operators.
- `DELETE /api/tunnel/disconnect` → immediate teardown and lease release.
- Engine orchestrator:
- Manages single tunnel per agent; tracks tunnel_id, virtual IP, token expiry.
- Emits start/stop signals to agent (rename events as needed).
- Cleans peer/routing state on stop.
- Token issuance: short-lived, binds agent_id/tunnel_id/port/expiry; validated before adding peer.
- Remove domain limits; remove channel/protocol handler registry for tunnels.
- Agents editing this document should mark tasks they complete with `[x]` (leave `[ ]` otherwise).
- [x] Replace legacy tunnel APIs with:
- [x] `POST /api/tunnel/connect` → tunnel_id, token, WG client config (keys, endpoint, allowed IPs), virtual IP, idle_seconds (900).
- [x] `GET /api/tunnel/status` → up/down, virtual IP, connected operators.
- [x] `DELETE /api/tunnel/disconnect` → immediate teardown and lease release.
- [x] Engine orchestrator:
- [x] Manages single tunnel per agent; tracks tunnel_id, virtual IP, token expiry.
- [x] Emits start/stop signals to agent (rename events as needed).
- [x] Cleans peer/routing state on stop.
- [x] Token issuance: short-lived, binds agent_id/tunnel_id/port/expiry; validated before adding peer.
- [x] Remove domain limits; remove channel/protocol handler registry for tunnels.
- Checkpoint tests:
- API happy path: connect → status → disconnect.
- Reject stale/second connect for same agent while active.
- [ ] API happy path: connect → status → disconnect.
- [ ] Reject stale/second connect for same agent while active.
### 5) UI Advanced Config & Operator Flow (PowerShell MVP) — Milestone: UI Advanced Config & Operator Flow (Windows, PowerShell MVP)
- In `Data/Engine/web-interface/src/Devices/Device_Details.jsx`, add “Advanced Config” tab:
- “Reverse VPN Tunnel - Allowed Ports” with toggles per protocol.
- Defaults by OS (Windows: RDP/WinRM/PS; All: VNC/WebRTC; allow operator overrides).
- PowerShell MVP:
- Reuse `Data/Engine/web-interface/src/Devices/ReverseTunnel/Powershell.jsx` as the base UI.
- Rewire to new APIs and virtual IP flow.
- Keep live web terminal behavior (WebSocket or equivalent) so operator input streams to remote PowerShell and outputs stream back in real time over the VPN tunnel.
- Ensure tunnel is up via `/api/tunnel/connect/status` before opening the terminal; call `/api/tunnel/disconnect` on exit/tab close.
- Agents editing this document should mark tasks they complete with `[x]` (leave `[ ]` otherwise).
- [x] In `Data/Engine/web-interface/src/Devices/Device_Details.jsx`, add “Advanced Config” tab:
- [x] “Reverse VPN Tunnel - Allowed Ports” with toggles per protocol.
- [x] Defaults by OS (Windows: RDP/WinRM/PS; All: VNC/WebRTC; allow operator overrides).
- [x] PowerShell MVP:
- [x] Reuse `Data/Engine/web-interface/src/Devices/ReverseTunnel/Powershell.jsx` as the base UI.
- [x] Rewire to new APIs and virtual IP flow.
- [x] Keep live web terminal behavior (WebSocket or equivalent) so operator input streams to remote PowerShell and outputs stream back in real time over the VPN tunnel.
- [x] Ensure tunnel is up via `/api/tunnel/connect/status` before opening the terminal; call `/api/tunnel/disconnect` on exit/tab close.
- Later protocols (RDP/SSH/etc.) can follow once MVP is proven, but do not block on them for this milestone.
- Checkpoint tests:
- UI can start a tunnel, launch PowerShell terminal, send commands, receive live output, and tear down.
- Toggles change ACL behavior (engine→agent reachability) as expected.
- [ ] UI can start a tunnel, launch PowerShell terminal, send commands, receive live output, and tear down.
- [ ] Toggles change ACL behavior (engine→agent reachability) as expected.
### 6) Legacy Tunnel Removal & Cleanup — Milestone: Legacy Tunnel Removal & Cleanup (Windows)
- Remove/retire:
- Engine `reverse_tunnel_orchestrator` and domain handlers under `Data/Engine/services/WebSocket/Agent/Reverse_Tunnels/`.
- Agent `role_ReverseTunnel.py` and protocol handlers.
- WebUI components tied to the old Socket.IO tunnel namespace.
- Update docs and references to point to the new WireGuard VPN flow; keep change log entries.
- Ensure no lingering domain limits/config knobs remain.
- Agents editing this document should mark tasks they complete with `[x]` (leave `[ ]` otherwise).
- [x] Remove/retire:
- [x] Engine `reverse_tunnel_orchestrator` and domain handlers under `Data/Engine/services/WebSocket/Agent/Reverse_Tunnels/`.
- [x] Agent `role_ReverseTunnel.py` and protocol handlers.
- [x] WebUI components tied to the old Socket.IO tunnel namespace.
- [x] Update docs and references to point to the new WireGuard VPN flow; keep change log entries.
- [x] Ensure no lingering domain limits/config knobs remain.
- Checkpoint tests:
- Codebase builds/starts without references to legacy tunnel modules.
- UI no longer calls old APIs or Socket.IO tunnel namespace.
- [ ] Codebase builds/starts without references to legacy tunnel modules.
- [ ] UI no longer calls old APIs or Socket.IO tunnel namespace.
### 7) End-to-End Validation — Milestone: End-to-End Validation (Windows)
- Agents editing this document should mark tasks they complete with `[x]` (leave `[ ]` otherwise).
- Functional:
- Windows agent: WireGuard connect on port 30000; PowerShell MVP fully live in the web terminal; RDP/WinRM reachable over tunnel as configured.
- Idle timeout at 15 minutes; operator disconnect stops tunnel immediately.
- [ ] Windows agent: WireGuard connect on port 30000; PowerShell MVP fully live in the web terminal; RDP/WinRM reachable over tunnel as configured.
- [ ] Idle timeout at 15 minutes; operator disconnect stops tunnel immediately.
- Security:
- Client-to-client blocked.
- Only engine IP reachable; per-agent ACL enforces allowed ports.
- Token enforcement blocks stale/unauthorized sessions.
- [ ] Client-to-client blocked.
- [ ] Only engine IP reachable; per-agent ACL enforces allowed ports.
- [ ] Token enforcement blocks stale/unauthorized sessions.
- Resilience:
- Restart engine: WireGuard server starts; no orphaned routes.
- Restart agent: adapter persists; tunnel stays down until requested.
- [ ] Restart engine: WireGuard server starts; no orphaned routes.
- [ ] Restart agent: adapter persists; tunnel stays down until requested.
- Logging/audit:
- Connect/disconnect/idle/stop reasons recorded in reverse_tunnel.log (Engine/Agent) and Device Activity.
- [ ] Connect/disconnect/idle/stop reasons recorded in reverse_tunnel.log (Engine/Agent) and Device Activity.
- Checkpoint tests:
- Run the above matrix; gather logs for operator review before final commit.
- [ ] Run the above matrix; gather logs for operator review before final commit.
## Linux (Deferred) — Do Not Implement Yet
- When greenlit, mirror the structure above for Linux: