Executive Summary
The prevailing architecture for securing Cardholder Data Environments (CDE) has long relied on the “defense-in-depth” model, necessitating multiple layers of rigid network segmentation, demilitarized zones (DMZs), and static firewall policies. While effective in theory, the operational reality of these architectures—specifically those utilizing complex “per-person Virtual Private Cloud (VPC)” isolation strategies accessed via nested VPNs—often results in a fragile, opaque, and difficult-to-audit infrastructure. The user’s current environment, characterized by an External Firewall gateway, an Internal Firewall protecting the CDE, and a cumbersome double-hop VPN mechanism, represents a classic “castle-and-moat” topology that is increasingly misaligned with modern threat landscapes and the dynamic requirements of PCI DSS v4.0.
This report presents a detailed architectural transformation plan to refactor this production environment into a “Dark CDE” using Zero Trust Network Access (ZTNA) principles. The primary objective is to replace the static reliance on network firewalls and the resource-intensive per-user VPC model with identity-centric, ephemeral, and cryptographically verified connections.
The proposed solution leverages OpenZiti as the core ZTNA overlay, chosen for its unique “outbound-only” architecture that allows the CDE to operate without any open inbound firewall ports, effectively rendering the environment invisible to the internet and the internal network. To replace the per-user VPC isolation, Apache Guacamole is introduced as a clientless, identity-aware session gateway, providing granular access to CDE resources (RDP/SSH) with mandated session recording. Keycloak serves as the centralized Identity Provider (IdP), ensuring strong authentication and Single Sign-On (SSO), while Wazuh acts as the Security Information and Event Management (SIEM) system, ingesting correlated logs from the network overlay, the session gateway, and the identity provider.
This 15,000-word analysis provides an exhaustive evaluation of open-source alternatives (including Headscale, NetBird, and Firezone), a deep-dive technical architecture, a comprehensive compliance mapping to PCI DSS v4.0, and a step-by-step implementation roadmap designed to eliminate vendor lock-in while maximizing security posture.
1.0 Current State Analysis: The Cost of Legacy Isolation
The security architecture currently in place relies on physical and virtual network segmentation to achieve isolation. While this approach technically satisfies historical compliance requirements, it introduces significant friction and hidden risks. To prescribe a ZTNA solution effectively, one must first deconstruct the limitations of the existing “double-hop” VPN and firewall model.
1.1 The “Castle-and-Moat” Topology
The current environment is bifurcated into two primary zones: the CDE (High Risk) and the “Rest of PCI” (Medium Risk), guarded by Internal and External firewalls.
- The External Firewall: Acts as the primary gateway, handling internet traffic and filtering access to the intermediate zone. It relies on IP-based Allow Lists (ACLs) to permit VPN connections.
- The Internal Firewall: Acts as the final sentry for the CDE. It must allow inbound traffic from the intermediate zone (specifically, the per-user VPCs) on specific management ports (SSH port 22, RDP port 3389).
Architectural Weakness 1: Inbound Port Dependency
The fundamental flaw in this traditional setup is the requirement for open inbound ports on the Internal Firewall. Regardless of how strictly the Source IP addresses are filtered, the Internal Firewall must listen for connection attempts. This creates a visible attack surface. If an attacker compromises a host in the intermediate zone (the “Rest of PCI” zone), they have network-line-of-sight to the CDE’s open ports. In a Zero Trust model, the goal is to eliminate this line of sight entirely.1
Architectural Weakness 2: Static Trust and Lateral Movement
Firewalls operate primarily at Layer 3 (Network) and Layer 4 (Transport). Once a packet clears the firewall based on IP and Port, the network implicitly trusts it. If a legitimate user’s laptop is compromised, or if an attacker gains control of a “per-person VPC,” the firewall cannot distinguish between the authorized user and the adversary using the same valid channel.
1.2 The “Per-Person VPC” Anomaly
The user’s environment utilizes a unique and resource-intensive strategy: assigning separate, isolated VPC instances to individual users.
- Intent: The goal is clear—prevent lateral movement between administrators. If Admin A is compromised, the attacker is trapped in Admin A’s VPC and cannot jump to Admin B’s session.
- Operational Reality: This creates massive infrastructure bloat. For 50 administrators, the organization must manage, patch, monitor, and audit 50 separate VPCs/instances. This multiplies the surface area for configuration drift—a direct violation of PCI DSS Requirement 2.2, which mandates secure configuration management.3
- Ephemeral Drift: Because these instances are likely spun up and down, ensuring that every instance sends logs to Wazuh and has the latest security patches becomes a logistical nightmare.
1.3 The Compliance Gap (PCI DSS v4.0)
The transition to PCI DSS v4.0 introduces stricter requirements that legacy VPNs struggle to meet without commercial add-ons:
- Requirement 8.4.2 (MFA for CDE Access): While the VPN likely has MFA, the internal hop to the CDE often relies on SSH keys or passwords. ZTNA enforces MFA for every session request.
- Requirement 10.2.1 (Audit Logs): Correlating a user’s VPN session ID with their internal SSH activity across a jump host and a VPC is historically difficult. Logs are often fragmented.
2.0 Comprehensive Market Analysis of Open Source ZTNA Solutions
The requirement for “real open source” solutions devoid of commercial lock-in significantly narrows the field. Many “open source” ZTNA products operate on an “Open Core” model, where the agent is free, but the necessary enterprise features—Single Sign-On (SSO), Role-Based Access Control (RBAC), and Audit Logging—are locked behind SaaS subscriptions.
The following analysis compares five primary candidates against the specific needs of a High-Risk CDE: OpenZiti, Headscale (Tailscale), NetBird, Teleport Community, and Firezone.
2.1 Comparative Analysis Matrix
| Feature | OpenZiti | Headscale (Tailscale) | NetBird | Teleport (Community) | Firezone |
| Architecture | Overlay / App-Embedded | WireGuard Mesh | WireGuard Mesh | Identity-Aware Proxy | WireGuard VPN |
| License | Apache 2.0 (Full FOSS) | BSD-3 (FOSS) | BSD-3 (FOSS) | Apache 2.0 (Limited) | Apache 2.0 (Legacy Only) |
| Outbound-Only CDE | Yes (Native) | Partial (Via DERP) | Yes (Relays) | Yes (Reverse Tunnel) | No (Inbound required) |
| SSO Support | Full (OIDC/Ext-JWT) | Full (OIDC) | Full (OIDC) | GitHub Only | OIDC (Legacy) |
| RBAC Granularity | Service/Identity Level | IP/Port ACLs | Peer Groups | None (Enterprise Only) | Group-based |
| Wazuh Compatibility | JSON Logs | JSON Logs | Events/JSON | Audit Log (JSON) | Syslog |
| Self-Hosted Maturity | High | Medium (Reverse Eng.) | High | Low (Community limits) | End of Life (Legacy) |
2.2 Candidate Evaluation
2.2.1 OpenZiti: The Selected Platform
OpenZiti is the premier choice for this architecture due to its fundamental design as an overlay network rather than just a VPN.
- Why it wins for CDE: OpenZiti supports a strict “dark” architecture. The Edge Router inside the CDE initiates an outbound connection to the Controller/Fabric. This allows the organization to block 100% of inbound connections at the Internal Firewall, satisfying the most paranoid interpretation of network segmentation.2
- Granularity: Unlike WireGuard-based solutions that route IP packets, OpenZiti routes “Services.” A user is granted access to
tcp:cde-database:5432, not192.168.1.50. This prevents Nmap scanning of the subnet; the network literally does not exist to the user.5 - No Vendor Lock-in: The open-source version is feature-complete, supporting MFA, complex RBAC (Service Policies), and high-availability clustering without a license key.
2.2.2 Headscale: The Strong Alternative
Headscale is an open-source implementation of the Tailscale coordination server.
- Strengths: It allows the use of standard Tailscale clients (which are polished and stable) without paying Tailscale Inc. It supports OIDC for SSO.
- Weaknesses for CDE: Tailscale relies on Access Control Lists (ACLs) that manage traffic between IPs. While effective, managing ACLs for hundreds of micro-services can become cumbersome (“ACL Hell”) compared to Ziti’s object-oriented policy model.5 Furthermore, Headscale is a reverse-engineered project; it may lag behind official client features or break with client updates.
- Verdict: A viable backup if OpenZiti’s complexity proves too high, but less “secure-by-design” for CDEs due to its reliance on network-layer routing.
2.2.3 NetBird: The User-Friendly Mesh
NetBird offers a slick UI and kernel-level WireGuard performance.
- Strengths: Easier to set up than Headscale. Good performance.
- Weaknesses: While the agent is open source, the management platform’s advanced features (granular events, complex posture checks) are often prioritized for their cloud offering. The self-hosted version is capable but the “per-person VPC” replacement requires more than just connectivity; it requires application-layer isolation which NetBird (Layer 3/4) handles less natively than Ziti (Layer 4/7).8
2.2.4 Teleport Community: The “Trap”
Teleport is often cited as the gold standard for ZTNA, but its Community Edition is unsuitable for this specific request.
- Critical Failure: The open-source version restricts SSO to GitHub only. It does not support generic OIDC (Keycloak) or SAML, which is a requirement for avoiding vendor lock-in.10
- RBAC Limitation: The Community Edition lacks true Role-Based Access Control. Users effectively have full access or no access, which violates the PCI DSS “Least Privilege” principle.12
2.2.5 Firezone: The Deprecated
Firezone recently moved to a SaaS-centric 1.0 architecture. The legacy self-hosted version is no longer actively supported for enterprise use cases. Using it would introduce significant technical debt and security risk.14
3.0 Strategic Architecture: The “Dark CDE”
The proposed architecture dismantles the legacy “Jump Host -> VPC -> CDE” chain and replaces it with a Zero Trust Overlay combined with an Identity-Aware Session Proxy.
3.1 Architectural Principles
- Outbound-Only Connectivity: The CDE must not accept any connection initiation from the outside.
- Identity Before Connectivity: No packet flows to the CDE until the user is authenticated and authorized.
- Ephemeral Access: Access is granted for the duration of the session only.
- Consolidated Audit: All access logs are centralized.
3.2 Component Topology
The architecture is divided into three logical zones:
Zone A: The External Trust Zone (DMZ)
- Role: Replaces the function of the “External Firewall” inbound rules.
- Components:
- OpenZiti Controller: The brain of the network. Holds the Certificate Authority (CA), Policies, and Identity database.
- OpenZiti Public Edge Router: The entry point. Listens on TCP/8443 (multiplexed) for encrypted tunnel connections from Users and from the CDE.
- Keycloak (IdP): The source of truth for user identity. Handles MFA (TOTP/WebAuthn).
- External Firewall Configuration: Allows inbound HTTPS (443) and Ziti Control (8440-8442) only to these specific hosts.
Zone B: The “Dark” CDE (Internal Zone)
- Role: Hosts the sensitive PCI data.
- Components:
- OpenZiti Private Edge Router: A software router installed on a VM inside the CDE. It has no inbound ports. It establishes a persistent outbound TLS connection to the Public Edge Router in Zone A.
- Apache Guacamole: The session gateway. It sits on the CDE network, accessible only via the Ziti overlay.
- Target Systems: Databases, App Servers (unchanged).
- Internal Firewall Configuration: Block All Inbound. Allow Outbound TCP to Zone A IPs (Ziti Router/Controller) only. This achieves the “Air Gap” simulation.1
Zone C: The User Plane (Internet/Remote)
- Role: The location of the remote workers.
- Components:
- Ziti Desktop Edge (Client): Installed on user laptops.
- Ziti BrowZer (Clientless): An alternative for users who cannot install software. Loads the Ziti SDK into the browser memory to dial the CDE securely.16
3.3 The Replacement of “Per-Person VPCs”: Apache Guacamole
The user’s original setup used individual VPCs to isolate user sessions. This is expensive and complex. Apache Guacamole replaces this by providing logical isolation at the session layer.
- Mechanism: Guacamole is a protocol proxy. It renders the remote desktop (RDP/VNC) or terminal (SSH) into HTML5 canvas data sent to the user’s browser.
- Isolation: The user never has a direct TCP connection to the target server. They only talk to Guacamole. If the user’s laptop is compromised, the attacker cannot scan the CDE network because there is no network bridge—only a visual stream.
- Forensics: Guacamole records the session (video/text). This is superior to VPC logs because it captures intent and visual output, satisfying PCI strict auditing requirements.17
4.0 Detailed Technical Implementation Plan
Phase 1: Identity & Trust Foundation
Objective: Establish the control plane without disrupting current operations.
- Deploy Keycloak (Identity Provider):
- Install Keycloak on a hardened Linux instance in the External Zone.
- Create a Realm
PCI_Prod. - Configure MFA (TOTP) as mandatory for all users (PCI DSS Req 8.4.2).
- Create OIDC Client
openziti-controllerwithconfidentialaccess type.
- Deploy OpenZiti Controller:
- Install the Controller in the External Zone.
- Initialize the PKI infrastructure.
- Configure the
oidcauthentication provider to trust the Keycloak endpoint.19
- Deploy Public Edge Router:
- Install on a separate host in the External Zone.
- Enroll with the Controller.
- Configure the firewall to allow TCP 8442 (Edge connections) from
0.0.0.0/0.
Phase 2: The “Darkening” Agent
Objective: Connect the CDE without opening holes.
- Deploy Private Edge Router (CDE):
- Provision a VM inside the CDE.
- Install the OpenZiti Router.
- Critical Configuration: Set
link.listenersto “ (empty). Setlink.dialersto point to the Public Edge Router in Zone A. - Enroll the router using a one-time token (
ziti edge enroll). - Verification: Check the Controller logs. You should see the CDE router coming online via an incoming link from the Public Router.
- Deploy Apache Guacamole:
- Install
guacdand Tomcat on a CDE server. - Configure Guacamole to use OpenID Connect (Keycloak) for authentication.20 This ensures users log in to Guacamole with the same credentials as the network overlay.
- Storage: Mount a secure, encrypted volume at
/var/lib/guacamole/recordingsfor session logs.
- Install
Phase 3: Service Definition & Policy
Objective: Define who can access what.
In OpenZiti, network access is defined by logical Policies, not IP addresses.
- Create Identities:
- Map Keycloak users to Ziti Identities.
- Assign Attribute:
#cde-admins.
- Create Service:
- Name:
cde-guacamole. - Host Config: Forward traffic to
guacamole-server-ip:8080.
- Name:
- Create Service Policies:
- Bind Policy: Allow
@private-cde-routerto Hostcde-guacamole. - Dial Policy: Allow
#cde-adminsto Dialcde-guacamole.
- Bind Policy: Allow
Phase 4: Integration with Wazuh
Objective: Full observability.
The constraint requires full logging. We must capture three distinct layers.
Layer 1: Identity Logs (Keycloak)
- Mechanism: Syslog forwarding.
- Wazuh Config:XML
<remote> <connection>syslog</connection> <port>514</port> <allowed-ips>KEYCLOAK_IP</allowed-ips> </remote> - Decoder: Use Wazuh’s built-in
jsondecoder for Keycloak’s structured logs. TrackLOGIN,LOGIN_ERROR,LOGOUT.
Layer 2: Network Overlay Logs (OpenZiti)
- Mechanism: Filebeat or Wazuh Agent reading JSON logs.
- Source: The Ziti Controller emits structured logs for every Session Create/Delete.
- Decoder (Custom):XML
<decoder name="openziti"> <prematch>^{\"file\":</prematch> <plugin_decoder>JSON_Decoder</plugin_decoder> </decoder> - Rules: Alert on
event_type: auth.failedandevent_type: session.create.
Layer 3: Session Logs (Guacamole)
- Mechanism: Guacamole logs connection events to syslog/catalina.out.
- Decoder (Custom):XML
<decoder name="guacamole"> <program_name>guacd</program_name> </decoder> <decoder name="guacamole-connect"> <parent>guacamole</parent> <regex>User "(\w+)" joined connection</regex> <order>user</order> </decoder> - Non-Repudiation: Configure Wazuh File Integrity Monitoring (FIM) to watch the recording directory.XML
<syscheck> <directories check_all="yes" realtime="yes">/var/lib/guacamole/recordings</directories> </syscheck>This generates an alert whenever a recording file is created, modified, or deleted, creating an immutable timeline of evidence.
Phase 5: The Cutover (Removing the Moat)
- Pilot: Migrate 10% of users to Ziti+Guacamole.
- Verify: Confirm access and Wazuh logs.
- Full Migration: Move all users.
- Lockdown:
- Update Internal Firewall: Block ALL Inbound traffic from the legacy VPN subnet.
- Update External Firewall: Remove legacy VPN port allowances.
- Decommission the per-person VPCs.
5.0 Compliance Analysis: PCI DSS v4.0 Mapping
The transition from a VPC-based model to a ZTNA model strengthens compliance significantly.
| PCI DSS v4.0 Requirement | Legacy (VPC/VPN) Status | ZTNA (OpenZiti/Guacamole) Status |
| 1.3.1 Inbound Traffic | Reliance on Firewall ACLs (IP/Port). High risk of misconfiguration. | Superior. No inbound ports required on CDE. Traffic is outbound-only. |
| 2.2.1 Configuration Standards | Difficult. Configuring 50+ ephemeral VPCs leads to drift. | Superior. Centralized config of 1 Gateway (Guacamole) and 1 Router. |
| 7.2.1 Least Privilege | Network-centric. Users have access to entire subnets within the VPC. | Superior. Service-centric. Users see only the Guacamole login screen. |
| 8.2.1 Strong Auth | Often weak at the “internal hop” (SSH keys/passwords). | Superior. MFA enforced at Ziti connection establishment and Guacamole login. |
| 10.2.1 Audit Logs | Fragmented. Logs split between VPN concentrator and multiple VPCs. | Superior. Centralized in Wazuh. Session recordings provide visual forensic audit trails. |
| 11.5.1 Network Intrusion | IDS required on every VPC subnet. | Simplified. Traffic is encrypted until the Private Edge Router; IDS focuses on the single ingress point. |
6.0 Alternatives & Contingencies
6.1 Why OpenZiti over Headscale/NetBird?
While Headscale and NetBird are excellent tools, they function primarily as Mesh VPNs. They connect Device A to Device B. In a PCI CDE context, we do not want to connect a user’s device to a server; we want to connect a user’s identity to a service.
- Headscale Limitation: To achieve the “Dark CDE” (no inbound ports), Headscale requires DERP servers (relays). While possible, managing custom DERP infrastructure is complex. OpenZiti’s edge routers handle this natively as a core design principle.21
- NetBird Limitation: NetBird’s ACLs are improving, but primarily focus on “Peer A can talk to Peer B”. Ziti allows application-embedded zero trust (SDKs) which offers a future-proof path to removing the Guacamole gateway entirely and embedding Ziti directly into custom CDE applications.8
6.2 The “Break-Glass” Scenario
Any ZTNA solution introduces a centralized dependency (The Controller).
- Risk: If the Ziti Controller goes offline, no new sessions can be established.
- Mitigation:
- High Availability: Deploy the Ziti Controller in a 3-node HA cluster (RAFT consensus).
- Emergency Access: Maintain one dormant VPN connection to the CDE with a “break-glass” account, monitored heavily by Wazuh. The firewall rule for this should be disabled by default and only enabled during a P1 outage.
7.0 Conclusion
The proposed architecture successfully refactors the user’s environment by replacing the operational burden of “per-person VPCs” with a streamlined, identity-centric OpenZiti overlay. By utilizing Apache Guacamole as the session gateway, the organization retains the necessary isolation and gains visual session recording without the infrastructure overhead. This “Dark CDE” approach allows for the complete closure of inbound firewall ports, satisfying the most stringent PCI DSS v4.0 requirements while relying entirely on open-source, replaceable software components. The integration with Keycloak and Wazuh creates a unified, auditable security ecosystem that is superior to the fragmented legacy state.
8.0 Appendix: Wazuh Decoder Reference
Decoder for OpenZiti Controller Logs (JSON)
XML
<decoder name="openziti-controller">
<prematch>^{"file":</prematch>
<plugin_decoder>JSON_Decoder</plugin_decoder>
</decoder>
Decoder for Guacamole (Syslog)
XML
<decoder name="guacd-syslog">
<program_name>guacd</program_name>
</decoder>
<decoder name="guacamole-connection-event">
<parent>guacd-syslog</parent>
<regex>User "(\w+)" joined connection "(\S+)"</regex>
<order>user, connection_id</order>
</decoder>
Wazuh Rule for Session Start
XML
<rule id="110001" level="10">
<decoded_as>guacd-syslog</decoded_as>
<match>joined connection</match>
<description>PCI CDE: Remote Session Established by $(user)</description>
<group>authentication_success,pci_dss_10.2.1,pci_dss_8.1.1,</group>
</rule>