Tag: containers

  • Secure Swarm Secrets with OpenBao

    Executive Summary

    Achieving automatic secret rotation in Docker Swarm is historically difficult because native Swarm Secrets are immutable (they cannot change without restarting the service). Furthermore, strict security standards like PCI-DSS Requirement 3 prohibit storing unencrypted credentials in static configuration files or on physical disk.

    This guide details the “Bundled Process Sidecar” architecture. This pattern uses OpenBao (the open-source fork of HashiCorp Vault) to inject rotating credentials directly into a secure RAM-disk (tmpfs) at runtime.

    Key Benefits

    1. Automatic Rotation: Database passwords rotate without restarting the application container.
    2. PCI-DSS Compliance: Secrets exist only in volatile memory (tmpfs). They are never written to the host hard drive or included in the Docker image layers.
    3. Swarm Compatibility: Overcomes Swarm’s lack of “Pods” by bundling the Agent and App into a single atomic scheduling unit.

    1. Architecture: The Bundled Process Pattern

    In Kubernetes, you would run a “Sidecar” container in the same Pod. Docker Swarm does not have Pods; if you deploy two separate containers, they may land on different servers.

    To guarantee co-location and secure memory sharing, we bundle the OpenBao Agent binary inside the Application Container.

    graph TD
        subgraph "Docker Swarm Node"
            subgraph "Container (Bundled)"
                A[Entrypoint Script] -->|Starts & Monitors| B(OpenBao Agent)
                A -->|Starts & Monitors| C(Application)
                B -->|Writes Secret| D[/"tmpfs (RAM Disk)"/]
                C -->|Reads Secret| D
            end
        end
        B -.->|Auth & Fetch| E((OpenBao Server))
    

    2. Compliance: Why tmpfs Satisfies PCI-DSS

    Your auditor may flag “storing credentials in a file” as a violation. You must clarify the difference between Data at Rest and Data in Use.

    • The Config: We use a Docker tmpfs volume. This maps the directory /app/secrets to a block of System RAM.
    • The Compliance Argument:
      • Volatile: If power is cut or the container stops, the data vanishes instantly. It is physically impossible to recover from the hard drive.
      • Requirement 3.4: This requirement applies to PAN and sensitive data stored on disk. Since tmpfs is memory, it falls under “Data in Use,” similar to how the variable sits in your application’s memory heap.
      • No Artifacts: The secret is not in the Docker Image, not in the overlay2 file system, and not in backups.

    3. Implementation Guide

    Step 1: The OpenBao Agent Configuration (agent-config.hcl)

    This file tells the agent how to authenticate and where to write the secret.

    pid_file = "/var/run/bao-agent.pid"
    
    auto_auth {
      # Swarm nodes should ideally use AppRole or Kubernetes auth (if using Mirantis)
      # For simple Swarm, AppRole is most common.
      method "approle" {
        config = {
          role_id_file_path = "/etc/bao/role_id"
          secret_id_file_path = "/etc/bao/secret_id"
          remove_secret_id_file_after_reading = false
        }
      }
    
      sink "file" {
        config = {
          path = "/tmp/bao-token" # Ephemeral token location
        }
      }
    }
    
    template {
      # The critical PCI-DSS path - this MUST be the tmpfs volume
      destination = "/app/secrets/db_password"
      
      # Fetch data from OpenBao and format it as a simple string
      contents = "{{ with secret \"database/creds/my-app-role\" }}{{ .Data.password }}{{ end }}"
      
      # Optional: Command to run when secret rotates (e.g., reload app)
      # command = "pkill -HUP -f 'python app.py'"
    }
    
    

    Step 2: The Fail-Fast Entrypoint (entrypoint.sh)

    This script replaces the default command. It acts as a process supervisor. If the OpenBao Agent crashes, this script kills the container immediately so Swarm can restart it.

    #!/bin/bash
    set -m # Enable job control to handle background processes
    
    # 1. Start OpenBao Agent in the background
    # We assume the 'bao' binary is in the PATH
    bao agent -config=/etc/bao/agent-config.hcl > /var/log/bao-agent.log 2>&1 &
    BAO_PID=$!
    
    # 2. Wait for the secret to be rendered (Critical for startup race conditions)
    echo "Waiting for secrets to be rendered into /app/secrets/..."
    TIMEOUT=30
    while [ ! -f /app/secrets/db_password ]; do
      if ! kill -0 $BAO_PID 2>/dev/null; then
        echo "CRITICAL: OpenBao Agent died while starting up! Check logs."
        cat /var/log/bao-agent.log
        exit 1
      fi
      sleep 1
      ((TIMEOUT--))
      if [ $TIMEOUT -le 0 ]; then
        echo "Timed out waiting for OpenBao to render secrets."
        exit 1
      fi
    done
    echo "Secrets authenticated and rendered successfully."
    
    # 3. Start the Main Application
    # Replace this with your actual start command
    python app.py &
    APP_PID=$!
    
    # 4. Monitoring Loop
    # If either process dies, kill the container to trigger a Swarm Restart
    while true; do
      if ! kill -0 $BAO_PID 2>/dev/null; then
        echo "CRITICAL: OpenBao Agent crashed. Shutting down container."
        kill -TERM $APP_PID
        exit 1
      fi
      
      if ! kill -0 $APP_PID 2>/dev/null; then
        echo "Application exited. Shutting down OpenBao Agent."
        kill -TERM $BAO_PID
        exit 0
      fi
      
      sleep 2
    done
    
    

    Step 3: The Unified Health Check (healthcheck.sh)

    Docker only allows one HEALTHCHECK instruction. This script checks both components.

    #!/bin/bash
    
    # 1. Check if OpenBao Agent is running
    pgrep bao > /dev/null || exit 1
    
    # 2. Check if the secret file exists and is not empty
    if [ ! -s /app/secrets/db_password ]; then
      exit 1
    fi
    
    # 3. Check if the App is responsive (replace port/path as needed)
    curl -f http://localhost:8080/health || exit 1
    
    exit 0
    
    

    Step 4: The Dockerfile

    We use a multi-stage build to copy the bao binary from the official OpenBao image into your application image.

    # Stage 1: Get OpenBao binary
    FROM openbao/openbao:latest AS bao-source
    
    # Stage 2: Your Application
    FROM python:3.9-slim
    
    # Install dependencies for healthcheck and process management
    RUN apt-get update && apt-get install -y curl procps && rm -rf /var/lib/apt/lists/*
    
    WORKDIR /app
    
    # COPY the 'bao' binary. Note: The binary is usually at /bin/bao or /usr/local/bin/bao
    COPY --from=bao-source /bin/bao /usr/local/bin/bao
    
    # Copy Configs
    COPY agent-config.hcl /etc/bao/agent-config.hcl
    COPY entrypoint.sh /usr/local/bin/entrypoint.sh
    COPY healthcheck.sh /usr/local/bin/healthcheck.sh
    
    # Set permissions
    RUN chmod +x /usr/local/bin/bao \
        && chmod +x /usr/local/bin/entrypoint.sh \
        && chmod +x /usr/local/bin/healthcheck.sh \
        && mkdir -p /etc/bao
    
    # Copy Application Code
    COPY . .
    
    # Define the Healthcheck
    HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
      CMD /usr/local/bin/healthcheck.sh
    
    ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
    
    

    Step 5: The Docker Compose (Swarm Stack)

    This is where you define the tmpfs volume to ensure compliance.

    version: '3.8'
    
    services:
      webapp:
        image: my-registry/my-bundled-app:latest
        deploy:
          replicas: 3
          restart_policy:
            condition: any
        environment:
          # Address of your OpenBao server
          VAULT_ADDR: "[http://openbao.internal:8200](http://openbao.internal:8200)" 
        volumes:
          # PCI Compliance: Map /app/secrets to RAM (tmpfs)
          - type: tmpfs
            target: /app/secrets
            tmpfs:
              size: 20m  # Limit size to prevent memory exhaustion DoS
              mode: 0700 # Strict permissions (Owner only)
        configs:
          # Inject AppRole credentials using standard Swarm Secrets/Configs
          # These are read-only by the Agent to authenticate initially
          - source: bao_role_id
            target: /etc/bao/role_id
          - source: bao_secret_id
            target: /etc/bao/secret_id
    
    configs:
      bao_role_id:
        external: true
      bao_secret_id:
        external: true
    
    

    4. Operational Best Practices

    Handling Rotation

    When OpenBao rotates the password (e.g., every 1 hour):

    1. OpenBao Server generates new credentials.
    2. OpenBao Agent (inside the container) detects the change.
    3. Agent rewrites the file /app/secrets/db_password in the tmpfs volume.
    4. Application Response:
      • Option A (Hot Reload): Your app watches the file and re-reads it.
      • Option B (Signal): Configure the template block in agent-config.hcl to send a signal (SIGHUP) to your app to force a reload.

    Troubleshooting

    If a container is restarting loop:

    1. Check docker service logs <service_name>. The entrypoint.sh is configured to print “CRITICAL” errors to stdout.
    2. Ensure the tmpfs size is adequate (though 20MB is plenty for text secrets).
    3. Verify network connectivity from the container to the OpenBao server address.

    Security Hardening

    • AppRole: Ensure the secret_id used for initial authentication is short-lived or wrapped.
    • Memory Limit: Always set a size limit on the tmpfs volume to prevent a compromised container from filling the host RAM and causing a node crash.