Category: Servers

  • Explain how is allowence limit for cpu in cgroups for Linux

    Short answer

    • Linux cgroups control CPU either by relative weighting (shares/weight) or by absolute quota (CFS quota / cpu.max).
    • Shares/weight only affect scheduling when CPU is contended. Quota enforces a hard time-slice limit.

    Key details

    1. Two modes
    • Relative (fair-share):
      • cgroup v1: cpu.shares (default 1024). Higher = more CPU when contended.
      • cgroup v2: cpu.weight (1–10000, default 100). Same idea; no hard cap.
    • Absolute (time quota):
      • cgroup v1 (CFS): cpu.cfs_period_us and cpu.cfs_quota_us.
      • cgroup v2: cpu.max with two fields: “<quota> <period>” or “max <period>” (no limit).
    1. How quota works (formula)
    • Allowed CPU cores = cpu.cfs_quota_us / cpu.cfs_period_us (v1) or quota/period (v2).
    • Percent of a single core = (quota / period) * 100%.
    • Percent of total system CPUs = (quota / period) / N_CPUS * 100%. Examples (assume period = 100000 µs = 100 ms):
      • quota = 50000 → 50% of one CPU (0.5 core).
      • quota = 100000 → 100% of one CPU (1 core).
      • quota = 200000 → 200% (2 cores worth of CPU time).
    1. Defaults & behavior
    • Typical default CFS period: 100000 µs (100 ms). Default quota = -1 (no limit) in v1; v2 default is “max” (unlimited).
    • Shares/weight do not limit CPU if the system is idle; they only proportionally divide CPU when multiple cgroups compete.
    • Quota always limits usage (even if idle it prevents burst beyond configured budget).
    • cpuset.cpus restricts which logical CPUs the group can run on — quota cannot exceed that available CPU-set.
    1. Real-time tasks
    • Separate controls exist for RT: cpu.rt_period_us and cpu.rt_runtime_us (v1) or related fields in v2; they limit RT runtime to avoid starving normal tasks.

    Commands (examples)

    View v1:

    cat /sys/fs/cgroup/cpu/<group>/cpu.cfs_period_us
    cat /sys/fs/cgroup/cpu/<group>/cpu.cfs_quota_us
    cat /sys/fs/cgroup/cpu/<group>/cpu.shares
    

    Set v1:

    # set period to 100ms and quota to 200ms (2 CPUs)
    echo 100000 > /sys/fs/cgroup/cpu/<group>/cpu.cfs_period_us
    echo 200000 > /sys/fs/cgroup/cpu/<group>/cpu.cfs_quota_us
    

    View v2:

    cat /sys/fs/cgroup/<group>/cpu.max
    cat /sys/fs/cgroup/<group>/cpu.weight
    

    Set v2:

    # allow 2 CPUs (200ms per 100ms period)
    echo "200000 100000" > /sys/fs/cgroup/<group>/cpu.max
    
    # remove quota (unlimited)
    echo "max 100000" > /sys/fs/cgroup/<group>/cpu.max
    

    Practical tips

    • Use quota for strict limits (e.g., containers). Use shares/weight to balance under contention.
    • Choose period carefully (shorter periods = finer granularity but slightly higher overhead).
    • Always consider cpuset limits and number of host CPUs when computing percentages.

    If you want, give your target (e.g., “limit to 30% of 8‑CPU host”) and I’ll compute exact quota/period values and example commands.

    madalinignisca: I want Relative (fair-share) for max 6 physical cores, 12 logical cores, on a cpu that has 32 physical cores. The containers will run allowed on all cpus for spreading their load better. What quota should I put to this container?

    GitHub Copilot: – Assumption: host = 32 physical → 64 logical CPUs; you want max = 6 physical = 12 logical.

    Short answer (relative / fair‑share — no hard cap)

    • cgroup v1 (cpu.shares): 12 * 1024 = 12288
    • cgroup v2 (cpu.weight): 12 * 100 = 1200

    This gives the container ~12/64 = 18.75% of CPU when the system is fully contested.

    Commands

    # set cgroup v1 shares
    echo 12288 > /sys/fs/cgroup/cpu/<group>/cpu.shares
    
    # set cgroup v2 weight
    echo 1200 > /sys/fs/cgroup/<group>/cpu.weight
    

    Notes

    • Shares/weight only affect scheduling under contention; they do not enforce a hard limit.
    • If you need a hard cap, use cpu.cfs_quota_us/cpu.cfs_period_us (v1) or cpu.max (v2) instead.
  • Compare incus and libvirt for virtualization

    Here’s a detailed comparison of Incus and libvirt, focusing on their architecture, use cases, feature sets, and management models:


    1. Overview

    Incus

    • A system container and virtual machine manager, forked from LXD in 2023 by the community after Canonical changed LXD’s governance.
    • Developed under the Linux Containers project (same group behind LXC).
    • Provides unified management for system containers (via LXC) and virtual machines (via QEMU) using a consistent API and CLI.
    • Designed for ease of use, with integrated image management, networking, storage pools, clustering, and role-based remote management.

    libvirt

    • A virtualization API library and daemon that provides a common management layer over multiple hypervisors (QEMU/KVM, Xen, VMware, Hyper-V, etc.).
    • It’s a middleware used by tools like virsh, virt-manager, and OpenStack’s Nova.
    • Does not itself provide containers — it’s focused on virtual machines.
    • Provides a consistent interface (XML-based) for VM lifecycle management, networks, and storage.

    2. Architecture and Layers

    Aspect Incus libvirt
    Core type System management daemon (incusd) with REST API Library (libvirt.so) + daemon (libvirtd)
    Hypervisors / backends LXC for containers, QEMU for VMs Multiple: QEMU/KVM, Xen, VirtualBox, VMware ESXi, etc.
    Control interface incus CLI / REST API / UI tools virsh CLI / Libvirt API / virt-manager / higher-level tools
    Management granularity Manages full container and VM lifecycle, plus networking & storage as first-class concepts Manages VMs and their devices; other tools (e.g., libvirt-network, storage pools) are available but less integrated

    3. Functionality Scope

    Area Incus libvirt
    Abstraction level High-level integrated system manager Low-level hypervisor abstraction layer
    Containers Yes (system containers via LXC) No
    Virtual Machines Yes (QEMU/KVM) Yes (QEMU/KVM, Xen, etc.)
    Networking Built-in bridge management, OVN integration, profiles Virtual networks and bridges (XML definitions)
    Storage Built-in storage pools (ZFS, Btrfs, LVM, Ceph, etc.) Storage pools and volumes defined and managed separately
    Clustering Native clustering across nodes Not inherent; needs external orchestration (e.g., oVirt, OpenStack, etc.)
    Image management Integrated image servers, simplified deployment of OS images You manage disk images manually or via higher-level tools
    REST API Yes (fully featured, versioned, well-structured) No native REST (though libvirt-daemon-rest exists as add-on)

    4. Typical Use Cases

    Use Case Recommended tool
    Lightweight system containers (like virtual machines but faster) Incus
    Unified management of both containers and VMs on the same host Incus
    Simple developer/testing environments Incus
    Integration into orchestration platforms (OpenStack, custom tools) libvirt
    Managing VMs across different hypervisors (KVM, Xen, VMware, etc.) libvirt
    Building your own VM management tool via API bindings libvirt
    High-level, ready-to-use virtualization environment Incus

    5. Performance and Complexity

    • Incus aims for ease of use and integration. It hides much of the complexity of setting up storage, networks, and images.
    • libvirt gives you fine-grained control but at a lower level — you often configure XML directly or rely on higher tools like virt-manager.
    • For single-host or small-cluster setups, Incus is faster to get running.
    • For large-scale environments, libvirt is often integrated into higher-level systems (e.g., OpenStack, oVirt).

    6. Comparison Summary

    Feature Incus libvirt
    Container support ✔️
    VM support ✔️ (QEMU/KVM) ✔️ (many backends)
    API REST-based C API / bindings
    CLI tools incus virsh, virt-manager, etc.
    Clustering Built-in Requires orchestration layer
    Ease of setup Very easy Moderate to advanced
    Target user Sysadmins/devs wanting simple virtualization Developers/integrators building virtualization platforms

    7. Integration Perspective

    • libvirt is the foundation layer — many tools (like Incus’s VM management via QEMU) could still use libvirt internally, though Incus uses QEMU directly.
    • Incus is more of a turnkey, batteries-included system that provides the full stack.

    In Short

    Use Incus if you want… Use libvirt if you want…
    A unified, easy-to-use system for containers and VMs A flexible API to manage VMs across many hypervisors
    A single command-line and REST API for everything To integrate with existing tools (virt-manager, OpenStack)
    Built-in image, network, and storage management Low-level control or custom orchestration

    In one sentence:

    • Incus = a full-featured, high-level system manager for containers and VMs.
    • libvirt = a foundational virtualization API used by other management tools.
  • Sudoers file explained

    Sudoers file explained

    Overview

    • The sudoers file controls which users may run which commands as which target users on which hosts, and it also configures sudo behaviour.
    • Location: /etc/sudoers (do not edit directly). Use visudo to edit (visudo locks the file and checks syntax).
    • You can drop additional configuration files into /etc/sudoers.d and include them with the built-in #includedir or #include directives.

    Basic file structure and syntax tokens

    • Lines beginning with # are comments.
    • Blank lines are ignored.
    • Fields are generally space- or tab-separated.
    • Keywords and identifiers are case-sensitive.
    • Common keywords: User_Alias, Runas_Alias, Host_Alias, Cmnd_Alias, Defaults, and user specifications.

    Aliases

    • Define groups for reuse.
    • Types:
      • User_Alias NAME = user1, user2, %group, +netgroup
      • Runas_Alias NAME = user1, user2, :group
      • Host_Alias NAME = host1, host2, 192.0.2.0/24
      • Cmnd_Alias NAME = /path/to/command, /path/to/cmd arg1, /bin/sh -c
    • Examples:
      • User_Alias ADMINS = alice, bob, %wheel
      • Cmnd_Alias SYSTEMCTL = /bin/systemctl, /usr/bin/systemctl

    User specifications (the most important lines)

    • General form: who where = (runas) [tag_list:] command_list
      • who: username, User_Alias, %group (groups prefixed with %), +netgroup
      • where: Host_Alias, hostname, IP, or ALL
      • runas: user or Runas_Alias in parentheses, e.g. (root), (www-data)
      • tag_list: optional command tags (NOPASSWD:, PASSWD:, NOEXEC:, SETENV:, etc.)
      • command_list: comma-separated commands or Cmnd_Alias entries
    • Common examples:
      • root ALL=(ALL) ALL
        • root may run any command as any user on any host (typical default)
      • %wheel ALL=(ALL) ALL
        • members of group wheel may run any command as any user
      • alice webservers = (root) NOPASSWD: /usr/sbin/service httpd restart
        • on host(s) in alias webservers, alice may run the listed service command as root without a password
      • bob ALL = (www-data) /usr/bin/tee /var/www/html/index.html
        • bob may run tee as user www-data for that file

    Command tags

    • NOPASSWD: run the command without prompting for the user’s password
    • PASSWD: force password prompt even if NOPASSWD would apply
    • NOEXEC: prevent the command from using exec(3) to spawn new processes (not foolproof)
    • SETENV: allow setting environment variables via sudo -E / sudo -s? (controls env handling)
    • Examples:
      • alice ALL = NOPASSWD: /bin/systemctl reload nginx
      • %ops ALL = (root) NOEXEC: /usr/bin/less /var/log/*

    Negation

    • You can prefix a command with ! to explicitly disallow it:
      • %staff ALL=(ALL) ALL, !/usr/bin/passwd
    • An explicit negation in a command list prevents that command even if a more general entry would allow it.

    Wildcards and command arguments

    • Commands must be specified with absolute paths.
    • Wildcards/glob characters are allowed in Cmnd_Alias entries (e.g. /usr/bin/apt-get *) but be cautious—globs can expand permissions widely.
    • If you allow a program that permits shell escapes or arbitrary arguments (e.g., editors, shells, tee, find), it is effectively root access.

    Defaults (configuration options)

    • Defaults lines configure sudo’s behavior (environment, logging, timeouts, secure_path, env_keep, etc.).
    • Basic form:
      • Defaults option
      • Per-user: Defaults:username option
      • Per-host: Defaults@hostname option
      • Per-runas: Defaults>runas option
      • Per-command: Defaults!command option
    • Examples:
      • Defaults timestamp_timeout=15
      • Defaults env_reset
      • Defaults:alice !authenticate
      • Defaults@laptop secure_path=“/usr/local/bin:/usr/bin:/bin”
    • Many options exist: env_reset, env_keep, secure_path, timestamp_timeout, lecture, log_output, requiretty (deprecated on some systems), etc. See man sudoers.

    Include directives

    • #include /path/to/file — include a single file
    • #includedir /path/to/dir — include all files in a directory (lexicographic order)
    • Use these for packaged or per-package drop-ins such as /etc/sudoers.d/*

    Precedence and security notes

    • More specific Defaults override general Defaults.
    • Order of user specifications matters when you create complex overlapping rules—explicit negations take precedence for the same matching line.
    • Be careful with NOPASSWD and commands that allow shell escapes—these can give a full root shell.
    • Always edit with visudo. You can run visudo -c to check syntax without editing.
    • Keep sudoers and files in /etc/sudoers.d with strict permissions (typically 0440).

    Practical examples

    • Allow members of wheel to run any command:
      • %wheel ALL=(ALL) ALL
    • Allow a service admin to restart nginx without password:
      • alice ALL=(root) NOPASSWD: /bin/systemctl restart nginx, /bin/systemctl reload nginx
    • Allow a user to run a command as another unprivileged user:
      • bob ALL=(www-data) /usr/bin/tee /var/www/html/index.html

    Where to learn more

    • man sudoers — definitive reference with full syntax, tags and examples
    • man visudo — how to safely edit the file
  • Real client ip for Magento 2

    Real client ip for Magento 2

    How I made Magento 2 see the real client IP behind HAProxy + Varnish + Nginx (with Cloudflare)

    If you run Magento 2 behind multiple layers, it’s common to lose the original client IP and end up with your proxy or load balancer address everywhere (logs, admin actions, rate-limiters, captchas). Here’s the setup that finally gave me the real IP all the way to PHP and Magento:

    • HAProxy: add X-Forwarded-For
    • Varnish: normalize X-Forwarded-For (favor Cloudflare CF-Connecting-IP if present; otherwise keep chain or fall back to client.ip)
    • Nginx: trust the Varnish IP and use X-Forwarded-For to set the real client IP, then pass that header to PHP-FPM

    Architecture

    Client (optionally Cloudflare) -> HAProxy -> Varnish -> Nginx -> PHP-FPM (Magento 2)

    Step 1: HAProxy — add the client IP once

    In your HAProxy frontend that handles HTTP/HTTPS:

    frontend fe_http
        bind :80
        option forwardfor if-none
        default_backend be_varnish
    
    frontend fe_https
        bind :443 ssl crt /etc/haproxy/certs
        option forwardfor if-none
        http-request set-header X-Forwarded-Proto https if { ssl_fc }
        default_backend be_varnish
    
    backend be_varnish
        server varnish 127.0.0.1:6081
    

    Notes:

    • option forwardfor adds X-Forwarded-For with the client source IP and, with if-none, won’t overwrite an existing header (e.g., from Cloudflare).
    • Set X-Forwarded-Proto so downstream knows the original scheme.

    Step 2: Varnish — normalize X-Forwarded-For

    I used this in vcl_recv to prefer Cloudflare’s CF-Connecting-IP when present, otherwise keep what we already have, and finally fall back to client.ip:

    if (req.http.CF-Connecting-IP) {
        set req.http.X-Forwarded-For = req.http.CF-Connecting-IP;
    } else if (req.http.X-Forwarded-For) {
        # keep existing header (CF/HAProxy might have set it)
        set req.http.X-Forwarded-For = req.http.X-Forwarded-For;
    } else {
        set req.http.X-Forwarded-For = client.ip;
    }
    

    Why this order:

    • When using Cloudflare, CF-Connecting-IP is the canonical client address and is safer than trusting a possibly client-spoofed X-Forwarded-For arriving at the edge.
    • If you’re not on Cloudflare, HAProxy will have added X-Forwarded-For, so we preserve it.
    • If neither is set, we fall back to the direct client.ip seen by Varnish.

    Optional in Varnish: ensure X-Forwarded-Proto is forwarded correctly as well (if HAProxy set it, just pass it through).

    Step 3: Nginx — trust Varnish and apply the real IP

    Tell Nginx which upstream IPs are trusted to provide X-Forwarded-For and then use that header as the source of truth. Also, pass the header through to PHP-FPM.

    In http { }:

    # Trust only your proxy chain. At minimum, trust Varnish:
    set_real_ip_from 127.0.0.1;    # or your Varnish host/IP
    real_ip_header X-Forwarded-For;
    real_ip_recursive on;
    
    # Optional: useful for debugging
    log_format main '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent" xff="$http_x_forwarded_for"';
    access_log /var/log/nginx/access.log main;
    

    Inside your PHP location block:

    location ~ \.php$ {
        include fastcgi_params;
    
        # Make sure PHP sees the normalized header
        fastcgi_param HTTP_X_FORWARDED_FOR $http_x_forwarded_for;
    
        # Usual PHP-FPM pass
        fastcgi_pass unix:/run/php/php-fpm.sock;
    }
    

    Critical pieces:

    • set_real_ip_from must include your Varnish IP(s). Only trust known proxies. If Nginx also sits behind HAProxy directly in some paths, add those IPs too.
    • real_ip_header X-Forwarded-For makes Nginx rewrite $remote_addr to the real client IP, using the trusted chain.
    • real_ip_recursive on walks the header from right to left until it finds the first untrusted address, which becomes the real IP.
    • Passing HTTP_X_FORWARDED_FOR to PHP ensures Magento/PHP code that inspects that header can still see the chain (while REMOTE_ADDR becomes the real client IP thanks to real_ip_header).

    Magento 2 impact

    • PHP’s $_SERVER[‘REMOTE_ADDR’] now shows the actual client address (thanks to Nginx real_ip_module).
    • $_SERVER[‘HTTP_X_FORWARDED_FOR’] is available as the normalized chain (thanks to the fastcgi_param line).
    • Admin actions, logs, rate-limiters, and security modules that rely on REMOTE_ADDR behave correctly.
    • If you use Cloudflare, you’ll get the exact visitor IP via CF-Connecting-IP promoted into X-Forwarded-For at the Varnish layer.

    Verification checklist

    • From the shell on your web host:
    • tail -f /var/log/nginx/access.log and confirm $remote_addr shows the real IP and xff shows what you expect.
    • In Varnish:
    • varnishlog -g request -i ReqHeader:X-Forwarded-For and confirm the header holds the correct client IP.
    • In PHP:
    • Create a small phpinfo() or var_dump($_SERVER) page temporarily and confirm REMOTE_ADDR equals the client IP and HTTP_X_FORWARDED_FOR is set.
    • In Magento admin:
    • Check recent logs, orders, or security modules for real client IPs.

    Common pitfalls

    • Trust only your proxies. Never set set_real_ip_from 0.0.0.0/0. That allows clients to spoof X-Forwarded-For.
    • If you’re on Cloudflare and want to trust CF directly in Nginx, you can set real_ip_header CF-Connecting-IP and add all Cloudflare IP ranges to set_real_ip_from. The Varnish normalization shown above is often simpler to maintain.
    • Ensure HAProxy uses option forwardfor if-none so it doesn’t stomp a legitimate X-Forwarded-For coming from an upstream like Cloudflare.
    • Don’t forget to pass HTTP_X_FORWARDED_FOR to PHP-FPM:
      fastcgi_param HTTP_X_FORWARDED_FOR $http_x_forwarded_for;

    That’s it. With HAProxy setting X-Forwarded-For, Varnish normalizing it, and Nginx trusting only the Varnish IP and passing the header to PHP, Magento 2 finally sees the real visitor IP end-to-end.