Setup script to use on homelab environment vms/lxc. Does initial setup, basic ufw rules, etc.
Find a file
2026-06-11 16:19:36 +00:00
homelab-verify.sh bug fix for promtail 2026-06-08 15:53:25 +00:00
README.md Update README.md 2026-06-11 16:19:36 +00:00

homelab-verify.sh

A single interactive bash script that profiles an Ubuntu 24.04 Proxmox VM or LXC container, checks it against expected configuration, and offers to fix any drift — one module at a time, with your explicit confirmation before anything changes.


Quick Start

HL_LAN_SUBNET=XXX.XXX.XXX.XXX/XX \
HL_CADDY_IP=XXX.XXX.XXX.XXX \
HL_MONITOR_IP=XXX.XXX.XXX.XXX\
bash homelab-verify.sh

Run as root (or with sudo) — most checks and fixes require elevated privileges.


Flags

Flag Description
--dry-run Run all checks and show what would change, but apply nothing
--help Print usage and exit

Environment Variables

Variable Required Description
HL_LAN_SUBNET Yes LAN CIDR used in firewall rules, e.g. 192.168.86.0/24
HL_CADDY_IP Yes IP of the Caddy reverse proxy host
HL_MONITOR_IP Yes IP of the monitoring host (Prometheus/Loki)
HL_ROLE No Override automatic role detection (see Roles below)

The script exits immediately if any required variable is unset.


How It Works

  1. Role detection — reads hostname and matches it against known pve-* names. Set HL_ROLE to override.
  2. Questionnaire — a short set of Y/N questions that builds a profile for the host (Docker, Caddy, monitoring stack, etc.). Some questions are skipped when the role already implies the answer.
  3. Profile confirmation — prints a summary of what was detected and asks you to confirm before any checking begins.
  4. Modules — each module runs its checks independently, prints PASS / FAIL / WARN per item, then offers a fix if anything failed. You confirm each fix individually. After applying, checks re-run automatically to confirm resolution.

Modules

All modules run in this order. Unconditional modules run on every host; conditional ones depend on questionnaire answers.

# Module Runs when
1 System Update & qemu-guest-agent Always (qemu-guest-agent skipped on LXC)
2 journald Retention Always
3 Unattended Security Upgrades Always
4 Docker Installation Always
5 UFW Docker = No
6 iptables Docker = Yes
7 node_exporter node_exporter = Yes
8 Promtail Promtail = Yes
9 cAdvisor cAdvisor = Yes
10 Cron: .vscode-server-insiders cleanup VS Code cron = Yes

What each module checks

Module 1 — System Update & qemu-guest-agent

  • Runs apt-get update automatically (no prompt)
  • Reports upgradeable packages as WARN; prompts to run apt upgrade if any exist
  • Checks qemu-guest-agent is installed, enabled, and active (VM only — skipped on LXC)

Module 2 — journald Retention

  • Checks SystemMaxUse=500M is set in /etc/systemd/journald.conf
  • Fix backs up the original file before editing, then restarts systemd-journald

Module 3 — Unattended Security Upgrades

  • Checks unattended-upgrades package is installed
  • Checks /etc/apt/apt.conf.d/20auto-upgrades has both periodic settings enabled
  • Checks security origin is uncommented in /etc/apt/apt.conf.d/50unattended-upgrades

Module 4 — Docker Installation

  • If Docker is already installed: checks docker.service is enabled and active, docker compose (v2 plugin) is available, docker buildx is available
  • If Docker is not installed: reports WARN and offers to install it
  • Installation adds Docker's official apt repo (download.docker.com) and installs docker-ce, docker-ce-cli, containerd.io, docker-buildx-plugin, docker-compose-plugin, then enables docker.service
  • On successful install: sets DOCKER=true so Module 6 (iptables) runs instead of Module 5 (UFW)
  • LXC note: warns that features: nesting=1 must be set in Proxmox container options first

Module 5 — UFW (non-Docker hosts)

  • UFW installed and active, IPv6 disabled
  • SSH allow rule from HL_LAN_SUBNET
  • App port allow rules from HL_CADDY_IP (one per port from questionnaire)
  • node_exporter allow from HL_MONITOR_IP + catch-all deny on 9100
  • Default incoming policy is deny
  • Fix warns explicitly before enabling UFW and requires a second confirmation

Module 6 — iptables (Docker hosts)

  • /usr/local/bin/apply-iptables-custom.sh exists and is executable
  • iptables-custom.service exists, is enabled, active, and has After=docker.service
  • Live INPUT rules: SSH from LAN, app ports from Caddy, node_exporter from monitor, node_exporter DROP
  • Live DOCKER-USER rules: Caddy and Monitor allowed through before RFC1918/loopback/link-local DROP, RETURN at end
  • Fix saves current rules with iptables-save before applying and restores on failure

Module 7 — node_exporter

  • prometheus-node-exporter installed, enabled, active, port 9100 listening

Module 8 — Promtail

  • Package installed, config exists at /etc/promtail/config.yml
  • clients.url points to http://HL_MONITOR_IP:3100/loki/api/v1/push
  • system and journal scrape jobs present
  • If Docker logs enabled: docker_sd_configs present, promtail user in docker group
  • Service enabled and active

Module 9 — cAdvisor

  • Docker is running, cadvisor container exists and is running
  • Restart policy is unless-stopped, port 8080 listening
  • Fix removes and recreates the container if it exists but is misconfigured

Module 10 — Cron: .vscode-server-insiders cleanup

  • Checks all users for rm -rf ~/.vscode-server-insiders with schedule 0 3 * * *
  • Prompts which user to add the job to if none exists

Known Roles

The script auto-detects the role from hostname. Override with HL_ROLE.

pve-caddy · pve-coder · pve-monitor · pve-forgejo · pve-auth · pve-docker · pve-romm · pve-ollama · pve-foundryvtt

If no role is detected, the script falls back to generic mode (questionnaire-driven modules only, no role-specific skips).


LXC vs VM

The script works on both, with two differences:

  • qemu-guest-agent is skipped automatically on LXC containers (detected via systemd-detect-virt). It's only meaningful on KVM/QEMU VMs.
  • Firewall modules (UFW / iptables) check for NET_ADMIN capability before running. Unprivileged LXC containers don't have this capability, so the modules skip with a message directing you to the Proxmox host-level firewall instead. Privileged LXC containers proceed normally with a warning.

Output Format

[CHECK]  What is being checked
  PASS   What was verified
  FAIL   What was expected vs. what was found
  WARN   Present but not exactly right

[FIX]    Summary of what will change
         (exact commands or file contents printed before any prompt)

[DONE]   Module complete — all checks passing
[SKIP]   Module not applicable for this host
[FAIL]   Module complete — one or more checks did not pass

Color is disabled automatically when NO_COLOR is set or output is not a terminal.


Safety Guarantees

  • Nothing changes without your confirmation. Every fix prints exactly what it will do and prompts [y/N] (default No) before applying.
  • Config files are backed up before being overwritten (e.g. journald.conf.bak.20250608120000).
  • UFW enable triggers a second confirmation with an explicit warning about the deny policy taking effect immediately.
  • iptables fixes save current rules with iptables-save before applying. If the service fails to start, the previous rules are restored automatically.
  • Unexpected crashes are caught by an exit trap that attempts to restore a saved iptables snapshot if one exists.
  • --dry-run skips all fix prompts and applies nothing.

Exit Codes

Code Meaning
0 All checks passed (or all failures were fixed)
1 One or more checks failed and were not fixed
2 Script error (missing env vars, bad flags, etc.)