Objective

Write a suite of Bash shell scripts to automate common server administration tasks: service health monitoring with automatic restart on failure, batch user provisioning from a CSV file, log rotation and archival, and a nightly system report emailed to the administrator. The scripts were designed to run as systemd timer units rather than raw cron jobs for better logging and dependency management.

Tools & Technologies

  • Bash 5.x — scripting language
  • systemd — service and timer unit management
  • systemctl — querying and controlling services
  • journalctl — reading systemd journal logs
  • mailutils / sendmail — sending email reports
  • awk / sed / cut — text processing in scripts
  • useradd / passwd / chage — user provisioning commands
  • logger — writing custom entries to syslog
  • df / free / top / ss — system resource reporting

Architecture Overview

flowchart TD Timer1[systemd Timer\nservice-monitor.timer] --> Script1[service_monitor.sh\nCheck & restart services] Timer2[systemd Timer\nnightly-report.timer] --> Script2[nightly_report.sh\nSystem metrics email] CSV[users.csv] --> Script3[provision_users.sh\nBatch user creation] Script1 --> Journal[systemd Journal\nStructured logs] Script2 --> Email[Admin Email\nHTML report] Script3 --> Syslog[/var/log/syslog\nProvisioning audit] style Timer1 fill:#1a1a2e,stroke:#00d4ff,color:#e0e0e0 style Timer2 fill:#1a1a2e,stroke:#00d4ff,color:#e0e0e0 style CSV fill:#1a1a2e,stroke:#00d4ff,color:#e0e0e0 style Script1 fill:#181818,stroke:#1e1e1e,color:#888 style Script2 fill:#181818,stroke:#1e1e1e,color:#888 style Script3 fill:#181818,stroke:#1e1e1e,color:#888 style Journal fill:#1a1a2e,stroke:#00ff88,color:#e0e0e0 style Email fill:#1a1a2e,stroke:#00ff88,color:#e0e0e0 style Syslog fill:#1a1a2e,stroke:#00ff88,color:#e0e0e0

Step-by-Step Process

01
Service Health Monitor Script

The core monitoring script checks a predefined list of services, logs their state, and restarts any that are inactive. It uses systemctl is-active and writes structured entries via logger.

#!/usr/bin/env bash
# /usr/local/bin/service_monitor.sh
set -euo pipefail

SERVICES=("ssh" "apache2" "bind9" "fail2ban")
LOGFILE="/var/log/service-monitor.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')

for svc in "${SERVICES[@]}"; do
    STATUS=$(systemctl is-active "$svc" 2>/dev/null || echo "unknown")
    if [[ "$STATUS" != "active" ]]; then
        echo "[$TIMESTAMP] WARN: $svc is $STATUS — restarting" | tee -a "$LOGFILE"
        systemctl restart "$svc" && \
            echo "[$TIMESTAMP] INFO: $svc restarted successfully" | tee -a "$LOGFILE" || \
            echo "[$TIMESTAMP] ERROR: failed to restart $svc" | tee -a "$LOGFILE"
        logger -t service-monitor "Restarted $svc (was $STATUS)"
    else
        echo "[$TIMESTAMP] OK: $svc is active" >> "$LOGFILE"
    fi
done
02
systemd Timer Unit Deployment

Created service and timer unit files so the monitor runs every 5 minutes under systemd supervision with automatic logging to the journal.

# /etc/systemd/system/service-monitor.service
[Unit]
Description=Service Health Monitor
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/service_monitor.sh
StandardOutput=journal
StandardError=journal
# /etc/systemd/system/service-monitor.timer
[Unit]
Description=Run service monitor every 5 minutes

[Timer]
OnBootSec=2min
OnUnitActiveSec=5min
AccuracySec=30s

[Install]
WantedBy=timers.target
sudo chmod +x /usr/local/bin/service_monitor.sh
sudo systemctl daemon-reload
sudo systemctl enable --now service-monitor.timer
sudo systemctl list-timers --all | grep service-monitor
03
Batch User Provisioning from CSV

The provisioning script reads a CSV file (username,fullname,group,expiry), creates each user with the correct settings, assigns them to their group, and forces a password change on first login.

#!/usr/bin/env bash
# /usr/local/bin/provision_users.sh
# CSV format: username,fullname,group,expiry_days
INPUT="$1"
[[ -z "$INPUT" ]] && { echo "Usage: $0 users.csv"; exit 1; }

while IFS=',' read -r username fullname group expiry; do
    [[ "$username" == "username" ]] && continue  # skip header

    # Create group if it doesn't exist
    getent group "$group" &>/dev/null || groupadd "$group"

    # Create user
    useradd -m -c "$fullname" -g "$group" -s /bin/bash "$username" 2>/dev/null || {
        echo "WARN: user $username already exists"
        continue
    }

    # Set temporary password (username123) and force change
    echo "$username:${username}123" | chpasswd
    chage -d 0 "$username"          # force password change on login
    chage -M "$expiry" "$username"  # set max password age

    logger -t user-provision "Created user $username in group $group"
    echo "Created: $username ($fullname) → group $group, expiry ${expiry}d"
done < "$INPUT"
04
Nightly System Report Script

Collects disk usage, memory stats, failed services, recent auth failures, and top CPU processes, then formats a plain-text report and sends it via mail.

#!/usr/bin/env bash
# /usr/local/bin/nightly_report.sh
REPORT=$(mktemp)
HOST=$(hostname)
DATE=$(date '+%A, %B %d %Y %H:%M')

cat >> "$REPORT" <
05
Script Hardening & Testing

Applied best practices: set -euo pipefail, input validation, trap handlers for cleanup, and dry-run mode flags. Tested all scripts with bash -n (syntax check) and shellcheck.

# Syntax check all scripts
bash -n /usr/local/bin/service_monitor.sh
bash -n /usr/local/bin/provision_users.sh
bash -n /usr/local/bin/nightly_report.sh

# Static analysis
shellcheck /usr/local/bin/*.sh

# Test provisioning with sample CSV
cat > /tmp/test_users.csv <

Complete Workflow

flowchart LR A[Write Scripts\nwith set -euo pipefail] --> B[Validate Syntax\nbash -n + shellcheck] B --> C[Deploy to\n/usr/local/bin/] C --> D[Create systemd\nService + Timer Units] D --> E[systemctl daemon-reload\nenable --now timer] E --> F[Manual Test Run\nsystemctl start .service] F --> G[Check Journal\njournalctl -u ...] G --> H{Output Correct?} H -->|yes| I[Monitor Runs\nAutomatically] H -->|no| J[Fix Script\nRedeploy] J --> F style A fill:#1a1a2e,stroke:#00d4ff,color:#e0e0e0 style I fill:#1a1a2e,stroke:#00ff88,color:#e0e0e0 style H fill:#181818,stroke:#1e1e1e,color:#888 style J fill:#181818,stroke:#1e1e1e,color:#888

Challenges & Solutions

  • Script failing silently on unset variables — Added set -u to catch all unset variable references at runtime, converting silent bugs into explicit errors.
  • Timer not triggering after reboot — Missed adding WantedBy=timers.target in the [Install] section. Without this, systemctl enable had no effect on boot persistence.
  • CSV parsing breaking on names with spaces — Fixed by setting IFS=',' in the while read loop rather than relying on default word splitting.
  • Mail not sending from server — The lab environment lacked an MTA. Installed postfix in local-only mode and used mailutils to route reports to the local admin mailbox.

Key Takeaways

  • set -euo pipefail at the top of every production Bash script is non-negotiable — it transforms subtle bugs into loud failures that are easy to diagnose.
  • systemd timers are superior to cron for managed services — they log to the journal, support dependency ordering, and can be inspected with standard systemctl commands.
  • Batch provisioning scripts must handle idempotency — running the same script twice should not create duplicate users or destroy existing accounts.
  • Always validate external inputs (CSV files, command-line arguments) before processing to prevent script failures or unintended system changes.