Managing multiple servers can be a daunting task, especially when striving for consistency and efficiency. To tackle this challenge, I developed a robust automation system using Ansible, GitHub Actions, and Vagrant. This setup not only streamlines server configuration but also ensures that deployments are repeatable and maintainable.
A Bit of History: How It All Started
This project began out of necessity. I was maintaining a handful of Ubuntu servers — one for email, another for a website, and a few for experiments — and I quickly realized that logging into each one to make manual changes was both tedious and error-prone. My first step toward automation was a collection of shell scripts. They worked, but as the infrastructure grew, they became hard to manage and lacked the modularity I needed.
That is when I discovered Ansible. I created the ansible-servers repository in early 2024 as a way to centralize and standardize my infrastructure automation. Initially, it only contained a basic playbook for setting up users and updating packages. But over time, it evolved to include multiple roles, structured inventories, and eventually CI/CD integration through GitHub Actions.
Every addition was born out of a real-world need. When I got tired of testing changes manually, I added Vagrant to simulate my environments locally. When I wanted to be sure my configurations stayed consistent after every push, I integrated GitHub Actions to automate deployments. When I noticed the repo growing, I introduced linting and security checks to maintain quality.
The repository has grown steadily and organically, each commit reflecting a small lesson learned or a new challenge overcome.
The Foundation: Ansible Playbooks
At the core of my automation strategy are Ansible playbooks, which define the desired state of my servers. These playbooks handle tasks such as installing necessary packages, configuring services, and setting up user accounts. By codifying these configurations, I can apply them consistently across different environments.
To manage these playbooks, I maintain a structured repository that includes:
Inventory Files: Located in the inventory directory, these YAML files specify the hosts and groups for deployment targets.
Roles: Under the roles directory, I define reusable components that encapsulate specific functionalities, such as setting up a web server or configuring a database.
Configuration File: The ansible.cfg file sets important defaults, like enabling fact caching and specifying the inventory path, to optimize Ansible’s behavior.
Seamless Deployments with GitHub Actions
To automate the deployment process, I leverage GitHub Actions. This integration allows me to trigger Ansible playbooks automatically upon code changes, ensuring that my servers are always up-to-date with the latest configurations.
One of the key workflows is Deploy to Production, which executes the main playbook against the production inventory. This workflow is defined in the ansible-deploy.yml file and is triggered on specific events, such as pushes to the main branch.
Additionally, I have set up other workflows to maintain code quality and security:
Super-Linter: Automatically checks the codebase for syntax errors and adherence to best practices.
Dependabot Updates: Keeps dependencies up-to-date by automatically creating pull requests for new versions.
Local Testing with Vagrant
Before deploying changes to production, it is crucial to test them in a controlled environment. For this purpose, I use Vagrant to spin up virtual machines that mirror my production servers.
This approach allows me to validate changes in a safe environment before applying them to live servers.
Embracing Open Source and Continuous Improvement
Transparency and collaboration are vital in the open-source community. By hosting my automation setup on GitHub, I invite others to review, suggest improvements, and adapt the configurations for their own use cases.
The repository is licensed under the MIT License, encouraging reuse and modification. Moreover, I actively monitor issues and welcome contributions to enhance the system further.
In summary, by combining Ansible, GitHub Actions, and Vagrant, I have created a powerful and flexible automation framework for managing my servers. This setup not only reduces manual effort but also increases reliability and scalability. I encourage others to explore this approach and adapt it to their own infrastructure needs. What began as a few basic scripts has now evolved into a reliable automation pipeline I rely on every day.
If you are managing servers and find yourself repeating the same configuration steps, I invite you to check out the ansible-servers repository on GitHub. Clone it, explore the structure, try it in your own environment — and if you have ideas or improvements, feel free to open a pull request or start a discussion. Automation has made a huge difference for me, and I hope it can do the same for you.
In my previous post, I shared the story of why I needed a new USB stick and how I used ChatGPT to write a benchmark script that could measure read performance across various methods. In this follow-up, I will dive into the technical details of how the script evolved—from a basic prototype into a robust and feature-rich tool—thanks to incremental refinements and some AI-assisted development.
Starting Simple: The First Version
The initial idea was simple: read a file using dd and measure the speed.
No cache flushing, leading to inflated results when repeating the measurement
With ChatGPT’s help, I started addressing each of these issues one by one.
Tools check
On a default Ubuntu installation, some tools are available by default, while others (especially benchmarking tools) usually need to be installed separately.
Tools used in the script:
Tool
Installed by default?
Needs require?
hdparm
❌ Not installed
✅ Yes
dd
✅ Yes
❌ No
pv
❌ Not installed
✅ Yes
cat
✅ Yes
❌ No
ioping
❌ Not installed
✅ Yes
fio
❌ Not installed
✅ Yes
lsblk
✅ Yes (in util-linux)
❌ No
awk
✅ Yes (in gawk)
❌ No
grep
✅ Yes
❌ No
basename
✅ Yes (in coreutils)
❌ No
find
✅ Yes
❌ No
sort
✅ Yes
❌ No
stat
✅ Yes
❌ No
This function ensures the system has all tools needed for benchmarking. It exits early if any tool is missing.
This was the initial version:
check_required_tools() {
local required_tools=(dd pv hdparm fio ioping awk grep sed tr bc stat lsblk find sort)
for tool in "${required_tools[@]}"; do
if ! command -v "$tool" &>/dev/null; then
echo "❌ Required tool '$tool' is not installed."
exit 1
fi
done
}
That’s already nice, but maybe I just want to run the script anyway if some of the tools are missing.
This is a more advanced version:
ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()
require() {
if ! command -v "$1" >/dev/null; then
return 1
fi
return 0
}
check_required_tools() {
echo "🔍 Checking required tools..."
for tool in "${ALL_TOOLS[@]}"; do
if ! require "$tool"; then
MISSING_TOOLS+=("$tool")
fi
done
if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
echo "⚠️ The following tools are missing: ${MISSING_TOOLS[*]}"
echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
if [[ -z "$FORCE_YES" ]]; then
read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
case $yn in
[Yy]*)
echo "Continuing with limited tests..."
;;
*)
echo "Aborting. Please install the required tools."
exit 1
;;
esac
else
echo "Continuing with limited tests (auto-confirmed)..."
fi
else
echo "✅ All required tools are available."
fi
}
Device Auto-Detection
One early challenge was identifying which device was the USB stick. I wanted the script to automatically detect a mounted USB device. My first version was clunky and error-prone.
detect_usb() {
USB_DEVICE=$(lsblk -o NAME,TRAN,MOUNTPOINT -J | jq -r '.blockdevices[] | select(.tran=="usb") | .name' | head -n1)
if [[ -z "$USB_DEVICE" ]]; then
echo "❌ No USB device detected."
exit 1
fi
USB_PATH="/dev/$USB_DEVICE"
MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_PATH" | head -n1)
if [[ -z "$MOUNT_PATH" ]]; then
echo "❌ USB device is not mounted."
exit 1
fi
echo "✅ Using USB device: $USB_PATH"
echo "✅ Mounted at: $MOUNT_PATH"
}
After a few iterations, we (ChatGPT and I) settled on parsing lsblk with filters on tran=usb and hotplug=1, and selecting the first mounted partition.
We also added a fallback prompt in case auto-detection failed.
detect_usb() {
if [[ -n "$USB_DEVICE" ]]; then
echo "📎 Using provided USB device: $USB_DEVICE"
MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
return
fi
echo "🔍 Detecting USB device..."
USB_DEVICE=""
while read -r dev tran hotplug type _; do
if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
base="/dev/$dev"
part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
if [[ -n "$part" ]]; then
USB_DEVICE="$part"
break
fi
fi
done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)
if [ -z "$USB_DEVICE" ]; then
echo "❌ No mounted USB partition found on any USB disk."
lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
fi
MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
if [ -z "$MOUNT_PATH" ]; then
echo "❌ USB device is not mounted."
exit 1
fi
echo "✅ Using USB device: $USB_DEVICE"
echo "✅ Mounted at: $MOUNT_PATH"
}
Finding the Test File
To avoid hardcoding filenames, we implemented logic to search for the latest Ubuntu ISO on the USB stick.
find_ubuntu_iso() {
# Function to find an Ubuntu ISO on the USB device
find "$MOUNT_PATH" -type f -regextype posix-extended \
-regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}
Later, we enhanced it to accept a user-provided file, and even verify that the file was located on the USB stick. If it was not, the script would gracefully fall back to the Ubuntu ISO search.
find_test_file() {
if [[ -n "$TEST_FILE" ]]; then
echo "📎 Using provided test file: $(basename "$TEST_FILE")"
# Check if the provided test file is on the USB device
TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
echo "❌ The provided test file is not located on the USB device."
# Look for an Ubuntu ISO if it's not on the USB
TEST_FILE=$(find_ubuntu_iso)
fi
else
TEST_FILE=$(find_ubuntu_iso)
fi
if [ -z "$TEST_FILE" ]; then
echo "❌ No valid test file found."
exit 1
fi
if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
else
echo "📎 Selected test file: $(basename "$TEST_FILE")"
fi
}
Read Methods and Speed Extraction
To get a comprehensive view, we added multiple methods:
Parsing their outputs proved tricky. For example, pv outputs speed with or without spaces, and with different units. We created a robust extract_speed function with regex, and a speed_to_mb function that could handle both MB/s and MiB/s, with or without a space between value and unit.
extract_speed() {
grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}
speed_to_mb() {
if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
value="${BASH_REMATCH[1]}"
unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
else
echo "0"
return
fi
case "$unit" in
kb/s) awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
mb/s) awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
gb/s) awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
kib/s) awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
mib/s) awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
gib/s) awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
*) echo "0" ;;
esac
}
Dropping Caches for Accurate Results
To prevent cached reads from skewing the results, each test run begins by dropping system caches using:
sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
What it does:
Command
Purpose
sync
Flushes all dirty (pending write) pages to disk
echo 3 > /proc/sys/vm/drop_caches
Clears page cache, dentries, and inodes from RAM
We wrapped this in a helper function and used it consistently.
Multiple Runs and Averaging
We made the script repeat each test N times (default: 3), collect results, compute averages, and display a summary at the end.
echo "📊 Read-only USB benchmark started ($RUNS run(s))"
echo "==================================="
declare -A TEST_NAMES=(
[1]="hdparm"
[2]="dd"
[3]="dd + pv"
[4]="cat + pv"
[5]="ioping"
[6]="fio"
)
declare -A TOTAL_MB
for i in {1..6}; do TOTAL_MB[$i]=0; done
CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
echo "Test,Run,Speed (MB/s)" > "$CSVFILE"
for ((run=1; run<=RUNS; run++)); do
echo "▶ Run $run"
idx=1
### tests run here
echo "📄 Summary of average results for $UBUNTU_VERSION:"
echo "==================================="
SUMMARY_TABLE=""
for i in {1..6}; do
if [[ ${TOTAL_MB[$i]} != 0 ]]; then
avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
echo "${TEST_NAMES[$i]} average: $avg MB/s"
RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
fi
done
Output Formats
To make the results user-friendly, we added:
A clean table view
CSV export for spreadsheets
Log file for later reference
if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
echo -e "📋 Table view:"
echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
fi
if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
if require gnuplot; then
echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
gnuplot -p -e "
set terminal dumb;
set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
set xlabel 'Test Method';
set ylabel 'MB/s';
plot '-' using 2:xtic(1) with boxes notitle
"
fi
fi
LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
{
echo "Benchmark for USB device: $USB_DEVICE"
echo "Mounted at: $MOUNT_PATH"
echo "Ubuntu version: $UBUNTU_VERSION"
echo "Test file: $TEST_FILE"
echo "Timestamp: $(date)"
echo "Number of runs: $RUNS"
echo ""
echo "Read speed averages:"
for line in "${RESULTS[@]}"; do
echo "$line"
done
} > "$LOGFILE"
echo "📝 Results saved to: $LOGFILE"
echo "📈 CSV exported to: $CSVFILE"
echo "==================================="
The Full Script
Here is the complete version of the script used to benchmark the read performance of a USB drive:
#!/bin/bash
# ==========================
# CONFIGURATION
# ==========================
RESULTS=()
USB_DEVICE=""
TEST_FILE=""
RUNS=1
VISUAL="none"
SUMMARY=0
# (Consider grouping related configuration into a config file or associative array if script expands)
# ==========================
# ARGUMENT PARSING
# ==========================
while [[ $# -gt 0 ]]; do
case $1 in
--device)
USB_DEVICE="$2"
shift 2
;;
--file)
TEST_FILE="$2"
shift 2
;;
--runs)
RUNS="$2"
shift 2
;;
--visual)
VISUAL="$2"
shift 2
;;
--summary)
SUMMARY=1
shift
;;
--yes|--force)
FORCE_YES=1
shift
;;
*)
echo "Unknown option: $1"
exit 1
;;
esac
done
# ==========================
# TOOL CHECK
# ==========================
ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()
require() {
if ! command -v "$1" >/dev/null; then
return 1
fi
return 0
}
check_required_tools() {
echo "🔍 Checking required tools..."
for tool in "${ALL_TOOLS[@]}"; do
if ! require "$tool"; then
MISSING_TOOLS+=("$tool")
fi
done
if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
echo "⚠️ The following tools are missing: ${MISSING_TOOLS[*]}"
echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
if [[ -z "$FORCE_YES" ]]; then
read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
case $yn in
[Yy]*)
echo "Continuing with limited tests..."
;;
*)
echo "Aborting. Please install the required tools."
exit 1
;;
esac
else
echo "Continuing with limited tests (auto-confirmed)..."
fi
else
echo "✅ All required tools are available."
fi
}
# ==========================
# AUTO-DETECT USB DEVICE
# ==========================
detect_usb() {
if [[ -n "$USB_DEVICE" ]]; then
echo "📎 Using provided USB device: $USB_DEVICE"
MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
return
fi
echo "🔍 Detecting USB device..."
USB_DEVICE=""
while read -r dev tran hotplug type _; do
if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
base="/dev/$dev"
part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
if [[ -n "$part" ]]; then
USB_DEVICE="$part"
break
fi
fi
done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)
if [ -z "$USB_DEVICE" ]; then
echo "❌ No mounted USB partition found on any USB disk."
lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
fi
MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
if [ -z "$MOUNT_PATH" ]; then
echo "❌ USB device is not mounted."
exit 1
fi
echo "✅ Using USB device: $USB_DEVICE"
echo "✅ Mounted at: $MOUNT_PATH"
}
# ==========================
# FIND TEST FILE
# ==========================
find_ubuntu_iso() {
# Function to find an Ubuntu ISO on the USB device
find "$MOUNT_PATH" -type f -regextype posix-extended \
-regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}
find_test_file() {
if [[ -n "$TEST_FILE" ]]; then
echo "📎 Using provided test file: $(basename "$TEST_FILE")"
# Check if the provided test file is on the USB device
TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
echo "❌ The provided test file is not located on the USB device."
# Look for an Ubuntu ISO if it's not on the USB
TEST_FILE=$(find_ubuntu_iso)
fi
else
TEST_FILE=$(find_ubuntu_iso)
fi
if [ -z "$TEST_FILE" ]; then
echo "❌ No valid test file found."
exit 1
fi
if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
else
echo "📎 Selected test file: $(basename "$TEST_FILE")"
fi
}
# ==========================
# SPEED EXTRACTION
# ==========================
extract_speed() {
grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}
speed_to_mb() {
if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
value="${BASH_REMATCH[1]}"
unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
else
echo "0"
return
fi
case "$unit" in
kb/s) awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
mb/s) awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
gb/s) awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
kib/s) awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
mib/s) awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
gib/s) awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
*) echo "0" ;;
esac
}
drop_caches() {
echo "🧹 Dropping system caches..."
if [[ $EUID -ne 0 ]]; then
echo " (requires sudo)"
fi
sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
}
# ==========================
# RUN BENCHMARKS
# ==========================
run_benchmarks() {
echo "📊 Read-only USB benchmark started ($RUNS run(s))"
echo "==================================="
declare -A TEST_NAMES=(
[1]="hdparm"
[2]="dd"
[3]="dd + pv"
[4]="cat + pv"
[5]="ioping"
[6]="fio"
)
declare -A TOTAL_MB
for i in {1..6}; do TOTAL_MB[$i]=0; done
CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
echo "Test,Run,Speed (MB/s)" > "$CSVFILE"
for ((run=1; run<=RUNS; run++)); do
echo "▶ Run $run"
idx=1
if require hdparm; then
drop_caches
speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
mb=$(speed_to_mb "$speed")
echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
fi
((idx++))
drop_caches
speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
mb=$(speed_to_mb "$speed")
echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
((idx++))
if require pv; then
drop_caches
FILESIZE=$(stat -c%s "$TEST_FILE")
speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
mb=$(speed_to_mb "$speed")
echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
fi
((idx++))
if require pv; then
drop_caches
speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
mb=$(speed_to_mb "$speed")
echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
fi
((idx++))
if require ioping; then
drop_caches
speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
mb=$(speed_to_mb "$speed")
echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
fi
((idx++))
if require fio; then
drop_caches
speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
--size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
--minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
mb=$(speed_to_mb "$speed")
echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
fi
done
echo "📄 Summary of average results for $UBUNTU_VERSION:"
echo "==================================="
SUMMARY_TABLE=""
for i in {1..6}; do
if [[ ${TOTAL_MB[$i]} != 0 ]]; then
avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
echo "${TEST_NAMES[$i]} average: $avg MB/s"
RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
fi
done
if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
echo -e "📋 Table view:"
echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
fi
if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
if require gnuplot; then
echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
gnuplot -p -e "
set terminal dumb;
set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
set xlabel 'Test Method';
set ylabel 'MB/s';
plot '-' using 2:xtic(1) with boxes notitle
"
fi
fi
LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
{
echo "Benchmark for USB device: $USB_DEVICE"
echo "Mounted at: $MOUNT_PATH"
echo "Ubuntu version: $UBUNTU_VERSION"
echo "Test file: $TEST_FILE"
echo "Timestamp: $(date)"
echo "Number of runs: $RUNS"
echo ""
echo "Read speed averages:"
for line in "${RESULTS[@]}"; do
echo "$line"
done
} > "$LOGFILE"
echo "📝 Results saved to: $LOGFILE"
echo "📈 CSV exported to: $CSVFILE"
echo "==================================="
}
# ==========================
# MAIN
# ==========================
check_required_tools
detect_usb
find_test_file
run_benchmarks
You van also find the latest revision of this script as a GitHub Gist.
Lessons Learned
This script has grown from a simple one-liner into a reliable tool to test USB read performance. Working with ChatGPT sped up development significantly, especially for bash edge cases and regex. But more importantly, it helped guide the evolution of the script in a structured way, with clean modular functions and consistent formatting.
Conclusion
This has been a fun and educational project. Whether you are benchmarking your own USB drives or just want to learn more about shell scripting, I hope this walkthrough is helpful.
Next up? Maybe a graphical version, or write benchmarking on a RAM disk to avoid damaging flash storage.
Stay tuned—and let me know if you use this script or improve it!
When I upgraded from an old 8GB USB stick to a shiny new 256GB one, I expected faster speeds and more convenience—especially for carrying around multiple bootable ISO files using Ventoy. With modern Linux distributions often exceeding 4GB per ISO, my old drive could barely hold a single image. But I quickly realized that storage space was only half the story—performance matters too.
Curious about how much of an upgrade I had actually made, I decided to benchmark the read speed of both USB sticks. Instead of hunting down benchmarking tools or manually comparing outputs, I turned to ChatGPT to help me craft a reliable, repeatable shell script that could automate the entire process. In this post, I’ll share how ChatGPT helped me go from an idea to a functional USB benchmark script, and what I learned along the way.
The Goal
I wanted to answer a few simple but important questions:
How much faster is my new USB stick compared to the old one?
Do different USB ports affect read speeds?
How can I automate these tests and compare the results?
But I also wanted a reusable script that would:
Detect the USB device automatically
Find or use a test file on the USB stick
Run several types of read benchmarks
Present the results clearly, with support for summary and CSV export
Getting Help from ChatGPT
I asked ChatGPT to help me write a shell script with these requirements. It guided me through:
Handling different cases for user-provided test files or Ubuntu ISOs
Parsing and converting human-readable speed outputs
Displaying results in human-friendly tables and optional CSV export
We iterated over the script, addressing edge cases like:
USB devices not mounted
Multiple USB partitions
pv not showing output unless stderr was correctly handled
Formatting output consistently across tools
ChatGPT even helped optimize the code for readability, reduce duplication, and handle both space-separated and non-space-separated speed values like “18.6 MB/s” and “18.6MB/s”.
Benchmark Results
With the script ready, I ran tests on three configurations:
The old USB stick is not only limited in capacity but also very slow. It barely breaks 20 MB/s in most tests.
The new USB stick, when plugged into a fast USB 3.0 port, is significantly faster—over 10x the speed in most benchmarks.
Plugging the same new stick into a slower port dramatically reduces its performance—a good reminder to check where you plug it in.
Tools like hdparm, dd, and cat + pv give relatively consistent results. However, ioping and fio behave differently due to the way they access data—random access or block size differences can impact results.
Also worth noting: the metal casing of the new USB stick gets warm after a few test runs, unlike the old plastic one.
Conclusion
Using ChatGPT to develop this benchmark script was like pair-programming with an always-available assistant. It accelerated development, helped troubleshoot weird edge cases, and made the script more polished than if I had done it alone.
If you want to test your own USB drives—or ensure you’re using the best port for speed—this benchmark script is a great tool to have in your kit. And if you’re looking to learn shell scripting, pairing with ChatGPT is an excellent way to level up.
Want the script? I’ll share the full version of the script and instructions on how to use it in a follow-up post. Stay tuned!
From the AI-generated Wikipedia summary for a 10 year old:
The Advent of Code is an exciting annual computer programming event that takes place during the holiday season. It’s a fun challenge for programmers of all levels!
Every day in December leading up to Christmas, a new coding puzzle is released on the Advent of Code website. These puzzles are designed to test your problem-solving skills and help you improve your coding abilities.
You can participate by solving each puzzle using any programming language you’re comfortable with. The puzzles start off easy and gradually become more challenging as the days go by. You’ll get to explore different concepts like algorithms, data structures, and logical thinking while having lots of fun!
Not only will you have the opportunity to learn and practice coding, but there’s also a friendly community of fellow participants who share their solutions and discuss strategies on forums or social media platforms.
So if you enjoy coding or want to give it a try, the Advent of Code is a fantastic event for you! It’s a great way to sharpen your programming skills while enjoying the festive spirit during the holiday season.
Back in 2018 I created a GitHub repository with the good intention to work on all the puzzles, starting from the first year, 2015. Well, guess what never happened? ¯\_(ツ)_/¯
This year I’m starting again. I do not promise that I will work on a puzzle every day. Maybe I’ll spend more time on procrastinating setting up GitHub Actions. We’ll see…
The blog archive displays the full article, not just an excerpt.
Infinite scroll! I don’t know yet if I like it, I might change it later.
The blog archive has 2 columns. Again, I’m not sure about this, might change it later. Feedback is welcome, leave a comment! I changed it to single column, that’s easier to read, especially on mobile.
The most recent post is displayed full width.
On individual posts the thumbnail image is now the background of the title.
I’m still not entirely happy that the author is shown at the bottom of each blog post. I’m the only author here, so that’s useless, but I have not yet found how to remove that. EDIT: fixed with some extra CSS. Thanks for the tip, Frank!
Do you have any suggestions or comments on the new layout?
When I first bought my Fitbit, I was genuinely excited. It looked sleek, had all the right features—heart rate tracking, sleep analysis, step counting—and promised to help me better understand and improve my health. For a while, it felt like a good investment.
But over time, my enthusiasm faded. The more I used it, the more I realized something frustrating: Fitbit is a closed ecosystem, and that comes with some serious drawbacks.
Walled Garden, Limited Freedom
What do I mean by “closed ecosystem”? Essentially, Fitbit controls every aspect of the experience—from the hardware to the software to how your data is accessed. You are locked into their app, their platform, and their way of doing things.
Want to export your health data in a usable, open format? Tough luck. Want to use your Fitbit with a different app or platform? You will likely run into walls, paywalls, or limited APIs. Even things as basic as syncing your steps with other services can become frustratingly complicated—or simply impossible without a third-party workaround or a paid subscription.
Your Data, Their Rules
This is perhaps what bothers me most. The data collected by Fitbit—your heart rate, activity, sleep patterns—is incredibly personal. Yet Fitbit treats it like their property. You can view it in their app, sure, but only in the ways they allow. If you want more detailed insights or longer historical views, you often need to pay for Fitbit Premium.
And even then, it is not truly your data in the way it should be. You cannot easily export it, analyze it, or integrate it with other tools without hitting a wall. Contrast this with platforms that support open data standards and allow users to take full control of their own information.
Vendor Lock-in Is Real
Another big issue: once you are in the Fitbit ecosystem, it is hard to leave. If you switch to another tracker, you lose your history. There is no easy way to transfer years of health data to a new device or platform. That means people often stick with Fitbit—not because it is the best option, but because they do not want to start over from scratch.
This is a classic case of vendor lock-in. And it feels especially wrong when we are talking about personal health data.
It Did Not Have to Be This Way
The thing is, Fitbit could have done this differently. They could have embraced open standards, supported broader integration, and given users real ownership of their data. They could have made it easier to work with third-party apps and services. Instead, they chose to build a walled garden—and I am no longer interested in living in it.
Looking Ahead
I have not decided which tracker I will switch to yet, but one thing is clear: I want something open. Something that respects my ownership of my data. Something that plays nicely with other tools and services I already use.
Fitbit might work well for some people, and that is fine. But for me, the closed ecosystem is a dealbreaker. I want freedom, transparency, and real control over my data—and until Fitbit changes course, I will be looking elsewhere.
Last month I moved from Merelbeke to Ghent. I registered my new address on the government website, and last week I was invited to update my eID with my new address.
I made an appointment with one of the administrative centers of the city. The entire process took less than 5 minutes, and at the end I got a welcome gift: a box with a lot of information about the city services.
It’s been a while since I last did an unboxing video. The audio is in Dutch, maybe if I’m not too lazy (and only if people ask for it in the comments) I’ll provide subtitles.
Unboxing van het verwelkomingspakket van @StadGentVideos
Me: uses Pandoc and LaTeX to convert the file to PDF and read it in Evince because I don’t have LibreOffice installed and I’m too lazy to upload the document to Google Docs.
I needed to review an addendum to a rental contract. (I moved! I’ll write about that later.) The addendum was sent to me in ODT format. At the time, my desktop pc was still packed in a box. On my laptop (a 2011 MacBook Air with Ubuntu 20.04) I only have the most essential software installed, which for me doesn’t include an office suite. I could install LibreOffice, but why make it easy if I can also do it the hard way? 😀
I do have Evince installed, which is a lightweight PDF viewer. To convert ODT to PDF I’m using Pandoc, which is a Swiss army knife for converting document formats. For PDF it needs the help of LaTeX, a document preparation system for typesetting.
First I installed the required software:
$ sudo apt install pandoc texlive texlive-latex-extra
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
libapache-pom-java libcommons-logging-java libcommons-parent-java libfontbox-java libpdfbox-java preview-latex-style texlive-base texlive-binaries
texlive-fonts-recommended texlive-latex-base texlive-latex-recommended texlive-pictures texlive-plain-generic tipa
Suggested packages:
libavalon-framework-java libcommons-logging-java-doc libexcalibur-logkit-java liblog4j1.2-java texlive-xetex texlive-luatex pandoc-citeproc
context wkhtmltopdf librsvg2-bin groff ghc php python r-base-core libjs-mathjax node-katex perl-tk xzdec texlive-fonts-recommended-doc
texlive-latex-base-doc python3-pygments icc-profiles libfile-which-perl libspreadsheet-parseexcel-perl texlive-latex-extra-doc
texlive-latex-recommended-doc texlive-pstricks dot2tex prerex ruby-tcltk | libtcltk-ruby texlive-pictures-doc vprerex
The following NEW packages will be installed:
libapache-pom-java libcommons-logging-java libcommons-parent-java libfontbox-java libpdfbox-java pandoc preview-latex-style texlive texlive-base
texlive-binaries texlive-fonts-recommended texlive-latex-base texlive-latex-extra texlive-latex-recommended texlive-pictures texlive-plain-generic
tipa
0 upgraded, 17 newly installed, 0 to remove and 1 not upgraded.
Need to get 116 MB of archives.
After this operation, 448 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Just to compare, installing LibreOffice Writer would actually use less disk space. Pandoc is a lot faster though.
$ sudo apt install libreoffice-writer
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
libabw-0.1-1 libboost-date-time1.71.0 libboost-filesystem1.71.0 libboost-iostreams1.71.0 libboost-locale1.71.0 libclucene-contribs1v5
libclucene-core1v5 libcmis-0.5-5v5 libe-book-0.1-1 libeot0 libepubgen-0.1-1 libetonyek-0.1-1 libexttextcat-2.0-0 libexttextcat-data libgpgmepp6
libjuh-java libjurt-java liblangtag-common liblangtag1 libmhash2 libmwaw-0.3-3 libmythes-1.2-0 libneon27-gnutls libodfgen-0.1-1 liborcus-0.15-0
libraptor2-0 librasqal3 librdf0 libreoffice-base-core libreoffice-common libreoffice-core libreoffice-math libreoffice-style-colibre
libreoffice-style-tango librevenge-0.0-0 libridl-java libuno-cppu3 libuno-cppuhelpergcc3-3 libuno-purpenvhelpergcc3-3 libuno-sal3
libuno-salhelpergcc3-3 libunoloader-java libwpd-0.10-10 libwpg-0.3-3 libwps-0.4-4 libxmlsec1 libxmlsec1-nss libyajl2 python3-uno uno-libs-private
ure
Suggested packages:
raptor2-utils rasqal-utils librdf-storage-postgresql librdf-storage-mysql librdf-storage-sqlite librdf-storage-virtuoso redland-utils
libreoffice-base gstreamer1.0-plugins-bad tango-icon-theme fonts-crosextra-caladea fonts-crosextra-carlito libreoffice-java-common
The following NEW packages will be installed:
libabw-0.1-1 libboost-date-time1.71.0 libboost-filesystem1.71.0 libboost-iostreams1.71.0 libboost-locale1.71.0 libclucene-contribs1v5
libclucene-core1v5 libcmis-0.5-5v5 libe-book-0.1-1 libeot0 libepubgen-0.1-1 libetonyek-0.1-1 libexttextcat-2.0-0 libexttextcat-data libgpgmepp6
libjuh-java libjurt-java liblangtag-common liblangtag1 libmhash2 libmwaw-0.3-3 libmythes-1.2-0 libneon27-gnutls libodfgen-0.1-1 liborcus-0.15-0
libraptor2-0 librasqal3 librdf0 libreoffice-base-core libreoffice-common libreoffice-core libreoffice-math libreoffice-style-colibre
libreoffice-style-tango libreoffice-writer librevenge-0.0-0 libridl-java libuno-cppu3 libuno-cppuhelpergcc3-3 libuno-purpenvhelpergcc3-3
libuno-sal3 libuno-salhelpergcc3-3 libunoloader-java libwpd-0.10-10 libwpg-0.3-3 libwps-0.4-4 libxmlsec1 libxmlsec1-nss libyajl2 python3-uno
uno-libs-private ure
0 upgraded, 52 newly installed, 0 to remove and 1 not upgraded.
Need to get 78,5 MB of archives.
After this operation, 283 MB of additional disk space will be used.
Do you want to continue? [Y/n] n
Abort.
Next, converting the file. It’s possible to tell Pandoc which file formats to use with the -f (from) and -t (to) switches, but it can usually guess correctly based on the file extensions.
$ time pandoc 2022-06-house-contract-adendum.odt -o 2022-06-house-contract-adendum.pdf
real 0m0,519s
user 0m0,475s
sys 0m0,059s
It took only half a second to convert the file. Opening LibreOffice takes a bit more time on this old laptop.
You can see the PDF document properties with pdfinfo:
$ pdfinfo 2022-06-house-contract-adendum.pdf
Title:
Subject:
Keywords:
Author:
Creator: LaTeX with hyperref
Producer: pdfTeX-1.40.20
CreationDate: Sat Jun 11 23:32:30 2022 CEST
ModDate: Sat Jun 11 23:32:30 2022 CEST
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 2
Encrypted: no
Page size: 612 x 792 pts (letter)
Page rot: 0
File size: 64904 bytes
Optimized: no
PDF version: 1.5
I don’t want it in letter format, I want A4:
$ time pandoc -V papersize:a4 -o 2022-06-house-contract-adendum.pdf 2022-06-house-contract-adendum.odt
real 0m0,520s
user 0m0,469s
sys 0m0,060s
$ pdfinfo 2022-06-house-contract-adendum.pdf
Title:
Subject:
Keywords:
Author:
Creator: LaTeX with hyperref
Producer: pdfTeX-1.40.20
CreationDate: Sat Jun 11 23:40:16 2022 CEST
ModDate: Sat Jun 11 23:40:16 2022 CEST
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 2
Encrypted: no
Page size: 595.276 x 841.89 pts (A4)
Page rot: 0
File size: 64935 bytes
Optimized: no
PDF version: 1.5
Then I could open the file with evince 2022-06-house-contract-adendum.pdf.
When Ubuntu 22.04 LTS (Jammy Jellyfish) was released, I wanted to upgrade my system from Ubuntu 21.10 (Impish Indri). But I had one critical requirement:
Do not replace my 5.13 kernel series!
This was primarily for compatibility reasons with specific drivers and tools I rely on. See also my other post about my ridiculous amount of kernels.
This post documents the steps I took to successfully upgrade the OS while keeping my old kernel intact.
🧹 Step 1: Clean Up Old Configuration Files Before the Upgrade
Before starting the upgrade, I removed some APT configuration files that could conflict with the upgrade process:
Now it was time for the main event. I initiated the upgrade with:
sudo do-release-upgrade
The release upgrader went through its usual routine — calculating changes, checking dependencies, and showing what would be removed or upgraded.
3 installed packages are no longer supported by Canonical.
22 packages will be removed, 385 new packages installed, and 3005 packages upgraded.
Download: ~5.2 MB
Estimated time: 17 mins @ 40 Mbit/s or over 2 hours @ 5 Mbit/s.
😱 Step 3: Wait, It Wants to Remove What?!
Among the packages marked for removal:
hardlink
fuse
Many linux-5.13.* kernel packages
Tools like grub-customizer and older versions of Python
🔍 Investigating hardlink
I use hardlink regularly, so I double-checked its availability.
No need to worry — it is still available in Ubuntu 22.04! It moved from its own package to util-linux. 👉 manpages.ubuntu.com (hardlink)
So no problem there.
✅ Saving fuse
I aborted the upgrade and manually installed fuse to mark it as manually installed:
sudo apt install fuse
Then I restarted the upgrade.
🛠 Step 4: Keep the 5.13 Kernel
To keep using my current kernel version, I re-added the Impish repo after the upgrade but before rebooting.
awk '($1$3$4=="debjammymain"){$3="impish" ;print}' /etc/apt/sources.list \
| sudo tee /etc/apt/sources.list.d/impish.list
Then I updated the package lists and reinstalled the kernel packages I wanted to keep:
Because curiosity killed the cat, not because it’s useful! 😀
Start with a clean install in a virtual machine
I start with a simple Vagrantfile:
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/jammy64"
config.vm.provision "ansible" do |ansible|
ansible.playbook = "playbook.yml"
end
end
This Ansible playbook updates all packages to the latest version and removes unused packages.
- name: Update all packages to the latest version
hosts: all
remote_user: ubuntu
become: yes
tasks:
- name: Update apt cache
apt:
update_cache: yes
cache_valid_time: 3600
force_apt_get: yes
- name: Upgrade all apt packages
apt:
force_apt_get: yes
upgrade: dist
- name: Check if a reboot is needed for Ubuntu boxes
register: reboot_required_file
stat: path=/var/run/reboot-required get_md5=no
- name: Reboot the Ubuntu box
reboot:
msg: "Reboot initiated by Ansible due to kernel updates"
connect_timeout: 5
reboot_timeout: 300
pre_reboot_delay: 0
post_reboot_delay: 30
test_command: uptime
when: reboot_required_file.stat.exists
- name: Remove unused packages
apt:
autoremove: yes
purge: yes
force_apt_get: yes
Then bring up the virtual machine with vagrant up --provision.
Get the installation size
I ssh into the box (vagrant ssh) and run a couple of commands to get some numbers.