Skip to content

bash

connecting flash memory to laptop

Benchmarking USB Drives with Shell Scripts – Part 2: Evolving the Script with ChatGPT

Introduction

In my previous post, I shared the story of why I needed a new USB stick and how I used ChatGPT to write a benchmark script that could measure read performance across various methods. In this follow-up, I will dive into the technical details of how the script evolved—from a basic prototype into a robust and feature-rich tool—thanks to incremental refinements and some AI-assisted development.


Starting Simple: The First Version

The initial idea was simple: read a file using dd and measure the speed.

dd if=/media/amedee/Ventoy/ISO/ubuntu-24.10-desktop-amd64.iso \
   of=/dev/null bs=8k

That worked, but I quickly ran into limitations:

  • No progress indicator
  • Hardcoded file paths
  • No USB auto-detection
  • No cache flushing, leading to inflated results when repeating the measurement

With ChatGPT’s help, I started addressing each of these issues one by one.


Tools check

On a default Ubuntu installation, some tools are available by default, while others (especially benchmarking tools) usually need to be installed separately.

Tools used in the script:

ToolInstalled by default?Needs require?
hdparm❌ Not installed✅ Yes
dd✅ Yes❌ No
pv❌ Not installed✅ Yes
cat✅ Yes❌ No
ioping❌ Not installed✅ Yes
fio❌ Not installed✅ Yes
lsblk✅ Yes (in util-linux)❌ No
awk✅ Yes (in gawk)❌ No
grep✅ Yes❌ No
basename✅ Yes (in coreutils)❌ No
find✅ Yes❌ No
sort✅ Yes❌ No
stat✅ Yes❌ No

This function ensures the system has all tools needed for benchmarking. It exits early if any tool is missing.

This was the initial version:

check_required_tools() {
  local required_tools=(dd pv hdparm fio ioping awk grep sed tr bc stat lsblk find sort)
  for tool in "${required_tools[@]}"; do
    if ! command -v "$tool" &>/dev/null; then
      echo "❌ Required tool '$tool' is not installed."
      exit 1
    fi
  done
}

That’s already nice, but maybe I just want to run the script anyway if some of the tools are missing.

This is a more advanced version:

ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

Device Auto-Detection

One early challenge was identifying which device was the USB stick. I wanted the script to automatically detect a mounted USB device. My first version was clunky and error-prone.

detect_usb() {
  USB_DEVICE=$(lsblk -o NAME,TRAN,MOUNTPOINT -J | jq -r '.blockdevices[] | select(.tran=="usb") | .name' | head -n1)
  if [[ -z "$USB_DEVICE" ]]; then
    echo "❌ No USB device detected."
    exit 1
  fi
  USB_PATH="/dev/$USB_DEVICE"
  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_PATH" | head -n1)
  if [[ -z "$MOUNT_PATH" ]]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi
  echo "✅ Using USB device: $USB_PATH"
  echo "✅ Mounted at: $MOUNT_PATH"
}

After a few iterations, we (ChatGPT and I) settled on parsing lsblk with filters on tran=usb and hotplug=1, and selecting the first mounted partition.

We also added a fallback prompt in case auto-detection failed.

detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

Finding the Test File

To avoid hardcoding filenames, we implemented logic to search for the latest Ubuntu ISO on the USB stick.

find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

Later, we enhanced it to accept a user-provided file, and even verify that the file was located on the USB stick. If it was not, the script would gracefully fall back to the Ubuntu ISO search.

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}

Read Methods and Speed Extraction

To get a comprehensive view, we added multiple methods:

  • hdparm (direct disk access)
  • dd (simple block read)
  • dd + pv (with progress bar)
  • cat + pv (alternative stream reader)
  • ioping (random access)
  • fio (customizable benchmark tool)
    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi

Parsing their outputs proved tricky. For example, pv outputs speed with or without spaces, and with different units. We created a robust extract_speed function with regex, and a speed_to_mb function that could handle both MB/s and MiB/s, with or without a space between value and unit.

extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

Dropping Caches for Accurate Results

To prevent cached reads from skewing the results, each test run begins by dropping system caches using:

sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

What it does:

CommandPurpose
syncFlushes all dirty (pending write) pages to disk
echo 3 > /proc/sys/vm/drop_cachesClears page cache, dentries, and inodes from RAM

We wrapped this in a helper function and used it consistently.


Multiple Runs and Averaging

We made the script repeat each test N times (default: 3), collect results, compute averages, and display a summary at the end.

  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

  ### tests run here

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

Output Formats

To make the results user-friendly, we added:

  • A clean table view
  • CSV export for spreadsheets
  • Log file for later reference
  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="

The Full Script

Here is the complete version of the script used to benchmark the read performance of a USB drive:

#!/bin/bash

# ==========================
# CONFIGURATION
# ==========================
RESULTS=()
USB_DEVICE=""
TEST_FILE=""
RUNS=1
VISUAL="none"
SUMMARY=0

# (Consider grouping related configuration into a config file or associative array if script expands)

# ==========================
# ARGUMENT PARSING
# ==========================
while [[ $# -gt 0 ]]; do
  case $1 in
    --device)
      USB_DEVICE="$2"
      shift 2
      ;;
    --file)
      TEST_FILE="$2"
      shift 2
      ;;
    --runs)
      RUNS="$2"
      shift 2
      ;;
    --visual)
      VISUAL="$2"
      shift 2
      ;;
    --summary)
      SUMMARY=1
      shift
      ;;
    --yes|--force)
      FORCE_YES=1
      shift
      ;;
    *)
      echo "Unknown option: $1"
      exit 1
      ;;
  esac
done

# ==========================
# TOOL CHECK
# ==========================
ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

# ==========================
# AUTO-DETECT USB DEVICE
# ==========================
detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

# ==========================
# FIND TEST FILE
# ==========================
find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}



# ==========================
# SPEED EXTRACTION
# ==========================
extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

drop_caches() {
  echo "🧹 Dropping system caches..."
  if [[ $EUID -ne 0 ]]; then
    echo "  (requires sudo)"
  fi
  sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
}

# ==========================
# RUN BENCHMARKS
# ==========================
run_benchmarks() {
  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
  done

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="
}

# ==========================
# MAIN
# ==========================
check_required_tools
detect_usb
find_test_file
run_benchmarks

You van also find the latest revision of this script as a GitHub Gist.


Lessons Learned

This script has grown from a simple one-liner into a reliable tool to test USB read performance. Working with ChatGPT sped up development significantly, especially for bash edge cases and regex. But more importantly, it helped guide the evolution of the script in a structured way, with clean modular functions and consistent formatting.


Conclusion

This has been a fun and educational project. Whether you are benchmarking your own USB drives or just want to learn more about shell scripting, I hope this walkthrough is helpful.

Next up? Maybe a graphical version, or write benchmarking on a RAM disk to avoid damaging flash storage.

Stay tuned—and let me know if you use this script or improve it!

connecting flash memory to laptop

Benchmarking USB Drives with Shell Scripts – Part 1: Why I Built a Benchmark Script

Introduction

When I upgraded from an old 8GB USB stick to a shiny new 256GB one, I expected faster speeds and more convenience—especially for carrying around multiple bootable ISO files using Ventoy. With modern Linux distributions often exceeding 4GB per ISO, my old drive could barely hold a single image. But I quickly realized that storage space was only half the story—performance matters too.

Curious about how much of an upgrade I had actually made, I decided to benchmark the read speed of both USB sticks. Instead of hunting down benchmarking tools or manually comparing outputs, I turned to ChatGPT to help me craft a reliable, repeatable shell script that could automate the entire process. In this post, I’ll share how ChatGPT helped me go from an idea to a functional USB benchmark script, and what I learned along the way.


The Goal

I wanted to answer a few simple but important questions:

  • How much faster is my new USB stick compared to the old one?
  • Do different USB ports affect read speeds?
  • How can I automate these tests and compare the results?

But I also wanted a reusable script that would:

  • Detect the USB device automatically
  • Find or use a test file on the USB stick
  • Run several types of read benchmarks
  • Present the results clearly, with support for summary and CSV export

Getting Help from ChatGPT

I asked ChatGPT to help me write a shell script with these requirements. It guided me through:

  • Choosing benchmarking tools: hdparm, dd, pv, ioping, fio
  • Auto-detecting the mounted USB device
  • Handling different cases for user-provided test files or Ubuntu ISOs
  • Parsing and converting human-readable speed outputs
  • Displaying results in human-friendly tables and optional CSV export

We iterated over the script, addressing edge cases like:

  • USB devices not mounted
  • Multiple USB partitions
  • pv not showing output unless stderr was correctly handled
  • Formatting output consistently across tools

ChatGPT even helped optimize the code for readability, reduce duplication, and handle both space-separated and non-space-separated speed values like “18.6 MB/s” and “18.6MB/s”.


Benchmark Results

With the script ready, I ran tests on three configurations:

1. Old 8GB USB Stick

hdparm       16.40 MB/s
dd 18.66 MB/s
dd + pv 17.80 MB/s
cat + pv 18.10 MB/s
ioping 4.44 MB/s
fio 93.99 MB/s

2. New 256GB USB Stick (Fast USB Port)

hdparm      372.01 MB/s
dd 327.33 MB/s
dd + pv 310.00 MB/s
cat + pv 347.00 MB/s
ioping 8.58 MB/s
fio 992.78 MB/s

3. New 256GB USB Stick (Slow USB Port)

hdparm       37.60 MB/s
dd 39.86 MB/s
dd + pv 38.13 MB/s
cat + pv 40.30 MB/s
ioping 6.88 MB/s
fio 73.52 MB/s

Observations

  • The old USB stick is not only limited in capacity but also very slow. It barely breaks 20 MB/s in most tests.
  • The new USB stick, when plugged into a fast USB 3.0 port, is significantly faster—over 10x the speed in most benchmarks.
  • Plugging the same new stick into a slower port dramatically reduces its performance—a good reminder to check where you plug it in.
  • Tools like hdparm, dd, and cat + pv give relatively consistent results. However, ioping and fio behave differently due to the way they access data—random access or block size differences can impact results.

Also worth noting: the metal casing of the new USB stick gets warm after a few test runs, unlike the old plastic one.


Conclusion

Using ChatGPT to develop this benchmark script was like pair-programming with an always-available assistant. It accelerated development, helped troubleshoot weird edge cases, and made the script more polished than if I had done it alone.

If you want to test your own USB drives—or ensure you’re using the best port for speed—this benchmark script is a great tool to have in your kit. And if you’re looking to learn shell scripting, pairing with ChatGPT is an excellent way to level up.


Want the script?
I’ll share the full version of the script and instructions on how to use it in a follow-up post. Stay tuned!

black internal hdd on black surface

How big is a clean install of Ubuntu Jammy Jellyfish (22.04)?

Because curiosity killed the cat, not because it’s useful! 😀

Start with a clean install in a virtual machine

I start with a simple Vagrantfile:

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/jammy64"
  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "playbook.yml"
  end
end

This Ansible playbook updates all packages to the latest version and removes unused packages.

- name: Update all packages to the latest version
  hosts: all
  remote_user: ubuntu
  become: yes

  tasks:

  - name: Update apt cache
    apt:
      update_cache: yes
      cache_valid_time: 3600
      force_apt_get: yes

  - name: Upgrade all apt packages
    apt:
      force_apt_get: yes
      upgrade: dist

  - name: Check if a reboot is needed for Ubuntu boxes
    register: reboot_required_file
    stat: path=/var/run/reboot-required get_md5=no

  - name: Reboot the Ubuntu box
    reboot:
      msg: "Reboot initiated by Ansible due to kernel updates"
      connect_timeout: 5
      reboot_timeout: 300
      pre_reboot_delay: 0
      post_reboot_delay: 30
      test_command: uptime
    when: reboot_required_file.stat.exists

  - name: Remove unused packages
    apt:
      autoremove: yes
      purge: yes
      force_apt_get: yes

Then bring up the virtual machine with vagrant up --provision.

Get the installation size

I ssh into the box (vagrant ssh) and run a couple of commands to get some numbers.

Number of installed packages:

$ dpkg-query --show | wc --lines
592

Size of the installed packages:

$ dpkg-query --show --showformat '${Installed-size}\n' | awk '{s+=$1*1024} END {print s}' | numfmt --to=iec-i --format='%.2fB'
1.14GiB

I need to multiply the package size with 1024 because dpkg-query outputs size in kilobytes.

Total size:

$ sudo du --summarize --human-readable --one-file-system /
1.9G	/

Get the installation size using Ansible

Of course, I can also add this to my Ansible playbook, and then I don’t have to ssh into the virtual machine.

  - name: Get the number of installed packages
    shell: dpkg-query --show | wc --lines
    register: package_count
    changed_when: false
    failed_when: false
  - debug: msg="{{ package_count.stdout }}"

  - name: Get the size of installed packages
    shell: >
      dpkg-query --show --showformat '${Installed-size}\n' 
      | awk '{s+=$1*1024} END {print s}' 
      | numfmt --to=iec-i --format='%.2fB'
    register: package_size
    changed_when: false
    failed_when: false
  - debug: msg="{{ package_size.stdout }}"

  - name: Get the disk size with du
    shell: >
      du --summarize --one-file-system /
      | numfmt --to=iec-i --format='%.2fB'
    register: du_used
    changed_when: false
    failed_when: false
  - debug: msg="{{ du_used.stdout }}"

The output is then:

TASK [Get the number of installed packages] ************************************
ok: [default]

TASK [debug] *******************************************************************
ok: [default] => {
    "msg": "592"
}

TASK [Get the size of installed packages] **************************************
ok: [default]

TASK [debug] *******************************************************************
ok: [default] => {
    "msg": "1.14GiB"
}

TASK [Get the disk size with du] ***********************************************
ok: [default]

TASK [debug] *******************************************************************
ok: [default] => {
    "msg": "1.82MiB /"
}
technology computer desktop programming

screenshots van de framebuffer

Soms moet ne mens al eens iets speciaals doen, zoals het nemen van een screenshot op een toestel dat wel Linux draait, maar geen X. Oink? Volgens StackExchange zou ik fbgrab of fbdump moeten gebruiken, maar dat is in dit concrete geval niet mogelijk because reasons.

In dit concrete geval is er een toepassing die rechtstreeks naar de framebuffer beelden stuurt. Bon, alles is een file onder Linux, dus ik ging eens piepen wat er dan eigenlijk in dat framebuffer device zat:


$ cp /dev/fb0 /tmp/framebuffer.data
$ head -c 64 /tmp/framebuffer.data
kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�kkk�

IEKS!!!
Alhoewel…
Tiens, dat zag er verdacht regelmatig uit, telkens in groepjes van 4 bytes. “k” heeft ASCII waarde 107, of 6B hexadecimaal, en #6B6B6B is een grijstint. Ik had voorlopig nog geen enkel idee wat die “�” betekende, maar ik wist dat ik iets op het spoor was!

Ik heb framebuffer.data dan gekopieerd naar een pc met daarop Gimp. (referentie naar Contact invoegen)

:(){ :|:& };:

Ik daag iedereen uit om de titel van deze post in te typen in Bash.

DISCLAIMER: zorg dat je vooraf alle andere programma’s afgesloten hebt!