Skip to content

2025

🎻 Spring Tunes: Three Inspiring Music Courses I Attended This Season

This spring was filled with music, learning, and connection. I had the opportunity to participate in three wonderful music courses, each offering something unique—new styles, deeper technique, and a strong sense of community. Here is a look back at these inspiring experiences.


🎶 1. Fiddlers on the Move – Ghent (5–9 March)

Photo: Filip Verpoest

In early March, I joined Fiddlers on the Move in Ghent, a five-day course packed with workshops led by musicians from all over the world. Although I play the nyckelharpa, I deliberately chose workshops that were not nyckelharpa-specific. This gave me the challenge and joy of translating techniques from other string traditions to my instrument.

Here is a glimpse of the week:

  • Wednesday: Fiddle singing with Laura Cortese – singing while playing was new for me, and surprisingly fun.
  • Thursday: Klezmer violin / Fiddlers down the roof with Amit Weisberger – beautiful melodies and ornamentation with plenty of character.
  • Friday: Arabic music with Layth Sidiq – an introduction to maqams and rhythmic patterns that stretched my ears in the best way.
  • Saturday: Swedish violin jamsession classics with Mia Marine – a familiar style, but always a joy with Mia’s energy and musicality.
  • Sunday: Live looping strings with Joris Vanvinckenroye – playful creativity with loops, layering, and rhythm.

Each day brought something different, and I came home with a head full of ideas and melodies to explore further.


🪗 2. Workshopweekend Stichting Draailiek & Doedelzak – Sint-Michielsgestel, NL (18–21 April)

Photo: Arne de Laat

In mid-April, I traveled to Sint-Michielsgestel in the Netherlands for the annual Workshopweekend organized by Stichting Draailier & Doedelzak. This year marked the foundation’s 40th anniversary, and the event was extended to four days, from Friday evening to Monday afternoon, at the beautiful location of De Zonnewende.

I joined the nyckelharpa workshop with Rasmus Brinck. One of the central themes we explored was the connection between playing and dancing polska—a topic close to my heart. I consider myself a dancer first and a musician second, so it was especially meaningful to deepen the musical understanding of how movement and melody shape one another.

The weekend offered a rich variety of other workshops as well, including hurdy-gurdy, bagpipes, diatonic accordion, singing, and ensemble playing. As always, the atmosphere was warm and welcoming. With structured workshops during the day and informal jam sessions, concerts, and bals in the evenings, it was a perfect blend of learning and celebration.


🇸🇪 3. Swedish Music for Strings – Ronse (2–4 May)

At the beginning of May, I took part in a three-day course in Ronse dedicated to Swedish string music. Although we could arrive on 1 May, teaching started the next day. The course was led by David Eriksson and organized by Amate Galli. About 20 musicians participated—two violinists, one cellist, and the rest of us on nyckelharpa.

The focus was on capturing the subtle groove and phrasing that make Swedish folk music so distinctive. It was a joy to be surrounded by such a rich soundscape and to play in harmony with others who share the same passion. The music stayed with me long after the course ended.


✨ Final Thoughts

Each of these courses gave me something different: new musical perspectives, renewed technical focus, and most importantly, the joy of making music with others. I am deeply grateful to all the teachers, organizers, and fellow participants who made these experiences so rewarding. I am already looking forward to the next musical adventure!

blue bright lights

📚 Automating Ansible Role Documentation with GitHub Actions and AI

Maintaining documentation for Ansible roles can be a tedious and easily neglected task. As roles evolve, variable names change, and new tasks are added, it is easy for the README.md files to fall out of sync. To prevent this and keep documentation continuously up to date, I wrote a GitHub Actions workflow that automatically generates and formats documentation for all Ansible roles in my repository. Even better: it writes its own commit messages using AI.

Let me walk you through why I created this workflow, how it works, and what problems it solves.


🤔 Why Automate Role Documentation?

Ansible roles are modular, reusable components. Good roles include well-structured documentation—at the very least, variable descriptions, usage examples, and explanations of defaults.

However, writing documentation manually introduces several problems:

  • Inconsistency: Humans forget things. Updates to a role do not always get mirrored in its documentation.
  • Wasted time: Writing boilerplate documentation by hand is inefficient.
  • Error-prone: Manually copying variable names and descriptions invites typos and outdated content.

Enter ansible-doctor: a tool that analyzes roles and generates structured documentation automatically. Once I had that, it made perfect sense to automate its execution using GitHub Actions.


⚙️ How the Workflow Works

Here is the high-level overview of what the workflow does:

  1. Triggers:
    • It can be run manually via workflow_dispatch.
    • It is also designed to be reusable in other workflows via workflow_call.
  2. Concurrency and Permissions:
    • Uses concurrency to ensure that only one documentation run per branch is active at a time.
    • Grants minimal permissions needed to write to the repository and generate OIDC tokens.
  3. Steps:
    • Check out the code.
    • 🐍 Set up Python and install ansible-doctor.
    • 📄 Generate documentation with ansible-doctor --recursive roles.
    • 🧼 Format the resulting Markdown using Prettier to ensure consistency.
    • 🤖 Configure Git with a bot identity.
    • 🔍 Detect whether any .md files changed.
    • 🧠 Generate a commit message using AI, powered by OpenRouter.ai and a small open-source model (mistralai/devstral-small:free).
    • 💾 Commit and push the changes if there are any.

🧠 AI-Generated Commit Messages

Why use AI for commit messages?

  • I want my commits to be meaningful, concise, and nicely formatted.
  • The AI model is given a diff of the staged Markdown changes (up to 3000 characters) and asked to:
    • Keep it under 72 characters.
    • Start with an emoji.
    • Summarize the nature of the documentation update.

This is a small but elegant example of how LLMs can reduce repetitive work and make commits cleaner and more expressive.

Fallbacks are in place: if the AI fails to generate a message, the workflow defaults to a generic 📝 Update Ansible role documentation.


🌍 A Universal Pattern for Automated Docs

Although this workflow is focused on Ansible, the underlying pattern is not specific to Ansible at all. You can apply the same approach to any programming language or ecosystem that supports documentation generation based on inline annotations, comments, or code structure.

The general steps are:

  1. Write documentation annotations in your code (e.g. JSDoc, Doxygen, Python docstrings, Rust doc comments, etc.).
  2. Run a documentation generator, such as:
  3. Generate a commit message from the diff using an LLM.
  4. Commit and push the updated documentation.

This automation pattern works best in projects where:

  • Documentation is stored in version control.
  • Changes to documentation should be traceable.
  • Developers want to reduce the overhead of writing and committing docs manually.

🔐 A Note on OpenRouter API Keys

The AI step relies on OpenRouter.ai to provide access to language models. To keep your API key secure, it is passed via secrets.OPENROUTER_API_KEY, which is required when calling this workflow. I recommend generating a dedicated, rate-limited key for GitHub Actions use.


🧪 Try It Yourself

If you are working with Ansible roles—or any codebase with structured documentation—and want to keep your docs fresh and AI-assisted, this workflow might be useful for you too. Feel free to copy and adapt it for your own projects. You can find the full source in my GitHub repository.

Let the robots do the boring work, so you can focus on writing better code.


💬 Feedback?

If you have ideas to improve this workflow or want to share your own automation tricks, feel free to leave a comment or reach out on Mastodon: @amedee@lou.lt.

Happy automating!

blur bright business codes

🤔 “Wasn’t /dev/null Good Enough?” — Understanding the Difference Between /dev/null and /dev/zero

After my last blog post about the gloriously pointless /dev/scream, a few people asked:

“Wasn’t /dev/null good enough?”

Fair question—but it misses a key point.

Let me explain: /dev/null and /dev/zero are not interchangeable. In fact, they are opposites in many ways. And to fully appreciate the joke behind /dev/scream, you need to understand where that scream is coming from—not where it ends up.


🌌 Black Holes and White Holes

To understand the difference, let us borrow a metaphor from cosmology.

  • /dev/null is like a black hole: it swallows everything. You can write data to it, but nothing ever comes out. Not even light. Not even your logs.
  • /dev/zero is like a white hole: it constantly emits data. In this case, an infinite stream of zero bytes (0x00). It produces, but does not accept.

So when I run:

dd if=/dev/zero of=/dev/null

I am pulling data out of the white hole, and sending it straight into the black hole. A perfectly balanced operation of cosmic futility.


📦 What Are All These /dev/* Devices?

Let us break down the core players:

DeviceCan You Write To It?Can You Read From It?What You ReadCommonly Used ForNickname / Metaphor
/dev/nullYesYesInstantly empty (EOF)Discard console output of a programBlack hole 🌑
/dev/zeroYesYesEndless zeroes (0x00)Wiping drives, filling files, or allocating memory with known contentsWhite hole 🌕
/dev/randomNoYesRandom bytes from entropy poolSecure wiping drives, generating random dataQuantum noise 🎲
/dev/urandomNoYesPseudo-random bytes (faster, less secure)Generating random dataPseudo-random fountain 🔀
/dev/oneYesYesEndless 0xFF bytesWiping drives, filling files, or allocating memory with known contentsThe dark mirror of /dev/zero ☠️
/dev/screamYesYesaHAAhhaHHAAHaAaAAAA…CatharsisEmotional white hole 😱

Note: /dev/one is not a standard part of Linux—it comes from a community kernel module, much like /dev/scream.


🗣️ Back to the Screaming

/dev/scream is a parody of /dev/zero—not /dev/null.

The point of /dev/scream was not to discard data. That is what /dev/null is for.

The point was to generate data, like /dev/zero or /dev/random, but instead of silent zeroes or cryptographic entropy, it gives you something more cathartic: an endless, chaotic scream.

aHAAhhaHHAAHaAaAAAAhhHhhAAaAAAhAaaAAAaHHAHhAaaaaAaHahAaAHaAAHaaHhAHhHaHaAaHAAHaAhhaHaAaAA

So when I wrote:

dd if=/dev/scream of=/dev/null

I was screaming into the void. The scream came from the custom device, and /dev/null politely absorbed it without complaint. Not a single bit screamed back. Like pulling screams out of a white hole and throwing them into a black hole. The ultimate cosmic catharsis.


🧪 Try Them Yourself

Want to experience the universe of /dev for yourself? Try these commands (press Ctrl+C to stop each):

# Silent, empty. Nothing comes out.
cat /dev/null

# Zero bytes forever. Very chill.
hexdump -C /dev/zero

# Random bytes from real entropy (may block).
hexdump -C /dev/random

# Random bytes, fast but less secure.
hexdump -C /dev/urandom

# If you have the /dev/one module:
hexdump -C /dev/one

# If you installed /dev/scream:
cat /dev/scream

💡 TL;DR

  • /dev/null = Black hole: absorbs, never emits.
  • /dev/zero = White hole: emits zeroes, absorbs nothing.
  • /dev/random / /dev/urandom = Entropy sources: useful for cryptography.
  • /dev/one = Evil twin of /dev/zero: gives endless 0xFF bytes.
  • /dev/scream = Chaotic white hole: emits pure emotional entropy.

So no, /dev/null was not “good enough”—it was not the right tool. The original post was not about where the data goes (of=/dev/null), but where it comes from (if=/dev/scream), just like /dev/zero. And when it comes from /dev/scream, you are tapping into something truly primal.

Because sometimes, in Linux as in life, you just need to scream into the void.

blur bright business codes

🐧 Falling Down the /dev Rabbit Hole: From Secure Deletion to /dev/scream

It started innocently enough. I was reading a thread about secure file deletion on Linux—a topic that has popped up in discussions for decades. You know the kind: “Is shred still reliable? Should I overwrite with random data or zeroes? What about SSDs and wear leveling?”

As I followed the thread, I came across a mention of /dev/zero, the classic Unix device that outputs an endless stream of null bytes (0x00). It is often used in scripts and system maintenance tasks like wiping partitions or creating empty files.

That led me to wonder: if there is /dev/zero, is there a /dev/one?

Turns out, not in the standard kernel—but someone did write a kernel module to simulate it. It outputs a continuous stream of 0xFF, which is essentially all bits set to one. It is a fun curiosity with some practical uses in testing or wiping data in a different pattern.

But then came the real gem of the rabbit hole: /dev/scream.

Yes, it is exactly what it sounds like.

What is /dev/scream?

/dev/scream is a Linux kernel module that creates a character device which, when read, outputs a stream of text that mimics a chaotic, high-pitched scream. Think:

aHAAhhaHHAAHaAaAAAAhhHhhAAaAAAhAaaAAAaHHAHhAaaaaAaHahAaAHaAAHaaHhAHhHaHaAaHAAHaAhhaHaAaAA

It is completely useless… and completely delightful.

Originally written by @matlink, the module is a humorous take on the Unix philosophy: “Everything is a file”—even your existential dread. It turns your terminal into a primal outlet. Just run:

cat /dev/scream

And enjoy the textual equivalent of a scream into the void.

Why?

Why not?

Sometimes the joy of Linux is not about solving problems, but about exploring the weird and wonderful corners of its ecosystem. From /dev/null swallowing your output silently, to /dev/urandom serving up chaos, to /dev/scream venting it—all of these illustrate the creativity of the open source world.

Sure, shred and secure deletion are important. But so is remembering that your system is a playground.

Try it Yourself

If you want to give /dev/scream a go, here is how to install it:

⚠️ Warning

This is a custom kernel module. It is not dangerous, but do not run it on production systems unless you know what you are doing.

Build and Load the Module

git clone https://github.com/matlink/dev_scream.git
cd dev_scream
make build
sudo make install
sudo make load
sudo insmod dev_scream.ko

Now read from the device:

cat /dev/scream

Or, if you are feeling truly poetic, try screaming into the void:

dd if=/dev/scream of=/dev/null

In space, nobody can hear you scream… but on Linux, /dev/scream is loud and clear—even if you pipe it straight into oblivion.

When you are done screaming:

sudo rmmod dev_scream

Final Thoughts

I started with secure deletion, and I ended up installing a kernel module that screams. This is the beauty of curiosity-driven learning in Linux: you never quite know where you will end up. And sometimes, after a long day, maybe all you need is to cat /dev/scream.

Let me know if you tried it—and whether your terminal feels a little lighter afterward.

front view on animals head on exhibition

You can now follow my blog from the Fediverse!

If you are part of the Fediverse—on Mastodon, Pleroma, or any other ActivityPub-compatible platform—you can now follow this blog directly from your favorite platform.

Thanks to the excellent ActivityPub plugin for WordPress, each blog post I publish on amedee.be is now automatically shared in a way that federated social platforms can understand and display.

Follow me from Mastodon

If you are on Mastodon, you can follow this blog just like you would follow another person:

Search for: @amedee.be@amedee.be

Or click this link if your Mastodon instance supports it:
https://amedee.be/@amedee.be

New blog posts will appear in your timeline, and you can even reply to them from Mastodon. Your comments will appear as replies on the blog post page—Fediverse and WordPress users interacting seamlessly!

Why I enabled ActivityPub

I have been active on Mastodon for a while as @amedee@lou.lt, and I really enjoy the decentralized, open nature of the Fediverse. It is a refreshing change from the algorithm-driven social media platforms.

Adding ActivityPub support to my blog aligns perfectly with those values: open standards, decentralization, and full control over my own content.

This change was as simple as adding the activitypub plugin to my blog’s Ansible configuration on GitHub:

 blog_wp_plugins_install:
+  - activitypub
   - akismet
   - google-site-kit
   - health-check

Once deployed, GitHub Actions and Ansible took care of the rest.

What this means for you

If you already follow me on Mastodon (@amedee@lou.lt), nothing changes—you will still see the occasional personal post, boost, or comment.

But if you are more interested in my blog content—technical articles, tutorials, and occasional personal reflections—you might prefer following @amedee.be@amedee.be. It is an automated account that only shares blog posts.

This setup lets me keep content separate and organized, while still engaging with the broader Fediverse community.

Want to do the same for your blog?

Setting this up is easy:

  1. Make sure you are running WordPress version 6.4 or later.
  2. Install and activate the ActivityPub plugin.
  3. After activation, your author profile (and optionally, your blog itself) becomes followable via the Fediverse.
  4. Start publishing—and federate your writing with the world!
blur bright business codes

Automating My Server Management with Ansible and GitHub Actions

Managing multiple servers can be a daunting task, especially when striving for consistency and efficiency. To tackle this challenge, I developed a robust automation system using Ansible, GitHub Actions, and Vagrant. This setup not only streamlines server configuration but also ensures that deployments are repeatable and maintainable.

A Bit of History: How It All Started

This project began out of necessity. I was maintaining a handful of Ubuntu servers — one for email, another for a website, and a few for experiments — and I quickly realized that logging into each one to make manual changes was both tedious and error-prone. My first step toward automation was a collection of shell scripts. They worked, but as the infrastructure grew, they became hard to manage and lacked the modularity I needed.

That is when I discovered Ansible. I created the ansible-servers repository in early 2024 as a way to centralize and standardize my infrastructure automation. Initially, it only contained a basic playbook for setting up users and updating packages. But over time, it evolved to include multiple roles, structured inventories, and eventually CI/CD integration through GitHub Actions.

Every addition was born out of a real-world need. When I got tired of testing changes manually, I added Vagrant to simulate my environments locally. When I wanted to be sure my configurations stayed consistent after every push, I integrated GitHub Actions to automate deployments. When I noticed the repo growing, I introduced linting and security checks to maintain quality.

The repository has grown steadily and organically, each commit reflecting a small lesson learned or a new challenge overcome.

The Foundation: Ansible Playbooks

At the core of my automation strategy are Ansible playbooks, which define the desired state of my servers. These playbooks handle tasks such as installing necessary packages, configuring services, and setting up user accounts. By codifying these configurations, I can apply them consistently across different environments.

To manage these playbooks, I maintain a structured repository that includes:

  • Inventory Files: Located in the inventory directory, these YAML files specify the hosts and groups for deployment targets.
  • Roles: Under the roles directory, I define reusable components that encapsulate specific functionalities, such as setting up a web server or configuring a database.
  • Configuration File: The ansible.cfg file sets important defaults, like enabling fact caching and specifying the inventory path, to optimize Ansible’s behavior.

Seamless Deployments with GitHub Actions

To automate the deployment process, I leverage GitHub Actions. This integration allows me to trigger Ansible playbooks automatically upon code changes, ensuring that my servers are always up-to-date with the latest configurations.

One of the key workflows is Deploy to Production, which executes the main playbook against the production inventory. This workflow is defined in the ansible-deploy.yml file and is triggered on specific events, such as pushes to the main branch.

Additionally, I have set up other workflows to maintain code quality and security:

  • Super-Linter: Automatically checks the codebase for syntax errors and adherence to best practices.
  • Codacy Security Scan: Analyzes the code for potential security vulnerabilities.
  • Dependabot Updates: Keeps dependencies up-to-date by automatically creating pull requests for new versions.

Local Testing with Vagrant

Before deploying changes to production, it is crucial to test them in a controlled environment. For this purpose, I use Vagrant to spin up virtual machines that mirror my production servers.

The deploy_to_staging.sh script automates this process by:

  1. Starting the Vagrant environment and provisioning it.
  2. Installing required Ansible roles specified in requirements.yml.
  3. Running the Ansible playbook against the staging inventory.

This approach allows me to validate changes in a safe environment before applying them to live servers.

Embracing Open Source and Continuous Improvement

Transparency and collaboration are vital in the open-source community. By hosting my automation setup on GitHub, I invite others to review, suggest improvements, and adapt the configurations for their own use cases.

The repository is licensed under the MIT License, encouraging reuse and modification. Moreover, I actively monitor issues and welcome contributions to enhance the system further.


In summary, by combining Ansible, GitHub Actions, and Vagrant, I have created a powerful and flexible automation framework for managing my servers. This setup not only reduces manual effort but also increases reliability and scalability. I encourage others to explore this approach and adapt it to their own infrastructure needs. What began as a few basic scripts has now evolved into a reliable automation pipeline I rely on every day.

If you are managing servers and find yourself repeating the same configuration steps, I invite you to check out the ansible-servers repository on GitHub. Clone it, explore the structure, try it in your own environment — and if you have ideas or improvements, feel free to open a pull request or start a discussion. Automation has made a huge difference for me, and I hope it can do the same for you.


connecting flash memory to laptop

Benchmarking USB Drives with Shell Scripts – Part 2: Evolving the Script with ChatGPT

Introduction

In my previous post, I shared the story of why I needed a new USB stick and how I used ChatGPT to write a benchmark script that could measure read performance across various methods. In this follow-up, I will dive into the technical details of how the script evolved—from a basic prototype into a robust and feature-rich tool—thanks to incremental refinements and some AI-assisted development.


Starting Simple: The First Version

The initial idea was simple: read a file using dd and measure the speed.

dd if=/media/amedee/Ventoy/ISO/ubuntu-24.10-desktop-amd64.iso \
   of=/dev/null bs=8k

That worked, but I quickly ran into limitations:

  • No progress indicator
  • Hardcoded file paths
  • No USB auto-detection
  • No cache flushing, leading to inflated results when repeating the measurement

With ChatGPT’s help, I started addressing each of these issues one by one.


Tools check

On a default Ubuntu installation, some tools are available by default, while others (especially benchmarking tools) usually need to be installed separately.

Tools used in the script:

ToolInstalled by default?Needs require?
hdparm❌ Not installed✅ Yes
dd✅ Yes❌ No
pv❌ Not installed✅ Yes
cat✅ Yes❌ No
ioping❌ Not installed✅ Yes
fio❌ Not installed✅ Yes
lsblk✅ Yes (in util-linux)❌ No
awk✅ Yes (in gawk)❌ No
grep✅ Yes❌ No
basename✅ Yes (in coreutils)❌ No
find✅ Yes❌ No
sort✅ Yes❌ No
stat✅ Yes❌ No

This function ensures the system has all tools needed for benchmarking. It exits early if any tool is missing.

This was the initial version:

check_required_tools() {
  local required_tools=(dd pv hdparm fio ioping awk grep sed tr bc stat lsblk find sort)
  for tool in "${required_tools[@]}"; do
    if ! command -v "$tool" &>/dev/null; then
      echo "❌ Required tool '$tool' is not installed."
      exit 1
    fi
  done
}

That’s already nice, but maybe I just want to run the script anyway if some of the tools are missing.

This is a more advanced version:

ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

Device Auto-Detection

One early challenge was identifying which device was the USB stick. I wanted the script to automatically detect a mounted USB device. My first version was clunky and error-prone.

detect_usb() {
  USB_DEVICE=$(lsblk -o NAME,TRAN,MOUNTPOINT -J | jq -r '.blockdevices[] | select(.tran=="usb") | .name' | head -n1)
  if [[ -z "$USB_DEVICE" ]]; then
    echo "❌ No USB device detected."
    exit 1
  fi
  USB_PATH="/dev/$USB_DEVICE"
  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_PATH" | head -n1)
  if [[ -z "$MOUNT_PATH" ]]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi
  echo "✅ Using USB device: $USB_PATH"
  echo "✅ Mounted at: $MOUNT_PATH"
}

After a few iterations, we (ChatGPT and I) settled on parsing lsblk with filters on tran=usb and hotplug=1, and selecting the first mounted partition.

We also added a fallback prompt in case auto-detection failed.

detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

Finding the Test File

To avoid hardcoding filenames, we implemented logic to search for the latest Ubuntu ISO on the USB stick.

find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

Later, we enhanced it to accept a user-provided file, and even verify that the file was located on the USB stick. If it was not, the script would gracefully fall back to the Ubuntu ISO search.

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}

Read Methods and Speed Extraction

To get a comprehensive view, we added multiple methods:

  • hdparm (direct disk access)
  • dd (simple block read)
  • dd + pv (with progress bar)
  • cat + pv (alternative stream reader)
  • ioping (random access)
  • fio (customizable benchmark tool)
    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi

Parsing their outputs proved tricky. For example, pv outputs speed with or without spaces, and with different units. We created a robust extract_speed function with regex, and a speed_to_mb function that could handle both MB/s and MiB/s, with or without a space between value and unit.

extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

Dropping Caches for Accurate Results

To prevent cached reads from skewing the results, each test run begins by dropping system caches using:

sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

What it does:

CommandPurpose
syncFlushes all dirty (pending write) pages to disk
echo 3 > /proc/sys/vm/drop_cachesClears page cache, dentries, and inodes from RAM

We wrapped this in a helper function and used it consistently.


Multiple Runs and Averaging

We made the script repeat each test N times (default: 3), collect results, compute averages, and display a summary at the end.

  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

  ### tests run here

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

Output Formats

To make the results user-friendly, we added:

  • A clean table view
  • CSV export for spreadsheets
  • Log file for later reference
  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="

The Full Script

Here is the complete version of the script used to benchmark the read performance of a USB drive:

#!/bin/bash

# ==========================
# CONFIGURATION
# ==========================
RESULTS=()
USB_DEVICE=""
TEST_FILE=""
RUNS=1
VISUAL="none"
SUMMARY=0

# (Consider grouping related configuration into a config file or associative array if script expands)

# ==========================
# ARGUMENT PARSING
# ==========================
while [[ $# -gt 0 ]]; do
  case $1 in
    --device)
      USB_DEVICE="$2"
      shift 2
      ;;
    --file)
      TEST_FILE="$2"
      shift 2
      ;;
    --runs)
      RUNS="$2"
      shift 2
      ;;
    --visual)
      VISUAL="$2"
      shift 2
      ;;
    --summary)
      SUMMARY=1
      shift
      ;;
    --yes|--force)
      FORCE_YES=1
      shift
      ;;
    *)
      echo "Unknown option: $1"
      exit 1
      ;;
  esac
done

# ==========================
# TOOL CHECK
# ==========================
ALL_TOOLS=(hdparm dd pv ioping fio lsblk stat grep awk find sort basename column gnuplot)
MISSING_TOOLS=()

require() {
  if ! command -v "$1" >/dev/null; then
    return 1
  fi
  return 0
}

check_required_tools() {
  echo "🔍 Checking required tools..."
  for tool in "${ALL_TOOLS[@]}"; do
    if ! require "$tool"; then
      MISSING_TOOLS+=("$tool")
    fi
  done

  if [[ ${#MISSING_TOOLS[@]} -gt 0 ]]; then
    echo "⚠️  The following tools are missing: ${MISSING_TOOLS[*]}"
    echo "You can install them using: sudo apt install ${MISSING_TOOLS[*]}"
    if [[ -z "$FORCE_YES" ]]; then
      read -rp "Do you want to continue and skip tests that require them? (y/N): " yn
      case $yn in
        [Yy]*)
          echo "Continuing with limited tests..."
          ;;
        *)
          echo "Aborting. Please install the required tools."
          exit 1
          ;;
      esac
    else
      echo "Continuing with limited tests (auto-confirmed)..."
    fi
  else
    echo "✅ All required tools are available."
  fi
}

# ==========================
# AUTO-DETECT USB DEVICE
# ==========================
detect_usb() {
  if [[ -n "$USB_DEVICE" ]]; then
    echo "📎 Using provided USB device: $USB_DEVICE"
    MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
    return
  fi

  echo "🔍 Detecting USB device..."
  USB_DEVICE=""
  while read -r dev tran hotplug type _; do
    if [[ "$tran" == "usb" && "$hotplug" == "1" && "$type" == "disk" ]]; then
      base="/dev/$dev"
      part=$(lsblk -nr -o NAME,MOUNTPOINT "$base" | awk '$2 != "" {print "/dev/"$1; exit}')
      if [[ -n "$part" ]]; then
        USB_DEVICE="$part"
        break
      fi
    fi
  done < <(lsblk -o NAME,TRAN,HOTPLUG,TYPE,MOUNTPOINT -nr)

  if [ -z "$USB_DEVICE" ]; then
    echo "❌ No mounted USB partition found on any USB disk."
    lsblk -o NAME,TRAN,HOTPLUG,TYPE,SIZE,MOUNTPOINT -nr | grep part
    read -rp "Enter the USB device path manually (e.g., /dev/sdc1): " USB_DEVICE
  fi

  MOUNT_PATH=$(lsblk -no MOUNTPOINT "$USB_DEVICE")
  if [ -z "$MOUNT_PATH" ]; then
    echo "❌ USB device is not mounted."
    exit 1
  fi

  echo "✅ Using USB device: $USB_DEVICE"
  echo "✅ Mounted at: $MOUNT_PATH"
}

# ==========================
# FIND TEST FILE
# ==========================
find_ubuntu_iso() {
  # Function to find an Ubuntu ISO on the USB device
  find "$MOUNT_PATH" -type f -regextype posix-extended \
    -regex ".*/ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\\.iso" | sort -V | tail -n1
}

find_test_file() {
  if [[ -n "$TEST_FILE" ]]; then
    echo "📎 Using provided test file: $(basename "$TEST_FILE")"
    
    # Check if the provided test file is on the USB device
    TEST_FILE_MOUNT_PATH=$(realpath "$TEST_FILE" | grep -oP "^$MOUNT_PATH")
    if [[ -z "$TEST_FILE_MOUNT_PATH" ]]; then
      echo "❌ The provided test file is not located on the USB device."
      # Look for an Ubuntu ISO if it's not on the USB
      TEST_FILE=$(find_ubuntu_iso)
    fi
  else
    TEST_FILE=$(find_ubuntu_iso)
  fi

  if [ -z "$TEST_FILE" ]; then
    echo "❌ No valid test file found."
    exit 1
  fi

  if [[ "$TEST_FILE" =~ ubuntu-[0-9]{2}\.[0-9]{2}-desktop-amd64\.iso ]]; then
    UBUNTU_VERSION=$(basename "$TEST_FILE" | grep -oP 'ubuntu-\d{2}\.\d{2}')
    echo "🧪 Selected Ubuntu version: $UBUNTU_VERSION"
  else
    echo "📎 Selected test file: $(basename "$TEST_FILE")"
  fi
}



# ==========================
# SPEED EXTRACTION
# ==========================
extract_speed() {
  grep -oP '(?i)[\d.,]+\s*[KMG]i?B/s' | tail -1 | sed 's/,/./'
}

speed_to_mb() {
  if [[ "$1" =~ ([0-9.,]+)[[:space:]]*([a-zA-Z/]+) ]]; then
    value="${BASH_REMATCH[1]}"
    unit=$(echo "${BASH_REMATCH[2]}" | tr '[:upper:]' '[:lower:]')
  else
    echo "0"
    return
  fi

  case "$unit" in
    kb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v / 1000 }' ;;
    mb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gb/s)   awk -v v="$value" 'BEGIN { printf "%.2f", v * 1000 }' ;;
    kib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v / 1024 }' ;;
    mib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v }' ;;
    gib/s)  awk -v v="$value" 'BEGIN { printf "%.2f", v * 1024 }' ;;
    *) echo "0" ;;
  esac
}

drop_caches() {
  echo "🧹 Dropping system caches..."
  if [[ $EUID -ne 0 ]]; then
    echo "  (requires sudo)"
  fi
  sudo sh -c "sync && echo 3 > /proc/sys/vm/drop_caches"
}

# ==========================
# RUN BENCHMARKS
# ==========================
run_benchmarks() {
  echo "📊 Read-only USB benchmark started ($RUNS run(s))"
  echo "==================================="

  declare -A TEST_NAMES=(
    [1]="hdparm"
    [2]="dd"
    [3]="dd + pv"
    [4]="cat + pv"
    [5]="ioping"
    [6]="fio"
  )

  declare -A TOTAL_MB
  for i in {1..6}; do TOTAL_MB[$i]=0; done
  CSVFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).csv"
  echo "Test,Run,Speed (MB/s)" > "$CSVFILE"

  for ((run=1; run<=RUNS; run++)); do
    echo "▶ Run $run"
    idx=1

    if require hdparm; then
      drop_caches
      speed=$(sudo hdparm -t --direct "$USB_DEVICE" 2>/dev/null | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    drop_caches
    speed=$(dd if="$TEST_FILE" of=/dev/null bs=8k 2>&1 |& extract_speed)
    mb=$(speed_to_mb "$speed")
    echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
    TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
    echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    ((idx++))

    if require pv; then
      drop_caches
      FILESIZE=$(stat -c%s "$TEST_FILE")
      speed=$(dd if="$TEST_FILE" bs=8k status=none | pv -s "$FILESIZE" -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require pv; then
      drop_caches
      speed=$(cat "$TEST_FILE" | pv -f -X 2>&1 | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require ioping; then
      drop_caches
      speed=$(ioping -c 10 -A "$USB_DEVICE" 2>/dev/null | grep 'read' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
    ((idx++))

    if require fio; then
      drop_caches
      speed=$(fio --name=readtest --filename="$TEST_FILE" --direct=1 --rw=read --bs=8k \
            --size=100M --ioengine=libaio --iodepth=16 --runtime=5s --time_based --readonly \
            --minimal 2>/dev/null | awk -F';' '{print $6" KB/s"}' | extract_speed)
      mb=$(speed_to_mb "$speed")
      echo "${idx}. ${TEST_NAMES[$idx]}: $speed"
      TOTAL_MB[$idx]=$(echo "${TOTAL_MB[$idx]} + $mb" | bc)
      echo "${TEST_NAMES[$idx]},$run,$mb" >> "$CSVFILE"
    fi
  done

  echo "📄 Summary of average results for $UBUNTU_VERSION:"
  echo "==================================="
  SUMMARY_TABLE=""
  for i in {1..6}; do
    if [[ ${TOTAL_MB[$i]} != 0 ]]; then
      avg=$(echo "scale=2; ${TOTAL_MB[$i]} / $RUNS" | bc)
      echo "${TEST_NAMES[$i]} average: $avg MB/s"
      RESULTS+=("${TEST_NAMES[$i]} average: $avg MB/s")
      SUMMARY_TABLE+="${TEST_NAMES[$i]},$avg\n"
    fi
  done

  if [[ "$VISUAL" == "table" || "$VISUAL" == "both" ]]; then
    echo -e "📋 Table view:"
    echo -e "Test Method,Average MB/s\n$SUMMARY_TABLE" | column -t -s ','
  fi

  if [[ "$VISUAL" == "bar" || "$VISUAL" == "both" ]]; then
    if require gnuplot; then
      echo -e "$SUMMARY_TABLE" | awk -F',' '{print $1" "$2}' | \
      gnuplot -p -e "
        set terminal dumb;
        set title 'USB Read Benchmark Results ($UBUNTU_VERSION)';
        set xlabel 'Test Method';
        set ylabel 'MB/s';
        plot '-' using 2:xtic(1) with boxes notitle
      "
    fi
  fi

  LOGFILE="usb-benchmark-$(date +%Y%m%d-%H%M%S).log"
  {
    echo "Benchmark for USB device: $USB_DEVICE"
    echo "Mounted at: $MOUNT_PATH"
    echo "Ubuntu version: $UBUNTU_VERSION"
    echo "Test file: $TEST_FILE"
    echo "Timestamp: $(date)"
    echo "Number of runs: $RUNS"
    echo ""
    echo "Read speed averages:"
    for line in "${RESULTS[@]}"; do
      echo "$line"
    done
  } > "$LOGFILE"

  echo "📝 Results saved to: $LOGFILE"
  echo "📈 CSV exported to: $CSVFILE"
  echo "==================================="
}

# ==========================
# MAIN
# ==========================
check_required_tools
detect_usb
find_test_file
run_benchmarks

You van also find the latest revision of this script as a GitHub Gist.


Lessons Learned

This script has grown from a simple one-liner into a reliable tool to test USB read performance. Working with ChatGPT sped up development significantly, especially for bash edge cases and regex. But more importantly, it helped guide the evolution of the script in a structured way, with clean modular functions and consistent formatting.


Conclusion

This has been a fun and educational project. Whether you are benchmarking your own USB drives or just want to learn more about shell scripting, I hope this walkthrough is helpful.

Next up? Maybe a graphical version, or write benchmarking on a RAM disk to avoid damaging flash storage.

Stay tuned—and let me know if you use this script or improve it!

connecting flash memory to laptop

Benchmarking USB Drives with Shell Scripts – Part 1: Why I Built a Benchmark Script

Introduction

When I upgraded from an old 8GB USB stick to a shiny new 256GB one, I expected faster speeds and more convenience—especially for carrying around multiple bootable ISO files using Ventoy. With modern Linux distributions often exceeding 4GB per ISO, my old drive could barely hold a single image. But I quickly realized that storage space was only half the story—performance matters too.

Curious about how much of an upgrade I had actually made, I decided to benchmark the read speed of both USB sticks. Instead of hunting down benchmarking tools or manually comparing outputs, I turned to ChatGPT to help me craft a reliable, repeatable shell script that could automate the entire process. In this post, I’ll share how ChatGPT helped me go from an idea to a functional USB benchmark script, and what I learned along the way.


The Goal

I wanted to answer a few simple but important questions:

  • How much faster is my new USB stick compared to the old one?
  • Do different USB ports affect read speeds?
  • How can I automate these tests and compare the results?

But I also wanted a reusable script that would:

  • Detect the USB device automatically
  • Find or use a test file on the USB stick
  • Run several types of read benchmarks
  • Present the results clearly, with support for summary and CSV export

Getting Help from ChatGPT

I asked ChatGPT to help me write a shell script with these requirements. It guided me through:

  • Choosing benchmarking tools: hdparm, dd, pv, ioping, fio
  • Auto-detecting the mounted USB device
  • Handling different cases for user-provided test files or Ubuntu ISOs
  • Parsing and converting human-readable speed outputs
  • Displaying results in human-friendly tables and optional CSV export

We iterated over the script, addressing edge cases like:

  • USB devices not mounted
  • Multiple USB partitions
  • pv not showing output unless stderr was correctly handled
  • Formatting output consistently across tools

ChatGPT even helped optimize the code for readability, reduce duplication, and handle both space-separated and non-space-separated speed values like “18.6 MB/s” and “18.6MB/s”.


Benchmark Results

With the script ready, I ran tests on three configurations:

1. Old 8GB USB Stick

hdparm       16.40 MB/s
dd 18.66 MB/s
dd + pv 17.80 MB/s
cat + pv 18.10 MB/s
ioping 4.44 MB/s
fio 93.99 MB/s

2. New 256GB USB Stick (Fast USB Port)

hdparm      372.01 MB/s
dd 327.33 MB/s
dd + pv 310.00 MB/s
cat + pv 347.00 MB/s
ioping 8.58 MB/s
fio 992.78 MB/s

3. New 256GB USB Stick (Slow USB Port)

hdparm       37.60 MB/s
dd 39.86 MB/s
dd + pv 38.13 MB/s
cat + pv 40.30 MB/s
ioping 6.88 MB/s
fio 73.52 MB/s

Observations

  • The old USB stick is not only limited in capacity but also very slow. It barely breaks 20 MB/s in most tests.
  • The new USB stick, when plugged into a fast USB 3.0 port, is significantly faster—over 10x the speed in most benchmarks.
  • Plugging the same new stick into a slower port dramatically reduces its performance—a good reminder to check where you plug it in.
  • Tools like hdparm, dd, and cat + pv give relatively consistent results. However, ioping and fio behave differently due to the way they access data—random access or block size differences can impact results.

Also worth noting: the metal casing of the new USB stick gets warm after a few test runs, unlike the old plastic one.


Conclusion

Using ChatGPT to develop this benchmark script was like pair-programming with an always-available assistant. It accelerated development, helped troubleshoot weird edge cases, and made the script more polished than if I had done it alone.

If you want to test your own USB drives—or ensure you’re using the best port for speed—this benchmark script is a great tool to have in your kit. And if you’re looking to learn shell scripting, pairing with ChatGPT is an excellent way to level up.


Want the script?
I’ll share the full version of the script and instructions on how to use it in a follow-up post. Stay tuned!