Skip to content

2025

Creating 10 000 Random Files & Analyzing Their Size Distribution: Because Why Not? 🧐💾

Ever wondered what it’s like to unleash 10 000 tiny little data beasts on your hard drive? No? Well, buckle up anyway — because today, we’re diving into the curious world of random file generation, and then nerding out by calculating their size distribution. Spoiler alert: it’s less fun than it sounds. 😏

Step 1: Let’s Make Some Files… Lots of Them

Our goal? Generate 10 000 files filled with random data. But not just any random sizes — we want a mean file size of roughly 68 KB and a median of about 2 KB. Sounds like a math puzzle? That’s because it kind of is.

If you just pick file sizes uniformly at random, you’ll end up with a median close to the mean — which is boring. We want a skewed distribution, where most files are small, but some are big enough to bring that average up.

The Magic Trick: Log-normal Distribution 🎩✨

Enter the log-normal distribution, a nifty way to generate lots of small numbers and a few big ones — just like real life. Using Python’s NumPy library, we generate these sizes and feed them to good old /dev/urandom to fill our files with pure randomness.

Here’s the Bash script that does the heavy lifting:

#!/bin/bash

# Directory to store the random files
output_dir="random_files"
mkdir -p "$output_dir"

# Total number of files to create
file_count=10000

# Log-normal distribution parameters
mean_log=9.0  # Adjusted for ~68KB mean
stddev_log=1.5  # Adjusted for ~2KB median

# Function to generate random numbers based on log-normal distribution
generate_random_size() {
    python3 -c "import numpy as np; print(int(np.random.lognormal($mean_log, $stddev_log)))"
}

# Create files with random data
for i in $(seq 1 $file_count); do
    file_size=$(generate_random_size)
    file_path="$output_dir/file_$i.bin"
    head -c "$file_size" /dev/urandom > "$file_path"
    echo "Generated file $i with size $file_size bytes."
done

echo "Done. Files saved in $output_dir."

Easy enough, right? This creates a directory random_files and fills it with 10 000 files of sizes mostly small but occasionally wildly bigger. Don’t blame me if your disk space takes a little hit! 💥

Step 2: Crunching Numbers — The File Size Distribution 📊

Okay, you’ve got the files. Now, what can we learn from their sizes? Let’s find out the:

  • Mean size: The average size across all files.
  • Median size: The middle value when sizes are sorted — because averages can lie.
  • Distribution breakdown: How many tiny files vs. giant files.

Here’s a handy Bash script that reads file sizes and spits out these stats with a bit of flair:

#!/bin/bash

# Input directory (default to "random_files" if not provided)
directory="${1:-random_files}"

# Check if directory exists
if [ ! -d "$directory" ]; then
    echo "Directory $directory does not exist."
    exit 1
fi

# Array to store file sizes
file_sizes=($(find "$directory" -type f -exec stat -c%s {} \;))

# Check if there are files in the directory
if [ ${#file_sizes[@]} -eq 0 ]; then
    echo "No files found in the directory $directory."
    exit 1
fi

# Calculate mean
total_size=0
for size in "${file_sizes[@]}"; do
    total_size=$((total_size + size))
done
mean=$((total_size / ${#file_sizes[@]}))

# Calculate median
sorted_sizes=($(printf '%s\n' "${file_sizes[@]}" | sort -n))
mid=$(( ${#sorted_sizes[@]} / 2 ))
if (( ${#sorted_sizes[@]} % 2 == 0 )); then
    median=$(( (sorted_sizes[mid-1] + sorted_sizes[mid]) / 2 ))
else
    median=${sorted_sizes[mid]}
fi

# Display file size distribution
echo "File size distribution in directory $directory:"
echo "---------------------------------------------"
echo "Number of files: ${#file_sizes[@]}"
echo "Mean size: $mean bytes"
echo "Median size: $median bytes"

# Display detailed size distribution (optional)
echo
echo "Detailed distribution (size ranges):"
awk '{
    if ($1 < 1024) bins["< 1 KB"]++;
    else if ($1 < 10240) bins["1 KB - 10 KB"]++;
    else if ($1 < 102400) bins["10 KB - 100 KB"]++;
    else bins[">= 100 KB"]++;
} END {
    for (range in bins) printf "%-15s: %d\n", range, bins[range];
}' <(printf '%s\n' "${file_sizes[@]}")

Run it, and voilà — instant nerd satisfaction.

Example Output:

File size distribution in directory random_files:
---------------------------------------------
Number of files: 10000
Mean size: 68987 bytes
Median size: 2048 bytes

Detailed distribution (size ranges):
&lt; 1 KB         : 1234
1 KB - 10 KB   : 5678
10 KB - 100 KB : 2890
>= 100 KB      : 198

Why Should You Care? 🤷‍♀️

Besides the obvious geek cred, generating files like this can help:

  • Test backup systems — can they handle weird file size distributions?
  • Stress-test storage or network performance with real-world-like data.
  • Understand your data patterns if you’re building apps that deal with files.

Wrapping Up: Big Files, Small Files, and the Chaos In Between

So there you have it. Ten thousand random files later, and we’ve peeked behind the curtain to understand their size story. It’s a bit like hosting a party and then figuring out who ate how many snacks. 🍿

Try this yourself! Tweak the distribution parameters, generate files, crunch the numbers — and impress your friends with your mad scripting skills. Or at least have a fun weekend project that makes you sound way smarter than you actually are.

Happy hacking! 🔥

How I Tamed Duplicity’s Buggy Versions — and Made Sure I Always Have a Backup 🛡️💾

If you’re running Mail-in-a-Box like me, you might rely on Duplicity to handle backups quietly in the background. It’s a great tool — until it isn’t. Recently, I ran into some frustrating issues caused by buggy Duplicity versions. Here’s the story, a useful discussion from the Mail-in-a-Box forums, and a neat trick I use to keep fallback versions handy. Spoiler: it involves an APT hook and some smart file copying! 🚀


The Problem with Duplicity Versions

Duplicity 3.0.1 and 3.0.5 have been reported to cause backup failures — a real headache when you depend on them to protect your data. The Mail-in-a-Box forum post “Something is wrong with the backup” dives into these issues with great detail. Users reported mysterious backup failures and eventually traced it back to specific Duplicity releases causing the problem.

Here’s the catch: those problematic versions sometimes sneak in during automatic updates. By the time you realize something’s wrong, you might already have upgraded to a buggy release. 😩


Pinning Problematic Versions with APT Preferences

One way to stop apt from installing those broken versions is to use APT pinning. Here’s an example file I created in /etc/apt/preferences/pin_duplicity.pref:

Explanation: Duplicity version 3.0.1* has a bug and should not be installed
Package: duplicity
Pin: version 3.0.1*
Pin-Priority: -1

Explanation: Duplicity version 3.0.5* has a bug and should not be installed
Package: duplicity
Pin: version 3.0.5*
Pin-Priority: -1

This tells apt to refuse to install these specific buggy versions. Sounds great, right? Except — it often comes too late. You could already have updated to a broken version before adding the pin.

Also, since Duplicity is installed from a PPA, older versions vanish quickly as new releases push them out. This makes rolling back to a known good version a pain. 😤


My Solution: Backing Up Known Good Duplicity .deb Files Automatically

To fix this, I created an APT hook that runs after every package operation involving Duplicity. It automatically copies the .deb package files of Duplicity from apt’s archive cache — and even from my local folder if I’m installing manually — into a safe backup folder.

Here’s the script, saved as /usr/local/bin/apt-backup-duplicity.sh:

#!/bin/bash
set -x

mkdir -p /var/backups/debs/duplicity

cp -vn /var/cache/apt/archives/duplicity_*.deb /var/backups/debs/duplicity/ 2>/dev/null || true
cp -vn /root/duplicity_*.deb /var/backups/debs/duplicity/ 2>/dev/null || true

And here’s the APT hook configuration I put in /etc/apt/apt.conf.d/99backup-duplicity-debs to run this script automatically after DPKG operations:

DPkg::Post-Invoke { "/usr/local/bin/apt-backup-duplicity.sh"; };

Use apt-mark hold to Lock a Working Duplicity Version 🔒

Even with pinning and local .deb backups, there’s one more layer of protection I recommend: freezing a known-good version with apt-mark hold.

Once you’ve confirmed that your current version of Duplicity works reliably, run:

sudo apt-mark hold duplicity

This tells apt not to upgrade Duplicity, even if a newer version becomes available. It’s a great way to avoid accidentally replacing your working setup with something buggy during routine updates.

🧠 Pro Tip: I only unhold and upgrade Duplicity manually after checking the Mail-in-a-Box forum for reports that a newer version is safe.

When you’re ready to upgrade, do this:

sudo apt-mark unhold duplicity
sudo apt update
sudo apt install duplicity

If everything still works fine, you can apt-mark hold it again to freeze the new version.


How to Use Your Backup Versions to Roll Back

If a new Duplicity version breaks your backups, you can easily reinstall a known-good .deb file from your backup folder:

sudo apt install --reinstall /var/backups/debs/duplicity/duplicity_<version>.deb

Replace <version> with the actual filename you want to roll back to. Because you saved the .deb files right after each update, you always have access to older stable versions — even if the PPA has moved on.


Final Thoughts

While pinning bad versions helps, having a local stash of known-good packages is a game changer. Add apt-mark hold on top of that, and you have a rock-solid defense against regressions. 🪨✨

It’s a small extra step but pays off hugely when things go sideways. Plus, it’s totally automated with the APT hook, so you don’t have to remember to save anything manually. 🎉

If you run Mail-in-a-Box or rely on Duplicity in any critical backup workflow, I highly recommend setting up this safety net.

Stay safe and backed up! 🛡️✨

🧱 Let’s Get Hard (Links): Deduplicating My Linux Filesystem with Hadori

File deduplication isn’t just for massive storage arrays or backup systems—it can be a practical tool for personal or server setups too. In this post, I’ll explain how I use hardlinking to reduce disk usage on my Linux system, which directories are safe (and unsafe) to link, why I’m OK with the trade-offs, and how I automated it with a simple monthly cron job using a neat tool called hadori.


🔗 What Is Hardlinking?

In a traditional filesystem, every file has an inode, which is essentially its real identity—the data on disk. A hard link is a different filename that points to the same inode. That means:

  • The file appears to exist in multiple places.
  • But there’s only one actual copy of the data.
  • Deleting one link doesn’t delete the content, unless it’s the last one.

Compare this to a symlink, which is just a pointer to a path. A hardlink is a pointer to the data.

So if you have 10 identical files scattered across the system, you can replace them with hardlinks, and boom—nine of them stop taking up extra space.


🤔 Why Use Hardlinking?

My servers run a fairly standard Ubuntu install, and like most Linux machines, the root filesystem accumulates a lot of identical binaries and libraries—especially across /bin, /lib, /usr, and /opt.

That’s not a problem… until you’re tight on disk space, or you’re just a curious nerd who enjoys squeezing every last byte.

In my case, I wanted to reduce disk usage safely, without weird side effects.

Hardlinking is a one-time cost with ongoing benefits. It’s not compression. It’s not archival. But it’s efficient and non-invasive.


📁 Which Directories Are Safe to Hardlink?

Hardlinking only works within the same filesystem, and not all directories are good candidates.

✅ Safe directories:

  • /bin, /sbin – system binaries
  • /lib, /lib64 – shared libraries
  • /usr, /usr/bin, /usr/lib, /usr/share, /usr/local – user-space binaries, docs, etc.
  • /opt – optional manually installed software

These contain mostly static files: compiled binaries, libraries, man pages… not something that changes often.

⚠️ Unsafe or risky directories:

  • /etc – configuration files, might change frequently
  • /var, /tmp – logs, spools, caches, session data
  • /home – user files, temporary edits, live data
  • /dev, /proc, /sys – virtual filesystems, do not touch

If a file is modified after being hardlinked, it breaks the deduplication (the OS creates a copy-on-write scenario), and you’re back where you started—or worse, sharing data you didn’t mean to.

That’s why I avoid any folders with volatile, user-specific, or auto-generated files.


🧨 Risks and Limitations

Hardlinking is not magic. It comes with sharp edges:

  • One inode, multiple names: All links are equal. Editing one changes the data for all.
  • Backups: Some backup tools don’t preserve hardlinks or treat them inefficiently.
    Duplicity, which I use, does not preserve hardlinks. It backs up each linked file as a full copy, so hardlinking won’t reduce backup size.
  • Security: Linking files with different permissions or owners can have unexpected results.
  • Limited scope: Only works within the same filesystem (e.g., can’t link / and /mnt if they’re on separate partitions).

In my setup, I accept those risks because:

  • I’m only linking read-only system files.
  • I never link config or user data.
  • I don’t rely on hardlink preservation in backups.
  • I test changes before deploying.

In short: I know what I’m linking, and why.


🔍 What the Critics Say About Hardlinking

Not everyone loves hardlinks—and for good reasons. Two thoughtful critiques are:

The core arguments:

  • Hardlinks violate expectations about file ownership and identity.
  • They can break assumptions in software that tracks files by name or path.
  • They complicate file deletion logic—deleting one name doesn’t delete the content.
  • They confuse file monitoring and logging tools, since it’s hard to tell if a file is “new” or just another name.
  • They increase the risk of data corruption if accidentally modified in-place by a script that assumes it owns the file.

Why I’m still OK with it:

These concerns are valid—but mostly apply to:

  • Mutable files (e.g., logs, configs, user data)
  • Systems with untrusted users or dynamic scripts
  • Software that relies on inode isolation or path integrity

In contrast, my approach is intentionally narrow and safe:

  • I only deduplicate read-only system files in /bin, /sbin, /lib, /lib64, /usr, and /opt.
  • These are owned by root, and only changed during package updates.
  • I don’t hardlink anything under /home, /etc, /var, or /tmp.
  • I know exactly when the cron job runs and what it targets.

So yes, hardlinks can be dangerous—but only if you use them in the wrong places. In this case, I believe I’m using them correctly and conservatively.


⚡ Does Hardlinking Impact System Performance?

Good news: hardlinks have virtually no impact on system performance in everyday use.

Hardlinks are a native feature of Linux filesystems like ext4 or xfs. The OS treats a hardlinked file just like a normal file:

  • Reading and writing hardlinked files is just as fast as normal files.
  • Permissions, ownership, and access behave identically.
  • Common tools (ls, cat, cp) don’t care whether a file is hardlinked or not.
  • Filesystem caches and memory management work exactly the same.

The only difference is that multiple filenames point to the exact same data.

Things to keep in mind:

  • If you edit a hardlinked file, all links see that change because there’s really just one file.
  • Some tools (backup, disk usage) might treat hardlinked files differently.
  • Debugging or auditing files can be slightly trickier since multiple paths share one inode.

But from a performance standpoint? Your system won’t even notice the difference.


🛠️ Tools for Hardlinking

There are a few tools out there:

  • fdupes – finds duplicates and optionally replaces with hardlinks
  • rdfind – more sophisticated detection
  • hardlink – simple but limited
  • jdupes – high-performance fork of fdupes

📌 About Hadori

From the Debian package description:

This might look like yet another hardlinking tool, but it is the only one which only memorizes one filename per inode. That results in less memory consumption and faster execution compared to its alternatives. Therefore (and because all the other names are already taken) it’s called “Hardlinking DOne RIght”.

Advantages over other tools:

  • Predictability: arguments are scanned in order, each first version is kept
  • Much lower CPU and memory consumption compared to alternatives

This makes hadori especially suited for system-wide deduplication where efficiency and reliability matter.


⏱️ How I Use Hadori

I run hadori once per month with a cron job. Here’s the actual command:

/usr/bin/hadori --verbose /bin /sbin /lib /lib64 /usr /opt

This scans those directories, finds duplicate files, and replaces them with hardlinks when safe.

And here’s the crontab entry I installed in the file /etc/cron.d/hadori:

@monthly root /usr/bin/hadori --verbose /bin /sbin /lib /lib64 /usr /opt

📉 What Are the Results?

After the first run, I saw a noticeable reduction in used disk space, especially in /usr/lib and /usr/share. On my modest VPS, that translated to about 300–500 MB saved—not huge, but non-trivial for a small root partition.

While this doesn’t reduce my backup size (Duplicity doesn’t support hardlinks), it still helps with local disk usage and keeps things a little tidier.

And because the job only runs monthly, it’s not intrusive or performance-heavy.


🧼 Final Thoughts

Hardlinking isn’t something most people need to think about. And frankly, most people probably shouldn’t use it.

But if you:

  • Know what you’re linking
  • Limit it to static, read-only system files
  • Automate it safely and sparingly

…then it can be a smart little optimization.

With a tool like hadori, it’s safe, fast, and efficient. I’ve read the horror stories—and decided that in my case, they don’t apply.


✉️ This post was brought to you by a monthly cron job and the letters i-n-o-d-e.

🔍 How I Accidentally Discovered Power Query

A few weeks ago, I was knee-deep in CSV files. Not the fun kind. These were automatically generated reports from Cisco IronPort, and they weren’t exactly what I’d call analysis-friendly. Think: dozens of columns wide, thousands of rows, with summary data buried in awkward corners.

I was trying to make sense of incoming mail categories—Spam, Clean, Malware—and the numbers that went with them. Naturally, I opened the file in Excel, intending to wrangle the data manually like I usually do. You know: transpose the table, delete some columns, rename a few headers, calculate percentages… the usual grunt work.

But something was different this time. I noticed the “Get & Transform” section in Excel’s Data ribbon. I had clicked it before, but this time I gave it a real shot. I selected “From Text/CSV”, and suddenly I was in a whole new environment: Power Query Editor.


🤯 Wait, What Is Power Query?

For those who haven’t met it yet, Power Query is a powerful tool in Excel (and also in Power BI) that lets you import, clean, transform, and reshape data before it even hits your spreadsheet. It uses a language called M, but you don’t really have to write code—although I quickly did, of course, because I can’t help myself.

In the editor, every transformation step is recorded. You can rename columns, remove rows, change data types, calculate new columns—all through a clean interface. And once you’re done, you just load the result into Excel. Even better: you can refresh it with one click when the source file updates.


🧪 From Curiosity to Control

Back to my IronPort report. I used Power Query to:

  • Transpose the data (turn columns into rows),
  • Remove columns I didn’t need,
  • Rename columns to something meaningful,
  • Convert text values to numbers,
  • Calculate the percentage of each message category relative to the total.

All without touching a single cell in Excel manually. What would have taken 15+ minutes and been error-prone became a repeatable, refreshable process. I even added a “Percent” column that showed something like 53.4%—formatted just the way I wanted.


🤓 The Geeky Bit (Optional)

I quickly opened the Advanced Editor to look at the underlying M code. It was readable! With a bit of trial and error, I started customizing my steps, renaming variables for clarity, and turning a throwaway transformation into a well-documented process.

This was the moment it clicked: Power Query is not just a tool; it’s a pipeline.


💡 Lessons Learned

  • Sometimes it pays to explore what’s already in the software you use every day.
  • Excel is much more powerful than most people realize.
  • Power Query turns tedious cleanup work into something maintainable and even elegant.
  • If you do something in Excel more than once, Power Query is probably the better way.

🎯 What’s Next?

I’m already thinking about integrating this into more of my work. Whether it’s cleaning exported logs, combining reports, or prepping data for dashboards, Power Query is now part of my toolkit.

If you’ve never used it, give it a try. You might accidentally discover your next favorite tool—just like I did.


Have you used Power Query before? Let me know your tips or war stories in the comments!

In Defense of the Em Dash — A Beautiful Line of Thought ✍️

Lately, I’ve noticed something strange happening in online discussions: the humble em dash (—) is getting side-eyed as a telltale sign that a text was written with a so-called “AI.” I prefer the more accurate term: LLM (Large Language Model), because “artificial intelligence” is a bit of a stretch — we’re really just dealing with very complicated statistics 🤖📊.

Now, I get it — people are on high alert, trying to spot generated content. But I’d like to take a moment to defend this elegant punctuation mark, because I use it often — and deliberately. Not because a machine told me to, but because it helps me think 🧠.

A Typographic Tool, Not a Trend 🖋️

The em dash has been around for a long time — longer than most people realize. The oldest printed examples I’ve found are in early 17th-century editions of Shakespeare’s plays, published by the printer Okes in the 1620s. That’s not just a random dash on a page — that’s four hundred years of literary service 📜. If Shakespeare’s typesetters were using em dashes before indoor plumbing was common, I think it’s safe to say they’re not a 21st-century LLM quirk.

The Tragedy of Othello, the Moor of Venice, with long dashes (typeset here with 3 dashes)

A Dash for Thoughts 💭

In Dutch, the em dash is called a gedachtestreepje — literally, a thought dash. And honestly? I think that’s beautiful. It captures exactly what the em dash does: it opens a little mental window in your sentence. It lets you slip in a side note, a clarification, an emotion, or even a complete detour — just like a sudden thought that needs to be spoken before it disappears. For someone like me, who often thinks in tangents, it’s the perfect punctuation.

Why I Use the Em Dash (And Other Punctuation Marks)

I’m autistic, and that means a few things for how I write. I tend to overshare and infodump — not to dominate the conversation, but to make sure everything is clear. I don’t like ambiguity. I don’t want anyone to walk away confused. So I reach for whatever punctuation tools help me shape my thoughts as precisely as possible:

  • Colons help me present information in a tidy list — like this one.
  • Brackets let me add little clarifications (without disrupting the main sentence).
  • And em dashes — ah, the em dash — they let me open a window mid-sentence to give you extra context, a bit of tone, or a change in pace.

They’re not random. They’re intentional. They reflect how my brain works — and how I try to bridge the gap between thoughts and words 🌉.

It’s Not Just a Line — It’s a Rhythm 🎵

There’s also something typographically beautiful about the em dash. It’s not a hyphen (-), and it’s not a middling en dash (–). It’s long and confident. It creates space for your eyes and your thoughts. Used well, it gives writing a rhythm that mimics natural speech, especially the kind of speech where someone is passionate about a topic and wants to take you on a detour — just for a moment — before coming back to the main road 🛤️.

I’m that someone.

Don’t Let the Bots Scare You

Yes, LLMs tend to use em dashes. So do thoughtful human beings. Let’s not throw centuries of stylistic nuance out the window because a few bots learned how to mimic good writing. Instead of scanning for suspicious punctuation, maybe we should pay more attention to what’s being said — and how intentionally 💬.

So if you see an em dash in my writing, don’t assume it came from a machine. It came from me — my mind, my style, my history with language. And I’m not going to stop using it just because an algorithm picked up the habit 💛.

Help ons zoeken: drie mensen op zoek naar een warm huis in Gent

Soms zit het mee, soms nét niet. Het herenhuis waar we helemaal verliefd op waren, is helaas aan iemand anders verhuurd. Jammer, maar we blijven niet bij de pakken zitten. We zoeken verder — en hopelijk kan jij ons daarbij helpen!

Wij zijn drie mensen die samen een huis willen delen in Gent. We vormen een warme, bewuste en respectvolle woongroep, en we dromen van een plek waar we rust, verbinding en creativiteit kunnen combineren.

Wie zijn wij?

👤 Amedee (48): IT’er, balfolkdanser, amateurmuzikant, houdt van gezelschapsspelletjes en wandelen, auti en sociaal geëngageerd
👩 Chloë (bijna 52): Kunstenares, ex-Waldorfleerkracht en permacultuurontwerpster, houdt van creativiteit, koken en natuur
🎨 Kathleen (54): Doodle-artiest met sociaal-culturele achtergrond, houdt van gezelligheid, buiten zijn en schrijft graag

We willen samen een huis vormen waar communicatie, zorgzaamheid en vrijheid centraal staan. Een plek waar je je thuis voelt, en waar ruimte is voor kleine activiteiten zoals een spelavond, een workshop, een creatieve sessie of gewoon rustig samen zijn.

Wat zoeken we?

🏡 Een huis (géén appartement) in Gent, op max. 15 minuten fietsen van station Gent-Sint-Pieters
🌿 Energiezuinig: EPC B of beter
🛏 Minstens 3 ruime slaapkamers van ±20m²
💶 Huurprijs:

  • tot €1650/maand voor 3 slaapkamers
  • tot €2200/maand voor 4 slaapkamers

Extra ruimtes zoals een zolder, logeerkamer, atelier, bureau of hobbyruimte zijn heel welkom. We houden van luchtige, multifunctionele plekken die mee kunnen groeien met onze noden.

📅 Beschikbaar: vanaf nu, ten laatste oktober

💬 Heeft het huis 4 slaapkamers? Dan verwelkomen we graag een vierde huisgenoot die onze waarden deelt. Maar meer dan 4 bewoners willen we bewust vermijden — kleinschalig wonen werkt voor ons het best.

Ken jij iets? Laat van je horen!

Ken je een huis dat past in dit plaatje?
We staan open voor tips via immokantoren, vrienden, buren, collega’s of andere netwerken — alles helpt!

📩 Contact: amedee@vangasse.eu

Dankjewel om mee uit te kijken — en delen mag altijd 💜

Samenwonen in Gent? Wij starten een nieuwe cohousing en zoeken nog iemand!

Heb jij altijd al willen samenwonen met fijne mensen in een warme, open en respectvolle sfeer? Dan is dit misschien wel iets voor jou.

Samen met twee vrienden ben ik een nieuwe cohousing aan het opstarten in Gent. We hebben een prachtig gerenoveerd herenhuis op het oog, en we zijn op zoek naar een vierde persoon om het huis mee te delen.

Het huis

Het gaat om een ruim en karaktervol herenhuis met energielabel B+. Het beschikt over:

Vier volwaardige slaapkamers van elk 18 à 20 m²

Eén extra kamer die we kunnen inrichten als logeerkamer, bureau of hobbyruimte

Twee badkamers

Twee keukens

Een zolder met stevige balken — de creatieve ideeën borrelen al op!


De ligging is uitstekend: aan de Koning Albertlaan, op amper 5 minuten fietsen van station Gent-Sint-Pieters en 7 minuten van de Korenmarkt. De huurprijs is €2200 in totaal, wat neerkomt op €550 per persoon bij vier bewoners.

Het huis is al beschikbaar vanaf 1 juli 2025.

Wie zoeken we?

We zoeken iemand die zich herkent in een aantal gedeelde waarden en graag deel uitmaakt van een respectvolle, open en bewuste leefomgeving. Concreet betekent dat voor ons:

Je staat open voor diversiteit in al haar vormen

Je bent respectvol, communicatief en houdt rekening met anderen

Je hebt voeling met thema’s zoals inclusie, mentale gezondheid, en samenleven met aandacht voor elkaar

Je hebt een rustig karakter en draagt graag bij aan een veilige, harmonieuze sfeer in huis

Leeftijd is niet doorslaggevend, maar omdat we zelf allemaal 40+ zijn, zoeken we eerder iemand die zich in die levensfase herkent


Iets voor jou?

Voel je een klik met dit verhaal? Of heb je vragen en wil je ons beter leren kennen? Aarzel dan niet om contact op te nemen via amedee@vangasse.eu.

Is dit niets voor jou, maar ken je iemand die perfect zou passen in dit plaatje? Deel dan zeker deze oproep — dank je wel!

Samen kunnen we van dit huis een warme thuis maken.

📰 Featured by Sibelga and Passwerk: When Being Different Becomes a Strength

I am excited to share some wonderful news—Sibelga and Passwerk have recently published a testimonial about my work, and it has been shared across LinkedIn, Sibelga’s website, and even on YouTube!


What Is This All About?

Passwerk is an organisation that matches talented individuals on the autism spectrum with roles in IT and software testing, creating opportunities based on strengths and precision. I have been working with them as a consultant, currently placed at Sibelga, Brussels’ electricity and gas distribution network operator.

The article and video highlight how being “different” does not have to be a limitation—in fact, it can be a real asset in the right context. It means a lot to me to be seen and appreciated for who I am and the quality of my work.


Why This Matters

For many neurodivergent people, the professional world can be full of challenges that go beyond the work itself. Finding the right environment—one that values accuracy, focus, and dedication—can be transformative.

I am proud to be part of a story that shows what is possible when companies look beyond stereotypes and embrace neurodiversity as a strength.


Thank you to Sibelga, Passwerk, and everyone who contributed to this recognition. It is an honour to be featured, and I hope this story inspires more organisations to open up to diverse talents.

👉 Want to know more? Check out the article or watch the video!

🚗 French Road Trip to Balilas: From Ghent to Janzé with Strangers Turned Friends

A few weeks ago, I set off for Balilas, a balfolk festival in Janzé (near Rennes), Brittany (France). I had never been before, but as long as you have dance shoes, a tent, and good company, what more do you need?

Bananas for scale

From Ghent to Brittany… with Two Dutch Strangers

My journey began in Ghent, where I was picked up by Sterre and Michelle, two dancers from the Netherlands. I did not know them too well beforehand, but in the balfolk world, that is hardly unusual — de balfolkcommunity is één grote familie — one big family.

We took turns driving, chatting, laughing, and singing along. Google Maps logged our total drive time at 7 hours and 39 minutes.

Google knows everything
Péage – one of the many

Along the way, we had the perfect soundtrack:
🎶 French Road Trip 🇫🇷🥖 — 7 hours and 49 minutes of French and Francophone tubes.

https://open.spotify.com/playlist/3jRMHCl6qVmVIqXrASAAmZ?si=746a7f78ca30488a

🍕 A Tasty Stop in Pré-en-Pail-Saint-Samson

Somewhere around dinner time, we stopped at La Sosta, a cozy Italian restaurant in Pré-en-Pail-Saint-Samson (2300 inhabitants). I had a pizza normandebase tomate, andouille, pomme, mozzarella, crème, persil . A delicious and unexpected regional twist — definitely worth remembering!

pizza normande

The pizzas wereexcellent, but also generously sized — too big to finish in one sitting. Heureusement, ils nous ont proposé d’emballer le reste à emporter. That was a nice touch — and much appreciated after a long day on the road.

Just to much to eat it all

⛺ Arrival Just Before Dark

We arrived at the Balilas festival site five minutes after sunset, with just enough light left to set up our tents before nightfall. Trugarez d’an heol — thank you, sun, for holding out a little longer.

There were two other cars filled with people coming from the Netherlands, but they had booked a B&B. We chose to camp on-site to soak in the full festival atmosphere.

Enjoy the view!
Banana pancakes!

Balilas itself was magical: days and nights filled with live music, joyful dancing, friendly faces, and the kind of warm atmosphere that defines balfolk festivals.

Photo: Poppy Lens

More info and photos:
🌐 balilas.lesviesdansent.bzh
📸 @balilas.balfolk on Instagram


Balfolk is more than just dancing. It is about trust, openness, and sharing small adventures with people you barely know—who somehow feel like old friends by the end of the journey.

Tot de volgende — à la prochaine — betek ar blez a zeu!
🕺💃

Thank you Maï for proofreading the Breton expressions. ❤️

🎥 Automating Git Repository Visualizations with GitHub Actions and Gource

In the world of DevOps and continuous integration, automation is essential. One fascinating way to visualize the evolution of a codebase is with Gource, a tool that creates animated tree diagrams of project histories.

Recently, I implemented a GitHub Actions workflow in my ansible-servers repository to automatically generate and deploy Gource visualizations. In this post, I will walk you through how the workflow is set up and what it does.

But first, let us take a quick look back…


🕰️ Back in 2013: Visualizing Repos with Bash and XVFB

More than a decade ago, I published a blog post about Gource (in Dutch) where I described a manual workflow using Bash scripts. At that time, I ran Gource headlessly using xvfb-run, piped its output through pv, and passed it to ffmpeg to create a video.

It looked something like this:

#!/bin/bash -ex
 
xvfb-run -a -s "-screen 0 1280x720x24" \
  gource \
    --seconds-per-day 1 \
    --auto-skip-seconds 1 \
    --file-idle-time 0 \
    --max-file-lag 1 \
    --key -1280x720 \
    -r 30 \
    -o - \
  | pv -cW \
  | ffmpeg \
    -loglevel warning \
    -y \
    -b:v 3000K \
    -r 30 \
    -f image2pipe \
    -vcodec ppm \
    -i - \
    -vcodec libx264 \
    -preset ultrafast \
    -pix_fmt yuv420p \
    -crf 1 \
    -threads 0 \
    -bf 0 \
    ../gource.mp4

This setup worked well for its time and could even be automated via cron or a Git hook. However, it required a graphical environment workaround and quite a bit of shell-fu.


🧬 From Shell Scripts to GitHub Actions

Fast forward to today, and things are much more elegant. The modern Gource workflow lives in .github/workflows/gource.yml and is:

  • 🔁 Reusable through workflow_call
  • 🔘 Manually triggerable via workflow_dispatch
  • 📦 Integrated into a larger CI/CD pipeline (pipeline.yml)
  • ☁️ Cloud-native, with video output stored on S3

Instead of bash scripts and virtual framebuffers, I now use a well-structured GitHub Actions workflow with clear job separation, artifact management, and summary reporting.


🚀 What the New Workflow Does

The GitHub Actions workflow handles everything automatically:

  1. ⏱️ Decides if a new Gource video should be generated, based on time since the last successful run.
  2. 📽️ Generates a Gource animation and a looping thumbnail GIF.
  3. ☁️ Uploads the files to an AWS S3 bucket.
  4. 📝 Posts a clean summary with links, preview, and commit info.

It supports two triggers:

  • workflow_dispatch (manual run from the GitHub UI)
  • workflow_call (invoked from other workflows like pipeline.yml)

You can specify how frequently it should run with the skip_interval_hours input (default is every 24 hours).


🔍 Smart Checks Before Running

To avoid unnecessary work, the workflow first checks:

  • If the workflow file itself was changed.
  • When the last successful run occurred.
  • Whether the defined interval has passed.

Only if those conditions are met does it proceed to the generation step.


🛠️ Building the Visualization

🧾 Step-by-step:

  1. Checkout the Repo
    Uses actions/checkout with fetch-depth: 0 to ensure full commit history.
  2. Generate Gource Video
    Uses nbprojekt/gource-action with configuration for avatars, title, and resolution.
  3. Install FFmpeg
    Uses AnimMouse/setup-ffmpeg to enable video and image processing.
  4. Create a Thumbnail
    Extracts preview frames and assembles a looping GIF for visual summaries.
  5. Upload Artifacts
    Uses actions/upload-artifact to store files for downstream use.

☁️ Uploading to AWS S3

In a second job:

  • AWS credentials are securely configured via aws-actions/configure-aws-credentials.
  • Files are uploaded using a commit-specific path.
  • Symlinks (gource-latest.mp4, gource-latest.gif) are updated to always point to the latest version.

📄 A Clean Summary for Humans

At the end, a GitHub Actions summary is generated, which includes:

  • A thumbnail preview
  • A direct link to the full video
  • Video file size
  • Commit metadata

This gives collaborators a quick overview, right in the Actions tab.


🔁 Why This Matters

Compared to the 2013 setup:

2013 Bash Script2025 GitHub Actions Workflow
Manual setup via shellFully automated in CI/CD
Local onlyCloud-native with AWS S3
Xvfb workaround requiredHeadless and clean execution
Script needs maintenanceModular, reusable, and versioned
No summariesMarkdown summary with links and preview

Automation has come a long way — and this workflow is a testament to that progress.


✅ Final Thoughts

This Gource workflow is now a seamless part of my GitHub pipeline. It generates beautiful animations, hosts them reliably, and presents the results with minimal fuss. Whether triggered manually or automatically from a central workflow, it helps tell the story of a repository in a way that is both informative and visually engaging. 📊✨

Would you like help setting this up in your own project? Let me know — I am happy to share.