Open Source Photography: Why Immich's Dataset Project Matters

January 31, 2026

I’ve been running Immich for a while now as my self-hosted photo management solution, and it’s been brilliant. For those who haven’t heard of it, Immich is essentially an open-source alternative to Google Photos that you can run on your own hardware. No subscription fees, no cloud storage limits, and most importantly, complete control over your data. It’s exactly the kind of project that makes the open-source ecosystem so valuable.

Now they’re asking for help, and honestly, it’s the kind of request that makes you appreciate how these projects actually work behind the scenes.

The team behind Immich is building a public EXIF dataset to improve their metadata parsing capabilities. For the non-technical folks out there, EXIF data is all the information your camera embeds into a photo – things like what camera model took it, the lens used, exposure settings, and yes, even GPS coordinates if you have location services enabled. The problem is that every camera manufacturer does this slightly differently. It’s a bit like everyone speaking English but with wildly different accents and dialects.

Someone in the discussion thread explained it perfectly: take 20 different cameras and they’ll store the same data in 20 different ways. Immich currently uses ExifTool, which is brilliant software that’s been around for years, but to really excel at organising and understanding photos from every possible device, they need more data. Lots more data. From as many different cameras, phones, and devices as possible.

The whole thing reminds me of those early days of Wikipedia, when the ambitious idea of creating a comprehensive encyclopedia relied entirely on people volunteering their knowledge. This is similar – building a comprehensive dataset of how different devices store photo metadata requires the community to chip in with examples from their own gear.

What strikes me about this is how it highlights both the strengths and challenges of open-source software. On one hand, proprietary systems like Google Photos have vast datasets to work with – billions of photos uploaded by users, giving them enormous training data to perfect their systems. On the other hand, those photos sit on Google’s servers, analysed by their algorithms, feeding their business model. With Immich, you maintain control, but the trade-off is that improvements rely on community participation rather than corporate resources.

The privacy considerations here are interesting too. The project is refreshingly upfront about the fact that uploaded photos will be publicly accessible, including all that metadata. They’re explicitly warning people not to upload anything that could be personally identifying – no photos from your home with GPS coordinates embedded, for instance. It’s a good reminder of just how much information our devices are quietly recording. Every photo I take with my iPhone is logging exactly where I was standing when I took it, unless I’ve specifically disabled that feature.

There’s something wonderfully practical about the responses in the discussion. People are genuinely enthusiastic about contributing, but they’re also asking smart questions. One person wanted to know if they could strip out the actual image data and just submit the metadata from thousands of photos spanning 20 years. Another was digging through their cupboards looking for old cameras that might still work, thinking they could contribute examples from vintage hardware.

This kind of collaborative effort represents what I love most about the open-source community. It’s not just about free software – it’s about building tools collectively that serve everyone’s needs, not just shareholders’ profits. When commercial software falls short or disappears (remember Picasa?), community-driven alternatives keep going because they’re not dependent on quarterly earnings or strategic pivots.

I’ll be contributing some photos myself. I’ve got an old Canon DSLR gathering dust, plus obviously my iPhone, and probably a few other devices floating around. It won’t take much effort – they’re only asking for a handful of photos from each device – but the cumulative effect of hundreds or thousands of people doing the same will make Immich better for everyone who uses it.

There’s a broader lesson here about the sustainability of open-source projects. They need more than just code contributions; they need data, documentation, bug reports, and sometimes just people willing to upload a few test photos. These projects work because people recognise that putting in a small amount of effort individually creates massive value collectively.

If you’re using Immich, or even if you’re just interested in supporting quality open-source software, consider contributing to their dataset. Take a few generic photos – a park, your coffee cup, the sky – with whatever cameras you have access to, and upload them. Just remember to check those privacy settings first. The last thing anyone needs is to accidentally contribute their home address to a public dataset.

It’s small efforts like these that keep the open-source ecosystem thriving, and frankly, we need more of that kind of community participation. The alternative is continuing to hand over all our data, photos, and digital lives to massive corporations who monetise every byte. I know which future I’d rather support.