All notes

Read P-322 Notes on digital heritage infrastructure, dataspaces, and governance.

Is a heritage dataspace still an open data infrastructure?
Gertjan FilarskiGertjan Filarski

Is a heritage dataspace still an open data infrastructure?

Open data sounds simple, but in practice it rarely is. Anyone who shares heritage data inevitably runs into legal limits, technical constraints, ethical considerations, and geopolitical reality. Ahead of publishing my fourth dataspace experiment next week, this blog pauses on a more fundamental question: what does it really mean to set access conditions within an infrastructure that calls itself “open data”? When is “open” no longer unconditional? And how do you translate policy, trust, and responsibility into technology? This post explores where openness starts to chafe, where control becomes unavoidable, and why that is exactly where dataspaces show their value.

Read
Heritage and data spaces: experiments 2 & 3
Gertjan FilarskiGertjan Filarski

Heritage and data spaces: experiments 2 & 3

Data spaces sound abstract until you take them apart and look at what actually happens when multiple parties try to access data at the same time. In this blog I walk through experiments 2 and 3, where a single provider is approached by multiple consumers — including a malicious one. What happens to contracts, keys, and access when the architecture is put under pressure? And where does responsibility really lie: with the provider, or with the party holding the key?

Using working code, I show why roles, transactions, and infrastructure layers matter, and what a data space does — and explicitly does not — promise when it comes to security. No policy talk, but concrete technical observations from experiments that are allowed to break.

Read
Heritage and data spaces: experiment 1
Gertjan FilarskiGertjan Filarski

Heritage and data spaces: experiment 1

data spaces are showing up more and more in heritage policy, but what do they actually mean technically? In this first experiment I build a working data space transaction using European open-source technology: no diagrams, but real software you can run locally. Step by step I show how providers and consumers negotiate, wait, and ultimately exchange data under explicit conditions. No abstract policy talk, but a concrete sign of life beneath the words ‘data space’.

Read
Sustainable links: why not everything that lasts should do everything
Gertjan FilarskiGertjan Filarski

Sustainable links: why not everything that lasts should do everything

Sustainable links sound like a technical choice until you realise the real problem is organisational: one infrastructure expected to support archiving, scholarly citation, and marketing at the same time. This blog walks through Handle, DOI, ARK, and even PURL—but above all through the underlying question: what are you allowed to promise permanently, and what do you only want to measure temporarily? The outcome is a surprisingly simple design rule: separate identity from attention.

Read
Names That Endure: Persistent Identifiers Without the Illusion of Eternity
Gertjan FilarskiGertjan Filarski

Names That Endure: Persistent Identifiers Without the Illusion of Eternity

Links feel like concrete, until they suddenly turn to sand. In heritage data, that is not a detail but a risk: what works today may vanish tomorrow due to a new vendor, a reorganisation, or a change in leadership. In this blog, I show why a URL is not a promise, what a PID does deliver, and how minimal, sober infrastructure can build trust - without illusions of eternity. If your collections need to outlive your organisation, this is worth reading.

Read
The Art of Getting Lost with AI
Gertjan FilarskiGertjan Filarski

The Art of Getting Lost with AI

In my attempt to install a simple analytics tool, I got tangled in a chain of seemingly plausible AI suggestions — each just convincing enough to continue. What began as a fifteen-minute task turned into hours of trying, correcting, and hoping the next step would work. It became a lesson in how language models simulate confidence, how easily you get swept along, and how important it remains to verify every step yourself.

Read
What Are We Really Looking For? Dataset Discovery in the NDE Dataset Register on the Road to a Dataspace.
Gertjan FilarskiGertjan Filarski

What Are We Really Looking For? Dataset Discovery in the NDE Dataset Register on the Road to a Dataspace.

The third lesson learned from the Datahub Colonial Collections service platform.

The NDE Dataset Register is currently mainly a technical entry point to endpoints, but for a service platform like the Datahub Colonial Collections that is not enough. We need to understand what we are actually ingesting: the content, the technical quality, the legal conditions and the ethical sensitivities. By broadening discovery to these four lenses, the dataset register becomes a meaningful instrument that helps data providers publish responsibly and prepares service platforms for a dataspace logic.

Read
Outside the Dataset, Within the Community: Enrichments as Nanopublications
Gertjan FilarskiGertjan Filarski

Outside the Dataset, Within the Community: Enrichments as Nanopublications

The second lesson from the Colonial Collections Datahub: enrichments must become independent, citable knowledge items.

The Datahub cannot store enrichments in its cache without losing them every night—nor without becoming an aggregator itself. Communities of origin also want their knowledge to exist outside Dutch infrastructure. This is why enrichments are published as standalone nanopublications: small, verifiable knowledge claims with their own provenance and licensing, distributed across an international network. The cache stays temporary, but the knowledge becomes durable, citable, and independent—exactly what reproducible and responsible infrastructure requires.

Read
Cache, Not Copies: A Remedy Against Aggregation
Gertjan FilarskiGertjan Filarski

Cache, Not Copies: A Remedy Against Aggregation

The first blog with lessons learned from treating colonial collections as a service platform.

The Datahub lives on the rhythm of nightly synchronizations: every day the entire knowledge graph is rebuilt from scratch. Not to preserve it, but to remain deliberately disposable. A dataset that cannot be fully reconstructed from its source is an aggregate. By remaining fully reproducible, the Datahub prevents itself from becoming an authority and stays a service provider rather than a gatekeeper.

Read
Service Platform Datahub Colonial Collections Part 3: from NDE Infrastructure to Dataspace
Gertjan FilarskiGertjan Filarski

Service Platform Datahub Colonial Collections Part 3: from NDE Infrastructure to Dataspace

The Datahub Colonial Collections automatically processes and publishes museum data according to NDE principles: technically correct, legally permitted, and transparently traceable. But when datasets include colonial-era objects or human remains, the question shifts from can we? to should we?

Responsible publication requires four frameworks: contextual, technical, legal, and ethical. In a European dataspace, these frameworks become enforceable protocols, enabling responsible and trustworthy data sharing with trust by design built into the infrastructure.

Read
Service Platform Datahub Colonial Collections Part 2: Infrastructure and NDE Compliance
Gertjan FilarskiGertjan Filarski

Service Platform Datahub Colonial Collections Part 2: Infrastructure and NDE Compliance

In part 2 of the datahub as a service platform, we descend into the infrastructure beneath the Colonial Collections. Not the interface, but the agreements that determine who publishes which data, how it keeps flowing, and how close we stay to the source. We follow how museums publish their own data according to the NDE architecture, how the Datahub uses it without aggregating, and why enrichments are stored as independent knowledge instead of being hidden in an intermediate layer.

Read
Service Platform Datahub Colonial Collections Part 1: The User Environment
Gertjan FilarskiGertjan Filarski

Service Platform Datahub Colonial Collections Part 1: The User Environment

The Datahub for Colonial Collections is one of the first concrete examples of a dataspace in the heritage domain. In this infrastructure, collections from Dutch museums with a colonial context are brought together and enriched with knowledge from communities of origin — as with the Bird of Prophecy from Nigeria. The Datahub presents academic and local knowledge equally within a single shared system.

This Part 1 focuses on the user environment of the Datahub.

Read
A Data Architecture Studio
Gertjan FilarskiGertjan Filarski

A Data Architecture Studio

Data are the raw material of the digital world. But just as steel, wood, or pigment gain value only through design and craftsmanship, data becomes meaningful only when given structure, context, and coherence. That is precisely the domain of a data architecture studio.

Read