It took me a very long amount of time to find a resource that effectively explained to me what linked data1 was. I think that’s one of the major reasons it hasn’t become more familiar than it perhaps should be.
It was Seth van Hooland and Ruben Verborgh’s book Linked Data for Libraries, Archives and Museums that finally(!) opened my eyes to what’s possible through linked data. The book compared it to other well-known formats such as JSON, CSV, relational databases and markup standards like XML. When academics needed to briefly explain what ‘linked data’ was in their articles that I was reading, the definitions would often be as similar to each other as they were unhelpful. And I get it: not only are you having to explain this unusual data structure, you’re also having to explain this to a bunch of musicologists – at least in the papers I’m reading.
Still, questions remain unanswered – at least to me. When Berners-Lee talks about putting our data on the web and linking it to other data, how does something like that play out when factoring in the various open data formats out there? Are we supposed to put URIs/URLs into the cells of our spreadsheets? This is probably a really dumb question, I don’t know. I think I’ve vaguely figured this out.
Reversibility(?) remains a mystery to me. If I define
<Bob> <parentOf> <Alice>, do I also need to define
<Alice> <childOf> <Bob>? I found this archived blog post from Tim BL himself on the topic, but not much else. I also haven’t found much on SPARQL endpoints, which I think is another major problem behind our lack of awareness around linked data. How do the endpoints work? How do you build one?
Not to mention the ontologies. When I think about how the idea is to be able to traverse this online knowledge graph through shared ontologies, my mind always returns to this XKCD comic.