Rupert's world

Bury the hierarchy: On folders, broken links, and Bill Gates' biggest mistake

Or, "Down with the hierarchy-archy".

Let's talk about the filesystem.

Somewhere deep in MIT, around 1965, a group of researchers were putting the finishing touches on their new computer operating system, Multics.

One of the breakthrough features? A "hierarchical, location-addressable" filesystem.

Instead of dealing with an unstructured list of data on your computer, you could collect data in "files" and put them in "folders" — in other words, your information was structured in a hierarchy. And, to interact with some data, you'd just need to know where it was in the hierarchy – a location address.

Fast-forward four years to 1969, and a team of breakaway Multics engineers invented Unix, carrying the files-and-folders metaphor with them. Today, with the huge influence Unix had on the genesis of personal computing, the hierarchical location-addressable filesystem is everywhere.

And it's fine...?

Well, depends on who you ask. In my view, the files-and-folders metaphor has got us all in a conceptual rut.

Real files and folders—the ones our filesystem metaphor are based on—they're physical objects. And physical objects have certain constraints, which we have unwittingly enforced upon every piece of data in every computer in the modern world.

For example, the research document I've put together writing this article applies to a bunch of other work I have on at the moment. But it can only live in this folder with my newsletter articles – one folder, one file, one physical place (copy... paste...).

I also have a bunch of information on my computer that I'd love to connect. A PDF article I downloaded the other day on David Linch's maps of Twin Peaks would pair excellently with a book I saw the other day on cartography, and with a specific paragraph in a journal entry on travel I made in 2015, and with an email from a friend who's obsessed with pop culture.

But again, I confont this clunky metaphor – if I want to link files, I need to refer to where they are, and I can only refer to the files in their entirety. If I move the files, change computer, or email a file to a friend, all these relationships are gone. Poof.

That's not to mention needing to open specific apps to see what's inside my files, the lack of easy tagging and metadata, the hell that is broken links on the web...

Because of the limitations of filesystems for actually organising information in a way people find useful, proprietary layers of software have been built to abstract away from all this – think Notion, Airtable, Google Drive, Evernote. Even tools like iTunes are just databases superimposed on a folder full of files to make them palatable. More recently, apps like Hook have emerged that explicitly link files with unique IDs instead of location addresses. While it's a fascinating take, it's still an extra layer of brittle complexity – and you'd better hope the company never goes bust.

But there have been glimpses of a better future.

For example, WinFS was an ambitious (but doomed) filesystem pioneered by Microsoft in the early 2000s. It treated your entire filesystem like a relational database – instead of having discrete closed-off files, information was set free like a spreadsheet. You could link anything to anything else, sort and filter by rich metadata, and form relationships between different bits of content.

(When it fell over, Bill Gates referred to it as his "greatest mistake", saying the world was not ready for it yet and promising its return).

At the moment, I'm most excited by content-addressable filesystems, which also originated in the 60s but then somehow slipped away unnoticed. In a content-addressable filesystem, instead of pointing to where a file is on a computer, you point to what it is. Think of it like someone asking you what you want to eat for dinner. Instead of saying "fridge, third shelf, 20cm to the left" (a location address) we'd probably rather say "the salmon fillets, please" (a content address).

The latest take on mainstream content-addressed storage is the Inter-Planetary File System (IPFS). It's not just concerned with having an address for the files on your computer, but also files across the entire internet. Each file uniquely identified in a "global namespace".

Importantly, IPFS needs no centralised host, no web server, no external company like Dropbox or iCloud Drive – data is distributed over all the computers in the network. And if you link to something, that link won't break with a reshuffle of your files, or a website going offline – so long as the data can be found somewhere in the world (by matching its content, not its location) then everything will work.

(It reminds me a bit of the original vision for Project Xanadu, another piece of brilliant information technology archaeology, a predecessor of the web... well, in their words, the web "trivialises" their project, with "ever-breaking links" – I recommend you look it up).

I'm convinced that the fact we stick all our digital information in "files and folders"—metaphors for actual physical locations—restricts our imagination about what we can actually do with our data.

I'm also convinced that there's a plethora of once-helpful, decades-old digital metaphors, which are now ripe to be picked, composted and replaced with something new.

What these metaphors are, is left as an exercise to the reader.

Some noteworthy links from my research: