Content Archiving

Slipped disks: Why preserving our digital history is hard

The Bayeux Tapestry is over 500 hundred years old, yet people can still understand it today. You don't need any special tools or training. You don't even need to speak a particular language, though a smattering of Latin will help you comprehend some of the finer detail.

Evidence of writing dates back much further, with stone slabs and pottery showing written words from over 3,000 years ago. Graphical stories, in the shape of cave paintings, stretch into the depths of human history, going back more than 30,000 years. Ancient, but we can still understand what they depict.

On my desk I have three 2.5-inch floppy disks from 1985. They contain the project write-up for my 'O'-level computer science course, written on a long-since-obsolete word-processor made by Hermes. They are completely inaccessible and unreadable today. Even eBay returns zero listings for that old machine. One of the few photos I can find of it online is part-way down this page.

No real loss, of course, but in terms of data permanence that's pathetic. Barely 30 years after the information was created and stored, it has effectively been lost forever.

As human civilisation has advanced, so our information storage materials have become ever more rich, yet correspondingly ever more transient. We can still read Shakespeare's plays and look at da Vinci's beautiful drawings, yet audio tapes rot, CD substrates corrode, movie film decays and digital formats change so fast they become obsolete within a human generation, if not sooner.

This is a recognised problem with old media, but there's no easy solution. Broadcasting companies attempt to keep old footage in climate-controlled environments. Libraries do the same, as do art galleries, but that only slows the deterioration; it doesn't stop it. Audio-visual histories are even trickier to manage. I wrote earlier this year about preserving the sounds of New Zealand's past. An admirable project, but the new storage medium will have to be upgraded regularly if it too isn't to become obsolete.

It's not as though we can easily return to the storage media of the past. A printer company executive once said, “If you value your photos, print them out.” He would say that, of course. But the longest-lasting inkjet inks and paper are rated at 100 years (and nobody's actually tested that in real life, for obvious reasons). Yet the oldest true photograph in existence is almost twice that age. Newer technology is more capable, but less robust.

Digital archivists today might choose JPEG and PDF for storing images and documents respectively, since, although not perfect, they are the most widely viewable and transportable formats. But that's only true today and for perhaps the past 10 years. What about the year 2030 or later? Will we still have JPEG viewing software and Acrobat Reader in 2050?

Even if we do, will our antique SATA SSD drives hold their data that long, assuming cosmic rays, sunspot activity and other sources of strong electromagnetic radiation haven't wiped them clean? Probably not, which means regular upgrades in the meantime, regular transitions from an old digital format to a new one. That means expense and time, especially given the sheer volume involved: IBM recently claimed that 90% of the world's digital data was generated in the past two years.

All of this means we will only store what we want to store. All around the world there are projects underway to store data for the long term. Just like the New Zealand one, these involve actively choosing which items will be retained. Think of the Internet Archive and similar initiatives; many countries have their own such storage projects.

These all take an active approach to archiving. It's a subtle but important point. Unlike most of history so far, deliberately storing information for retrieval means we're deciding what future historians can and can't recover from our era, and that determines what they can learn about us. Anything we don't explicitly store in a long-term format will be destroyed or rendered obsolete and unrecoverable - unknown.

That wasn't the case before the digital age. Some storage projects, such as the Egyptian pyramids, were designed to last forever, but most information wasn't. Yet a 200-year-old book left in an old cupboard doesn't need any special technology for you to read it today, and nor do old drawings and paintings.

But if you want your photos, documents, audio-visual files and other records to outlast you – or even last as long as you – it will take an active, conscious effort to make that happen. As for more transient information such as social media posts, emails and instant messages, nobody's going to be digging those up in a time-capsule 50 years from now.

So, much of our modern life is genuinely transient and becoming more so. It won't stand the test of time. Our virtual memories will fade even faster than our biological ones. Does that matter? Will our civilisation become like someone with no long-term memory, focused firmly in the present and short-termist? Has that already happened? We won't know the answers for many years, by which time we may have forgotten the questions.

If you have valuable information in digital form, take some time to think about how you're going to maintain it. Otherwise, 30 years from now, you too may find yourself staring at obsolete technology containing information that you will never be able to retrieve.


« Top Tips: Investing in programmatic buying


Rant: Arguing online is futile »
Alex Cruickshank

Alex Cruickshank has been writing about technology and business since 1994. He has lived in various far-flung places around the world and is now based in Berlin.  

  • Mail


Do you think your smartphone is making you a workaholic?