This week I finished a small pilot project ‘archiving’ some of the data storage devices held at University of Sussex Special Collections. My interest in this area is predicated on the premise that the paper archive has been replaced by the hard disk. In just 25 years most people in Britain and worldwide have come to create information in a new way. In turn, both archivists and historians will need figure out how to deal with data storage devices in archival contexts (for more on this see my recent talk ‘Hard disks as archives of everyday life‘. Thankfully, open source digital forensics tools that enable archivists to preserve and curate these born-digital archives have made huge strides in recent years thanks largely to the efforts of the BitCurator project led by University of North Carolina Chapel Hill.
Together with my wonderful colleagues in the Library and Special Collections, this pilot used BitCurator to capture a small number of data storage devices with a view to:
- Creating evidence around processes and outputs that can feed into future data management plans
- Preserving and stabilising these materials in line with best practice
- Establishing a set of shared principles in anticipation of future born digital deposits
To achieve this I spent a couple of days in The Keep (where the Special Collections are held) make captures of media devices including floppy disks, USB sticks, and CDs, most of which held institutional records. Working through these helped me playtest and iterate a step-by-step Processing Workflow for Digital Media using BitCurator that I’d developed in advance. This workflow is available on GitHub under a CC BY-SA license, so do share, use, and build on it. And if you are an archivist using BitCurator to capture comparable archives I’d be grateful for your input!
Having worked with born-digital collections whilst at the British Library, it was great to get stuck into them again. Whilst I was at The Keep I also took the opportunity to survey response to two Mass Observation Directives (Summer 2004 Letters and Emails, Summer 2015 You Online) for descriptions of how people remembered and experienced interactions with computers. As I plot research that will use both born-digital and physical archives as primary sources of historical change in the age of the personal computer, it is encouraging to see great work in the area bubbling up. The first workshop of AHRC-funded ‘Born digital big data and approaches for history and the humanities’ research network took place this month, showcase the growing maturity of humanities approaches to born-digital data. And last month Matthew G. Kirschenbaum’s (very excellent) Track Changes appeared, a book that traces literary encounters with word-processing. As one of the BitCurator project, Kirschenbaum offers some astute observation on using born-digital archives in humanities research. For example, during a discussion of the reliability – or otherwise – of created and modified date/time file metadata (in short, how do we know the clock of a computer was set correctly and in the case of a frequent flier to which time zone?), he writes:
A scholar working with such materials must be conversant in the antiquarian cants of vanished operating systems, file formats, and emulators, just as we expect an early modernist doing book history to know something of signatures and collation formulas (233)
Unsurpringly perhaps, I couldn’t agree more.