All posts by jwbaker

James Baker is Director of Digital Humanities at the University of Southampton. James is a Software Sustainability Institute Fellow, a Fellow of the Royal Historical Society, and holds degrees from the University of Southampton and latterly the University of Kent, where in 2010 he completed his doctoral research on the late-Georgian artist-engraver Isaac Cruikshank. James works at the intersection of history, cultural heritage, and digital technologies. He is currently working on a history of knowledge organisation in twentieth century Britain. In 2021, I begin a major new Arts and Humanities Research Council funded project 'Beyond Notability: Re-evaluating Women’s Work in Archaeology, History and Heritage, 1870 – 1950'. Previous externally funded research projects have focused on legacy descriptions of art objects ('Legacies of Catalogue Descriptions and Curatorial Voice: Opportunities for Digital Scholarship', Arts and Humanities Research Council), the preservation of intangible cultural heritage ('Coptic Culture Conservation Collective', British Council, and 'Heritage Repertoires for inclusive and sustainable development', British Academy), the born digital archival record ('Digital Forensics in the Historical Humanities', European Commission), and decolonial futures for museum collections ('Making African Connections: Decolonial Futures for Colonial Collections', Arts and Humanities Research Council). Prior to joining Southampton, James held positions of Senior Lecturer in Digital History and Archives at the University of Sussex and Director of the Sussex Humanities Lab, Digital Curator at the British Library, and Postdoctoral Fellow with the Paul Mellon Centre for Studies in British Art. He is a member of the Arts and Humanities Research Council Peer Review College, a convenor of the Institute of Historical Research Digital History seminar, a member of The Programming Historian Editorial Board and a Director of ProgHist Ltd (Company Number 12192946), and an International Advisory Board Member of British Art Studies.

Metadata about a hard disk: is this a research object?

So contemporary historians, here is the scenario. You are interested in some aspect of life since the 1980s. You have all the usual sources: personal papers, newspapers, official/corporate archives, pictures, books, radio, music, television shows, et cetera. If you are looking at life after 1996, after the boom in the public web, you can also add web archives into the mix. But one of those media types – personal papers – is in decline. Not that people aren’t writing things of importance, but the use of paper as a form to draft those things is in decline. In many cases people aren’t using pen and ink or a typewriter, but are sitting at a personal computer – as I am right now – and drafting those things in a word processor. Alongside those digital personal papers are all sorts of things people save to their personal computers: pictures, books, radio, music, television; you get the picture.

So, to research life since the 1980s, collections of things held on personal computers (that is from PCs to laptops to tablets to smartphones), let us call them personal digital archives, are in scope. And yet – as every contemporary historian knows – privacy is an issue here. Sure you can study things that appear in public but personal things, private things, are often off limits as a result of data protection and the like. And you can guarantee those hard drives are going to have some juicy personal, private details. So, with a heavy heart, you write history without them. Great history. But perhaps not the history you would have liked to write.

What if there was something you could use? What if you could understand the hours someone spent using a personal computer, how they arranged their files, their patterns for editing documents, their choice of software, their downloading habits? And what if you could do all of that without needing to see the personal documents themselves? (of course, filenames are important, often very personal, and possibly subject to data protection, but they they certainly aren’t as personal)

In short what would you be able to glean from the sort of metadata available to download in .csv format here?  (for the sake of clarity this is metadata for my USB stick, captured using BitCurator for demonstration purposes). Imagine that this data schema was used to represent metadata for a hard drive packed with personal papers, newspapers, official/corporate archives, pictures, books, radio, music, and television shows; just like your hard disk no doubt. What would you be able to do with it? Could you imagine it as a research object in its own right?