This post was originally published at the Software Sustainability Institute blog.
In Part One of this blog series on the House of Lords Science and Technology Committee inquiry into forensic science, I discussed oral evidence pertaining to digital forensics – a branch of forensic science concerned with the recovery and investigation of material found in digital devices – and their relevance to my home discipline, History. The three most common themes of the oral evidence sessions – Volume, Variety, and Systems – were described as relevant to historians for a number of reasons: for how they are changing archival labour, for the suitedness of historians to conceptualising new forms of primary evidence, and for the new knowledges required to test the authenticity of primary sources produced during the digital age. In this blog I discuss three less common, but no less important, themes of the oral evidence sessions: the temporal fluidity of systems that produce born digital materials, gaps in the record, and ethical digital forensics.
In the previous blog, I described how the oral evidence sessions highlight that there is a need to know systems in order to trust the files that are produced by them. This is further complicated by technological change. An undercurrent of the oral evidence sessions was the contemporary fluidity of digital systems, applications, and architectures. This fluidity has created a moving target that is thwarting things like investment in digital forensic research and capacity building. What is striking to me, however, is that systems haven’t always been so fluid. During the mid-1990s, home and work computing in the Global North began to consolidate around a relatively stable base of Windows, macOS, and Linux operating systems. But the advent and boom in smartphone technology – precipitated by the launch of the iPhone in 2007 – created instability. According to Angus Marshall (Lecturer in Cybersecurity, York):
We are up against human ingenuity rather than evolution. All we have to do is look at something like the app stores and the number of new apps that are added every day. Each one will have its own functionality and its own particular data format, and there is the potential for each and every one of them to be involved in some criminal act, so that at some point we may have to devise a technique for dealing with them. (27 November 3pm, Session 2, 18)
And so unlike the period of computing history that preceded it, documents produced between 2007 and the next systems consolidation (history suggests it will happen..) are likely to be characterised by a fluid and rapidly evolving ecology of data formats. This plays into the professional strengths of the historian. For not only will the historian using born digital primary sources require some knowledge of the systems on which they were produced and used, but they will also want to grasp the historical specificity of those systems, a socio-technical judgement we are ideally trained to make.
With fluidity of systems comes loss. Many of us will have encountered an old digital file that we can no longer access or open. In many respects this loss isn’t a problem for the historian. Historians are experts at dealing with gaps in the record, with ‘good enough’ archives of our objects of study. Postcolonial and queer approaches have, in particular, provided explanations of the relationship between power and the historical record that make the idea of a ‘complete archive’ a fiction. And those same approaches have enabled historians to develop toolkits for reconciling our interpretations of the past with the distortions and absences caused by oppression and marginalization (e.g. Emma Perez, ‘Queering the Borderlands: The Challenges of Excavating the Invisible and Unheard’, Frontiers: A Journal of Women Studies 24, no. 2 (2003): 122–31, doi: 10.1353/fro.2004.0021). In short, we know that the archive as traditionally constituted – a physical site with physical holdings – is a technology of remembrance. Digital forensics is another such technology. And like all such technologies, its capacity to remember is shaped and constrained by social factors. In the case of digital forensics, the inquiry suggests that the principle social factor that is shaping and constraining digital forensics is an underexamination of the challenges of capturing, stabilising, and preserving born digital materials. Angus Marshall again:
Digital forensics is seen very much as a niche [..] At the moment, the whole concept of forensic readiness is massively underrepresented in the computing community, which means that new technologies are being produced every day that are inherently difficult to recover anything useful from to use as evidence. (27 November 3pm, Session 2, 6)
The post-1980s historical record – like any historical record – is then incomplete and will be continue to be so. However, what is interesting about the case Marshall describes, is that the technological impossibility of preserving some aspects of the historical record is produced by a resistance to archiving. Apps whose uses are ‘inherently difficult to recover’ recall things like the scurrilous poetry shared in eighteenth-century Parisian parks: as the historian Robert Darnton writes, we know they were there, we know why participants chose to use that system of communication to protect their privacy, and we can only ever get a sense of what was said through the ripples and waves recorded in media better attuned to remembrance (‘An Early Information Society: News and the Media in Eighteenth-Century Paris’, The American Historical Review 105, no. 1 (2000): 1–35, doi: 10.2307/2652433). And so whilst digital forensics may create an abundance of primary source materials (see Part One of this series), practitioners anxiety about gaps in the record may well find solace in the ‘gap readiness’ of the historian.
The drive within digital forensics to recover evidence, to fill gaps, to see privacy as ‘a problem’, brings us to ethics. Digital forensics does not only provide confidence that documents produced are what we think they are. Digital forensics is also about finding the materiality of the record that software design is trying to hide from users. And so digital forensics processes also recover files that people had consciously deleted, auto-save fragments made by software without our knowing, and traces of interactions with files, software, and web services embedded in logs and system histories. The behaviours of deleted and auto-generated files are central to the investigatory function of digital forensics, for they enable law enforcement to exploit system architectures to reconstruct patterns of behaviour. The oral evidence sessions reveal the levels of sophistication of these forensic processes. For example, as Adrian Foster states:
It is about an understanding of what can be achieved through the different types of analysis, whether it is obtaining a kiosk download, and I use the word “download” in the broadest sense, at a police station — it would probably take about two hours, and we can hand the phone straight back to the victim — or a level 2 or 3 download which looks either at deleted data which is not available on the device or actually breaks the device up to look at the memory chip to find out what can be retrieved from it. Those things take longer and we are currently seeing delays of up to five or six months. We are trying to make sure that everybody has an understanding of what can be achieved at each level, because that obviously affects case progression, hearings and the trial date, along with what can be achieved (30 October 2018, 4)
Archivists are already – and have been for some time – using ‘level 2 or 3’ digital forensics to preserve our shared heritage, particularly in the case of personal papers (Corinne Rogers, ‘From Time Theft to Time Stamps: Mapping the Development of Digital Forensics from Law Enforcement to Archival Authority’, International Journal of Digital Humanities 1, No. 1 (2019), doi: 10.1007/s42803-019-00002-y). They know of the ethical implications, and have taken steps to differentiate between ‘dark’ versions of a series – containing deleted and auto-generated fragments – and ‘access’ versions whereby the document views available more closely match what the creator could see and access (Laura Carroll et al., ‘A Comprehensive Approach to Born-Digital Archives’, Archivaria 72, no. 72 (2011)). But is it okay to use primary sources that a person thought they had deleted or didn’t know had been created during the course of using a digital device? What do we do if we come across a sensitive document that an archivist hadn’t noticed? And given that digital forensic software used by archivists was adapted from software intended for the purposes of criminal investigation, what if that document is evidence of a crime? Different questions, therefore, are asked if we approach digital forensics from a perspective other than that of the criminal investigation. And these perspectives can only be beneficial to developing consensus around the ethics of digital forensics.
Most historians will consider an inquiry into forensic science outside their expertise. This is perfectly reasonable, for most historians do not work with born digital primary sources. But as the beginning of the age of mass computing recedes into the past, historians are starting to reconcile the outputs of digital forensic processes into their practice. The oral evidence phase of the House of Lords Science and Technology Committee inquiry into forensic science indicates that historians have much to learn on this topic from lawyers, law enforcement professionals, scientists, members of the judiciary, home office officials, regulators, and research funders. But these professionals can also learn from our expertise in analysing primary sources, of reading between disparate materials, of historicising change, of interpreting absence. I intend to write again on these themes when the inquiry reports later this year.