A wonderful few days in Edinburgh at the 2016 Software Sustainability Institute Collaboration Workshop marked the end of my 2015 fellowship with the Software Sustainability Institute. This seems then a good time to reflect on the year, not least because the rationale for an historian being an Software Sustainability Institute fellow might not be immediately obvious.
Before I proceed, a brief 101 on the Software Sustainability Institute. From the front of their website:
Our mission is to cultivate better, more sustainable, research software to enable world-class research (better software, better research). Software is fundamental to research: seven out of ten UK researchers report that their work would be impossible without it.
By software, they don’t mean Word but research software, computer programmes that researchers move their sources/data in and out of to organise them, derive meaning from them, find a new way into them, or generate information about them. I suspect that many people in the Humanities fall in the three out of ten UK researchers who aren’t using research software, but with Digital Humanities and Digital History both things, this is changing.
So what do the Software Sustainability Institute do? Again from their website, they work toward:
Getting software on the research agenda
Supporting communities that want change
Increasing skills
Improving software
Now I’m probably not much use on the latter (my foobar is weak), but I certainly want to get the possibilities for the use and adaptation of software on the research agenda of my peers, I am part of a community of digital historians who are trying to effect change in their profession (and it is no coincidence that a past and a present fellow are from the same community), and I thoroughly enjoy supporting individuals and groups who want to add software skills to their research toolkit. So not such an odd fellowship after all!
Looking back, I’ve learnt a huge amount from my year as a fellow. These include, in no particular order:
It isn’t all rosy over in the sciences Knowing that the EPSRC founded the Software Sustainability Institute (recently joined by the ESRC and BBSRC, come on AHRC…) and that much of science is hard without data/code (genetics, LHC, space stuff, et cetera) I expected to find fundamental issues around embedding computational skills into professional training and crediting software development to be ‘solved’ in science. Far from it. Solo code/data people in large research teams seems to be a thing. Legacy code that nobody owns or knows how to adapt seems to abound. And even little things like citing software seems to be problematic: otherwise why would at a hackday team at the 2016 Collaborations Workshop develop a ‘Should I cite?’ workflow?
Software can build a bridge I’m not interested in ‘Two Cultures’ postulating. And occasionally in SSI-land ‘scientist’ gets used when ‘researcher’ is meant (always, when pointed out, an innocent slip of the tongue). But, what I’ve observed is that having a shared set of knowledges and values – in this case on software (not that I know quite as much as fellow SSIers, see below…) – is a useful icebreaker, gets conversations going, and offers points of intersection that help both ‘sides’ better understand their respective research and solve the problems they share.
I know nothing At the 2015 Collaborations Workshop I was part of a hack day team. We made a thing intended to help non-coders understand that interrupting someone who is coding breaks their flow. I say ‘we’, but I’m not sure I contributed that much. That is, I helped scope the idea, produce the documentation, build the descriptive website, and test the product, all of which are important to a software project, but I didn’t actually build anything. The development platforms that came as second nature to or could be very easily picked up by a bunch of scientists (and not necessarily computer scientists) were beyond me.
I know something But. And this is a big but. The learning curve isn’t beyond the historian. My mantra for some time has been that historians spend time reading secondary literature that is useful but not essential to their work. Carve out some of that time to learn research orientated software skills and they will learn something of use (and if not, at least be able to judge digital history, the lack of people who can do so meaningfully being an acute problem if we want – as we should – computational approaches to just become part of the historians toolkit…). This is what I’ve done for some time and the Software Sustainability Institute fellowship gave me both further momentum on this and a milestone to look back on. I realised the other day that between the 2015 and 2016 workshops I had learnt: to manipulate data and make graphs in R; to use Git and Github as part of data management and web development workflows; to use Make files to update outputs when the inputs have changed; to use wget to scrape the web (thanks Peter!); to SSH into a remote device (usually a Raspberry Pi) to run data analysis without clogging up my local machine; to write shell scripts with multiple inputs and outputs; to write good regular expressions so I can find stuff in textual sources and create subsets of files; and to capture data storage devices with open source digital forensics tools. Python still eludes me, but I’m proud of my hacky endeavours.
Teachers are better learners It is a truism that there is no better way to learn something than to teach it, even if learners might not all like the idea that they are learning from someone who is still learning (though flipped around and pitched as ‘this proves this is not magic and you can do it too’ it can be pretty powerful). As part of my fellowship I ran two sets of software skills training events: Programming Historian Live and Library Carpentry. During the same time I ran a text analysis workshop for the CHASE Doctoral Training Partnership, continued to supervise a PhD student starting a digital history PhD from zero digital, and was still working on the British Library’s awesome Digital Scholarship Training Programme (it was a big part of my job there). I not only love teaching, training, and sharing computational skills with these audiences but am also so much better for it. Case in point, the ‘to write shell scripts with multiple inputs and outputs’ thing I learnt came as a result of a question from a colleague about extracting text from PDFs (not OCR, the bits you can cut and paste) to make new text files. So I DuckDuckGoed a bit (because they don’t track you; there are good intellectual reasons not to Google people), found it could be done, found it was super easy, helped them understand it, embedded into what I do and what I teach.
Small and hacky is more useful than big and shiny Obviously people at my university counting the research £££ I bring in would disagree with this statement. But this comment isn’t mean in that way. Rather, it is meant with respect to helping other learn. People starting out digital history projects can find themselves looking to the big projects for models of where to go, what to learn, how to make progress. But most digital history is small and hacky; it is me figuring out something about inputs and outputs in the shell that accidentally helps me do something else that – ultimately – deepens my understanding of the past phenomena I care about. A clear outcome of the Defining Effective Digital History Mentorship workshop that I ran with Carys Brown and Adam Crymble as part of my fellowship was that folks starting out in digital history need these small and hacky examples to be more visible. We need to do better here.
As a final thought, I’d implore any historian reading this to consider applying for a Software Sustainability Fellowship when they come around in Autumn. The application process is super light. And it the money isn’t going to set pulses racing (£3k) but the perspectives it opens up are invaluable. Perhaps I’ll see you at Collaborations Workshop 2017!