Picture of Gabriel Egan G a b r i e l   E g a n  .  com

"What happened?: Perspectives going forward" by Gabriel Egan

"Social, Digital, Scholarly Editing" is, when you think about it, an odd title. Why are there no conjunctions of any kind in our title? Without them, what semantic work is being done by those commas? Are we to undertand them as Boolean operators, and if so which ones? [SLIDE] Taking them as logical ORs would open up a broad umbrella under which we might all comfortably shade ourselves, for everyone here is fulfilling the conditions of at least one of these adjectives. [SLIDE] But if those commas are Boolean AND operators then the conference title becomes something of a challenge to its delegates, taunting us with an implicit reproof of "bet you can't satisfy all three terms". Social and Digital is easy, Digital and Scholarly is easy, Social and Scholarly is harder, and Social and Digital and Scholarly is especially tough.

The crux of all this is how we understaning the word social, and I've heard it being used in four distinct ways in papers at this conference [SLIDE]:

* public collaborative input during creation of an edition

* scholarly collaborative input during creation of an edition

* scholarly debate/reuse during consumption of an edition

* public debate/reuse during consumption of an edition

[SLIDE] The first kind of social activity we've called crowdsourcing: getting people you don't know and might never meet to do work on your project simply be inviting them and making it attractive to do so. Examples of this first kind of social engagement include Melissa Terras's Transcribing Bentham project and Paul Flemons's labelling of insect specimens. Within this first kind of sociability, there's are distinctions to be made between having the crowd key in the texts of your primary documents, having them decide on the acceptability of particular readings, and having them annotate transcriptions that are already established. On Thursday Barbara Bordalejo objected to what she saw as the deceptive self-description in Ray Siemens's Devonshire Manuscript project on just that point: the project claims to use crowdsourcing to de-centre the authority of the individual scholarly editor, but in fact the transcription was already established when the project went social. And so, according to Barbara, the project is significantly less radical than it seems because the crowd has significantly less power.

Paul Flemons's crowdsourced project to label insect specimens is organized along the line of a game based on 'expeditions' with 'leaders'. But he's found that gamification isn't really helpful: what contributors want is to be recognized and to know that they're making a difference. This led us to a debate on the ethics of crowdsourcing: is it right to have things done for free that someone would otherwise be paid to do? This debate came up again this afternoon in Laura Mandell's talk on the eMOP project's crowdsourcing of the correction of dirty OCR. As Laura said, and as Paul Flemons said earlier, stop worrying: this is not compulsory labour. Over the past three days we've heard a lot about how particular amateurs have made extraordinary inputs to particular projects, such as Josh Sosin's triple-bypass octogenarian. This phenomenon should take us back to the etymology of the word amateur, which comes from love: they do it for the love of it. Of course, certain groups like graduate students are in a vulnerable position and we ought to be particular thoughtful about their motivations. Moreover, Brent Nelson told us that it's important not to let the crowd become dominated by too many graduate students overseen by too few regular faculty, not least because the professional status of the project may suffer if this balance is not maintained.

Just what is it that the amateur transcribers and correctors do differently from the professionals? Ben Brumfield sees the difference lying in the self-reflexity of the professional: we think about our methods, conceiving of them in theoretical terms. If we want amateurs to get self-reflective we have to start with providing them with some guidelines and these ought to appear where the amateurs go looking, in Wikipedia. But do we expect or even want thousands of members of the public to learn these skills to help us out? Melissa Terras argued that crowdsourcing is most usefully a kind of open audition, sifting the thousands of volunteers to find the dozen or two who are willing do the bulk of the work.

Responding to a high-profile article on our topic, Paul Eggert claimed that Siemens et al. 2012 fundamentally mistakes what a scholarly edition is, which is transaction between the scholar and the reader. The edition is an argument made by a scholar about a set of originating documents and what happened to the words in them and it operates rhetorically in the present rather than recovering something from the past. Via an engagement with recent work on Hans Walter Gabler's theorizing of the editorial act, Paul argued that whole editions can't readily be crowdsourced because committees simply cannot muster the rhetoric needed for the edition's truth claims to be persuasive. Only once the edition is made can it start to be social in its ongoing life as an engagement with readers, which is the engagement I've put further down my list as ways 3 and 4 of being social. Paul didn't quite say so, but his argument against committees might be thought to also bear on the social interaction of scholars working collaboratively on an edition, which is the second way of being social in my list [SLIDE].

We heard from several delegates about this kind of sociability, including Jason Boyd this morning on the intensely collaborative Record of Early English Drama project that sends scholars into the parish and county records offices of places like Worcester and Shropshire to collect the highly dispersed local knowledge of drama outside of London. This scholarly collaborative work long predates the Internet and a key question here is whether we should build tools to help scholars be digitally collaborative without learning the gory details of TEI XML or should we make them knuckle down and do it. The debate has both a practical element--if the kind of people we want making editions won't do the learning then are we content to have editions made only by those who will--and a theoretical one, since for some delegates the very act of encoding one's documents in TEI is itself an essential first step in understanding those documents and beginning to articulate what you think they say. On Wednesday Elena Pierazzo described to us how the Early English Law project created tools that enabled the construction of a critical apparatus by scholars who didn't learn TEI and who came from a variety of different specialisms within law. These scholars had to agree at the start on a single set of buttons to be made available in the transcription editor, reflecting a single set of agreed features that might occur in a document being transcribed. Elena confided to us that she doesn't think this is the right way to work: the scholars need to understand what's going on underneath the tool.

Fotis Jannidis addressed this question when showing us his TextGrid system, making the valuable point that the time and money that is wasted in accommodating those who just won't knuckle down and learn XML sometimes far exceeds the time and money that would be needed for them to actually do the learning -- that is, we have here an irrational not a practical objection to learning XML. We heard again from Laura Mandell this afternoon that even haven given away to people the TEI texts of their own transcriptions and exhorting them to make digital editions from them, they won't do it: they demand non-TEI editors. Peter Robinson on Wednesday demonstrated his Textual Communities project that can hide the XML from the collaborating editor, but also has the startling alleged feature of finally solving the perennial problem of overlapping hierarchies in the documents we want to encode. We want to record at once their physical structure (generally as pairs of pages forming leaves) and also their literary structure and stanzas or scene or paragraphs, and Peter's solution uses the XML Path Language to enable this. Roger Osborne presented the AustEse Workbench, which is a very similar collaborative editing/annotation tool.

Should we worry that Peter and Roger are duplicating one another's effort? That's a question so important that I'd like to reflect on it for a moment. According to Joshua Sosin and Peter Robinson the duplication of effort in our projects is not a problem. After all, by natural selection the process of evolution developed the eye 25 times in unrelated species, so duplication is just part of the process. To that I'd like to respond that when I heard the recent discovery that ants have evolved a messaging system that works much like the TCP/IP protocol on which the Internet is built--what journalists enjoyed calling the Anternet--I reflected not on the insignificance of our achievement as human beings--because Mother Nature got there first--but on the magnificence of it. It took evolution millions of years to come up with this messaging protocol by a wasteful process of blindly trying everything and throwing away what didn't work, whereas once we put our minds to it we homed in the solution in a few years. If we're going to justify the duplication of effort by different teams, let's not use the analogy of evolution.

Earlier today we heard from Alex Gill a new way to engage with crowds: not as a telepresence but as a real-world presence. On the model of the hacker and maker communities, Alex says we should simply invite the crowd over for a beer and a hamburger. This will work very well for those whose crowds live locally, but surely the point of crowdsourcing is that the Internet brings together people who can't meet in person. I wontt pursue this objection very far, since of coursee the very fact this particular crowd has come from four continents to discuss this undermines my case.

[SLIDE] My third way of being social won't detain us long bcause we've been doing it for centuries. One big difference that being digital makes here is in the potential for reuse, which in the print world means mainly quotation or excerpting or anthologizing the material but in the digital world can mean thoroughly transforming its. For some digital reuses that are already happening, such as computational linguistics, the kinds of contextualizing information that scholarly editors provide and think of as their added-value is, according to Geoff Rockwell, really subtracted-value: from a 'big data' point of view the text is just a 'bag of words'. If you want to add something to it, the most useful thing would be part-of-speech tagging on each word. This morning Joris van Zundert outlined for us the hermeneutic potential that may be opened up if we get right the combination of both approaches, making editions that contain the usual scholarly contextualizing material and yet allowing the extraction of what distant readers and linguistic statisticians need to work on. This Peter Robinson called filling the gap between close and distant reading, but this morning Tuomas Heikkila made me realize that the gap disappears of its own accord if we close the circle by using computers to generated documentary stemmata. That kind of data-mining closes the loop by returning to editors results that assist in the foundational stage of recension that helps an editor decide which early witness to use as a basis for the modern one.

Meg Meiman and Ken Price described how the Walt Whitman Archive sits on the borderline between my third and fourth categories. By doing what Peter Robinson has long proposed--giving everything away under the most generous Creative Commons licence you can--this archive aims to further D. F. McKenzie's idea of a social text in which all the various inputs of labour that make a completed work are documented and given their due weight. I say that this project lies on the borderline because Meg and Ken were quite clear that the project is blurring the borders between social groups--of text's creators and of texts' readers or users--that we normally rely on. Allison Muri and Catherine Nyegren also referenced McKenzie's expansive notion of what we do and what our primary objects are by showing how in the digital medium the spatial topography of London can be incorporated in editions of works like Alexander Pope's The Dunciad that rely so heavily on it.

I'll end this section with Peter Shillingsburg'z persuasive curmudgeon that we must simply keep the crowd out of the transcription of primary materials, out of all the decisions that constitute editing. Peter echoed Paul Eggert in insisting on the on the argumentative force of a scholarly edition. His concern is accuracy, which is the scholarly editor's concern and can't be farmed out: if accuracy is everybody's job, it's nobody's job. Peter did offer a space for the scholarly and perhaps the non-scholarly crowd in offering responses to the finished edition and in offering feedback that might persuade the editor to make changes in a subsequent iteration of the edition. Hence I put his notion of the social here on the post-creation side.

[SLIDE] The fourth way of being social is a category somewhat larger than I expected before I heard Wendy J. Phillps-Rodriguez talking about a group of readers I'd never thought about before: religiously devout people for whom a particular document is a sacred object that they want actively to engage with as part of the system of beliefs and associated rituals. Accommodating such needs in a digital project means making not an edition but what Wendy called a resource, incorporating in itself the variability of the devotional object when its text is translated multiple times in different ways. This seems to me an extreme example of what Daniel O'Donnell pointed out is an ever-present condition in our field: the primary objects tend to evoke strong emotional feelings in their beholders. From this follows a curious hybridity in our digital editions, since often they have to serve both the kind of instructional roles that museums specialize in (as in "behold this object") and the critical/interpretative role that is more typical of work in university departments. The beauty Dan referred to does not have to visual beauty. Gimena del Rio's discussion of metrics in medieval Castillian poetry shared with Wendy's paper a concern for the way that for users concerned with the sound of writing, neither the print not digital edition is ever going to be thing itself.

Yesterday, in the process of busting several of our cherished myths, Edward Vanhouette argued that the critical/interpretative part of the hybridity that Dan O'Donnell referred to must not be separated from the "behold this object" part. In other words, and here again echoing Paul Eggert on the transactional and rhetorical nature of editing, we cannot publish a separate 'clean' reading text since to do so cuts the work off from the argument from which that reading text emerged. In sharp contrast, Murray McGillivray today told us that we DO have to 'do it in the road'--meaning expose our clear reading texts as soon as we establish them--because if we don't there'll be no-one watching us at all. In implicit contradiction of Edward's claim that electronic texts are not more accessible than printed editions, Murray told us of his highly diverse and highly dispersed worldwide readership. In the course of visualizing what kind of digital edition of Thomas More's works that he'd like to build, Romuald Lakowski described a series of ambitious much like Murray's, being more interested in a wide audience of varying abilities than in a small audience of specialists.

*

I'm aware that my simple summary of this event has not encompassed some of the most interesting debates we have had. I've said nothing about the extraordinary work on genetic editing and genetic criticism that we have seen. I'm thinking here of Zallig Pollock delightful announcement that Hans Walter Gabler Ulysses edition was, effectively, already XML avant la lettre and Susan Brown's proposal that a digital genetic edition might itself appear in stages in order to acknowledge its own genesis. I think D. F. McKenzie would have been delighted by both ideas. Likewise Wout Dillon's demonstration of the genetic edition of Samuel Beckett manuscripts, which has the impressive aim of simultaneously publishing literary critical reflections upon the materials, which as Paul Eggert reminded us is the stage that too seldom follows publication of an edition, due to sheer editorial exhaustion.

This fourth category of the social necessarily engages wider debates about Open Access and what is called in England the Impact Agenda and in the US Outreach. Elena Pierazzo made a passionate appeal for us not to allow this Impact Agenda to deflect us  from making scholarly editions that may get read only by other scholars. What's at stake here is the proper relationship between the academic profession and the wider public that pays its wages, and the debate holds a danger to academia. James Ginther expanded this economic concern by thinking about the inevitable decline of the commercial publication of our research by the traditional print means. For some, this is no bad thing not because digital dissemination is better than print dissemination, but because we have nothing of value to disseminate to our societies. In the wisdom of the crowds some people see an alternative to the wisdom of the scholar. As a shorthand for where I stand in that debate, I'll quote one of the many delightful anecdotes from Terry Eagleton memoir The Gatekeeper:

. . . an Oxford academic . . . was invited to deliver a lecture at Ruskin, the Oxford trade union college, and . . . began with the typically donnish, self-deprecating ploy of claiming to know very little of the subject in question. [SLIDE] A voice from the back boomed out in a rich Lancashire accent [SLIDE]: "Tha'art paid to knoow!" (The Gatekeeper 89-90)

Whenever I'm tempted to deprecate our abilities and importance as scholars, when anyone tells me that we're being elitist and that the crowd is wiser than us, I try to remember that working class voice pointing out the simple economic reality that we have been paid to know more than anyone else about our topics.