Picture of Gabriel Egan G a b r i e l   E g a n  .  com

Authors and Indexes (II)

Sir, -- The index omissions that prompted John Sutherland's complaint (Letters, November 10) were names of people, which even the simplest machine can find. Those who wrote to the TLS (Letters, November 24) to insist that an index is semantically more complex than a mere concordance are right and should be delighted at what machines can now do. The current state of the art in natural language processing enables software to replicate the systematic ordering of terms and ideas found in a manually-produced index and to greatly outperform it for sophisticated enquiries of the kind 'where does this author mention A or B without also mentioning X or Y?'

To take a concrete example, the Nameless Shakespeare computer project (http://wordhoard.northwestern.edu) allows one to ask the machine to list just the adjectives used by the character Ophelia when speaking verse, and to identify which of them are not also used in, say, Desdemona's or Innogen's verse lines. Those persuaded by James Lambs's quotation of laughably poor machine translation of idiomatic English that machines cannot make sense of language should note that the morphosyntactic analysis necessary to tell Ophelia's adjectives from other parts of speech was done by machine, not by hand. Thus the computer sets the mind free to pursue details that were formerly too difficult to discern. In the way they mistake the impact of technology, those who insist that readers will always prefer well-made manual indexes are like the medieval scribes who, confronted by the crude but rapid early printing press, comforted themselves that serious readers would always prefer painstakingly illuminated manuscripts.

Dr Gabriel Egan, Department of English and Drama, Loughborough University