Picture of Gabriel Egan G a b r i e l   E g a n  .  com

"Sir Thomas More: Editing a Manuscript play within an XML Edition Created for Printed Plays" by Gabriel Egan

    British Library manuscript Harley 7368 receives special scholarly attention because it contains the play Sir Thomas More with at least one scene written by William Shakespeare in his own handwriting. The complexity of the play's authorship is captured by the title page of John Jowett's Arden3 edition: "Original Text by Anthony Munday and Henry Chettle | Censored by Edmund Tilney | Revisions co-ordinated by Hand C | Revised by Henry Chettle, Thomas Dekker, Thomas Heywood and William Shakespeare" (Munday et al. 2011, iii). Our methods of textual scholarship were developed largely for use with printed documents, although in principle the techniques of variant description and discrimination developed by the New Bibliographers are equally applicable to printed books and manuscripts, as W. W. Greg pointed out on the first page of his book The Calculus of Variants (Greg 1927). But in practice, most early modern drama comes down to us mostly in printed form. Because publishers' usually created a new edition by using as the printer's copy an exemplar of the most recently preceding printed edition, the lines of descent for printed plays are primarily monogenetic and hence simpler than the complex, fanning-out genealogies created when texts are widely disseminated in manuscript.

    The Calculus of Variants is easily Greg's least-cited book, presumably because even his devotees find it virtually unreadable. The impenetrable formulas at the heart of the book's method arise because Greg's model was the three-volume Principia Mathematica of Bertrand Russell and Alfred North Whitehead (Greg 1927, v; Russell & Whitehead 1910; Russell & Whitehead 1912; Russell & Whitehead 1913). Russell and Whitehead attempted to derive all the operations and principles of modern mathematics by logically combining the most primitive axioms of set theory to build complex proofs from simple ones. The process was tedious, and famously it was not until page 379 of the first volume that Russell and Whitehead had developed their system sufficiently to declare: "it will follow . . . that 1 + 1 = 2". Despite the difficulties arising from modelling his work on Russell and Whitehead's, one aspect of Greg's application of their formal logic repays our attention because it shows that formulas and tree structures are equivalent in describing hierarchical textual relationships, and this insight underpins how we now represent trees as formulas in systems such as the Extensible Markup Language (XML) used in the Text Encoding Initiative (TEI) markup of literary works.

    After the main text of The Calculus of Variants, Greg offered a series of pictures to illustrate that a formula of the kind {A}{B}{C[D(EF)]}, his second example, is an alternative way of expressing a tree structure. Greg was referring to six manuscripts (identified as A through F), and the bracketting shows the family-tree relationships between them arising from certain manuscripts being created by copying others. To help make sense of Greg's picture that is reproduced in Figure One, Figure Two shows the same thing with identifiers along the bottom for the six terminal manuscripts (that is, ones from which no further extant manuscripts were made) and the three inferential manuscripts (from which manuscripts C, D, E, and F derive) that were not labelled by Greg but are here identified as x, y, and z. Figure Two also shows the ancestral manuscript from which they all derive, in Greg's nomenclature (x)A' (a mnemonic for Exclusive Ancestor), identified here as 'root'.

Figure One. The second example of a formula for a family tree of manuscript relations from Greg's The Calculus of Variants (Greg 1927, 60).

Figure Two. Greg's picture from Figure One with added explanatory labels provided by the present author.

    Greg's tree shown in Figures One and Two would be encoded in XML thus:

<root>
    <A/>
    <B/>
    <x>
        <C/>
        <y>
            <D/>
            <z>
                <E/>
                <F/>
            </z>
        </y>
    </x>
</root>

In a tree, each element has exactly one 'parent' element (except the root which has none) and zero or more 'children' elements descending from it. In XML encoding, these relationships are represented by each element's opening and closing tags (the symbols <x> and </x> for element x) being wholly enclosed within its parent's opening and closing tags. The convention of placing a forward slash after the name of an element is a shorthand indicating that an element is 'empty' in the sense of having no further descendents and no 'content', so that <E/> is equivalent to <E></E>.

    The above example puts the elements on separate lines with indentation to help the human eye perceive the structure, but such matters of layout carry no meaning in XML so we can rewrite it on a single line as:

<root><A/><B/><x><C/><y><D/><z><E/><F/></z></y></x></root>

In this one-line representation, we see more clearly the principle of enclosure: the whole of element z (that is, <z>...</z>) is 'within' the element y (that is, <y>...</y>). Greg's formula uses curly braces {...} where in XML we use the start and end tags for our element <x> (corresponding to the tree node x), square braces [...] where in XML we use the start and end tags for our element <y> (corresponding to the tree node y), and round braces (...) where in XML we use the start and end tags for our element <z> (corresponding to the tree node z). Thus Greg's formula notation for a tree is equivalent to XML's notation for the same tree:

Greg:        A   B   {  C   [  D   (  E   F    )   ]   }
 XML: <root><A/><B/><x><C/><y><D/><z><E/><F/></z></y></x></root>

Greg's system is, I think, the earliest application to textual matters of Arthur Cayley's insight that formulas and trees are interchangeable representations of the same structures (Cayley 1857). If readers know of an earlier case, I would be grateful to hear of it.

    The use of genealogical trees to represent how textual documents are related one to another by processes of copying in transmission was pioneered by the German philologist Karl Lachmann (1793-1851). But it was not until the 1960s that a research group at the computer company International Business Machines (IBM) realized the usefulness to digital representation of treating individual texts as also internally structured as trees. The internal hierarchical features common to a set of documents such as a collection of novels could be abstracted and expressed separately as a tree structure. The abstract tree for all novels begins with a root, called 'novel', comprising one or more 'chapters', each of which comprises one or more 'paragraphs', each of which comprises one or more 'sentences', and so on down to any desired level of granularity, ending with the single purposeful mark known as a 'glyph', such as a letter or piece of punctuation. For early modern plays, the abstract structure would be a root 'play' comprising five 'acts' (or none), each comprising one or more 'scenes', each comprising one or more 'speeches' (interspersed with zero or more 'stage directions'), each comprising one 'speech prefix' followed by one or more verse 'lines' or prose 'paragraphs' (interspersed with zero or more 'stage directions').

    The ability to distinguish content from structure and express the latter as a tree in machine-readable form was first manifested in IBM's Generalized Markup Language, which became formalized as Standard Generalized Markup Language (SGML) in the late 1980s (Goldfarb 1990). SGML provides a common standard for expressing the names of the elements that make up a text of one kind (such as all novels being made of elements called 'chapter', 'paragraph', 'sentence' and so on) and quantifying the relationships between them, as in a play having exactly five or zero 'acts', divided into 'scenes' and zero or one 'stage directions' appearing between and within 'speeches'. Although he included no pictures of tree structures, in his guide to his invention Charles F. Goldfarb was explicit that SGML assumed that all texts can be represented as trees (Goldfarb 1990, 18-19, 127, 133).

    Before Goldfarb's SGML, Noam Chomsky had influentially used trees to account for the structure of individual sentences, but nothing larger (Chomsky 1956; Chomsky 1957; Chomsky 1959). (The entire discipline of structuralist poetics is the application of linguistic thinking about sentence structure to larger units of writing, and it failed principally because it was built on the weak foundation of Saussurean structuralist linguistics instead of Chomsky's more correct transformational-generative grammar.) Chomsky's work on natural languages was enthusiastically adopted by scientists developing high-level computer programming languages because it provided a precise taxonomy of the expressive powers of a series of increasingly complex grammars governing the production of valid sentences in any real or artificial language (Aho, Sethi & Ullman 1988, 81-82). Goldfarb made no mention of Chomsky, but the Document Type Definition (DTD) by which his SGML codifies the abstract tree structure of a document is itself a Chomskyan context-free grammar. A DTD is descriptive in that it specifies the structures found in any existing text that obeys its rules, while a Chomskyan context-free grammar is generative in that it specifies the rules by which any valid text in that language can be produced (Stührenberg & Wurm 2010). This difference is merely one of perspective.

    SGML provided the model for Tim Berners-Lee's choice of textual units and their inter-relationships in the HyperText Markup Language (HTML) he developed for his Worldwide Web (Berners-Lee 1999). In the 1990s, SGML was used to encode many large-scale scholarly text projects including the digital versions of the Oxford English Dictionary and the CD-ROM databases published by Chadwyck-Healey that later became Literature Online. Popularizing the new approach outside the computational fields, Steven DeRose and his co-authors argued that SGML rightly responds to the general question of "what is text?" with the answer that it is "an 'ordered hierarchy of content objects,' or 'OHCO'" (DeRose et al. 1990, 4). That texts are OHCOs, are trees, is now almost universally accepted, although substantive objections are still occasionally raised (Lancashire 2010). SGML itself was superseded by the more powerful and more consistent -- but functionally equivalent -- Extensible Markup Language (XML), which is now used in almost all large-scale digital text projects.

    The idealized tree-structure model of SGML/XML can capture not only a text's tree-like semantic structure but also its tree-like physical structure, as a 'volume' comprising 'gatherings' comprising 'leaves' comprising 'pages' (or 'sides'). However, a single tree structure cannot simultaneously embody a text's semantic structure and its physical structure, since the beginnings and ends of the semantic units inevitably cut across the beginnings and ends of the physical units. An act, a scene, a speech, a prose paragraph or verse line, or a sentence may begin on one page or leaf or gathering and end on the next. Trees can capture only structures in which the smaller units fit wholly inside the larger units. The following encoding depicts the kind of 'overlapping' hierarchies we find when we try to express semantic and physical structure at once:

<x>
  <y>
    <z>
  </y>
    </z>
</x>

This encoding does not express a tree, since the element z begins as a child of the element y but ends as a child of the element x, yet by definition each element in a tree has only one parent.

    In any tree-like structure representing a text, either the semantic or the physical structure must take precedence. Like most scholarly digital text projects, the New Oxford Shakespeare privileges the semantic structure and subordinates the physical. We still record physical structure in the XML encoding, but not as part of the tree. Rather, we put a 'milestone' marker wherever a new page begins, and this implicitly asserts that the preceding page has just ended. Because these 'milestones' do not encompass a span of text, they do not conflict with the spans encoding the semantic structure. With such 'milestone' encoding, the only way to recover the physical structure is to traverse the entire document and make inferences from the relative locations of the 'milestones'; the tree-like structure of the physical elements is not explicitly captured. In the case of Sir Thomas More there is yet a third structure of interest, and it cuts across the semantic and the physical hierarchies: the division of the manuscript's contents into handwriting by Anthony Munday, Henry Chettle, Edmund Tilney, Hand C (whose owner is unknown), Thomas Dekker, Thomas Heywood and William Shakespeare. We will return to this third structure shortly.

    For most purposes, our authority on what the manuscript of Sir Thomas More contains is Greg's Malone Society Reprint of 1911, as slightly corrected by Harold Jenkins in a revised and reprinted edition in 1961, which was itself reprinted in 1990 (Greg 1911). (The British Library does not allow readers to consult the fragile physical manuscript, but matters of scholarly disagreement such as disputed readings can be pursued by consultation of the Library's high-resolution digital facsimile.) Greg presented the manuscript pages not in the order that they are found in the document. Rather, he printed first the pages containing what he called the Original Text (predominantly in Munday's hand) and then the pages containing a series of what he identified as the Additions (in others' hands), which he numbered One through Six. Table One shows the order of the pages in the manuscript and the reordering of the material as presented by Greg, with colour-coded blocks representing the three groups of leaves that he 'moved'. The result is something akin to a modern 'genetic'edition in aiming to record a chronology of the revision as a discrete set of events enacted upon the Original Text. Or, to put it another way, Greg imposed upon the text a new structure representing time, since by definition revision comes after intial writing.

Manuscript order Greg's Malone Society Reprint order
3a (Scene 1) 3a (Scene 1)
3b (Scene 1, Scene 2) 3b (Scene 1, Scene 2)
4a (Scene 2) 4a (Scene 2)
4b (Scene 2) 4b (Scene 2)
5a (Scene 3) 5a (Scene 3)
5b (Scene 3, Scene 4) 5b (Scene 3, Scene 4)
6a (Addition 1 Scene 13) 10a (Scene 6)
6b blank 10b (Scene 6, Scene 7)
7a (Addition 2 Scene 4) 11a (Scene 7)
7b (Addition 2 Scene 5, Scene 6) 11b (Scene 7, Scene 8)
8a (Addition 2 Scene 6) 14a (Scene 8)
8b (Addition 2 Scene 6) 14b (Scene 9)
9a (Addition 2 Scene 6) 15a (Scene 9)
9b blank 15b (Scene 9)
10a (Scene 6) 17a (Scene 9, Scene 10)
10b (Scene 6, Scene 7) 17b (Scene 10)
11a (Scene 7) 18a (Scene 10, Scene 11)
11b (Scene 7, Scene 8) 18b (Scene 11, Scene 12, Scene 13)
11*a blank 19a (Scene 13, Scene 13)
11*b (Additional 3 Scene 8) 19b (Scene 13)
12a (Addition 4 Scene 8) 20a (Scene 13, Scene 14)
12b (Addition 4 Scene 8) 20b (Scene 15, Scene 16)
13a (Addition 4 Scene 8) 21a (Scene 16)
13b (Addition 4 Scene 8) 21b (Scene 16, Scene 17)
13*a (Addition 5 Scene 9) 22a (Scene 17)
13*b blank 22b blank
14a (Scene 8) 6a (Addition 1 Scene 13)
14b (Scene 9) 6b blank
15a (Scene 9) 7a (Addition 2 Scene 4)
15b (Scene 9) 7b (Addition 2 Scene 5, Scene 6)
16a (Addition 6 Scene 9) 8a (Addition 2 Scene 6)
16b (Addition 6 Scene 9) 8b (Addition 2 Scene 6)
17a (Scene 9, Scene 10) 9a (Addition 2 Scene 6)
17b (Scene 10) 9b blank
18a (Scene 10, Scene 11) 11*a blank
18b (Scene 11, Scene 12, Scene 13) 11*b (Addition 3 Scene 8)
19a (Scene 13, Scene 13) 12a (Addition 4 Scene 8)
19b (Scene 13) 12b (Addition 4 Scene 8)
20a (Scene 13, Scene 14) 13a (Addition 4 Scene 8)
20b (Scene 15, Scene 16) 13b (Addition 4 Scene 8)
21a (Scene 16) 13*a (Addition 5 Scene 9)
21b (Scene 16, Scene 17) 13*b blank
22a (Scene 17) 16a (Addition 6 Scene 9)
22b blank 16b (Addition 6 Scene 9)

Table One. The order of the leaves in British Library manuscript Harley 7368 (in the first column) and the order of presentation of the same material in W. W. Greg's Malone Society Reprint of the play (in the second column). The scene numbers are Greg's and strikethrough represents substantial blocks of lines deleted in the manuscript. Smaller deletions are not recorded.

    For an XML-encoded edition of Sir Thomas More, Greg's Malone Society Reprint provides a accurate transcription of the base text that can be conveniently screen-scraped in digital form from the ProQuest online database One Literature (formerly called Literature Online). But as can be seen from Table One, Greg's reordering of the leaves in putting all the Additions at the end takes the reader further from the physical order of the manuscript and also further from the dramatic order. That is, with the exception of the obviously misplaced folio 6, the order of the scenes in the manuscript is dramatically correct as a representation of the later state of the play, after the Additions were written and sutured onto the earlier state. Greg reported a "gap" in the manuscript between folios 5 and 10 and another between folios 11 and 14 (Greg 1911, v), but he meant only that after removing the leaves containing the Additions -- folios 6, 7, 8, 9, 11*, 12, 13, 13*, and 16 -- what remains does not read as a complete play. Hence, Greg decided, the manuscript must once have contained leaves that are no longer there. The obvious inference is that the process of revision that created the Additions involved the removal of whole leaves containing material superseded by the writing in the Additions. This impression is enhanced by the presence of manuscript pages in which whole speeches marked for deletion convey approximately the same dramatic events (with different phrasing) as speeches in the Additions. This is most clearly seen in folio 5b's deleted version of the start of Scene 4 (the insurrection gathering head) and folio 7a's version of Scene 4 that replaced it.

    An edition seeking to present to modern readers the play as it existed after revision will necessarily put the dramatic material into the order found in the manuscript rather than the order of Greg's edition. This is challenging if one uses Greg's edition to provide the base transcription, since the Additions must be moved from the end of the transcription to their correct locations within it. While editing an XML document, it is good practice to have one's XML-editing software frequently validate the document against the abstracted statement of the tree structure to which it is supposed to adhere, which structure is in our case embodied in a DTD expressing the elements and the relationships between them in the manner codified by the latest version (called P5) of the TEI Guidelines (Sperberg-McQueen et al. 2021). Experienced XML encoders know that if they allow a document to become invalid while making any substantial alterations, so that it no longer conforms to the abstracted tree, it can be difficult to restore the document to validity. It is often hard to discern which of the many possible causes of invalidity -- which of the myriad possible violations of the rules -- is truly the problem.

    When XML editing software reports that a document is invalid it usually attempts to help the editor by indicating where it first detected a violation of the rules. But software cannot know the editor's larger intentions and may report a problem in one part of the document when the real fault lies elsewhere. For example, for the faulty encoding

Line 1  <x>
Line 2    <y>
Line 3      <z>
Line 4    </y>
Line 5      </z>
Line 6  </x>

the software might report that Line 4 contains the problem, since it closes the y element while the z element opened in Line 3 remains open. But from the human point of view the error is perhaps in Line 5 containing the </z> tag, which the editor meant to move to put it between Line 3 and Line 4. The elements x, y, and z may encompass dozens or hundreds of dramatic lines each, and may have dozens or hundreds of 'child' elements of their own, so the misdirection given by the software's attempt to point out the location of the error can be substantial. For this reason, experienced encoders try to remain no more than one 'undo' operation away from validity and thereby avoid having to unpick the multiple and complex possible sources of invalidity.

    An editor making a traditional (non-XML) printed edition is unconstrained by this necessity for the in-progress text to remain close to semantic correctness. She is free to make the document messy while putting its materials into a new order. The XML editor of Sir Thomas More, by contrast, must maintain the document's validity while rearranging the leaves in Greg's transcription to restore the order found in the manuscript, which is also the order of scenes in the revised version of the play. If the editor started by treating the play as a collection of scenes numbered 1 through 17 then it is not especially difficult to reorder Greg's transcription to bring the text back to manuscript and dramatic order, so long as she maintains an element-naming system that distinguishes versions by using labels such as "original Scene 4" and "new Scene 4".

    It would be even easier to revise Greg's transcription to restore the manuscript order if instead of privileging the dramatic/semantic structure the editor first privileged the physical structure of the manuscript as a collection of leaves and subordinated its structure as a collection of scenes. This is because Greg's reordering proceeded leaf-wise not scene-wise, as can be seen in Table One. Greg's edition is a sequence of leaves rearranged as discrete units to tell a story about the process of revision, even though that process was not organized by leaves. Rather, many leaves have multiple revising hands, especially where the censor Edmund Tilney and Hand C wrote annotations (including deletions) on pages that are otherwise in Munday's hand. Greg appears to have been torn between two competing hierarchies, wanting to retain the primacy of a manuscript's physical structure as a collection of leaves while also representing the alternative chronological structure of the text's states 'before' and 'after' revision.

    Privileging the dramatic/semantic structure of the play and subordinating the physical structure is perhaps not the most useful way to begin editing Sir Thomas More. But within a multi-editor, multi-work project such as the New Oxford Shakespeare such decisions must be made in the light of the needs of the whole project, and for most of the works the copy text is a relatively homogenous printed edition, for which privileging the dramatic/poetic/semantic structure is sensible. Project-wide consistency in such matters is desirable because the publisher rightly demands submission of all the works in an encoding that conforms to a single shared standard and that validates against a single DTD. The XML-encoded form of a work reads like a computer program and needs to be transformed into something that looks like a printed edition by the application to it of software called an Extensible Stylesheet Language Transformation (XSLT). Writing an XSLT for a complex edition is expensive in human time and the publisher has the reasonable hope of doing this only once for an entire project, and ideally of sharing an XSLT across multiple projects. An anomalous case such as Sir Thomas More must for these reasons be accommodated within a workflow designed for less complex copy texts.

    Is there perhaps a better way for editors of early modern works to manage the multiple competing hierarchies they encounter? Might we represent at once the multiple hierarchies of our copy texts' units of writing support (the gatherings, leaves, and pages), its agents of inscription or impression (the various hands in manuscripts and compositors in printing), and its semantic units (the acts, scenes, and speeches, or stanzas, quatrains, couplets, and so on)? Attempts have been made to devise alternatives to XML that permit the representation of competing hierarchies within a single document, such as the Text-as-Graph Markup Language (TAGML) developed in the Netherlands (Huygens Institute for the History of the Netherlands (Huygens ING) 2019), but none has achieved the (relatively) mainstream success of TEI-XML.

    An approach called 'standoff markup' allows multiple competing hierarchies to be embodied as multiple distinct XML documents without having to create multiple copies of the base text. Instead of the XML markup residing in the same document as the base text with its tags enclosing the units of the base text, 'stand-off markup' is stored in a separate file (one for each hierarchy) that contains pointers to the various locations in the shared base text to which each element of the markup is understood to apply. Elsewhere I have shown how this approach facilitates the encoding of competing scholarly opinions about early modern editions of plays, such as differences of opinion about which man in a team of compositors typeset which part of an edition (Egan 2014). Of the current TEI Guidelines' 1971 pages, just five pages are concerned with 'stand-off markup' (Sperberg-McQueen et al. 2021, Section 16.9), and the approach is less popular than conventional markup. Experiments to explore whether 'stand-off markup' might provide assistance with the digital-encoding and text-editing problems explored in this paper are ongoing at the Centre for Textual Studies at De Montfort University.

Works Cited

Aho, Alfred V., Ravi Sethi and Jeffrey D. Ullman. 1988. Compilers: Principles, Techniques, and Tools. Computer Science. Reading MA. Addison-Wesley.

Berners-Lee, Tim. 1999. Weaving the Web: The Past, Present and Future of the World Wide Web By its Inventor. London. Orion Business.

Cayley, Arthur. 1857. "On the Theory of Analytical Forms Called Trees." Philosophical Magazine. 4th 13. 172-76.

Chomsky, Noam. 1956. "Three Models for the Description of Language." IRE [Institute of Radio Engineers] Transactions on Information Theory 2. 113-24.

Chomsky, Noam. 1957. Syntactic Structures. Janua Linguarum Series Minor. 4. The Hague. Mouton.

Chomsky, Noam. 1959. "On Certain Formal Properties of Grammars." Information and Control 2. 137-67.

DeRose, Steven J., David G. Durand, Elli Mylonas and Allen H. Renear. 1990. "What is Text, Really?" Journal of Computing in Higher Education 1.2. 3-26.

Egan, Gabriel. 2014. "Using Stand-off XML Markup to Record Scholarly Differences of Opinion About Typesetting." Available online at . Proceedings of the Digital Humanities Congress Held at the University of Sheffield on 6-8 September 2012. Edited by Clare Mills, Michael Pidd and Esther Ward. Studies in the Digital Humanities. Sheffield. Humanities Research Institute of the University of Sheffield. n. pag..

Goldfarb, Charles F. 1990. The SGML Handbook. Edited with a foreword by Yuri Rubinsky. Oxford. Clarendon Press.

Greg, W. W. 1927. The Calculus of Variants: An Essay on Textual Criticism. Oxford. Clarendon Press.

Greg, W. W., ed. 1911. The Book of Sir Thomas More. Malone Society Reprints. Oxford. Malone Society.

Huygens Institute for the History of the Netherlands (Huygens ING). 2019. 'Text-as-Graph Markup Language (TAGML':) A GitHub Page Online at Https//github.com/HuygensING/TAG/tree/master/TAGML:.

Lancashire, Ian. 2010. "SGML, Interpretation, and the Two Muses." Electronic Publishing: Politics and Pragmatics. Edited by Gabriel Egan. New Technologies in Medieval and Renaissance Studies. Toronto. Medieval and Renaissance Texts and Studies (MRTS) and ITER. 105-19.

Munday, Anthony, Henry Chettle, Edmund Tilney, Hand C, Thomas Dekker, Thomas Heywood and William Shakespeare. 2011. Sir Thomas More. Ed. John Jowett. The Arden Shakespeare. London. Methuen.

Russell, Bertrand and Alfred North Whitehead. 1910. Principia Mathematica. Vol. 1: Introduction; Part 1 Mathematical Logic; Part 2 Prolegomena to Cardinal Arithmetic. 3 vols. Cambridge. Cambridge University Press.

Russell, Bertrand and Alfred North Whitehead. 1912. Principia Mathematica. Vol. 2: Prefatory Statement of Symbolic Conventions; Part 3 Cardinal Arithmetic; Part 4 Relation-Arithmetic; Part 5 Series. 3 vols. Cambridge. Cambridge University Press.

Russell, Bertrand and Alfred North Whitehead. 1913. Principia Mathematica. Vol. 3: Part 5 Series (Continued;) Part 6 Quantity. 3 vols. Cambridge. Cambridge University Press.

Sperberg-McQueen, C M, Lou Burnard and TEI Technical Council. 2021. TEI [Text Encoding Initiative] P5 Guidelines for Electronic Text Encoding and Interchange. No place. The TEI Consortium.

Stührenberg, Maik and Christian Wurm. 2010. 'Refining the Taxonomy of XML Schema Languages: A New Approach for Categorizing XML Schema Languages in Terms of Processing Complexity': A Paper Presented at the Conference 'Balisage: The Markup Conference' in Montreal, Canada, on 3-6 August. Available Online in 'Proceedings of Balisage: The Markup Conference 2010', Balisage Series on Markup Technologies Volume 5. DOI Https//doi.org/10.4242/BalisageVol5.Stuhrenberg01:.