In a review of Hugh Craig and Arthur Kinney’s co-authored book, Shakespeare, Computers, and the Mystery of Authorship, Gary Taylor says:
With Craig and Kinney’s collection computational stylistics has come of age, achieving at last the kind of maturity, clarity, and authority found in the best work of the great “manual” attribution scholars like MacDonald P. Jackson. Jackson is often cited, because the digital databases and statistical packages used by Craig, Kinney, and their team provide new and independent evidence for conclusions Jackson had reached by other means. [….] But the new tools do not simply strengthen some of the old theories; they challenge others, and offer exciting solutions to more than one still-vexed Bermuda triangle.
The database consists of “165 Early Modern English plays, around 3.25 million words of dialogue” (xvii). Craig has been quietly building that carefully defined database, and testing ways to use it, for many years [….] Kinney, by contrast, brings “humanist” to sit at the table with Craig’s “digital.” [….] The co-authored Introduction elegantly situates authorship and attribution within a theoretical, transhistorical, cognitive framework that makes us interested in these plays […]. (Taylor 2011)
In their Preface and Acknowledgements to their book, Craig and Kinney reveal that their collaborative project was first “hatched” in 2001 when Arthur Kinney “spent a period as a research visitor” at the University of Newcastle, Australia, where Hugh Craig works. The project was
to combine Kinney’s knowledge of Shakespeare with Craig’s familiarity of numbers, and thus pioneer a Shakespearean computational stylistics. Kinney would supply the questions and Craig would furnish numerical results. In what they jointly wrote they would keep in mind an audience that had no interest in arcane statistics or in interminable tables of figures. Authorship would be at the core of the enterprise. (Craig and Kinney 2009, p. xv )
The authors then explain that “computational stylistics stems from the work of John Burrows, beginning in the 1980s.” Burrows, write Craig and Kinney, held that
the smallest elements of literary language (down to very common grammatical words such as and and but) had things of stylistic interest buried in them, and he sought to bring latent patterns in their use to light, through multivariate statistical procedures analysing the word-count information from large tracts of machine-readable text. The new method revealed a patterning that pervaded all levels of language and could be measured. (Craig and Kinney 2009, p. xv )
In July 2011 Hugh Craig convened an international conference in honour of his colleague Emeritus Professor John Burrows at the University of Newcastle, Australia. In a blog, David L. Hoover who attended the symposium reports:
John Burrows talked about how to investigate changes in authorship in collaborative Renaissance plays, showing that analyzing sections of plays by using a sliding window of overlapping sections allows us to pinpoint changes of authorship [….] Finally, there was the launch of the latest version of Hugh Craig’s Intelligent Archive, freely available archiving and analysis software. [….] Burrows was largely responsible for the resurgence of interest in computational approaches to style [….] (Hoover 2011)
Computational analysis, say Craig and Kinney in the preface to their book, “can offer an objective arbitration” where “experts disagree on the authorship of a text, where there is no external evidence to help,’ and can also “work in a more exploratory way, by looking for unexpected patterns and quirks in the dataset itself, rather than testing a hypothesis.” The authors offer “examples of both approaches” in their book. Craig and Kinney then explain that “one of the purposes of the book is to show that,” rather than the “stubborn” contradiction one might expect between language, and other artistic means, and computation, “there is a considerable sympathy between them.” Hugh Craig and Arthur Kinney add: “Beyond the immediate questions of Shakespeare authorship the work in the book is meant as a contribution to the larger question of stylistic individuality” (Craig and Kinney 2009, pp. xv-xvi ).
Computational analysis offers abundant evidence that writers leave subtle and persistent traces of a distinctive style through all levels of their syntax and lexis. This brings to the fore a central paradox of language. Speakers and writers share the words they use in a given language [….] They create a personal and identifiable style from within the common language [….] Yet from that common set [they] make individual selections that persist across all their uses of language. They create a personal and identifiable style from with the common language. Computational analysis reveals the richness of this variation within the dialogue of Shakespeare and his contemporaries. It persists even when dramatists strive to create their own fictional linguistic individualities in characters. Hal, Falstaff, and Hotspur do have their own languages, but underlying them all is a Shakespearean idiom, which means they are all distinct from Jonson, Marlowe, or Middleton characters. [….] it wasn’t until the computer came along that we could properly appreciate some aspects of this miraculous secret working of language in these very familiar plays and characters. (Craig and Kinney 2009, pp. xvi-xviii)
In their preface also, Hugh Craig and Arthur Kinney write that, understandably, “readers will want to know how many texts we have included, to judge the basis on which we generalize about authors and trends, and will be curious about the nature of the texts that underlie the whole enterprise.”
We use early printed versions of copy-texts, to minimize the effects of modern editing and to open up the corpus to two or more editions where they differ significantly. Each text is tied therefore to a single early witness. Consequently spellings are early Modern and highly variable. We standardize selected function words to modern usage. For the rest, a process has been developed within the software we use for word-counting to group variant spellings as teams, collecting the different forms of the same word under a single head word, which can then form the basis for counting. Thus instances of ‘folly’, ‘follie’, and ‘folie’, are all counted under the head word folly. The corpus we have assembled for the book consists of 165 Early Modern plays, around 3.25 million words of dialogue in all (they are listed in Appendix A of this book). Of these, 138 are from a more narrowly defined ‘Shakespearean’ period, which one might define as 1580-1619 [….] Some of these plays are of mixed or disputed authorship, and are the subject of investigation, so cannot form the a set of standards for the core purpose of the study, the defining of Shakespearean authorship. For this we need single author, well-attributed plays to serve as exemplars of Shakespeare’s style and those of his contemporaries. There are 112 such plays within the 138 in our corpus. The Annuls of English Drama (1964 edition) lists 174 surviving well-attributed single-author plays from this period. The corpus thus contains 112 out of 174—just under two thirds—of all the available usable plays for attribution purposes. It includes complete sets of the surviving Shakespeare, Marlowe, Jonson, Middleton, and Webster plays: complete […] according to a conservative standard of what is ‘well-attributed’ for each writer. We have four or more plays by seven other playwrights: Lyly, Greene, Peele, Decker, Heywood, Ford, and Fletcher, and three each by Robert Wilson, Chapman and Shirley. (Craig and Kinney 2009, pp. xvii-xviii)
Following on from the Preface, in the Introduction to the work, Hugh Craig and Arthur Kinney note Thomas Wilson as “one of the earliest champions of language in Shakespeare’s time.” Wilson’s manual The Arte of Rhetorique (1553), which establishes “the art of language in ways all the skilled dramatists of Shakespeare’s day would observe, is grounded in a philosophy of mind that is sophisticated even by today’s standards”:
Such ideas are restated nearly three decades later in the more famous Defence of Poetry of Sir Philip Sidney. [….] We have not substantially bettered these concepts in the twenty-first century, but we have deepened our knowledge of just how such poetic language comes to be; psychology, linguistics, physics and neuroscience have all come into confluence, showing us how the human brain works—not just ours but those of Shakespeare and his contemporaries. They have shown us the processes by which we acquire, process, and interpret knowledge; how the human brain processes language; and […] how each person’s processing of language is individually distinct. Word deployment is individual to a high degree; and understanding this permit us, for the first time, to address and answer, at least provisionally, some basic questions about the lives and works of Shakespeare and his contemporaries. (Craig and Kinney 2009, p.p. 1-2)
Hugh Craig and Arthur Kinney further point out that
Although our culture has changed considerably since Shakespeare wrote, our brains have not [….] Our process of cognition, then, is transhistorical, and in applying its operations to the sort of mental processing that is revealed in Shakespeare’s language, neuroscience has taught us how to determine, through his individual handling of language, how we […] can determine new influences on, and accomplishments of, Shakespeare’s plays. For language is not just ‘a cultural artefact’, as Steven Pinkerton wrote in The Language Instinct (1994) but a ‘distinct piece of the biological makeup of our brains’, although the complicated process of cognition and subsequent representation of thought is not a conscious process. (Craig and Kinney 2009, p. 3)
Hugh Craig and Arthur Kinney then examine and discuss the intricacies of language from a neuroscientific perspective to further explain just how it is that we “can understand more precisely how language works and poetry is formed,” in order to give the reader an understanding of Shakespeare’s characterisations and use of language. As Aaron Ben-Ze’ev notes in his work on the subtlety of human emotions: “We don’t see things as they are we see things as we are” (Ben-Ze’ev 2000, 9). In their book, Craig and Kinney point out that “since Shakespeare processes the idea or the object” (of a character or situation he sees in the external world, or in his imagination,) “in his own way, the expression will bear some stamp of individuality, too; and a scene or an act will become uniquely identifiable.” (Craig and Kinney 2009, p. 4) Hugh Craig and Arthur Kinney also say that “the computer now allows us to establish the identifiable, distinguishing the use of language of individual Renaissance English playwrights, Shakespeare foremost among them”: although “common words account for most of the bulk of a text,” “rare words or relatively uncommon words provide a second string for authorship attribution” (Craig and Kinney 2009, pp. 5-13).
Authorship will probably remain the focus of most work in computational stylistics. It is the area of literary studies where we seek a single definitive answer, and makes a natural fit with quantitative methods. It takes up the majority of the studies in our book. Yet there is no reason why the methods should not provide useful perspectives on other aspects. […] As the amount of electronic text available to us for this type of work has grown, more of the method’s potential is being realized. (Craig and Kinney 2009, pp. 13-14)
in his review, Gary Taylor writes:
Craig and Kinney explain their “Methods” in the second chapter. They begin by finding data that “works” in undisputed cases. But it takes rare, patient determination to spend years developing a technique that tells people what they already know: that, for instance, Shakespeare wrote Coriolanus and Middleton wrote Hengist, King of Kent. Professionals and amateurs alike prefer to leap, immediately, to grab the glory of solving the most famous mysteries. In fact, most published studies of attribution […] never establish the reliability of their data or their methods so painstakingly. Craig and Kinney do. They focus, throughout the book, on very common words that one playwright demonstrably uses more or less often than other playwrights of the period, and on very rare words that one playwright demonstrably uses more than anyone else. This evidence is then evaluated using well-established statistical techniques. But you don’t need to be a statistician to understand the argument: 54 figures, 17 tables, 2 appendices (including one that defines technical vocabulary) make the evidence easily comprehensible, even to the most numerically challenged Shakespearian.
‘The first fruits of this method confirm conclusions that should by now be familiar and uncontroversial: Titus Andronicus contains material written by Peele, Timon of Athens material by Middleton, Henry VIII and the Two Nobel Kinsmen material by Fletcher. Later chapters confirm that Shakespeare wrote nothing in Edmond Ironside (Philip Palmer, 100-115), but that he is indeed Hand D in the manuscript of Sir Thomas More, and that he wrote that addition early in the seventeenth century (Timothy Irish Watt, 134-61); the same methods credit Shakespeare for the “Countess” scenes of Edward III, but assigned to an unidentified collaborator the battle scenes from IH. […] (Taylor 2011)
Ben-Ze’ev, Aaron. The Subtlety of Emotions. “A Bradford Book.” Cambridge, USA: Massachusetts Institute Technology P, 2000. Print.
Craig, Hugh, and Arthur Kinney. eds., Shakespeare, Computers, and the Mystery of Authorship. Cambridge: Cambridge University Press, 2009. Print.
Taylor, Gary. “Shakespeare, Computers, and the Mystery of Authorship/ Shakespeare’s Additions to Thomas Kyd’s the Spanish Tragedy: A Fresh Look at the Evidence Regarding the 1602 Additions.” In Medieval & Renaissance Drama in England. Vol 24. 1 January, 2011. Web. (accessed 16 October, 2105)
Hoover, David L. A Symposium in Honour of John Burrows. www.textualscholarship.nl/?p=9129 Web. 2015. (accessed 29 October 2015)