[Put logo here]

"The most perfect edition of plays ever published": the Digital Lacy project

Lou Burnard (Independent Scholar)

Thomas Hailes Lacy (1809-1873)

Thomas Hailes Lacy (1809-1873)

  • Lacy was the leading theatrical publisher of "Acting Editions" -- practical working documents printed at 6d a copy for individual titles, or 5s for a bound volume of 15 titles.
  • Between 1848 and 1873, his Lacy's Acting Edition of Plays, grew to contain 100 volumes of 15 titles each: it was sold across the globe, and made him a reasonable fortune.
  • The LAE is a unique sample, apparently covering the full range of Victorian Theatrical presentations
  • The population it samples approximates to the titles listed in vols 4 and 5 of Allardyce Nicoll's magisterial History of English Drama -- c. 24,000 distinct titles performed between 1800 and 1900.

Research question: how representative is the LAE ?

A corpus is a sample, hopefully representative of a known population. Initial comparisons between the LAE and Allardyce Nicoll's Handlists suggest distributions of size, age, and mode are comparable.

First performance dates by volume

First performance dates by volume

"It is hard to avoid the conclusion that Lacy astutely leavened the mix for each volume, using mainly contemporary titles to complement the old favourites." (cf. How old are these plays?)

Digital Lacy project

Proto-website at http://lb42.github.io/Lacy

Current workflow

  • Goal is consistent minimal encoding of a known source edition
  • VPP texts:
    • VPP-PDF to Docx (OCR by Abby, thanks Huma-num)
    • DocX to TEI-All (XSLT by TEI)
    • TEI-All to Lacy XML (homegrown XSLT scripts)
  • Minimal markup, largely ignoring visual salience
  • TEI schema defined by ODD very close to dracor-schema

Impossible without manual intervention: this is the main bottleneck in current workflow.

DraCor vs Lacy: how close ?

DraCor and Lacy have a few ideological differences...

However - the DraCor team is very responsive and helpful !

Tagging headaches are another persistent challenge

These texts are full of phenomena which break or strain the simple OHCO model...

For example ....

Implied speaker

Speeches assigned to multiple speakers

(<stage> not currently permitted within <speaker>)

Nesting of simultaneous speech or song

The whole dance (the Tyrolienne) is contained by a <spGrp> element which contains two nested <spGrp> elements, each containing two <sp> elements to be performed in parallel. (See also TEI Issue 2695)

Tentative suggestions and conclusions