Introducing digital classics

The goddess Athena, yesterday.By Giacomo Peru, Project Officer.

As a member of the Software Sustainability Institute and classicist, I could not write my first blog post on anything but the relationship between Classics and IT. [1]

For those who are not familiar with the subject, Classics is the study of the Greek and Roman world in the period that spans between, roughly, the start of the 1st millennium BC until around the 6th century AD. At the core of this field is the study of ancient Greek and Latin, the main languages of that world, of Classical Archaeology, which collects and study its material artefacts, and of Ancient History, which reconstructs ancient Mediterranean and Near-Eastern history by the means of texts and archaeological evidence. Beyond this tripartite division, the field has bred a number of ancillary disciplines, which can be regarded as disciplines in their own right, such as philology, linguistics, palaeography, philosophy, history of art, and others.

We are all aware that IT has revolutionised all fields of knowledge and the traditional practices within them, and most of us are aware that the Humanities have a well-established digital branch. Yet some might still be surprised at how eagerly Classicists have endorsed the computerisation of their traditional scholarly tools. Digital Classics are then how classicists strive to integrate current digital tools with their traditional practices and needs.

In future posts I will investigate the other directions taken by digital classicists. Yet for the sake of simplicity, I will keep the focus of this post on how the study of ancient Greek and Latin texts, a discipline named Classical Philology, has developed in the digital era.

As it happens, the study of classical texts has always been highly data-intensive. Since the time of the great libraries at Pergamon and Alexandria in the 4th century BCE, scholars [2] have collected, catalogued, parsed, analysed, and commented on texts, and, based on the primary sources available to them, developed scholarly resources such as lexica, encyclopaedias, commentaries, and critical editions. [3] This work has been handed down from one generation to the next through the centuries, having survived the many wars, plagues, calamities, cultural and political upheavals and simple bad luck that have happened in the meantime.

In the digital age, classicists now face a daunting amount of scholarly material, formed from primary sources and a very large body of scholarship. [4] This has been laboriously produced and preserved by a class of extremely disciplined and skilled scholars. Indeed the work of classical philologists has always been inherently best suited to a digital environment, rather than to a paper one as it grants them an ease of access and versatility that printed texts could never provide.

Therefore, it is no surprise that classicists were among the first humanists to exploit the potential of computers in their work. Roberto Busa's Index Thomisticus, which began on IBM computers in the 1950s, and D. W. Packard’s Concordance to Livy, which dates back to the late 60s, are notable cases in point. Since then, digital classics have evolved through several stages as can be seen in the work of one of its leading lights, Gregory Crane, co-founder and editor-in-chief of the Perseus Digital Library. [5] Yet for the sake of brevity, let us look at the three main stages of this history.

The first stage involved the Digital Incunabula. Originally, incunubula was a term first applied to the earliest printed books, but DI refers to digital collections of non-machine-actionable texts. Google Books and the Open Content Alliance (OCA) can be considered examples of these. More relevant to the field of Classics are the Thesaurus Linguae Graecae (TLG), JSTOR and the Bryn Mawr Classical Review (BMCR).

The TLG, started in 1972, is a digital library of texts that runs to over 100 million words. It is one of the most important resources of the field and is under copyright. This means you have to pay substantial subscription costs to access it. The texts contained in it mirror printed editions and can be searched in various ways to generate excerpts of those print sources.

Other features of the TLG are the accuracy of the transcriptions and the encoded citation scheme through which scholars cite these sources. JSTOR is an archive of digitised publications, also protected by copyright and subscription costs, and BMCR, founded in 1990, is an online open access journal that publishes reviews of current works in the field. All these resources make access to the material much quicker and easier.

Then there are machine-actionable knowledge bases, such as the aforementioned Perseus Digital Library, which was founded in 1987. Perseus is a resource to which I’m personally indebted because it helped me significantly with my dissertation thesis. [6] It was launched in the 80s with the aim of advancing far beyond the horizon of a semi-static digital resource such as the TLG and others.

Instead, it is an environment that embraces both the textual and the material data of the classical world, and exposes this database to new forms of dynamic inquiry. Semantic text mark-up is the characteristic feature of projects like Perseus, and this introduces a whole new stream of revolutionary possibilities, whereby what for centuries has been dependent on the rigorous intellectual exertions of scholars can now carried out quite trivially by digital computation.

Finally, Suda On Line (SOL) is a successful collaborative project started in 1997 as an online digital community that aimed to create the first comprehensive translation of a Byzantine encyclopedia called The Suda. This massive work of 625,000 words spread was across over 30,000 entries, the scale of which posed all manner of challenges to traditional scholars. Yet as of this month, nearly all of the entries have been translated by the SOL community. In addition, the resources provided by SOL share all the features already present in Perseus, such as all the text being fully XML encoded.

In conclusion, these examples all have lessons for both digital classicists and the digital humanities overall. The challenge has been in each case how to digitise what are, after all, very large corpora of texts and to make them machine-actionable, so that they can be subject to the widest possible range of inquiries. How this has been achieved shows the way for future work, and also demonstrates the need for well developed methodologies and the possibilities of digital technology being exploited to the full.


  1. This blog post draws significantly on the work of Prof. Gregory Crane (see n. 6 below).
  2. Throughout the Middle Ages, being a scholar meant being a Classicist by default.
  3. Critical editions are crucial tools in Classical Studies. Their methodology also draws interesting parallels to the software development process.
  4. OCLC, for example, refers to over 20,500 works, all by or about Homer.
  5. Professor Crane is now Open Access Officer for the University of Leipzig, responsible for developing its Open Philology Project which will eventually cover every academic program, from Biology and Chemistry to Greek and Latin.
  6. Which was a sample of a commentary on Lysias’ fouth oration, of which I disgracefully lost the master digital copy in the pre-cloud-computing era...
Posted by a.hay on 18 February 2014 - 11:00am