Undergraduates Reveal Key Feature of Homeric Scholia Using Advanced Computational Tools

Advanced computational tools such as natural language processing and word embedding can often provide excellent opportunities for undergraduate research in the humanities. This summer, a College of the Holy Cross summer research team completed a project titled “A Composite Model for Homeric Scholia Transmission.” Natalie DiMattia, Augusta Holyfield, Rose Kaczmarek, and I explored the little-studied scholia (historical scholarly annotation) of Iliad manuscripts and used natural language processing methods to analyze the topics within them. As part of the project, we have made available for the first time a diplomatic edition of the scholia within Books Eight through Ten of two Iliad manuscripts. Our work demonstrates that the sources of Homeric scholia are varied across manuscripts with no single stemmatic source. In other words, scribes used material creatively instead of simply copying from earlier works.

Holy Cross research team: Natalie DiMattia ‘22, Rose Kaczmarek ‘23, Anne-Catherine Schaaf ‘22, and Augusta Holyfield ‘22
Holy Cross Research Team: Natalie DiMattia ‘22, Rose Kaczmarek ‘23, Anne-Catherine Schaaf ‘22, and Augusta Holyfield ‘22

Beyond the Stemmatic Model

Previous scholarship assumed a stemmatic model of transmission, with later annotations deriving from earlier ones like branches on a tree, all leading back to singular source. Because we recognized that scribes creatively mixed material from multiple sources, we applied computational methods to identify common units of scholia content. These units have been compressed, expanded, and combined in different manuscripts, making an unrooted network a more accurate model for scholia than a stemmatic tree.

No single method accounts for all the diagnostic features of the scholia: thematic content, technical language, non-linguistic markings on the manuscript, and chronological indications. Therefore, we drew on a variety of natural language processing methods such as TF-IDF, a measure of the proportional importance of words to the document; topic modelling, which identifies recurring clusters of co-occurring terms; and word embeddings, which model sequences of terms. Using this new methodological framework, we created a composite model of the relationships between scholia. The resulting network has no stemmatic family tree, or even one source. Rather, it illustrates an interweaving, two-thousand year scholarly debate about the Iliad.

Image produced by natural language processing

The MID’s Groundbreaking Work

Our research builds on work my teammates and I have been doing for four years as members of the Holy Cross Manuscripts, Inscriptions, and Documents Club (MID), focusing on the Homer Multitext Project. Part of Harvard’s Center for Hellenic Studies, the Homer Multitext Project has consistently engaged with and produced groundbreaking scholarship in Homeric studies. The Holy Cross MID was founded ten years ago, and has been working with the Homer Multitext Project ever since, providing students many summer opportunities to work on these incredible manuscripts. Professor Neel Smith, our faculty mentor, additionally serves as one of the Information Architects for the project. Students have turned their work into senior theses as well as presenting at conferences in Kalamazoo, Michigan, Mexico City, and Krakow.

Beginning in my first year, when I took Introduction to Ancient Greek and joined the club, I learned the foundational skills required to work with these texts. I have continued my work in MID throughout my three years at Holy Cross, as well as continuing on through Advanced Greek, and I have taken on an active role helping see through projects at Hackathon, as well as testing and introducing the new software to other members. In the spring of my sophomore year, I audited a course on archaeological data analysis, which gave me an initial overview of working with digital notebooks and forms of analysis such as topic modelling. My previous summer research allowed me to gain experience with the forms of textual analysis we continued to develop, as well as how to be a more efficient reader and editor of the manuscripts.

Creating New Research Opportunities

This summer, my team and I developed our modeling methods even further by tackling the scholia, a much more complicated corpus. My senior thesis will focus on the theme of weaving on the Iliad and its scholia, and my experience doing research has been invaluable, not just for the technical skills it gave, but in my work directly building and processing the corpus that I will use for my thesis research.

My team received the high honor of having a paper based on our research accepted at the SCS-AIA, the preeminent conference for classics and archaeology in America. The session we will be presenting at, Ancient Makerspaces, is unique among classics conferences for its combination of the classics and digital humanities, and the scholars and presentations there will offer a fascinating introduction to the latest developments in the fields I am most passionate about. The work my team has done will continue throughout the year as we expand our corpora of books of the Iliad, and even though I’m graduating, I’m very excited to see what the next generation of MID scholars at Holy Cross produce.

Anne-Catherine Schaaf is a senior Classics major at the College of the Holy Cross.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License