X
Innovation

AI breakthrough enables scientists to read Roman scrolls once buried by Mount Vesuvius

ZDNET In Depth: Go inside the 20-year journey to decode the Herculaneum scrolls, carbonized by a historic volcanic eruption two thousand years ago - and unreadable until now.
Written by Erin Carson, Senior Contributing Editor
img-0788charred-wood-red-lights

Herculaneum scroll with red laser lines being scanned at Institut de France by Brent Seales and his team.

EduceLab.

After a historic volcanic eruption, two millennia, and an international effort to use artificial intelligence to read a set of mysterious ancient scrolls, researchers know what at least one Roman Epicurean philosopher had on his mind: food.

Humans, while full of surprises, can be endearingly predictable. 

The revelation comes as the culmination of the Vesuvius Challenge -- a contest launched in March 2023 by University of Kentucky researcher Brent Seales, former GitHub CEO Nat Friedman, and entrepreneur and investor Daniel Gross. The goal was to take computed tomography (CT) scans of what are known as the Herculaneum scrolls as well as machine-learning-based software and put these in the hands of tech-savvy sleuths from around the world in hopes someone could read the scrolls without even touching them. 

Also: This is what AI will produce during the next decade and beyond

With support from Silicon Valley, organizers dangled prize money for progress in the pursuit of reading the writing once buried and carbonized in the eruption of Mount Vesuvius. That included $700,000 which will be split among the winning team of three: Youssef Nader, Luke Farritor, and Julian Schilliger -- all students. They submitted 15 columns of text, which preliminary analysis suggests contains writing about whether the scarcity or abundance of goods like food affects how pleasurable humans find them.

The Vesuvius Challenge marks a pivotal moment in the quest to get inside the scrolls. It's also a big moment for Seales, the researcher from the University of Kentucky -- he's been trying to accomplish this for the last two decades. 

Seales and various incarnations of his team have never been closer to reading the trove of texts. In a way, the contest's December 31 deadline didn't really matter. The challenge, both in terms of the grand prize and the puzzle itself, goosed interest and recruited new collaborators whose contributions Seales compared to about 10 years of human work in just the first three months. 

"It's astounding to feel this kind of redemptive power that we may hold now because of AI and tomography and computation," Seales said in an interview before the grand prize announcement.

It might seem like a lot to go through but Michael McOsker, a researcher who has studied the scrolls, estimates all these efforts could yield what amounts to about 200 new books. The collection is also the only surviving library from antiquity.

"We have probably less than 1% of … all the literature that was written," he said. "Any gain in our knowledge is important."

History unwrapped

190516seales1538-2-men

Brent Seales and Seth Parker (Digital Restoration Initiative project lead) scanning a replica of the Herculaneum scroll on the University of Kentucky campus.

UK Photo

Seales didn't set out to spend two decades unwrapping ancient texts. Originally from Western New York, he was an imaging specialist with an interest in AI. The problem: there wasn't much happening with AI back then. Computer vision, however, seemed like one area where progress was being made. 

He met a professor at the University of Kentucky in the mid-1990s who was working on a manuscript of the Anglo-Saxon epic poem Beowulf at a time when there was a push to digitize libraries. Having read Beowulf in high school, like so many kids, Seales' interest was piqued. He thought about the power of digitizing this one text -- the one extant manuscript to witness the story. 

Also: A year of AI breakthroughs that left no human thing unchanged

Digitization transitioned into restoration. Once a text was digital, the image quality could be improved. Just making a copy didn't have to be the end. And if they could digitally flatten a wrinkled document, why couldn't they also unfurl it?

"We invented the idea of completely unwrapping something before we knew about the things that we were going to actually unwrap," Seales said. 

In 2004, Seales finally found something to unwrap when a University of Michigan classics scholar named Richard Janko told him he'd identified the perfect fit. 

Enter: The Herculaneum scrolls. 

largersectioncomposite-3840.png

The Greek characters, πορφύραc, revealed as the word "PURPLE," are among the multiple characters and lines of text that have been extracted by Vesuvius Challenge contestant Luke Farritor.

Vesuvius Challenge

Digging up the past

In modern times, the eruption of Vesuvius might call to mind images of those ash-entombed bodies holding each other as their world ended. It's a historical event that's at once fascinating, tragic, and even a little creepy. 

The only account from the time comes from the letters of Roman author and lawyer Pliny the Younger, who described panicked crowds and a "thick black cloud" that consumed the land like a flood. 

"Some people were so frightened of dying that they actually prayed for death," he wrote. 

When the cloud thinned enough to let daylight in, Pliny the Younger saw everything buried deep in an ash that reminded him of snow. 

textreveal.png
Vesuvius Challenge

In Herculaneum -- a city roughly 10 miles to the west of Pompeii, and even closer to the erupting volcano -- all the falling ash and debris buried a villa once owned by Julius Caesar's father-in-law. Renderings of the estate show a giant courtyard, gardens, and arches. Crucially, the villa was also home to a library of papyrus scrolls. 

While about 65 feet of hot ash might seem like the worst possible outcome for papyrus, the heat carbonized the scrolls, preserving them from the natural deteriorating effects of air.

It wasn't until the 1700s that a farmer, while digging a well, struck marble and kicked off excavation efforts that turned up more than 600 unopened scrolls. (The exact number of Herculaneum papyri is hard to pinpoint, Seales said, given whether researchers count fragments and partial pieces. Some peg the number up to 1,800.) 

The scrolls passed into the care of Antonio Piaggio, a scholar from the Vatican Library, who invented a machine to unwrap some of the better-preserved scrolls. Piaggio wasn't always successful.

What was unwrapped contained primarily Epicurean philosophy, leading McOsker to believe the remaining scrolls could be of the same nature. They may not rewrite the way scholars view the ancient world, but considering the dearth of writings from the time, another 200 books could be a decent haul. 

It's not an 'experiment'

Today, the scrolls are housed in several locations around Europe, with the bulk found at the National Library of Naples in Italy. 

Not surprisingly, most people can't walk in and futz around with fragile 2,000-year-old scrolls. It took Seales years of building a case through funding, success on other projects, and academic diplomacy to gain access.

Seales, who had a background in surgical innovation like laparoscopy, wanted to use computed tomography to scan the scrolls and then create software to wrap those scans. 

Also: How tech professionals can survive and thrive at work in the time of AI

In 2005, Seales had the opportunity to share his idea in a lecture at Oxford. By that time, he and his team had put together an example of papyrus embedded in a polyurethane sphere, which they had scanned and virtually unwrapped. 

"That was sort of the debutante come down stairway with the dress on saying, 'come and dance with me,'" Seales said.

The response was positive, but to the ears of protective conservators, it still sounded a whole lot like an experiment, and "experiment" is a dirty word when applied to something so rare and old. 

After four years of hard work, relationship-building, and some finesse, in 2009, Seales and his team traveled to the Institut de France to make their first micro-CT scans of the papyri. 

"I was simultaneously terrified and also incredibly excited," Seales said. The scrolls were small and looked like charcoal. "They tell you that this is a whole book from antiquity… and it's just this little tiny thing because it shrunk when it carbonized."

As much of an achievement as it was to finally get scans of the scrolls, Seales struggled to get the software to work the way the team wanted it to.

If they weren't going to crack the Herculaneum scrolls immediately, they needed another goal to shoot for. 

Troubleshooting

To say Seales has been working on the Herculaneum scrolls for 20 years might make it sound like he clocked in and out of the office every day with that singular focus. 

In reality, there were chunks of time when the team couldn't work on the scrolls, or were working on digitally scanning and unwrapping other texts that in some way still helped them move closer to their final goal. 

In 2006, Seale's team unwrapped a medieval copy of the Book of Ecclesiastes written in Hebrew. A year later, in 2007, Seales was on a team that went to Venice to digitize the oldest complete copy of Homer's Iliad. 

img-2967-web

 Herculaneum scroll being scanned at Diamond Light Source inside its scanning case.

EduceLab

"Every one of those projects that I did along the way built up a little bit of credibility in me as a researcher, and some knowledge in me in being able to approach decision makers at these museums and libraries to have a conversation with them," he said. He even learned enough French to speak with the researchers in Paris.

Still, by 2012, the Herculaneum push was in a bit of a slump. The next year, Seales took a sabbatical and spent a year in Paris as a visiting scientist at Google's Cultural Institute. It gave him the chance to rebuild confidence and get an infusion of new people and new ideas, right as Google was about to acquire AI research lab DeepMind.

Around that time, Seales started pursuing the idea of making scans inside a particle accelerator, which would significantly boost the resolution of the images. 

The reset was handy, as the technical challenge of reading scrolls remained thorny.

One chief problem has been something called segmentation. Though the scrolls are quite small, the scans are detailed. Technical Lead Stephen Parsons, who first worked with Seales as an undergrad at the University of Kentucky, described trying to digitally separate layers of partially crushed papyrus and the network of fibers visible in the scans. He compared it to what a cross-section of a log might look like, but somewhat smashed. 

Another challenge has been actually reading the ink on the papyrus. Parsons said the best imaging technology they have to see inside the scrolls is the X-ray micro CT. The issue? There's not enough contrast to read the ink.

The Herculaneum scrolls were written with what was essentially soot from oil lamps, which chemically is almost pure carbon. As the papyrus is also chemically carbon, the team found themselves looking at gray on gray.

Other projects, like the En-Gedi scroll in 2016 -- the oldest Pentateuchal (relating to the first five books of the Bible) scroll since the Dead Sea scrolls, whose successful digital unwrapping was a major milestone for Seales' team -- used ink with iron in it, which shows up at bright spots in X-rays.

Also: Can generative AI solve computer science's greatest unsolved problem?

Parsons said they hypothesized there could still be some detectable difference. He likened it to black-painted lines on asphalt. Perhaps, a machine-learning model could be trained to see the ink.

It took years of work, testing the idea on scrolls they made and fragments of Herculaneum scrolls that had broken off and revealed their writing, to get to the point where they were able to read two characters from layers deep inside a scroll. 

"It was clear with that moment. Even if it takes many years to develop to refine… this approach is going to bear fruit eventually," Parsons said. A year later, it has.

A software problem

What segmentation and ink detection allude to is that getting the scans has been only part of the overall challenge of the scrolls. Taking the data and sorting it out algorithmically has been a whole other journey. 

After several iterations, Seales' team created the Volume Cartographer, written primarily by project lead Seth Parker, who joined the team in 2012. It's open-source software used to map the inside of the scrolls and make sense of the "floating word soup," as Parker put it.

pherc118-allplates-1998-web

The 12 pezzi, or "pieces," of the opened Herculaneum papyrus scroll known as P.Herc.118. The compilation of images is owned by the Bodleian Library at the University of Oxford.

Vesuvius Challenge

Parker started his career as a video editor working on research documentaries. He'd worked with Seales, and when a team member left, and Seales needed someone who knew their way around cameras and image capture, he recruited Parker. Once a media and communication major, he turned toward a Ph.D. in computer science. 

He's also getting extra help with the Volume Cartographer from people hired by the Vesuvius Challenge who are working their way through a wishlist of bug fixes. 

Creating the Vesuvius Challenge meant opening up years of work to an unknown global workforce. Parsons estimated more than 1,000 people have been working on the project for the better part of a year -- something that's both scary and exciting, he said. 

After all, it was computer science students who made the initial discovery of the word "purple" in October.

Keep scrolling

While 15 columns of text is more than Seales expected, it is hardly the end of the story.

Looking farther out, both Parker and Parsons imagine their work could also inspire other fields that use dimensional imaging. 

CT scans and MRIs are already powerful, but what if there's still information hiding from the naked eyes of doctors that could improve tumor detection and the like?

"There are ways of transforming that data to make it more interpretable for a human," Parker said. 

farritor

"There's no reason to slow down. Let's read the entire library," said student and team member Luke Farritor.

Vesuvius Challenge

And there are still ancient texts to read. Concurrently, they're working on a medieval manuscript from the Morgan Library -- a Coptic Gospel whose pages are fused. They've taken multiple CT scans and are once again trying to virtually untangle what's written inside. 

The short-term goal for 2024 is to read 90% of the scroll Nader, Farritor and Schilliger started. And yes, there will be more prize money on the line.

"We are celebrating right now, but there's no reason to slow down. Let's read the entire library!" Farritor said in a statement.

For Parsons, there is something profound about working on these texts and imagining the humans 2,000 years ago who wrote them -- people who never would have guessed anyone in 2024 would be so interested in what they had to say, and certainly couldn't have conceived of the tech behind those efforts. Even today, most people would likely struggle to define "machine learning." 

"All this time has gone by and this one part of this journey has come to me and my computer screen," Parsons said. "That's quite humbling."

After all these years, Seales knows the importance of that throughline of humanity. Ancient texts talk about love, war, music, rhetoric, poetry -- topics still being agonized over today. And food, of course. 

"The mature intellectual dialogue that occurs in these ancient manuscripts is distinctly human. Being able to tell stories is distinctly human," Seales said. 

Seales imagines maybe reaching 2,000 years back, stripped of all current religious, political or whatever other boundaries, there's a way to rally around what it means to be human. 

"We have to read it," Seales said. "We have to study it. We can't forget it."

Editorial standards