Blog

Finding 20th Century manuscripts for the Location Register at the BBC Written Archive Centre

By Jackie Bishop, Location Register & WATCH Researcher

 

 

Introduction

As the researcher for the Location Register, one of the University of Reading’s (UoR) research and digital humanities projects, I identify literary manuscripts held in British and Irish archives and libraries and record them in an online database. This provides literature researchers with a one-stop-shop for finding where and how to access original manuscripts. With over 500 repositories included, there is plenty of work to do! I am constantly on the watch for announcements of new literary acquisitions and the launch of catalogued collections. I was asked to investigate manuscripts at the BBC Written Archive Centre in Emmer Green, Reading, as they had previously been researched as far back as the early 2000s.

The history of The Location Register

The original Location Register of 20th-century English literary manuscripts and letters was published in two large volumes by the British Library in 1988, edited by Dr David Sutton. This was followed in 1995 by the publication, also by the British Library, of Location Register of English literary manuscripts and letters: 18th and 19th centuries. The printed volumes are available in the University of Reading Library and Special Collections reading room.

Since 1998, records have been added to an online database, now hosted by UoR Special Collections. Around 3,000 literary authors of all genres, from major poets to minor science fiction writers and romantic novelists are documented. Enid Blyton, Marie Corelli and Ruby M. Ayres are included in the same way as Doris Lessing, Dorothy M. Richardson and Virginia Woolf.  Find the full list of included authors, with brief biographical and archival details, here: Location Register NAL.

The obvious benefit of publishing the research online is that users can remotely search for authors and manuscripts and find out what repositories hold, access arrangements and archive call numbers before visiting in person. The database can be used to see connections between authors, their archive collections and regional literary movements.

BBC Written Archive Centre

The BBC Written Archive Centre (WAC) holds the working papers of the BBC from 1922 onwards. The background to the collection is described on a panel in an exhibition about the service: “The BBC’s mission to inform, educate and entertain has led to the accumulation of documents that showcase British culture and political history throughout the 20th century and beyond”.

The BBC WAC holds 4.5 miles of material, 250,000 files of correspondence and 21,000 reels of microfilm. The collections main file series are scripts, contributor files, production and policy files, and BBC publications, such as the Radio Times.  More information is available on the BBC WAC website.

Caversham Park

The BBC’s documents were originally held in the Historical Records Office, which was set up in London in 1957. In 1970, the Written Archives Centre was established and the Corporation’s records were transferred to Caversham, Reading.

The BBC WAC is next door to the former BBC Monitoring mansion building and the grounds of Caversham Park. Based in Emmer Green, a quiet village with a duck pond, BBC Monitoring was a buzzing, thriving global news operation in the “big park” from 1943 to 2018. I was lucky enough to work there in the 1990s and 2000s.

Set up in 1939 before World War II, the government asked the BBC to monitor the use of the media by Axis powers, especially radio. Through the decades, the BBC monitored and translated global media coverage during historic events such as the Cold War, the fall of the Berlin wall, wars in the Gulf, Iraq and Afghanistan, terrorism, 9/11 and many more.  A history of Caversham Park by Brian Rotheray (published by BBC Monitoring) covers the use of the building from the 13th century on.

 

Research guidance

As the BBC WAC only has a partial online catalogue, visiting in person is essential. My starting point for the project was to find out what collections and resources were available at the archive.

There is plenty of advice on the WAC website, with information for visiting researchers which guides you through the procedures for making an appointment, using the reading room, on site facilities, copying and licensing. There is also a research guide on how to access the material held at WAC, whether you are searching for a person, programme or policy. There is a comfortable break room with a friendly atmosphere where you can talk to other researchers. The usual reading room rules of no pens or bags applies.

The reading room is open for visitors on Wednesdays and Thursdays by prior arrangement only.

Catalogue

The BBC has a partial catalogue on The Archives Hub.  This gave me an idea of the structure of the collection, extent and dates.

Material less than seven years old, material that has not yet been deposited with BBC Archives, material held for legal or business reasons or in electronic formats is not in scope for general research.

I emailed the team at WAC, explaining about the Location Register and the kind of manuscripts I was hoping to find. They suggested series of radio and TV contributor files that would be worth pursuing and offered to pick some sample files. At this stage it was hard to gauge how many files would be relevant and how many visits I would need to make. I booked a desk for two days in September 2024. Little did I know I would be visiting twice a month until June 2025!

 

Jackie Bishop in the BBC reading room with a large number of BBC files.

 

The hunt for manuscripts

I was greeted on my first visit by a pile of around 60 files waiting for me in a trolley. The archivist assigned to help me with my research had produced a good range of different files from radio and TV contributors and talked me through the selection. There were well-known BBC names, such as Spike Milligan, Alan Bennett and Raymond Briggs.

The paper catalogues for the collection are available in the reading room so after I had looked at the initial set of files, I was able to check one of the catalogues against the list of 3,000 literary authors I was looking for. I was delighted to find so many of the relevant authors listed in the BBC catalogue.

I created a spreadsheet of existing entries in Location Register, files I had requested, which ones were open and which ones contained manuscripts. As some files were closed, I started to request more than I could get through in one sitting to compensate. It became clear quite quickly that there was a huge number of manuscripts from the 1960s onwards that had not previously been included in the Location Register.

I started looking through the files, turning every page looking for the handwriting and signatures of authors, documenting my finds and notes in a word file. What emerged was a picture of the relationship BBC staff had with literary authors, encouraging them to create for broadcast, to participate in programmes and some rejection letters. There were many legal forms covering copyright, meticulously laid out and kept. Some of the interesting files covered the whole production process for a programme, including scripts, outlines, characters, casting and finances.

In between visits, I formatted my research notes in the house style of the Location Register and input the data online. Of the 500 files I ordered over the ten months of the project, I looked at 355 files which were open for access.

I was pleased to find some Samuel Beckett manuscripts as UoR Special Collections includes the Beckett archive. Other highlights were Denis Potter’s staff file, Spike Milligan’s correspondence in his unique and amusing style, extracts of John Le Carré material for scripts, well over 200 items of correspondence between the BBC and Alun Owen (prolific Welsh playwright, screenwriter, known for writing The Beatles’ film A Hard Day’s Night), over 300 items of correspondence between the BBC and Gwyn Thomas, a Welsh writer and broadcaster. A personal highlight was reading Sue Townsend’s correspondence about televising The diary of Adrian Mole.

There are now around 1,300 entries for the BBC WAC in the Location Register, which you can see here. As much of the BBC material is not available in an online catalogue, the archive team can now point literary researchers to the Location Register to find call numbers online prior to visiting.

 

Example of a Location Register record for the BBC Written Archives Centre.

 

Conclusion

I thoroughly enjoyed researching this well organised, fantastic resource. It is a rich collection of 20th century broadcasting, demonstrating the engagement BBC staff had with many authors.

The archive team were incredibly helpful and dedicated in helping me with my research. Thanks go to the BBC’s Sam Blake and Tom Hercock, who produced hundreds of files for me, and the reading room team for answering my many questions.

The Location Register has been regularly supported in its research by the British Academy, and since 2003 has been recognised with the status of Academy Research Project (ARP). Funding for the project has been received from the British Academy, the Strachey Trust, the Arts Council of England, the Esmée Fairbairn Foundation and the Pilgrim Trust.

New article: ‘Immersive sacred heritage: enchantment through authenticity at Glastonbury Abbey’

Photograph of Glastonbury Abbey
Glastonbury Abbey (©Cheryl Green, reproduced with permission)

What are the distinctive challenges and opportunities of immersive heritage interpretation in the context of a sacred heritage site? We congratulate Roberta Gilchrist (University of Reading), Janet Bell, Alex Book, Claire Fear, Olivia Hinkin, Simon Hobbs, Jennifer Matravers, Jon Meggitt, Nic Phillips and Vanessa Ruhlig on the publication of their article ‘Immersive sacred heritage: enchantment through authenticity at Glastonbury Abbey’, which addresses this question.

The paper reflects on how immersive approaches can be designed to respect the spiritual ethos of sacred heritage, in relation to the open-air heritage site of Glastonbury Abbey (Somerset, UK). Here, a cross-disciplinary team collaborated in the design of an immersive experience to engage visitors. Authenticity and biography emerged as powerful conceptual tools to reconcile potential tensions between immersive storytelling and the spiritual values of place.

Read the article, open access

Congratulations to our video essayists

A cinema screen reads 'Urban Futures at the New York World's Fair of 1939 by Mara Oliva University of Reading'
Screening of Mara Oliva’s “The City of Tomorrow” video essay at MOMI NYC in Jan 2025

Congratulations to our Reading researchers who exhibited video essays at the Museum of the Moving Image in New York City earlier this year.

The series Expanded Screens: The Video Essay was programmed by Professor John Gibbs (Film, Theatre & Television) and Dr Mara Oliva (History) in partnership with other video essay experts and practitioners. It presents a selection of work that explores the possibilities of the medium, featuring pieces from both co-curators as well as Dr Adam O’Brien (Film, Theatre & Television).

Professor Gibbs and Dr Oliva also attended the American Historical Association Annual Meeting, where they participated in a panel titled Visualizing the Past: Exploring the Video Essay as a Dynamic Historical Methodology.

Read more:

Synthesising the Obraz | Stanislavsky, Active Analysis and AI-Assisted Image Creation

by Jon Weinbren, University of Surrey

[Obraz is the anglicised spelling of the Russian polyseme образ, which can be roughly translated as character, image, representation, likeness, or similitude.]

This is a whirlwind tour of a piece of research I had the pleasure of presenting at the Digital Humanities and Artificial Intelligence Conference in June 2024 at the University of Reading. In this work, I draw comparisons between the now widely familiar world of generative AI image creation and a somewhat underappreciated aspect of Konstantin Stanislavsky’s actor training system, known as ‘Active Analysis,’ which, despite its significance, has perhaps not received the broader recognition it deserves.

One of the challenges in delivering this paper was gauging the audience’s familiarity with both fields: the historical and methodological context of Stanislavsky’s system of actor training, and the intricate mechanics of diffusion-based generative AI. The room was full of very smart people with extensive experience in and knowledge of ‘Digital Humanities’ (a term whose precise meaning still slightly eludes me), but I was uncertain about how much explanation and contextualisation would be needed for the two areas I was drawing on.

My research had developed through my discourses with the ‘Stanislavsky Studies’ community, where I was accustomed to assuming the audience’s in-depth knowledge of this seminal Russian actor, director, trainer, and theatre impresario, who has shaped Western theatre and cinema acting practices for over a century. The ‘Stanislavskians’ I had engaged with were experts in the field, but I couldn’t make the same assumption about the audience in Reading. Therefore, I felt the need to adjust my presentation to include key information about Stanislavsky, his ‘system,’ and his significant legacy.

Conversely, when presenting to Machine Learning specialists at the University of Surrey and elsewhere, I could freely delve into the technicalities of Denoising Diffusion Probabilistic Models and the complexities of latent spaces and transformer architectures. However, for that particular audience to gain value from my presentation, I was compelled to provide something of a beginner’s guide to acting, emotion, and Stanislavsky’s approach to stagecraft and performative ‘realism’. And that was before I even broached the more advanced topic of ‘Active Analysis’, which was revolutionary even in Stanislavsky’s own iconoclastic contexts.

So, with this meta-analysis concluded, here is a brief and necessarily condensed outline of the paper, presented in a format that I hope will be appropriate and digestible for this medium.

Exploring Digital Acting: Bridging Stanislavsky’s Legacy with AI-Assisted Image Creation

As part of my ongoing research into digital acting, I’ve been fascinated by the unexpected connections between the time-honoured techniques of theatre and the cutting-edge capabilities of generative artificial intelligence (AI). In particular, I’ve been exploring how Konstantin Stanislavsky’s Active Analysis — a lesser-known yet ingenious aspect of his ‘system’ — shares surprising similarities with the way modern AI systems, like text-to-image generators, create visual content.

So, who was Stanislavsky, and what does his work have to do with AI?

Figure 1- Konstantin Stanislavsky. 3D Digital Likeness created by Author

Rediscovering Stanislavsky

Konstantin Stanislavsky (1863–1938) was a Russian theatre director and actor, best known for co-founding the Moscow Art Theatre. He developed what’s now known as the ‘Stanislavsky System,’ a comprehensive approach to acting that emphasised psychological realism and emotional truth in performance. A particular component of this system is Active Analysis, a rehearsal technique that’s all about improvisation and iterative refinement. Despite its profound impact, this technique isn’t as widely discussed as other elements of his work.

In simple terms, Active Analysis encourages actors to explore their characters through spontaneous enactments rather than just memorising lines. This method helps actors internalise their roles, allowing for performances that feel authentic and dynamic. It’s about asking, ‘What would I do if I were this character?’ — a question that leads to a performance that feels lived-in and genuine.

Stanislavsky’s approach to acting had already become revolutionary because it had moved away from the rigid, declamatory style of acting that was common in his time. Instead of focusing solely on the external presentation of a character, Stanislavsky urged actors to delve into their own emotions and experiences, using these as tools to bring their characters to life. This technique became the foundation for many acting schools worldwide, although it is important to note that Stanislavsky’s ‘system’ is often misinterpreted and oversimplified in its American adaptation, known as Method Acting, developed by Lee Strasberg.

While Strasberg’s Method Acting emphasises emotional memory and personal experience as the primary tools for an actor, Stanislavsky’s system was broader and more flexible, incorporating a variety of techniques to help actors create believable characters. His work is more focused on the actor’s active engagement with the text, the given circumstances, and the dynamics between characters, rather than solely on the actor’s personal emotions.

The Ingenuity of Active Analysis

Active Analysis represents one of Stanislavsky’s most innovative contributions to rehearsal techniques, yet it remains relatively uncelebrated in discussions about his work. This method emerged as a practical solution to help actors immerse themselves in their roles more deeply. By encouraging actors to improvise around the script’s ‘given circumstances’ and iteratively refine their performances, Stanislavsky was able to foster a deeper connection between the actor and the character, leading to more authentic portrayals on stage.

What makes Active Analysis particularly ingenious is its emphasis on the iterative process. Rather than rehearsing scenes in a fixed, linear way, actors were invited to explore different possibilities, to ‘play’ within the role, and to allow their performances to evolve organically over time. This technique not only made performances more dynamic but also helped actors discover new facets of their characters that they might not have found through more traditional methods.

But how does this century-old method relate to artificial intelligence, you ask?

The AI Connection

As someone deeply interested in ‘digital acting’, in other words empowering autonomously animated characters with their own acting capabilities, all things AI are very much in the frame for me. This is one of the many reasons why I have been looking closely at generative AI systems and how they could be used in this context. My goal is to explore how AI can enhance or even replicate the nuanced performances typically associated with human actors.

It was during these investigations that some uncanny resonances emerged between Stanislavsky’s Active Analysis and the processes involved in generative AI text-to-image systems. Tools like DALL-E, Midjourney, Flux and Leonardo.ai, all of which generate images based on text descriptions, rely on a fascinating process called diffusion.

A sequence of images, from 'Data' (in this case, a picture of a cat) to 'Noise'. An arrow from left to right reads 'Fixed forward diffusion process'. An arrow from right to left reads 'Generative reverse denoising process'.
Figure 2 – Denoising Process

Here’s how it works in a nutshell:

  1. Text Encoding and Latent Space: When the AI reads a text prompt, it first converts the prompt into a numerical representation known as an embedding. This embedding is then placed into what’s called Latent Space, a high-dimensional mathematical space where each dimension represents different features or attributes of the data. In Latent Space, concepts that are similar or related are positioned closer together, while those that are different are further apart. For example, if the prompt is ‘a cat riding a bicycle,’ the AI maps this concept into Latent Space where it can capture the essence of both ‘cat’ and ‘bicycle’ and their relationship.
  2. Starting with Noise: The image creation process begins with a random, noisy image — think of it like static on an old TV screen. This noise is also represented in Latent Space, but it initially occupies a position far from the structured concept encoded by the text.
  3. Denoising Process: The AI then begins the process of denoising, where it gradually refines the noisy image by adjusting its position in Latent Space. Each adjustment is guided by the text prompt’s embedding, moving the image closer to the desired outcome. As the image is refined step by step, the AI aligns it with the features and attributes described in the text prompt, until a coherent and recognisable image emerges.

    From an image of a dog to noise, and from noise to an image of a dog.
    Figure 3 – Denoising and Reverse Denoising
  4. LORAs and Direction: Just as a director might guide an actor’s interpretation of a role, LORAs (Low-Rank Adaptation of Large Language Models) can be used to modify and fine-tune the AI’s output to match specific styles or artistic directions. This is akin to how a director shapes the performance to align with a particular vision or style, ensuring that the final output — whether a performance or an image — reflects a cohesive and deliberate creative approach.

Latent Space is a critical concept here because it allows the AI to navigate through a vast landscape of possible images, making informed decisions about how to transform noise into a meaningful visual representation based on the text.

Figure 4 – Representation of Latent Space clustering: a 3D visualisation of different ‘clusters’ in the ‘Embedded Feature Space’ of a ‘long short-term memory based variational autoencoder’ (LSTM-VAE). (Taken from Chung et al. 2019)

Equally important are LORAs, which enable the AI to fine-tune its outputs to match specific styles or artistic directions, much like a director guiding an actor’s performance to achieve a desired creative vision. This iterative process of refinement, where an image slowly emerges from a chaotic mess of pixels, is strikingly similar to how actors, through Active Analysis, refine their performances from initial improvisations into something coherent and powerful.

Why This Matters to Digital Acting

You might be wondering, why is someone who studies digital acting interested in these parallels between AI and theatre? Well, it’s all about understanding the deeper principles of creativity and performance — whether they’re being carried out by humans or machines.

Both Stanislavsky’s Active Analysis and AI diffusion models rely on iterative refinement. Actors don’t just get it right on the first try. They rehearse, they explore different possibilities, and they adjust their performances until everything clicks. Similarly, AI doesn’t generate a perfect image right away — it gradually shapes the noise into something recognisable, guided by the input text.

In both cases, the process is guided by experience and data. For actors, this means drawing on their personal experiences and emotional responses. For AI, it means relying on vast amounts of training data — millions of image-caption pairs that teach it how to associate words with visuals.

But beyond these technical similarities, what really excites me is the idea that both actors and AI systems are, in a sense, engaging in a form of ‘creative improvisation.’ For actors, this improvisation is rooted in the moment-by-moment decisions they make during rehearsal — decisions that are influenced by their understanding of the character, the context of the scene, and their interactions with other actors. For AI, the improvisation comes from the way it navigates the vast, complex space of possible images, guided by the constraints of the text prompt.

This connection between human and machine creativity opens up exciting possibilities for digital acting. Imagine an AI system that could generate not just images, but entire performances — performances that could then be interpreted and refined by human actors. Or consider how AI-generated imagery could be used to inspire new approaches to character development, set design, or even the writing of scripts.

Figure 5 – Robot Hamlet. 3D model

Expanding the Horizon of Creativity

The intersection of Stanislavsky’s Active Analysis and AI diffusion models also prompts us to rethink what we mean by ‘creativity.’ Traditionally, creativity has been seen as a uniquely human trait — a product of consciousness, emotion, and personal experience. However, the success of AI systems in generating art, music, and now even performances, challenges this notion.

Does this mean that machines can be truly creative? That’s still a matter of debate. But what’s clear is that AI can engage in processes that are remarkably similar to human creativity. By iteratively refining their outputs, guided by data and algorithms, AI systems can produce results that are both novel and meaningful — qualities that are central to our understanding of creativity.

In the context of digital acting, this suggests a future where human and machine creativity are not in competition, but in collaboration. By blending the strengths of both, we could develop new forms of artistic expression that go beyond what either could achieve on their own.

For example, AI could be used to generate a range of possible interpretations of a character, which an actor could then explore and adapt in rehearsal. Or AI-generated environments could provide dynamic backdrops that respond in real-time to the actions of performers, creating a more immersive and interactive theatre experience.

Looking Ahead

As AI continues to advance, the boundaries between machine-generated and human-created art will blur. By exploring the resonances between traditional artistic practices and AI, we can develop new methods of collaboration that benefit both fields. Whether through digital avatars that ‘perform’ with the nuance of a trained actor or AI systems that generate concept art, the potential for synergy is vast.

In conclusion, while Stanislavsky could not have imagined the technological tools we have today, his emphasis on the iterative, experiential nature of creativity resonates deeply with the processes driving AI innovation. As we continue to explore these intersections, we may find that the future of creativity lies not in choosing between human or machine, but in the harmonious integration of both.

For those involved in ‘Digital Humanities’, this is an exciting frontier. We’re blending the traditional with the technological, exploring new ways to create and perform. Whether you’re a theatre enthusiast or a tech aficionado, there’s something truly fascinating about watching these worlds come together.

Welcome (back) to the DH Hub portal

With a new academic year beginning, we’ve prepared a quick run-down of our resources and new portal content.

 

What is DH and the DH Hub?

For those of you who are new to DH and the Hub, Digital Humanities (DH) is the critical study of the intersection between digital technologies, disciplines in the Arts and Humanities, and scholarly communication.

The DH Hub, which is part of the University Library, is your base for DH research support. The Hub team brings together academic and professional services colleagues from across the University. We use our combined expertise to enable DH research in a variety of aspects, including:

  • DH skills and training
  • Project proposals, collaboration and funding
  • Research data management, preservation and sharing
  • Collections-based research and digital artefacts
  • Programming and computing resources

 

What can I find on the DH Hub portal?

You can find answers to key questions which might arise when starting out with DH. These include ‘What is Digital Humanities?’ (for which, see our ‘What is Digital Humanities?’ page); ‘Is my project DH?’ (for which, see our DH FAQs); and ‘How, and why, might I engage with DH?’ (for which, see our ‘Starting out with DH’ guide).

You can see what support is offered by the DH Hub and DH CoP. The Hub and CoP work closely together. The CoP promotes awareness of, and engagement with, DH among academics, and the Hub provides practical support for DH research.

You can find out about events that the DH Hub and DH CoP are running. These include networking events, drop-in sessions and methodological workshops.

You can use the resources on our website to develop DH skills, for inspiration, and for help with DH projects. Our resources include links (to courses, projects, software and more); guides (on, for example, Humanities Data Management and Working with Collections); and case studies (to which we will soon be adding).

You can also get in touch with us to see how we can help.

 

What’s new?

Our calendar has been updated with new events including our Back to Campus Networking (23 October) and Humanities Data event (20 November). More of these to come.

We have published a new Digital Humanities Planning Tool, to guide you through technical, legal and ethical questions which might arise when planning a DH project.

Over the summer, the DH CoP held its inaugural conference, on the theme of Digital Humanities and Artificial Intelligence. You can hear all about the day here and catch up on posts by conference speakers in our DH & AI guest blog series.

We have also answered some Frequently Asked Questions relating to both Digital Humanities and the Hub’s resources and support.

 

Something missing?

If there is something you would like to see on the DH Hub portal, please let us know, by emailing digitalhumanities@reading.ac.uk.

“Bringing History to Life”: Animating Historical Portraits with Artificial Intelligence

by Cate Cleo Alexander, University of Toronto

It’s the hot new trend in the historic photo colourization community: AI generated movement.

Recently, content creators on platforms like YouTube have started to “bring history to life” by using AI programs to colourize, “modernize,” and animate portraits of historical figures. However, this technology often alters historical portraits to reinforce modern gender normativity and fetishization of whiteness — devaluing diversity and accuracy in digitally mediated public history in favour of a simulated intimacy with “attractive” bodies. I have documented this trend through my survey of four content creation teams: Royalty Now, Randomly Digest, Historicly, and Mystery Scoop.

“Colourization”

In the first stage of a “brought to life” style video, the image is colourized and/or enhanced. Content creators often automate this work with AI programs like Gradient. The labelling of these images as simply “colourized” or “enhanced” obscures the true amount of modification.


Screenshots of an AI “enhanced portrait” from Mystery Scoop video “Historical Figures Brought To Life (King Henry VIII, Catherine Howard, Queen Mary I, Verdi, Tesla)”

The screenshots above show the before and after of an AI-enhanced portrait of a “Southern Belle” from a Mystery Scoop video. The subtle differences are made more noticeable through the animated GIF below.

GIF created by Cate Cleo Alexander

Note how the eyes shift upwards, the hairline changes, the shadow under the chin deepens, the cheekbones sharpen, and the highlights and shadows of the face increase in contrast. This is not so much digital restoration as a digital facelift.

Sometimes the changes are less subtle. Historicly, Mystery Scoop, and Randomly Digest all use AI to add makeup to female-presenting subjects. These makeovers range from subtle eyelash extensions and lipstick to a dramatic smoky eye. Often, the content creators who post these images seem unaware or uncaring of the physical limitations of makeup: noses and faces are slimmed, cheekbones change location, eyes are enlarged, and — most problematically — skin is whitened.


Screenshots of an AI “colourized” Alexandra Feodorovna from Historicly’s video “Romanov Family – 10 Striking Photos From Romanov’s Last Ball – Brought to Life”

“Modernization”

One of the most insidious implications of this content is that in order to be “modern” and “relatable,” historical figures must be proximate to Western ideals of beauty. One prominent example of this comes from videos featuring the portrait of a once-known woman who is referred to as “Mestiza de Sangley,” an antiquated term for someone of Chinese and Filipino descent. In their video, Randomly Digest posits: “What if she lived today? Loose interpretations with modern make up and hairstyle.” The AI generated image is barely recognizable. The woman’s face shape, cheekbones, lips, hairline, and body positioning are all shifted, bringing her in closer proximity to gender normative and white supremacist ideals of beauty.

Screenshot of an AI-generated “modernized” woman from Randomly Digest’s video “Mestiza de Sangley, 1875, Brought To Life”

Animation 

After “colourization” and “modernization,” the final stage of a “brought to life” video is animated movement. Through AI tools like Deep Nostalgia, these uncanny personages tilt their heads, look around, blink, and smile.

Unlike the other channels, RoyaltyNow manually creates portraits and conducts research to make their recreations as accurate as possible. In the example below, RoyaltyNow recreated Mary Queen of Scots through studying her death mask, other portraits, and contemporary descriptions. However, after all that meticulous work, the image was fed into AI animation software — thus distorting the image.

Screenshot from RoyaltyNow video showing their original recreation of Mary, Queen of Scots


Screenshot from RoyaltyNow video showing how the recreation has been distorted after being fed into the AI animation program

These AIs homogenise, flatten, and contrast features based on the racialized and gendered norms established by their training data. Yet, because the YouTube algorithm rewards posting frequently,[1] content creators are pressured to speed up their workflows with AI automation in order to remain algorithmically visible. Staying on trend is also key to maintaining an audience, which likely influenced RoyaltyNow’s decision to use AI animation. After all, it’s clear that audiences react to this content.

YouTube and Public History

At the time of this writing, the four channels that I reviewed have a combined total of 186,939,185 views.[2] The comment sections are filled with emotional reactions — viewers report tearing up, feeling “moved” or “emotional,” and having chills or goosebumps. As one comment stated, “Who else got chills and a tear in their eye when the Grand Duchess Romanova ‘came to life’? Honestly, when she began to smile and move her eyes around I just got so emotional.”

In her book on queer archives, Ann Cvekovitch describes “the quest for history as a psychic need.”  More and more, people are turning to the Internet to fulfil their thirst for learning about the past. There is a desire to create affective intimacy with these historical figures. However, these AI algorithms are creating this intimacy by contorting and distorting these images, creating the false impression that gender expression is immutable and positioning “modernity” and “relatability” alongside whiteness. Given the substantial impact of this content, this uncanny new practice in digital public history must be investigated further.

 

Endnotes

[1] Probably; it’s tricky to figure out how YouTube’s algorithms work. However, because there’s a lot of algorithmic gossip that frequency is rewarded (see examples 1, 2, and 3), many creators will strive for frequent uploads.

[2] Historicly – 12,260,968 total views; RoyaltyNow – 42,618,816 total views; Mystery Scoop – 130,354,670 total views; Randomly Digest – 1,704,731 total views. Numbers obtained from About section on YouTube Channels, August 27, 2024.

The Digital Humanities Hub has a new look!

Dawn Kanter, Rachel Lewis and Chrissie Willis-Phillips [Digital Humanities Officer, Research Development Manager and Associate Director (Scholarship and Planning), DH Hub team] and Jo Stace [Designer, Creative & Print Services] discuss how we arrived at this new visual identity.

Dawn Kanter (DK): The Hub team first embarked on this branding project with three aims in mind. We wanted to help colleagues to recognise the Digital Humanities Hub; to tie the various things that the Hub is involved in (such as events, presentations and resources) together; and to tie ourselves more closely to the Library, where we are based.

After meeting with the Creative & Print Services team, we put together a list of keywords that could inform the imagery in our branding. These included ‘Digital Humanities’, ‘Arts and Humanities’, ‘Digital Technology’, ‘Interdisciplinarity’, ‘Research Innovation’, ‘Collaborative Research’, ‘Open Research’… I didn’t think that it would be easy to come up with designs based on these quite abstract terms! We were all very excited when we saw the initial concepts.

Jo Stace (JS): From the initial brief I knew that I wanted to create a graphical identifier – something that’s not exactly a logo, but that can be used to identify the Hub. It needed to be something that gave a sense of what the Hub does, but that also looked visually pleasing. This bespoke graphic could then be used across various materials to create a visual identity. I also thought it was important to find imagery that could be used by the Hub to give the feeling of data and digital technology. This process started with a search of our stock image library.

Additionally, some elements of the design needed to follow particular guidelines. The typography needed to fit within University branding, and the colours needed to come from the Library ‘rainbow’ branding to create a link between the Hub and the Library.

 

Left: UoR Library Lockup (reads 'University of Reading Library'. Right: UoR Library rainbow block

UoR Library branding

 

JS: With all this in mind I developed three initial concepts:

Concept 1 – Analogue to Digital: This took the idea of going from an analogue form to a digital form. I wanted to represent this with lines that start off ‘hand drawn’, then move into a precise digital shape which forms a circuitry graphic.

Concept 2 – Network and Nodes: For this concept I wanted to play on the idea of connection, both in terms of the Hub connecting people and their work, and the idea of data connection and digitalisation.

Concept 3 – Pixel Patterns: This third concept looked at using basic shapes to represent pixels (again focusing on the digital element of the Hub’s work), but creating a flow with the shapes to reflect the human nature of data.

 

Three concepts for graphical identifiers for the DH Hub

 

DK: The Network and Nodes concept was a favourite with the Hub team.

Rachel Lewis (RL): From the beginning, the Hub’s role has been as a focal point that connects lots of different people and teams. This design represents who we are and what we do.

DK: We also enjoyed the relationship between the Pixel Patterns and the Library’s rainbow block. So we asked Jo if we could proceed with both concepts (2. Network and Nodes and 3. Pixel Patterns).

JS: Once the two options had been chosen, I wanted to develop imagery to fit with the graphical element. It became clear that the images from the stock library were too dark for this project and sometimes too generic.

DK: We wanted to strike the right balance between specific (e.g. images of 3D-modelled objects or text markup) and general (e.g. abstract representations of technology).

JS: I felt that a more bespoke graphic was needed. Starting with stock images, I was able to manipulate these and introduce the Library colours. I then had a graphic that I could use across multiple assets, and which the Hub team could use to create a visual link between their work.

Custom imagery for the DH Hub

 

DK: This imagery that Jo designed matches the concept of ‘Hub’ or ‘Network and Nodes’ in the graphical identifier.

RL: It also has the look and feel of the output of social network analysis, so it ties in with DH research as a theme too.

DK: Looking at the graphics alongside the imagery, the Hub team chose to move forward with this Network and Nodes concept.

JS: From this point I was able to use mock-ups of how the visual identity would work on pull-up banners, leaflets, web banners and presentations.

 

Banner reads: 'Digital Humanities Hub' 'Enabling research innovation by providing sustainable support for Digital Humanities'     
Mock-ups of a pull-up banner and web banner

 

DK: It was very exciting for the Hub team to see these mock-ups. Some of the designs also included a new idea: a handwritten text element (alongside the Network and Nodes motif), which reminded us of some of the primary sources that Digital Humanists might be using.

 

Leaflet reads: 'Digital Humanities Hub' 'Enabling research innovation by providing sustainable support for Digital Humanities'   
Left: Mock-up of a leaflet. Right: Zoom-in on handwritten text element in background

 

RL: I really liked the idea of including primary sources, and I suggested (as did Sue Egleton, University Librarian) using a manuscript from our own collections for this purpose – highlighting both the richness of our Special Collections and the Hub’s relationship with UMASCS (University Museums and Special Collections).

DK: Caroline Gould, UMASCS’ Principal Archivist, suggested a few different manuscripts that we might use. Taking into account its appearance, content, and rights restrictions, we chose this one, from (the Ancient Roman Stoic philosopher) Seneca’s Epistolae Morales ad Lucilium (Moral Letters to Lucilius). The manuscript was produced in Italy (or South Germany?), c.1475.

Including this in the branding was not straightforward! It initially took me some time to find the part of the text which is about joy (letter 27) as opposed to the part which is about death (letter 26). There is also a scribal error in the text itself, which we wanted to avoid. Many thanks to Rachel Mairs (Professor of Classics and Middle Eastern Studies) for reviewing the below.

 


Website banner incorporating the manuscript leaf


Manuscript leaf from Seneca (Epistolae Morales ad Lucilium), in Latin, produced in Italy (or South Germany?), c.1475 reference MS 5650/23, European Manuscripts Collections, Special Collections, University of Reading (detail)

‘Cast about rather for some good which will abide. But there can be no such good except as the soul discovers it for itself within itself. Virtue alone affords everlasting and peace-giving joy; even if some obstacle arise, it is but like an intervening cloud, which floats beneath the sun but never prevails against it. When will it be your lot to attain this joy? Thus far, you have indeed not been sluggish, but you must quicken your pace. Much toil remains; to confront it, you must yourself lavish all your waking hours, and all your efforts, if you wish the result to be accomplished. This matter cannot be delegated to someone else.’ Translation by Richard M. Gummere

 

DK: The branding exercise also prompted the Hub team to re-examine our tagline. We settled on ‘Enabling research innovation by providing sustainable support for Digital Humanities’. ‘Enabling research innovation’ is the motivation that underpins all that the Hub does, while ‘sustainable support’ means (to me) support that is integrated and embedded. For example, members of the Hub team are involved in the grant development process and in decisions about digital infrastructure.

RL: One of the strands of the original DH Hub setup project was about sustainability, so we’ve built a sustainable service from the start. Enabling sustainable research is key to everything that Robert Darby and Guy Baxter, in the Hub team, do around data preservation and storage, and that we all do around ensuring that research projects are planned and conducted in a way that enables the research and outputs to be accessible long term.

Chrissie Willis-Phillips (CWP): It was important, also, that this tagline aligned well with the aims and ethos of the Library, and both enabling research innovation and sustainability are strategic goals that fit with the Library’s mission to support research and teaching by providing services that reach all communities at the University. The tagline and the overall work of the DH Hub also contributes to the University’s strategic aim to achieve excellence and sustainability. The production and promotion of innovative, open, and reproducible research is a key element of achieving this aim.

DK: The branding exercise prompted other kinds of reflections on the Hub and its work too.

RL: I was blown away when I first saw the Hub branding on the postcards and I felt a little bubble of pride for the whole team! We have worked together on the DH Hub project over the last three years (and a lot of work prior to that!) and it is amazing to see this representation of all the work we’ve done to get from having no formal DH support to a cohesive and productive team. Huge thanks to everyone in the team, especially Roberta Gilchrist for having the initial vision, and continuing to drive the project forward.

DK: We are thrilled to have updated our website with the new branding, and look forward to using it at the events we run, in the presentations we give, and in the resources that we produce.

It has been a great experience working with Jo and Creative & Print Services on this. Thank you, Jo, Sue, Caroline, Rachel, and all our colleagues who contributed!

 

Postcard (front) reads: 'Digital Humanities Hub' 'Enabling research innovation by providing sustainable support for Digital Humanities'    Postcard (back)
Printed and ready-to-use DH Hub postcard

Banner reads: 'Digital Humanities Hub' 'Enabling research innovation by providing sustainable support for Digital Humanities' 'research.reading.ac.uk/digitalhumanities'
Printed and ready-to-use DH Hub banner

A Chaucer AI Co-Pilot? Reflections on the Implications of LLMs for Learning about Medieval Texts and Beyond

by Sophie Whittle, Digital Humanities Institute, University of Sheffield

 

Introduction

In June, I had the pleasure of attending the inaugural DH and AI conference at Reading. Since then, I have been reflecting on AI’s implications for higher education, as well as wider communities.

I began investigating Large Language Models (LLMs) as part of the C21 Editions team, including AI’s ability to produce a prototype digital teaching edition of Chaucer.[1] The initial aim of the project sought to explore the capabilities of machine-assisted technology more broadly, merging the curatorial and statistical aspects of historical editing.[2]

From 2023 onwards, we wanted to gauge whether LLMs could reliably interpret historical textual traditions and produce engaging digital teaching resources. The aim was to maintain traditional aspects of editing, including diplomatic transcriptions, detailed annotation of the text and essays on scholarly interpretations, while improving the efficiency of the editorial process and meeting students’ expectations of interactivity.

DALL-E 3 creation via Microsoft Bing Copilot. Prompt: Generate an image of Chaucer as an AI robot in the twenty first century, including some medieval elements into his design (28 August 2024).

 

Why Chaucer and The Pardoner’s Prologue & Tale?

Geoffrey Chaucer, the author of the 14th Century text, The Canterbury Tales, was named ‘the father of English poetry’ by John Dryden in 1700. Yet, recent criticism of Chaucer suggests his texts promote hegemony via upholding white, patriarchal values (e.g. see recent evidence about Chaucer’s lawsuit, which initially prompted discussion about possible sexual assault).

Sarah Baechle and Carissa Harris make the interesting point that the ethical questions raised by Chaucer’s inclusion on the university curriculum lead to a deeper critique of historical attitudes to consent, race, and societal binaries and categories.[3] Ruth Evans also draws attention to expanding and shifting conversation within Chaucer studies, through modern criticism, adaptation, and reparation.[4] For instance, Chaucer’s Wife of Bath has been adapted for the stage through Zadie Smith’s Jamaican-born and British Wife of Willesden.

The Pardoner from the Ellesmere manuscript of The Canterbury Tales. Public domain.

The fluidity of the Pardoner’s desires has sparked interest in his representation of medieval queerness, despite strict Catholic opinion against ‘sodomy’.[5] Students and teachers in our focus groups frequently stated that learning about Chaucer and his texts led to further research on global medieval literature, racism in the Middle Ages, and gender and sexuality as a spectrum. How, then, does AI fare when prompting its interpretation of medieval texts, in ways that address the history of Chaucer’s reception?

 

Benefits and Limitations of AI for Interpreting Chaucer’s Texts

LLMs such as OpenAI’s ChatGPT could only provide surface level summarisation and description, as might be expected given the age of medieval literature. Without significant editorial intervention, and additional guidance from open access data such as Wikipedia, outputs were sparse and the overall process of prompting sufficient responses slow. Below is a recent example output from the prompt, ‘Summarise the theme of gender and sexuality, as well as gender fluidity, queerness, and masculinity, in Chaucer’s Pardoner’s Prologue and Tale, to a university level standard with references.’

 

2. Gender Fluidity and Queerness

The Pardoner’s self-presentation and ambiguous identity play with ideas of gender fluidity and queerness. Throughout his prologue, the Pardoner appears to relish his liminal status, and his description, behavior, and performance can be seen as a critique of the rigid gender roles of his time. As an individual who occupies a space between established categories of male and female, the Pardoner embodies a form of gender fluidity that resists traditional binaries.

Moreover, the Pardoner’s association with relics and his role as a religious figure allow him to blur the lines between the sacred and the profane. His very profession involves deception and manipulation, which parallels his fluidity of identity. This sense of indeterminacy is further emphasized by his admission of deceit in selling indulgences: he is a performer who knows how to manipulate appearances to his advantage, much like how he manipulates his own identity.

The Pardoner’s queerness can also be seen in his relationships with other pilgrims. His close association with the Summoner, another morally ambiguous figure in the Tales, has led some critics to suggest a homoerotic subtext. The Pardoner’s own words at the end of his prologue – inviting the pilgrims to kiss his relics or the “bones” – carry a sense of homoerotic double entendre (Rigby, The Pardoner’s Tale and Gendered Power).

 

3. Masculinity

The Pardoner’s character also interrogates medieval constructions of masculinity. In his ambiguous gender presentation and his rejection of traditional masculine roles, he becomes a figure who challenges the conventional ideal of the male Christian pilgrim. The Pardoner’s narrative voice in his prologue is marked by a conscious subversion of these ideals: he mocks the idea of piety and openly confesses to exploiting the gullibility of his audience for personal gain.

Furthermore, his tale, which is a moral narrative about three rioters who meet their end due to their greed, can be read as a commentary on masculine excess. The rioters represent unchecked masculine aggression, gluttony, and a pursuit of power – vices that lead to their destruction. In this sense, the Pardoner’s tale becomes a critique of the very ideals of masculinity that he himself eschews.

 

References:

    • Chaucer, Geoffrey. The Canterbury Tales. Edited by Jill Mann, Penguin Books, 2005.
    • Dinshaw, Carolyn. Getting Medieval: Sexualities and Communities, Pre- and Postmodern. Duke University Press, 1999.
    • Rigby, Stephen H. “The Pardoner’s Tale and Gendered Power.” Chaucer and Gender, ed. Elaine Tuttle Hansen, Oxford University Press, 1992.

Extracts of the output from the publicly available ChatGPT-4o, on the gender and sexuality of the Pardoner (28 August 2024).

 

Interestingly, the response draws on recent notions within queer theory, explaining that the Pardoner’s desires challenge ‘rigid gender norms’ of medieval society, with his perceived ‘sinful’ actions at odds with church beliefs. However, there are several issues with the output, as it 1) refers to an irrelevant relationship with the Summoner pilgrim (it does not feature in The Pardoner’s Tale); 2) does not acknowledge the Pardoner’s tension with the Host, crucially reflecting hypermasculinity; and 3) the few references were published not later than 2005. There is also little accuracy in the referencing of these sources – for instance, Hansen’s book is actually named Chaucer and the Fictions of Gender (within which Rigby’s work does not feature), and an entire edition of The Canterbury Tales is used as secondary research, without clearly stated page numbers. Lastly, the output sentence, ‘His very profession involves deception and manipulation, which parallels his fluidity of identity’ does not provide the required amount of nuance about medieval sin and society, which appears in recent academic scholarship. There must be more analysis of the perception of ‘sin’ and ‘deviance’ in medieval society as deception and manipulation, which, to a modern reader, might be more concretely associated with his selling of false relics. From the output, a user might assume that the LLM is associating queerness and fluidity with ‘sin’ as a contemporary perspective, rather than specifying that this is a historical perception. There is a lot of recent scholarship celebrating the Pardoner’s queer identity, his ability to outsmart the other pilgrims and his audiences, and the reclamation of power when he challenges the Host, which is not evident from the output. These contemporary themes attract students’ attention today, and these linguistic nuances have real implications for their learning of the medieval period.

 

What are the Implications for Teaching and Beyond?

As institutions are beginning to communicate, GenAI outputs must be unravelled further by students. In particular, AI use must be paired with knowledge from well-informed teachers, alongside authoritative secondary research. These outputs may be suitable for overviews of medieval tales, but they must be investigated for their suitability and trustworthiness. Baechle and Harris also suggest that, to avoid ‘deifying’ popular medieval writers such as Chaucer, students must be equipped to interrogate historical attitudes.[6] The same goes for examining Chaucer’s literature through AI tools.

 

Images of the prototype interface for The Pardoner’s Prologue and Tale, by the C21 Editions project. The first is the main landing page, and the second is one of the discussion boards for students to comment on aspects of the edition.

The screenshots above present our current prototype. Alongside interactive aspects, we have included activities and discussion boards for further interrogation of, and collaboration on, Chaucer’s works and the AI technology used in the edition. These activities are crucial for embedding critical perspectives of AI into learning, which must occur alongside guidance of its implications on society more widely.[7]

I end this post with some of these societal implications. As Bender et al. note, these models often encode hegemonic views, resulting in outputs that have the potential for misuse by society.[8] The anthropomorphism of AI – the images of humanoid AI robots completing everyday tasks, like the image of ‘AI Chaucer’ above – leads to the belief that LLMs have the capacity for human ‘intelligence’. The creation of departmental resources to improve critical AI literacy, such as Rutgers’ AI Council living document, equips teachers with the practical and ethical implications so they can make informed decisions about its risks. It is these approaches that must be embedded into university learning of all disciplines, particularly for engaging and sparking interest in Chaucer studies.

 

[1] The C21 Editions project was generously funded by UKRI-AHRC and the Irish Research Council under the UK- Ireland Collaboration in the Digital Humanities Research Grants (grant numbers AH/ W001489/1 and IRC/W001489/1). The collaboration is between the Digital Humanities Institute (University of Sheffield), the University of Glasgow, and the Department of Digital Humanities at University College Cork.

[2] Katherine Bode’s discussion on bridging the curatorial and statistical aspects of DH studies can be found at: https://critinq.wordpress.com/2019/04/02/computational-literary-studies-participant-forum-responses-day-2-3/

[3] See: Baechle, Sarah, and Carissa M. Harris. ‘The Ethical Challenge of Chaucerian Scholarship in the Twenty-First Century’. The Chaucer Review 56, no. 4 (2021): 311–21.

[4] See: Evans, Ruth. ‘On Not Being Chaucer’. Studies in the Age of Chaucer 44, no. 1 (2022): 2–26.

[5] See: Dhouib, Mohamed Karim. ‘De/Stabilizing Heterosexuality in the Pardoner’s Tale’. International Journal of Language and Literary Studies 3, no. 4 (5 December 2021): 154–66.

[6] Baechle, Sarah, and Carissa M. Harris. ‘The Ethical Challenge of Chaucerian Scholarship in the Twenty-First Century’. The Chaucer Review 56, no. 4 (2021), p. 318.

[7] A more in-depth analysis of AI in pedagogical contexts can be found on ‘The AI Hype Wall of Shame’, by Katie Conrad and Lauren M. E. Goodlad.

[8] Bender, Emily M. et al. ‘On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜’. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, p. 616.

Developing AI-based Agents for Automating Linguistic Research Workflows: Fine-tuning Corpus Annotators as an Example

by Abdulrahman A. A. Alsayed
University of St Andrews, United Kingdom; King Faisal University, Saudi Arabia

Recent strides in generative AI development and the emerging abilities of pre-trained models (Wei et al., 2022) hold significant potential to automate tasks and alleviate hurdles in research workflows. Research teams can explore utilizing AI-based agents that they train to help delegate various components of the research pipeline. Developing these agents involves a process of fine-tuning a general-purpose foundation model to power them with the capability of handling specialized research tasks and to work autonomously. Such agents have proven effective in assisting with the reduction of demands in labor-intensive tasks in general, both within research contexts and beyond (Turow, 2024; Cheng et al., 2024 for an overview).

Concept art generated with Stable Diffusion. Prompt: ‘a central AI bot depicted with a marker in hand busily working with a volume of spreadsheets’

Linguistic work based on empirical studies of corpora often requires highly annotated datasets tagged with specialized linguistic information, which can prove to be labor-intensive to produce manually depending on a project’s scale. In my current project, code named CEERR (the Communicative Events Extended Reality Repository), I seek to establish a platform for documenting and analyzing the natural syntax of spoken language in modern varieties produced within immersive spatial environments. One of the aims of developing this platform is to achieve seamlessly incorporation of linguistic corpus tools, AI APIs, and virtual reality (VR) immersive experiences used as visual stimuli for speakers into one ecosystem. The project’s motivation is to enhance language documentation methods to fully capture communication environments and facilitate collaboration between speech communities and language experts in simulated virtual fieldwork settings. One of the project’s important outputs is the production of high-quality annotated datasets with audiovisual samples for low-resource language varieties that have seen little documentation work. Alsayed (2023) provides an overview of the project’s proof-of-concept and its data collection methodology.

A participant responding to visual stimuli during virtual reality-mediated linguistic fieldwork

Audiovisual corpora of spoken language and corpora of low-resource languages are especially challenging to compile into abstractable datasets. Audiovisual corpora are inherently data-rich and encapsulate textual, auditory, and visual modalities. Low-resource languages inherently lack reference corpora and dictionaries that can facilitate their annotation. Combined, creating and annotating these types of datasets can be a significant bottleneck, especially when large-scale annotation work is required. One technique I explored to alleviate the hurdle in textual annotation was fine-tuning large language models (LLMs) to tag text with linguistic annotations (Naveed et al., 2023 for an overview of fine-tuning techniques). The fine-tuning process involves prompt engineering system instructions, utilizing in-context learning, and preparing demonstration datasets tailored to the desired tasks throughout the project’s stages. The emerging classification abilities of LLMs make them excellent candidates for training as linguistic annotators of corpora through these fine-tuning techniques.

In August 2023, manual annotation work on a sample documented in the CEERR project resulted in the creation of a prototype system to assist with the automated linguistic tagging of further documented samples. The prototype features LLM functionality powered by the LangChain framework, and a graphical user interface built with the PyQt5 library. The system is built to be compatible with various LLM APIs and incorporates fine-tuning instructions and demonstration datasets customizable for different annotation needs. The system can be further configured to allow for model selection, hyperparameter adjustment, and tag set additions as required. Users can feed corpora into the system manually or through file upload, and the generated completion output is displayed through a text editor with options to export the data in multiple formats compatible with a variety of spreadsheet and corpus editing software.

Rows include: Token (e.g. الوان), Transliteration (e.g. alwan), Translation (e.g. colors), Form (e.g. N), Visual (e.g. SC3SQ05), Referent (e.g. R3: Notepad and Crayon (7))
Dataset annotations displayed in the EXMARaLDA Partitur Editor

The prompts for the system instructions were engineered iteratively based on the project’s prototype corpus and initial tests of the annotation instructions. The instructions assign the task required from the agent, provide it with the desired annotation tags, and specify the output format. The instructions underwent testing with multiple LLM models and hyperparameter configurations to engineer prompts that generated satisfactory annotations. Following prompt engineering, a demonstration dataset of the desired completions was manually created to further fine-tune the LLM models through few-shot learning. Few-shot learning helps mitigate hallucination-related errors in the tagged completion. Each model’s performance after the fine-tuning process was evaluated, and the best-performing model was selected as the system’s default. A human-in-the-loop approach was also employed to manually correct the generated annotations and ensure accuracy.

On the visual side, an ongoing component of the project is exploring the use of computer vision (CV) for detecting the objects that speakers see in virtual environments. This technique is useful for determining what visual elements were part of the speaker’s immersive experiences at any given time, and time-aligning visual stimuli with speech production. This mapping of visual input and linguistic output offers a record of the spatiotemporal setting that the speaker was experiencing at each moment of speech to assist in mitigating the ephemerality of the communicative setting.

Adding annotations to the dataset informed by visual input starts with identifying all the objects contained in a stimulus virtual environment (a scene) and the actions they were involved in (a sequence). Since the environments are replicated and used across multiple speakers, utilizing CV to acquire information about the visual elements in a scene is useful to resolve the bottleneck of manually detecting these objects in the visual captures of each individual participant’s VR feed. Currently, the identified objects are used in LabelBox as terms in a custom ontology and labeled in a sample feed through a bounding box interface to be used with CV models that have custom ontology capabilities. A future step in the project is to explore how multi-agent systems can enable linguistic and visual annotator agents to work simultaneously together to provide time-aligned visual/linguistic annotations while exchanging information that aids in each agent’s tasks.

A website interface with multiple panels including a selectable list of annotation terms and their codenames (SC1(3) Accordion, for example), a video player with video control buttons, and a video timeline with a movable frame marker.
LabelBox computer vision custom ontology and annotation interface

We have only started to explore the potential of integrating generative AI-based agents into linguistic research. As highlighted in this post, this integration shows promise for documenting and processing audiovisual corpora and low-resource spoken languages. This line of enquiry still holds significant potential for improving the efficiency and quality of research outputs in the field and remains largely unexplored.

 

Alsayed, A. A. A. (2023). Extended reality language research: Data sources, taxonomy and the documentation of embodied corpora. Modern Languages Open, Digital Modern Languages Special Collection. Liverpool University Press. https://modernlanguagesopen.org/articles/10.3828/mlo.v0i0.441 (Retrieved December 20th, 2023).

Cheng, Y., Zhang, C., Zhang, Z., Meng, X., Hong, S., Li, W., et al. (2024). Exploring large language model-based intelligent agents: Definitions, methods, and prospects. ArXiv Preprint. https://arxiv.org/abs/2401.03428 (Retrieved May 29th, 2024).

Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Usman, M., et al. (2023). A comprehensive overview of large language models. ArXiv Preprint. https://doi.org/10.48550/arXiv.2307.06435 (Retrieved January 15th, 2024).

Turow, J. (2024, June 5). The rise of AI agent infrastructure. Madrona. https://www.madrona.com/the-rise-of-ai-agent-infrastructure/ (Retrieved June 9th, 2024).

Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., et al. (2022). Emergent abilities of large language models. ArXiv Preprint. https://doi.org/10.48550/arXiv.2206.07682 (Retrieved May 29th, 2024).

AI as a Cataloguing Aid? Towards a Workflow for Cultural Heritage Accessibility and Engagement

by José Pedro Sousa
Centre for Theatre Studies, University of Lisbon / Library Services, Imperial College London

Introduction

Archival research is a gateway to the past, offering invaluable insights into history, culture, and society. However, the path to accessing these treasures is often obstructed by various challenges, from ambiguous catalogue descriptions to difficulties in reading historical handwriting. While these obstacles are well documented, the focus of this discussion shifts towards the innovative solutions that are beginning to emerge, particularly in the realm of AI-driven cataloguing.

Inaccessible handwriting. Contract ref. code PT-ADLSB-NOT-CNLSB3-001-001-00113 ff. 93-94v, National Archive Torre do Tombo (Portugal)

 1. Critically Embracing Technology: Transkribus and the Archives

The advancement of Handwritten Text Recognition (HTR) technology has opened new possibilities for improving access to archival records. A prime example of this is Transkribus, a tool that has gained significant attention for its ability to transcribe historical manuscripts (see Milioni 2020; Nockels et al 2022). Melissa Terras (2022), though acknowledging the benefits of the tool for libraries and archives, also forewarns for the necessity of using it critically. Terras mentions the need for 1) “ensuring a plurality of voices” (2022: 142) to be digitised and transcribed, 2) planning “how HTR can become embedded into existing digitisation and information workflows” (2022: 142) and 3) establishing standards for data handling and “openly publishing datasets” (2022: 143).

2. Methodology: Engaging Stakeholders for Effective Implementation

Integrating HTR technology into archival workflows also requires a deep understanding of the needs and concerns of those involved. To this end, I conducted fieldwork and interviews with professionals across several institutions, including university libraries, archives, and national libraries. The goal was to explore the feasibility of using HTR as a cataloguing aid and to understand how it could be effectively implemented.

The research focused on five key areas: cataloguing, collections management, digitisation, conservation, and digital archives management. Through this process, it became clear that a collaborative approach is essential. By bringing together heads of collections, archivists, cataloguers, conservators, and digitisation specialists, we can develop a workflow that not only improves efficiency but also ensures that the integrity and accessibility of archival records are maintained.

 3. Workflow: A Collaborative Approach to AI-Driven Cataloguing Aid

Based on the insights gained from fieldwork and interviews, I proposed a comprehensive workflow for incorporating HTR technology into the cataloguing process. This workflow begins with the careful selection of a collection or corpus of documents to be catalogued, considering issues such as linguistic and cultural representativeness and subject matter diversity and inclusion. This selection must be followed by an assessment of the stability of the documents. Once the collection/corpus is chosen the process can begin. At this point, even physical proximity between the storage area, cataloguing office, and digitisation studio was recommended. While conducting this research, I contributed to the UCL-V&A Chinese Export Watercolour project, directed by Jin Gao, where these conditions were met, making the whole process much more agile (see Gao et al 2023).

If an item is deemed stable, it proceeds to the digitisation phase, where it is transcribed using HTR technology. If the item is unstable, it will be kept in the conservation studio to be stabilised for digitisation.

The resulting digital transcription is then used to enrich the metadata, providing additional context and facilitating easier access to the content. The use of HTR might not only accelerate cataloguing workflow but also enhances the quality of the information available to the public.

4. Outcomes: Enhancing Accessibility and Engagement

The integration of HTR technology into archival cataloguing offers several significant benefits. One of the most impactful is the improvement in accessibility. Digitised and transcribed records can be accessed by a wider audience, including people with disabilities, the elderly, and those with limited financial resources. This democratization of knowledge is a crucial step towards making archival materials more inclusive.

Moreover, the use of crowdsourcing in the transcription process can further enhance accuracy while fostering community engagement. By involving the public in the digitisation and transcription efforts, archives can attract new audiences and strengthen ties with their communities.

 5. Limitations and Future Directions

While the potential of HTR technology is promising, it is not without its challenges. The current workflow relies on existing language models, which may introduce bias if not carefully managed. Future projects should consider the development of specific models tailored to diverse and inclusive collections. Additionally, the process requires a strategic allocation of human and financial resources to ensure its sustainability.

Another critical next step is the empirical evaluation of the proposed workflow. Implementing a trial period with a sample corpus will provide valuable insights into the effectiveness and efficiency of the process, allowing for adjustments and improvements before wider adoption.

Conclusion

The integration of AI-driven tools like Transkribus into archival cataloguing is a significant step forward in addressing the accessibility challenges that have long plagued researchers. By fostering collaboration among archival professionals and embracing innovative technologies, we can begin to unlock the full potential of our shared history. While there are still challenges to overcome, the path ahead is one of exciting possibilities for enhancing access to archival records and making them available to a broader, more diverse audience.

AI-generated image in Bing 19-08-2024
Prompt: archivist lady holding a tablet, digitisation officer, books conservator, special collections manager and Artificial Intelligence looking at an old manuscript, in a medieval archive with modern technology such as holograms, a cute and friendly robot, high tech digitisation machine

Bibliography

Gao, J., Zhang, Y.,  Wang, L., Li, Y., Li, F., Chang, S., Liu, J. (2023) “Unveiling the V&A Chinese Export Watercolours (CEW) Collection: A Journey of Digitisation and Discovery”. Available at: https://blogs.ucl.ac.uk/dis-blog/2023/09/08/cew/ (accessed: 9-8-2024)

Milioni, N. (2020) Automatic Transcription of Historical Documents. Transkribus as a Tool for Libraries, Archives and Scholars. MA dissertation, Uppsala Universitet. Available at: https://www.diva-portal.org/smash/get/diva2:1437985/FULLTEXT01.pdf (accessed: 9-8-2024).

Nockels, J., Gooding, P., Ames, S. and Terras, M. (2022) “Understanding the application of handwritten text recognition technology in heritage contexts: a systematic review of Transkribus in published research”, Archival Science, 22, pp. 367-392.
doi: https://doi.org/10.1007/s10502-022-09397-0 (accessed: 9-8-2024).

Terras, M. (2022) “The role of the library when computers can read: Critically adopting Handwritten Text Recognition (HTR) technologies to support research” in Wheatley, A. and Hervieux, S. (eds), The Rise of AI: Implications and Applications of Artificial Intelligence in Academic Libraries. Atlanta: ACRL – Association of College & Research Libraries, pp. 137-148. Available at: https://www.research.ed.ac.uk/en/publications/the-role-of-the-library-when-computers-can-read-critically-adopti (accessed: 9-8-2024)