On October 7, 2025, the IT Section and the Source Publication Section of the Association of Hungarian Archivists held their autumn training session at the Budapest City Archives. The event focused on the preparation and publication of digital scholarly editions, as well as on the use of semantic web technologies and AI-based tools in archival and research contexts. The professional partners of the programme were the Institute for Literary Studies of the ELTE Research Centre for the Humanities and the National Laboratory for Digital Heritage.
The training offered a comprehensive overview ranging from the theoretical foundations of digital philology to the most recent applications of artificial intelligence. Zsófia Fellegi opened the programme with her course “Creating and Publishing Digital Scholarly Editions”, which presented the history of digital philology and the application of international standards such as the Text Encoding Initiative (TEI). She outlined major Hungarian projects – including DigiPhil and the ITIdata semantic database – and emphasized the importance of the FAIR principles in data management. Fellegi underlined that digital editions are not merely digitized versions of print publications but autonomous scholarly products designed to support research and data enrichment.

In the second session, Kata Dobás discussed “Developing Literary Databases on a Semantic Web Basis”, explaining the methods of structuring bibliographic data within semantic networks through the ITIdata system. She demonstrated how semantic data models enable machine-readable relationships among archival and literary sources, thereby enhancing the interoperability, reusability, and research potential of cultural heritage materials.

The afternoon session began with Barbara Kovács-Bobák’s presentation, “Handwriting Recognition on Hungarian and Latin Texts Using Custom-Developed Tools.” She introduced the handwriting recognition (HTR) technologies developed within the National Laboratory for Digital Heritage, which provide open-access alternatives to commercial systems such as Transkribus. The project builds on the TrOCR framework to fine-tune open-source models for Hungarian and Latin manuscripts, making advanced handwriting recognition tools freely available to the scholarly community.

The final lecture, delivered by Gábor Palkó, explored “Artificial Intelligence and Digital Philology.” His presentation addressed the opportunities and limitations of large language models (LLMs) in philological research, highlighting their use in text analysis, data enrichment, and corpus building. At the same time, he drew attention to issues of scientific transparency, the representativeness of training corpora, and the need for critical evaluation of AI-generated results.

During the hands-on sessions, participants were able to experiment with the tools introduced, gaining practical experience in text encoding, data enrichment, and handwriting recognition. The event successfully bridged archival digitization practices with the latest achievements of digital humanities.
Participation in the training was free of charge for members of the Association of Hungarian Archivists but required prior registration. The event clearly demonstrated that preserving and reinterpreting digital heritage requires the close collaboration of technological and humanistic expertise.