• contact@dh-lab.hu
  • 1088 Budapest Múzeum krt. 6-8
Natural Language Processing (NLP)
Developing NLP tools is a priority. The development of Hungarian language analysis machine intelligence is a prerequisite both for the exploration and preservation of the national digital heritage and for the market exploitation of the specialized tools developed...
Web Harvesting
In Hungary, there are currently few concentrated cultural and scientific archiving activities that result in the material of sufficient accuracy and purity to be suitable for widespread use. Data loss is obvious and continuous. To address this DH-LAB is continuously selecting and harvesting web resources relevant to research and...
Literary Corpora
The National Laboratory for Digital Heritage is producing three literary corpora of machine-generated annotations. The aim of the ELTE Poetry Corpus is to present Hungarian canonical poetry. The ELTE Novel Corpus aims at presenting Hungarian canonical and less canonical fiction, and the ELTE Drama Corpus aims at presenting Hungarian canonical and...
"Born Digital" laboratory
The Hungarian cultural heritage suffers a huge amount of data loss due to limited infrastructure and a shortage of professionals to archive and manage born-digital material. The management of born-digital material is not only a matter for collectors or academics but can also be used by market players, as the management...
Wikibase project
The database of the Department of Digital Humanities of ELTE, based on the wikibase software, is a self-developed database for the organisation and publication of research materials on prosopography, bibliography and other historical topics. The ELTEdata database currently contains the collections of three research groups...
Danube AI
Danube-AI is a sub-project of DH-LAB, which deals with the artificial intelligence-based management of the heritage of the Danube region. The project aims to develop AI-based good practices for processing and publishing collections, digital and historical resources that are at risk, unprocessed and/or with limited accessibility...