CLARIN

CLARIN supports and provides services for researchers and students who engage in cutting edge data-driven research with language-based materials, be it in the field of translation, history, political science, literature or classical studies.  CLARIN offers long-term solutions and technology services for deploying, connecting, analysing and sustaining digital language data and tools.

Besides providing access to language-based materials and tools, CLARIN offers depositing services. It is easy to upload, archive and share your own data.

Learning Resources

CLARIN-DK

Find and deposit data in CLARIN-DK or check out Voyant tools, NLP tools, Korp and TEI texts in the collection.

CLARIN ERIC

Visit the official website for CLARIN ERIC (European Research Infrastructure for Language Resources and Technology)

Inspiration

Be inspired by this case on Gesta Danorum and language technology – a shortcut to scientific evidence.

User impression

Read more on how to make texts ready for your corpus in one take using CLARIN-DK tools in this user case.

CLARIN-DK is part of a European infrastructure

CLARIN is short for Common Language Resources and Technology Infrastructure. CLARIN in Denmark (CLARIN-DK) is part of the European infrastructure CLARIN ERIC (European Research Infrastructure Consortium).

In the Danish repository, you find various collections of language data: a general language corpus, several collections of language for special purposes, parallel corpora, old Danish texts etc. CLARIN offers tools to discover, explore, annotate, analyse and combine data sets. The tools are supported by tutorials and use cases. 

As CLARIN is part of a European infrastructure, it also offers access to language data and tools from the other members. In Denmark CLARIN offers workshops, seminars, PhD courses and training courses for scientists. 

Selected tools

Voyant Tools is a web-based reading and analysis environment for digital texts

Korp is a web-based tool for searching keywords in text corpora and generating concordances.

The Text Encoding Initiative (TEI) is a standard for the representation of texts in digital form.

The CLARIN-DK archive contains language-based data and tools for annotation and research in a variety of disciplines in humanities and social sciences. Part of the data and tools has public access. Other resources can only be accessed and downloaded by researchers via e.g. WAYF.

The archive is evolving over time and currently contains Danish text corpora, separated in general language and professional language. In addition, there are Danish audio, video and photo collections as well as lexica, wordnet and annotations for some of the files.

The archive can host the language-based data you use for research purposes or want to share with fellow colleagues. It is possible to do linguistic annotation, convert text to the XML (TEI P5) format and deposit your own data in CLARIN-DK.

Do you have any questions or comments? Do you wish to join the community? Or do you want to get started with a workshop or a meeting on a specific topic?

Please contact community lead and engineer Lene Offersgaard or senior researcher Costanza Navarretta at the Department of Nordic Studies and Linguistics, University of Copenhagen.

Menu