Collections


Collections

Find a list of digital archives used by the DIGHUMLAB community with a short description and information about accessibility.

CLARIN-DK

The CLARIN-DK archive contains language based data and tools for annotation and research in a variety of disciplines in humanities and social sciences. Part of the data and tools has public access; other resources can only be accessed and downloaded by researchers via e.g. WAYF. The archive is evolving over time and currently contains Danish text corpora, separated in general language and professional language. In addition there are Danish audio, video and photo collections, as well as lexica, wordnet, and annotations for some of the files. The archive can host the language based data you use for research purposes or want to share with fellow colleagues. It is possible to do linguistic annotation, convert text to the XML (TEI P5) format and deposit your own data in CLARIN. The archive may be accessed at here

Danish Sound

Danish sound contains parts of the State and University Library’s large collections of older sound recordings, namely the parts that have been digitised and are found to be without any copyright restrictions. The collection include recordings on early gramophone records, the oldest Danish wax cylinders, audio cassettes and others. The collection may be accesssed here

The Danish Digital Newspaper Collection

Digitised Danish newspapers from the mid-1600s onwards. Newspapers more than 100 year old are public domain, while newer newspapers is only accessible at the State and University Library, The Royal Library and the Danish Film Institute at Mediestream.dk. The collection may be accessed here

The State and University Library’s Commercial Collection

Commercials shown in Danish cinemas from around 1907 to 1995, and all the broadcast on Danish television channel TV2 in the period 1988-2005. Accessible for employees at universities with license at Mediestream.dk. The collection may be accessed here

The State and University Library’s digital Radio/TV Collection

The radio and TV collection is part of Larm.fm and includes Danish radio and television recorded digitally from 2006 onwards. Accessible for employees at universities with license at Mediestream.dk. The collection may be accessed here

Web Archives in Denmark

The Danish Web Archive

Netarkivet.dk is the Danish web archive. It contains the Danish part of the internet from July 2005 onwards.
Due to Danish laws on personal data and data protection, access to netarkivet. dk is restricted to researchers with permission for relevant research projects. Special search functions have been developed for the Danish Web Archive offering free text search or search by URL. Free text search makes it possible to find results by keywords or phrases, while URL search makes it possible to trace or retrieve older versions of websites with known addresses. The archive may be accessed here

the danish web

Library of Congress Web Archives

The US Library of Congress Web Archives (LCWA) collect and make available collections of digital material, including several collections of websites selected by specialists to cover specific themes and topics that may be relevant to researchers. The Library of Congress Web Archives undertakes selective crawls, event crawls, and thematic crawls, and since 2000 they have collected and preserved collections of relevant websites in connection with events such as elections in the United States, the war in Iraq and 11 September 2001. The archives may be accessed online, and there are multiple search options: search by URL, faceted search, browsing (alphabetically or by subject) or search in current collections. The achives may be accessed here

Pandora

The Australien Web Archive PANDORA was established in 1996. PANDORA does not attempt to archive the entire Internet (and other digital materials); the strategy here is to undertake selective crawls and event crawls, so that the archive covers a variety of topics relating to Australia and Australians. The archive is freely accessible online, and may be searched in several ways: by URL or keyword, browsing (alphabetically, by subject) and by free text search. The archive may be accessed here

The Internet Archive

Since 1996 The Internet Archive has attempted to archive as much of the entire public part of the Internet as possible. Access to the archive is free of charge for all. There is access both to the Internet Archive’s Wayback Machine and the other collections of digital materials. Data can be searched by URL or free text search. The archive harvests data in many different ways, both as very broad crawls and as selective and thematic crawls, etc. The archive harvests data on the basis of accumulated lists, in which the links they encounter along the way are continuously added to the lists that direct the archiving. The archive also receives donations of data from various places, so it is a highly heterogeneous and composite collection. The archive may be accessed here

The Portuguese Web Archive

The Portuguese Web Archive is the Portuguese national web archive, and it is available online in its full extent and with the possibility of full text search. This contrasts with other national web archives in the public sector which do not provide full online access to the archives for all. The archive has been harvesting the Portuguese web since 1996, and since 2007 has been run by the Foundation for National Scientific Computing (FCCN) in Portugal. The archive may be accessed free of charge here

The UK Web Archive

The UK Web Archive contains websites that publish research, that reflect the diversity of lives, interests and activities throughout the UK, and demonstrate web innovation. Special Collections are groups of websites brought together on a particular theme. They can be events-based (e.g The Olympic & Paralympic Games 2012), topical (e.g. The Credit Crunch Collection) or subject-oriented (e.g. The British Countryside Collections). The archive, which has existed since 2005, offers several ways of searching the materials. You can search by website title or URL, but there is also the possibility of full text searching and browsing (alphabetically, by subject or in special collections). This collection of websites may be accessed free of charge here