CST’s PoS Tagger is an expanded version of Brill-tagger, with add-ons for handling of XML and for better handling of words with capital letters e.g. in headlines. In CLARIN-DK Brill’s PoS tagger supports the languages Danish and English.

What is a PoS tagger? A Part-Of-Speech tagger (PoS tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained PoS tags like ‘noun-plural’. (Def. From: ). CST = Center for SprogTeknologi / Center for Language and Technology. PoS = Part Of Speech. XML = Extensible Markup Language.


