CST’s POS-tagger is an expanded version of Brill-tagger, with add-ons for handling of XML and for better handling of words with capital letters e.g. in headlines. In CLARIN-DK Brill’s PoS-tagger supports the languages Danish and English.
What is a PosTagger?
A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like ‘noun-plural’. (Def. From: nlp.stanford.edu/software/tagger.shtml ). CST = Center for SprogTeknologi / Center for Language and Technology. POS = Part Of Speech. XML = Extensible Markup Language.