Projects per year
Given the vast amounts of data available in digitised textual form, it is important to provide mechanisms that allow users to extract nuggets of relevant information from the ever growing volumes of potentially important documents. Text mining techniques can help, through their ability to automatically extract relevant event descriptions, which link entities with situations described in the text. However, correct and complete interpretation of these event descriptions is not possible without considering additional contextual information often present within the surrounding text. This information, which we refer to as meta-knowledge, can include (but is not restricted to) the modality, subjectivity, source, polarity and specificity of the event. We have developed a meta-knowledge annotation scheme specifically tailored for news events, which includes six aspects of event interpretation. We have applied this annotation scheme to the ACE 2005 corpus, which contains 599 documents from various written and spoken news sources. We have also identified and annotated the words and phrases evoking the different types of meta-knowledge. Evaluation of the annotated corpus shows high levels of inter-annotator agreement for five meta-knowledge attributes, and moderate level of agreement for the sixth attribute. Detailed analysis of the annotated corpus has revealed further insights into the expression mechanisms of different types of meta-knowledge, their relative frequencies and mutual correlations.