Question Answering systems involve the extraction of answers to a question rather than retrieval of relevant documents. For Question Answering evaluation, it is necessary that a human assessor decide the correctness of the answers, given that the same answer can be expressed in different ways. Therefore, the use of suitable test collection could help to identify where the systems are performing well, or where they are failing. Our data collection analysis suggests that there should be a proportion of text in which the reasoning or explanation that constitutes an answer to a "why" question is present in, or capable of extracting from, the source text. We report on an implemented component for the extraction of candidate answers from source text. This component uses an approach that combines lexical overlapping and lexical semantic relatedness (lexico-syntactic approach) for ranking possible answers to causal questions. On undifferentiated texts, we obtain an overall recall of 34.13% indicating that simple match is adequate for answering over 1/3 of "why" questions. We have analyzed those question-answer pairs units where the answer is explicit, ambiguous and implicit, and shown that if we can separate the last category, the rate of recall increases considerably.
|Title of host publication||Proceedings of the Mexican International Conference on Computer Science|Proc. Mex. Int. Conf. Comp. Sci.|
|Publisher||IEEE Computer Society|
|Number of pages||10|
|Publication status||Published - 2008|
|Event||9th Mexican International Conference on Computer Science, ENC 2008 - Mexicali, Baja California|
Duration: 1 Jul 2008 → …
|Conference||9th Mexican International Conference on Computer Science, ENC 2008|
|City||Mexicali, Baja California|
|Period||1/07/08 → …|
- Causal Questions.
- Natural Language Processing
- Question Answering
- Test Collection