Abstract
The extraction of information from texts requires resources that contain both syntactic and semantic properties of lexical units. As the use of language in specialized domains, such as biology, can be very different to the general domain, there is a need for domain-specific resources to ensure that the information extracted is as accurate as possible. We are building a large-scale lexical resource for the biology domain, providing information about predicateargument structure that has been bootstrapped from a biomedical corpus on the subject of E. Coli. The lexicon is currently focussed on verbs, and includes both automatically-extracted syntactic subcategorization frames, as well as semantic event frames that are based on annotation by domain experts. In addition, the lexicon contains manually-added explicit links between semantic and syntactic slots in corresponding frames. To our knowledge, this lexicon currently represents a unique resource within in the biomedical domain. © Springer-Verlag Berlin Heidelberg 2009.
Original language | English |
---|---|
Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|Lect. Notes Comput. Sci. |
Place of Publication | Heidelberg |
Publisher | Springer Nature |
Pages | 137-148 |
Number of pages | 11 |
Volume | 5449 |
DOIs | |
Publication status | Published - 2009 |
Event | 10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009 - Mexico City Duration: 1 Jul 2009 → … |
Other
Other | 10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009 |
---|---|
City | Mexico City |
Period | 1/07/09 → … |
Keywords
- Biological language processing
- Domain-specific lexical resources
- Information extraction
- Lexical acquisition
- Syntaxsemantics linking