There have been a number of recent efforts (e.g. BioCatalogue, BioMOBY, etc.) to systematically catalogue bioinformatics tools, services and datasets. These efforts mostly rely on manual curation and are unable to cope with the huge influx of various electronic resources, which consequently result in their unavailability to the community. We present a text mining approach that utilizes the literature to extract and semantically profile bioinformatics resources. Our method identifies the mentions of resources in the literature and assigns a set of co-occurring terminological and ontological entities (descriptors) to represent them. Since such representations can be extremely sparse, we use kernel metrics based on lexical term/descriptor similarities to identify semantically related resources. Resources are then either clustered or linked into a network, providing the users (bioinformaticians and service/tool crawlers) with a possibility to explore tools, services and datasets based on their relatedness, thus potentially improving the resource discovery process.
|Name||CEUR Workshop Proceedings|
|Conference||Workshop on Semantic Web Applications and Tools for Life Sciences, SWAT4LS 2009|
|Period||1/07/09 → …|
- Bioinformatics services
- Kernel similarity
- Service description
- Text mining