Abstract
Knowledge discovery through pattern finding in data is central to modern molecular biology, which now has thousands of databases and similar numbers of tools for processing those data. Any data analysis in molecular biology involves gathering and processing data from many sources, even before the analysis for the central biological question takes place. Taverna is a workflow workbench that allows bioinformaticians to create data pipelines involving distributed Web services and other forms of tool; these workflows gather and manage data in order to perform analyses that answer biological questions. RapidMiner brings a large suite of data processing, visualisation and data mining tools to bear upon tables of data, but there is a disconnect between these operators and the services available to users of Taverna. Through a RapidMiner extension to Taverna we have combined the ability to gather and process data from many molecular biological sources with RapidMiner's data mining capabilities to provide a powerful tool for scientific analysis. In this article we describe this RapidMiner extension to Taverna and some preliminary analyses we have performed using RapidMiner on biological data.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd RapidMiner Community Meeting and Conference |
Editors | Simon Fischer, Ingo Mierswa |
Pages | 75-86 |
Number of pages | 12 |
Publication status | Published - 2011 |
Event | RCOMM2011 - Dublin Duration: 1 Jan 1824 → … |
Conference
Conference | RCOMM2011 |
---|---|
City | Dublin |
Period | 1/01/24 → … |
Research Beacons, Institutes and Platforms
- Manchester Institute of Biotechnology