Measuring the distance between multiple sequence alignments

Benjamin P. Blackburne, Simon Whelan

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Motivation: Multiple sequence alignment (MSA) is a core method in bioinformatics. The accuracy of such alignments may influence the success of downstream analyses such as phylogenetic inference, protein structure prediction, and functional prediction. The importance of MSA has lead to the proliferation of MSA methods, with different objective functions and heuristics to search for the optimal MSA. Different methods of inferring MSAs produce different results in all but the most trivial cases. By measuring the differences between inferred alignments, we may be able to develop an understanding of how these differences (i) relate to the objective functions and heuristics used in MSA methods, and (ii) affect downstream analyses. Results: We introduce four metrics to compare MSAs, which include the position in a sequence where a gap occurs or the location on a phylogenetic tree where an insertion or deletion (indel) event occurs. We use both real and synthetic data to explore the information given by these metrics and demonstrate how the different metrics in combination can yield more information about MSA methods and the differences between them. © The Author 2011. Published by Oxford University Press. All rights reserved.
    Original languageEnglish
    Article numberbtr701
    Pages (from-to)495-502
    Number of pages7
    JournalBioinformatics
    Volume28
    Issue number4
    DOIs
    Publication statusPublished - Feb 2012

    Fingerprint

    Dive into the research topics of 'Measuring the distance between multiple sequence alignments'. Together they form a unique fingerprint.

    Cite this