VISUALISING PROTEIN SEQUENCE ALIGNMENT

  • Shaimaa Aljuhani

Student thesis: Phd

Abstract

In bioinformatics, protein sequence analysis aims to convert sequence information into useful biochemical and biophysical knowledge that provides deeper understanding of structure and function of the known sequences and hence transfer this knowledge to uncharacterised ones. A fundamental task of protein sequence analysis is sequence alignment. Sequence alignment uses strings of contiguous letters of amino acids, arranged in vertical register in order to highlight regions of similarity and difference. The conserved regions highlight evolutionary constrained parts of the molecular structure from which biological roles are likely to be inferred. This artificial view has become the norm and sequence alignments are seldom viewed in other ways. When sequences have different lengths, gap characters are inserted to denote insertions or deletions; however, gaps have no meaning in 3D structures.This project aims to revisit the protein sequence alignment problem by exploring possible ways to visualise the relationships between sequences in three dimensional space while eliminating gap characters. In our work, an N-body Hamiltonian dynamic system in contact with a heat bath is built to model each protein sequence as a set of particles connected by springs. The sequence alignment is then presented by vertical springs connecting aligned pairs of amino acids where gaps are presented by stretched springs and unaligned amino acids corresponding to these gaps are repelled out of the plane. The configuration that the dynamic system adopts in three dimensional space when the potential energy of the system reaches a steady state will be the basis of the visualisation.The novel 3D visualisations generated by our model for various alignments were able to highlight structural features in a concise way without gaps, providing an overview of the fullalignment, which can be explored interactively. The method opensthe possibility of analysing much larger alignments that wouldn't be possible with conventional visualisations produced by current alignment tools and editors.Keyword(s):
Date of Award31 Dec 2014
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorDavid Silvester (Supervisor) & Terri Attwood (Supervisor)

Keywords

  • Bioinformatics
  • Heat bath
  • Langevin equation
  • N-body system
  • Hamiltonian dynamic system
  • 3D visualisation
  • protein sequence analysis
  • protein sequence alignmet

Cite this

'