Published October 1, 1988 | Version v1
Journal article Open

The Multiple Sequence Alignment Problem in Biology

Description

The study and comparison of sequences of characters from a finite alphabet is relevant to various areas of science, notably molecular biology. The measurement of sequence similarity involves the consideration of the different possible sequence alignments in order to find an optimal one for which the "distance" between sequences is minimum. By associating a path in a lattice to each alignment, a geometric insight can be brought into the problem of finding an optimal alignment. This problem can then be solved by applying a dynamic programming algorithm. However, the computational effort grows rapidly with the number N of sequences to be compared $(O(l^N ))$, where l is the mean length of the sequences to be compared). It is proved here that knowledge of the measure of an arbitrarily chosen alignment can be used in combination with information from the pairwise alignments to considerably restrict the size of the region of the lattice in consideration. This reduction implies fewer computations and less memory space needed to carry out the dynamic programming optimization process. The observations also suggest new variants of the multiple alignment problem.

Files

article.pdf

Files (1.2 MB)

Name Size Download all
md5:c799d6327917413ad1941ec99fe73e77
1.2 MB Preview Download