TY - JOUR
T1 - No so HoT - heads or tails is not able to reliably compare multiple sequence alignments
AU - Wise, Michael
PY - 2009
Y1 - 2009
N2 - Most phylogenetic-tree building applications use multiple sequence alignments as a starting point. A recent meta-levelmethodology, called Heads or Tails, aims to reveal the quality of multiple sequence alignments by comparing alignments taken inthe forward direction with the alignments of the same sequences when the sequences are reversed. Through an examination of aspecial case for multiple sequence alignment – pair-wise alignments, where an optimal algorithm exists – and the use of a modifiedglobal-alignment application, it is shown that the forward and reverse alignments, even when they are the same, do not capture allthe possible variations in the alignments and when the forward and reverse alignments differ there may be other alignments thatremain unaccounted for. The implication is that comparing just the forward and (biologically irrelevant) reverse alignments is notsufficient to capture the variability in multiple sequence alignments, and the Heads or Tails methodology is therefore not suitable asa method for investigating multiple sequence alignment accuracy. Part of the reason is the inability of individual multiple sequencealignment applications to adequately sample the space of possible alignments. A further implication is that the Hall [Hall, B.G.,2008. Mol. Biol. Evol. 25, 1576–1580] methodology may create optimal synthetic multiple sequence alignments that extant alignerswill be unable to completely recover ab initio due to alternative alignments being possible at particular sites. In general, it is shownthat more divergent sequences will give rise to an increased number of alternative alignments, so sequence sets with a higher degreeof similarity are preferable to sets with lower similarity as the starting point for phylogenetic tree building. The Willi Hennig Society 2009.
AB - Most phylogenetic-tree building applications use multiple sequence alignments as a starting point. A recent meta-levelmethodology, called Heads or Tails, aims to reveal the quality of multiple sequence alignments by comparing alignments taken inthe forward direction with the alignments of the same sequences when the sequences are reversed. Through an examination of aspecial case for multiple sequence alignment – pair-wise alignments, where an optimal algorithm exists – and the use of a modifiedglobal-alignment application, it is shown that the forward and reverse alignments, even when they are the same, do not capture allthe possible variations in the alignments and when the forward and reverse alignments differ there may be other alignments thatremain unaccounted for. The implication is that comparing just the forward and (biologically irrelevant) reverse alignments is notsufficient to capture the variability in multiple sequence alignments, and the Heads or Tails methodology is therefore not suitable asa method for investigating multiple sequence alignment accuracy. Part of the reason is the inability of individual multiple sequencealignment applications to adequately sample the space of possible alignments. A further implication is that the Hall [Hall, B.G.,2008. Mol. Biol. Evol. 25, 1576–1580] methodology may create optimal synthetic multiple sequence alignments that extant alignerswill be unable to completely recover ab initio due to alternative alignments being possible at particular sites. In general, it is shownthat more divergent sequences will give rise to an increased number of alternative alignments, so sequence sets with a higher degreeof similarity are preferable to sets with lower similarity as the starting point for phylogenetic tree building. The Willi Hennig Society 2009.
U2 - 10.1111/j.1096-0031.2009.00292.x
DO - 10.1111/j.1096-0031.2009.00292.x
M3 - Article
SN - 0748-3007
VL - 26
SP - 1
EP - 6
JO - Cladistics
JF - Cladistics
IS - 4
ER -