Skip to main navigation Skip to search Skip to main content

Data for Multiple protein structure alignment at scale with FoldMason

  • Cameron Gilchrist (Creator)
  • Martin Steinegger (Creator)
  • Milot Mirdita (Creator)

Dataset

Description

Data used for analysis in "Multiple protein structure alignment at scale with FoldMason" Protein structure is conserved beyond sequence, making multiple structural alignment (MSTA) essential for analyzing distantly related proteins. Computational prediction methods have vastly extended our repository of available proteins structures, requiring fast and accurate MSTA methods. Here, we introduce FoldMason, a progressive MSTA method that leverages the structural alphabet from Foldseek, a pairwise structural aligner, for multiple alignment of hundreds of thousands of protein structures, exceeding alignment quality of state-of-the-art methods, while two orders of magnitudes faster than other MSTA methods. FoldMason computes confidence scores, offers interactive visualizations, and provides essential speed and accuracy for large-scale protein structure analysis in the era of accurate structure prediction. Using Flaviviridae glycoproteins, we demonstrate how FoldMason’s MSTAs support phylogenetic analysis below the twilight zone. FoldMason is free open-source software: foldmason.foldseek.com and webserver: search.foldseek.com/foldmason.



Is supplement to
https://doi.org/10.1101/2024.08.01.606130 (DOI)

Software
https://github.com/steineggerlab/foldmason-analysis
Date made available3 Apr 2025
PublisherZenodo

Cite this