A program for maximum likelihood superpositioning and analysis of macromolecular structures

Author

Douglas Theobald <dtheobald@brandeis.edu>

Citations

"Optimal simultaneous superpositioning of multiple structures with missing data."

Theobald, Douglas L. & Steindel, Philip A. (2012) Bioinformatics 28 (15): 1972-1979 [Open Access]

"Accurate structural correlations from maximum likelihood superpositions."

Theobald, Douglas L. & Wuttke, Deborah S. (2008) PLOS Computational Biology 4(2):e43 [Open Access]

Theseus is a program that simultaneously superimposes multiple macromolecular structures. Instead of using the conventional least-squares criteria, Theseus finds the optimal solution to the superposition problem using the method of maximum likelihood (ML). The ML method downweights variable regions of the superposition and corrects for correlations among atoms, producing much more accurate results.

When superposing macromolecules with different residue sequences, other programs and algorithms discard residues that are aligned with gaps. Theseus, however, uses a novel ML superposition algorithm that includes all of the data. To use Theseus to superposition homologous proteins with different length sequences (e.g., when the protein sequences align with gaps and insertions), a sequence alignment must be provided. We supply a wrapper script, theseus_align (linked below), that calls Theseus, extracts the proper sequences from the PDB files, aligns them, and performs the superposition using that alignment. Future versions of Theseus will address the much harder structural alignment problem, by simultaneously finding the best alignment and superposition using the method of maximum likelihood