SANA: simulated annealing far outperforms many other search algorithms for biological network alignment
Every alignment algorithm consists of two orthogonal components: an objective function M measuring the quality of an alignment, and a search algorithm that explores the space of alignments looking for ones scoring well according to M. We introduce a new search algorithm called SANA (Simulated Annealing Network Aligner) and apply it to protein-protein interaction networks using S3 as the topological measure. Compared against 12 recent algorithms, SANA produces 5-10 times as many correct node pairings as the others when the correct answer is known. We expose an anti-correlation in many existing aligners between their ability to produce good topological vs. functional similarity scores, whereas SANA usually outscores other methods in both measures. If given the perfect objective function encoding the identity mapping, SANA quickly converges to the perfect solution while many other algorithms falter. We observe that when aligning networks with a known mapping and optimizing only S3, SANA creates alignments that are not perfect and yet whose S3 scores match that of the perfect alignment. We call this phenomenon saturation of the topological score. Saturation implies that a measure's correlation with alignment correctness falters before the perfect alignment is reached. This, combined with SANA's ability to produce the perfect alignment if given the perfect objective function, suggests that better objective functions may lead to dramatically better alignments. We conclude that future work should focus on finding better objective functions, and offer SANA as the search algorithm of choice.Availability and Implementation:
Software available at http://sana.ics.uci.edu.Contact:
Supplementary data are available at Bioinformatics online.