Metabolic network reconstructions are often incomplete. Constraint-based and pattern-based methodologies have been used for automated gap filling of these networks, each with its own strengths and weaknesses. Moreover, since validation of hypotheses made by gap filling tools require experimentation, it is challenging to benchmark performance and make improvements other than that related to speed and scalability.Results:
We present BoostGAPFILL, an open source tool that leverages both constraint-based and machine learning methodologies for hypotheses generation in gap filling and metabolic model refinement. BoostGAPFILL uses metabolite patterns in the incomplete network captured using a matrix factorization formulation to constrain the set of reactions used to fill gaps in a metabolic network. We formulate a testing framework based on the available metabolic reconstructions and demonstrate the superiority of BoostGAPFILL to state-of-the-art gap filling tools. We randomly delete a number of reactions from a metabolic network and rate the different algorithms on their ability to both predict the deleted reactions from a universal set and to fill gaps. For most metabolic network reconstructions tested, BoostGAPFILL shows above 60% precision and recall, which is more than twice that of other existing tools.Availability and Implementation:
MATLAB open source implementation (https://github.com/Tolutola/BoostGAPFILL)Contacts:
email@example.com or firstname.lastname@example.org.Supplementary information:
Supplementary data are available at Bioinformatics online.