The value of prior knowledge in machine learning of complex network systems
Our overall goal is to develop machine-learning approaches based on genomics and other relevant accessible information for use in predicting how a patient will respond to a given proposed drug or treatment. Given the complexity of this problem, we begin by developing, testing and analyzing learning methods using data from simulated systems, which allows us access to a known ground truth. We examine the benefits of using prior system knowledge and investigate how learning accuracy depends on various system parameters as well as the amount of training data available.Results
The simulations are based on Boolean networks—directed graphs with 0/1 node states and logical node update rules—which are the simplest computational systems that can mimic the dynamic behavior of cellular systems. Boolean networks can be generated and simulated at scale, have complex yet cyclical dynamics and as such provide a useful framework for developing machine-learning algorithms for modular and hierarchical networks such as biological systems in general and cancer in particular. We demonstrate that utilizing prior knowledge (in the form of network connectivity information), without detailed state equations, greatly increases the power of machine-learning algorithms to predict network steady-state node values (‘phenotypes’) and perturbation responses (‘drug effects’).Availability and implementation
Links to codes and datasets here: https://gray.mgh.harvard.edu/people-directory/71-david-craft-phd.Contact
Supplementary data are available at Bioinformatics online.