Motivation: Causality between two diseases is valuable information as subsidiary information for medicine which is intended for prevention, diagnostics and treatment. Conventional cohort-centric researches are able to obtain very objective results, however, they demands costly experimental expense and long period of time. Recently, data source to clarify causality has been diversified: available information includes gene, protein, metabolic pathway and clinical information. By taking full advantage of those pieces of diverse information, we may extract causalities between diseases, alternatively to cohort-centric researches.
Method: In this article, we propose a new approach to define causality between diseases. In order to find causality, three different networks were constructed step by step. Each step has different data sources and different analytical methods, and the prior step sifts causality information to the next step. In the first step, a network defines association between diseases by utilizing disease–gene relations. And then, potential causalities of disease pairs are defined as a network by using prevalence and comorbidity information from clinical results. Finally, disease causalities are confirmed by a network defined from metabolic pathways.
Results: The proposed method is applied to data which is collected from database such as MeSH, OMIM, HuDiNe, KEGG and PubMed. The experimental results indicated that disease causality that we found is 19 times higher than that of random guessing. The resulting pairs of causal-effected diseases are validated on medical literatures.
Availability and Implementation:http://www.alphaminers.net
Supplementary information: Supplementary data are available at Bioinformatics online.