Motivation: Cell differentiation is steered by extracellular signals that activate a cell type specific transcriptional program. Molecular mechanisms that drive the differentiation can be analyzed by combining mathematical modeling with population average data. For standard mathematical models, the population average data is informative only if the measurements come from a homogeneous cell culture. In practice, however, the differentiation efficiencies are always imperfect. Consequently, cell cultures are inherently mixtures of several cell types, which have different molecular mechanisms and exhibit quantitatively different dynamics. There is an urgent need for data-driven mathematical modeling approaches that can detect possible heterogeneity and, further, recover the molecular mechanisms from heterogeneous data.
Results: We develop a novel method that models a heterogeneous population using homogeneous subpopulations that evolve in parallel. Different subpopulations can represent different cell types and each subpopulation can have cell type specific molecular mechanisms. We present statistical methodology that can be used to quantify the effect of heterogeneity and to infer the subpopulation specific molecular interactions. After a proof of principle study with simulated data, we apply our methodology to analyze the differentiation of human Th17 cells using time-course RNA sequencing data. We construct putative molecular networks driving the T cell activation and Th17 differentiation and allow the cell populations to be split into two subpopulations in the case of heterogeneous samples. Our analysis shows that the heterogeneity indeed has a statistically significant effect on observed dynamics and, furthermore, our statistical methodology can infer both the subpopulation specific molecular mechanisms and the effect of heterogeneity.
Availability and Implementation: An implementation of the method is available at http://research.ics.aalto.fi/csb/software/subpop/.
Contact:firstname.lastname@example.org or email@example.com
Supplementary information: Supplementary data are available at Bioinformatics online.