Motivation: DNA methylation aberrations are now known to, almost universally, accompany the initiation and progression of cancers. In particular, the colon cancer epigenome contains specific genomic regions that, along with differences in methylation levels with respect to normal colon tissue, also show increased epigenetic and gene expression heterogeneity at the population level, i.e. across tumor samples, in comparison with other regions in the genome. Tumors are highly heterogeneous at the clonal level as well, and the relationship between clonal and population heterogeneity is poorly understood.
Results: We present an approach that uses sequencing reads from high-throughput sequencing of bisulfite-converted DNA to reconstruct heterogeneous cell populations by assembling cell-specific methylation patterns. Our methodology is based on the solution of a specific class of minimum cost network flow problems. We use our methods to analyze the relationship between clonal heterogeneity and population heterogeneity in high-coverage data from multiple samples of colon tumor and matched normal tissues.
Availability and implementation: http://github.com/hcorrada/methylFlow.
Supplementary information: Supplementary information is available at Bioinformatics online.