Whole genome expression profiling of large cohorts of different types of cancer led to the identification of distinct molecular subcategories (subtypes) that may partially explain the observed inter-tumoral heterogeneity. This is also the case of colorectal cancer (CRC) where several such categorizations have been proposed. Despite recent developments, the problem of subtype definition and recognition remains open, one of the causes being the intrinsic heterogeneity of each tumor, which is difficult to estimate from gene expression profiles. However, one of the observations of these studies indicates that there may be links between the dominant tumor morphology characteristics and the molecular subtypes. Benefiting from a large collection of CRC samples, comprising both gene expression and histopathology images, we investigated the possibility of building image-based classifiers able to predict the molecular subtypes. We employed deep convolutional neural networks for extracting local descriptors which were then used for constructing a dictionary-based representation of each tumor sample. A set of support vector machine classifiers were trained to solve different binary decision problems, their combined outputs being used to predict one of the five molecular subtypes.Results:
A hierarchical decomposition of the multi-class problem was obtained with an overall accuracy of 0.84 (95%CI=0.79-0.88). The predictions from the image-based classifier showed significant prognostic value similar to their molecular counterparts.Contact:
email@example.comAvailability and Implementation:
Source code used for the image analysis is freely available from https://github.com/higex/qpath.Supplementary information:
Supplementary data are available at Bioinformatics online.