Background: The RNA-seq analysis at the single cell level emerges as a powerful tool to study the complex transcriptional dynamics of heart development. However, the understanding of the epigenetic dynamics at the single cell level is required to further discover the cis-regulatory elements and trans-acting factors that drive the expression changes and the lineage differentiation. Though single cell ATAC-seq and ChIP-seq have been invented to analyze individual cells, signals from these experiments are intrinsically discrete and cannot be used to accurately describe the continuum of chromatin accessibility and histone modifications.
Results: Using the publicly available sequencing data, we successfully built a statistical model to predict various histone marks such as H3K4me3, H3K27me3, H3K27ac and chromatin accessibility signals from corresponding RNA-seq data. We found a high correlation coefficient (0.86 on average) between observed and predicted epigenomic data. Applying this method on a combined dataset of single cells of Etv2-EYFP, Nkx2-5 EFYP and Kdr-EYFP from the developmental stages E6.5 to E8.25, we identified chromosomal regions with distinct histone codes and chromatin accessibility for endothelial, hematopoietic and cardiac lineages. We found that the regulatory factors that are highly expressed in the multi-potent progenitors are marked by high levels of H3K4me3 and H3K27me3, and the structural genes that are highly expressed in the differentiated cell populations are only marked by high levels of H3K4me3. Using the lineage specific chromatin accessibility regions predicted by the model, we further identified 21 lineage specific trans- regulators by the transcription factor motif accessibility analysis, including motifs associated with well-described master regulators such as Etv2, Gata1 and Nkx2-5.
Conclusion: In this study, we presented a set of novel methods to infer the histone codes and chromatin accessibility from the single cell RNA-seq data, and predicted the causative cis- and trans- regulators that drive lineage specification. We believe our novel methods will greatly increase the value of current single cell RNA-seq data and provide an enhanced understanding of cardiogenesis.