Motivation: Horizontal gene transfer (HGT) is a fundamental mechanism that enables organisms such as bacteria to directly transfer genetic material between distant species. This way, bacteria can acquire new traits such as antibiotic resistance or pathogenic toxins. Current bioinformatics approaches focus on the detection of past HGT events by exploring phylogenetic trees or genome composition inconsistencies. However, these techniques normally require the availability of finished and fully annotated genomes and of sufficiently large deviations that allow detection and are thus not widely applicable. Especially in outbreak scenarios with HGT-mediated emergence of new pathogens, like the enterohemorrhagic Escherichia coli outbreak in Germany 2011, there is need for fast and precise HGT detection. Next-generation sequencing (NGS) technologies facilitate rapid analysis of unknown pathogens but, to the best of our knowledge, so far no approach detects HGTs directly from NGS reads.
Results: We present Daisy, a novel mapping-based tool for HGT detection. Daisy determines HGT boundaries with split-read mapping and evaluates candidate regions relying on read pair and coverage information. Daisy successfully detects HGT regions with base pair resolution in both simulated and real data, and outperforms alternative approaches using a genome assembly of the reads. We see our approach as a powerful complement for a comprehensive analysis of HGT in the context of NGS data.
Availability and Implementation: Daisy is freely available from http://github.com/ktrappe/daisy.
Supplementary information: Supplementary data are available at Bioinformatics online.