Indel and Carryforward Correction (ICC): a new analysis approach for processing 454 pyrosequencing data

    loading  Checking for direct PDF access through Ovid

Abstract

Motivation:

Pyrosequencing technology provides an important new approach to more extensively characterize diverse sequence populations and detect low frequency variants. However, the promise of this technology has been difficult to realize, as careful correction of sequencing errors is crucial to distinguish rare variants (∼1%) in an infected host with high sensitivity and specificity.

Results:

We developed a new approach, referred to as Indel and Carryforward Correction (ICC), to cluster sequences without substitutions and locally correct only indel and carryforward sequencing errors within clusters to ensure that no rare variants are lost. ICC performs sequence clustering in the order of (i) homopolymer indel patterns only, (ii) indel patterns only and (iii) carryforward errors only, without the requirement of a distance cutoff value. Overall, ICC removed 93–95% of sequencing errors found in control datasets. On pyrosequencing data from a PCR fragment derived from 15 HIV-1 plasmid clones mixed at various frequencies as low as 0.1%, ICC achieved the highest sensitivity and similar specificity compared with other commonly used error correction and variant calling algorithms.

Availability and implementation:

Source code is freely available for download at http://indra.mullins.microbiol.washington.edu/ICC. It is implemented in Perl and supported on Linux, Mac OS X and MS Windows.

Contact:

jmullins@uw.edu

Supplementary information:

Supplementary data are available at Bioinformatics online.

Related Topics

    loading  Loading Related Articles