Quality Assessment of DNA Sequence Data: Autopsy of A Mis-Sequenced mtDNA Population Sample

    loading  Checking for direct PDF access through Ovid

Abstract

Summary

Published DNA data sets constitute a body of sequencing results resting in silico that are supposed to reflect the variation of (once) living cells. In cases where the DNA variation reported is suspected to be fraught with artefacts, an autopsy of the full body of data is needed to clarify the amount and causes of mis-sequencing. In this paper we elaborate on strategies that allow a clear-cut identification of the problems in severely flawed mtDNA data. This approach is applied, by way of example, to a data set of HVS-I sequences from the Caucasus, published by Nasidze & Stoneking in 2001. These data bear numerous ambiguous nucleotide positions and suffer from an even higher number of phantom mutations, indicating that severe biochemical problems adversely influenced those sequencing results at the time. Furthermore, systematic omission of sequences with a long C-stretch (incurred by a transition at position 16189) must have severely biased the data set. Since no complete correction of these data has appeared to date, this example of mis-sequencing necessitates circumstantial evidence that is bullet-proof.

Related Topics

    loading  Loading Related Articles