Biobanks are frequently required to verify specimen relationships. We present two algorithms to compare SNP genotype patterns that provide an objective, high-throughput tool for verification.Methods:
The first algorithm allows for comparison of all holdings within a biobank, and is well suited to construct sample relationships de novo for comparison with assumed relationships. The second algorithm is tailored to oncology, and allows one to confirm that paired DNAs from malignant and normal tissues are from the same individual in the presence of copy number variations. To evaluate both algorithms, we used an internal training data set (n = 1504) and an external validation data set (n = 1457).Results:
In comparison with the results from manual review and a priori knowledge of patient relationships, we identified no errors in interpreting sample relationships within our validation data set.Conclusion:
We provide an efficient and objective method of automated data analysis that is currently lacking for establishing and verifying specimen relationships in biobanks.