Motivation: We introduce a coalescent-based method (RECOAL) for the simulation of new haplotype data from a reference population of haplotypes. A coalescent genealogy for the reference haplotype data is sampled from the appropriate posterior probability distribution, then a coalescent genealogy is simulated which extends the sampled genealogy to include new haplotype data. The new haplotype data will, therefore, contain both some of the existing polymorphic sites and new polymorphisms added based on the structure of the simulated coalescent genealogy. This allows exact coalescent simulation of new haplotype data, compared with other methods which are more approximate in nature.
Results: We demonstrate the performance of our method using a variety of data simulated under a coalescent model, before applying it to data from the 1000 Genomes project.
Availability: The source code is freely available for download at ftp://popgen.usc.edu
Supplementary information: Supplementary data are available at Bioinformatics online.