1Michael Smith Genome Sciences Centre, BC Cancer Agency2Department of Pathology and Laboratory Medicine, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada3Terry Fox Laboratory, BC Cancer Agency4Bioinformatics Training Program, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada5Independent Researcher6Hematology Division, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada7European Research Institute for the Biology of Ageing, University Medical Centre Groningen, Groningen, The Netherlands
Checking for direct PDF access through Ovid
Summary:Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that are repetitive or hard to sequence prevent the accumulation of larger scaffolds, and create errors such as misorientations and mislocalizations. Thus, complementary methods for determining the orientation and positioning of fragments are important for finishing assemblies. Strand-seq is a method for determining template strand inheritance in single cells, information that can be used to determine relative genomic distance and orientation between scaffolds, and find errors within them. We present contiBAIT, an R/Bioconductor package which uses Strand-seq data to repair and improve existing assemblies.Availability and Implementation:contiBAIT is available on Bioconductor. Source files available from GitHub.Contact:firstname.lastname@example.org or email@example.comSupplementary information:Supplementary data are available at Bioinformatics online.