1Department of Informatics, King's College London, London, UK2Genomics England, Charterhouse Square, London, UK
Checking for direct PDF access through Ovid
MotivationConserved non-coding elements (CNEs) represent an enigmatic class of genomic elements which, despite being extremely conserved across evolution, do not encode for proteins. Their functions are still largely unknown. Thus, there exists a need to systematically investigate their roles in genomes. Towards this direction, identifying sets of CNEs in a wide range of organisms is an important first step. Currently, there are no tools published in the literature for systematically identifying CNEs in genomes.ResultsWe fill this gap by presenting Symbol; a tool for identifying CNEs between two given DNA sequences with user-defined criteria. The results presented here show the tool's ability of identifying CNEs accurately and efficiently. Symbol is based on a k-mer technique for computing maximal exact matches. The tool thus does not require or compute whole-genome alignments or indexes, such as the suffix array or the Burrows Wheeler Transform (BWT), which makes it flexible to use on a wide scale.Availability and implementationFree software under the terms of the GNU GPL (https://github.com/lorrainea/CNEFinder).