Automated selection of signals in protein NMR spectra, known as peak picking, has been studied for over 20 years, nevertheless existing peak picking methods are still largely deficient. Accurate and precise automated peak picking would accelerate the structure calculation, and analysis of dynamics and interactions of macromolecules. Recent advancement in handling big data, together with an outburst of machine learning techniques, offer an opportunity to tackle the peak picking problem substantially faster than manual picking and on par with human accuracy. In particular, deep learning has proven to systematically achieve human-level performance in various recognition tasks, and thus emerges as an ideal tool to address automated identification of NMR signals.Results:
We have applied a convolutional neural network for visual analysis of multidimensional NMR spectra. A comprehensive test on 31 manually annotated spectra has demonstrated top-tier average precision (AP) of 0.9596, 0.9058 and 0.8271 for backbone, side-chain and NOESY spectra, respectively. Furthermore, a combination of extracted peak lists with automated assignment routine, FLYA, outperformed other methods, including the manual one, and led to correct resonance assignment at the levels of 90.40%, 89.90% and 90.20% for three benchmark proteins.Availability and implementation:
The proposed model is a part of a Dumpling software (platform for protein NMR data analysis), and is available at https://dumpling.bio/.Supplementary information:
Supplementary data are available at Bioinformatics online.