1Department of Mathematics2College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou 310018, China3College of Information Science and Engineering, Hunan University, Changsha 410082, China4College of Science, East China University of Technology, Nanchang 330013, China5School of Mathematics and Information Science, Henan Polytechnic University, Jiaozuo 454000, China6Department of Civil and Environmental Engineering, National Universality of Singapore, Singapore 117576, Singapore7Department of Mathematics, City University of Hong Kong, Hong Kong SAR8College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642, China9School of Mathematics and Statistics, Hainan Normal University, Haikou 570100, China10Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Checking for direct PDF access through Ovid
MotivationLow-rank matrix completion has been demonstrated to be powerful in predicting antigenic distances among influenza viruses and vaccines from partially revealed hemagglutination inhibition table. Meanwhile, influenza hemagglutinin (HA) protein sequences are also effective in inferring antigenic distances. Thus, it is natural to integrate HA protein sequence information into low-rank matrix completion model to help infer influenza antigenicity, which is critical to influenza vaccine development.ResultsWe have proposed a novel algorithm called biological matrix completion with side information (BMCSI), which first measures HA protein sequence similarities among influenza viruses (especially on epitopes) and then integrates the similarity information into a low-rank matrix completion model to predict influenza antigenicity. This algorithm exploits both the correlations among viruses and vaccines in serological tests and the power of HA sequence in predicting influenza antigenicity. We applied this model into H3N2 seasonal influenza virus data. Comparing to previous methods, we significantly reduced the prediction root-mean-square error in a 10-fold cross validation analysis. Based on the cartographies constructed from imputed data, we showed that the antigenic evolution of H3N2 seasonal influenza is generally S-shaped while the genetic evolution is half-circle shaped. We also showed that the Spearman correlation between genetic and antigenic distances (among antigenic clusters) is 0.83, demonstrating a globally high correspondence and some local discrepancies between influenza genetic and antigenic evolution. Finally, we showed that 4.4%±1.2% genetic variance (corresponding to 3.11 ± 1.08 antigenic distances) caused an antigenic drift event for H3N2 influenza viruses historically.Availability and implementationThe software and data for this study are available at http://bi.sky.zstu.edu.cn/BMCSI/.Contactjialiang.email@example.com or firstname.lastname@example.orgSupplementary informationSupplementary data are available at Bioinformatics online.