This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (CC BY).
ORIGINAL RESEARCH
Classification models for assessment of influenza virus A/H1N1 inhibitors
1 Institute of Biomedical Chemistry (IBMC), Moscow, Russia
2 The Ufa Institute of Chemistry of the Ufa Federal Research Centre of the Russian Academy of Sciences (UFRC RAS), Ufa, Russia
3 Saint Petersburg Pasteur Research Institute of Epidemiology and Microbiology, Saint Petersburg, Russia
Correspondence should be addressed: Leonid A. Stolbov
10 Pogodinskaya St., b. 8, Moscow, 119121, Russia; ur.xednay@alvoblots
Financing: this research, which details the structure-activity relationship models, was supported long-term by the State Program for Fundamental Scientific Research in the Russian Federation (2021–2030) (Number 124050800018-9). This database was developed as part of the state research project entitled “Kinetic, spectral, luminescent, and theoretical analysis of core intermediates in chemical and biochemical oxidation processes” of the Ufa Institute of Chemistry of the Ufa Federal Research Centre of the Russian Academy of Sciences No. 125020601626-9.
Author contribution: Stolbov LA — data analysis, model building, manuscript preparation; Borisevich SS — idea, database preparation; Gorokhov YaV — scientific literature review for database compilation; Zarubaev VV — providing up-to-date results of biological testing; Tarasova OA — the idea and methodology of research; Poroikov VV- research methodology. Every author contributed to writing and editing the paper.
We created classification models to study how the structure of A/H1N1 flu inhibitors relates to their activity (structure-activity relationship (SAR) models). These models rely on a unique machine learning approach known as a self-consistent extreme classifier. The training sample consisted of expert-selected information regarding the structures and biological test outcomes of 2,255 drug-like compounds. Independent variables included the IC50 (50 % inhibitory concentration), the CC50 (half-maximal cytotoxic concentration), and the SI (selectivity index). The structure was described using molecular descriptors called quantitative atomic neighborhoods (QNA). The predictive accuracy (AUC-ROC) for the IC50, CC50, and SI values was 0.822, 0.875, and 0.875, respectively. Evaluation of the developed models on an independent test set of 16 diverse chemical compounds demonstrated a classification accuracy of 63 % for predicting antiviral activity and cytotoxicity. The study shows that original SAR models can be used to select promising drugs to fight the A/H1N1 flu virus before they are made or tested in a lab.
Keywords: machine learning, virtual screening, antiviral drugs, influenza A/H1N1 virus, structure-activity relationship models, selectivity index, ligand-based approach