Quantification and Visualization of Natural Product-Likeness
NP-Scout is a free web service for the:
Identification of natural products in large molecular libraries
Quantification of natural product-likeness of small molecules
Visualization of atoms and areas in small molecules characteristic to natural products or synthetic molecules (based on similarity maps)
NP-Scout utilizes random forest classifiers trained on data sets consisting of more than 265k natural products and synthetic molecules (doi: 10.3390/biom9020043).
Molecular structures can be loaded either by directly drawing a molecule with the JSME Molecular editor , by pasting a SMILES into the field “Enter SMILES”, or by uploading a text file containing a list of SMILES. NP-Scout runs a thorough data preparation protocol to standardize the input. Therefore, chemical structures do not need to be preprocessed by the user with respect to hydrogen annotation, aromatization, protonation, tautomerism and stereochemistry. Salts are also recognized, and the minor components removed prior to calculations.
Example upload file
Lists of SMILES should be formatted as shown in the following examples:
Example 1: One SMILES per row with no additional data
Example 2: One SMILES per row with additional data
The following separators may be used: " " (space character) or "\t" (tab).
Running the calculations
Calculations are started by clicking the “Submit” button. A new web page will load that reports on the progress of calculations and displays a web link that allows users to return and inspect the results once all calculations have been completed.
Analyzing the results
The results page displays a table that presents the predictions for the query molecules.
Column "SMILES" reports the input SMILES.
Column "Molecule name" reports the name of a molecule. If not specified, an index is reported.
Column "Error/Warning" reports any errors or warnings.
!1: Invalid or empty input. No output was produced. In combination with one of the other messages, the other message gives the reason for the invalidity
S1: The salt filter identified a multi-compound SMILES for which the core component could not be determined. A result was generated from the original input but is probably unreliable.
S0: The salt filter has removed at least one component of the input SMILES.
W1: The molecular weight is not between 150 and 1500 Da. No prediction result.
E1: Element types other than those present in the training data were detected. A result was generated but is probably unreliable.
C1: Molecule is broken during canonalize procedure. Comes always with "!1".
N1: Molecule is broken during neutralization procedure. Comes always with "!1".
Column "NP class probability" reports the predicted probability of the molecule being a natural product.
Column "Similarity maps" shows a visualization of similarity maps. Green highlights mark atom contributing to the classification of a molecule as natural products, whereas orange highlights mark atoms contributing to the classification of a molecule as synthetic molecules.
Note that similarity maps are not calculated for molecules with a molecular weight below 150 Da or above 1500 Da.
The results in .csv format can be downloaded for further use. The .csv file contains all the information from the table of results except for the similarity maps.
Below is an example of what the results look like:
Chen, Y.; Stork, C.; Hirte, S.; Kirchmair, J. NP-Scout: Machine Learning Approach for the Quantification and Visualization of the Natural Product-Likeness of Small Molecules. Biomolecules2019, 9, 43.
Stork, C.; Embruch, G.; Šícho, M.; de Bruyn Kops, C.; Chen, Y.;
Svozil, D.; Kirchmair, J. NERDD: a web portal providing access to in
silico tools for drug discovery. Bioinformatics2020.
To report a problem or give feedback, please go to the Feedback & Support page or contact us directly: