GLORY
Predicting Cytochrome P450 Metabolites

About GLORY

GLORY was designed to predict metabolites that can be formed in humans by enzymes belonging to the cytochrome P450 (CYP) enzyme family. To do so, GLORY uses a two-pronged approach consisting of the following aspects: the incorporation of site of metabolism (SoM) prediction with FAME 2 and transformation of the molecule into its potential metabolites using a new set of reaction rules developed specifically for GLORY.

Incorporating Site of Metabolism Prediction

SoM prediction is the prediction of metabolically labile atom positions in a molecule. FAME 2, which was developed previously in our research group, is a machine learning-based tool that was developed to predict SoMs for CYP metabolism in humans. The models were developed using the extremely randomized trees algorithm and 2D circular descriptors of atoms and their environments. FAME 2 was shown to have a high level of accuracy, achieving a Matthews correlation coefficient (MCC) of 0.57 and an AUC of 0.91 on an independent test set. For more details on FAME 2, see the FAME 2 publication.

GLORY uses SoM prediction with FAME 2 as an initial step in the prediction of the metabolite structures. Depending on the mode (MaxCoverage or MaxEfficiency), the predicted SoMs are incorporated slightly differently.

  • In MaxEfficiency mode, the SoM probabilities predicted by FAME 2 are used as a preliminary filter for the positions in the molecule at which the reaction rules are applied. Hence a metabolite is only predicted if the reaction to create it took place at a predicted SoM.
  • In MaxCoverage mode, the reaction rules are applied at all positions in the molecule regardless of the SoM probabilities that FAME 2 predicted for the atoms involved in the reaction. The predicted SoM probabilities are used to score the predicted metabolites, however, as part of a new scoring approach that was found to be effective.
Reaction Rules

The reaction rules were developed based on known CYP-mediated reactions documented in the scientific literature. Hence the reaction rule base is not biased by any particular dataset.

The reactions found in the literature were represented as SMIRKS based on our chemical knowledge. The full list of reaction types and their SMIRKS can be found in the publication.

Scoring and Ranking of Predictions

The predictions made by GLORY are scored and ranked (per input molecule) based on the predicted SoM probabilities of the atoms involved in the reaction and whether the reaction type is common or not.

Further Information

For more details on the method development and evaluation of GLORY, including the reaction rules and the datasets, please refer to the publication.

Usage

Choose which mode you would like to use for metabolite prediction: MaxCoverage or MaxEfficiency. MaxCoverage is the default mode.

Choosing a mode

MaxCoverage
MaxEfficiency
  • Default recommended mode.
  • Recommended when limiting the number of predicted metabolites is of critical interest.
  • Applies the reaction rules to all positions in the molecule.
  • Applies the reaction rules only at positions which were predicted to be SoMs.
  • Was found to result in high recall of known metabolites and was able to meaningfully rank the large number of predicted metabolites.
  • Was found to result in a lower recall and slightly higher precision than MaxCoverage mode.

Enter SMILES, draw a molecule, or upload a file (.smi or .sdf). The input file may contain up to 1,000 molecules if it is a SMILES file or be up to 40 MB in size (approximately 15,000 molecules) if it is an SDF file. Please note that files larger than a few MB may take some time to upload. Click submit to start the calculation. You will then be forwarded to the result page.

Note that GLORY only makes predictions for input molecules containing at least 3 heavy atoms and does not predict any metabolites containing fewer than 3 heavy atoms. Note also that GLORY can not make predictions for molecules containing any atoms other than the following: C, N, S, O, H, F, Cl, Br, I, and P. This is the case because FAME 3 can not make predictions for molecules containing atoms that are not included in this list.

Preferred Format of Input Molecules for Best Results

Each SMILES and/or SDF entry should represent a single-component molecule. No predictions are made for multi-component molecules.

All molecules should be neutral and already have explicit hydrogens added. If there are missing hydrogens, the software will attempt to automatically add correct hydrogens before making predictions.

Output

On the result page, you will be able to download the predictions as an .sdf file. In the .sdf file, the structures of the predicted metabolites are provided along with the following information for each predicted metabolite:

  • Rank (among predicted metabolites for the particular parent molecule)
  • Score
  • Reaction name
  • Identifying information for the parent molecule (i.e. the input molecule for which the metabolite was predicted):
    • InChI
    • SMILES
    • ID
    If there were multiple input molecules, the ID of the parent molecule corresponds to the molecule’s position in the ordered list of input molecules (i.e. its position in the input file).

If the same metabolite was predicted via multiple reaction rules, the information corresponding to the version with the highest score is reported.

Viewing the Predicted Metabolites

If the input contains fewer than 25 molecules, the individual predictions for each input molecule can be viewed. A visualization of each input molecule and its predicted metabolites is provided, as well as a visual representation of the FAME 2 site of metabolism predictions.

If no predictions could be made for a particular input molecule, a corresponding error message will be displayed.

Citing GLORY

de Bruyn Kops, C.; Stork, C.; Šícho, M.; Kochev, N.; Svozil, D.; Jeliazkova, N.; Kirchmair, J. GLORY: Generator of the Structures of Likely Cytochrome P450 Metabolites Based on Predicted Sites of Metabolism. Front. Chem. 2019, 7:402.
doi: 10.3389/fchem.2019.00402

Stork, C.; Embruch, G.; Šícho, M.; de Bruyn Kops, C.; Chen, Y.; Svozil, D.; Kirchmair, J. NERDD: a web portal providing access to in silico tools for drug discovery. Bioinformatics 2020.
doi: 10.1093/bioinformatics/btz695

Problems?

To report a problem or give feedback, please go to the Feedback & Support page or contact us directly: