Skin Doctor CP
Predicting the Skin Sensitization Potential of Small Molecules

About Skin Doctor CP

Skin Doctor CP is a machine learning model for the classification of small organic compounds into skin sensitizers and non-sensitizers. More specifically, the core of Skin Doctor CP is a random forest binary classifier that is enveloped in an aggregated Mondrian conformal prediction framework. This allows users to define an error significance level (i.e. error rate) for classification. Predictions (i.e. sensitizer or non-sensitizer) are thus only reported for compounds for which the expected reliability reaches or exceeds the error rate defined by the user.

Skin Doctor CP is trained on a curated data set of 1285 compounds measured in the local lymph node assay (LLNA). As of October 2020, this curated data set is the largest of its kind.

Further Information

Details on the methods and performance of the model is provided in accompanying publication (to be published).

How to use the Skin Doctor Suite web service

Enter SMILES, draw a molecule, or upload a file (.smi or .sdf) containing up to 100k molecules. The .smi file must contain exactly one SMILES per row. If additional information is provided for a molecule, it should be separated from the SMILES notation by a single space character.

Optionally, select a significance level different from the default value by moving the slide bars to the desired value.

Click the submit button to start the calculations. You will then be forwarded to the results page.


The results of the calculations will be displayed as a color-coded table. Additionally, users can download the results as a .csv file or check the results online at a later point in time using the web link provided. Results will be deleted permanently after 60 days or as soon as the user clicks on the “Delete results” button.

In the “Show/hide columns” section users can select the columns to be displayed in the results table. The results table contains, among others, the columns explained in Table 1. Table 2 explains the possible error and warning codes that may be reported.

Table 1. Explanation of the most important output columns.

Column name


General information:


Unique integer assigned to each submitted molecule


SMILES as submitted by the user

Filtered SMILES

SMILES after preprocessing

2D structure

2D structure of the preprocessed molecule

Error significance

Prediction of Skin Doctor CP with a reliability fulfilling the selected error significance


P-values for the sensitizing and the non-sensitizing class


Code for any errors or warnings thrown during the preparation of molecular structures. See Table 2 for explanation.

Table 2. Errors and Warnings.


Error message or warning


Invalid or empty input. No output was produced. Further information may be provided by additional messages.


The salt filter identified a multi-component SMILES for which the core component could not be unambiguously determined. The reported prediction may be unreliable.


The salt filter has removed at least one component of the input SMILES.


Element types other than those represented in the training data were detected. A result was generated but is probably unreliable.


Molecule was found to be broken during the canonicalization procedure. Always thrown in connection with ‘!1’


Molecule was found to be broken during the neutralization procedure. Always thrown in connection with ‘!1’

Citing Skin Doctor CP

Wilm, A.; Stork, C.; Bauer, C.; Schepky, A.; Kühnl, J.; Kirchmair, J. Skin Doctor: Machine Learning Models for Skin Sensitization Prediction that Provide Estimates and Indicators of Prediction Reliability. Int. J. Mol. Sci. 2019.
doi: 10.3390/ijms20194833

Wilm, A.; Norinder, U; Agea, M. I.; de Bruyn Kops, C.; Stork, C.; Kühnl, J.; Kirchmair, J. Skin Doctor CP: Conformal prediction of the skin sensitization potential of small organic molecules. Chem. Res. Tox. 2020.

Stork, C.; Embruch, G.; Šícho, M.; de Bruyn Kops, C.; Chen, Y.; Svozil, D.; Kirchmair, J. NERDD: a web portal providing access to in silico tools for drug discovery. Bioinformatics 2020.
doi: 10.1093/bioinformatics/btz695


To report a problem or give feedback, please go to the Feedback & Support page or contact us directly: