Lhasa Limited shared knowledge shared progress

Skin Sensitisation Assessment Using Derek Nexus

Derek Nexus contains expert-derived functionality for assessment of the skin sensitisation endpoint. Derek provides quantitative EC3 predictions for skin sensitisation as well as negative predictions for those query compounds which do not fire a skin sensitisation alert.


The Skin Sensitisation Brochure can be found here.

Lhasa's Defined Approach

EC3 Predictions

Performance Metrics

Features and Benefits

Negative Predictions

Performance Metrics



EC3 Predictions

The prediction is built on a Nearest Neighbour Model, where the nearest neighbours are taken from a reference set of compounds that fire the same alert as the query compound. A similarity score is calculated for the nearest neighbours and an EC3 prediction is made.

The nearest neighbour compounds are selected from over 650 compounds in the Lhasa EC3 dataset; the EC3 values for these compounds have been taken from literature and curated by Lhasa scientists. For compounds with multiple literature EC3 values, the median was taken to reduce the interference from outliers (Figures 1-3).

Figures 1-3: An example of the curation of EC3 values for the Lhasa EC3 dataset.

A diagrammatic representation of the Nearest Neighbour Model can be seen below (Figure 4). This model involves a three-step process:

  1. Firstly, the query compound is processed in Derek to determine whether a skin sensitisation alert is fired.
  2. Secondly, those compounds from the Lhasa EC3 dataset which fire the same skin sensitisation alert as the query compound are identified. Because they fire the same alert, these compounds are believed to cause skin sensitisation through the same mechanism as the query compound.
  3. Thirdly, the compounds in the Lhasa EC3 dataset are assessed using an in-house structural fingerprinting technique. They are then evaluated for their similarity to the query compound using the Tanimoto score. Up to 10 nearest neighbours are highlighted and are used to make the EC3 prediction, based on a weighted average. If less than 3 nearest neighbours are found, no prediction is made.

Figure 4: The steps taken in order to predict an EC3 value:

EC3 Prediction Results

Visual (Figure 5)

  • A clear, graphical representation of the EC3 prediction is provided, which shows the nearest neighbours,theirTanimoto similarity to the query compound, and their EC3 values.
  • Each nearest neighbour is clearly identified as either a sensitiser or a non-sensitiser.
  • There is an option to display the colour-coded Globally Harmonized System of Classification and Labelling of Chemicals (GHS) or ECETOC classification. These classifications can assist in categorising compounds, and the colour coding helps with visualisation.
  • The structures of the nearest neighbours are shown, and selecting a compound brings up an information box that includes data sources and references.

Expert Fine-Tuning

  • The information that has gone into making the prediction is transparently shown and nearest neighbours can be added to or removed from the prediction based on expert assessment.
  • Users can supplement the Lhasa EC3 dataset with their own data to increase the chemical space covered. However, the compounds added must fire a skin sensitisation alert in Derek, otherwise they cannot be used as a nearest neighbour.

Figure 5: The Derek EC3 prediction for 3,5-diaminophenol:



 Back to top

Performance Metrics for EC3 Prediction Model

Lhasa scientists have assessed the performance of the model in predicting EC3 values for an external test set of compounds (Figure 6). The model has been designed not to under-predict, as this may bring about exposure to a chemical that is a sensitiser. The model correctly or over-predicts the compounds to be within the appropriate ECETOC category 79% of the time, and to be within the GHS category 89% of the time. 

Figure 6: Performance Data for Derek EC3 Predictions against an external dataset:

The ECETOC (European Centre for Ecotoxicology and Toxicology of Chemicals) classifications are split into four different categories depending on the numerical value (Figure 7).

EC3 Value <0.1 ≥0.1 to <1 ≥1 to <10 ≥10 to <100
Potency Category Extreme Strong Moderate Weak
GHS   1A - Strong 1B - Other

The GHS (Globally Harmonized System of Classification and Labelling of Chemicals) classification has two subcategories: 1A and 1B. If an EC3 value is less than or equal to 2%, it is classified as 1A, if an EC3 value is greater than 2%, it is classified as 1B.Figure 7: ECETOC Classifications

Features and Benefits

  • Increases Understanding of Toxicity: The prediction of skin sensitisation potency allows experts to more fully understand the risk a particular compound poses.
  • Clear, Visual Results: The graphical representation ensures that the expert assessment of potential sensitisers is quick and easy.
  • Transparency: Derek Nexus provides the EC3 values and structures of nearest neighbours, facilitating a thorough review by experts.
  • Facilitates Expert Review: Experts can fine-tune the EC3 predictions by adding or removing compounds from the calculation based on their expert knowledge.

Back to top

Negative Predictions

Derek Nexus also further evaluates compounds which do not fire an alert for skin sensitisation. The query compound is compared to the Lhasa skin sensitisation negative predictions dataset, producing the following outcomes:

  • Where all features in the molecule are found in accurately classified compounds from the dataset, a prediction of Non-Sensitiser is displayed.
  • For those compounds where features in the query are found in non-alerting sensitisers in the Lhasa dataset, the prediction remains Non-Sensitiser, but Misclassified1 features are highlighted to enable the negative prediction to be verified by expert assessment.
  • In cases where features in the query are not found in the Lhasa dataset, the prediction remains Non-Sensitiser, but the Unclassified2 features are highlighted to enable the negative prediction to be verified by expert assessment.

Please view the following infographics for more information on understanding negative predictions:

Understanding Negative Predictions in Derek Nexus

Understanding Negative Predictions in Derek Nexus – a detailed workflow

  1. Misclassified features are those that have been derived from non-alerting mutagens/skin sensitisers in the Lhasa reference sets.
  2. Unclassified features are those that have not been found in the Lhasa reference sets.

Lhasa Skin Sensitisation Negative Predictions Dataset

The Lhasa Skin Sensitisation Negative PredictionsDataset is comprised of a mixture of human and animal data. In order to assign an experimental call for a particular reference compound, an assay hierarchy is used to rank the data. For example, human data is ranked above standard animal assays, which in turn is ranked above non-standard and other animal assays. 

A summary of the data can be seen in the table below.

Data source used to assign overall call

Number of chemicals

Proportion of overall datasets (%)

Number of sensitisers

Number of non-sensitiser

Prevalence of non-sensitisers (%)

Human data






Standard Animal Assays






Non-Standard Animal Assays






Other Animal Assays












Performance Metrics for Skin Sensitisation Negative Predictions[1] 

An external dataset of 986 compounds was used to measure the performance of the negative prediction functionality for skin sensitisation in Derek Nexus. The results from the external validation can be seen below in figure 8.

Figure 8: The negative predictivity of each type of prediction during the external validation:



Misclassified and unclassified results are highlighted for expert review but are not necessarily an indication of activity.

The prevalence of each type of prediction can be seen below in figure 9.

Figure 9: The frequency of occurrence of each type of prediction during the external validation:


 The following papers and resources relating to Lhasa's skin sensitisation work may be useful to you if you'd like to find out more. 

  1. Chilton et al. (2018) 'Making reliable negative predictions of human skin sensitisation using an in silico fragmentation approach', Regulatory Toxicology and Pharmacology, Vol. 95, June, Pages 227-235
  2. Macmillan D. and Chilton M. L. (2019) 'A defined approach for predicting skin sensitisation hazard and potency based on the guided integration of in silico, in chemico and in vitro data using exclusion criteria', Regulatory Toxicology and Pharmacology, Vol. 101, February, Pages 35-47
  3. Canipa et al. (2017) 'A quantitative in silico model for predicting skin sensitization using a nearest neighbours approach within expert-derived structure–activity alert spaces', Journal of Applied Toxicology, Vol. 37, Issue 8, August, Pages 985-995

Back to top

© 2023 Lhasa Limited | Registered office: Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK Tel: +44 (0)113 394 6020
VAT number 396 8737 77 | Lhasa Limited is registered as a charity (290866)| Company Registration Number 01765239 (England and Wales).