Lhasa Limited shared knowledge shared progress

EC3 Predictions for Skin Sensitisation

Derek Nexus contains expert-derived functionality to provide a quantitative EC3 prediction for skin sensitisation.

For further information regarding EC3 predictions and how this has been implemented within Derek Nexus, please take a look at a recorded webinar by Dr. Jeff Plante, highlighting the development of the EC3 prediction engine.

The Skin Sensitisation Brochure can be found here.  

Derek Nexus also contains expert-derived functionality to provide negative predictions for the skin sensitisation endpoint, find out more here.


General Approach

EC3 Prediction Results

Performance Metrics 

Features and Benefits


General Approach

The prediction is built on a Nearest Neighbour Model, where the nearest neighbours are taken from a reference set of compounds that exclusively fire the same alert as the query compound. A similarity score is calculated for the nearest neighbours and an EC3 prediction is made.

The nearest neighbour compounds are selected from over 650 compounds in the Lhasa EC3 dataset; the EC3 values for these compounds have been taken from literature and curated by Lhasa scientists. For compounds with multiple literature EC3 values, the median was taken to reduce the interference from outliers (Figures 1-3).

Figures 1-3: An example of the curation of EC3 values for the Lhasa EC3 dataset.

A diagrammatic representation of the Nearest Neighbour Model can be seen below (Figure 4). This model involves a three-step process:

  1. Firstly, the query compound is processed in Derek to determine whether a skin sensitisation alert is fired.
  2. Secondly, those compounds from the Lhasa EC3 dataset which fire the same skin sensitisation alert as the query compound are identified. Because they fire the same alert, these compounds are believed to cause skin sensitisation through the same mechanism as the query compound.
  3. Thirdly, the compounds in the Lhasa EC3 dataset are assessed using an in-house structural fingerprinting technique. They are then evaluated for their similarity to the query compound using the Tanimoto score. Up to 10 nearest neighbours are highlighted and are used to make the EC3 prediction, based on a weighted average. If less than 3 nearest neighbours are found, no prediction is made.

Click to enlarge

Figure 4: The steps taken in order to predict an EC3 value

EC3 Prediction Results

Visual (Figure 5)

  • A clear, graphical representation of the EC3 prediction is provided, which shows the nearest neighbours,theirTanimoto similarity to the query compound, and their EC3 values.
    • Each nearest neighbour is clearly identified as either a sensitiser or a non-sensitiser.
  • There is an option to display the colour-coded ECETOC classification in addition to the numerical EC3 prediction, these classifications can assist in categorising compounds, and the colour coding helps with visualisation.
  • The structures of the nearest neighbours are shown, and selecting a compound brings up an information box that includes data sources and references.

Expert Fine-Tuning

  • The information that has gone into making the prediction is transparently shown and nearest neighbours can be added to or removed from the prediction based on expert assessment.
  • Users can supplement the Lhasa EC3 dataset with their own data to increase the chemical space covered. However, the compounds added must fire a skin sensitisation alert in Derek, otherwise they cannot be used as a nearest neighbour.

Figure 5: The Derek EC3 prediciton for 3,5-diaminophenol

 Back to top

Performance Metrics

Lhasa scientists have assessed the performance of the model in predicting EC3 values for an external test set of compounds (Figure 6). The model has been designed not to under-predict, as this may bring about exposure to a chemical that is a sensitiser. The model correctly or over-predicts the compounds to be within the appropriate ECETOC category 79% of the time, and to be within the GHS category 89% of the time. 

Figure 6: Performance Data for Derek EC3 Predictions against an external dataset

The ECETOC (European Centre for Ecotoxicology and Toxicology of Chemicals) classifications are split into four different categories depending on the numerical value (Figure 7).

EC3 Value <0.1 ≥0.1 to <1 ≥1 to <10 ≥10 to <100
Potency Category Extreme Strong Moderate Weak

Figure 7: ECETOC Classifications

The GHS (Globally Harmonized System of Classification and Labelling of Chemicals) classification has two subcategories: 1A and 1B. If an EC3 value is less than or equal to 2%, it is classified as 1A, if an EC3 value is greater than 2%, it is classified as 1B.

Features and Benefits

  • Increases Understanding of Toxicity: The prediction of skin sensitisation potency allows experts to more fully understand the risk a particular compound poses.
  • Clear, Visual Results: The graphical representation ensures that the expert assessment of potential sensitisers is quick and easy.
  • Transparency: Derek Nexus provides the EC3 values and structures of nearest neighbours, facilitating a thorough review by experts.
  • Facilitates Expert Review: Experts can fine-tune the EC3 predictions by adding or removing compounds from the calculation based on their expert knowledge.

Back to top


© 2017 Lhasa Limited | Registered office: Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK Tel: +44 (0)113 394 6020
VAT number 396 8737 77 | Lhasa Limited is registered as a charity (290866)| Company Registration Number 01765239 (England and Wales).

Thanks to QuestionPro's generosity, we now have survey software that powers our data intelligence.