Lhasa Limited shared knowledge shared progress

Human review of in silico predictions of toxicity

Read time: 10 minutes

Expert Review Magnifying Glass Results

Human Review of In Silico Predictions of Toxicity: Some Key Questions for Consideration When Making An Assessment

The speed by which models for toxicity prediction can be built and the accuracy with which they make predictions, have both improved greatly in recent years as a result of advances in technology, mechanistic understanding and access to data. It should be noted however that there is no such thing as a perfect model. Lessons remain to be learned in model building and some toxicities remain harder to predict than others; this might be due to a lack of data or complexity of mechanism. Predictive models remain only as good as the data and understanding used to train them.

Consequently, in silico predictions are an extremely useful tool in a risk assessors’ inventory if used appropriately. The user should understand any shortcomings when reaching conclusions and consider transparency in predictions, especially if a model is to be used in a pivotal role, such as regulatory decision making. The rationale and data that is used to build a prediction and support a result; in addition to the known limitations of the predictive system, are all important pieces of information that should be presented to the user to enable an informed decision on whether to accept or refute any prediction offered.

Common limitations to model building approaches can often be used to explain incorrect predictions made by in silico systems if the user is aware of them and can probe the prediction adequately. For example, statistical approaches to modelling structure-activity relationships automatically generate relationships between structure and activity and as a result can sometimes mistakenly attribute associations where there is correlation but no causation. In contrast, structure-activity relationships that are built manually, using a human expert approach, are unlikely to suffer from the same limitations, however, these systems may be more prone to overlooking new or limited data as encoding knowledge in this way can be time consuming and may prove difficult to include outliers within predictions.

The strengths and limitations of different modelling approaches for Ames mutagenicity have been recognised in the inclusion of in silico predictions in the current implementation of the ICH M7 guidance for genotoxic impurity assessment [1]. Two systems built by different methods are required to satisfy the guideline, as it is reasoned that the strengths of one will make up for the limitations of the other. However, questions can arise from using this approach:

  • What should I do when the two predictions disagree?
  • Which should be believed?

This is where a knowledge of the general limitations of the predictive approaches and a transparency in predictions becomes very useful. The ICH M7 guidance [1] acknowledges this predicament, stating that ‘the outcome of any computer system-based analysis can be reviewed with the use of expert knowledge in order to provide additional supportive evidence on relevance of any positive, negative, conflicting or inconclusive prediction and provide a rationale to support the final conclusion’. In addition, there has been a number of publications outlining common limitations of predictive systems in this area, defining how expert review should be carried out and how it improves predictivity. [1,2,3,4,5,6,7,8,9,10]  

The table below highlights some key questions that should be asked during expert review, with notes drawn from relevant literature sources.


Review Question


Is there enough information supplied with the prediction to carry out a review?

It is hard to define a minimum requirement but knowledge of training set, method of prediction and toxicophores identified and assessed are all useful. [6,8,10]

Is the assay used to build the model appropriate to measure the hazard caused by this compound?

Assays have their own applicability domains. The human hazard caused by certain chemicals may not be well predicted by an assay. For example, measuring mutagenicity of acid halides in the Ames test or prohaptens in the direct protein reactivity assay. If the assay cannot adequately assess the hazard the model produced from these data will not be accurate. [6,7,8]

Does the model have enough knowledge of similar compounds to the query in order to make a prediction?

This should be covered by the applicability domain of the model but may also be reflected in a confidence metric supplied with a prediction or human assessment of training set compounds. Evidence from outside of the model may help draw a conclusion. [8,9,11]

Does the evidence used to build the prediction, contain additional toxicophores not present in the query compound?  (For positive predictions)

An inadequacy more common in statistical machine learned approaches to modelling. Knowledge of other functional groups which may cause toxicity. [6,8,9]


Want to find out more? Download the slides or listen to the recording of the webinar: ‘Aiding And Standardising Expert Review Under ICH M7’.


[1] ICH M7 Guideline for Genotoxic Impurity Assessment.

[2] Dobo et al. (2012) ‘In silico methods combined with expert knowledge rule out mutagenic potential of pharmaceutical impurities: An industry survey’, Regulatory Toxicology and Pharmacology, 62(3), 449-455. http://dx.doi.org/10.1016/j.yrtph.2012.01.007

[3] Kruhlak et al. (2012) ‘(Q)SAR modeling and safety assessment in regulatory review’, Clinical Pharmacology & Therapeutics, Regulatory Science, 91(3), 529-534. https://doi.org/10.1038/clpt.2011.300  

[4] Naven R. T., Greene N., Williams R.V. (2012) ‘Latest advances in computational genotoxicity prediction’, Expert Opinion Drug Metab Toxicol, 8(12), 1579-87. https://doi.org/10.1517/17425255.2012.724059

[5] Sutter et al. (2013) ‘Use of in silico systems and expert knowledge for structure-based assessment of potentially mutagenic impurities’, Regulatory Toxicology and Pharmacology, 67(1), 39-52.  https://doi.org/10.1016/j.yrtph.2013.05.001

[6] Powley M, W. (2015) ‘(Q)SAR assessments of potentially mutagenic impurities: A regulatory perspective on the utility of expert knowledge and data submission’, Regulatory Toxicology and Pharmacology, 71(2), 295-300. https://doi.org/10.1016/j.yrtph.2014.12.012

[7] Greene et al. (2015) ‘A practical application of two in silico systems for identification of potentially mutagenic impurities’, Regulatory Toxicology and Pharmacology’, 72(2), 335-349. https://doi.org/10.1016/j.yrtph.2015.05.008

 [8] Barber et al, (2015) ‘Establishing best practise in the application of expert review of mutagenicity under ICH M7, Regulatory Toxicology and Pharmacology’, 73(1) 367-377.  https://doi.org/10.1016/j.yrtph.2015.07.018

[9] Amberg et al, (2016) ‘Principles and procedures for implementation of ICH M7 recommended (Q)SAR analyses’, 77, 13-24.  https://doi.org/10.1016/j.yrtph.2016.02.004

[10] Myatt et al, (2018) ‘In silico toxicology protocols’, 96, 1-17. https://doi.org/10.1016/j.yrtph.2018.04.014

[11] Amberg et al (2019) ‘Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses’, 102, 53-64. https://doi.org/10.1016/j.yrtph.2018.12.007


About this article

Edited by:

  • Pearl Saville


Read Time:

Clock: 10 minutes10 minutes

© 2020 Lhasa Limited | Registered office: Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK Tel: +44 (0)113 394 6020
VAT number 396 8737 77 | Lhasa Limited is registered as a charity (290866)| Company Registration Number 01765239 (England and Wales).

Thanks to QuestionPro's generosity, we now have survey software that powers our data intelligence.