Lhasa Limited shared knowledge shared progress

(Q)SAR Methodology, Confidence and Interpretation of Predictions

Sarah Nexus uses a unique, highly transparent, machine-learning methodology to build a statistical model for Ames mutagenicity. The learning algorithm has been developed using a novel, fragmentation methodology which has been specifically optimised for reactive fragment-driven endpoints.  In addition to a prediction for Ames mutagencity, it also provides additional information to support subsequent expert analysis in accordance with the proposed guidelines.


The unique hierarchical model employed by Sarah Nexus not only retrieves matching fragments, it also further refines these results by considering the structure's similarity to your structure. This methodology retains those fragments that are perceived to be of greater value. Fragments may be of various sizes and can even overlap, ensuring greater accuracy in predictions.

The structural explanation for the prediction provided by Sarah Nexus is conveyed by highlighting those fragment(s) that the model considers meaningful.

Figure 1-3 highlight a step-by-step guide to the fragmentation process.


Scientific innovation

Advantages of our Methodology

The advantages of this methodology include:

  • The ability to generate fragments that are contained within the training set molecules, thereby avoiding the bias of models built using pre-determined fragments which may not reflect the training data.
  • The ability to build a hierarchy of models - some more global and some more local, giving users the best of both worlds.  
    • A single global model while having broad coverage, will not be adequately sensitive to local variations (activity cliffs).
    • Local models whilst more accurate for fragments that fall inside their chemical space will be narrower in their scope (applicability domain). 
    • Sarah Nexus contains both and will select the most appropriate model for each fragment.
  • Sarah Nexus looks at the information available for each fragment and uses scientifically valid rules to combine these. The relative importance of the contribution of each local model is provided, along with the data that underlies it, thereby providing a very transparent prediction. Furthermore, Sarah Nexus gives a measure of confidence for each prediction it makes. We believe that this uniquely gives the information that your expert needs to be able to understand and judge the prediction.


Sarah Nexus provides a confidence score and a structural explanation for each prediction along with direct access to supporting data to aid expert analysis.

The confidence score is based on each fragment’s contribution to the prediction and the weight placed upon each fragment to the overall prediction. For each fragment, the user can readily drill down to see the individual compounds that inform that prediction and how these contribute to the prediction for each fragment (figure 5 on the 'Prediction' tab). As part of the prediction, Sarah Nexus will display a confidence in that prediction. Our analysis shows that the confidence strongly correlates to accuracy.

 For further supporting information for the prediction, Sarah Nexus can be used in conjunction with Vitic Nexus where experimental details provide evidence related to strains, references etc. can be used further to support predictions. This additional information can be used to support regulatory submission.

As with any in silico prediction, it is advised that an expert should review the prediction and make a judgement in the event that a positive prediction is made. This judgement can only be made if sufficient data and supporting information is provided. Sarah Nexus provides complete transparency and significant amounts of detail behind each prediction.

Example of how confidence can be used in Sarah Nexus: An equivocal prediction is one where there is no strong evidence either for or against a prediction. It is possible for a user to set this threshold level and so preclude making predictions when there is not a strong overall signal for activity or inactivity.

Interpretation of a Prediction

When considering the interpretation of a prediction, an expert review is very important which is why we have worked hard to ensure Sarah Nexus assists this by providing sufficient information to support an expert analysis.

This includes measures of confidence (expected accuracy) for every prediction, and through the provision of transparent predictions with sufficient information for an expert to review. Sarah Nexus represents a different approach to ‘black box’ statistical models where the user can have an understanding of the model’s accuracy against a test set, but can’t assess an individual prediction.

 The self organising hierarchical network of the model allows for sub and super-fragments to be stored. Only the largest model fragment will be used in the prediction, this allows for the most relevant context of the fragment to be considered by effectively looking at the most ‘local’ model (hypothesis) which will be comprised of the most similar training compounds.  

The structural explanation for the prediction provided by Sarah Nexus is conveyed by highlighting those fragment(s) that Sarah considers meaningful.  Derek Nexus also highlights fragments of the query compound in order to illustrate the matches to patterns used to hold knowledge within Derek.

Both models highlight parts of the query compound for the same purpose – to draw the user’s attention to parts of the structure that influenced that prediction.  How these are identified is however, very different.  Derek relies on patterns drawn by experts whereas Sarah identifies those with statistical significance automatically.


Sarah Nexus, like Derek Nexus, presents all the information an expert requires in order to come to an informed decision.

Contact Us

© 2017 Lhasa Limited | Registered office: Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK Tel: +44 (0)113 394 6020
VAT number 396 8737 77 | Lhasa Limited is registered as a charity (290866)| Company Registration Number 01765239 (England and Wales).

QuestionPro supports sample survey questions such as multiple choice, drop-down menu, likert-scale, semantic differential, matrix, constant sum, drag-and-drop, slider-scale, net-promoter scale, and many more question types.