Sarah Nexus is a statistical software tool that gives you accurate mutagenicity predictions quickly.
Statistical-based software for the prediction of mutagenicity
The ICH M7 guideline1 proposes that a computational toxicology assessment should be performed using two complementary (Q)SAR methodologies that predict the outcome of a bacterial mutagenicity assay. Specifically, one methodology should be expert rule-based and the second methodology should be statistical-based.
(Q)SAR models utilising these prediction methodologies should also follow the validation principals set forth by the Organisation for Economic Co-operation and Development (OECD)2.
Sarah Nexus and Derek Nexus (the Lhasa Limited expert toxicity prediction tool), in combination, can provide you with the means to meet the computational toxicological assessment requirements of the ICH M7 guidelines from one intuitive interface.
You can assess your potential genotoxic impurities quickly and easily and submit those results to regulators reducing the need for time consuming and expensive in vitro tests.
Both Derek Nexus and Sarah Nexus have been designed independently to meet the OECD validation principles, and both systems can be run from within the same Nexus interface to help simplify your workflow. Find out more about Lhasa's ICH M7 solutions here.
The models provide completely independent predictions, with the option to consolidate into a single report.
Sarah Nexus uses a unique, hierarchical, machine-learning methodology to build a model for Ames mutagenicity.
Query structures which are imported into Sarah are standardised and then fragmented. These fragments are reviewed for activity versus inactivity. Sarah further refines the fragments by considering the similarity of the query structure to a training set of compounds.
The structure standardisation in Nexus 2.2 uses a set of transform rules including, but not limited to, aromaticity perception, transforming tautomers and resonance forms, and removing salts. The aim of the standardisation is to interpret structures more accurately, in order to optimise predictions and minimise false signal strengths.
The fragments are arranged into a network of hypotheses (or nodes) and the fragments which are perceived to be of a greater value contribute to an overall prediction of toxicity. Fragments may be of various sizes and can even overlap, ensuring greater accuracy in predictions. Figures 1-3 highlight a step-by-step guide to the fragmentation process.
The overall prediction is comprised of the following items:
This high level of transparent information facilitates the expert review process.
The advantages of this methodology include:
Sarah Nexus provides a confidence score for each prediction along with direct access to supporting data to aid expert analysis. The confidence score is based on each fragment’s contribution to the overall prediction and the weight placed upon each fragment. Lhasa’s analysis shows that the confidence strongly correlates to accuracy (figure 4 - for the full graphic which explains confidence in Sarah, click here).
For more information on confidence in Sarah Nexus, please view Lhasa’s video on Model Building and Interpreting Confidence, presented by Account Manager Dr Dave Yeo.
When considering the interpretation of a prediction, expert review is very important. This is why Lhasa has worked hard to ensure Sarah Nexus facilitates this by providing sufficient information to support an expert analysis.
Sarah Nexus represents a different approach to ‘black box’ statistical models, where the user can have an understanding of the model’s accuracy against a test set, but can’t assess an individual prediction.
The structural explanation for the prediction provided by Sarah Nexus is conveyed by highlighting those fragment(s) that Sarah considers meaningful. Derek Nexus also highlights fragments of the query compound in order to illustrate the matches to patterns used to hold knowledge within Derek.
The reason that both models highlight structural fragments is the same: to draw the user’s attention to parts of the query compound which influenced the prediction. However, Derek relies on patterns drawn by experts to find these fragments, whereas Sarah identifies those with statistical significance.
Sarah Nexus, like Derek Nexus, presents all the information an expert requires in order to come to an informed decision. Sarah incorporates additional data such as strain information, similar compounds which were not included in the model, and references. This additional information is available for single predictions, batch predictions, and batch validations.
There is detailed strain information available for the compounds in the Sarah Nexus training set. Including strain data in Sarah Nexus facilitates decision making and helps to reduce uncertainty.
The strain data is displayed alongside Sarah predictions, without changing the prediction, in order to provide context for the expert (figure 5). The strain information available for each example compound is displayed, showing whether the data is positive, negative or equivocal/conflicted. This data is then used to create a strain profile for the hypothesis, using a heat map to show which strains have the highest contribution. Strain data is also available for compounds in the additional information tab.
This strain data can be particularly important in the following cases:
Figure 5: An example of the strain information which is shown in Sarah Nexus. A strain profile for the hypothesis is shown above, based on the strain profiles of the example compounds.
The additional information panel contains two types of compounds:
A compound may be rejected from a model for a variety of reasons, including: containing data which is not reliable enough to be used in model building, or containing conflicting results for same test protocol from different sources. Lhasa has included this additional information to provide the expert with as much transparent information as possible to facilitate the decision making process.
The Example compound panel enables you to see contribution information for training set example compounds and additional information compounds.
The information that you can see for each compound includes: