Making reliable negative predictions of human skin sensitisation using an in silico fragmentation approach
A previously published fragmentation method for making reliable negative in silico predictions has been applied to the problem of predicting skin sensitisation in humans, making use of a dataset of over 2750 chemicals with publicly available skin sensitisation data from 18 in vivo assays. An assay hierarchy was designed to enable the classification of chemicals within this dataset as either sensitisers or non-sensitisers where data from more than one in vivo test was available. The negative prediction approach was validated internally, using a 5-fold cross-validation, and externally, against a proprietary dataset of approximately 1000 chemicals with in vivo reference data shared by members of the pharmaceutical, nutritional, and personal care industries. The negative predictivity for this proprietary dataset was high in all cases (>75%), and the model was also able to identify structural features that resulted in a lower accuracy or a higher uncertainty in the negative prediction, termed misclassified and unclassified features respectively. These features could serve as an aid for further expert assessment of the negative in silico prediction.