Machine Learning Predicts Degree of Aromaticity from Structural Fingerprints
Prediction of whether a compound is “aromatic” is at first glance a relatively simple task—does it obey Hückel’s rule (planar cyclic p-system with 4n + 2 electrons) or not? However, aromaticity is far from a binary property, and there are distinct variations in the chemical and biological behavior of different systems which obey Hückel’s rule and are thus classified as aromatic. To that end, the aromaticity of each molecule in a large public dataset was quantified by an extension of the work of Raczynska et al. Building on this data, a method is proposed for machine learning the degree of aromaticity of each aromatic ring in a molecule. Categories are derived from the numeric results, allowing the differentiation of structural patterns between them and thus a better representation of the underlying chemical and biological behavior in expert and (Q)SAR systems.