Which Datasets Lead to Predictive Models?

I came across a recent paper from the Tropsha group that discusses the issue of modelability – that is, can a dataset (represented as a set of computed descriptors and an experimental endpoint) be reliably modeled. Obviously the definition of reliable is key here and the authors focus on a cross-validated classification accuracy as the […]

SALI in Bulk

Sometime back John Van Drie and I had developed the Structure Activity Landscape Index (SALI), which is a way to quantify activity cliffs – pairs of compounds which are structurally very similar but have significantly different activities. In preparation for a talk on SALI at the Boston ACS, I was looking for SAR datasets that […]

From Theory to Practice

Some time back, John Van Drie and myself had done some work on characterizing structure-activity cliffs, which are molecules that have very similar structures but very different activities. The term originated from Maggiora, who suggested that this was a reason for the failure of many QSAR models. At the same time, such cliffs can represent […]

SALI Viewer Now on GitHub

Last year, John Van Drie and I published two papers (here and here) on the Structure Activity Landscape Index, (SALI) which is a way to view SAR data as a network of compounds. Along with the paper ,I put up a simple Java application (licensed under the LGPL) to generate and explore these networks. – […]

So much to do, so little time

Trying to squeeze sense out of chemical data

Which Datasets Lead to Predictive Models?

SALI in Bulk

From Theory to Practice

SALI Viewer Now on GitHub