A few days ago I had made available a SMARTS version of the PAINS substructural filters, that were converted using CACTVS from the original SLN patterns. I had mentioned that the SMARTSViewer application was a handy way to visualize the complex SMARTS patterns. Matthias Rarey let me know that his student had converted all the […]
Similarity Matrices in Parallel
Today I got an email asking whether it’d be possible to speed up a fingerprint similarity matrix calculation in R. Now, pairwise similarity matrix calculations (whether they’re for molecules or sequences or anything else) are by definition quadratic in nature. So performing these calculations for large collections aren’t always feasible – in many cases, it’s […]
Inserting 2D Depictions into R Plots
Recent versions of rcdk allow you to insert images of chemical structures into R plots, via the view.image.2d and rasterImage functions. One problem with the latter function is that the 2D structure image must be located in plot units, rather than pixel units. Paul Murrell suggested an easy way to insert the raster image into […]
PAINS Substructure Filters as SMARTS
Sometime back Baell et al published an interesting paper describing a set of substructure filters to identify compounds that are promiscuous in high throughput biochemical screens. They termed these compounds Pan Assay Interference Compounds or PAINS. There are a variety of functional groups that are known to be problematic in HTS assays. The reasons for […]
Updates to R Packages
I’ve uploaded a new version of fingerprint (v 3.4) which now supports feature fingerprints – fingerprints that are represented as variable length vectors of numbers or strings. An example would be circular fingerprints. Now, when reading fingerprints you have to indicate whether you’re loading binary fingerprints or not (via the binary argument in fp.read). A […]