So much to do, so little time

Trying to squeeze sense out of chemical data

Archive for the ‘pains’ tag

Visualizing PAINS SMARTS

without comments

A few days ago I had made available a SMARTS version of the PAINS substructural filters, that were converted using CACTVS from the original SLN patterns. I had mentioned that the SMARTSViewer application was a handy way to visualize the complex SMARTS patterns. Matthias Rarey let me know that his student had converted all the SMARTS to SMARTSViewer depictions and made them available as a PDF. Given the complexity of many of the PAINS patterns, these depictions are a very nice way to get a quick idea of what is supposed to match.

(FWIW, the SMARTS don’t reproduce the matches obtained using the original SLN’s – but hopefully the depictions will help anybody who’d like to try and fix the SMARTS).

Written by Rajarshi Guha

December 2nd, 2010 at 1:18 am

Posted in cheminformatics,software

Tagged with , ,

PAINS Substructure Filters as SMARTS

with one comment

Sometime back Baell et al published an interesting paper describing a set of substructure filters to identify compounds that are promiscuous in high throughput biochemical screens. They termed these compounds Pan Assay Interference Compounds or PAINS. There are a variety of functional groups that are known to be problematic in HTS assays. The reasons for exclusion of molecules with these and other groups range from reactivity towards proteins to poor developmental potential or known toxicity. Derek Lowe has a nice summary of the paper.

The paper published the substructure filters as a collection of Sybyl Line Notation (SLN) patterns. Unfortunately, without access to Sybyl, it’s difficult to reuse the published patterns. Having them in ¬†SMARTS form would allow one to use them with many more (open source or commercial) tools. Luckily, Wolf Ihlenfeldt came to the rescue and provide me access to a version of the CACTVS toolkit that was able to convert the SLN patterns to SMARTS.

There are three files, p_l15, p_l150 and p_m150 corresponding to tables S8, S7 and S6 from the supplementary information. The first column is the pattern and the second column is the name for that pattern taken from the original SLN files. While all patterns were converted to SMARTS, the conversion process is not perfect as I have not been able to reproduce (using the OEChem toolkit with the Tripos aromaticity model) all the hits that were obtained using the original SLN patterns.

(As a side note, the SMARTSViewer is a really handy tool to visualize a SMARTS pattern – which is great since many of the PAINS patterns are very complex)

Written by Rajarshi Guha

November 14th, 2010 at 8:41 pm