Another Oracle Structure Search Cartridge

I came across an ASAP paper today describing substructure searching in Oracle databases. The paper comes from the folks at J & J and is part of their series of papers on the ABCD platform. Performing substructure searches in databases is certainly not a new topic and various products are out there that support this […]

Caching SMARTS Queries

Andrew Dalke recently published a detailed write up on his implementation of the Pubchem fingerprints and provided a pretty thorough comparison with the CDK implementation. He pointed out a number of bugs in the CDK version; but he also noted that performance could be improved by caching parsed SMARTS queries – which are used extensively […]

Visualizing PAINS SMARTS

A few days ago I had made available a SMARTS version of the PAINS substructural filters, that were converted using CACTVS from the original SLN patterns. I had mentioned that the SMARTSViewer application was a handy way to visualize the complex SMARTS patterns. Matthias Rarey let me know that his student had converted all the […]

Substructure Searching with Hadoop

My last two posts have described recent attempts at working with Hadoop, a map/reduce framework. As I noted, Hadoop for cheminformatics is quite trivial when working with SMILES files, which is line oriented but requires a bit more work when dealing with multi-line records such as in SD files. But now that we have a […]

Manipulating SMARTS Queries

Yesterday, Andreas Maunz asked a question on the openbabel-discuss list: … a possibility to combine two distinct smarts patterns. Of course, there is the comma OR operator (s1,s2), but I was thinking of a more sophisticated combination, where equal parts of s1 and s2 are “mapped onto each other” in a way such that merely […]

So much to do, so little time

Trying to squeeze sense out of chemical data

Another Oracle Structure Search Cartridge

Caching SMARTS Queries

Visualizing PAINS SMARTS

Substructure Searching with Hadoop

Manipulating SMARTS Queries