Encryption of chemical information has not been a very common topic in cheminformatics. There was an ACS symposium in 2005 (summary) that had a number of presentations on the topic of “safe exchange” of chemical information – i.e., exchanging information on chemical structures without sharing the structures themselves. The common thread running through many presentations was to […]
Metabolite Similarity & Dirty Compounds
Edit 10/9/14 – Updated statistics for the 1024 bit fingerprints There’s been some discussion about a paper by O’Hagan et al that have proposed a Rule of 0.5 that states that 90% of approved drugs exhibit a Tanimoto similarity > 0.5 to one or more human metabolites. Their analysis is based on metabolites listed in […]
Fingerprint Similarity Searches in MongoDB
A few of my recent projects have involved the use of MongoDB, primarily for the ease afforded by a schemaless environment. Sometime back I had investigated the use of MongoDB to store chemical structure data, though those efforts did not actually query structures per se; instead they queried for precomputed numeric or text properties. So […]
fingerprint 3.5.2 released
Version 3.5.2 of the fingerprint package has been pushed to CRAN. This update includes a contribution from Abhik Seal that significantly speeds up similarity matrix calculations using the Tanimoto metric. His patch led to a 10-fold improvement in running time. However his code involved the use of nested for loops in R. This is a well […]
Support for feature,count fingerprints in fingerprint 3.5.0
I’ve just updated the fingerprint package to v3.5.0 (should show up on CRAN shortly, or else you can get it directly from my Github repository). The main update in this version is better support for feature,count type fingerprints. An example would be ECFP or signature fingerprints. In these types of fingerprints, the output is usually […]