Update to the fingerprint Package

I’ve just uploaded a new version of the fingerprint package (v3.3) to CRAN that implements some ideas described in Nisius and Bajorath. First, the balance method generates “balanced code” fingerprints, which given an input fingerprint of N bits, returns a new fingerprint of 2N bits, such that the bit density is exactly 50%. Second, bit.importance […]

Slides from a Guest Lecture at Drexel University

On Thursay I joined Antony Williams as a guest lecturer in Jean Claude-Bradleys‘ class on chemical information retrieval at Drexel University. Using a combination of WebEx and Skype, we were able to give our presentations – seamlessly joining three different locations. Technology is great! Tony gave an excellent talk on citizen science and ChemSpider and […]

Brute Force – Inelegant, But Sometimes Useful

A few days back I posted on improving query times in Pub3D by going from a monolithic database (17M rows), to a partitioned version (~ 3M rows in 6 separate databases) and then performing queries in parallel. I also noted that we were improving query times by making use of an R-tree spatial index. Andrew […]

Conformational Envelopes

Joe Leonard posted a question on the CCL mailing list today regarding “conformation envelopes”. More specifically, he asked Has there been work on creating visualizations of “conformer envelopes”, graphical representations of the conformational space occupied (or available) to molecules. Particularly when such visualizations are used to (quickly/visually) compare whether 2 molecules can adopt the same […]

Do the CDK Fingerprints Work?

In a previous post, I dicussed virtual screening benchmarks and some new public datasets for this purpose. I recently improved the performance of the CDK hashed fingerprints and the next question that arose is whether the CDK fingerprints are any good. With these new datasets, I decided to quantitatively measure how the CDK fingerprints compare […]