Do the CDK Fingerprints Work?

In a previous post, I dicussed virtual screening benchmarks and some new public datasets for this purpose. I recently improved the performance of the CDK hashed fingerprints and the next question that arose is whether the CDK fingerprints are any good. With these new datasets, I decided to quantitatively measure how the CDK fingerprints compare […]

Working With Fingerprints in R (can’t beat C!)

Since I do a lot of cheminformatics work in R, I’ve created various functions and packages that make life easier for me as do my modeling and analysis. Most of them are for private consumption. However, I’ve released a few of them to CRAN since they seem to be generally useful. One of them is […]

Datasets for Virtual Screening Benchmarks

Virtual screening (VS) is a common task in the drug discovery process and is a computational method to identify promising compounds from a collection of hundreds to millions of possible compounds. What “promising” exactly means, depends on the context – it might be compounds that will likely exhibit certain pharmacological effects. Or compounds that are […]

So much to do, so little time

Trying to squeeze sense out of chemical data

Do the CDK Fingerprints Work?

Working With Fingerprints in R (can’t beat C!)

Datasets for Virtual Screening Benchmarks