Since I do a lot of cheminformatics work in R, I’ve created various functions and packages that make life easier for me as do my modeling and analysis. Most of them are for private consumption. However, I’ve released a few of them to CRAN since they seem to be generally useful. One of them is […]
Datasets for Virtual Screening Benchmarks
Virtual screening (VS) is a common task in the drug discovery process and is a computational method to identify promising compounds from a collection of hundreds to millions of possible compounds. What “promising” exactly means, depends on the context – it might be compounds that will likely exhibit certain pharmacological effects. Or compounds that are […]
Which Bits are Important for Similarity Searches?
The recent paper by Wang and Bajorath is an interesting approach to identifying the important bits in a fingerprint, with respect to a dataset. Their discussion focuses on the structural key type fingerprints (such as MACCS and the BCI fingerprints) and the problem they are trying to address is the fact that certain structural features […]
AJAX’ified Pub3D
Pub3D is a 3D version of PubChem, in which we have generated a single conformer for 99% of PubChem using the smi23d suite of programs. The structures are then stored in a PostgreSQL database along with their distance moment shape descriptors described by Ballester and Graham-Richards. This allows us to perform shape similarity queries against […]
CDL – A Cheminformatics Toolkit
The Chemical Descriptors Library (CDL) has been around for a while, but hasn’t seemed to get much publicity. A paper describing the design and performance of the library just came out today. While the name suggests a library of descriptors, it’s actually a general C++ library for cheminformatics. The library appears to use the molecular […]