Joerg has made a nice blog post on the use of Open Source software and data to analyse the occurence of antithrombotics. More specifically he was trying to answer the question Which XRay ligands are closest to the Fontaine et al. structure-activity relationship data for allowing structure-based drug design? Using Blue Obelisk tools and ChemSpider […]
Getting a CAS Number from a PubChem CID
A few days back, Hari on FriendFeed had asked how one could get a a CAS number from a PubChem compound ID (CID). The reverse, that is finding a CID for a given CAS number is generally quite easy as shown by Rich here and here. Since I was trying to get some writing done, […]
Chemistry in Google Docs
I met with Jean-Claude Bradley yesterday and we had a pretty useful hack session, allowing him to easily incorporate chemical and cheminformatics functionality into a GoogleDocs spreadsheet. A common task that Jean-Claude wanted to automate was the calculation of milligrams (or milliliters) of a chemical required for a certain molarity. So what we need for […]
Do the CDK Fingerprints Work?
In a previous post, I dicussed virtual screening benchmarks and some new public datasets for this purpose. I recently improved the performance of the CDK hashed fingerprints and the next question that arose is whether the CDK fingerprints are any good. With these new datasets, I decided to quantitatively measure how the CDK fingerprints compare […]
Datasets for Virtual Screening Benchmarks
Virtual screening (VS) is a common task in the drug discovery process and is a computational method to identify promising compounds from a collection of hundreds to millions of possible compounds. What “promising” exactly means, depends on the context – it might be compounds that will likely exhibit certain pharmacological effects. Or compounds that are […]