I’ve just uploaded a new version of the fingerprint package (v3.3) to CRAN that implements some ideas described in Nisius and Bajorath. First, the balance method generates “balanced code” fingerprints, which given an input fingerprint of N bits, returns a new fingerprint of 2N bits, such that the bit density is exactly 50%. Second, bit.importance […]
Some More Comparisons with the GSK Dataset
My previous post did a quick comparison of the GSK anti-malarial screening dataset with a virtual library of Ugi products. That comparison was based on the PubChem fingerprints and indicated a broad degree of overlap. I was also interested in looking at the overlap in other feature spaces. The simplest way to do this is […]
A Quick Look at the GSK Malaria Dataset
A few days ago, GSK released an approximately 13,000 member compound library (using the CC0 license) that had been tested for activity against P. falciparum. The structures and data have been deposited into ChEMBL and a paper is available, that describes the screening project and results. Following this announcement there was a thread on FriendFeed, where […]
Spreading the Word About R & Cheminformatics
These last few days I’ve been in the UK for an EBI workshop on cheminformatics in R. It was a two day workshop, the first day focusing on general cheminormatics in R using the rcdk and rpubchem packages, and the second day focusing on doing mass spectrometry in R using XCMS and Rdisop, run by […]
2D Depictions in R Plots
In preparation for the upcoming R workshop at the EBI, I’ve been cleaning up the rcdk package and updating some features. One of the new features is the ability to get a 2D depiction as a raster image. Uptil now, 2D depictions were drawn in a Swing window – this allowed you to resize the […]