Cheminformatics and Hotness (or lack thereof)

A few days back I discussed some thoughts on cheminformatics vis a vis bioinformatics, inspired by a review by Aaron Sterling. In that thread, Steven Salzberg made a comment, stating … In my opinion, it is not a “hot” field, though, in part for some of the reasons mentioned in the post – particularly the […]

Cheminformatics – the New World for TCS?

A few weeks back Aaron Sterling posted a review of the Handbook of Cheminformatics Algorithms (in which I have a chapter). Aaron notes … my goal for the project changed from just a review of a book, to an attempt to build a bridge between theoretical computer science and computational chemistry … The review/bridging was […]

Working with Sequences in R

I’ve been working on some RNAi projects and part of that involved generating descriptors for sequences. It turns out that the Biostrings package is very handy and high performance. So, our database contains a catalog for an siRNA library with ~ 27,000 target DNA sequences. To get at the siRNA sequence, we need to convert […]

RNAi in PubChem

While considering ways to disseminate RNAi screening data, I found out that PubChem now contains two RNAi screening datasets – AIDs 1622 and 1904. These screens reuse the PubChem bioaassay formats – which is both good and bad. For example, while there are a few standardized columns (such as PUBCHEM_ACTIVITY_SCORE), the bulk of the user […]

ChEMBL in RDF and Other Musings

Earlier today, Egon announced the release of an RDF version of ChEMBL, hosted at Uppsala. A nice feature of this setup is that one can play around with the data via SPARQL queries as well as explore the classes and properties that the Uppsala folks have implemented. Having fiddled with SPARQL on and off, it […]