Cheminformatics with Hadoop and EC2

In the last few posts I’ve described how I’ve gotten up to speed on developing Map/Reduce applications using the Hadoop framework. The nice thing is that I can set it all up and test it out on my laptop and then easily migrate the application to a large production cluster. Over the past few days […]

Substructure Searching with Hadoop

My last two posts have described recent attempts at working with Hadoop, a map/reduce framework. As I noted, Hadoop for cheminformatics is quite trivial when working with SMILES files, which is line oriented but requires a bit more work when dealing with multi-line records such as in SD files. But now that we have a […]

Hadoop and SD Files

In my previous post I had described my initial attempts at working with Hadoop, an implementation of the map/reduce framework. Most Hadoop examples are based on line oriented input files. In the cheminformatics domain, SMILES files are line oriented and so applying Hadoop to a variety of tasks that work with SMILES input is easy. […]

Hadoop and Atom Counting

Over the past few months I’ve been hacking together scripts to distribute data parallel jobs. However, it’s always nice when somebody else has done the work. In this case, Hadoop is an implementation of the map/reduce framework from Google. As Yahoo and others have shown, it’s an extremely scalable framework, and when coupled with Amazons […]

PubChem Bioassay Annotation Poster

Sometime back I had described some work on the automated annotation of PubChem bioassays. The lack of annotations on the assays can make it difficult to integrate with other biological resources. Ideally, the bioassays would be manually annotated – however, it’s not a very exciting job. So, collaborating with Patrick Ruch and Julien Gobeill, we […]

« Previous
1
…
15
16
17
18
19
…
23
Next »

So much to do, so little time

Trying to squeeze sense out of chemical data

Cheminformatics with Hadoop and EC2

Substructure Searching with Hadoop

Hadoop and SD Files

Hadoop and Atom Counting

PubChem Bioassay Annotation Poster