Hadoop and SD Files

In my previous post I had described my initial attempts at working with Hadoop, an implementation of the map/reduce framework. Most Hadoop examples are based on line oriented input files. In the cheminformatics domain, SMILES files are line oriented and so applying Hadoop to a variety of tasks that work with SMILES input is easy. […]

Hadoop and Atom Counting

Over the past few months I’ve been hacking together scripts to distribute data parallel jobs. However, it’s always nice when somebody else has done the work. In this case, Hadoop is an implementation of the map/reduce framework from Google. As Yahoo and others have shown, it’s an extremely scalable framework, and when coupled with Amazons […]

Chemistry, Clouds, Collaboration (Part 1)

There’s been an interesting discussion sparked by Deepaks post, asking why there is a much smaller showing of chemists and chemistry applications in the cloud compared to other life science areas. This post led to a FriendFeed thread that raised a number of issues. At a high level one can easily point out factors such […]

So much to do, so little time

Trying to squeeze sense out of chemical data

Hadoop and SD Files

Hadoop and Atom Counting