In my previous post I talked mainly about why there isn’t a large showing of chemistry in the cloud. It was based of Deepaks post and a FriendFeed thread, but really only addressed the first two words of the title. The issue of collaboration came up in the FriendFeed thread via some comments from Matthew Todd. He asked
I am also interested in why there are so few distributed chemistry collaborations – i.e. those involving the actual synthesis of chemical compounds and their evaluation. Does it come down to data sharing tools?
The term “distributed chemistry collaborations” arises, partly, from a recent paper. But one might say that the idea of distributed collaborations is already here. Chemists have been collaborating in variety of ways, though many of these collaborations are small and focused (say between two or three people).
I get the feeling that Matthew is talking about larger collaborations, something on the lines of the CombiUgi project or the ONS Challenge. I think there are a number of factors that might explain why we don’t see more such large, distributed chemistry collaborations.
First, there is the issue of IP and credit. How will it get apportioned? If each collaborator is providing a specific set of skills, I can see it being relatively simple. But then it also sounds like pretty much any current collaboration. What happens when multiple people are synthesizing different compounds? And you have multiple people doing assays? How is work dividied? How is credit received? And are large, loosely managed groups even efficient? Of course, one could compare the scenario to many large Open Source projects and their management issues.
Second, I think data sharing tools are a factor. How do collaborations (especially those without an informatics component) efficiently share information? Probably Excel – but there are a number of efforts such as CDD and ChemSpider which are making it much easier for chemists to share chemical information.
A third factor that is somewhat related to the previous point is that academic chemistry has somewhat ignored the informatics aspects of chemistry (both as infrastructure topic as well as a research area). I think this is partly related to the scale of academic chemistry. Certainly, many topics in chemical research do not require informatics capabilities (compared to say ab initio computational capabilities). But there are a number of areas, such as the type that Matthew notes, that can greatly benefit from an efficient informatics infrastructure. I certainly won’t say that it’s all there and ready to use – but I think it’s important cheminformatics plays a role. In this sense, one could say that there would be many more distributed collaborations, if the chemists knew that there was an infrastructure that could help their efforts. I will also note that it’s not just about infrastructure – while important, it’s also pretty straightforward IT (given some domain knowledge). I do think that there is a lot more to cheminformatics than just setting up databases, that can support bench chemistry efforts. Industry realizes this. Academia hasn’t so much (at least yet).
Which leads me to the fourth factor, which is social. Maybe the reason for the lack of such collaborations is there chemists just don’t have a good way of getting the word out they are available and/or interested. Certainly, things like FriendFeed are a venue for things like this to happen, but given that most academic chemists are conservative, it may take time for this to pick up speed.