Archive for the ‘workshop’ tag
The last few days I’ve been at the EBI, attending the Molecular Informatics Open Source Software (MIOSS) workshop. As part of this trip to the UK, I’ve also had the opportunity to present some of the work my colleagues and I have done at the NCTT – thanks to Mark Forster for the invitation to speak at Syngenta and to John Chambers for having me speak to the ChEMBL group. At the workshop I presented my work on cheminformatics in R.
The focus of the workshop was to bring OSS developers and users from industry and academica/government together to hear about a variety of projects and discuss issues underlying the development and use of these projects. There were some very nice presentations – I won’t go into too much detail but some highlights for me included
- Kevin Lawson (Syngenta) presented his work on LICSS – integrating the CDK with Excel. While I’ not a fan of Excel, it’s a necessary evil. I was quite surprised at the performance he acheived for substructure searches within Excel and the ability to access various functionalities of the CDK as Excel functions. While it probably won’t replace Accord or ChemOffice right now, it’s something to take a look at.
- Mike Bodkin (Lilly) spoke about the use of KNIME at Lilly. They have built up an extensive collection of commercial and OSS nodes and it’s clear that KNIME is capable of giving Pipeline Pilot a run for its money. Thorsten Mienl then spoke of the OSS development of KNIME, and mentioned that they now support a collection of HCS and image analysis nodes (courtesy MPI Dresden). This is quite interesting, given that we’re ramping up our HCS capabilities at the NCTT
- Hans de Winter of Silicos spoke about the tools and services that their company has produced on top of OpenBabel (and contributed back to the community). Quite encouraging to see a cheminformatics company making money of the OSS stack
- Greg Landrum spoke about RDKit, presenting the RDKit based catridge for Postregsql. He showed some nice performance numbers and it was nice to see that they had gotten the coders who implemented the GiST indexing mechanisms to implement a GiST index for binary fingerprints.
In addition to these, there were other talks on Openbabel, Cinfony, Taverna, fpocket and others. While I’ve known about many of these projects it was useful to learn some of the details from the developers themselves.
A number of issues surrounding OSS development and use were discussed. For example, community development was regarded as a key factor in the success of OSS projects. Erik Lindahl of GROMACS fame, spoke about the development model of GROMACS and how important their success has been due to community involvement. Some other issues included the importance (and lack of) good documentation, what makes people contribute to OSS and so on.
The fact that industry participation was about 50% of group was nice. And a number of industry-related issues also arose. For example, there were several discussion of business models based around OSS and how they can feed back into OSS projects. A commen thread seemed to be that service and customization of OSS are good approaches to building businesses around the OSS stack, Silicos and Eagle Genomics being two prime examples.
The fact that there are industry users of OSS as well as industry members contributing back to OSS projects was very encouraging. An idea supported by a number of participants was some form of web site / wiki where such contributors and users could list themselves. (IMO, the Blue Obelisk wiki, could be a candidate for this type of thing). Sure, there’d be usually corporate and legal barriers to this type of thing, but if done would have a number of benefits – encouragement for project developers and easily viewable precedent that would encourage other companies to use or participate in OSS projects, resulting in a positive feedback loop. With various pre-competitive collaboration efforts (e.g., Pistoia Alliance) popping up in the pharma industry, this is certainly possible.
Finally, it’s always good to meet up with old friends and also meet people whom I’ve only known over email. The social aspects of the workshop were very nice – helped greatly by excellent food and drink! Thanks to Mark for putting together a great meeting.
Finally all the paper work is in place for me to visit the EBI in May to run a hands-on workshop on doing cheminformatics in R. This is part of a two day program. On May 17th I will be discussing the use of the rcdk and rpubchem packages, considering a variety of use cases such as QSAR modeling and fingerprint similarity. I’ll also be touching on the CDK, as that’s the underlying toolkit. The second day, May 18th, run by Stefan Neumann, Paul Benton and David Broadhurst will focus on handling and analyzing mass spectrometry data in R.
I’m really looking forward to this trip and the chance to discuss these packages as well as get feedback on what’s useful and what’s not. If any readers are attending and would like me to cover specific topics, feel free to drop me a line.
I’ll be reaching London on May 14th, visiting some friends and then heading to Hinxton on May 16th. Another reason I’m looking forward to the trip (in addition to some good fish and chips) is that I get to visit Harrogate, where I grew up.