BAZOO

So much to do, so little time

Trying to squeeze sense out of chemical data

CDL – A Cheminformatics Toolkit

with 5 comments

The Chemical Descriptors Library (CDL) has been around for a while, but hasn’t seemed to get much publicity. A paper describing the design and performance of the library just came out today. While the name suggests a library of descriptors, it’s actually a general C++ library for cheminformatics. The library appears to use the molecular graph as its core concept and uses the Boost Graph Library (BGL) to represent and manipulate molecular graphs. Some features include substructure searching using SMARTS, fingerprints, descriptors (CATS, a bunch of topological’s etc) and file format reading (SMILES and SDF as far as I can see).

It seems nice and is available under the Boost Software License. While it does a lot of the basic operations, it doesn’t appear as comprehensive as say OpenBabel or RDKit. However, it’s good to see the cheminformatics toolkit ecosystem growing.

An aside – I haven’t really done much C++ coding and what little I do is basically ‘C in C++’. But how do people get their heads around C++ templates? I tend to get a headache when trying to examine one. And I thought that writing Java was tedious – C++ with templates takes the cake!

Update - Their Sourceforge project page is here but I can’t seem to find a download link.  A software paper with no software!

Written by Rajarshi Guha

September 20th, 2008 at 2:20 pm

Posted in cheminformatics,software

Tagged with ,

5 Responses to 'CDL – A Cheminformatics Toolkit'

Subscribe to comments with RSS or TrackBack to 'CDL – A Cheminformatics Toolkit'.

  1. Exactly, I can’t find either the download page.

    Vladimir Chupakhin

    23 Sep 08 at 8:25 am

  2. Hi Rajarshi,
    A software paper with no software!

    … thats what I thought too, the software seems to be hanging around since March 2006 (in 2005 Morphochem was taken over by Biovertis (Intercell)).

    The link in the PDF is also broken, but the software can be found as an archive file named morpho_1_2_0.tar

    Do a search on cuil or google or a hardlink here:
    http://ftp.heanet.ie/disk1/sourceforge/c/cd/cdelib/

    Isn’t it nice to have sourceforge on a disk array?
    That’s like the internet in-a-box offered by archive.org and Alexa some time ago (when in was less than 2 petabytes).
    http://www.archive.org/about/faqs.php

    I didn’t try to compile it yet but I was chuckling on
    “what little I do is basically ‘C in C++’” Same here, but that’s fair enough. I use a commercial C++ to JAVA converter from Tangible to convert most of the important code snippets I have. I only know three to five C++ hardcore programmers and it just doesn’t make sense to stick to C++ if nobody around you uses it. I am well aware of WIN and LUNIX kernels etc, but in the (open source) cheminformatics field its bending towards JAVA.

    But as Joerg already in 2007 wrote I must applaud and congratulate the authors that the library made its way to open source.
    http://miningdrugs.blogspot.com/2007/06/making-case-chemical-descriptor-library.html

    On the other hand its not that much about open source, assume I can not program in C++ and the community of developers is not growing beyond three the CDL library is doomed (I do not wish that). Would be good if Vladimir could comment on the origin and possible future of the whole project.

    Cheers
    Tobias

    Tobias Kind

    24 Sep 08 at 9:43 am

  3. Tobias, thanks a lot for the link. I’ll play around with the sources.

    Regarding your comments on C++ – from what I have read in various places, using C++ as C++ takes a significant amount of effort! But even though a lot of cheminformatics is done in Java, I definitely see the role of C/C++ libs – stuff like database backends and so on. Also Python / Ruby frontends such as what OB provides are extremely useful

    However, I will throw out a comment (wearing my asbestos underwear!) that much of cheminformatics is not in need of blazing high performance. If it takes 30 seconds versus 20 seconds to evaluate descriptors, does it really matter? However, I do think it’s important that the core algorithms (DFS, BFS, ring perception, SMILES parsing etc) be as fast as possible.

    Regarding the issue of using what people around you use, I can see the value in that. I’m falling in love with Lisp – but I doubt I’ll ever go beyond the hobby level, since I can’t really do much cheminformatics with it (though Clojure seems to be a way out)

    Rajarshi Guha

    24 Sep 08 at 9:30 pm

  4. Hmmm…SWIG does Lisp too…:-)

    baoilleach

    3 Oct 08 at 11:00 am

  5. Noel, didn’t know that SWIG does that. However a nicer Lisp dialect seems to have popped up – Clojure. It’s basically a Lisp on the JVM and you get seamless access to Java libs

    Rajarshi Guha

    3 Oct 08 at 12:12 pm

Leave a Reply