The CDK uses the UniversalIsomorphismTester to perform graph and subgraph isomorphism. However it’s not very efficient and this shows when performing substructure searches over large collections. A quick test where I compared the CDK code to OpenBabel’s obgrep showed that the CDK is nearly forty times slower than OpenBabel. Improvements in this code will enhance […]
In my last post I had reported some timing measurements for various operations. One of them was fingerprinting using the path-based hashing Fingerprinter class in the CDK. As reported, it took nearly 4 minutes to process a 1000-molecule subset of ZINC. Not good. So I spent a little time last night hacking on the code, […]
As part of a larger project, I’ve been doing some profiling on various aspects of the CDK, focusing on core cheminformatics operations. I’m using the excellent YourKit profiler to do the tests. They tests are run on a Macbook Pro (2.16GHz) with 1GB RAM, using the latest trunk version of the CDK and JDK 1.5. […]
Since we’re coming up to a 1.2 release (see Egons post) I’ve put up a nightly build site for the 1.2.x branch here so that we can track improvemens in the JUnit tests and various other code and documentation quality issues.
I just updated the CDK Nightly build script so that it summarizes the state of unit test coverage. Currently, trunk has a total of 3215 methods (in 378 classes) that are missing unit tests. See the JUnit test summary for a module-wise summary.