Recently I came across a paper from Marth et al that described a method based on network analysis to support retrosynthetic planning, particularly for complex natural products. I’m no synthetic chemist so I can’t comment on the relevance or importance of the targets or the significance of the proposed approach to planning a synthetic route. What caught my eye was the claim that
This work validates the utility of network analysis as a starting point for identifying strategies for the syntheses of architecturally complex secondary metabolites.
I was a little disappointed (hey, a Nature publication sets certain expectations!) that the network analysis was fundamentally walking the molecular graph to identify a certain type of ring, termed the maximally bridging ring. The algorithm is described in the SI and the authors make it available
as an online tool. Unfortunately they didn’t provide any source code for their algorithm, which was a bit irritating, given that the algorithm is a key component of the paper.
I put together an implementation using the CDK (1.5.12), available in a Github repo. It’s a quick hack, using the parameters specified in the paper, and hasn’t been extensively tested. However it seems to give the correct result for the first few test cases in the SI.
The tool will print out the hash code of the rings recognized as maximally bridging and also generate an SVG depiction with the first such ring highlighted in red, such as shown alongside. You can build a self-contained version of the tool as
1 2 3 | git clone git@github.com:rajarshi/maxbridgerings.git cd maxbridgerings mvn clean package |
The tool can then be run (with the depiction output to Copaene.svg
)
1 2 | java -jar target/MaximallyBridgingRings-1.0-jar-with-dependencies.jar \ "CC(C)C1CCC2(C3C1C2CC=C3C)C" Copaene |
[…] not provided, the described algorithm appeared to be simple enough for a skilled CDK developer to rebuild an application from scratch. So now anyone can even contribute to further improvement of the […]