Brute Force – Inelegant, But Sometimes Useful

A few days back I posted on improving query times in Pub3D by going from a monolithic database (17M rows), to a partitioned version (~ 3M rows in 6 separate databases) and then performing queries in parallel. I also noted that we were improving query times by making use of an R-tree spatial index. Andrew […]

BibDesk and Word

Since writing papers is pretty much a way of life for an academic, I like to have tools that let me concentrate on the content, yet make beautiful documents with minimal effort on my part. The solution to this is LaTeX. While it gives me beautifully typeset documents, it doesn’t handle bibliographic data management. That […]

Java Port of VFLib Works and it’s Blazing

Sometime back I described how I was porting the VFLib algorithms to Java, so that we could use it for substructure search, since the current UniversalIsomorphismTester is pretty slow for this task, in general. While I had translated the Ullman algorithm implementation of VFLib and shown that it outperformed the CDK method, it turned out […]

Multi-threaded Database Access with Python

Pub3D contains about 17.3 million 3D structures for PubChem compounds, stored in a Postgres database. One of the things we wanted to do was 3D similarity searching and to achieve that we’ve been employing the Ballester and Graham-Richards method. In this post I’m going to talk about performance – how we went from a single […]

Conformational Envelopes

Joe Leonard posted a question on the CCL mailing list today regarding “conformation envelopes”. More specifically, he asked Has there been work on creating visualizations of “conformer envelopes”, graphical representations of the conformational space occupied (or available) to molecules. Particularly when such visualizations are used to (quickly/visually) compare whether 2 molecules can adopt the same […]