So much to do, so little time

Trying to squeeze sense out of chemical data

Archive for the ‘spatial index’ tag

Brute Force – Inelegant, But Sometimes Useful

with one comment

A few days back I posted on improving query times in Pub3D by going from a monolithic database (17M rows), to a partitioned version (~ 3M rows in 6 separate databases) and then performing queries in parallel. I also noted that we were improving query times by making use of an R-tree spatial index.

Andrew Dalke posted a comment:

I’ve wondered about this quote from the ANN page at http://www.cs.umd.edu/~mount/ANN/ .

Computing exact nearest neighbors in dimensions much higher than 8 seems to be a very difficult task. Few methods seem to be significantly better than a brute-force computation of all distances.”

Since you’re in 12-D space, this suggests that a linear search would be faster. The times I’ve done searches for near neighbors in higher dimensional property space have been with a few thousand molecules at most, so I’ve never worried about more complicated data structures.

Read the rest of this entry »

Written by Rajarshi Guha

November 20th, 2008 at 5:42 pm