Accessing High Content Data from R

Over the last few months I’ve been getting involved in the informatics & data mining aspects of high content screening. While I haven’t gotten into image analysis itself (there’s a ton of good code and tools already out there), I’ve been focusing on managing image data and meta-data and asking interesting questions of the voluminuous, […]

Similarity Matrices in Parallel

Today I got an email asking whether it’d be possible to speed up a fingerprint similarity matrix calculation in R. Now, pairwise similarity matrix calculations (whether they’re for molecules or sequences or anything else) are by definition quadratic in nature. So performing these calculations for large collections aren’t always feasible – in many cases, it’s […]

Updates to R Packages

I’ve uploaded a new version of fingerprint (v 3.4) which now supports feature fingerprints – fingerprints that are represented as variable length vectors of numbers or strings. An example would be circular fingerprints. Now, when reading fingerprints you have to indicate whether you’re loading binary fingerprints or not (via the binary argument in fp.read). A […]