A few days ago I was pointed to a new cheminformatics toolkit called Indigo, written in C++ with source code available under the GPLv3 license. Rich has previously commented on this. While it’s an initial release, it has a number of interesting components such as an Oracle cartridge, 2D depiction and scaffold detection and R-group […]
Drug Discovery Trends In and From the Literature
I came across a recent paper by Agarwal and Searls which describes a detailed bibliometric analysis of the scientific literature to identify and characterize specific research topics that appear to be drivers for drug discovery – i.e., research areas/topics in life sciences that exhibit significant activity and thus might be fruitful for drug discovery efforts. […]
A Custom Palette for Heatmaps
Heatmaps are a common way to visualize matrices and R provides a variety of methods to generate these diagrams. One of the key features of a heatmap is the color scheme employed. By default the image method uses heat.colors which ranges from red (lowest values) to white (highest values). Other palettes include rainbow and topographical. […]
Correlating Continuous and Categorical Variables
At work, a colleague gave an interesting presentation on characterizing associations between continuous and categorical variables. I expect that I will be facing this issue in some upcoming work so was doing a little reading and made some notes for myself. Given a continuous variable Y and a categorical variable G, is the distribution of […]
Oracle Notes
Some handy settings when running a query from the command line via sqlplus set echo off set heading on set linesize 1024 set pagesize 0 set tab on set trims on set wrap off — might want to set column formats here — e.g.: column foo format A10 spool stats — dump results to stats.lst […]