BAZOO

So much to do, so little time

Trying to squeeze sense out of chemical data

R and Oracle

with 2 comments

It’s been a while since my last post, but I’m getting up to speed at work. It’s been less than a month, but there’s already a ton of cool stuff going on. One of the first things I’ve been getting to grips with is the data infrastructure at the NCGC, which is based around Oracle. One of my main projects is handling informatics for RNAi screening. As the data comes out of the pilots, they get loaded into the Oracle infrastructure.

Being an R aficionado, I’m doing the initial, exploratory analyses (normalization, hit selection, annotation etc.) using R. Thus I needed to have a way to access an Oracle DB from R. This is supported by the ROracle package. But it turns out that the installation is a little non-obvious and I figured I’d describe the procedure (on OS X 10.5) for posterity.

The first thing to do is to get Oracle from here. Note that this is the full Oracle installation and while it comes with 32 bit and 64 bit libraries, some of the binaries that are required during the R install are 64 bit only. After getting the zip file, extract the installation files and run the installation script. Since I just needed the libraries (as opposed to running an actual Oracle DB), I just went with the defaults and opted out of the the actual DB creation step. After installation is done, it’s useful to set the following environment variables:

1
2
3
export ORACLE_HOME=/Users/foo/oracle
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$ORACLE_HOME/lib:$DYD_LIBRARY_PATH

With Oracle installed, execute the following

1
$ORACLE_HOME/bin/genclntst

This will link a variety of object files into a library, which is required by the R package, but doesn’t come in the default Oracle installation.

The next thing is to get a 64 bit version of R from here and simply install as usual. Note that this will require you to reinstall all your packages, if you had a previous version of R around. Specifically, before installing ROracle, make sure to install the DBI package.

After installing R, get the ROracle 0.5-9 source package. Since there’s no binary build for OS X, we have to compile it ourselves. Before building, I like to CHECK the package to make sure that all is OK. Thus, the sequence of commands is

1
2
3
4
tar -zxvf ROracle_0.5-9.tar.gz
R --arch x86_64 CMD CHECK ROracle
R --arch x86_64 CMD BUILD ROracle
R --arch x86_64 CMD INSTALL -l  /Users/guhar/Library/R/2.9/library ROracle_0.5-9.tar.gz

When I ran the CHECK, I did get some warnings, but it seems to be safe to ignore them.

At this stage, the ROracle package should be installed and you can start R and load the package. Remember to start R with the –arch x86_64 argument, since the ROracle package will have been built for the 64 bit version of R.

Written by Rajarshi Guha

June 17th, 2009 at 3:19 am

Posted in software

Tagged with , ,

2 Responses to 'R and Oracle'

Subscribe to comments with RSS or TrackBack to 'R and Oracle'.

  1. I didn’t realise that you can download Oracle for free (within the license terms). That’s useful to know

    baoilleach

    17 Jun 09 at 10:17 am

  2. I assume there are limits on the DB functionality – but I’m just using the libs

    Rajarshi Guha

    18 Jun 09 at 1:29 am

Leave a Reply