So much to do, so little time

Trying to squeeze sense out of chemical data

Archive for July, 2009

Viewing Pathways

with 3 comments

There are a variety of pathway databases such as KEGG, Reactome and Wikipathways. One of my projects involves pathway analysis, and it’d be nice to easily display them. In many cases, one can simply display a pre-generated image of the pathway (such as in KEGG), but in general, interactivity is nice. However, the latter would require me to somehow dynamically layout the pathway – which is non-trivial.

In this vein, the WikiPathways collection is very nice, as they provide their pathways in the GPML format. This database (or rather wiki) allows users to contribute pathways, which are drawn manually using an editor. More importantly, the GPML output contains the coordinates of all the pathway elements and thus one can display the final pathway in the manner that the contributor intended.

Life becomes a little easier since the project also provides a standalone editor and viewer called PathVisio, written in Java. This tool contains code to parse in GPML files and also perform the layout. Since I wanted to integrate the pathway visualization into my own codebase, I took a look at how I could reuse the relevant PathVisio classes in my own code. As a first attempt it worked quite nicely, though there are a few rough edges – mainly due to my lack of knowledge of the PathVisio API.

Requirements

I obtained the PathVisio sources from their Subversion repository and built the program. This resulted in two main jar files: pathvisio.jar and pathvisio_core.jar. In addition, for the code to run, we require a few other jar files located in the lib directory

  • bridgedb.jar
  • bridgedb-bio.jar
  • jdom.jar
  • resources.jar

With these jar files in my CLASSPATH, the following code will bring up a window allowing you to load a GPML file and then display the pathway. (You can get the pathways in GPML format here)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
import org.pathvisio.model.*;
import org.pathvisio.preferences.PreferenceManager;
import org.pathvisio.view.VPathway;
import org.pathvisio.view.swing.VPathwaySwing;

import javax.swing.*;
import java.awt.*;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.io.File;
import java.util.List;

public class GPMLTest extends JFrame {

    private Pathway pathway;
    private VPathway vPathway;
    private VPathwaySwing wrapper;
    private JPanel panel;
    private JScrollPane scrollPane = new JScrollPane();


    public GPMLTest() throws HeadlessException {
        PreferenceManager.init();
       
        panel = new JPanel(new BorderLayout());
        setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
        JButton load = new JButton("Load");
        load.addActionListener(new ActionListener() {
            public void actionPerformed(ActionEvent actionEvent) {
                JFileChooser fc = new JFileChooser(".");
                int choice = fc.showOpenDialog(GPMLTest.this);
                if (choice == JFileChooser.APPROVE_OPTION) {
                    File f = fc.getSelectedFile();
                    try {
                        loadAndShow(f);
                    } catch (ConverterException e) {

                    }
                }

            }
        });
        panel.add(load, BorderLayout.NORTH);

        setContentPane(panel);
    }

    public void loadAndShow(File f) throws ConverterException {

        pathway = new Pathway();
        PathwayImporter importer = new GpmlFormat();
        importer.doImport(f, pathway);
        List<PathwayElement> elems = pathway.getDataObjects();

        // highlight a specific gene
        for (PathwayElement e : elems) {
            if (e.getGeneID().equals("3456")) {
                System.out.println("found it");
                e.setBold(true);
                e.setColor(Color.RED);
                e.setItalic(true);
                e.setFontName("Times");
            }
        }
        wrapper = new VPathwaySwing(scrollPane);
        vPathway = wrapper.createVPathway();
        vPathway.fromModel(pathway);
        vPathway.setEditMode(false);
        panel.add(scrollPane, BorderLayout.CENTER);
        panel.setPreferredSize(scrollPane.getPreferredSize());
    }

    public static void main(String[] args) throws ConverterException {
        GPMLTest g = new GPMLTest();
        g.pack();
        g.setVisible(true);
    }
}

It’s a pretty simplistic example and doesn’t really allow much interactivity. But for my purposes, this is pretty useful. One downside of this approach is that it requires manually drawn pathways – so that they contain layout coordinates. As far as I can tell, only WikiPathways provides these, so displaying pathways from other sources is still problematic. A screen shot of a displayed pathway is below:


Viewing a pathway via the PathVisio classes

Viewing a pathway via the PathVisio classes

Written by Rajarshi Guha

July 21st, 2009 at 11:36 pm

Posted in bioinformatics

Tagged with ,

Plate Well Series Plots in R

with 2 comments

Plate well series plots are a common way to summarize well level data across multiple plates in a high throughput screen. An example can be seen in Zhang et al. As I’ve been working with RNAi screens, this visualization has been a useful way to summarize screening data and the various transformations on that data. It’s fundamentally a simple scatter plot, with some extra annotations. Though the x-axis is labeled with plate number, the values on the x-axis are actually well locations. The y-axis is usually the signal from that well.

Since I use it often, here’s some code that will generate such a plot. The input is a list of matrices or data.frames, where each matrix or data.frame represents a plate. In addition you need to specify a “plate map” – a character matrix indicating whether a well is a sample, (“c”) positive control (“p”), negative control (“n”) or ignored (“x”). The code looks like

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
plate.well.series <- function(plate.list, plate.map, draw.sep = TRUE, color=TRUE, ...) {
  signals <- unlist(lapply(plate.list, as.numeric))
  nwell <- prod(dim(plate.list[[1]]))
  nplate <- length(signals) / nwell

  cols <- 'black'
  if (color) {
    pcolor <- 'red'
    ncolor <- 'green'
    colormat <-  matrix(0, nrow=nrow(plate.list[[1]]), ncol=ncol(plate.list[[1]]))
    colormat[which(plate.map == 'n')] <- ncolor
    colormat[which(plate.map == 'p')] <- pcolor
    colormat[which(plate.map == 'c')] <-  'black'
    cols <- sapply(1:nwell, function(x) {
      as.character(colormat)
    })
  }
  plot(signals, xaxt='n', ylab='Signal', xlab='Plate Number', col = cols, ...)
  if (color) legend('topleft', bty='n', fill=c(ncolor, pcolor, 'black'),
                    legend=c('Negative', 'Positive', 'Sample'),
                    y.intersp=1.2)
  if (draw.sep) {
    for (i in seq(1, length(signals)+nwell, by=nwell)) abline(v=i, col='grey')
  }
  axis(side=1, at = seq(1, length(signals), by=nwell) + (nwell/2), labels=1:nplate)
}

An example of such a plot is below


Plate Well Series Plot

Plate well series plot


Another example comparing normalized data from three runs of an RNAi screen investigating drug sensitization (also highlighting the fact that plate 7 in the 5nm run was messed up):


Comparing runs with plate well series plots

Comparing runs with plate well series plots


Written by Rajarshi Guha

July 14th, 2009 at 2:01 am