Papers About Systems You Can’t Use or Buy

Browsing the latest articles in JCIM, I came across one by Sander et al that discussed the design of a drug discovery informatics system employed at Actelion. The main claim to fame of the work appears to be the fact that it was built from scratch and so is vendor independent.

While somewhat interesting, one question jumped out at me: what is the value of this paper? The system is specific to this one company so it’s not like I can access the code or the workflows. I can’t even buy this software. In this case, they do provide public access to some of their tools, so it’s not a totally “opaque” paper. But there are other examples where one cannot really even try out the tools described in the paper.

While I see the value of the paper from the authors point of view (spreading the word, publications being “currency”, etc.), such papers have always felt a little pointless to me, as a reader. What can I do after reading this paper? Is there anything I can follow-up on?

12 thoughts on “Papers About Systems You Can’t Use or Buy”

Abhishek Tiwari says:

February 1, 2009 at 4:15 am

I guess what you are saying is correct, but the one offered as open source also don’t work properly. I still remember one of JCIM published paper “Harvesting Chemical Information from the Internet Using a Distributed Approach”, when I referred the live web server it was not working at all the way it was mentioned in paper (at the time paper got online, there was only snapshots on the web server and paper did mentioned that server is live). I was surprised how it got accepted (may be it was carrying big names such as Andrew Bender). And in most of cases the software published are not fully functional or they will be down after 6 months.

Reply
Rich Apodaca says:

February 1, 2009 at 7:11 am

Rajarshi, I’ve asked the same questions myself at various times:

http://depth-first.com/articles/2006/08/23/readily-available-without-infringements-or-restrictions

The word “advertisement” comes to mind to describe the role (and value) of such a paper.

Reply
Deepak says:

February 1, 2009 at 8:07 am

How something like this can get published is the part I don’t get. Isn’t this what whitepapers and conferences are for?

Reply
Egon Willighagen says:

February 1, 2009 at 8:42 am

Write one of the authors.

Reply
baoilleach says:

February 1, 2009 at 10:12 am

I think the initial precedent was set with a paper about J&J’s in house system last year or so. I wonder do they even reference that paper, which would indicate that such papers add to the collective body of knowledge.

Reply
Rajarshi Guha says:

February 1, 2009 at 3:08 pm

@Abhishek – yes, problems do exist even when papers talk about open source software. But at least an effort is made to provide access to the software that is talked about. Long term maintanence of such software is of course a problem.

@Noel, I agree, it probably did set a precedent (and the J&J paper was cited)

@Rich, advertisement is probably a good term – but advertisements imply that something is being ‘sold’. In this case, even that is not happening!

Reply
Rich Apodaca says:

February 1, 2009 at 4:14 pm

@Rajarshi, IMO every scientific publication is selling something. Even at the lowest level, this could be the idea that the authors are competent, innovative, productive, etc., or that the organization they work for is a good environment with smart people. In most cases, the approach taken in the paper is being sold as a worthwhile way of solving a particular problem. The ‘buyer’ could be a prospective employer, a prospective employee, a funding agency, or even a possible collaborator.

There’s nothing wrong with any of this. But when nobody can independently verify your claims, you move from doing science to doing advertising.

Scientists work in a reputation economy and publications are one form of currency:

http://depth-first.com/articles/2007/05/14/scientific-publication-and-the-seven-deadly-sins

In my view, scientific publications are becoming less effective in this role for some of the same reasons that traditional advertising is becoming less effective, multi-million dollar Super Bowl ads notwithstanding ;-).

Reply
Deepak says:

February 1, 2009 at 8:30 pm

If someone has a reference architecture, which can be implemented by anyone based on a paper, then it’s fine to publish, especially if there is a compelling advantage to the architecture.

I understand the reputation aspects, and all it does is make it more evident that we need alternatives for measuring reputation and assessing quality when it comes to software.

Reply
Michael R. Bernstein says:

February 1, 2009 at 10:48 pm

Perhaps this should simply be called ‘self-promotion’, rather than ‘advertising’.

Reply
Rajarshi Guha says:

February 2, 2009 at 4:38 am

@Rich, good points!

Reply
Rajarshi Guha says:

February 4, 2009 at 4:50 pm

[This comment is actually from Thomas Sander and I manually copied it from the old blog]

The paper is a followup to a poster displayed on the ICCS2008, which caused great interest from cheminformaticians at larger corporations, who seek to convince upper management to support a higher degree of internal development. The main motivation for the paper was to prove by experience that this approach can work and to deliver an argument for those, who are willing to take the risk and go into a similar direction. Only if many drug companies develop their own software ideally on agreed open standards, the commercial software industry will be forced to improve their products and to support common standards. It is a shame that products from different vendors hardly interoperate.

I agree that use of the paper is limited if your job is not related to developing chemistry related software. I also support the idea of open source and open tools. We are in a dialog with our management with the intend to release substantially more stuff (source code and/or tools).

The J&J paper was actually cited.

Reply
Tobias Kind says:

February 8, 2009 at 11:51 pm

Hi Rajarshi,
I think the cool thing is, that Thomas Sander actually replied here.
After reading his post the whole message also got another taste.

I would usually agree with your post from a comp chem point of view.
If somebody reports of a system in a blurred manner only showing results
instead of the methods its pretty useless. Its impossible to test
if the system works really in that way or if its just wishful thinking.
But that holds also true for all kind of conference talks or posters.

The first problem I had was that from home I could not read about
Advanced Biological and Chemical Discovery (ABCD) or OSIRIS
an Entirely in-House Developed Drug Discovery Informatics System,
because its ACS closed source.

So if the authors dont want to pony up 1000 Dollars (or $3000) for
ACS Open Access option its probably not worth (it). Or lets say
its only important for a very small part of the scientific community
or comp chem researchers and of no general interest.

On the other hand I have to agree with Thomas that there is certainly
pressure from outside and inside to report such a development. Its good
for the reputation and also gives you scientific kudos. The OSIRIS system
itself was built over 10 years according to the Actelion company report.
http://www1.actelion.com/documents/AR_Actelion_1999.pdf

If you look at actelion.com there are only three OSIRIS hits; so its of
minor importance to the whole company or … wait maybe it is, maybe its
so good it must be held secretive and the the whole publications suddenly
is an enlightenment?! (I could not read it because of closed access,
so I just assume it is a satori) (google for OSIRIS site:actelion.com)

I would say this is not about company bashing, allthough two of your links
point to publications from companies, it is also not about open source
or open data (the highest level of scientific reporting), but it is about
proper scientific reporting with very detailed methods sections
(many journals do not appreciate that). So here I agree again with you and
would come to the conclusion: A non-opaque system or program should be

1) open source OR
2) commercially available OR
3) contain a very well description to fully reproduce the system.

And we certainly should try to understand scientists in companies who
are operating under the Sword of Damocles, hence their legal department,
and are actually not allowed to disclose any valuable information.

Kind regards
Tobias Kind (FiehnLab)

Reply