Finally all the paper work is in place for me to visit the EBI in May to run a hands-on workshop on doing cheminformatics in R. This is part of a two day program. On May 17th I will be discussing the use of the rcdk and rpubchem packages, considering a variety of use cases […]
ChEMBL in RDF and Other Musings
Earlier today, Egon announced the release of an RDF version of ChEMBL, hosted at Uppsala. A nice feature of this setup is that one can play around with the data via SPARQL queries as well as explore the classes and properties that the Uppsala folks have implemented. Having fiddled with SPARQL on and off, it […]
Molecules & MongoDB – Numbers and Thoughts
In my previous post I had mentioned that key/value or non-relational data stores could be useful in certain cheminformatics applications. I had started playing around with MongoDB and following Rich’s example, I thought I’d put it through its paces using data from PubChem. Installing MongoDB was pretty trivial. I downloaded the 64 bit version for […]
Cheminformatics and Non-Relational Datastores
Over the past year or so I’ve been seeing a variety of non-relational data stores coming up. They also go by terms such as document databases or key/value stores (or even NoSQL databases). These systems are alternatives to traditional RDBMS’s in that they do not require explicit schema defined a priori. While they do not […]
When is a Bad Plate Bad?
When running a high-throughput screen, one usually deals with hundreds or even thousands of plates. Due to the vagaries of experiments, some plates will not be ervy good. That is, the data will be of poor quality due to a variety of reasons. Usually we can evaluate various statistical quality metrics to asses which plates […]