Earlier today, Emily Wixson posted a question on the CHMINF-L list asking … if there is any way to count the number of authors of papers with specific keywords in the title by year over a decade … Since I had some code compiling and databases loading I took a quick stab, using Python and […]
Pig and Cheminformatics
Pig is a platform for analyzing large datasets. At its core is a high level language (called Pig Latin), that is focused on specifying a series of data transformations. Scripts written in Pig Latin are executed by the Pig infrastructure either in local or map/reduce modes (the latter making use of Hadoop). Previously I had […]
Back from Boston
Another ACS National Meeting, this time in Boston, is over and I’m finally home. I gave two talks, one on issues surrounding the data deluge in modern drug discovery and another one on structure activity landscapes. There were a number of great sessions in CINF, COMP and MEDI, with some thought-provoking talks. I especially liked a talk given […]
SALI in Bulk
Sometime back John Van Drie and I had developed the Structure Activity Landscape Index (SALI), which is a way to quantify activity cliffs – pairs of compounds which are structurally very similar but have significantly different activities. In preparation for a talk on SALI at the Boston ACS, I was looking for SAR datasets that […]
Job Openings at the NCGC
I’ve been at the NCGC for a little more than a year and I can say that it’s a great place to work – smart people, cutting edge projects in chemical genomics and chemical biology, opportunities to be involved in all aspects of HTS projects and fresh data (lots of it). Now there’s opportunities for […]