Byproducts of Byproducts & Biomedical Data

Recently I came across a fantastic article that explored how far ahead Google Maps is compared to Apple Maps, focusing in particular on Areas of Interest (AOI), and how this is achieved with Googles competencies in massive data and massive computation, resulting in a moat. The conclusion is that Google has gathered so much data, […]

Database Licensing & Sustainability

Update (07/28/16): DrugBank/OMx have updated the licensing conditions for DrugBank data in response to concerns raised earlier by various people and groups. See here for a detailed response from Craig Knox A few days back I came across, via my Twitter network, the news that DrugBank had changed their licensing policy to CC BY-SA-NC 4.0. As […]

Cryptography & Chemical Structure Search

Encryption of chemical information has not been a very common topic in cheminformatics. There was an ACS symposium in 2005 (summary) that had a number of presentations on the topic of “safe exchange” of chemical information – i.e., exchanging information on chemical structures without sharing the structures themselves. The common thread running through many presentations was to […]

Exploring ChEMBL Targets with Neo4j

As part of an internal project I’ve recently started working with Neo4j for representing and querying relationships between entities (targets, compounds, etc.). What has really caught my attention is the Cypher graph query language – by allowing you to construct queries using graph notation, many tasks that would be complex or tedious in a traditonal […]

Substructure Searches – High Speed, Large Scale

My NCTT colleague, Trung Nguyen, recently announced a prototype chemical substructure search system based on fingerprint pre-screening and an efficient in-memory indexing scheme. I won’t go into the detail of the underlying pre-screen and indexing methodology (though the sources are available here). He’s provided a web interface allowing one to draw in substructure queries or […]

So much to do, so little time

Trying to squeeze sense out of chemical data

Byproducts of Byproducts & Biomedical Data

Database Licensing & Sustainability

Cryptography & Chemical Structure Search

Exploring ChEMBL Targets with Neo4j

Substructure Searches – High Speed, Large Scale