Recently some of our software projects have beging using SBT to build them using Build.scala (rather than build.sbt). One of the challenges (apart from learning Scala to build a Java project!) was removing some unneeded JAR files from a WAR build. I use xsbt-web-plugin to provide support for WAR packaging. It turns out that the […]
A Model Building IDE?
Recently I came across a NIPS2015 paper from Vartak et al that describes a system (APIs + visual frontend) to support the iterative model building process. The problem they are addressing is common one in most machine learning settings – building multiple models (different type) using various features and identifying one or more optimal models to […]
Exploring ChEMBL Targets with Neo4j
As part of an internal project I’ve recently started working with Neo4j for representing and querying relationships between entities (targets, compounds, etc.). What has really caught my attention is the Cypher graph query language – by allowing you to construct queries using graph notation, many tasks that would be complex or tedious in a traditonal […]
rinchi – An R package to generate InChI’s and InChI Keys
While trying to update rcdk on CRAN it was pointed out to me that usage of the library resulted in modifications to the users home directory. Specifically, this occurred when generating InChI‘s. The CDK makes use of jni-inchi, which in turn depends on JNATI which enables Java code to work with native libraries in a platform […]
Fingerprint Similarity Searches in MongoDB
A few of my recent projects have involved the use of MongoDB, primarily for the ease afforded by a schemaless environment. Sometime back I had investigated the use of MongoDB to store chemical structure data, though those efforts did not actually query structures per se; instead they queried for precomputed numeric or text properties. So […]