Call for Papers (ACS Fall Meeting 2018)

The More the Merrier: Combine Drugs Together

256th ACS National Meeting
Boston, August 19-23, 2018
CINF Division

Dear Colleagues, we are organizing a symposium at the Fall ACS meeting in Boston focusing on computational, experimental and hybrid approaches to investigate the effect of drugs combinations on biological systems and characterizing phenomena such as synergy and antagonism. The aim of this symposium is to explore fundamental principles that underlie effective combination treatments and synergistic drug behavior across different diseases; to reveal the possible side effects of combination treatment; to investigate possible mechanisms of drugs synergy, additivity and antagonism; to develop a new computational approaches for prediction of these effects.

We welcome contributions related to the identification of promising or novel drug combinations using to treat the rare/neglected disease as well as socially significant diseases (cancer, HIV, etc.), approaches to characterizing drug synergy scores and novel statistical and machine learning methods to predict and prioritize drug combinations as well as to identify side effects caused by drug mixtures.

The deadline for abstract submissions is March 20, 2018. All abstracts should be submitted via MAPS. If you have any questions feel free to contact Alexey or myself.

Alexey Zakharov

Rajarshi Guha

Call for Papers (ACS Fall Meeting 2018)

Move Away from the Lamp Post & Find Druggable Targets

256th ACS National Meeting
Boston August 19-23, 2018
CINF Division

Dear colleagues, we are organizing a symposium focusing on methodologies and case studies that have explored under- or unstudied targets, with the goal of elucidating their function or role in the context of human disease. Several reports have highlighted the focus on current biomedical research on a relatively small set of protein targets. For example, Edwards et al (Nature, 2011) reported that 75% of research is focused on studying only 10% of the known mammalian proteins.

Clearly, there may be many unexplored opportunities among the set of under- and unstudied targets, the so called “dark targets”.  Our own report (Nature Rev Drug Discov 2018) suggests that as many as 38% of human proteins are significantly understudied, and that over 9000 human proteins are currently not associated with NIH funding.

At this critical juncture in human health research, we believe it is timely to discuss approaches to study the “dark genome”. We invite submissions that address all aspects of illuminating dark targets, with a particular focus on computational approaches that involve novel experimental methods. Topics of interest include, but are not limited to:

  • Target prioritization methods
  • Integrative approaches that combine data types, using well studied targets to shed light on unstudied targets
  • Case studies that have elucidated function or disease relevance for unstudied targets
  • Novel characterization of targets that go beyond traditional structure based approaches
  • Target-centric databases that highlight dark targets

The deadline for abstract submissions is March 20, 2018. All abstracts should be submitted via MAPS at If you have any questions feel free to contact  Tudor or myself

Rajarshi Guha             Tudor Oprea
NCATS, NIH                University of New Mexico  

CSA Trust Grants 2018

The Chemical Structure Association (CSA) Trust Grant Program is now accepting applications, from young researchers with an interest in systems and methods used to store, process and retrieve information about chemical structures, reactions and compounds. You can view the details here and can get in touch with Bonnie Lawlor ( if you have any questions.

Visualizing Scaffold Trends

Recently Barbara Zdrazil and I published an article that explored the idea of tracking the attention being paid to a scaffold in the medicinal chemistry literature (as represented by ChEMBL). The gist of the idea is that scaffolds that are more frequently enumerated or tested in more assays (or even published in increasingly high IF journals) are receiving more attention than ones that are less frequently enumerated and so on. By fitting robust regression models to per-year scaffold-aggregated properties we identified significant vs non-significant trends.

The idea originated from a blog post (archived version) by Jonathan Baell, where he traced the publication history of the bis-chalcone scaffold starting from Stoll et al, Biochemistry, 2001 ending up at Anchoori et al, Cancer Cell, 2013, the point being that a PAINS containing scaffold (and thus of possibly dubious biological activity) received increasing attention resulting in a (relatively) high profile journal publication. This led to the question of whether we could systematically capture such attention trends for other scaffolds and thus this paper.

While the article presents a comprehensive analysis, it is limited to using a fixed set of scaffolds (defined using the Bemis-Murcko scheme) and a few properties we selected because we thought they would be proxies of attention. What if we could consider any scaffold? And visualize the time evolution of an arbitrary scaffold-aggregated property over time? This would be something like Google Trends – except that instead of text search terms, you’d be comparing scaffolds.

So I put together the Scaffold Trend Explorer, which allows youste-ss to view trends for a number of properties, for arbitrary substructures. Obviously, searching for frequent substructures will probably be non-responsive (so I disallow queries such as benzene and straight chain alkanes with < 8 carbons). I’ve provided a number of properties ranging from the count of enumerated compounds to drug-likeness. You can draw a structure or provide the SMILES directly. In addition there is a set of bookmarks for well known scaffolds (taken from Welsch et al, 2010). You can compare multiple (up to 9) scaffolds at a time, and compute moving window average curves, which hides the year to year variation.

This tool should let users play around with the idea of scaffold trends. Currently, it’s a very simple visualization tool – you can download the per-year data, but that’s it. Unlike the paper, I don’t fit regression lines, though I hope to implement this in the future. There’s a number of enhancements planned, including access to the underlying publications for a scaffold in a given year, simple analytics (such as differential analysis) on trends and so on. A natural next step is to go beyond the medchem literature and consider patents as well (say, via SureChEMBL). And of course, feature requests are also welcome.

Byproducts of Byproducts & Biomedical Data

Recently I came across a fantastic article that explored how far ahead Google Maps is compared to Apple Maps, focusing in particular on Areas of Interest (AOI), and how this is achieved with Googles competencies in massive data and massive computation, resulting in a moat. The conclusion is that

Google has gathered so much data, in so many areas, that it’s now crunching it together and creating features that Apple can’t make—surrounding Google Maps with a moat of time

But the key point that caught my eye was the idea that Google Maps sophistication is a byproduct of byproducts. As pointed out, AOI’s are a byproduct of buildings (a byproduct of satellite imagery) and places (a byproduct of Street View) and thus AOI’s are byproducts of byproducts.

This observation led me to thinking of how it could apply in a biomedical setting. In other words, given disparate biomedical data types, what new data types can be generated from them, and using those derived data types what further data types could be derived again? (“data type” may not be the right term to use here, and “entity” may be a more suitable one).

One interpretation of this idea are integrative resources, where disparate (but related) data types are connected to each other in a single store, allowing one to (hopefully) make non-obvious links between entities that explain a higher level system or phenomenon. Recent examples include Pharos and MARRVEL. However, these don’t really fit the concept of byproducts of byproducts as neither of these resources actually generate new data from pre-existing data, at least by themselves.

So are there better examples? One that comes to mind is the protein folding problem. While one could fold proteins de novo, it’s a little easier if constraints are provided. Thus we have constraints derived from NMR and AA coevolution. As a result we can view predicted protein structures as a byproduct of NMR constraints (a byproduct of structure determination) and a byproduct of AA co-evolution data (a byproduct of gene sequencing). An example of this is Tang et al, 2015.

Another one that comes to mind are inferred gene (or signalling, metabolic etc) networks, which go from say, gene expression data to a network of genes. But going by the Google Maps analogy above, the gene network is the first level byproduct. One could image a computation that processes a set of (inferred) gene networks to generate higher level structures (say, spatial localization or differentiation). But this is a bit more fuzzier than the protein structure problem

Of course, this starts to break down when we take into account errors in the individual components. Thus sequencing errors can introduce errors in the coevolution data, which can get carried over into the protein structure. This isn’t inevitable – but it does require validation and possibly curation. And in many case, large, correlated datasets can allow one to account for errors (or work around them).

This is mainly speculation on my part, but it seems interesting to try and think of how one can combine disparate data types to generate new ones, and repeat this process to come up with something new that was not available (or not obvious) from the initial data types.