Archive for the ‘polypharmacology’ tag
I came across Takigawa et al where they address polypharmacology by investigating drug-target pairs. Their approach is to simultaneously identify substructures from the ligand and subsequences from the target and combine this information to suggest drug-target pairs that represent some form of polypharmacology. More specifically their hypothesis is that “polypharmacological principles” are embedded in a special set of paired fragments (substructures on the ligand side, subsequence on the target side). When you think about it, this is a more generalized (abstract?) version of a pharmacophore that makes the role of the target explicit.
Their approach originates from two assumptions
These results suggest that targets of promiscuous drugs can be dissimilar, implying that only a small part of each target is related with the principle of polypharmacology.
Similarly, recent research shows that smaller drugs in molecular weight are likely to be more promiscuous, suggesting that small fragments in each ligand would be a key to drug promiscuity
These lead to their hypothesis
… that paired fragments significantly shared in drug-target pairs could be crucial factors behind polypharmacology.
Based on this idea they first apply a frequent itemset algorithm to identify pairs of subgraph (SG) and subsequences (SS), that occur frequently (more than 5%) in the drug-target pairs. After identifying about 10,000 such SS-SG pairs, they define a sparse fingerprint, where each bit corresponds to one such pair. Using these fingerprints they then cluster the drug-target pairs, ending up with a selection of clusters. They then propose that individual clusters represent distinct polypharmacologies.
Our significant substructure pairs partitioned drug-target pairs covering most of approved drugs into clusters, which were clearly separated from each other, implying that each cluster corresponds to a unique polypharmacology type
While the underlying algorithms to obtain their results are nice, a lot of things weren’t clear.
Foremost, given the above quote, it’s not exactly clear from the paper what is meant by “unique polypharmacology type“? Given that a cluster will consist of multiple drugs and multiple targets, it is not apparent from the text that a cluster highlights either promiscuity of compounds or ligand preferences for a small number of targets. While I think this is a major issue there are some other lesser problems
- I get the impression that they consider promiscuity and polypharmacology as equivalent concepts. While there is a degree of similarity, I’d regard polypharmacology more as a rationally, controlled type of promiscuity
- Most fragments they highlight in Figure 2 are relatively trivial paths. Certainly, reactive groups can lead to promiscuity; none of the subgraphs list exhibit reactive functionality and their application of the frequent itemset method, using a support of 5% could easily filter these out
- Given they consider arbitrary subsequences of the target, the resulting associations could be meaningless. Again, it’d be interesting to note, in cases where crystal structure is available, how many of the subsequences, in the list of significant SS-SG pairs, lie in or around the binding site. A related question would be, of the SG-SS pairs associated with a cluster, how are individual subsequences distributed? Few unique subsequences could point towards a common binding site or active domain.
- Related to the previous point, it’d be interesting to see in how many of the SG-SS paired fragments, the members correspond to actual interacting motifs (again based on crystal structure data).
- One could argue that just using string subsequences to characterize the target misses information on important ligand-target interactions.
And while they may be the first to consider an analysis of drug-target pairs specifically, the idea of considering ligand and target simultaneously is not new. For example, the SiFT approach is quite similar and was described in 2004.
So, even though the paper seems pretty fuzzy on the supposed polypharmacology that they identify, it is overall an interesting paper (and one of the more interesting cheminformatics applications of frequent itemset methods).
Recently there have been two papers asking whether cheminformatics or virtual screening in general, have really helped drug discovery, in terms of lead discovery.
The first paper from Muchmore et al focuses on the utility of various cheminformatics tools in drug discovery. Their report is retrospective in nature where they note that while much research has been done in developing descriptors and predictors of various molecular properties (solubility, bioavilability etc), it does not seem that this has contributed to increased productivity. They suggest three possible reasons for this
- not enough time to judge the contributions of cheminformatics methods
- methods not being used properly
- methods themselves not being sufficiently accurate.
They then go on consider how these reasons may apply to various cheminformatics methods and tools that are accessible to medicinal chemists. Examples range from molecular weight and ligand efficiency to solubility, similarity and bioisosteres. They use a 3-class scheme – known knowns, unknown knowns and unknown unknowns corresponding to methods whose underlying principles are whose results can be robustly interpreted, methods for properties that we don’t know how to realistically evaluate (but which we may still do so – such as solubility) and methods for which we can get a numerical answer but whose meaning or validity is doubtful. Thus for example, ligand binding energy calculations are placed in the “unknown unknown” category and similarity searches are placed in the “known unknown” category.
It’s definitely an interesting read, summarizing the utility of various cheminformatics techniques. It raises a number of interesting questions and issues. For example, a recurring issue is that many cheminformatics methods are ultimately subjective, even though the underlying implementation may be quantitative – “what is a good Tanimoto cutoff?” in similarity calculations would be a classic example. The downside of the article is that it does appear at times to be specific to practices at Abbott.
The second paper is by Schneider and is more prospective and general in nature and discusses some reasons as to why virtual screening has not played a more direct role in drug discovery projects. One of the key points that Schneider makes is that
appropriate “description of objects to suit the problem” might be the key to future success
In other words, it may be that molecular descriptors, while useful surrogates of physical reality, are probably not sufficient to get us to the next level. Schneider even states that “… the development of advanced virtual screening methods … is currently stagnated“. This statement is true in many ways, especially if one considers the statistical modeling side of virtual screening (i.e., QSAR). Many recent papers discuss slight modifications to well known algorithms that invariably lead to an incremental improvement in accuracy. Schneider suggests that improvements in our understanding of the physics of the drug discovery problem – protein folding, allosteric effects, dynamics of complex formation, etc – rather than continuing to focus on static properties (logP etc) will lead to advances. Another very valid point is that future developments will need to move away from the prediction or modeling of “… one to one interactions between a ligand and a single target …” and instead will need to consider “… many to many relationships …“. In other words, advances in virtual screen will address (or need to address) the ligand non-specificity or promiscuity. Thus activity profiles, network models and polyparmacology will all be vital aspects of successful virtual screening.
I really like Schneiders views on the future of virtual screening, even though they are rather general. I agree with his views on the stagnation of machine learning (QSAR) methods but at the same time I’m reminded of a paper by Halevy et al, which highlights the fact that
simple models and a lot of data trump more elaborate models based on less data
Now, they are talking about natural language processing using trillion-word corpora. Not exactly the situation we face in drug discovery! But, it does look like we’re slowly going in the direction of generating biological datasets of large size and of multiple types. A recent NIH RFP proposes this type of development. Coupled with well established machine learning methods, this could be lead to some very interesting developments. (Of course even ‘simple’ properties such as solubility could benefit from a ‘large data’ scenario as noted by Muchmore et al).
Overall, two interesting papers looking at the state of the field from different views.
The Curious Wavefunction has a nice post on the issue of selective and non-selective kinase inhibitors. An interesting commentary, especially in the light of the recent paper on network polypharmcology. While there have been a number of papers on polypharmcology and the idea itself is very attractive, it has seemed to me that for this approach to succeed we need very detailed information on the targets and systems involved in these networks. Indeed, a current project of mine is currently hitting this problem. As Ashutosh notes,
… in the first place we don’t even know what specific subset of kinases to hit for treating a particular disease. First comes target validation, then modulation.