Monday, March 28, 2016

Silly Molecules

Yes, you read it right! Lately, I came across this interesting page which has been compiling a list of molecules (small and macro) that have "unusual, ridiculous or downright silly" names (ok! sometimes flamboyant).

Well, believe it or not, these are real names of real molecules. Interestingly, the awesome people at School of Chemistry (Bristol University) cite their sources too. Go through the page to find out in detail how these "silly" molecules (more than 200) were actually named.

I wanted to see how the core structures of these molecules look. So I did some scraping to extract the molecules as MOL files and generated Bemis-Murcko scaffolds. I selected some random molecules and here is how their cores look.



                                     

Some more funny molecules. Eureka! I like this one!



After all, chemists can be funny too! With due respect to these chemists, here is the final one to end the Easter break.


Click on the image to go to the original page. (Image Courtesy: Original Page)

             

Sunday, March 27, 2016

What approved drugs tell us?


In the latest version of ChEMBL (version 21), the approved drugs data freezes December 2015. So there are 1,967 approved (or marketed) drugs (development phase: 4) in a large pool of 11,222 ChEMBL drugs. I thought I should peek into this data and create some quick views. 


ChEMBL drugs by development phase

As per the first approval dates provided, here is how the timeline of drug approval looks like. The first ever drug approval dates back to 1939. A huge number of the marketed drugs were approved in the year 1982 followed by 1996 and 2015. Although there is no clear trend, the overall number of approvals increased till 1996 after which the number decreased and has improved in the last couple of years.

Timeline of drug approval

A total of 1,646 of all the approved drugs have an ATC code assigned. The most prominent anatomical classes (first-level ATC) are N (284), A (275), C (229) and L (223). It is interesting to see more number of systemic hormonal preparations (class H) than anti-parasitic and insecticide drugs (class P). One should note that a drug could belong to more than one anatomical class depending on the indications it is used for.

Distribution of drugs in different first-level ATC classes 

Increasing evidences state that drugs interact with more than one biological target to elicit biological effects, supporting the polypharmacology paradigm. It will be interesting to see how this looks for the approved drugs. For this post, I only chose drug-target interactions in ChEMBL for which mechanism of action is known. There are two aspects of promiscuity that I would like to present here: drug promiscuity and target promiscuity. While the first is about the number of targets a drug interacts with, the latter is about the number of drugs that interact with a single target.

Drug promiscuity trend


Target promiscuity trend

It must be noted that the drug promiscuity and target promiscuity picture would be completely different when we consider activity information from bioassays. But considering this particular knowledge, a majority of biological targets still have only approved drug which interacts with them. This means that there is lot of scope for experimental drugs to explore these targets. Here are the top 10 targets interacting with most number of drugs and the first ever drug approved.

Top 10 targets by number of interacting drugs

A lot more can be understood from the data if deeply analysed. For instance one can see which molecular frameworks (scaffolds and cyclic skeletons) are prevalent among the approved drugs in comparison to drugs in other developmental phases (3, 2, and 1).

Sunday, March 6, 2016

Chemistry in Wikipedia


"The joy of discovery is certainly the liveliest that the mind of man can ever feel".
-Claude Bernard


Exactly an year ago, Journal of Chemoinformatics published a paper that presents a Chemical Structure Explorer that allows one to search through all the chemicals that have an entry page in the Wikipedia. I feel this is a wonderful resource that brings the chemical space across the Wikipedia into a nutshell. It provides both structure and substructure search via a simple web-interface. Here's the paper for you and the web-interface for all the chemistry enthusiasts.

I totally agree with the authors that this effort could improve quality of the chemical entries in the Wiki and indeed the resource is handy for researchers and medicinal chemists to find molecules similar to their novel leads and also to find chemical compounds of interest. They also pointed out few duplicate entries (same SMILES) in the Wiki indicating that few of these are due to missing stereochemistry in SMILES. About 250 most frequently occurring scaffolds were also presented in a nice scaffold-collage (see below).

250 frequent scaffolds in Wiki chemical space (1)

While there was little follow-up (2 citations so far in PubMed) by the scientific community, it would be interesting to see the data from different perspectives. In an earlier post by Egon, he pointed out that he could not parse about 42 SMILES. However, from the latest downloaded file, I could not parse (with CDK 1.5.13) only 7 of them. Most of these were due to unclosed rings and invalid kekule representations in the SMILES string.

Further, I looked at the basic properties of the parsed molecules. The overall distribution of molecules for different properties can be found in the plots below. While there are about 85% molecules with atom count between 0 and 60, there are also more than 10 molecules as complex as consisting more than 500 atoms. Also about 90% of them weigh under 500 amu, ~93% of them have hydrogen bond donors <= 5 and hydrogen bond acceptors <= 10, complying with the Lipinski's rule of five.






A lot more can be done with this wealth of data. For example, it would be interesting to see how the open-access chemistry data is structurally different compared to the patented counterpart from the pharma industry. Looking forward to more interesting analysis over this interesting resource.

References


Ertl et al: Wikipedia Chemical Structure Explorer: substructure and similarity searching of molecules from Wikipedia. J Cheminform. 2015; 7:10