[Zinc-fans] Question about "all purchasable" subset.

marc marc at plasmodium.sfsu.edu
Thu Oct 30 16:48:07 PDT 2008


John J. Irwin wrote:
> Hi Marc
>
> marc wrote:
>   
>> Hello other Zinc fans,
>> I've just recently started using the OpenEyes utilities ROCS and EON to 
>> do shape/electrostatics comparison against ZINC on a couple of projects. 
>>
>> Quick question about the large "all purchasable" subset.   I've found a 
>> number of hits from this subset, that when I follow up
>> on them only list a single "vendor" (pubchem).  Are compounds listed in 
>> Pubchem really purchasable?  I've never thought of Pubchem as a vendor, 
>> and can't see any way of procuring compounds when I visit their website.
>>
>> Assuming I'm right, say in this small subset of my hits (30 compounds), 
>> it appears that 5 of them appear only in PubChem.   (~16% of the hits 
>> below don't appear to be purchasable, unless Zinc is mis-categorizing 
>> them).  Any clarification on this issue?  thanks!
>> -marc
>>  
>>     
[john responded]:
> Hmmm. Before I go further, can you tell me more about how you put this
> list of ZINC IDs together please? Date downloaded? Exactly where did you
> look for the information?
Hi John,
thanks for the response.    Just downloaded the current "all 
purchasable" ZINC collection this last week.  I actually downloaded two 
collections of that subset from this link, in SDF format.     
http://zinc.docking.org/subset1/6/index.html
[a]  the "all" collection (referring to all protonation states)
[b] the "single" collection (referring to a single protonation state as 
I understand).

My goal is to do shape/electrostatics comparison against a lead compound 
in a drug development project.
In this search I used the [a] database, although I probably could have 
just used the [b] database just as well and in less time.

Then I used the OpenEyes ROCS utility to compare the minimized lead 
compound (via OpenEye SZYBKI) against the entire ZINC database.   I 
constructed a wonderful little PERL script to assist in the overall 
search feat.  It basically runs ROCS searches on each of the SDF files 
that was downloaded from ZINC.   In the end I get like 272 SDF files of 
ROCS hits (one for each of the 272 ZINC SDF files composing this 
library).  I concatenate the hits files (unix cat command), visualize 
the massive file using OpenEyes VIDA, sort by Tanimoto score to find the 
best hits.   The only problem with my approach might be that I'm not 
sampling multiple conformations of my starting ligand (or the ZINC 
database), but for a first-round attempt, that's probably ok.  (and the 
topic for another post later -- I'm sure this issue has been addressed 
elsewhere...... by doing a minimization of my starting ligand, and 
knowing that the ZINC compounds have also been minimized, I'm expecting 
the search pick up at least a good subset of shape hits, and at least 
some interesting compounds to purchase and screen.)

After the hits were obtained, I copied/pasted a list of them into the 
ZINC Search/browse website: 
http://zinc.docking.org/choose.shtml
This is where I found some of the hits to only be "available" from 
Pubchem, and not really purchasable as I see it.

(The particular hits I mentioned in my previous email all reside in 
6_p0.33.sdf from the "all purchasable" library obtained from ZINC).

thanks for your help.  Sorry for the mouthful. 
-marc







More information about the Zinc-fans mailing list