[Zinc-fans] Question about "all purchasable" subset.
marc
marc at plasmodium.sfsu.edu
Thu Oct 30 16:48:07 PDT 2008
John J. Irwin wrote:
> Hi Marc
>
> marc wrote:
>
>> Hello other Zinc fans,
>> I've just recently started using the OpenEyes utilities ROCS and EON to
>> do shape/electrostatics comparison against ZINC on a couple of projects.
>>
>> Quick question about the large "all purchasable" subset. I've found a
>> number of hits from this subset, that when I follow up
>> on them only list a single "vendor" (pubchem). Are compounds listed in
>> Pubchem really purchasable? I've never thought of Pubchem as a vendor,
>> and can't see any way of procuring compounds when I visit their website.
>>
>> Assuming I'm right, say in this small subset of my hits (30 compounds),
>> it appears that 5 of them appear only in PubChem. (~16% of the hits
>> below don't appear to be purchasable, unless Zinc is mis-categorizing
>> them). Any clarification on this issue? thanks!
>> -marc
>>
>>
[john responded]:
> Hmmm. Before I go further, can you tell me more about how you put this
> list of ZINC IDs together please? Date downloaded? Exactly where did you
> look for the information?
Hi John,
thanks for the response. Just downloaded the current "all
purchasable" ZINC collection this last week. I actually downloaded two
collections of that subset from this link, in SDF format.
http://zinc.docking.org/subset1/6/index.html
[a] the "all" collection (referring to all protonation states)
[b] the "single" collection (referring to a single protonation state as
I understand).
My goal is to do shape/electrostatics comparison against a lead compound
in a drug development project.
In this search I used the [a] database, although I probably could have
just used the [b] database just as well and in less time.
Then I used the OpenEyes ROCS utility to compare the minimized lead
compound (via OpenEye SZYBKI) against the entire ZINC database. I
constructed a wonderful little PERL script to assist in the overall
search feat. It basically runs ROCS searches on each of the SDF files
that was downloaded from ZINC. In the end I get like 272 SDF files of
ROCS hits (one for each of the 272 ZINC SDF files composing this
library). I concatenate the hits files (unix cat command), visualize
the massive file using OpenEyes VIDA, sort by Tanimoto score to find the
best hits. The only problem with my approach might be that I'm not
sampling multiple conformations of my starting ligand (or the ZINC
database), but for a first-round attempt, that's probably ok. (and the
topic for another post later -- I'm sure this issue has been addressed
elsewhere...... by doing a minimization of my starting ligand, and
knowing that the ZINC compounds have also been minimized, I'm expecting
the search pick up at least a good subset of shape hits, and at least
some interesting compounds to purchase and screen.)
After the hits were obtained, I copied/pasted a list of them into the
ZINC Search/browse website:
http://zinc.docking.org/choose.shtml
This is where I found some of the hits to only be "available" from
Pubchem, and not really purchasable as I see it.
(The particular hits I mentioned in my previous email all reside in
6_p0.33.sdf from the "all purchasable" library obtained from ZINC).
thanks for your help. Sorry for the mouthful.
-marc
More information about the Zinc-fans
mailing list