[Zinc-fans] subset 29
John J. Irwin
jji at cgl.ucsf.edu
Mon Jun 22 17:34:09 PDT 2009
Hi Ben
Thanks for your email and your interest in ZINC.
Ben Keshet wrote:
> Hi,
>
> According to the ZINC subsets page, subset #29 is CNS permeable
> molecules, however I cannot find this subset in the download-able list
> of subsets. Is it still available anywhere? If yes, what was the
> criteria that this subset was filtered by?
We quietly removed the "CNS permeable" subset (#29) because we thought
no one was interested. Your feedback is helpful - we will reactivate
it. We used the criteria we found in the literature: tPSA < 60 AND 150
< MWT < 400 AND 1.5 < calculated LogP < 2.7 .
We have renumbered "CNS permeable" as #39. We have added the additional
"clean" requirement, since we think that will be more interesting and
useful for most people (tell us if that is not true). It will be ready
Friday or earlier. It will be announced on our blog, http://docking.org/.
>
> Another unrelated question - when I search ZINC, I can always view
> less hits per page than I indicate. For example, I have only 94 hits
> when I ask for 500/page. Below the table I have the following text:
The numbering is funky and can be ignored. We can get the numbering
correct, but it makes the database run unacceptably slowly. We are aware
of this problem, and how confusing it must seem. We regret being
unclear, and hope to have a better - faster - solution soon.
>
> start 0 size 500
>
> select s.sub_id,s.smiles from substance s, protomer p, catalog_item
> ci, catalog c , zinctmp.qury q where s.sub_id = ci.sub_id_fk and
> ci.cat_id_fk = c.cat_id and c.free=1 and p.sub_id_fk = s.sub_id and
> q.qury_id = 146604 and q.sub_id = s.sub_id limit 0,500
>
>
> When I downloaded the table, I got a shorter table than what I
> expected. For example, searching by smile: Cc1ccc(C)cc1, yielded only
> 165 molecules. What am I missing?
Our current implementation is very inefficient at handling ad hoc
subsets. It was really fine when ZINC has 750,000 molecules, but with
32M internal entries, we have had to cut corners to make ZINC operate
fast enough to support the community. ZINC is really designed to get
dockable databases into your hands via downloading of prepared subsets.
We support ad hoc queries, but if you want full answers to queries that
return many results, you are better off downloading the SMILES to your
own computer and doing the search locally. (it turns out this is pretty
easy to do, e.g. with OEChem or other tools).
I hope this is clear and not too disappointing.
John
UCSF ZINC Team
>
> Thanks,
> Ben Keshet
> UMBC
> _______________________________________________
> Zinc-fans mailing list
> Zinc-fans at docking.org
> http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>
More information about the Zinc-fans
mailing list