[Zinc-fans] Some Question related to OLD and New ZINC version
John J. Irwin
jji at cgl.ucsf.edu
Fri Dec 1 20:17:33 PST 2006
Hi Varsha
Thanks for your questions. First, the approximate counts are:
Version 6 (New ZINC): about 4.6 M molecules
Version 5 (aka Old ZINC): about 3.3 M molecules
A1. The structures were all completely recalculated going from version 5
to 6. We - and numerous colleagues - think the geometries in version 6
are better.
A2. Version 6 is supposed to be a superset, indeed supercede, version
5. That didn't quite work out, so we've kept the old one online. We hope
people are only using the new version, except to look up ZINC IDs that
somehow did not make it into version 6. Why did some molecules not make
it into version 6, you may ask?
a. We did not load many compounds over 400 Daltons that were in version
5. Generally docking algorithms perform worse on large molecules,
especially floppy ones. We may load them again some day, but I
personally think docking big molecules is a big waste of time, and disk
space.
b. There were an embarrassing number of duplicates in version 5, due to
problems with our old canonicalization procedure.
We intend to release ZINC version 7 on 1/1/07, so I like to refer to the
versions by number rather than OLD and NEW, please. ZINC version 7 will
be an incremental improvement over ZINC version 6.
Hope this helps
John
varsha wrote:
> Hello,
> I wanted to ask couple of question about New Zinc Database:
>
> 1. For the same molecule, does the OLD and NEW database have the same
> Zinc ID for it? Do they also have the same 3D structure?
>
> 2. For the 3 million new structures in the NEW Zinc database
> (category: all-purchasable) , how many of them are in the old
> database's 6.7 million structures? How many of them are REALLY new?
>
> Thanks in advance,
> Varsha
>
> _______________________________________________
> Zinc-fans mailing list
> Zinc-fans at docking.org
> http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
More information about the Zinc-fans
mailing list