[Zinc-fans] question about searchin exact structures in ZINC
John J. Irwin
jji at cgl.ucsf.edu
Tue Sep 30 04:50:29 PDT 2008
Hi Rafael
Rafael Gozalbes wrote:
> Dear Zinc-fans,
>
> I have tried to search exact structures in the ZINC database without
> success, and I would like to know if it is possible to perform such a
> query (lets say, is it possible for example to retrieve solely pyridine
> if in the query I indicate the smiles of pyridine??).
>
If you use n1ccccc1 100 as the line in the SMILES window, 100 means
"Tanimoto 100%", i.e. identity. As you see, it finds more than just the
molecule you are looking for (pyridine is actually on page 2 in this
example). So, our fingerprint method is somewhat imprecise. This is an
area we will return to, but it is not our primary focus, which is
instead on providing ready-to-dock databases for 3D virtual screening.
> I have tried to perform my query by using the Tanimoto and Tversky
> thresholds as indicated in the HELP pages, but without success.
> Furthermore, I have the impression that Tversky thresholds do not work
> and the result is the same as putting only the Tanimoto number.
>
If you send me specific examples that you feel do not work, I will look
at them. It "works" for me, given our not-quite-right fingerprints.
For instance
n1ccccc1 100 finds pages of hits, including pyridine (ZINC ID 895354) on
page 2 </srchdb.pl?zinc=895354>.
n1ccccc1 100 0 100 finds molecules containing pyridine, generally. Also
finds indoles (!). hmmm. Anyway, they are generally larger molecules
that _include_ pyridine.
n1ccccc1 100 100 0 in this case finds more or less the same as n1ccccc1
100 above ("contained in...")
However, other patterns will find something different in this way. Thus
C2CCC1CCCC1C2 100 100 0
Finds molecules having the 5- and 6-membered rings separately as well as
5- and 6- rings fused as drawn. It also seems to find a few other
things. OK, OK, our fingerprints are not quite right! But I hope you
will agree that this interface provides a pragmatic way to filter fairly
rapidly through 10^7 molecules to a number that can be inspected by eye.
Happy docking
John
UCSF ZINC Team
More information about the Zinc-fans
mailing list