[Dock-fans] DOCKing with different architectures?
John J. Irwin
jji at cgl.ucsf.edu
Wed Jul 16 12:18:27 PDT 2008
Hi Marshall
Marshall Levesque wrote:
> John-
>
> Thank you for the quick and helpful reply.
>
> I had assumed this was the cause of discrepancies. So in your
> opinion, results from a screening experiment that was run on a single
> machine have equal "validity" when compared to the same screening
> results obtained using two different architectures?
Yes, equally "valid", by which I mean, they have no "validity" at all.
What does it mean to say a docking prediction is valid? It means it
actually works against the enzyme. Thus, they are predictions that need
to be tested.
John
UCSF DOCK Team
>
> -Marshall
>
> On Wed, Jul 16, 2008 at 11:05 AM, John J. Irwin <jji at cgl.ucsf.edu
> <mailto:jji at cgl.ucsf.edu>> wrote:
>
> Hi Marshall
>
> Marshall Levesque wrote:
> > This may be a general question that applies to all of the DOCK
> > software suite, or only certain parts:
> >
> > If one were to perform a screening of 100 small compounds (eg from
> > ZINC) using DOCK6 (grid energy and/or AMBER score) and the workload
> > was split between two different architectures (32-bit/64-bit,
> > different compiler versions), are there any issues with using the
> > results ranked by energy score? For this described situation, 50
> > compounds screened on each machine, same target, same input
> > files/parameters.
> > I'm asking this because if I run the same set of compounds on two
> > different architectures, I get similar results with similar rankings
> > and scores, but sometimes there is the occasional swing in score for
> > some of the compounds (eg -20 --> -8 for grid energy score). These
> > large changes in score are obviously discomforting, but even the
> small
> > changes (-20 --> -19) could cause a significant shift in
> rankings when
> > screen large datasets on the order of 10^5 or 10^6.
> >
> > Those most familiar with the DOCK algorithms might know best.
> Is the
> > difference in score coming from different architectures something to
> > do with the calculation of the score? or the
> orientation/confirmation
> > of the compounds by anchor-and-grow?
> >
> > I felt that the limited sampling of the search space results in the
> > fact that one can never produce a TRUE score, but more sampling does
> > narrow the window of discrepancy in energy score for the same
> compound
> > DOCKed on two different architectures, leading me to believe the
> > conformation search is at fault.
> >
> > Any insight into this would be greatly appreciated, thanks!
> We use a different version of DOCK, but I think the conclusions are
> general for the method. It is normal for DOCK to produce very slightly
> different results for identical input on different hardware due to
> accumulation of small rounding errors on floating point numbers.
> Occasionally, the "slight difference" will be at a saddle point
> during a
> step of minimization, resulting in a different local minimum being
> found, and thus, potentially dramatically different results. But you
> don't have to go to new hardware to see this phenomenon. Just reverse
> the order of molecules in the database, so that minimization of a
> compound starts with a different random seed.
>
> The bottom line? Docking can be useful, but has important weaknesses.
> Make predictions, test them, and go back afterwards and check the
> calculation against the experimental results.
>
> I hope this helps.
>
> John
> UCSF DOCK Team
> _______________________________________________
> Dock-fans mailing list
> Dock-fans at docking.org <mailto:Dock-fans at docking.org>
> http://blur.compbio.ucsf.edu/mailman/listinfo/dock-fans
>
>
More information about the Dock-fans
mailing list