[Dock-fans] Single processor optimization

Sudipto Mukherjee sudmukh at yahoo.com
Sat May 10 14:07:48 PDT 2008


Hi Francesco,

I'm not clear why you are using continuous scoring while docking, especially with a large protein. An energy grid would make the dock run orders of magnitude faster. Since you have so much memory, you could easily use a very fine grid (0.2A or smaller) if you are concerned about grid artifacts. If you wish, you can turn off clustering and ask dock to return a large number of poses. Rescore the poses with the continuous scoring function. Even then, you might consider to retain only those domains of the protein which are close to the binding site. 

The Redpaper describes using IBM's MASSV libraries for optimizaing programs on Blue Gene. As far as I know, there are nosimilar libraries we can use with gcc. Also, note that for a single ligand, you are not getting the benefit of of your dual opteron setup. Only one processor core is being used for running dock. 

The continuous scoring function is mostly used for rescoring and not docking as it is quite slow O(mn) for m protein and n ligand atoms. The grid scoring however, is only O(n) and therefore much faster.
 
Regards
Sudipto Mukherjee
Graduate Student, Robert C. Rizzo Lab
Dept. of Applied Math & Statistics, Stony Brook University

----- Original Message ----
From: Francesco Pietra <chiendarret at yahoo.com>
To: dock-fans <dock-fans at docking.org>
Sent: Saturday, May 10, 2008 12:28:27 PM
Subject: [Dock-fans] Single processor optimization

This post is mostly to users of dual-opteron and Linux with a final brief question to the developers.

While performing a continuous flex docking for a large ligand on a big protein (NUMA-type machine, opteron 875, 2.2MHz, 8GB RAM taken by the process (%MEM 29.3); 1150 min to date, as from top; no idea how much time will still be required, should hopefully no crash occur) I was reading the IBM Redpaper "High throughput computing validation for drug discovery using the dock program on a massively parallel system" recently referred to in the Dock6 manual.

Section "Single processor optimization" describes a 43% improvement in speed by using specialized math libraries. I am using the standard GNU libraries (compilation was with gcc 4.2.3) provided by Debian Linux lenny. I wonder whether anyone has found improvements for this platform.

To the developers: since most of the time with this CPU-bound code is spent on scoring (Redpaper above), is there any plan to implement a monitoring of the progress of the scoring, no matter how approximate it might be?

Thanks
francesco pietra






      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://blur.compbio.ucsf.edu/pipermail/dock-fans/attachments/20080510/e5c062b4/attachment.html 


More information about the Dock-fans mailing list