[Dock-fans] failure on dock.mpi
Alessandro Nascimento
al.s.nascimento at gmail.com
Tue Jul 10 12:51:32 PDT 2007
Hi Scottt and Dock-fans,
using the same mpich implementation I compiled pmemd (software of
particle mesh ewald implementation in amber 9). It is running okay in
my cluster now on the 5 nodes (6 processors).
However, neither dock parallel tests or my own dock run there....
Just to mention, my machines are Intel Xeon 64 (3.4Ghz), in a Gigabit
network (I don't know if hardware issue matter in this kind of
problems).
As was mentioned, it may be a mpi problem rather than dock problems. I
just would like to know if has anyone else seen the same problem....
Thanks in advance one more time!!!!!
-alessandro
Initializing MPI Routines...
Initializing MPI Routines...
Initializing MPI Routines...
Initializing MPI Routines...
tail -f rigid.out
cluster_rmsd_threshold 2.0
num_clusterheads_for_rescore 5
num_secondary_scored_conformers_written 4
rank_primary_ligands no
rank_secondary_ligands no
Initializing Library File Routines...
Initializing Orienting Routines...
Initializing Grid Score Routines...
Reading the energy grid from grid.nrg
[cli_1]: aborting job:
Fatal error in MPI_Recv: Other MPI error, error stack:
MPI_Recv(186).............................:
MPI_Recv(buf=0x7fffffc48f1c, count=2, MPI_INT, src=0, tag=100,
MPI_COMM_WORLD, status=0x7fffffc46620) failed
MPIDI_CH3_Progress_wait(212)..............: an error occurred while
handling an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(413):
MPIDU_Socki_handle_read(633)..............: connection failure
(set=0,sock=1,errno=104:Connection reset by peer)
rank 0 in job 6 ruska_34405 caused collective abort of all ranks
exit status of rank 0: killed by signal 11
-----------------------------------
Molecule: ZINC00000012
Anchors: 1
Orientations: 120000
Conformations: 2375
Primary Score
[2] Exit 11 /home/apps/mpich2/bin/mpiexec -n 5
/home/apps/dock/dock6/bin/dock6.mpi -i rigid.in -o rigid.out
</dev/null
On 7/9/07, Alessandro Nascimento <al.s.nascimento at gmail.com> wrote:
> On 7/9/07, Scott Brozell <sbrozell at scripps.edu> wrote:
> > The error above is a generic one from MPI.
> > Did you look for clues in the dock output ?
>
> Yes, I did, but there is no information. The output file is truncated.
> One molecule was docked and the file finished without any
> information....
>
> > What happens with 1, 2, or 3 numbers of processors ?
> >
>
> With 3 and 2 processors the same error occurs at the same moment
> (first molecule being docked). In a single processor (mpirun -np1
> dock.mpi ) the job is running fine (at least until this moment)... 3
> molecules have already being docked
>
> > Did the dock6/install/test/mpi quality control regression test pass ?
> > Retry it:
> > cd dock6/install/test/mpi
> > make clean
> > make
>
> I did it again and pasted below the results. I failed, but without any
> mpi messages (to my best knowledge, at least ;) !!!! )
>
> >
> > Can you run other MPI programs ?
> >
> > Scott
> >
>
> I run amber over lam-mpi. I'm not experienced with mpich2. Maybe I
> might be doing something wrong!
>
> By this poor report, can you have any idea of what might be happening?
>
> Thanks a lot, one more time!
>
>
>
> make[1]: Entering directory `/home/apps/dock/dock6/install/test/mpi'
> cd ../grid_generation && make test
> make[2]: Entering directory `/home/apps/dock/dock6/install/test/grid_generation'
> # Construct box to enclose spheres
> ../../../bin/showbox < box.in > /dev/null
> ../dockdif box.pdb.save box.pdb
> diffing box.pdb.save with box.pdb
> PASSED
> ==============================================================
> # Compute scoring grids
> ../../../bin/grid -i grid.in -o grid.out
> ../dockdif grid.out.save grid.out
> diffing grid.out.save with grid.out
> PASSED
> ==============================================================
> make[2]: Leaving directory `/home/apps/dock/dock6/install/test/grid_generation'
> make[1]: Leaving directory `/home/apps/dock/dock6/install/test/mpi'
> make[1]: Entering directory `/home/apps/dock/dock6/install/test/mpi'
>
> Processing test mpi
> /home/apps/mpich2/bin/mpirun -np 2 ../../../bin/dock6.mpi -i
> mpi.dockin -o mpi.dockmpiout
> Initializing MPI Routines...
> Initializing MPI Routines...
> ../dockdif -t 8 mpi.dockmpiout.save mpi.dockmpiout
> diffing mpi.dockmpiout.save with mpi.dockmpiout
> possible FAILURE: check mpi.dockmpiout.dif
> ==============================================================
> diffing mpi_ranked.mol2.save with mpi_ranked.mol2
> possible FAILURE: check mpi_ranked.mol2.dif
> ==============================================================
> make[1]: Leaving directory `/home/apps/dock/dock6/install/test/mpi'
>
>
> more mpi.dockmpiout.dif :
>
>
> 106c106
> < Conformations: 3
> ---
> > Conformations: 4
> 110a111,118
> > Molecule: ZINC01555236
> > Anchors: 1
> > Orientations: 50
> > Conformations: 18
> > Grid Score: 501606144.
> > vdw: 501606144.
> > es: -10.
> > -----------------------------------
> 114,117c122,125
> < Conformations: 6
> < Grid Score: 21.
> < vdw: 19.
> < es: 1.
> ---
> > Conformations: 7
> > Grid Score: -12.
> > vdw: -12.
> > es: -0.
> 123,125c131,133
> < Grid Score: 19.
> < vdw: 20.
> < es: -1.
> ---
> > Grid Score: 30.
> > vdw: 32.
> > es: -2.
> 130,133c138,141
> < Conformations: 48
> < Grid Score: -15.
> < vdw: -13.
> < es: -1.
> ---
> > Conformations: 44
> > Grid Score: -12.
> > vdw: -12.
> > es: -0.
> 138c146
> < Conformations: 12
> ---
> > Conformations: 16
> 146,148c154,156
> < Conformations: 6
> < Grid Score: 145519.
> < vdw: 145519.
> ---
> > Conformations: 2
> > Grid Score: 168506.
> > vdw: 168506.
> 154c162
> < Conformations: 28
> ---
> > Conformations: 29
>
>
>
> --
> [ ]s
>
> --alessandro
>
--
[ ]s
--alessandro
More information about the Dock-fans
mailing list