Problem executing in Parallel

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
Franky
Newbie
Newbie
Posts: 33
Joined: Mon Apr 10, 2006 1:05 pm
License Nr.: Research Group E. Pehlke

Problem executing in Parallel

#1 Post by Franky » Fri May 05, 2006 11:15 am

Hello everybody!

I am running vasp on a 64bit-Linux cluster (2xOpteron CPUs per Node). Compiler is pgf90 5.2-4, MPI Version is MPICH2-1.0.3, vasp version is 4.6.28.

The serial version of vasp runs fine.

The parallel version compiles and gets linked to MPICH2, which I compiled myself using the configure options suggested in the vasp-Makefile. The header mpif.h is copied to the vasp build directory (Do I need to convert to F90? Where is the tool "Convert"? Seems to work this way though.). For testing purposes however I turned the optimization off (vasp and mpich2 with -O0). Compiling with -O3 isnt different though.

Makefile setting:
FC=pgf90
FCL=mpif90
SCA=

After booting the MPI environment (mpdboot -f hosts) with just the local Server (=2 CPUs), I try to start vasp (mpiexec -np 2 ./vasp) in parallel withe following INCAR:
------------------------------------
System = Bulk-Au (fcc)
LPLANE = .TRUE.
NPAR = 2
LSCALU = .FALSE.
NSIM = 2

IALGO = 48
NBANDS = 8
ISMEAR = 1
SIGMA = 0.40
NELM = 5
-----------------------------------

That produces the following output on Stdout:

[cli_0]: aborting job:
Fatal error in MPI_Cart_sub: Invalid communicator, error stack:
MPI_Cart_sub(194): MPI_Cart_sub(MPI_COMM_NULL, remain_dims=0xadda80, comm_new=0xd0f300) failed
MPI_Cart_sub(76).: Null communicator
[cli_1]: aborting job:
Fatal error in MPI_Cart_sub: Invalid communicator, error stack:
MPI_Cart_sub(194): MPI_Cart_sub(MPI_COMM_NULL, remain_dims=0xadda80, comm_new=0xd0f300) failed
MPI_Cart_sub(76).: Null communicator
rank 1 in job 7 rzcluster.rz.uni-kiel.de_43986 caused collective abort of all ranks
exit status of rank 1: return code 13

Does anybody know what went wrong?
I appreciate your help.
Last edited by Franky on Fri May 05, 2006 11:15 am, edited 1 time in total.

alex
Hero Member
Hero Member
Posts: 585
Joined: Tue Nov 16, 2004 2:21 pm
License Nr.: 5-67
Location: Germany

Problem executing in Parallel

#2 Post by alex » Mon May 08, 2006 10:09 am

Hi Franky,

have you got a 64bit MPICH-executable?

Hth
Alex
Last edited by alex on Mon May 08, 2006 10:09 am, edited 1 time in total.

Franky
Newbie
Newbie
Posts: 33
Joined: Mon Apr 10, 2006 1:05 pm
License Nr.: Research Group E. Pehlke

Problem executing in Parallel

#3 Post by Franky » Mon May 08, 2006 12:39 pm

Hi alex,

compilation parameters are:
F77='pgf77 -Mx,119,0x200000'
F90='pgf90 -Mx,119,0x200000'
FFLAGS='-O0 -tp k8-64 -i8'
F90FLAGS='-O0 -tp k8-64 -i8'
-> export F77 F90 FFLAGS F90FLAGS
-> ./configure --prefix=... --without-romio --without-mpe
-> make && make install

I guess this should give me a 64bit MPICH executable.

Franky
Last edited by Franky on Mon May 08, 2006 12:39 pm, edited 1 time in total.

lahaye
Jr. Member
Jr. Member
Posts: 98
Joined: Fri Apr 14, 2006 5:08 am
Location: Suwon - Korea

Problem executing in Parallel

#4 Post by lahaye » Mon May 15, 2006 2:57 am

Hello,

I ran into same or similar MPI errors on a 32-bits Linux
cluster. I then tried to compile everything against
MPICH-1, which seemed to work fine.
Therefore I never tried to find a solution for the errors
with MPICH2.

Rob.
Last edited by lahaye on Mon May 15, 2006 2:57 am, edited 1 time in total.

Franky
Newbie
Newbie
Posts: 33
Joined: Mon Apr 10, 2006 1:05 pm
License Nr.: Research Group E. Pehlke

Problem executing in Parallel

#5 Post by Franky » Thu May 18, 2006 11:29 am

Hi,
which version of mpich1 did you use? I tried this also and got some error concerning 'MAPSET'.
Could you give me your compiler options for mpich1 and the way you started the mpi environment?
Thank you.
Last edited by Franky on Thu May 18, 2006 11:29 am, edited 1 time in total.

Post Reply