Re: MPI_Sendrecv error in Gadget2 from Michele Trenti on 2006-10-09 (GADGET General Discussion Mailing List)

From: Michele Trenti <trenti_at_stsci.edu>
Date: Mon, 9 Oct 2006 10:14:22 -0400 (EDT)

Hi Volker,

thanks a lot for your insight.

In the days after posting the message I continued to investigate the
problem and I just managed to track down its origin with the help of the
computer administrator. It seems to be due to the firewall, that was
blocking MPI communication: the network security manager here at STScI
had opened only a small range of ports for MPI processes, that I was
passing to MPICH2 via the PORT_RANGE variable.

Everything worked nicely with small runs, but by increasing the number
of particles all the allowed ports were eventually used and MPICH2 started
using ports outside the allowed range so the packets were blocked by the
firewall.

Best regards,

Michele

Michele Trenti
Space Telescope Science Institute
3700 San Martin Drive Phone: +1 410 338 4987
Baltimore MD 21218 U.S. Fax: +1 410 338 4767

" We shall not cease from exploration
   And the end of all our exploring
   Will be to arrive where we started
   And know the place for the first time. "

                                      T. S. Eliot

On Mon, 9 Oct 2006, Volker Springel wrote:

>
> Michele Trenti wrote:
>> Hi,
>>
>> I am stuck with a MPI error in MPI_Sendrecv when I try to run "large" N
>> (400^3 on 18 cpus) simulations with Gadget2. The error (reported in the
>> log below) appears after the Tree force evaluation in the first step.
>> Smaller N runs (e.g. 256^3) complete smoothly in a few hours.
>>
>> My understanding (see log of the error below) is that if the CPU time to
>> compute the tree force is too long (i.e. of the order of a few minutes),
>> the connection between the nodes is killed. However the MPI ring stays
>> on, i.e. I can do mpitrace and mpiringtest with no troubles after I get
>> the error message in the Gadget execution. Also, I have never been kicked
>> out of a ssh opened terminal when I connect to a node, even if I stay idle
>> for days.
>>
>> I am using a linux cluster of 9 Sun Opteron Dual CPU hosts with
>> MPICH2-1.0.4.
>>
>> Has anyone encountered a similar problem and/or can share some insight on
>> possible solutions?
>
>
> Hi Michele,
>
> This is a peculiar problem, I agree. I'm not sure that the 'time-out'
> explanation is the right one though. Normally, MPI should keep waiting.
>
> From your log-file, it looks as if Cpu 11 has been brought down by a
> KILL-signal (signal 9). This could arise if the machine was out of
> memory and received this signal from the operating system. You could try
> to monitor memory usage of the code with a tool like 'top' to see how
> close you are to exhausting the physical memory of the machine(s).
>
> Volker
>
>
>
>
>
>
>
>>
>> Thanks a lot for your help,
>>
>> Michele
>>
>> -------------------------------------------------------------------
>> This is Gadget, version `2.0'.
>>
>> Running on 18 processors.
>>
>> found 14 times in output-list.
>>
>> Allocated 40 MByte communication buffer per processor.
>>
>> Communication buffer has room for 953250 particles in gravity computation
>> Communication buffer has room for 327680 particles in density computation
>> Communication buffer has room for 262144 particles in hydro computation
>> Communication buffer has room for 243854 particles in domain decomposition
>>
>>
>> Hubble (internal units) = 0.1
>> G (internal units) = 43007.1
>> UnitMass_in_g = 1.989e+43
>> UnitTime_in_s = 3.08568e+16
>> UnitVelocity_in_cm_per_s = 100000
>> UnitDensity_in_cgs = 6.76991e-22
>> UnitEnergy_in_cgs = 1.989e+53
>>
>> Task=0 FFT-Slabs=15
>> Task=1 FFT-Slabs=15
>> Task=2 FFT-Slabs=15
>> Task=3 FFT-Slabs=15
>> Task=4 FFT-Slabs=15
>> Task=5 FFT-Slabs=15
>> Task=6 FFT-Slabs=15
>> Task=7 FFT-Slabs=15
>> Task=8 FFT-Slabs=15
>> Task=9 FFT-Slabs=15
>> Task=10 FFT-Slabs=15
>> Task=11 FFT-Slabs=15
>> Task=12 FFT-Slabs=15
>> Task=13 FFT-Slabs=15
>> Task=14 FFT-Slabs=15
>> Task=15 FFT-Slabs=15
>> Task=16 FFT-Slabs=15
>> Task=17 FFT-Slabs=1
>>
>> Allocated 434.028 MByte for particle storage. 80
>>
>>
>> reading file `./ic/GadgetSnapshot_000' on task=0 (contains 64000000 particles.)
>> distributing this file to tasks 0-17
>> Type 0 (gas): 0 (tot= 0000000000) masstab=0
>> Type 1 (halo): 64000000 (tot= 0064000000) masstab=0.057824
>> Type 2 (disk): 0 (tot= 0000000000) masstab=0
>> Type 3 (bulge): 0 (tot= 0000000000) masstab=0
>> Type 4 (stars): 0 (tot= 0000000000) masstab=0
>> Type 5 (bndry): 0 (tot= 0000000000) masstab=0
>>
>> reading done.
>> Total number of particles : 0064000000
>>
>> allocated 0.0762939 Mbyte for ngb search.
>>
>> Allocated 321.943 MByte for BH-tree. 64
>>
>> domain decomposition...
>> NTopleaves= 512
>> work-load balance=1.02083 memory-balance=1.02083
>> exchange of 0060408642 particles
>> exchange of 0026642403 particles
>> exchange of 0005557184 particles
>> exchange of 0000825654 particles
>> domain decomposition done.
>> begin Peano-Hilbert order...
>> Peano-Hilbert done.
>> Begin Ngb-tree construction.
>> Ngb-Tree contruction finished
>>
>> Setting next time for snapshot file to Time_next= 0.0243902
>>
>>
>> Begin Step 0, Time: 0.0123457, Redshift: 80, Systemstep: 0, Dloga: 0
>> domain decomposition...
>> NTopleaves= 512
>> work-load balance=1.02083 memory-balance=1.02083
>> domain decomposition done.
>> begin Peano-Hilbert order...
>> Peano-Hilbert done.
>> Start force computation...
>> Starting periodic PM calculation.
>>
>> Allocated 17.6798 MByte for FFT data.
>>
>> done PM.
>> Tree construction.
>> Tree construction done.
>> Begin tree force.
>> tree is done.
>> Begin tree force.
>> [cli_16]: aborting job:
>> Fatal error in MPI_Sendrecv: Other MPI error, error stack:
>> MPI_Sendrecv(217).........................: MPI_Sendrecv(sbuf=0x2a97d77f08, scount=488032, MPI_BYTE, dest=11, stag=18, rbuf=0x2a98d9b2e8, rcount=623888, MPI_BYTE, src=11, rtag=18, MPI_COMM_WORLD, status=0x7fbfffee00) failed
>> MPIDI_CH3_Progress_wait(217)..............: an error occurred while handling an event returned by MPIDU_Sock_Wait()
>> MPIDI_CH3I_Progress_handle_sock_event(608):
>> MPIDU_Socki_handle_pollhup(439)...........: connection closed by peer (set=0,sock=16)
>> [cli_11]: aborting job:
>> Fatal error in MPI_Sendrecv: Other MPI error, error stack:
>> MPI_Sendrecv(217).........................: MPI_Sendrecv(sbuf=0x2a98513668, scount=623888, MPI_BYTE, dest=16, stag=18, rbuf=0x2a990016b8, rcount=488032, MPI_BYTE, src=16, rtag=18, MPI_COMM_WORLD, status=0x7fbfffee00) failed
>> MPIDI_CH3_Progress_wait(217)..............: an error occurred while handling an event returned by MPIDU_Sock_Wait()
>> MPIDI_CH3I_Progress_handle_sock_event(608):
>> MPIDU_Socki_handle_pollhup(439)...........: connection closed by peer (set=0,sock=15)
>> rank 11 in job 1 udf2.stsci.edu_47530 caused collective abort of all ranks
>> exit status of rank 11: killed by signal 9
>> -------------------------------------------------------------------
>>
>>
>>
>>
>> -----------------------------------------------------------
>>
>> If you wish to unsubscribe from this mailing, send mail to
>> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
>> A web-archive of this mailing list is available here:
>> http://www.mpa-garching.mpg.de/gadget/gadget-list
>
>
>
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
>
Received on 2006-10-09 16:14:33