Re: memory allocation issue

From: Daniel Pfenniger <Daniel.Pfenniger_at_unige.ch>
Date: Tue, 31 May 2011 12:59:55 +0200

Romain Charlassier wrote:
> Hi all,
>
> I'm trying to run the following simulation with Gadget-2 in Tree-SPH mode
> -2x512^3 particles
> -PMGRID=1024
> -box size L=60Mpc/h
> -starting at z=30, ending at z=2.2
> -smoothing length = 4 kpc/h
> -PartAllocFactor = 1.6
> -TreeAllocFactor= 0.8
> -Buffersize= 30
>
> I'm running it on 200 cores having each 3 Go memory. It perfectly works
> with 2x(256^3) particles on 64 cores, but in the 512^3 configuration, it
> fails after several hundreds iterations with the following message:
>
> [vbuf.c 192] Cannot register vbuf region (size 786432)
> Infiniband library: Cannot allocate memory
> srun: error: platine1026: task8: Segmentation fault (core dumped)
>

It seems to be an error ocuring in the mpi library (probably mvapich).
In such a case a fix could be to use another mpi library, like openmpi,
or to use another (newer) version of the library.

  Dan


> The outputs indicate that it fails around the tree construction or the
> domain decomposition step. I've tried to play with the three last
> parameters in the list above ("upgrading" them to 3.0 / 1.2 / 100)
> without result.
>
> I've checked with valgrind in a much more modest configuration (64^3
> particles on 2 cores on my laptop) for anykind of memory leak but
> everything seems ok from this point of view. Does anyone have a clue on
> what is going on ?
>
> Thanks in advance,
>
> Romain Charlassier
> Post-doc in SPP - CEA Saclay (France)
Received on 2011-05-31 12:58:45

This archive was generated by hypermail 2.3.0 : 2023-01-10 10:01:31 CET