Re: Segmentation fault in tree calculation for multiple node runs

From: Volker Springel <vspringel_at_MPA-Garching.MPG.DE>
Date: Fri, 29 Jan 2021 16:00:00 +0100

Dear Ken,

I agree with you, this is likely the same problem independent of whether you use the cray, gnu or intel compiler. It could be that you get different lines for the segfault just because the optimizer stages of the compilers rearrange some of the statements in different ways. Using "-O0" (just for debugging purposes to largely avoid optimizer mangling) might give a uniform place for the crash.

In any case, it looks like that the crash happens when the code tries to access either the NodesIndex or TopNodes array, both of which are stored only once per node when you use multiple nodes and ranks access the data directly from other ranks via shared memory accesses. For some reason, this doesn't work for you.

It could be that the calls of
MPI_Win_allocate_shared()
MPI_Win_shared_query()
used in ~/src/data/mymalloc.cc are buggy in MPICH-3.2 (which is from November 2015). This wouldn't be a huge surprise, as these features were quite new at the time and back then rather rarely used, I think.

I've compiled
https://www.mpich.org/static/tarballs/3.2/
on our cluster and tried to use the code with this. Unfortunately, it doesn't start up properly at all on multi-node runs...

I have also tried compiling the current version mpich-3.4.1 for our cluster,
https://www.mpich.org/static/tarballs/3.4.1/
instead. This worked fine with Gadget4, also for multi-node runs.

So in conclusion, I can only recommend to use a more current MPI library, and to either compile mpich-3.4.1 or OpenMPI yourself if they are not available on your cluster.

Best,
Volker

> On 27. Jan 2021, at 18:16, Ken Osato <ken.osato_at_iap.fr> wrote:
>
>
> Dear Volker,
>
> Thank you for your help.
>> At the moment I cannot yet reproduce it, but it smells like it is related to the shared memory allocation.
> As you might already notice, I suspect the problem is caused by the MPI library or compiler. It seems that Gadget-4 is tested with OpenMPI but it might fail with specific versions of MPICH.
>
> There are three programming environments (cray, gnu, intel; for compilers, crayc++, mpicxx, mpiicpc) on my cluster but all failed due to segmentation fault.
> The strange thing is that only for cray compiler, the segmentation fault occurs at different point. (GDB log is attached below.) But I think this error is caused by the same problem.
> All three environments utilize the MPI library of Cray Message Passing Toolkit (MPT) v7.7.0, which is based on ANL MPICH 3.2, with Cray Compiling Environment v8.6.5.
>
> Best,
> Ken
>
>
> Core was generated by `./Gadget4 param.txt'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 0x00000000004e7bbc in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild_construct (this=0x7ffffffd8170)
> at src/tree/tree.cc:318
> 318 int index = NodeIndex[i];
> (gdb) bt
> #0 0x00000000004e7bbc in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild_construct (this=0x7ffffffd8170)
> at src/tree/tree.cc:318
> #1 0x00000000004e2be0 in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild (this=0x7ffffffd8170, ninsert=28129,
> indexlist=0x0) at src/tree/tree.cc:75
> #2 0x00000000004b59e5 in sim::gravity (this=0x7ffffffd7340, timebin=0) at src/gravity/gravity.cc:226
> #3 0x00000000004b6864 in sim::compute_grav_accelerations (this=0x7ffffffd7340, timebin=0) at src/gravity/gravity.cc:110
> #4 0x00000000004a5110 in sim::do_gravity_step_second_half (this=0x7ffffffd7340) at src/time_integration/kicks.cc:379
> #5 0x0000000000424a2a in sim::run (this=0x7ffffffd7340) at src/main/run.cc:149
> #6 0x000000000041bc6b in main (argc=2, argv=0x7fffffff6008) at src/main/main.cc:327
> (gdb) f 0
> #0 0x00000000004e7bbc in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild_construct (this=0x7ffffffd8170)
> at src/tree/tree.cc:318
> 318 int index = NodeIndex[i];
> (gdb) list
> 313 Father = (int *)Mem.mymalloc_movable(&Father, "Father", (MaxPart + NumPartImported) * sizeof(int));
> 314
> 315 /* now put in markers ("pseudo" particles) in top-leaf nodes to indicate on which task the branch lies */
> 316 for(int i = 0; i < D->NTopleaves; i++)
> 317 {
> 318 int index = NodeIndex[i];
> 319
> 320 if(TreeSharedMem_ThisTask == 0)
> 321 TopNodes[index].nextnode = MaxPart + MaxNodes + i;
> 322
>
>
> On 26/01/2021 17:50, Volker Springel wrote:
>> Dear Ken,
>>
>> Thanks a lot for reporting this problem. At the moment I cannot yet reproduce it, but it smells like it is related to the shared memory allocation.
>>
>> Could you let me know which MPI library (and which version) you're using? (Are there several MPI libraries on your system that you could try as well?) Which compiler are you using? (In case you don't know, the outputs of "which mpicc", "mpicc -v", and "ldd ./Gadget-4" should give some pointers)
>>
>> Best,
>> Volker
>>
>>
>>
>>> On 24. Jan 2021, at 16:25, Ken Osato <ken.osato_at_iap.fr> wrote:
>>>
>>> Dear Gagdet-community,
>>>
>>> I'm working on running dark-matter only cosmological simulations with Gadget-4.
>>> When I ran the code with the same Config.sh and param.txt of the example "DM-L50-N128", the code runs perfectly for single node, but for multi nodes, it fails due to segmentation fault.
>>> I have been using L-Gadget-2 but never encountered such an error on the same cluster.
>>> I analyzed the core file and it says segmentation fault occurs at the tree calculation. I suspect the memory allocation has something wrong when there are multiple shared memories.
>>>
>>> I've attached the log file when I ran the code with "DM-L50-N128" example setting on Cray XC50 with 2 nodes (= 80 cores) and the outputs of GDB in the following. Any help and suggestion are welcome. Thank you.
>>>
>>> Best regards,
>>> Ken Osato
>>>
>>>
>>> /* GDB outputs */
>>> Core was generated by `./Gadget4 param.txt'.
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0 0x00000000004b5a5e in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild_construct (this=0x7fffffff3870) at src/tree/tree.cc:324
>>> 324 Nextnode[MaxPart + i] = TopNodes[index].sibling;
>>> (gdb) bt
>>> #0 0x00000000004b5a5e in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild_construct (this=0x7fffffff3870) at src/tree/tree.cc:324
>>> #1 0x00000000004b0878 in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild (this=0x7fffffff3870, ninsert=24242, indexlist=0x0) at src/tree/tree.cc:75
>>> #2 0x000000000048ac97 in sim::gravity (this=0x7fffffff2a40, timebin=0) at src/gravity/gravity.cc:226
>>> #3 0x000000000048b8e5 in sim::compute_grav_accelerations (this=0x7fffffff2a40, timebin=0) at src/gravity/gravity.cc:110
>>> #4 0x000000000047f4ea in sim::do_gravity_step_second_half (this=0x7fffffff2a40) at src/time_integration/kicks.cc:379
>>> #5 0x000000000041911a in sim::run (this=0x7fffffff2a40) at src/main/run.cc:149
>>> #6 0x000000000041631a in main (argc=2, argv=0x7fffffff58f8) at src/main/main.cc:327
>>> (gdb) f 0
>>> #0 0x00000000004b5a5e in tree<gravnode, simparticles, gravpoint_data, foreign_gravpoint_data>::treebuild_construct (this=0x7fffffff3870) at src/tree/tree.cc:324
>>> 324 Nextnode[MaxPart + i] = TopNodes[index].sibling;
>>> (gdb) list
>>> 319
>>> 320 if(TreeSharedMem_ThisTask == 0)
>>> 321 TopNodes[index].nextnode = MaxPart + MaxNodes + i;
>>> 322
>>> 323 /* set nextnode for pseudo-particle (Nextnode exists on all ranks) */
>>> 324 Nextnode[MaxPart + i] = TopNodes[index].sibling;
>>> 325 }
>>> 326
>>> 327 point_data *export_Points = (point_data *)Mem.mymalloc("export_Points", NumPartExported * sizeof(point_data));
>>> 328
>>>
>>> --
>>> Ken Osato
>>> Institut d'Astrophysique de Paris
>>> 98bis boulevard Arago, 75014 Paris, France
>>> Tel: +33 1 44 32 80 00
>>> E-mail: ken.osato_at_iap.fr
>>>
>>> <DM-L50-N128.log>
>>> -----------------------------------------------------------
>>>
>>> If you wish to unsubscribe from this mailing, send mail to
>>> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
>>> A web-archive of this mailing list is available here:
>>> http://www.mpa-garching.mpg.de/gadget/gadget-list
>>
>>
>>
>> -----------------------------------------------------------
>>
>> If you wish to unsubscribe from this mailing, send mail to
>> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
>> A web-archive of this mailing list is available here:
>> http://www.mpa-garching.mpg.de/gadget/gadget-list
>
> --
> Ken Osato
> Institut d'Astrophysique de Paris
> 98bis boulevard Arago, 75014 Paris, France
> Tel: +33 1 44 32 80 00
> E-mail: ken.osato_at_iap.fr
>
>
>
>
> -----------------------------------------------------------
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
Received on 2021-01-29 16:00:00

This archive was generated by hypermail 2.3.0 : 2022-09-01 14:03:43 CEST