Re: [Gadget 4] possibly a bug in pm_nonperiodic.cc after snapshot was saved

From: Volker Springel <vspringel_at_MPA-Garching.MPG.DE>
Date: Wed, 12 May 2021 14:07:58 +0200

Hi Weiguang,

> On 11. May 2021, at 11:07, Weiguang Cui <cuiweiguang_at_gmail.com> wrote:
>
> The last thing, I set the simulation to run to the future. It ran fine but failed in saving the last snapshot:
> ```
> Final time=1.5 reached. Simulation ends.
>
> SNAPSHOT: writing snapshot file #128 _at_ time 1.5 ...
> SNAPSHOT: writing snapshot file: './snapshot_128' (file 1 of 1)
> SNAPSHOT: writing snapshot block 0 (Coordinates)...
> SNAPSHOT: writing snapshot block 1 (Velocities)...
> SNAPSHOT: writing snapshot block 2 (ParticleIDs)...
> SNAPSHOT: writing snapshot block 3 (Masses)...
> SNAPSHOT: writing snapshot block 7 (SubfindDensity)...
> [miclap:457496] *** Process received signal ***
> [miclap:457496] Signal: Segmentation fault (11)
> [miclap:457496] Signal code: Address not mapped (1)
> [miclap:457496] Failing at address: 0x38
> [miclap:457496] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fc6e1d7f3c0]
> [miclap:457496] [ 1] Gadget4(+0x44540)[0x564037866540]
> [miclap:457496] [ 2] Gadget4(+0x4b1f0)[0x56403786d1f0]
> [miclap:457496] [ 3] Gadget4(+0x4ccf9)[0x56403786ecf9]
> [miclap:457496] [ 4] Gadget4(+0x3d231)[0x56403785f231]
> [miclap:457496] [ 5] Gadget4(+0x265c3)[0x5640378485c3]
> [miclap:457496] [ 6] Gadget4(+0x12f32)[0x564037834f32]
> [miclap:457496] [ 7] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fc6e1b9f0b3]
> [miclap:457496] [ 8] Gadget4(+0x14ace)[0x564037836ace]
> [miclap:457496] *** End of error message ***
> ```
> Restart the run from restart files didn't help, but still, this could be the server's problem. Please let me know if you don't think so and want to open this issue in another thread.
>

Thanks for pointing out this crash. This happened because the code was trying to write an extra snapshot at the final time, even though no output was explicitly specified for this time (this is always done such that one has this dump anyhow in case one simply forgot to specify the dump). But in this case, the group finder was not run before the snapshot was written to disk... special options like SUBFIND_STORE_LOCAL_DENSITY caused therefore a crash.

I have now changed src/main/run.cc slightly so that this instability should not be there any more.

Best,
Volker
Received on 2021-05-12 14:07:59

This archive was generated by hypermail 2.3.0 : 2022-09-01 14:03:43 CEST