Re: Error saving lightcone file

From: Volker Springel <vspringel_at_MPA-Garching.MPG.DE>
Date: Tue, 10 Aug 2021 19:49:07 +0200

Hi Robin,

I'm not really sure what you have configured for lightcone_01, and how the log-file of the run looks like before the crash happened... I would have to look at the full log-file to better understand what happened here.

One potential possibility for the crash is that one of your HDF5 files contains more than 2 billion particles. For HDF5 files this is in principle permissable (but not for formats 1 and 2). However, the code still contained a small 32bit overflow issue in the I/O routine preventing this in practice. This is fixed in the newest version.

Cheers,
Volker



> On 6. Aug 2021, at 11:49, Robin Booth <robin.booth_at_sussex.ac.uk> wrote:
>
> Hi Volker
>
> I am currently running a simulation which specifies a quadrant lightcone output commencing at redshift z = 1.4.
> The simulation runs fine from z = 50 up to the point where lightcone output is required then crashes with this error on each core:
>
> BFD: Dwarf Error: found dwarf version '5', this reader only handles version 2, 3 and 4 information.
>
> with this traceback:
>
> ==== backtrace (tid: 241654) ====
> 0 0x0000000000059065 ucs_debug_print_backtrace() /cosma/local/software/ucx/ucx-1.10.1/src/ucs/debug/debug.c:656
> 1 0x0000000000440518 IO_Def::fill_write_buffer() ???:0
> 2 0x00000000004478f9 IO_Def::write_file() ???:0
> 3 0x00000000004487be IO_Def::write_multiple_files() ???:0
> 4 0x00000000004f9023 lightcone_particle_io::lightcone_save() ???:0
> 5 0x000000000041a066 sim::create_snapshot_if_desired() ???:0
> 6 0x000000000041ca55 sim::run() ???:0
> 7 0x000000000040cd38 main() ???:0
> 8 0x0000000000022555 __libc_start_main() ???:0
> 9 0x000000000040dd6f _start() ???:0
> =================================
>
> The first lightcone folder and files are created successfully (lightcone_00). but do not contain partcile data as far as I can see. The second set of files (lightcone_01) are created and contain particle data (I assume, as they are > 8 x 32 Gbytes in size). However, they are corrupted in that h5dump is unable to parse them, so I presume that the crash occurs whilst writing to these files.
>
> One further piece of information that may or may not be relevant: the code was inadvertently compiled with the SUBFIND_STORE_LOCALDENSITY flag set, so that output snapshots (and presumably the lightcone snapshots) contain the additional datasets associated with this setting.
>
> Any suggestions?
>
> Regards
>
> Robin
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
Received on 2021-08-10 19:49:07

This archive was generated by hypermail 2.3.0 : 2023-01-10 10:01:33 CET