Re: Lightcone file oputput

From: Volker Springel <vspringel_at_MPA-Garching.MPG.DE>
Date: Wed, 18 Aug 2021 12:13:12 +0200

Hi Robin,

> On 17. Aug 2021, at 16:00, Robin Booth <robin.booth_at_sussex.ac.uk> wrote:
>
> Nevertheless, I have recompiled Gadget4 with the latest version of all the source code files, using the same config options and parameter file settings, and attempted to rerun from the restart files generated up to the point of the crash. Unfortunately, this has not proved successful in that the newly compiled version crashes when loading the restart files created by the previous version.

> My assumption is that some code change since the previous build (from December 2020 code version) has resulted in a change in the restart file binary format such that the newer version is no longer compatible. From your knowledge of the code changes, is that a likely explanation?

Yes, restart files can easily become invalid if one either changes the compile-time config options, or if one makes code changes that modify the size of some of the structures used by the code. They are also not necessarily be portable between different compilers/architectures.

In this case a code change is the explanation: I had removed on Feb 13 in changeset e3e352d one line (line 62) in src/logs/timer.h, in order "to remove obsolete CPU_NGBTREEBUILD timer and 'ngbtreebuild' cpu-time output".

Removing this line changed the size of the CPU timing data that is collected, which is stored in the restart files, thus modifying its binary format. In principle you could resurrect your old restart files by adding this line again.

> Returning to the issue of 32 bit particle indexing, I noted an issue in the snap_io.cc file which causes the loading of IC files to fail. My specific use-case is somewhat unusual in that it involves the loading of IC files in Gadget2 format, containing 2048^3 particles. In line 758 of snap_io.cc:
> #ifdef GADGET2_HEADER
> for(int i = 0; i < NTYPES_HEADER; i++)
> if(header.npartTotalLowWord[i] > 0)
> header.npartTotal[i] = header.npartTotalLowWord[i] //+ (((long long)header.npartTotalHighWord[i]) << 32);
> #endif
>
> the expression that handles the high-order word has been, for some reason, commented out, and hence does not handle the case where npartTotal > 2^32.
> The fix for my case was obviously easy enough, by removing the commenting-out of npartTotalHighWord, and also commenting-out the if() line. I can't off-hand envisage a situation where this would not work for the general case, unless the issue relates to the multiple variants of the Gadget2 header that evolved over time.

The handling of the "npartTotalHighWord" was indeed commented out to make loading of yet older gadget IC files, where the npartTotalHighWord wasn't even defined yet, possible... The existence of npartTotalHighWord is of course an ugly kludge. If you insist on using such old legacy files, I'm afraid you sometimes have to manually interfere - like you've done successfully. Using hdf5 protects you from many of these incompatabilities, or at least gives you clear error messages that clearly hint what the problem is.

Regards,
Volker
Received on 2021-08-18 12:13:14

This archive was generated by hypermail 2.3.0 : 2022-09-01 14:03:43 CEST