Date: Wed, 1 Sep 2021 13:48:44 +0200

Hi Robin,

 From looking at the balance.txt you have provided, I can immediately
see that during the long step, there are a lot of '6' characters which
correspond to either FOF or SUBFIND (look at logs.h for details). Thus I
recommend trying to turn them off and see if this solves your problem.
The problem you are facing might be related to the force accuracy
requirements you have set, which causes subfind to run for an
unnecessarily long time, so lowering them might also help.


On 01.09.21 13:36, Robin Booth wrote:
> Hi Volker
> I noticed that my Gadget4 run appeared to "stall" for several hours
> periodically during a run, with these instances roughly correlating to
> requested restart file outputs.
> On further investigation, from inspection of the*balance.txt*file for
> example (see extract attached), it would appear that this is caused by
> the code performing a*Nsync-grv*on the entire particle set, including
> an expensive fof tree walk.  This pushes up the CPU step time from
> typically 4 minutes to around 8 hours!  Does that seem reasonable to
> you? I assume that this process is necessary for the restart files to
> record the particle parameters for all particles at the same timestep.
> If this is indeed the correct restart file behaviour, then my
> conclusion would be that it is extremely counter-productive to request
> restart file output too frequently during a simulation run,
> particularly where I am running the simulation in time-limited
> "chunks".  My understanding from the documentation is that a restart
> file will in any case be generated automatically when the run time
> approaches the limit set by the *TimeLimitCPU*parameter.  Would you
> agree with this conclusion?
> Regards
> Robin
