Re: Error from function deal_with_sph_node_request

From: Leonard Romano <>
Date: Fri, 22 Oct 2021 22:38:13 +0200

Hello Julianne,

I am also using Grackle for cooling and when I enable star formation, I
encounter the same error. What bugged me the most is that it seems to
happen at random, i.e. sometimes after few stars have spawned and
sometimes only after hundreds or thousands have spawned.
Does your error occur at random too?
Unfortunately I did not have time to debug this problem yet, so if you
or anyone has any ideas, it would be very welcome.
Though needless to say it seems very likely that these kinds of issues
are related to our custom implementations of these sub grid physics
(Grackle is not part of the public Gadget code), so most likely we will
have to find our own solutions to the bugs in our own code...


On 22.10.21 22:14, Goddard, Julianne wrote:
> Hello Everyone,
> I am running a zoom-in cosmological simulation with periodic boundary
> conditions in Gadget4. I am using grackle for cooling and star
> formation is enabled. The zoom region in the simulation is about 1.5
> Mpc in radius, and the effective resolution here is 1024^3. I have
> found that the code runs to completion if I run on only one node,
> however if I increase to two or more nodes I start to get one of the
> following errors:
> "Code termination on task=91, function deal_with_sph_node_request(),
> file src/mpi_utils/, line 272: p=1564695652
> MaxPart=5869 MaxNodes=13117"
> or
> "Fatal error in PMPI_Recv: Unknown error class, error stack:
> PMPI_Recv(171)........................: MPI_Recv(buf=0x7f63546475c0,
> count=8, MPI_BYTE, src=31, tag=10, MPI_COMM_WORLD, status=0x1) failed
> MPIDU_Complete_posted_with_error(1137): Process failed"
> I have once had the code complete running in parallel without
> experiencing these errors, but since I have not been able to
> replicate.  Has anyone else experienced this type of error or have
> advice on how to fix the problem?
> Thank You,
> Julianne
