Dear Gagdet-community,
I'm working on running dark-matter only cosmological simulations with
Gadget-4.
When I ran the code with the same Config.sh and param.txt of the example
"DM-L50-N128", the code runs perfectly for single node, but for multi
nodes, it fails due to segmentation fault.
I have been using L-Gadget-2 but never encountered such an error on the
same cluster.
I analyzed the core file and it says segmentation fault occurs at the
tree calculation. I suspect the memory allocation has something wrong
when there are multiple shared memories.
I've attached the log file when I ran the code with "DM-L50-N128"
example setting on Cray XC50 with 2 nodes (= 80 cores) and the outputs
of GDB in the following. Any help and suggestion are welcome. Thank you.
Best regards,
Ken Osato
/* GDB outputs */
Core was generated by `./Gadget4 param.txt'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000004b5a5e in tree<gravnode, simparticles, gravpoint_data,
foreign_gravpoint_data>::treebuild_construct (this=0x7fffffff3870) at
src/tree/tree.cc:324
324 Nextnode[MaxPart + i] = TopNodes[index].sibling;
(gdb) bt
#0 0x00000000004b5a5e in tree<gravnode, simparticles, gravpoint_data,
foreign_gravpoint_data>::treebuild_construct (this=0x7fffffff3870) at
src/tree/tree.cc:324
#1 0x00000000004b0878 in tree<gravnode, simparticles, gravpoint_data,
foreign_gravpoint_data>::treebuild (this=0x7fffffff3870, ninsert=24242,
indexlist=0x0) at src/tree/tree.cc:75
#2 0x000000000048ac97 in sim::gravity (this=0x7fffffff2a40, timebin=0)
at src/gravity/gravity.cc:226
#3 0x000000000048b8e5 in sim::compute_grav_accelerations
(this=0x7fffffff2a40, timebin=0) at src/gravity/gravity.cc:110
#4 0x000000000047f4ea in sim::do_gravity_step_second_half
(this=0x7fffffff2a40) at src/time_integration/kicks.cc:379
#5 0x000000000041911a in sim::run (this=0x7fffffff2a40) at
src/main/run.cc:149
#6 0x000000000041631a in main (argc=2, argv=0x7fffffff58f8) at
src/main/main.cc:327
(gdb) f 0
#0 0x00000000004b5a5e in tree<gravnode, simparticles, gravpoint_data,
foreign_gravpoint_data>::treebuild_construct (this=0x7fffffff3870) at
src/tree/tree.cc:324
324 Nextnode[MaxPart + i] = TopNodes[index].sibling;
(gdb) list
319
320 if(TreeSharedMem_ThisTask == 0)
321 TopNodes[index].nextnode = MaxPart + MaxNodes + i;
322
323 /* set nextnode for pseudo-particle (Nextnode exists on all
ranks) */
324 Nextnode[MaxPart + i] = TopNodes[index].sibling;
325 }
326
327 point_data *export_Points = (point_data
*)Mem.mymalloc("export_Points", NumPartExported * sizeof(point_data));
328
--
Ken Osato
Institut d'Astrophysique de Paris
98bis boulevard Arago, 75014 Paris, France
Tel: +33 1 44 32 80 00
E-mail: ken.osato_at_iap.fr
Received on 2021-01-24 16:25:51