Dear list,
I am searching for the optimal configuration for running Gadget4. I am
running control DMO simulations of 500 Mpc and 512^3 particles. I am
puzzled by the following, running the code with/out
*USE_SINGLEPRECISION_INTERNALLY* (config files pasted bellow) seems not to
affect both the execution time and the memory consumption (memory.txt
pasted bellow). However, I observe a rather small suppression (0.05%) of
the matter power spectrum at z=0.0 for modes larger than unity. Is it due
to the LEAN configuration? Should LEAN configuration affect the code
accuracy as well? I warmly appreciate any clarification you can provide.
Cheers,
---------------------- SINGLE PRECISION --------------------------
*Code was compiled with the following settings: ASMTH=1.25
CREATE_GRID DOUBLEPRECISION=0 FMM FOF FOF_GROUP_MIN_LEN=100
FOF_LINKLENGTH=0.2 FOF_PRIMARY_LINK_TYPES=2 HIERARCHICAL_GRAVITY
IMPOSE_PINNING LEAN MERGERTREE MULTIPOLE_ORDER=2 NGENIC=512
NGENIC_2LPT NSOFTCLASSES=1 NTAB=128 NTYPES=6 OUTPUT_TIMESTEP
PERIODIC PMGRID=512 POWERSPEC_ON_OUTPUT RANDOMIZE_DOMAINCENTER
RCUT=6.0 SELFGRAVITY SUBFIND SUBFIND_HBT
TREE_NUM_BEFORE_NODESPLIT=4 USE_SINGLEPRECISION_INTERNALLY*
*MEMORY: Largest Allocation = 1559.32 Mbyte | Largest Allocation Without
Generic = 1201.79 Mbyte -------------------------- Allocated Memory
Blocks---- ( Step 0 )------------------ Task Nr F
Variable MBytes Cumulative
Function|File|Linenumber
------------------------------------------------------------------------------------------
23 0 0 GetGhostRankForSimulCommRank 0.0006
0.0006 mymalloc_init()|src/data/mymalloc.cc|137 23 1 0
GetShmRankForSimulCommRank 0.0006 0.0012
mymalloc_init()|src/data/mymalloc.cc|138 23 2 0
GetNodeIDForSimulCommRank 0.0006 0.0018
mymalloc_init()|src/data/mymalloc.cc|139 23 3 0
SharedMemBaseAddr 0.0003 0.0021
mymalloc_init()|src/data/mymalloc.cc|153 23 4 1
slab_to_task 0.0020 0.0041
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|45 23 5 1
slabs_x_per_task 0.0006 0.0047
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|60 23 6 1
first_slab_x_of_task 0.0006 0.0053
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|63 23 7 1
slabs_y_per_task 0.0006 0.0059
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|66 23 8 1
first_slab_y_of_task 0.0006 0.0065
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|69 23 9 1
P 175.0443 175.0508
allocate_memory()|src/ngenic/../main/../data/simparticles|273 23 10 1
SphP 0.0001 175.0509
allocate_memory()|src/ngenic/../main/../data/simparticles|274 23 11 1
FirstTopleafOfTask 0.0006 175.0515
domain_allocate()|src/domain/domain.cc|163 23 12 1
NumTopleafOfTask 0.0006 175.0521
domain_allocate()|src/domain/domain.cc|164 23 13 1
TopNodes 0.0358 175.0879
domain_allocate()|src/domain/domain.cc|165 23 14 1
TaskOfLeaf 0.0156 175.1035
domain_allocate()|src/domain/domain.cc|166 23 15 1
ListOfTopleaves 0.0156 175.1191
domain_decomposition()|src/domain/domain.cc|118 23 16 1
PS 87.5222 262.6413
create_snapshot_if_desired()|src/main/run.cc|534 23 17 0
MinID 3.5000 266.1413
fof_fof()|src/fof/fof.cc|71 23 18 0
MinIDTask 3.5000 269.6413
fof_fof()|src/fof/fof.cc|72 23 19 0
Head 3.5000 273.1413
fof_fof()|src/fof/fof.cc|73 23 20 0
Next 3.5000 276.6413
fof_fof()|src/fof/fof.cc|74 23 21 0
Tail 3.5000 280.1413
fof_fof()|src/fof/fof.cc|75 23 22 0
Len 3.5000 283.6413
fof_fof()|src/fof/fof.cc|76 23 23 1
Send_count 0.0006 283.6419
treeallocate()|src/tree/tree.cc|794 23 24 1
Send_offset 0.0006 283.6425
treeallocate()|src/tree/tree.cc|795 23 25 1
Recv_count 0.0006 283.6431
treeallocate()|src/tree/tree.cc|796 23 26 1
Recv_offset 0.0006 283.6437
treeallocate()|src/tree/tree.cc|797 23 27 0
TreeNodes_offsets 0.0003 283.6440
treeallocate()|src/tree/tree.cc|824 23 28 0
TreePoints_offsets 0.0003 283.6443
treeallocate()|src/tree/tree.cc|825 23 29 0
TreeNextnode_offsets 0.0003 283.6447
treeallocate()|src/tree/tree.cc|826 23 30 0
TreeForeign_Nodes_offsets 0.0003 283.6450
treeallocate()|src/tree/tree.cc|827 23 31 0
TreeForeign_Points_offsets 0.0003 283.6453
treeallocate()|src/tree/tree.cc|828 23 32 0
TreeP_offsets 0.0003 283.6456
treeallocate()|src/tree/tree.cc|829 23 33 0
TreeSphP_offsets 0.0003 283.6459
treeallocate()|src/tree/tree.cc|830 23 34 0
TreePS_offsets 0.0003 283.6462
treeallocate()|src/tree/tree.cc|831 23 35 0
TreeSharedMemBaseAddr 0.0003 283.6465
treeallocate()|src/tree/tree.cc|833 23 36 1
Nodes 15.3964 299.0428
treeallocate()|src/tree/tree.cc|882 23 37 1
Points 0.0001 299.0429
treebuild_construct()|src/tree/tree.cc|311 23 38 1
Nextnode 3.5167 302.5596
treebuild_construct()|src/tree/tree.cc|312 23 39 1
Father 3.5010 306.0606
treebuild_construct()|src/tree/tree.cc|313 23 40 0
Flags 0.8750 306.9356
fof_find_groups()|src/fof/fof_findgroups.cc|127 23 41 0
FullyLinkedNodePIndex 0.5178 307.4534
fof_find_groups()|src/fof/fof_findgroups.cc|129 23 42 0
targetlist 3.5000 310.9534
fof_find_groups()|src/fof/fof_findgroups.cc|163 23 43 0
Exportflag 0.0006 310.9540
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|593 23
44 0 Exportindex 0.0006 310.9546
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|594 23
45 0 Exportnodecount 0.0006 310.9552
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|595 23
46 0 Send 0.0012 310.9564
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|597 23
47 0 Recv 0.0012 310.9576
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|598 23
48 0 Send_count 0.0006 310.9583
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|600 23
49 0 Send_offset 0.0006 310.9589
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|601 23
50 0 Recv_count 0.0006 310.9595
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|602 23
51 0 Recv_offset 0.0006 310.9601
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|603 23
52 0 Send_count_nodes 0.0006 310.9607
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|605 23
53 0 Send_offset_nodes 0.0006 310.9613
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|606 23
54 0 Recv_count_nodes 0.0006 310.9619
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|607 23
55 0 Recv_offset_nodes 0.0006 310.9625
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|608 23
56 1 PartList 1241.0233 1551.9858
src/fof/../mpi_utils/generic_comm.h|198generic_alloc_partlist_nodelist_ngblist()|src/fof/../mpi_utils/generic_comm.h|244
23 57 1 Ngblist 3.5000
1555.4858
src/fof/../mpi_utils/generic_comm.h|198generic_alloc_partlist_nodelist_ngblist()|src/fof/../mpi_utils/generic_comm.h|247
23 58 1 Shmranklist 3.5000
1558.9858
src/fof/../mpi_utils/generic_comm.h|198generic_alloc_partlist_nodelist_ngblist()|src/fof/../mpi_utils/generic_comm.h|248
23 59 1 DataIn 0.0001
1558.9859
src/fof/../mpi_utils/generic_comm.h|198generic_exchange()|src/fof/../mpi_utils/generic_comm.h|556*
* 23 61 1 DataOut 0.0001
1558.9860
src/fof/../mpi_utils/generic_comm.h|198generic_exchange()|src/fof/../mpi_utils/generic_comm.h|558
23 62 0 rel_node_index 0.0006
1558.9866
src/fof/../mpi_utils/generic_comm.h|198generic_prepare_particle_data_for_expor()|src/fof/../mpi_utils/generic_comm.h|317
------------------------------------------------------------------------------------------*
*---------------------- DOUBLE PRECISION --------------------------*
*Code was compiled with the following settings: ASMTH=1.25
CREATE_GRID DOUBLEPRECISION=1 FMM FOF FOF_GROUP_MIN_LEN=100
FOF_LINKLENGTH=0.2 FOF_PRIMARY_LINK_TYPES=2 GADGET2_HEADER
HIERARCHICAL_GRAVITY IMPOSE_PINNING LEAN MERGERTREE
MULTIPOLE_ORDER=2 NGENIC=512 NGENIC_2LPT NSOFTCLASSES=1
NTAB=128 NTYPES=6 OUTPUT_TIMESTEP PERIODIC PMGRID=512
POWERSPEC_ON_OUTPUT RANDOMIZE_DOMAINCENTER RCUT=6.0 SELFGRAVITY
SUBFIND SUBFIND_HBT TREE_NUM_BEFORE_NODESPLIT=4 *
*MEMORY: Largest Allocation = 1559.32 Mbyte | Largest Allocation Without
Generic = 1202.39 Mbyte -------------------------- Allocated Memory
Blocks---- ( Step 0 )------------------ Task Nr F
Variable MBytes Cumulative
Function|File|Linenumber
------------------------------------------------------------------------------------------
8 0 0 GetGhostRankForSimulCommRank 0.0006
0.0006 mymalloc_init()|src/data/mymalloc.cc|137 8 1 0
GetShmRankForSimulCommRank 0.0006 0.0012
mymalloc_init()|src/data/mymalloc.cc|138 8 2 0
GetNodeIDForSimulCommRank 0.0006 0.0018
mymalloc_init()|src/data/mymalloc.cc|139 8 3 0
SharedMemBaseAddr 0.0003 0.0021
mymalloc_init()|src/data/mymalloc.cc|153 8 4 1
slab_to_task 0.0020 0.0041
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|45 8 5 1
slabs_x_per_task 0.0006 0.0047
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|60 8 6 1
first_slab_x_of_task 0.0006 0.0053
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|63 8 7 1
slabs_y_per_task 0.0006 0.0059
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|66 8 8 1
first_slab_y_of_task 0.0006 0.0065
my_slab_based_fft_init()|src/pm/pm_mpi_fft.cc|69 8 9 1
P 175.0443 175.0508
allocate_memory()|src/ngenic/../main/../data/simparticles|273 8 10 1
SphP 0.0001 175.0509
allocate_memory()|src/ngenic/../main/../data/simparticles|274 8 11 1
FirstTopleafOfTask 0.0006 175.0515
domain_allocate()|src/domain/domain.cc|163 8 12 1
NumTopleafOfTask 0.0006 175.0521
domain_allocate()|src/domain/domain.cc|164 8 13 1
TopNodes 0.0358 175.0879
domain_allocate()|src/domain/domain.cc|165 8 14 1
TaskOfLeaf 0.0156 175.1035
domain_allocate()|src/domain/domain.cc|166 8 15 1
ListOfTopleaves 0.0156 175.1191
domain_decomposition()|src/domain/domain.cc|118 8 16 1
PS 87.5222 262.6413
create_snapshot_if_desired()|src/main/run.cc|534 8 17 0
MinID 3.5000 266.1413
fof_fof()|src/fof/fof.cc|71 8 18 0
MinIDTask 3.5000 269.6413
fof_fof()|src/fof/fof.cc|72 8 19 0
Head 3.5000 273.1413
fof_fof()|src/fof/fof.cc|73 8 20 0
Next 3.5000 276.6413
fof_fof()|src/fof/fof.cc|74 8 21 0
Tail 3.5000 280.1413
fof_fof()|src/fof/fof.cc|75 8 22 0
Len 3.5000 283.6413
fof_fof()|src/fof/fof.cc|76 8 23 1
Send_count 0.0006 283.6419
treeallocate()|src/tree/tree.cc|794 8 24 1
Send_offset 0.0006 283.6425
treeallocate()|src/tree/tree.cc|795 8 25 1
Recv_count 0.0006 283.6431
treeallocate()|src/tree/tree.cc|796 8 26 1
Recv_offset 0.0006 283.6437
treeallocate()|src/tree/tree.cc|797 8 27 0
TreeNodes_offsets 0.0003 283.6440
treeallocate()|src/tree/tree.cc|824 8 28 0
TreePoints_offsets 0.0003 283.6443
treeallocate()|src/tree/tree.cc|825 8 29 0
TreeNextnode_offsets 0.0003 283.6447
treeallocate()|src/tree/tree.cc|826 8 30 0
TreeForeign_Nodes_offsets 0.0003 283.6450
treeallocate()|src/tree/tree.cc|827 8 31 0
TreeForeign_Points_offsets 0.0003 283.6453
treeallocate()|src/tree/tree.cc|828 8 32 0
TreeP_offsets 0.0003 283.6456
treeallocate()|src/tree/tree.cc|829 8 33 0
TreeSphP_offsets 0.0003 283.6459
treeallocate()|src/tree/tree.cc|830 8 34 0
TreePS_offsets 0.0003 283.6462
treeallocate()|src/tree/tree.cc|831 8 35 0
TreeSharedMemBaseAddr 0.0003 283.6465
treeallocate()|src/tree/tree.cc|833 8 36 1
Nodes 15.3964 299.0428
treeallocate()|src/tree/tree.cc|882 8 37 1
Points 0.0001 299.0429
treebuild_construct()|src/tree/tree.cc|311 8 38 1
Nextnode 3.5167 302.5596
treebuild_construct()|src/tree/tree.cc|312 8 39 1
Father 3.5010 306.0606
treebuild_construct()|src/tree/tree.cc|313 8 40 0
Flags 0.8750 306.9356
fof_find_groups()|src/fof/fof_findgroups.cc|127 8 41 0
FullyLinkedNodePIndex 0.5178 307.4534
fof_find_groups()|src/fof/fof_findgroups.cc|129 8 42 0
targetlist 3.5000 310.9534
fof_find_groups()|src/fof/fof_findgroups.cc|163 8 43 0
Exportflag 0.0006 310.9540
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|593 8
44 0 Exportindex 0.0006 310.9546
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|594 8
45 0 Exportnodecount 0.0006 310.9552
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|595 8
46 0 Send 0.0012 310.9564
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|597 8
47 0 Recv 0.0012 310.9576
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|598 8
48 0 Send_count 0.0006 310.9583
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|600 8
49 0 Send_offset 0.0006 310.9589
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|601 8
50 0 Recv_count 0.0006 310.9595
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|602 8
51 0 Recv_offset 0.0006 310.9601
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|603 8
52 0 Send_count_nodes 0.0006 310.9607
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|605 8
53 0 Send_offset_nodes 0.0006 310.9613
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|606 8
54 0 Recv_count_nodes 0.0006 310.9619
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|607 8
55 0 Recv_offset_nodes 0.0006 310.9625
generic_allocate_comm_tables()|src/fof/../mpi_utils/generic_comm.h|608 8
56 1 PartList 1241.0233 1551.9858
src/fof/../mpi_utils/generic_comm.h|198generic_alloc_partlist_nodelist_ngblist()|src/fof/../mpi_utils/generic_comm.h|244
8 57 1 Ngblist 3.5000
1555.4858
src/fof/../mpi_utils/generic_comm.h|198generic_alloc_partlist_nodelist_ngblist()|src/fof/../mpi_utils/generic_comm.h|247
8 58 1 Shmranklist 3.5000
1558.9858
src/fof/../mpi_utils/generic_comm.h|198generic_alloc_partlist_nodelist_ngblist()|src/fof/../mpi_utils/generic_comm.h|248
8 59 1 DataIn 0.0001
1558.9859
src/fof/../mpi_utils/generic_comm.h|198generic_exchange()|src/fof/../mpi_utils/generic_comm.h|556
8 60 1 NodeInfoIn 0.0001
1558.9860
src/fof/../mpi_utils/generic_comm.h|198generic_exchange()|src/fof/../mpi_utils/generic_comm.h|557*
* 8 61 1 DataOut 0.0001
1558.9860
src/fof/../mpi_utils/generic_comm.h|198generic_exchange()|src/fof/../mpi_utils/generic_comm.h|558
8 62 0 rel_node_index 0.0006
1558.9866
src/fof/../mpi_utils/generic_comm.h|198generic_prepare_particle_data_for_expor()|src/fof/../mpi_utils/generic_comm.h|317
------------------------------------------------------------------------------------------
*
*Tiago Castro* Post Doc, Department of Physics / UNITS / OATS
Phone: *(* <%28+39%29%20327%20498%200157>*+39 040 3199 120) *
<%28+39%29%20327%20498%200157>
Mobile: *(* <%28+39%29%20327%20498%200157>*+39 388 794 1562) *
<%28+39%29%20327%20498%200157>
Email: *tiagobscastro_at_gmail.com* <tiagobscastro_at_gmail.com>
Website: *tiagobscastro.com <
http://tiagobscastro.com>*
<
http://sites.if.ufrj.br/castro/en>
Skype: *tiagobscastro* <
https://webapp.wisestamp.com/#>
Address:
*Osservatorio Astronomico di Trieste / Villa BazzoniVia Bazzoni, *
*2, 34143 Trieste TS* [image: photo]
<
http://ws-promos.appspot.com/r?rdata=eyJydXJsIjogImh0dHA6Ly93d3cud2lzZXN0YW1wLmNvbS9lbWFpbC1pbnN0YWxsP3dzX25jaWQ9NjcyMjk0MDA4JnV0bV9zb3VyY2U9ZXh0ZW5zaW9uJnV0bV9tZWRpdW09ZW1haWwmdXRtX2NhbXBhaWduPXByb21vXzU3MzI1Njg1NDg3Njk3OTIiLCAiZSI6ICI1NzMyNTY4NTQ4NzY5NzkyIn0=&u=754281802009791>
Received on 2020-12-01 08:16:24