GADGET-4
|
#include <gwalk.h>
Inherits gravtree< simparticles >.
Public Member Functions | |
void | gravity_tree (int timebin) |
This function computes the gravitational forces for all active particles. More... | |
Public Member Functions inherited from gravtree< simparticles > | |
void | set_softenings (void) |
This function sets the (comoving) softening length of all particle types in the table All.SofteningTable[...]. More... | |
void | gravity_exchange_forces (void) |
void | get_gfactors_multipole (gfactors &res, const T r, const T h_max, const T rinv) |
void | get_gfactors_monopole (gfactors &res, const T r, const T h_max, const T rinv) |
void | get_gfactors_potential (gfactors &res, const T r, const T hmax, const T rinv) |
Public Member Functions inherited from tree< gravnode, simparticles, gravpoint_data, foreign_gravpoint_data > | |
void | tree_add_to_fetch_stack (gravnode *nop, int nodetoopen, unsigned char shmrank) |
void | tree_add_to_work_stack (int target, int no, unsigned char shmrank, int mintopleafnode) |
void | prepare_shared_memory_access (void) |
void | cleanup_shared_memory_access (void) |
void | tree_fetch_foreign_nodes (enum ftype fetch_type) |
void | tree_initialize_leaf_node_access_info (void) |
foreign_gravpoint_data * | get_foreignpointsp (int n, unsigned char shmrank) |
subfind_data * | get_PSp (int n, unsigned char shmrank) |
pdata | get_Pp (int n, unsigned char shmrank) |
sph_particle_data * | get_SphPp (int n, unsigned char shmrank) |
tree () | |
int | treebuild (int ninsert, int *indexlist) |
void | treefree (void) |
void | treeallocate (int max_partindex, simparticles *Pptr, domain< simparticles > *Dptr) |
void | treeallocate_share_topnode_addresses (void) |
void | tree_export_node_threads (int no, int i, thread_data *thread, offset_tuple off=0) |
void | tree_export_node_threads_by_task_and_node (int task, int nodeindex, int i, thread_data *thread, offset_tuple off=0) |
gravnode * | get_nodep (int no) |
gravnode * | get_nodep (int no, unsigned char shmrank) |
int * | get_nextnodep (unsigned char shmrank) |
gravpoint_data * | get_pointsp (int no, unsigned char shmrank) |
void | tree_get_node_and_task (int i, int &no, int &task) |
void gravity_tree | ( | int | timebin | ) |
This function computes the gravitational forces for all active particles.
The tree walk is done in two phases: First the local part of the force tree is processed (gravity_primary_loop() ). Whenever an external node is encountered during the walk, this node is saved on a list. This node list along with data about the particles is then exchanged among tasks. In the second phase (gravity_secondary_loop() ) each task now continues the tree walk for the imported particles. Finally the resulting partial forces are send back to the original task and are summed up there to complete the tree force calculation.
Particles are only exported to other processors when really needed, thereby allowing a good use of the communication buffer. Every particle is sent at most once to a given processor together with the complete list of relevant tree nodes to be checked on the other task.
Particles which drifted into the domain of another task are sent to this task for the force computation. Afterwards the resulting force is sent back to the originating task.
In order to improve the work load balancing during a domain decomposition, the work done by each node/particle is measured. The work is measured for the interaction partners (i.e. the nodes or particles) and not for the particles itself that require a force computation. This way, work done for imported particles is accounted for at the task where the work actually incurred. The cost measurement is only done for the "GRAVCOSTLEVELS" highest occupied time bins. The variable MeasureCostFlag will state whether a measurement is done at the present time step.
The tree imbalance can be further reduced using chunking. The particles requiring a force computation are split into chunks of size #Nchunksize. A set of every #Nchunk -th chunk is processed first. Then the process is repeated, processing the next set of chunks. This way the amount of exported particles is more balanced, as communication heavy regions are mixed with less communication intensive regions.