From: Pedro Colin <>
Date: Wed, 1 Nov 2006 11:46:47 -0600 (CST)


Does anyone have a sort of recipe to optimize the work load balance and
thus to improve the multi-cpu computer performance? I just got a quad with
double core opteron 64bits proccesors at 2.0 Ghz, so 8 procs in total.
When I compare the wall time of this computer with a dual opteron at 2.4
Ghz (single core) in a Gadget2 simulation (N-body only) I only get a
factor of 2 gain. When I look at the timings.txt I see that this is due to
work-load imbalance. The procs in the quad run on average twice slower
than those in the dual. I have made some changes to *BufferSize* and
*PartAllocFactor* and have got some improvement but still I think I am not
getting the most of it.


