Re: work-load imbalance

From: Pedro Colin <p.colin_at_astrosmo.unam.mx>
Date: Fri, 3 Nov 2006 15:53:13 -0600 (CST)

Hi Volker,

Thanks for your reply. The problem I am studying is precisely one in which
the scalabilty is not good: an isolated system (disk + halo). So, this
essentially explain why my 8 procs computer not neccesarly means a much
better improvement over a dual with faster cpus. I actually see that my
computer in a dual mode (2.0 Ghz opteron cpus) is 5% better than a dual
computer (opteron single cores) at 1.8 Ghz, a slightly lower than expected
gain.

Pedro

On Fri, 3 Nov 2006, Volker Springel wrote:

>
>
> Hi Pedro,
>
> the amount of work-load imbalance you experience will very much depend
> on the type of problem you simulate, as well as on the particle
> number/resolution and on the number of processor you're using. In
> general, systems with only one or a few high-density region(s) (say an
> individual halo, or a galaxy merger) are much harder in this respect
> than a cosmological volume with lots of halos.
>
> The scalability of gadget2 is generally limited by work-load imbalances,
> not communication times. A too small value of BufferSize and
> PartAllocFactor can make things much worse, but once the settings for
> these parameters are large enough, a further increase won't make a
> difference, and the scalability for certain isolated system can remain
> quite poor. For a larger particle number at fixed processor number
> things normally get a bit better, but a substantial improvement on this
> requires algorithmic changes in the code.
>
> Volker
>
>
> Pedro Colin wrote:
> > Hi,
> >
> > Does anyone have a sort of recipe to optimize the work load balance and
> > thus to improve the multi-cpu computer performance? I just got a quad with
> > double core opteron 64bits proccesors at 2.0 Ghz, so 8 procs in total.
> > When I compare the wall time of this computer with a dual opteron at 2.4
> > Ghz (single core) in a Gadget2 simulation (N-body only) I only get a
> > factor of 2 gain. When I look at the timings.txt I see that this is due to
> > work-load imbalance. The procs in the quad run on average twice slower
> > than those in the dual. I have made some changes to *BufferSize* and
> > *PartAllocFactor* and have got some improvement but still I think I am not
> > getting the most of it.
> >
> > Cheers,
> >
> > Pedro
> >
> >
> >
> >
> > -----------------------------------------------------------
> >
> > If you wish to unsubscribe from this mailing, send mail to
> > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> > A web-archive of this mailing list is available here:
> > http://www.mpa-garching.mpg.de/gadget/gadget-list
>
>
>
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
>

-- 
Dr. Pedro Colin
Investigador Titular A
CRyA, UNAM, Morelia, Mich.
Received on 2006-11-03 22:53:31

This archive was generated by hypermail 2.3.0 : 2022-09-01 14:03:42 CEST