Cameron McBride wrote:
> Cameron McBride wrote (19 Oct 2007 15:32 EDT):
>> Sorry, the debugging cycle is a little slower since I have to wade
>> through the queue for the large number of PE. I've got two things
>> pending right now:
>> 1. a run on 1536 PE.
>> 2. a run that will output the input values of failing malloc()
>
> Updates on both of these:
>
> 1. Still failed on 1526 PE.
>
> % grep 'llocated' n1250_b640_pm2048_pe1536_debug.pbs.o197554
> Allocated 100 MByte communication buffer per processor.
> Allocated 145.519 MByte for particle storage. 80
> allocated 0.0762939 Mbyte for ngb search.
> Allocated 143.372 MByte for BH-tree. 64
>
> which puts reported per node memory at less than 400 MB.
>
> The arguments to the first successful malloc:
> toplist = malloc( 12392355 * 16 );
> (189 MB)
>
> The arguments of the failing malloc:
> toplist = malloc( 16818280 * 16);
> (256 MB)
>
> It's interesting that only some nodes failed to malloc
> in the domain decomp for Step 0.
>
>
> 2. More info on the 1024 PE attempts:
>
> % grep 'llocated' n1250_b640_pm2048_pe1024_debug.pbs.o197588
> Allocated 100 MByte communication buffer per processor.
> Allocated 218.279 MByte for particle storage. 80
> allocated 0.0762939 Mbyte for ngb search.
> Allocated 214.676 MByte for BH-tree. 64
>
> which puts reported per node memory at about 530 MB.
>
> First successful attempt:
> toplist = malloc( 9685531 * 16 );
> (147.8 MB)
>
> Second failed malloc:
> toplist = malloc( 9731052 * 16);
> (148.5 MB)
>
> The differences seem really small, especially in the 1024 PE case, to be
> from the memory we've considered here - but I don't understand why else
> a simple malloc() would fail except for lack of memory.
>
> My next best guess is to try and do a full memory profile, since
> it appears there is a chunk of significant memory that is allocated and
> not reported between the initial domain decomposition and the one in
> Step 0.
>
> Also, it seems that increasing the number of PE (available memory)
> doesn't fix this since the toplist memory requirements go up with more
> PE. (TOPNODEFACTOR=2.0 in the above cases)
>
> Could a corrupted input particle file cause any of this? (Seems
> unlikely since the first domain decomp was successful)
>
> Does someone with more understanding of Gadget2 have some suggestions or
> see something I'm missing?
>
> Thanks.
>
> Cameron
>
Hi Cameron,
It's always hard to operate close to the memory limit. According to my
experience, the OS is then often a bit erratic in where precisely the
maximum lies that you can allocate... For example, I have some
experience on bluegene nodes with nominally 512 MB RAM. But an
application code is lucky if it can use 440 MB of this. But on some
nodes, it may only be 430 MB, on others 405 MB for some reason. Worse,
these limits may drift during run-time due to various buffer allocations
made by the OS for MPI or I/O. It's hard to have complete control about
this, and the only thing that helps is to try to stay away from the
limit as far as possible.
The memory consumption of the domain decomposition in the public gadget2
code scales badly with processor number, and I suspect this has a lot to
do with your trouble for large CPU number. You can try the following
change: In the routine "domain_topsplit_local()", you find a line
if(TopNodes[sub].Count > All.TotNumPart /
(TOPNODEFACTOR * NTask * NTask))
Change the denominator to something like
(TOPNODEFACTOR * 8 * NTask)
This might help, although you might have to experiment with the factor 8
a bit.
About memory reporting: If you want to know precisely what the OS has
given gadget2, you can determine the virtual memory sizes for your
process by reading out the /proc filesystem (assuming your machine runs
linux), e.g. with the function below.
Alternatively, if you encapsulated all calls of malloc/free in a wrapper
routine, you can count how many bytes you have allocated/freed. That's
what I'm doing in my own updated version of gadget.
Volker
void report_VmRSS(void)
{
pid_t my_pid;
FILE *fd;
char buf[1024];
my_pid = getpid();
sprintf(buf, "/proc/%d/status", my_pid);
if((fd = fopen(buf, "r")))
{
while(1)
{
if(fgets(buf, 500, fd) != buf)
break;
if(strncmp(buf, "VmRSS", 5) == 0)
{
printf("ThisTask=%d: %s", ThisTask, buf);
}
if(strncmp(buf, "VmSize", 6) == 0)
{
printf("ThisTask=%d: %s", ThisTask, buf);
}
}
fclose(fd);
}
}
Received on 2007-10-22 15:08:22
This archive was generated by hypermail 2.3.0
: 2023-01-10 10:01:30 CET