Thanks, I think I figured out what I was doing wrong. When I copied
the non-accreted particles into a temporary holding list, I freed and
re-allocated P to the wrong size (need All.MaxPart instead of NumPart,
and All.MaxPartSph instead of N_gas). Gadget seems to be happy as
long as I have
P = malloc(All.MaxPart * sizeof(struct particle_data));
and
SphP = malloc( All.MaxPartSph * sizeof(struct sph_particle_data));
...despite the fact that there may be fewer particles than All.MaxPart
or All.MaxPartSph in that list, because I deleted them. I'm not
really sure why this works, but now it runs and dumps as expected.
Thanks for your help!
- Dave
On Sep 17, 2012, at 3:57 PM, Amit Kashi wrote:
> Hi Dave,
>
> Sorry, that's the only idea I have.
>
> Amit
>
> On 09/17/2012 09:16 AM, David Riethmiller wrote:
>> Hi Amit,
>>
>> Yes, I've messed with the PartAllocFactor - doesn't seem to make
>> any difference, at least with the problem I'm seeing.
>>
>> Dave
>>
>>
>>
>> On Sep 17, 2012, at 12:08 PM, Amit Kashi wrote:
>>
>>> Hi Dave,
>>>
>>> Did you try increasing PartAllocFactor in the parameter file?
>>>
>>> Amit
>>>
>>>
>>> On 09/17/2012 08:59 AM, David Riethmiller wrote:
>>>> Hi Gadget users -
>>>>
>>>> I'm trying to implement gas accretion onto a central black hole
>>>> in Gadget2, in a manner very similar to what others have done
>>>> here, but still having problems with the domain decomposition
>>>> even after I've applied fixes suggested in several of the
>>>> archived conversations. My accretion routine identifies
>>>> particles to be accreted, switches their P.Type to -1, then
>>>> reorders all particles by type 0-5, and updates NumPart and N_gas
>>>> locally, and All.TotNumPart and All.TotN_gas on all procs. This
>>>> runs happily on one processor, but crashes in parallel with the
>>>> error stack:
>>>>
>>>> [ 1] [0xbfffe3c8, 0x16d12000] (-P-)
>>>> [ 2] (ompi_convertor_unpack + 0x190) [0xbfffe428, 0x004379f0]
>>>> [ 3] (mca_pml_ob1_recv_request_progress + 0x8e1) [0xbfffe518,
>>>> 0x005f3681]
>>>> [ 4] (mca_pml_ob1_recv_frag_match + 0x905) [0xbfffe5f8, 0x005ef545]
>>>> [ 5] (mca_btl_sm_component_progress + 0x296) [0xbfffe728,
>>>> 0x0070e286]
>>>> [ 6] (mca_bml_r2_progress + 0x49) [0xbfffe748, 0x00700a39]
>>>> [ 7] (opal_progress + 0xf9) [0xbfffe798, 0x004c8ac9]
>>>> [ 8] (mca_pml_ob1_recv + 0x355) [0xbfffe7e8, 0x005ed105]
>>>> [ 9] (MPI_Recv + 0x1bd) [0xbfffe858, 0x0045e3ed]
>>>> [10] (domain_exchangeParticles + 0x890) [0xbfffe8f8, 0x0001ef82]
>>>> [11] (domain_decompose + 0x787) [0xbfffe998, 0x0001d0b2]
>>>> [12] (domain_Decomposition + 0x41b) [0xbfffea38, 0x0001c79a]
>>>> [13] (run + 0xa7) [0xbfffec38, 0x00002537]
>>>> [14] (main + 0x476) [0xbfffecd8, 0x000021e8]
>>>> [15] (start + 0x36) [0xbfffecfc, 0x00001d46]
>>>> [16] [0x00000000, 0x00000002] (FP-)
>>>>
>>>> It looks like when I run in parallel, there's a problem related
>>>> to passing particles among different processors for the domain
>>>> decomposition, possibly I've forgotten to update some global
>>>> variable?
>>>>
>>>> The archived post that most closely matches the problem I'm
>>>> having is here: http://www.mpa-garching.mpg.de/gadget/gadget-list/0461.html
>>>> - the code is even very similar to what I'm using. However,
>>>> this post is over a year old, and still appears to be
>>>> unresolved. Has anyone solved this issue?
>>>>
>>>> Thanks,
>>>> Dave
>>>>
>>>>
>>>> -------------------------------------------------
>>>> David A. Riethmiller
>>>> Ph.D. Candidate, Astrophysical Institute
>>>> Ohio University
>>>>
>>>> Clippinger Labs 338
>>>> http://www.phy.ohiou.edu/~rieth/
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
Received on 2012-09-17 23:32:02