Re: Problems with treebuild -- setting the TREE_NUM_BEFORE_NODESPLIT

From: Weiguang Cui <cuiweiguang_at_gmail.com>
Date: Fri, 17 Sep 2021 16:02:43 +0100

Dear Volker,

Sorry for this late confirmation. Many thanks for your help, these
modifications indeed fixed the problem. However, it comes with another MPI
communication memory limitation problem (I guess):

FOF/SUBFIND: Group catalogues saved. took = 7.6831 sec, total size 3604.99
MB, corresponds to effective I/O rate of 469.21 MB/sec
SUBFIND: Subgroup catalogues saved. took = 7.73874 sec
SUBFIND: Finished with SUBFIND. (total time=12824.6 sec)
SUBFIND: Total number of subhalos with at least 32 particles: 8814767
SUBFIND: Largest subhalo has 11419230 particles/cells.
SUBFIND: Total number of particles/cells in subhalos: 5113931178
Code termination on task=0, function mycxxsort_parallel(), file
src/fof/../sort/parallel_sort.h, line 460: currently, local data must be
smaller than 2 GB.

I think there is one of the three mycxxsort_parallel() calls within
fof_prepare_output_order(). Is there any parameter I can simply twist to
overcome this?
Thanks again and have a nice weekend.

Best,
Weiguang

-------------------------------------------
https://weiguangcui.github.io/


On Mon, Sep 13, 2021 at 6:07 PM Volker Springel <
vspringel_at_mpa-garching.mpg.de> wrote:

>
> Dear Weiguang,
>
> I have now found the cause of the problem you experienced with SUBFIND.
> This originated in a small bug in src/subfind/subfind_distribute.cc, which
> is used only when a halos must be processed by more than one core
> ("collective subfind"). The routine was correct for normal particle data,
> but did not work correctly for particles stored on a lightcone because of a
> missing templating of a sizeof() statement in an MPI-call. It turns out
> that your run triggered the bug, yielding corrupted particle positions (all
> zeros), which then led to a failed tree construction. The suggestion by the
> code's error message to increase TREE_NUM_BEFORE_NODESPLIT to fix this was
> a red herring...
>
> When increasing TREE_NUM_BEFORE_NODESPLIT, you experience as distraction
> another issue in the fmm-routine due to the sizing of some communication
> buffers. While you could circumvent this with the changes you tried below,
> these are not good fixes for the problem. So I have now fixed this more
> universally in the code.
>
> I note that a setting TREE_NUM_BEFORE_NODESPLIT to a large number like 96
> is not normally recommended. This will be quite slow (as you found, I
> think), because it will reduce the number of node-level interactions that
> can be computed, while driving up the number of particle-particle
> interactions. In the limit of an extremely large TREE_NUM_BEFORE_NODESPLIT
> one gets then ever closer to direct summation.
>
> Cheers,
> Volker
>
>
> > On 9. Sep 2021, at 17:26, Weiguang Cui <cuiweiguang_at_gmail.com> wrote:
> >
> > Dear Volker,
> >
> > For the problem of MPI_Sendrecv call in SUBFIND, I think it happened in
> this function ` SubDomain->particle_exchange_based_on_PS(SubComm);` -- line
> 101 in subfind_processing.cc. After changing the "MPI_Sendrecv" with
> "myMPI_Sendrecv" within this function in file `domain_exchange.cc`. The
> code does not report the MPI_Sendrecv error.
> >
> > However, the original SUBFIND problem shows up again:
> > ```
> > Code termination on task=2, function treebuild_insert_group_of_points(),
> file src/tree/tree.cc, line 489: It appears we have reached the bottom of
> the tree because there are more than TREE_NUM_BEFORE_NODESPLIT=96 particles
> in the smallest tree node representable for BITS_FOR_POSITIONS=64.
> > Either eliminate the particles at (nearly) indentical coordinates,
> increase the setting for TREE_NUM_BEFORE_NODESPLIT, or possibly enlarge
> BITS_FOR_POSITIONS if you have really not enough dynamic range
> > ```
> > As you can see, I have increased "TREE_NUM_BEFORE_NODESPLIT=96".
> Increasing this value to 128 requires an encasement of the MaxOnFetchStack
> in the fmm.cc which caused memory problems. Here is my current set:
> > `MaxOnFetchStack = std::max<int>(50 * (Tp->NumPart + NumPartImported), 9
> * TREE_MIN_WORKSTACK_SIZE);`
> > If you suggest an even larger value, I can only restart form snapshot
> and change the number of nodes.
> > By the way, the code runs a little bit slow with a large value of
> TREE_MIN_WORKSTACK_SIZE
> >
> > I have rsynced the recent run slurm.3774931.out to the m200n2048-dm/ for
> your reference.
> >
> > Thank you for the comment on the lightcone thickness, my mistake of
> failing to notice the unit. I hope that won't connect to the SUBFIND
> problem and I have increased the value to 2 Mpc/h.
> >
> > Thank you for your help!
> >
> > Best,
> > Weiguang
> >
> > -------------------------------------------
> > https://weiguangcui.github.io/
> >
> >
> > On Wed, Sep 8, 2021 at 8:33 PM Volker Springel <
> vspringel_at_mpa-garching.mpg.de> wrote:
> >
> > Dear Weiguang,
> >
> > Sorry for my sluggish answer. Too many other things on my plate.
> >
> > I think the crash you experienced in an MPI_Sendrecv call in SUBFIND
> happens most likely in line 270 of the file
> src/subfind/subfind_distribute.cc, because this call there is not protected
> yet against transfer sizes that exceed 2 GB in total... For the particle
> number and setup you're using, you are actually having a particle storage
> of ~1.52 GB or so on average. With a memory imbalance of ~30% (which you
> actually just reach according to your log file), it is possible that you
> reach the 2GB at this place, causing the native call of MPI_Sendrecv to
> fail.
> >
> > If this is indeed the problem, then replacing in line 270 to 274
> "MPI_Sendrecv" with "myMPI_Sendrecv" should fix it. I have also made this
> change in the code repository also.
> >
> >
> > Thanks for letting me know that you had to change the default size of
> TREE_MIN_WORKSTACK_SIZE to get around the bookkeeping buffer problem you
> experienced in fmm.cc. I guess I need to think about how this setting can
> be adjusted automatically so that it works in conditions like the one you
> created in your run.
> >
> > Best,
> > Volker
> >
> >
> >
> >
> > > On 2. Sep 2021, at 11:23, Weiguang Cui <cuiweiguang_at_gmail.com> wrote:
> > >
> > > Hi Volker,
> > >
> > > Did you find some time to look at the problem? I would like to have
> this run finished a.s.a.p. So I further modified the code (see the Gadget4
> folder for changes with git diff):
> > > FMM factor is increased to 50
> > > - MaxOnFetchStack = std::max<int>(0.1 * (Tp->NumPart +
> NumPartImported), TREE_MIN_WORKSTACK_SIZE);
> > > + MaxOnFetchStack = std::max<int>(50 * (Tp->NumPart +
> NumPartImported), 10 * TREE_MIN_WORKSTACK_SIZE);
> > > and the tree_min_workstack_size in gravtree.h is also increased:
> > > -#define TREE_MIN_WORKSTACK_SIZE 100000
> > > +#define TREE_MIN_WORKSTACK_SIZE 400000
> > >
> > > With these modifications, the code did not show the `Can't even
> process a single particle` problem in fmm, but crashed with an MPI_Sendrecv
> problem at subfind. See the job slurm.3757634 for details. Maybe this is
> connected with the previous SUBFIND construction problem, just too many
> particles in the halo??
> > > If there is no easy fix, I probably will exclude the SUBFIND part to
> finish the run which is a pity as the full merge tree needs to be redone.
> > >
> > > Thank you.
> > >
> > > Best,
> > > Weiguang
> > >
> > > -------------------------------------------
> > > https://weiguangcui.github.io/
> > >
> > >
> > > On Sun, Aug 29, 2021 at 5:57 PM Volker Springel <
> vspringel_at_mpa-garching.mpg.de> wrote:
> > >
> > > Hi Weiguang,
> > >
> > > The tree construction problem in subfind is odd and still bothers me.
> Could you perhaps make the run available to me on cosma7 so that I can
> investigate this myself?
> > >
> > > I agree that there should be enough total memory for FMM, but the
> termination of the code looks to be caused by an insufficient size
> allocation of internal bookkeeping buffers related to the communication
> parts of the algorithm. If you're add it, you could also make this setup
> available to me, then I can take a look why this happens.
> > >
> > > Regards,
> > > Volker
> > >
> > > > On 24. Aug 2021, at 12:29, Weiguang Cui <cuiweiguang_at_gmail.com>
> wrote:
> > > >
> > > > Hi Volker,
> > > >
> > > > This is a pure dark-matter particle run. This happens when the
> simulation ran to z~0.3.
> > > > As you can see from the attached config options, this simulation
> used an old IC file, neither the double-precision output is opened.
> > > >
> > > > I increased the factor from 0.1 to 0.5, which still resulted in the
> same error in the fmm.cc. I don't think memory is an issue here. As shown
> in memory.txt, the maximum occupied memory (in the whole file) is
> > > > ```MEMORY: Largest Allocation = 11263.9 Mbyte | Largest
> Allocation Without Generic = 11263.9 Mbyte``` and the parameter
> ```MaxMemSize 18000 % in MByte``` is in agreement with
> the machine's memory (cosma7). I will increase the factor to an even higher
> value to see if that works.
> > > >
> > > > If the single-precision position is not an issue, could it be caused
> by the `FoFGravTree.treebuild(num, d);` or
> `FoFGravTree.treebuild(num_removed, dremoved);` in subfind_unbind in which
> an FoF group has too many particles in a very small volume to build the
> tree?
> > > >
> > > > Any suggestions are welcome. Many thanks!
> > > >
> > > > ==================================
> > > > ALLOW_HDF5_COMPRESSION
> > > > ASMTH=1.2
> > > > DOUBLEPRECISION=1
> > > > DOUBLEPRECISION_FFTW
> > > > FMM
> > > > FOF
> > > > FOF_GROUP_MIN_LEN=32
> > > > FOF_LINKLENGTH=0.2
> > > > FOF_PRIMARY_LINK_TYPES=2
> > > > FOF_SECONDARY_LINK_TYPES=1+16+32
> > > > GADGET2_HEADER
> > > > IDS_64BIT
> > > > LIGHTCONE
> > > > LIGHTCONE_IMAGE_COMP_HSML_VELDISP
> > > > LIGHTCONE_MASSMAPS
> > > > LIGHTCONE_PARTICLES
> > > > LIGHTCONE_PARTICLES_GROUPS
> > > > MERGERTREE
> > > > MULTIPOLE_ORDER=3
> > > > NTAB=128
> > > > NTYPES=6
> > > > PERIODIC
> > > > PMGRID=4096
> > > > RANDOMIZE_DOMAINCENTER
> > > > RCUT=4.5
> > > > SELFGRAVITY
> > > > SUBFIND
> > > > SUBFIND_HBT
> > > > TREE_NUM_BEFORE_NODESPLIT=64
> > > > ===========================================================
> > > >
> > > >
> > > > Best,
> > > > Weiguang
> > > >
> > > > -------------------------------------------
> > > > https://weiguangcui.github.io/
> > > >
> > > >
> > > > On Mon, Aug 23, 2021 at 1:49 PM Volker Springel <
> vspringel_at_mpa-garching.mpg.de> wrote:
> > > >
> > > > Hi Weiguang,
> > > >
> > > > The code termination you experienced in the tree construction during
> subfind is quite puzzling to me, especially since you used
> BITS_FOR_POSITIONS=64... In principle, this situation should only arise if
> you have a small group of particles (~16) in a region about 10^18 smaller
> than the boxsize. Has this situation occurred during a simulation run, or
> in postprocessing? If you have used single precision for storing positions
> in a snapshot file, or if you have dense blobs of gas with intense star
> formation, then you can get occasional coordinate collisions of two or
> several particles, but ~16 seems increasingly unlikely. So I'm not sure
> what's really going on here. Have things acually worked when setting
> TREE_NUM_BEFORE_NODESPLIT=64?
> > > >
> > > > The issue in FMM is a memory issue. It should be possible to resolve
> it with a higher setting of MaxMemSize, or by enlarging the factor 0.1 in
> line 1745 of fmm.cc,
> > > > MaxOnFetchStack = std::max<int>(0.1 * (Tp->NumPart +
> NumPartImported), TREE_MIN_WORKSTACK_SIZE);
> > > >
> > > > Best,
> > > > Volker
> > > >
> > > >
> > > > > On 21. Aug 2021, at 10:10, Weiguang Cui <cuiweiguang_at_gmail.com>
> wrote:
> > > > >
> > > > > Dear all,
> > > > >
> > > > > I recently met another problem with the 2048^3, 200 mpc/h run.
> > > > >
> > > > > treebuild in SUBFIND requires a higher value for
> TREE_NUM_BEFORE_NODESPLIT:
> > > > > ==========================================================
> > > > > SUBFIND: We now execute a parallel version of SUBFIND.
> > > > > SUBFIND: Previous subhalo catalogue had approximately a size
> 2.42768e+09, and the summed squared subhalo size was 8.42698e+16
> > > > > SUBFIND: Number of FOF halos treated with collective SubFind
> algorithm = 1
> > > > > SUBFIND: Number of processors used in different partitions for the
> collective SubFind code = 2
> > > > > SUBFIND: (The adopted size-limit for the collective algorithm was
> 9631634 particles, for threshold size factor 0.6)
> > > > > SUBFIND: The other 10021349 FOF halos are treated in parallel with
> serial code
> > > > > SUBFIND: subfind_distribute_groups() took 0.044379 sec
> > > > > SUBFIND: particle balance=1.10537
> > > > > SUBFIND: subfind_exchange() took 30.2562 sec
> > > > > SUBFIND: particle balance for processing=1
> > > > > SUBFIND: root-task=0: Collectively doing halo 0 of length
> 10426033 on 2 processors.
> > > > > SUBFIND: subdomain decomposition took 8.54527 sec
> > > > > SUBFIND: serial subfind subdomain decomposition took 6.0162 sec
> > > > > SUBFIND: root-task=0: total number of subhalo coll_candidates=1454
> > > > > SUBFIND: root-task=0: number of subhalo candidates small enough to
> be done with one cpu: 1453. (Largest size 81455)
> > > > > Code termination on task=0, function
> treebuild_insert_group_of_points(), file src/tree/tree.cc, line 489: It
> appears we have reached the bottom of the tree because there are more than
> TREE_NUM_BEFORE_NODESPLIT=16 particles in the smallest tree node
> representable for BITS_FOR_POSITIONS=64.
> > > > > Either eliminate the particles at (nearly) indentical coordinates,
> increase the setting for TREE_NUM_BEFORE_NODESPLIT, or possibly enlarge
> BITS_FOR_POSITIONS if you have really not enough dynamic range
> > > > > ==============================================
> > > > >
> > > > > But, if I increase the TREE_NUM_BEFORE_NODESPLIT to 64, FMM seems
> not working:
> > > > > =============================================================
> > > > > Sync-Point 19835, Time: 0.750591, Redshift: 0.332284, Systemstep:
> 5.27389e-05, Dloga: 7.02657e-05, Nsync-grv: 31415, Nsync-hyd:
> 0
> > > > > ACCEL: Start tree gravity force computation... (31415 particles)
> > > > > TREE: Full tree construction for all particles. (presently
> allocated=7626.51 MB)
> > > > > GRAVTREE: Tree construction done. took 13.4471 sec
> <numnodes>=206492 NTopnodes=115433 NTopleaves=101004
> tree-build-scalability=0.441627
> > > > > FMM: Begin tree force. timebin=13 (presently allocated=0.5 MB)
> > > > > Code termination on task=0, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=887, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=40, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=888, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=889, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=3, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=890, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=6, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=891, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=9, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=892, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=893, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=894, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > Code termination on task=20, function gravity_fmm(), file
> src/fmm/fmm.cc, line 1879: Can't even process a single particle
> > > > > ======================================
> > > > >
> > > > > I don't think fine-tuning the value for TREE_NUM_BEFORE_NODESPLIT
> is a solution.
> > > > > I can try to use BITS_FOR_POSITIONS=128 by setting
> POSITIONS_IN_128BIT, but I am afraid that the code may not be able to run
> from restart files.
> > > > > Any suggestions?
> > > > > Many thanks.
> > > > >
> > > > > Best,
> > > > > Weiguang
> > > > >
> > > > > -------------------------------------------
> > > > > https://weiguangcui.github.io/
> > > > >
> > > > > -----------------------------------------------------------
> > > > >
> > > > > If you wish to unsubscribe from this mailing, send mail to
> > > > > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > > > > A web-archive of this mailing list is available here:
> > > > > http://www.mpa-garching.mpg.de/gadget/gadget-list
> > > >
> > > >
> > > >
> > > >
> > > > -----------------------------------------------------------
> > > >
> > > > If you wish to unsubscribe from this mailing, send mail to
> > > > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > > > A web-archive of this mailing list is available here:
> > > > http://www.mpa-garching.mpg.de/gadget/gadget-list
> > > >
> > > > -----------------------------------------------------------
> > > >
> > > > If you wish to unsubscribe from this mailing, send mail to
> > > > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > > > A web-archive of this mailing list is available here:
> > > > http://www.mpa-garching.mpg.de/gadget/gadget-list
> > >
> > >
> > >
> > >
> > > -----------------------------------------------------------
> > >
> > > If you wish to unsubscribe from this mailing, send mail to
> > > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > > A web-archive of this mailing list is available here:
> > > http://www.mpa-garching.mpg.de/gadget/gadget-list
> > >
> > > -----------------------------------------------------------
> > >
> > > If you wish to unsubscribe from this mailing, send mail to
> > > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > > A web-archive of this mailing list is available here:
> > > http://www.mpa-garching.mpg.de/gadget/gadget-list
> >
> >
> >
> >
> > -----------------------------------------------------------
> >
> > If you wish to unsubscribe from this mailing, send mail to
> > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > A web-archive of this mailing list is available here:
> > http://www.mpa-garching.mpg.de/gadget/gadget-list
> >
> > -----------------------------------------------------------
> >
> > If you wish to unsubscribe from this mailing, send mail to
> > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > A web-archive of this mailing list is available here:
> > http://www.mpa-garching.mpg.de/gadget/gadget-list
>
>
>
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
>
Received on 2021-09-17 17:03:27

This archive was generated by hypermail 2.3.0 : 2022-09-01 14:03:43 CEST