Re: Subfind took too much time to run

From: Volker Springel <vspringel_at_MPA-Garching.MPG.DE>
Date: Wed, 11 Aug 2021 12:09:36 +0200

Hi Weiguang,

> On 11. Aug 2021, at 10:46, Weiguang Cui <cuiweiguang_at_gmail.com> wrote:
>
> Hi Volker,
>
> Yes, `ErrTolTheta` is set to 0.5 which is just slightly small than the proposed value in the document. But I really doubt that is the cause.

Note that there are *NO* recommended values in the documentation... There are only example values - what you use is your own choice and responsibility. ErrTolTheta=0.5 for the purposes of gravitational unbinding in SUBFIND is extremely conservative (especially so if you use a higher order scheme for the tree), and using this instead of, e.g., 0.9 makes a huge difference in speed.

> I can understand that the two massive halos, even treated with separated tasks, can be costly. I suppose this is seen from
> "SUBFIND: Processing overall took (total time=1482.26 sec)"
>
> But there is about 2.5ks time difference to this line:
> "SUBFIND: Finished with SUBFIND. (total time=4092.76 sec)"
> Which should be reposeful for this cost? I can not see any significant cost listed between the two lines. Maybe I am wrong in interpreting this and they are not in sequence.

I think this is most likely because one of the processors working on full groups has taken longer than the two sets of 'collective tasks'. Task 4 is the likely culprit - it will have to work alone on a halo probably not much smaller than 5965232 particles (the reported limit for full assignment), plus a couple of smaller halos. The time this took is not explicitly reported in the log messages.

The message "Processing overall took" is actually not waiting for all ranks before it is printed, i.e. it can already be issued by root-task=0 when its collective work is completed. So if task=4 takes longer, this wouldn't show up there. I have now added a MPI_Barrier statement in the code to change this behaviour in the future, i.e. "Processing overall took" now measures the full completion time.

Regards,
Volker



> Thanks.
>
> Best,
> Weiguang
>
> -------------------------------------------
> https://weiguangcui.github.io/
>
>
> On Tue, Aug 10, 2021 at 6:50 PM Volker Springel <vspringel_at_mpa-garching.mpg.de> wrote:
>
> Hi Weiguang,
>
> No, the code has decided to treat the two biggest halos with 2 tasks each (because they don't fit on a single processor with their ~6 million particles), all the other tasks are used to process the 8913146 halos that are smaller. Only for halos that are treated with multiple tasks you see the "root-task" messages.
>
> The gravitational unbinding of highly resolved halos can take quite a bit of time... it might be that you have set a quite small value of ErrTolTheta, which would slow it down unnecessarily.
>
> Regards,
> Volker
>
>
> > On 7. Aug 2021, at 21:42, Weiguang Cui <cuiweiguang_at_gmail.com> wrote:
> >
> > Dear all,
> >
> > I understand that this simulation (2048^3 particles) is large to run, is it normal that the SUBFIND takes about 4ks to complete? It seems that there are only two tasks to run subfind. Are there some options in SUBFIND that I did set not correctly?
> >
> > ```
> > SUBFIND: We now execute a parallel version of SUBFIND.
> > SUBFIND: Previous subhalo catalogue had approximately a size 2.76992e+09, and the summed squared subhalo size was 1.16758e+14
> > SUBFIND: Number of FOF halos treated with collective SubFind algorithm = 2
> > SUBFIND: Number of processors used in different partitions for the collective SubFind code = 4
> > SUBFIND: (The adopted size-limit for the collective algorithm was 5965232 particles, for threshold size factor 0.6)
> > SUBFIND: The other 8913146 FOF halos are treated in parallel with serial code
> > SUBFIND: subfind_distribute_groups() took 0.044987 sec
> > SUBFIND: particle balance=1.23335
> > SUBFIND: subfind_exchange() took 20.5998 sec
> > SUBFIND: particle balance for processing=1
> > SUBFIND: root-task=2: Collectively doing halo 1 of length 6001658 on 2 processors.
> > SUBFIND: root-task=0: Collectively doing halo 0 of length 6318694 on 2 processors.
> > SUBFIND: subdomain decomposition took 5.28067 sec
> > SUBFIND: serial subfind subdomain decomposition took 4.50761 sec
> > SUBFIND: root-task=2: total number of subhalo coll_candidates=2858
> > SUBFIND: root-task=2: number of subhalo candidates small enough to be done with one cpu: 2858. (Largest size 234093)
> > SUBFIND: root-task=0: total number of subhalo coll_candidates=3148
> > SUBFIND: root-task=0: number of subhalo candidates small enough to be done with one cpu: 3148. (Largest size 1116065)
> > SUBFIND: root-task=2: bringing the independent ones back took 1.43275 sec
> > SUBFIND: root-task=2: the collective unbinding of remaining halos took 728.107 sec
> > SUBFIND: root-task=2: found 2312 bound substructures in FoF group of length 6001658
> > SUBFIND: root-task=2: Collective processing of halo 1 took 1212.43
> > SUBFIND: root-task=0: bringing the independent ones back took 1.69057 sec
> > SUBFIND: root-task=0: the collective unbinding of remaining halos took 893.829 sec
> > SUBFIND: root-task=0: found 2642 bound substructures in FoF group of length 6318694
> > SUBFIND: subfind_hbt_single_group() processing for Ngroups=1 took 1459.14 sec
> > SUBFIND: root-task=0: Collective processing of halo 0 took 1473.2
> > SUBFIND: Processing overall took (total time=1482.26 sec)
> > ALLOCATE: Changing to MaxPart = 17425843
> > ALLOCATE: Changing to MaxPartSph = 0
> > SUBFIND: subfind_exchange() (for return to original CPU) took 26.0112 sec
> > TREE: Full tree construction for all particles. (presently allocated=3948.2 MB)
> > GRAVTREE: Tree construction done. took 71.5354 sec <numnodes>=1.90281e+06 NTopnodes=70545 NTopleaves=61727 tree-build-scalability=0.962969
> > SUBFIND: SO iteration 1: need to repeat for 7848033 halo centers. (took 0.309375 sec)
> > SUBFIND: SO iteration 2: need to repeat for 7848033 halo centers. (took 0.2519 sec)
> > SUBFIND: SO iteration 3: need to repeat for 7848033 halo centers. (took 0.268218 sec)
> > SUBFIND: SO iteration 4: need to repeat for 7848033 halo centers. (took 0.266999 sec)
> > SUBFIND: SO iteration 5: need to repeat for 7848033 halo centers. (took 0.288053 sec)
> > SUBFIND: SO iteration 6: need to repeat for 7848033 halo centers. (took 0.266823 sec)
> > SUBFIND: SO iteration 7: need to repeat for 7848033 halo centers. (took 0.264547 sec)
> > SUBFIND: SO iteration 8: need to repeat for 7848033 halo centers. (took 0.263144 sec)
> > SUBFIND: SO iteration 9: need to repeat for 7848033 halo centers. (took 0.256196 sec)
> > SUBFIND: SO iteration 10: need to repeat for 7848033 halo centers. (took 0.264536 sec)
> > SUBFIND: SO iteration 11: need to repeat for 7848033 halo centers. (took 0.257483 sec)
> > SUBFIND: SO iteration 12: need to repeat for 7848033 halo centers. (took 0.25124 sec)
> > SUBFIND: SO iteration 13: need to repeat for 7848033 halo centers. (took 0.261861 sec)
> > SUBFIND: SO iteration 14: need to repeat for 7848033 halo centers. (took 0.257919 sec)
> > SUBFIND: SO iteration 15: need to repeat for 7848030 halo centers. (took 0.268111 sec)
> > SUBFIND: SO iteration 16: need to repeat for 38913 halo centers. (took 0.259465 sec)
> > SUBFIND: SO iteration 17: need to repeat for 37 halo centers. (took 0.0242081 sec)
> > SUBFIND: SO iteration 1: need to repeat for 7848033 halo centers. (took 0.264944 sec)
> > SUBFIND: SO iteration 2: need to repeat for 7848033 halo centers. (took 0.237437 sec)
> > SUBFIND: SO iteration 3: need to repeat for 7848033 halo centers. (took 0.257997 sec)
> > SUBFIND: SO iteration 4: need to repeat for 7848033 halo centers. (took 0.263542 sec)
> > SUBFIND: SO iteration 5: need to repeat for 7848033 halo centers. (took 0.268112 sec)
> > SUBFIND: SO iteration 6: need to repeat for 7848033 halo centers. (took 0.254724 sec)
> > SUBFIND: SO iteration 7: need to repeat for 7848033 halo centers. (took 0.268252 sec)
> > SUBFIND: SO iteration 8: need to repeat for 7848033 halo centers. (took 0.25666 sec)
> > SUBFIND: SO iteration 9: need to repeat for 7848033 halo centers. (took 0.272972 sec)
> > SUBFIND: SO iteration 10: need to repeat for 7848033 halo centers. (took 0.251183 sec)
> > SUBFIND: SO iteration 11: need to repeat for 7848033 halo centers. (took 0.265267 sec)
> > SUBFIND: SO iteration 12: need to repeat for 7848033 halo centers. (took 0.257801 sec)
> > SUBFIND: SO iteration 13: need to repeat for 7848033 halo centers. (took 0.269823 sec)
> > SUBFIND: SO iteration 14: need to repeat for 7848033 halo centers. (took 0.265494 sec)
> > SUBFIND: SO iteration 15: need to repeat for 7848030 halo centers. (took 0.247782 sec)
> > SUBFIND: SO iteration 16: need to repeat for 36428 halo centers. (took 0.259644 sec)
> > SUBFIND: SO iteration 17: need to repeat for 35 halo centers. (took 0.019228 sec)
> > SUBFIND: SO iteration 1: need to repeat for 7848033 halo centers. (took 0.265704 sec)
> > SUBFIND: SO iteration 2: need to repeat for 7848033 halo centers. (took 0.229607 sec)
> > SUBFIND: SO iteration 3: need to repeat for 7848033 halo centers. (took 0.254473 sec)
> > SUBFIND: SO iteration 4: need to repeat for 7848033 halo centers. (took 0.25696 sec)
> > SUBFIND: SO iteration 5: need to repeat for 7848033 halo centers. (took 0.26164 sec)
> > SUBFIND: SO iteration 6: need to repeat for 7848033 halo centers. (took 0.25506 sec)
> > SUBFIND: SO iteration 7: need to repeat for 7848033 halo centers. (took 0.240955 sec)
> > SUBFIND: SO iteration 8: need to repeat for 7848033 halo centers. (took 0.260188 sec)
> > SUBFIND: SO iteration 9: need to repeat for 7848033 halo centers. (took 0.248877 sec)
> > SUBFIND: SO iteration 10: need to repeat for 7848033 halo centers. (took 0.256545 sec)
> > SUBFIND: SO iteration 11: need to repeat for 7848033 halo centers. (took 0.25118 sec)
> > SUBFIND: SO iteration 12: need to repeat for 7848033 halo centers. (took 0.25854 sec)
> > SUBFIND: SO iteration 13: need to repeat for 7848033 halo centers. (took 0.246834 sec)
> > SUBFIND: SO iteration 14: need to repeat for 7848033 halo centers. (took 0.27007 sec)
> > SUBFIND: SO iteration 15: need to repeat for 7848033 halo centers. (took 0.258597 sec)
> > SUBFIND: SO iteration 16: need to repeat for 122350 halo centers. (took 0.248988 sec)
> > SUBFIND: SO iteration 17: need to repeat for 181 halo centers. (took 0.0358 sec)
> > SUBFIND: SO iteration 1: need to repeat for 7848033 halo centers. (took 0.269485 sec)
> > SUBFIND: SO iteration 2: need to repeat for 7848033 halo centers. (took 0.228188 sec)
> > SUBFIND: SO iteration 3: need to repeat for 7848033 halo centers. (took 0.234185 sec)
> > SUBFIND: SO iteration 4: need to repeat for 7848033 halo centers. (took 0.229443 sec)
> > SUBFIND: SO iteration 5: need to repeat for 7848033 halo centers. (took 0.224041 sec)
> > SUBFIND: SO iteration 6: need to repeat for 7848033 halo centers. (took 0.232641 sec)
> > SUBFIND: SO iteration 7: need to repeat for 7848033 halo centers. (took 0.230443 sec)
> > SUBFIND: SO iteration 8: need to repeat for 7848033 halo centers. (took 0.219413 sec)
> > SUBFIND: SO iteration 9: need to repeat for 7848033 halo centers. (took 0.220803 sec)
> > SUBFIND: SO iteration 10: need to repeat for 7848033 halo centers. (took 0.237108 sec)
> > SUBFIND: SO iteration 11: need to repeat for 7848033 halo centers. (took 0.232947 sec)
> > SUBFIND: SO iteration 12: need to repeat for 7848033 halo centers. (took 0.240728 sec)
> > SUBFIND: SO iteration 13: need to repeat for 7848033 halo centers. (took 0.228163 sec)
> > SUBFIND: SO iteration 14: need to repeat for 7848033 halo centers. (took 0.221182 sec)
> > SUBFIND: SO iteration 15: need to repeat for 7848033 halo centers. (took 0.224345 sec)
> > SUBFIND: SO iteration 16: need to repeat for 6272558 halo centers. (took 0.226659 sec)
> > SUBFIND: SO iteration 17: need to repeat for 46051 halo centers. (took 0.188434 sec)
> > SUBFIND: SO iteration 18: need to repeat for 52 halo centers. (took 0.01652 sec)
> > SUBFIND: determining spherical overdensity masses took 16.475 sec
> > SUBFIND: assembled and ordered groups and subhalos (took 5.52871 sec)
> > FOF/SUBFIND: writing group catalogue file: './snaps//groups_026/fof_subhalo_tab_026.0' (file 1 of 32)
> > FOF/SUBFIND: writing group catalogue block 0 (GroupLen)...
> > FOF/SUBFIND: writing group catalogue block 1 (GroupMass)..
> > FOF/SUBFIND: writing group catalogue block 2 (GroupPos)...
> > FOF/SUBFIND: writing group catalogue block 3 (GroupVel)...
> > FOF/SUBFIND: writing group catalogue block 4 (GroupLenType)...
> > FOF/SUBFIND: writing group catalogue block 5 (GroupOffsetType)...
> > FOF/SUBFIND: writing group catalogue block 6 (GroupMassType)...
> > FOF/SUBFIND: writing group catalogue block 7 (GroupAscale)...
> > FOF/SUBFIND: writing group catalogue block 9 (Group_M_Mean200)...
> > FOF/SUBFIND: writing group catalogue block 10 (Group_R_Mean200)...
> > FOF/SUBFIND: writing group catalogue block 11 (Group_M_Crit200)..
> > FOF/SUBFIND: writing group catalogue block 12 (Group_R_Crit200)...
> > FOF/SUBFIND: writing group catalogue block 13 (Group_M_Crit500)...
> > FOF/SUBFIND: writing group catalogue block 14 (Group_R_Crit500)...
> > FOF/SUBFIND: writing group catalogue block 15 (Group_M_TopHat200)...
> > FOF/SUBFIND: writing group catalogue block 16 (Group_R_TopHat200)...
> > FOF/SUBFIND: writing group catalogue block 17 (GroupNsubs)...
> > FOF/SUBFIND: writing group catalogue block 18 (GroupFirstSub)...
> > FOF/SUBFIND: writing group catalogue block 19 (SubhaloGroupNr)...
> > FOF/SUBFIND: writing group catalogue block 20 (SubhaloRankInGr)...
> > FOF/SUBFIND: writing group catalogue block 21 (SubhaloLen)...
> > FOF/SUBFIND: writing group catalogue block 22 (SubhaloMass)...
> > FOF/SUBFIND: writing group catalogue block 23 (SubhaloPos)...
> > FOF/SUBFIND: writing group catalogue block 24 (SubhaloVel)...
> > FOF/SUBFIND: writing group catalogue block 25 (SubhaloLenType)...
> > FOF/SUBFIND: writing group catalogue block 26 (SubhaloOffsetType)...
> > FOF/SUBFIND: writing group catalogue block 27 (SubhaloMassType)...
> > FOF/SUBFIND: writing group catalogue block 28 (SubhaloCM)...
> > FOF/SUBFIND: writing group catalogue block 29 (SubhaloSpin)...
> > FOF/SUBFIND: writing group catalogue block 30 (SubhaloVelDisp)...
> > FOF/SUBFIND: writing group catalogue block 31 (SubhaloVmax)...
> > FOF/SUBFIND: writing group catalogue block 32 (SubhaloVmaxRad)...
> > FOF/SUBFIND: writing group catalogue block 33 (SubhaloHalfmassRad)...
> > FOF/SUBFIND: writing group catalogue block 34 (SubhaloHalfmassRadType)..
> > FOF/SUBFIND: writing group catalogue block 35 (SubhaloIDMostbound)...
> > FOF/SUBFIND: writing group catalogue block 36 (SubhaloParentRank)...
> > FOF/SUBFIND: Group catalogues saved. took = 8.89541 sec, total size 3525.03 MB, corresponds to effective I/O rate of 396.275 MB/sec
> > SUBFIND: Subgroup catalogues saved. took = 9.7367 sec
> > SUBFIND: Finished with SUBFIND. (total time=4092.76 sec)
> > SUBFIND: Total number of subhalos with at least 32 particles: 9849779
> > SUBFIND: Largest subhalo has 4419279 particles/cells.
> > SUBFIND: Total number of particles/cells in subhalos: 2897277494
> > ```
> >
> > Many thanks for any suggestions. Please let me know if you need any more information.
> >
> > Best,
> > Weiguang
> >
> > -------------------------------------------
> > https://weiguangcui.github.io/
> >
> > -----------------------------------------------------------
> >
> > If you wish to unsubscribe from this mailing, send mail to
> > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> > A web-archive of this mailing list is available here:
> > http://www.mpa-garching.mpg.de/gadget/gadget-list
>
>
>
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
Received on 2021-08-11 12:09:37

This archive was generated by hypermail 2.3.0 : 2023-01-10 10:01:33 CET