Re: Some setups about on-the-fly lightcones

From: Volker Springel <vspringel_at_MPA-Garching.MPG.DE>
Date: Tue, 10 Jan 2023 09:56:58 +0100

Hi Zhao,

I have now analysed the problem more, and it is because my FOF algorithm assumes that particle IDs are unique... if they are not, then depending on circumstances crashes can occur. For ordinary snapshots, the uniqueness of the IDs is explicitly checked by the code. However, for the lightcone shells, the replication of the box to cover the lightcone duplicates IDs. In addition, the number of particles in lightcone shells can grow above the particle number you have in the box itself... This has also been the case in your run, I think, but because you selected 32-bit IDs, this created additional ID collisions once the particle number in the lightcone shell exceeded 2^32.

I have made a code change that eliminates the problem of the FOF algorithm with duplicate IDs in lightcone shells. It will also detect if IDS_32BIT is insufficient, so now the error should (hopefully) not occur any more.

Regards,
Volker

> On 6. Jan 2023, at 13:00, 陈钊 <chyiru_at_sjtu.edu.cn> wrote:
>
> Hi Volker,
>
> Thanks for your help! I have updated the recent modification for the LIGHTCONE_PARTICLES_GROUPS option. But the same error also occurred. My simulation setup and the detailed error are shown as following.
> ```
> # Basic code operation
> LEAN
> PERIODIC
> SELFGRAVITY
> RANDOMIZE_DOMAINCENTER
> # Gravity options
> PMGRID=1536
> TREEPM_NOTIMESPLIT
> FFT_COLUMN_BASED
> # Softening types and particle types
> NSOFTCLASSES=1
> NTYPES=2
> GADGET2_HEADER
> # Floating point accuracy
> IDS_32BIT
> # Group finding / Subfind / Merger tree
> FOF
> FOF_GROUP_MIN_LEN=20
> SUBFIND
> # LightCones
> LIGHTCONE
> LIGHTCONE_PARTICLES
> LIGHTCONE_PARTICLES_GROUPS
> LIGHTCONE_MAX_BOXREPLICAS=100000
> LIGHTCONE_MASSMAPS
> # Miscellaneous code options
> HOST_MEMORY_REPORTING
> POWERSPEC_ON_OUTPUT
> ALLOW_HDF5_COMPRESSION
> ```
>
> BoxSize = 250 Mpc/h, Particle number: 768**3.
> For light cone output: lightcones.txt [2 0 0.245 1.0 1 0 0 10]
> For snapshot output: [0.2500000, 0.3333333, 0.5000000, 0.6666666, 1.0000000]
>
> The error occurred at time 'a = 0.255474’.
> ```
> LIGHTCONE_PARTICLES_GROUPS: We shall first compute a group catalogue for the lightcone particles
> DOMAIN: Begin domain decomposition (sync-point 3142).
> DOMAIN: Sum=1 TotalCost=1 NumTimeBinsToBeBalanced=0 MultipleDomains=1
> DOMAIN: NTopleaves=41861, determination of top-level tree involved 9 iterations and took 1.75064 sec
> DOMAIN: we are going to try at most 443 different settings for combining the domains on tasks=3840, nnodes=128
> DOMAIN: total_cost=1 total_load=1
> DOMAIN: combining multiple-domains succeeded, target=3840 NTask=3840
> DOMAIN: best solution found after 1 iterations by task=6 for nextra=48, reaching maximum imbalance of 1.26671|1.02266
> DOMAIN: combining multiple-domains took 0.179027 sec
> DOMAIN: exchange of 5299598248 particles
> DOMAIN: particle exchange done. (took 2.23811 sec)
> DOMAIN: domain decomposition done. (took in total 4.20348 sec)
> PEANO: Begin Peano-Hilbert order...
> PEANO: done, took 0.630732 sec.
> FOF: Begin to compute FoF group catalogue... (presently allocated=355.221 MB)
> FOF: Comoving linking length: 0.0689223
> TREE: Full tree construction for all particles. (presently allocated=431.1 MB)
> FOFTREE: Ngb-tree construction done. took 1.52822 sec <numnodes>=286556 NTopnodes=47841 NTopleaves=41861
> FOF: Start linking particles (presently allocated=444.598 MB)
> FOF: linking of small cells took 0.00396637 sec
> FOF: local links done (took 2.48363 sec, avg-work=1.89282, imbalance=1.30774).
> FOF: Marked=71300643 out of the 5299598248 primaries which are linked
> FOF: begin linking across processors (presently allocated=454.268 MB)
> FOF: have done 3081770 cross links (processed 71300643, took 1.10096 sec)
> FOF: have done 306612 cross links (processed 7874010, took 0.0192003 sec)
> FOF: have done 40569 cross links (processed 1013927, took 0.0102004 sec)
> FOF: have done 6088 cross links (processed 160141, took 0.00862979 sec)
> FOF: have done 938 cross links (processed 25274, took 0.00800514 sec)
> FOF: have done 135 cross links (processed 4655, took 0.0124408 sec)
> FOF: have done 26 cross links (processed 1364, took 0.0114822 sec)
> FOF: have done 5 cross links (processed 96, took 0.00564015 sec)
> FOF: have done 0 cross links (processed 33, took 0.00865922 sec)
> FOF: Local groups found.
> FOF: primary group finding took = 3.70749 sec
> FOF: attaching gas and star particles to nearest dm particles took = 0 sec
> Code termination on task=3916, function fof_compile_catalogue(), file src/fof/fof.cc, line 595: start=133051 i=191371 Tp->NumPart=1013871 FOF_GList[start].DistanceOrigin=4334.64 != FOF_GList[i].DistanceOrigin=4341.53 MinID=212460889 MinIDTask=3790
> ```
>
> Thank your very much!
>
> Best,
> Zhao
>
>
>
>> 2023年1月5日 21:08,Volker Springel <vspringel_at_MPA-Garching.MPG.DE> 写道:
>>
>>
>> Hi Zhao,
>>
>>> On 27. Dec 2022, at 14:10, 陈钊 <chyiru_at_sjtu.edu.cn> wrote:
>>>
>>> Dear all,
>>> I am trying to use Gadget4 to generate on-the-fly lightcones for weak lensing studies. The main products I want are the halo lightcones and mass maps. However, I found that those saved mass maps are empty if the ‘LIGHTCONE_PARTICLES’ is not activated and with ‘LIGHTCONE_MASSMAPS’ activated alone. Is this set by definition or some extra setups is needed?
>>
>> I can confirm, LIGHTCONE_MASSMAPS produced empty mass maps if LIGHTCONE_PARTICLES has not been activated as well. This was because of an initialization problem if only the first option was set. I have now fixed this in the code, i.e. it should now work.
>>
>>
>>> At the same time, the snapshots must be at the exact redshifts for my study so I need to activate the “OUTPUT_NON_SYNCHRONIZED_ALLOWED” which means that the halo can only be calculated by post-processing. Is there some setups to make Gadget4 can post-process the particle lightcones and obtain the halo lightcones? I don’t find something about this in the document.
>>
>> No, at the moment there is no built-in option to load the lightcone particle data and run the group finder on it in postprocessing (unlike for snapshots, where this is readily possible). In practice, this would typically be very difficult for memory reasons - normally the particle lightcone data is much bigger than a single snapshot, so running the halo finder in one go on it would not be possible. This could only be circumvented by piecing things together somehow. If your lightcone is small enough, you can in principle accumlate the lightcone data and store it as a mock snapshot file, and then run the group finder on it.
>>
>>>
>>> At last, when I test some runs by using small box (250 Mac/h) to generate particle lightcones at high redshift (z~3.0), I config ‘LIGHTCONE_MAX_BOXREPLICAS=100000’, ‘LIGHTCONE_PARTICLES’, ‘LIGHTCONE_PARTICLES_GROUPS’ and without ’non_syn’. But I got strange exit code shown as follows:
>>> [[[Code termination on task=3645, function fof_compile_catalogue(), file src/fof/fof.cc, line 595: start=331118 i=447801 Tp->NumPart=807400 FOF_GList[start].DistanceOrigin=4449.11 != FOF_GList[i].DistanceOrigin=4452.7 MinID=695628668 MinIDTask=3528]]]
>>> Does anyone knows what’s this error means? And what I need to modify the config file or something else?
>>
>> This error should not occur... I think it has been due to a small glitch in the LIGHTCONE_PARTICLES_GROUPS option, which I now fixed. If the error should still occur, please send me your full setup so that I can try to reproduce it.
>>
>> Best,
>> Volker
>>
>>
>>
>>
>>> Thanks for your time in advance!
>>>
>>> Best,
>>> Zhao
>>>
>>>
>>> -----------------------------------------------------------
>>>
>>> If you wish to unsubscribe from this mailing, send mail to
>>> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
>>> A web-archive of this mailing list is available here:
>>> http://www.mpa-garching.mpg.de/gadget/gadget-list
>>
>>
>>
>>
>> -----------------------------------------------------------
>>
>> If you wish to unsubscribe from this mailing, send mail to
>> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
>> A web-archive of this mailing list is available here:
>> http://www.mpa-garching.mpg.de/gadget/gadget-list
>
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
Received on 2023-01-10 09:56:58

This archive was generated by hypermail 2.3.0 : 2023-01-10 10:01:33 CET