Re: Segmentation Fault on DMO runs on power9

From: Tiago Castro <tiagobscastro_at_gmail.com>
Date: Thu, 4 Mar 2021 11:30:50 +0100

Thanks, Volker. I have added the macros to define_extra and after running
make it returns me the error below. Should I link anything else for correct
compiling?

/usr/bin/ld: build/system/backward.o: undefined reference to symbol
'dladdr_at__at_GLIBC_2.17'

//usr/lib64/libdl.so.2: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Gadget4] Error 1
*Tiago Castro* Post Doc, Department of Physics / UNITS / OATS
Phone: *(* <%28+39%29%20327%20498%200157>*+39 040 3199 120) *
<%28+39%29%20327%20498%200157>
Mobile: *(* <%28+39%29%20327%20498%200157>*+39 388 794 1562) *
<%28+39%29%20327%20498%200157>
Email: *tiagobscastro_at_gmail.com* <tiagobscastro_at_gmail.com>
Website: *tiagobscastro.com <http://tiagobscastro.com>*
<http://sites.if.ufrj.br/castro/en>
Skype: *tiagobscastro* <https://webapp.wisestamp.com/#>
Address:
*Osservatorio Astronomico di Trieste / Villa BazzoniVia Bazzoni, *
*2, 34143 Trieste TS* [image: photo]
<http://ws-promos.appspot.com/r?rdata=eyJydXJsIjogImh0dHA6Ly93d3cud2lzZXN0YW1wLmNvbS9lbWFpbC1pbnN0YWxsP3dzX25jaWQ9NjcyMjk0MDA4JnV0bV9zb3VyY2U9ZXh0ZW5zaW9uJnV0bV9tZWRpdW09ZW1haWwmdXRtX2NhbXBhaWduPXByb21vXzU3MzI1Njg1NDg3Njk3OTIiLCAiZSI6ICI1NzMyNTY4NTQ4NzY5NzkyIn0=&u=754281802009791>


Em qua., 3 de mar. de 2021 às 14:10, Volker Springel <
vspringel_at_mpa-garching.mpg.de> escreveu:

>
> Hi Tiago,
>
> > On 3. Mar 2021, at 12:32, Tiago Castro <tiagobscastro_at_gmail.com> wrote:
> >
> > Many thanks, Volker.
> >
> > Hm, it possibly is a shared memory access problem given the place where
> this happens. Does the code run on a single node? Which MPI library is
> this? Certainly a buggy MPI-3 support is a primary suspect for this. It's
> also peculiar that the machine allows only 40% of the physical memory to be
> allocated as shared memory... (this is not good).
> >
> > The code did not run (crashed on the same part) on a single node. The
> MPI library is the one from IBM (I am running it on M100 cluster).
>
> Ok, in principle this should be IBM's Spectrum MPI library, which is
> closely related to OpenMPI. However, on Marconi100, you should be able to
> use GNU/OpenMPI as an alternative by changing to the corresponding modules.
> At least on Intel processors, OpenMPI works well for Gadget4.
>
> >
> > You can try to activate DEBUG to see whether this gives a core file for
> the crash. This would allow to locate the line where this happens by
> loading the core-file with gdb.
> >
> > I asked support to run this, I have not used gdb on a mpi and batched
> jobs before. Get back to you once I manage to run this.
> >
> > Another possibility would be to add the attached stack-tracing class to
> the compiled files for Gagdet4. This will activate a signal handler and -
> if you are moderately lucky - print an informative stack-trace when the
> crash happens.
> >
> > I apologize for my ignorance, but I did not understand how to implement
> this.
> >
>
> You only need to move backward.cc/backward.h to a source directory (e.g.
> src/system), and include them in the makefile of Gadget4, like
> OBJS += system/pinning.o system/system.o system/backward.o
> INCL += system/system.h system/pinning.h system/backward.h
> That's all, the constructor of the class will be called automatically on
> start-up without needing to modify any of the original code.
>
> Regards,
> Volker
>
>
>
>
>
> > Many thanks!
> > Tiago Castro Post Doc, Department of Physics / UNITS / OATS
> > Phone: (+39 040 3199 120)
> > Mobile: (+39 388 794 1562)
> > Email: tiagobscastro_at_gmail.com
> > Website: tiagobscastro.com
> > Skype: tiagobscastro
> > Address: Osservatorio Astronomico di Trieste / Villa Bazzoni
> > Via Bazzoni, 2, 34143 Trieste TS
> >
> >
> >
> >
> > Em qui., 25 de fev. de 2021 às 15:59, Volker Springel <
> vspringel_at_mpa-garching.mpg.de> escreveu:
> > Hi Tiago,
> >
> > Hm, it possibly is a shared memory access problem given the place where
> this happens. Does the code run on a single node? Which MPI library is
> this? Certainly a buggy MPI-3 support is a primary suspect for this. It's
> also peculiar that the machine allows only 40% of the physical memory to be
> allocated as shared memory... (this is not good).
> >
> > You can try to activate DEBUG to see whether this gives a core file for
> the crash. This would allow to locate the line where this happens by
> loading the core-file with gdb.
> >
> > Another possibility would be to add the attached stack-tracing class to
> the compiled files for Gagdet4. This will activate a signal handler and -
> if you are moderately lucky - print an informative stack-trace when the
> crash happens.
> >
> > Regards,
> > Volker
> >
> >
> >
> >
> > > On 25. Feb 2021, at 15:18, Tiago Castro <tiagobscastro_at_gmail.com>
> wrote:
> > >
> > > Dear list,
> > >
> > > I have tried to run g4 on a power9 cluster, and right after the IC
> creation and during the first step the code returns me segmentation fault .
> Any suggestions of what I am doing wrong?
> > >
> > > Many thanks for any help you can provide.
> > > Regards,
> > > T.
> > > <param.std.txt><Config.sh><slurm-2608670.out>
> > > -----------------------------------------------------------
> > >
> > > If you wish to unsubscribe from this mailing, send mail to
> > > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > > A web-archive of this mailing list is available here:
> > > http://www.mpa-garching.mpg.de/gadget/gadget-list
> >
> >
> > -----------------------------------------------------------
> >
> > If you wish to unsubscribe from this mailing, send mail to
> > minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe
> gadget-list
> > A web-archive of this mailing list is available here:
> > http://www.mpa-garching.mpg.de/gadget/gadget-list
>
>
>
>
> -----------------------------------------------------------
>
> If you wish to unsubscribe from this mailing, send mail to
> minimalist_at_MPA-Garching.MPG.de with a subject of: unsubscribe gadget-list
> A web-archive of this mailing list is available here:
> http://www.mpa-garching.mpg.de/gadget/gadget-list
>
Received on 2021-03-04 11:31:11

This archive was generated by hypermail 2.3.0 : 2023-01-10 10:01:32 CET