You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Debian now uses mpich as the default MPI on 32-bit architectures such as armel, armhf, i386.
This reveals a bug in netgen building against mpich. The error (from armel) is
[100%] Building CXX object ng/CMakeFiles/netgen.dir/ngappinit.cpp.o
cd /<<PKGBUILDDIR>>/obj-arm-linux-gnueabi/ng && /usr/bin/c++ -DFFMPEG -DHAVE_DLFCN_H -DHAVE_FREEIMAGE -DHAVE_FREETYPE -DHAVE_OPENGL_EXT -DHAVE_RAPIDJSON -DHAVE_TBB -DHAVE_TK -DHAVE_XLIB -DIGNORE_NO_ATOMICS -DINTERNAL_TCL_DEFAULT=1 -DJPEGLIB -DNETGEN_PYTHON -DNG_PYTHON -DOCCGEOMETRY -DOCC_CONVERT_SIGNALS -DOPENGL -DPARALLEL -DPYBIND11_SIMPLE_GIL_MANAGEMENT -DTCL -DTOGL_X11 -DUSE_TCL_STUBS -DUSE_TK_STUBS -DUSE_TOGL_2 -D_GLIBCXX_USE_CXX11_ABI=1 -D__STDC_CONSTANT_MACROS -I/<<PKGBUILDDIR>>/obj-arm-linux-gnueabi/ng -I/<<PKGBUILDDIR>>/ng -I/<<PKGBUILDDIR>>/obj-arm-linux-gnueabi -I/<<PKGBUILDDIR>>/include -I/<<PKGBUILDDIR>>/libsrc -I/<<PKGBUILDDIR>>/libsrc/include -I/usr/include/opencascade -I/usr/lib/arm-linux-gnueabi/mpich/include -I/usr/include/python3.12 -I/usr/include/tcl -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -g -DNDEBUG -std=gnu++17 -fvisibility=hidden -fabi-version=19 -MD -MT ng/CMakeFiles/netgen.dir/ngappinit.cpp.o -MF CMakeFiles/netgen.dir/ngappinit.cpp.o.d -o CMakeFiles/netgen.dir/ngappinit.cpp.o -c /<<PKGBUILDDIR>>/ng/ngappinit.cpp
In file included from /usr/include/tcl/tk.h:99,
from /<<PKGBUILDDIR>>/libsrc/include/inctcl.hpp:7,
from /<<PKGBUILDDIR>>/ng/ngappinit.cpp:10:
/usr/lib/arm-linux-gnueabi/mpich/include/mpicxx.h:160:18: error: expected identifier before ‘int’
160 | friend class Status;
| ^~~~~~
In file included from /usr/lib/arm-linux-gnueabi/mpich/include/mpi.h:977,
from /<<PKGBUILDDIR>>/libsrc/core/ng_mpi.hpp:14,
from /<<PKGBUILDDIR>>/libsrc/core/mpi_wrapper.hpp:13,
from /<<PKGBUILDDIR>>/libsrc/include/../meshing/meshtype.hpp:13,
from /<<PKGBUILDDIR>>/libsrc/include/../meshing/meshing.hpp:23,
from /<<PKGBUILDDIR>>/libsrc/include/meshing.hpp:1,
from /<<PKGBUILDDIR>>/ng/ngappinit.cpp:11:
/usr/lib/arm-linux-gnueabi/mpich/include/mpicxx.h:160:12: error: multiple types in one declaration
160 | friend class Status;
| ^~~~~
/usr/lib/arm-linux-gnueabi/mpich/include/mpicxx.h:160:5: error: friend declaration does not name a class or function
160 | friend class Status;
| ^~~~~~
/usr/lib/arm-linux-gnueabi/mpich/include/mpicxx.h:493:7: error: expected identifier before ‘int’
493 | class Status {
| ^~~~~~
/usr/lib/arm-linux-gnueabi/mpich/include/mpicxx.h:493:15: error: expected unqualified-id before ‘{’ token
493 | class Status {
| ^
Ultimately the error occurs when the build is configured with -DPARALLEL. If I understood correctly, this is set by libsrc/core/CMakeLists.txt when USE_MPI is set.
ng/ngappinit.cpp includes both inctcl.hpp (l.10) and mpi_wrapper.hpp (l.12). mpi_wrapper.hpp is also included indirectly via meshing.hpp (l.11)
inctcl.hpp includes tcl/tk.h, which in turn includes (l.99) X11/Xlib.h. Xlib.h defines (l.83)
#define Status int
At the same time core/mpi_wrapper.hpp includes ng_mpi.hpp, which includes mpi.h (if PARALLEL is set).
mpi.h from mpich includes mpicxx.h which defines (l.160) Status as a class. The compilation error arises from the conflicting definitions of Status.
Commit c2af423 already removed use of MPI from the netgen gui executable and therefore from ngappinit.cpp. It didn't address the issue of the definition of PARALLEL.
Arguably it might be more consistent to instead unset PARALLEL for the entire netgen target.
target_compile_definitions(netgen PUBLIC "-UPARALLEL")
I didn't do this in the patch above since I was not sure if the gui might be used to launch mpi processes (given the ParallelRun() function in ng/parallelfunc.cpp).
Setting -UPARALLEL means the compile line might contain -DPARALLEL -UPARALLEL, including the original flag for general parallel support in the netgen build. This looks a little strange, but is safe since the final value takes priority. In principle it would be possible to make cmake parse the compile options for the netgen target and remove -DPARALLEL, but that would be a little more complex than the one-line patch suggested here.
The text was updated successfully, but these errors were encountered:
Debian now uses mpich as the default MPI on 32-bit architectures such as armel, armhf, i386.
This reveals a bug in netgen building against mpich. The error (from armel) is
Ultimately the error occurs when the build is configured with
-DPARALLEL
. If I understood correctly, this is set by libsrc/core/CMakeLists.txt when USE_MPI is set.ng/ngappinit.cpp includes both inctcl.hpp (l.10) and mpi_wrapper.hpp (l.12). mpi_wrapper.hpp is also included indirectly via meshing.hpp (l.11)
inctcl.hpp includes tcl/tk.h, which in turn includes (l.99) X11/Xlib.h. Xlib.h defines (l.83)
At the same time core/mpi_wrapper.hpp includes ng_mpi.hpp, which includes mpi.h (if PARALLEL is set).
mpi.h from mpich includes mpicxx.h which defines (l.160) Status as a class. The compilation error arises from the conflicting definitions of Status.
Commit c2af423 already removed use of MPI from the netgen gui executable and therefore from ngappinit.cpp. It didn't address the issue of the definition of PARALLEL.
A successful mpich build can be obtained simply by unsetting PARALLEL for ngappinit.cpp, e.g.
This is sufficient for a successful build.
Arguably it might be more consistent to instead unset PARALLEL for the entire netgen target.
I didn't do this in the patch above since I was not sure if the gui might be used to launch mpi processes (given the
ParallelRun()
function in ng/parallelfunc.cpp).Setting
-UPARALLEL
means the compile line might contain-DPARALLEL -UPARALLEL
, including the original flag for general parallel support in the netgen build. This looks a little strange, but is safe since the final value takes priority. In principle it would be possible to make cmake parse the compile options for the netgen target and remove-DPARALLEL
, but that would be a little more complex than the one-line patch suggested here.The text was updated successfully, but these errors were encountered: