From Fedora Project Wiki

Revision as of 16:35, 15 October 2015 by Rathann (talk | contribs) (Drop _cc_name_suffix stuff and extra slashes at the end of paths)

Introduction

Message Passing Interface (MPI) is an API for parallelization of programs across multiple nodes and has been around since 1994 [1]. MPI can also be used for parallelization on SMP machines and is considered very efficient in it too (close to 100% scaling on parallelizable code as compared to ~80% commonly obtained with threads due to unoptimal memory allocation on NUMA machines). Before MPI, about every manufacturer of supercomputers had their own programming language for writing programs; MPI made porting software easy.

There are many MPI implementations available, such as LAM-MPI (in Fedora, obsoleted by Open MPI), Open MPI (the default MPI compiler in Fedora and the MPI compiler used in RHEL), MPICH (Not yet in Fedora), MPICH2 (in Fedora) and MVAPICH1 and MVAPICH2 (are in RHEL but not yet in Fedora).

As some MPI libraries work better on some hardware than others, and some software works best with some MPI library, the selection of the library used must be done in user level, on a session specific basis. Also, people doing high performance computing may want to use more efficient compilers than the default one in Fedora (gcc), so one must be able to have many versions of the MPI compiler each compiled with a different compiler installed at the same time. This must be taken into account when writing spec files.

Packaging of MPI compilers

The files of MPI compilers MUST be installed in the following directories:

File type Placement
Binaries %{_libdir}/%{name}/bin
Libraries %{_libdir}/%{name}/lib
Fortran modules %{_fmoddir}/%{name}
Architecture specific Python modules %{python2_sitearch}/%{name}

%{python3_sitearch}/%{name}

Config files %{_sysconfdir}/%{name}-%{_arch}


As include files and manual pages are bound to overlap between different MPI implementations, they MUST also placed outside normal directories. It is possible that some man pages or include files (either those of the MPI compiler itself or of some MPI software installed in the compiler's directory) are architecture specific (e.g. a definition on a 32-bit arch differs from that on a 64-bit arch), the directories that MUST be used are as follows:

File type Placement
Man pages %{_mandir}/%{name}-%{_arch}
Include files %{_includedir}/%{name}-%{_arch}


Architecture independent parts (except headers which go into -devel) MUST be placed in a -common subpackage that is BuildArch: noarch.

The runtime of MPI compilers (mpirun, the libraries, the manuals etc) MUST be packaged into %{name}, and the development headers and libraries into %{name}-devel.

As the compiler is installed outside PATH, one needs to load the relevant variables before being able to use the compiler or run MPI programs. This is done using environment modules.

The module file MUST be installed under %{_sysconfdir}/modulefiles/mpi. This allows as user with only one mpi implementation installed to load the module with:

module load mpi

The module file MUST have the line:

conflict mpi

to prevent concurrent loading of multiple mpi modules.


The module file MUST prepend $MPI_BIN into the users PATH and set LD_LIBRARY_PATH to $MPI_LIB. The module file MUST also set some helper variables (primarily for use in spec files):

Variable Value Explanation
MPI_BIN %{_libdir}/%{name}/bin Binaries compiled against the MPI stack
MPI_SYSCONFIG %{_sysconfdir}/%{name}-%{_arch} MPI stack specific configuration files
MPI_FORTRAN_MOD_DIR %{_fmoddir}/%{name} MPI stack specific Fortran module directory
MPI_INCLUDE %{_includedir}/%{name}-%{_arch} MPI stack specific headers
MPI_LIB %{_libdir}/%{name}/lib Libraries compiled against the MPI stack
MPI_MAN %{_mandir}/%{name}-%{_arch} MPI stack specific man pages
MPI_PYTHON2_SITEARCH %{python2_sitearch}/%{name} MPI stack specific Python 2 modules
MPI_PYTHON3_SITEARCH %{python3_sitearch}/%{name} MPI stack specific Python 3 modules
MPI_COMPILER %{name}-%{_arch} Name of compiler package, for use in e.g. spec files
MPI_SUFFIX _%{name} The suffix used for programs compiled against the MPI stack

As these directories may be used by software using the MPI stack, the MPI runtime package MUST own all of them.

MUST: By default, NO files are placed in /etc/ld.so.conf.d. If the packager wishes to provide alternatives support, it MUST be placed in a subpackage along with the ld.so.conf.d file so that alternatives support does not need to be installed if not wished for.

MUST: If the maintainer wishes for the environment module to load automatically by use of a scriptlet in /etc/profile.d or by some other mechanism, this MUST be done in a subpackage.

MUST: The MPI compiler package MUST provide an RPM macro that makes loading and unloading the support easy in spec files, e.g. by placing the following in /etc/rpm/macros.openmpi

%_openmpi_load \
 . /etc/profile.d/modules.sh; \
 module load mpi/openmpi-%{_arch}; \
 export CFLAGS="$CFLAGS %{optflags}";
%_openmpi_unload \
 . /etc/profile.d/modules.sh; \
 module unload mpi/openmpi-%{_arch};

loading and unloading the compiler in spec files is as easy as %{_openmpi_load} and %{_openmpi_unload}.

If the environment module sets compiler flags such as CFLAGS (thus overriding the ones exported in %configure, the RPM macro MUST make them use the Fedora optimization flags %{optflags} once again (as in the example above in which the openmpi-%{_arch} module sets CFLAGS).

Automatic setting of the module loading path in python interpreters is done using a .pth file placed in one of the directories normally searched for modules (%{python2_sitearch}, %{python3_sitearch}). Those .pth files should append the directory specified with $MPI_PYTHON2_SITEARCH or $MPI_PYTHON3_SITEARCH environment variable, depending on the interpreter version, to sys.path, and do nothing if those variables are unset. Module files MUST NOT set PYTHONPATH directly, since it cannot be set for both Python versions at the same time.