From Fedora Project Wiki
No edit summary
Line 5: Line 5:
Message Passing Interface (MPI) is an API for parallelization of programs across multiple nodes and has been around since 1994 [http://en.wikipedia.org/wiki/Message_Passing_Interface]. MPI can also be used for parallelization on SMP machines and is considered very efficient in it too (close to 100% scaling on parallelizable code as compared to ~80% commonly obtained with threads due to unoptimal memory allocation on NUMA machines). Before MPI, about every manufacturer of supercomputers had their own programming language for writing programs; MPI made porting software easy.
Message Passing Interface (MPI) is an API for parallelization of programs across multiple nodes and has been around since 1994 [http://en.wikipedia.org/wiki/Message_Passing_Interface]. MPI can also be used for parallelization on SMP machines and is considered very efficient in it too (close to 100% scaling on parallelizable code as compared to ~80% commonly obtained with threads due to unoptimal memory allocation on NUMA machines). Before MPI, about every manufacturer of supercomputers had their own programming language for writing programs; MPI made porting software easy.


There are many MPI implementations available, such as [http://www.lam-mpi.org/ LAM-MPI] (obsoleted by Open MPI), [http://www.open-mpi.org/ Open MPI] (the MPI compiler used in RHEL), [http://www.mcs.anl.gov/research/projects/mpi/mpich1/ MPICH] (Not yet in Fedora), [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] and
There are many MPI implementations available, such as [http://www.lam-mpi.org/ LAM-MPI] (in Fedora, obsoleted by Open MPI), [http://www.open-mpi.org/ Open MPI] (the default MPI compiler in Fedora and the MPI compiler used in RHEL), [http://www.mcs.anl.gov/research/projects/mpi/mpich1/ MPICH] (Not yet in Fedora), [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] (in Fedora) and
[http://mvapich.cse.ohio-state.edu/ MVAPICH1 and MVAPICH2] (Not yet in Fedora).
[http://mvapich.cse.ohio-state.edu/ MVAPICH1 and MVAPICH2] (Not yet in Fedora).


As some MPI libraries work better on some hardware than others, and some software works best with some MPI library, the selection of the library used must be done on a user-level basis. Also, people doing high performance computing may want to use more efficient compilers, so one must be able to have many versions compiled with different compilers of the same library installed at the same time. This must be taken into account when writing spec files.
As some MPI libraries work better on some hardware than others, and some software works best with some MPI library, the selection of the library used must be done in user level, on a session specific basis. Also, people doing high performance computing may want to use more efficient compilers than the default one in Fedora (gcc), so one must be able to have many versions of the MPI compiler each compiled with a different compiler installed at the same time. This must be taken into account when writing spec files.


== Packaging of MPI compilers ==
== Packaging of MPI compilers ==


MPI compilers <b>MUST</b> be installed (including binaries, man pages, etc) in <code>%{_libdir}/%{name}/%{version}-<compiler></code>, where <code><compiler></code> is normally gcc in Fedora. The MPI compiler RPMs <b>MUST</b> be possible to build with other compilers as well and support simultaneous installation of versions compiled with different compilers (e.g. in addition to a version compiled with <code>{gcc,g++,gfortran}</code> a version compiled with <code>{gcc34,g++34,g77}</code> must be possible to install and use simultaneously).
MPI compilers <b>MUST</b> be installed (including binaries, man pages, etc) in <code>%{_libdir}/%{name}/%{version}-<compiler></code>, where <code><compiler></code> is normally gcc in Fedora. The MPI compiler RPMs <b>MUST</b> be possible to build with other compilers as well and support simultaneous installation of versions compiled with different compilers (e.g. in addition to a version compiled with <code>{gcc,g++,gfortran}</code> a version compiled with <code>{gcc34,g++34,g77}</code> must be possible to install and use simultaneously). To do this, the spec file <b>MUST</b> support the use of the following variables
 
<pre>
# We only compile with gcc, but other people may want other compilers.
# Set the compiler here.
%global opt_cc gcc
# Optional CFLAGS to use with the specific compiler...gcc doesn't need any,
# so uncomment and define to use
#global opt_cflags
%global opt_cxx g++
#global opt_cxxflags
%global opt_f77 gfortran
#global opt_fflags
%global opt_fc gfortran
#global opt_fcflags
 
# Optional name suffix to use...we leave it off when compiling with gcc, but
# for other compiled versions to install side by side, it will need a
# suffix in order to keep the names from conflicting.
#global cc_name_suffix -gcc
</pre>


The runtime of MPI compilers (mpirun, the libraries, the manuals etc) <b>MUST</b> be packaged into %{name}, and the development headers and libraries into %{name}-devel.
The runtime of MPI compilers (mpirun, the libraries, the manuals etc) <b>MUST</b> be packaged into %{name}, and the development headers and libraries into %{name}-devel.
Line 22: Line 42:
<b>MUST:</b> By default, no files are placed in <code>/etc/ld.so.conf.d</code>. If the packager wishes to provide alternatives support, it <b>MUST</b> be placed in a subpackage along with the ld.so.conf.d file so that alternatives support does not need to be installed if not wished for.
<b>MUST:</b> By default, no files are placed in <code>/etc/ld.so.conf.d</code>. If the packager wishes to provide alternatives support, it <b>MUST</b> be placed in a subpackage along with the ld.so.conf.d file so that alternatives support does not need to be installed if not wished for.


The MPI compiler package MUST provide an RPM macro that makes loading and unloading the support easy in spec files, e.g. by placing the following in <code>/etc/rpm/macros.openmpi</code>
The MPI compiler package <b>MUST</b> provide an RPM macro that makes loading and unloading the support easy in spec files, e.g. by placing the following in <code>/etc/rpm/macros.openmpi</code>
<pre>
<pre>
%_mpi_compiler openmpi
%_openmpi_load \
%_openmpi_load \
  . /etc/profile.d/modules.sh; \
  . /etc/profile.d/modules.sh; \
  module load openmpi-%{_arch}; \
  module load openmpi-%{_arch}; \
  export CFLAGS="$CFLAGS %{optflags}"
  export CFLAGS="$CFLAGS %{optflags}";
%_openmpi_unload \
%_openmpi_unload \
  . /etc/profile.d/modules.sh; \
  . /etc/profile.d/modules.sh; \
Line 33: Line 54:
</pre>
</pre>
loading and unloading the compiler in spec files is as easy as <code>%{_openmpi_load}</code> and <code>%{_openmpi_unload}</code>.
loading and unloading the compiler in spec files is as easy as <code>%{_openmpi_load}</code> and <code>%{_openmpi_unload}</code>.
{{admon/note|Note about compiler flags|If the environment module sets compiler flags such as <code>CFLAGS</code>, the RPM macro <b>MUST</b> make the flags use the Fedora optimization flags <code>%{optflags}</code>.}}
 
If the environment module sets compiler flags such as <code>CFLAGS</code> (thus overriding the ones exported in <code>%configure</code>, the RPM macro <b>MUST</b> make them use the Fedora optimization flags <code>%{optflags}</code> once again.
 
Versions of the MPI compiler compiled against another combination than <code>{gcc,g++,gfortran}</code> <b>MUST</b> suffix the <code>%_mpi_compiler</code> definition with <code>%{?cc_name_suffix}</code> (e.g. <code>-gcc34</code>).
 


== Packaging of MPI software ==
== Packaging of MPI software ==


Software that supports MPI <b>MUST</b> be packaged also in serial mode [i.e. no MPI], if it is supported by upstream. (for instance: <code>foo</code>).  
Software that supports MPI <b>MUST</b> be packaged also in serial mode [i.e. no MPI], if it is supported by upstream. (for instance: <code>foo</code>).


The packager <b>MUST</b> package at least a version compiled against Open MPI. Packages made against other MPI compilers in Fedora <b>SHOULD</b> be made, but that is left up to the maintainer. The MPI enabled bits <b>MUST</b> be placed in a subpackage with the suffix denoting the MPI compiler used (for instance: <code>foo-mpi</code> for Open MPI [the traditional MPI compiler in Fedora] or <code>foo-mpich2</code> for MPICH2).
The packager <b>MUST</b> package at least a version compiled against Open MPI. Packages made against other MPI compilers in Fedora <b>SHOULD</b> be made, but that is left up to the maintainer. The MPI enabled bits <b>MUST</b> be placed in a subpackage with the suffix denoting the MPI compiler used (for instance: <code>foo-mpi</code> for Open MPI [the traditional MPI compiler in Fedora] or <code>foo-mpich2</code> for MPICH2).
Each MPI build of shared libraries <b>SHOULD</b> have a separate -libs subpackage for the libraries (e.g. foo-mpich2-libs). Each MPI build <b>MUST</b> have a separate -devel subpackage (e.g. foo-mpich2-devel) that includes the development libraries and Requires: <code>%{name}-devel</code> that includes the headers.


To prevent name clashes, there are two possibilities in the installation location:
To prevent name clashes, there are two possibilities in the installation location:
Line 46: Line 73:
#* The binaries of the software placed in <code>%{_bindir}</code> <b>MUST</b> be suffixed with the name of the MPI compiler (e.g. <code>bar_mpi</code> [for Open MPI] or <code>bar_mpich2</code> [for MPICH2]).
#* The binaries of the software placed in <code>%{_bindir}</code> <b>MUST</b> be suffixed with the name of the MPI compiler (e.g. <code>bar_mpi</code> [for Open MPI] or <code>bar_mpich2</code> [for MPICH2]).
#* The libraries of the software placed in <code>%{_libdir}</code> <b>MUST</b> be suffixed with the name of the MPI compiler (e.g. <code>libbar_mpi.so</code> [for Open MPI] or <code>libbar_mpich2.so</code> [for MPICH2]).
#* The libraries of the software placed in <code>%{_libdir}</code> <b>MUST</b> be suffixed with the name of the MPI compiler (e.g. <code>libbar_mpi.so</code> [for Open MPI] or <code>libbar_mpich2.so</code> [for MPICH2]).
#* Files installed in <code>%{_datadir}</code> <b>SHOULD</b> be placed in a <code>-common</code> subpackage that is required by all of the packages containing binaries.
# Placing in a separate directory
# Placing in a separate directory
#* The software <code>MUST</code> be installed in <code>%{_libdir}/%{name}/%{version}-%{_mpi_compiler}/</code> (e.g. code>%{_libdir}/foo/1.0-openmpi-gcc/</code>, including libraries and man files.
#* The software <code>MUST</code> be installed in <code>%{_libdir}/%{name}/%{version}-%{_mpi_compiler}/</code> (e.g. code>%{_libdir}/foo/1.0-openmpi-gcc/</code>, including libraries and man files.
#* Architecture and compiler independent headers <b>MUST</b> be placed as normal into <code>%{_includedir}</code>. If the headers contain e.g. some declaration about the MPI compiler used, the headers <b>MUST</b> be placed with the rest of the files in <code>%{_libdir}/%{name}/%{version}-%{_mpi_compiler}/</code>.
#* Files normally installed in <code>%{_datadir}</code> <b>MUST</b> be placed in a <code>-common</code> subpackage that is required by all of the packages containing binaries.
#* An environment module enabling the use of the software <b>MUST</b> be written and be made available as <code>/etc/modulefiles/%{name}-%{compiler}-%{_arch}</code>. The module <b>MUST</b> require the module of the used compiler. More info on [[PackagingDrafts/EnvironmentModules|environment modules]].
#* An environment module enabling the use of the software <b>MUST</b> be written and be made available as <code>/etc/modulefiles/%{name}-%{compiler}-%{_arch}</code>. The module <b>MUST</b> require the module of the used compiler. More info on [[PackagingDrafts/EnvironmentModules|environment modules]].


''The packages <b>MUST</b> have explicit requires on the used MPI runtime, as rpm might not pick up the correct version.'' - needs to be checked, at least libmpi is provided by all of them(?)
''The packages <b>MUST</b> have explicit requires on the used MPI runtime, as rpm might not pick up the correct version.'' - needs to be checked, at least libmpi is provided by all of them(?)

Revision as of 12:15, 24 July 2009

This is a draft document

Introduction

Message Passing Interface (MPI) is an API for parallelization of programs across multiple nodes and has been around since 1994 [1]. MPI can also be used for parallelization on SMP machines and is considered very efficient in it too (close to 100% scaling on parallelizable code as compared to ~80% commonly obtained with threads due to unoptimal memory allocation on NUMA machines). Before MPI, about every manufacturer of supercomputers had their own programming language for writing programs; MPI made porting software easy.

There are many MPI implementations available, such as LAM-MPI (in Fedora, obsoleted by Open MPI), Open MPI (the default MPI compiler in Fedora and the MPI compiler used in RHEL), MPICH (Not yet in Fedora), MPICH2 (in Fedora) and MVAPICH1 and MVAPICH2 (Not yet in Fedora).

As some MPI libraries work better on some hardware than others, and some software works best with some MPI library, the selection of the library used must be done in user level, on a session specific basis. Also, people doing high performance computing may want to use more efficient compilers than the default one in Fedora (gcc), so one must be able to have many versions of the MPI compiler each compiled with a different compiler installed at the same time. This must be taken into account when writing spec files.

Packaging of MPI compilers

MPI compilers MUST be installed (including binaries, man pages, etc) in %{_libdir}/%{name}/%{version}-<compiler>, where <compiler> is normally gcc in Fedora. The MPI compiler RPMs MUST be possible to build with other compilers as well and support simultaneous installation of versions compiled with different compilers (e.g. in addition to a version compiled with {gcc,g++,gfortran} a version compiled with {gcc34,g++34,g77} must be possible to install and use simultaneously). To do this, the spec file MUST support the use of the following variables

# We only compile with gcc, but other people may want other compilers.
# Set the compiler here.
%global opt_cc gcc
# Optional CFLAGS to use with the specific compiler...gcc doesn't need any,
# so uncomment and define to use
#global opt_cflags
%global opt_cxx g++
#global opt_cxxflags
%global opt_f77 gfortran
#global opt_fflags
%global opt_fc gfortran
#global opt_fcflags

# Optional name suffix to use...we leave it off when compiling with gcc, but
# for other compiled versions to install side by side, it will need a
# suffix in order to keep the names from conflicting.
#global cc_name_suffix -gcc

The runtime of MPI compilers (mpirun, the libraries, the manuals etc) MUST be packaged into %{name}, and the development headers and libraries into %{name}-devel.

As the compiler is installed outside PATH, one needs to load the relevant variables before being able to use the compiler or run MPI programs. This is done using environment modules.

The module file MUST prepend the MPI bindir {_libdir}/%{name}/%{version}-<compiler>/bin into the users PATH and set LD_LIBRARY_PATH to {_libdir}/%{name}/%{version}-<compiler>/lib. The module MUST provide environment variables $MPI_HOME, $MPI_BIN and $MPI_LIB, which should point to %{_libdir}/%{name}/%{version}-<compiler>, %{_libdir}/%{name}/%{version}-<compiler>/bin and %{_libdir}/%{name}/%{version}-<compiler>/lib, respectively.

MUST: By default, no files are placed in /etc/ld.so.conf.d. If the packager wishes to provide alternatives support, it MUST be placed in a subpackage along with the ld.so.conf.d file so that alternatives support does not need to be installed if not wished for.

The MPI compiler package MUST provide an RPM macro that makes loading and unloading the support easy in spec files, e.g. by placing the following in /etc/rpm/macros.openmpi

%_mpi_compiler openmpi
%_openmpi_load \
 . /etc/profile.d/modules.sh; \
 module load openmpi-%{_arch}; \
 export CFLAGS="$CFLAGS %{optflags}";
%_openmpi_unload \
 . /etc/profile.d/modules.sh; \
 module unload openmpi-%{_arch};

loading and unloading the compiler in spec files is as easy as %{_openmpi_load} and %{_openmpi_unload}.

If the environment module sets compiler flags such as CFLAGS (thus overriding the ones exported in %configure, the RPM macro MUST make them use the Fedora optimization flags %{optflags} once again.

Versions of the MPI compiler compiled against another combination than {gcc,g++,gfortran} MUST suffix the %_mpi_compiler definition with %{?cc_name_suffix} (e.g. -gcc34).


Packaging of MPI software

Software that supports MPI MUST be packaged also in serial mode [i.e. no MPI], if it is supported by upstream. (for instance: foo).

The packager MUST package at least a version compiled against Open MPI. Packages made against other MPI compilers in Fedora SHOULD be made, but that is left up to the maintainer. The MPI enabled bits MUST be placed in a subpackage with the suffix denoting the MPI compiler used (for instance: foo-mpi for Open MPI [the traditional MPI compiler in Fedora] or foo-mpich2 for MPICH2).

Each MPI build of shared libraries SHOULD have a separate -libs subpackage for the libraries (e.g. foo-mpich2-libs). Each MPI build MUST have a separate -devel subpackage (e.g. foo-mpich2-devel) that includes the development libraries and Requires: %{name}-devel that includes the headers.

To prevent name clashes, there are two possibilities in the installation location:

  1. Placing in system directories
    • The binaries of the software placed in %{_bindir} MUST be suffixed with the name of the MPI compiler (e.g. bar_mpi [for Open MPI] or bar_mpich2 [for MPICH2]).
    • The libraries of the software placed in %{_libdir} MUST be suffixed with the name of the MPI compiler (e.g. libbar_mpi.so [for Open MPI] or libbar_mpich2.so [for MPICH2]).
    • Files installed in %{_datadir} SHOULD be placed in a -common subpackage that is required by all of the packages containing binaries.
  2. Placing in a separate directory
    • The software MUST be installed in %{_libdir}/%{name}/%{version}-%{_mpi_compiler}/ (e.g. code>%{_libdir}/foo/1.0-openmpi-gcc/, including libraries and man files.
    • Architecture and compiler independent headers MUST be placed as normal into %{_includedir}. If the headers contain e.g. some declaration about the MPI compiler used, the headers MUST be placed with the rest of the files in %{_libdir}/%{name}/%{version}-%{_mpi_compiler}/.
    • Files normally installed in %{_datadir} MUST be placed in a -common subpackage that is required by all of the packages containing binaries.
    • An environment module enabling the use of the software MUST be written and be made available as /etc/modulefiles/%{name}-%{compiler}-%{_arch}. The module MUST require the module of the used compiler. More info on environment modules.

The packages MUST have explicit requires on the used MPI runtime, as rpm might not pick up the correct version. - needs to be checked, at least libmpi is provided by all of them(?)