"MapReduce-MPI WWW Site"_mws - "MapReduce-MPI Documentation"_md :c

:link(mws,http://mapreduce.sandia.gov)
:link(md,Manual.html)

:line

Getting Started :h3

Once you have
"downloaded"_http://www.sandia.gov/~sjplimp/download.html the
MapReduce MPI (MR-MPI) library, you should have the tarball
mapreduce.tar.gz on your machine.  Unpack it with the following
commands:

gunzip mapreduce.tar.gz
tar xvf mapreduce.tar :pre

which should create a mapreduce directory containing the following:

README
LICENSE
doc
examples
mpistubs
oink
oinkdoc
python
src
user :ul

The doc directory contains this documentation.  The oink and oinkdoc
directories contain the "OINK scripting
interface"_../oinkdoc/Manual.html to the MR-MPI library and its
separate documentation.  The examples directory contains a few simple
MapReduce programs which call the MR-MPI library.  These are
documented by a README file in that directory and are discussed below.
The mpistubs directory contains a dummy MPI library which can be used
to build a MapReduce program on a serial machine.  The python
directory contains the Python wrapper files needed to call the MR-MPI
library from Python.  The src directory contains the files that
comprise the MR-MPI library.  The user directory contains
user-contributed MapReduce programs.  See the README in that directory
for further details.

[Static library:] :h5

To build a static library for use by a C++ or C program (*.a file on
Linux), go to the src directory and type

make :pre

You will see a list of machine names, each of which has their own
Makefile.machine file in the src/MAKE directory.  You can choose
one of these and attempt to build the MR-MPI library by typing

make machine :pre

If you are successful, this will produce the file "libmrmpi_machine.a"
which can be linked by other programs.  If not, you will need to
create a src/MAKE/Makefile.machine file compatible with your platform,
using one of the existing files as a template.

The only settings in a Makefile.machine file that need to be specified
are those for the compiler and the MPI library on your machine.  If
MPI is not already installed, you can install one of several free
versions that work on essentially all platforms.  MPICH and OpenMPI
are the most common.

Within Makefile.machine you can either specify via -I and -L switches
where the MPI include and library files are found, or you can use a
compiler wrapper provided with MPI, like mpiCC or mpic++, which will
know where those files are.

You can also build the MR-MPI library without MPI, using the dummy MPI
library provided in the mpistubs directory.  In this case you can only
run the library on a single processor.  To do this, first build the
dummy MPI library, by typing "make" from within the mpistubs
directory.  Again, you may need to edit mpistubs/Makefile for your
machine.  Then from the src directory, type "make serial" which uses
the src/MAKE/Makefile.serial file.

Both a C++ and "C interface"_Interface_c.html are part of the MR-MPI
library, so it should be usable from any hi-level language.

[Shared library:] :h5

You can also build the MR-MPI library as a dynamic shared library
(*.so file instead of *.a on Linux).  This is required if
you want to use the library from Python.  To do this, type

make -f Makefile.shlib machine :pre

This will create the file libmrmpi_machine.so, as well as a soft link
libmrmpi.so, which is what the Python wrapper will load by default.
Note that if you are building multiple machine versions of the shared
library, the soft link is always set to the most recently built
version.

[Additional requirement for using a shared library:] :h5

The operating system finds shared libraries to load at run-time using
the environment variable LD_LIBRARY_PATH.  So you may wish to copy the
file src/libmrmpi.so or src/libmrmpi_g++.so (for example) to a place
the system can find it by default, such as /usr/local/lib, or you may
wish to add the MR-MPI src directory to LD_LIBRARY_PATH, so that the
current version of the shared library is always available to programs
that use it.

For the csh or tcsh shells, you would add something like this to your
~/.cshrc file:

setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/mrmpi/src :pre

:line 

The MapReduce programs in the examples directory can be built by
typing

make -f Makefile.machine :pre

from within the examples directory, where Makefile.machine is one of
the Makefiles in the examples directory.  Again, you may need to
modify one of the existing ones to create a new one for your machine.
Some of the example programs are provided as a C++ program, a C
program, as a Python script, or as an OINK input script.
Once you have built OINK, the latter can be run as, for example,

oink_linux < in.rmat :pre

When you run one of the example MapReduce programs or your own, if you
get an immediate error about the MRMPI_BIGINT data type, you will need
to edit the file src/mrtype.h and re-compile the library.  Mrtype.h
and the error check insures that your MPI will perform operations on
8-byte unsigned integers as required by the MR-MPI library.  For the
MPI on most machines, this is satisfied by the MPI data type
MPI_UNSIGNED_LONG_LONG.  But some machines do not support the "long
long" data type, and you may need a different setting for your machine
and installed MPI, such as MPI_UNSIGNED_LONG.
