MPICH2 setup and parallel execution

Step 1

Install the MPICH2 libraries.  A precompiled version of the MPICH2 library is included with the installers. 

For Windows:

The library installer is located here:

/ads/mpi/mpich2.msi 

and can be run by double clicking on the mpich2.msi executable.

 

For Linux:

The library is located under /ads/mpi/mpich2-1.0.8.tar

and can be decompressed by executing:

tar -xvf mpich2-1.0.8.tar

Step 2

Start the MPICH2 ring:

For Windows:

open a command prompt (if running Windows Vista or 7, you will need to "Run as Administrator") and type the following commands:

smpd -remove

smpd -install -phrase

 

For Linux:

In your home directory, edit a .mpd.conf file and insert the following:


MPD_SECRETWORD=


and replace with your own phrase. 

Now, the .mpd.conf file must only be readable/writable by you, so you will need to change file permissions:

chmod 600 .mpd.conf

 

Finally, start your ring by typing:

mpd &


Step 3

Test the ring by running a parallel case:

 

ADS provides a sample test case to demonstrate parallel execution.  It is located under

     \ads\3D Multi Runs\3RowSteadyParallel  (Windows)

     /examples/3D Multi Runs/3RowSteadyParallel (Linux)

 

First execute wand on the case.  This step will generate 3 restart files, Row1X, Row2X, Row3X.

Run 3 copies of the LEO solver in parallel on the MPICH2 ring by doing the following:

 

For Windows:

mpiexec -np 1 leo.exe 3RowSteadyParallelRow01.LEO : -np 1 leo.exe 3RowSteadyParallelRow02.LEO : -np 1 leo.exe 3RowSteadyParallelRow03.LEO

 

NOTE: The first time you run on the MPICH2 ring, you will need to enter your login domain/username and password.  A prompt will open up asking for this information.   

 

For Linux:

mpiexec -np 1 leo 3RowSteadyParallelRow01.LEO : -np 1 leo 3RowSteadyParallelRow02.LEO : -np 1 leo 3RowSteadyParallelRow03.LEO

 

Let's go through the command.  mpiexec is a wrapper command that notifies the ring while the rest of the command determines what the MPICH2 ring will execute.

The -np 1 flag instructs the ring to execute the command leo 3RowSteadyParallelRow01.LEO on a single CPU.  The colon is a command delimiter, creating 3 separate commands that will be executed on the ring in this sample case.  Each command will execute simultaneously on the ring.  Commands that execute on the same CPU will simply time division multiplex the CPU resources until the commands complete.

 

For More Information

If you have further questions, or a more complex system you are implementing on top of, ADS suggests that you visit the MPICH2 webpage: 

http://www.mcs.anl.gov/research/projects/mpich2/

 

 

Code
Author
George Fan
Date Created
2013-06-21 00:47:11
Date Updated
2013-06-21 00:55:42
Views
4142