
   ==================================================================
   ===                                                            ===
   ===           GENESIS Distributed Memory Benchmarks            ===
   ===                                                            ===
   ===                           COMMS2                           ===
   ===                                                            ===
   ===                      Message Exchange                      ===
   ===                                                            ===
   ===               Author:   Roger Hockney                      ===
   ===     Department of Electronics and Computer Science         ===
   ===               University of Southampton                    ===
   ===               Southampton SO9 5NH, U.K.                    ===
   ===     fax.:+44-703-593045   e-mail:rwh@uk.ac.soton.ecs       ===
   ===                                                            ===
   ===     Copyright: SNARC, University of Southampton            ===
   ===                                                            ===
   ===          Last update: March 1992; Release: 2.0             ===
   ===                                                            ===
   ==================================================================


1. Description
--------------
This benchmark measures the message exchange properties of a
computer network, and makes use of bidirectional links if they
are present. A pair of nodes send a message of varying length,
n, to each other and then wait to receive the message from the other 
of the pair.  One quarter of the time for this exchange is recorded 
as the time, t, to send a message, because four messages are sent 
during the exchange. This time is fitted by least-squares to the 
straight-line relation:

                     t = (n + nhalf) / rinf                   (1)

where  rinf  = the asymptotic stream rate (Byte/s), and
       nhalf = the message length (Byte) giving half the 
               asymptotic performance

This corresponds to an average performance, r, as a function of 
message length, n,
                            rinf
                    r = -------------                          (2)
                        (1 + nhalf/n)

In the above formula rinf is the asymptotic stream rate to use with the
value of nhalf in order to calculate the average bandwidth. For short
messages the values of rinf may be high but they will not be achieved
because of the effect of nhalf via equation (2).

The benchmark is restricted to asynchronous communication. Asynchronous,
here, means that a send returns to the calling program when the user 
data array being sent may be safely reused.  This, however, may be 
before the message has been received by the receiving node.  The 
receiving node program stops (i.e. blocks) until the data is available 
for use by the user's program.


2. Operating Instructions
-------------------------
The program asks for the destination node number, so that the message
time can be studied as a function of the distance between the nodes.
The default case, obtained by answering <CR> gives an exchange between
nearest-neighbour nodes.

Many message-passing computers have different timing for short and
long messages. This benchmark also asks for the longest short-message
in order to select suitable message lengths, and fits a separate
straight line for short and long messages.

To expand the PARMACS macros, compile and link the code with the
appropriate libraries enter the directory d77 and type:     make

On some systems it may be necessary to allocate the appropriate
resources before running the benchmark, eg. on the iPSC/860 to reserve
a cube of 2 processors, type:    getcube -t2

To run the benchmark executable, type:    host

This will automatically load both host and node programs. The progress
of the benchmark execution can be monitored via the standard output,
whilst a permanent copy of the benchmark is written to a file called
'result'. If the run is successful and a permanent record is required,
the file 'result' should be copied to another file before the next run
overwrites it.

The whole test takes about 2 1/2 minutes on the Intel iPSC/860.
