Using parallel computers with adevs
Adevs supports parallel computing with a simple
parallel algorithm. The parallel simulation algorithm is
implemented by the pdevssim class. The parallel algorithm tries
to utilize multiple processors by
- executing simultaneous state transitions in parallel,
- executing simultaneous outputs and
performing event routing in parallel, and
- executing the garbage collection methods
for active models in parallel.
In order for the parallel simulator to operate
correctly, the model implementation must adhere to a set of
constraints. These are
- It must be safe to execute model state transition and output
functions in parallel. This is probably already true for
your model (because you were very careful about not sharing state
variables, right?). If not, then you will need to modify your
model to eliminate shared variables. The only exception to this
are output variables that are written to in the output function and
read during a state transition function. Shared variables that
are written to during the state transition function, or which are
read/write in the output function must be eliminated.
- It must be safe to execute the gc_model() methods in parallel
with each other.
- The route() method for your network models must be
reentrant. This is necessary because event routing is done in
parallel, and so different threads may be executing route() methods in
parallel. The network models that come with adevs are safe to use
with the parallel simulator.
- The model can not be a variable structure model. The
pdevssim simulator does not support variable structure models.
The performance of your parallel simulation can, in many instances, be
improved by assigned specific models to specific threads. Adevs
provides support for model partitioning with the method static void
devs::prefer_thread(int thread_id). This method is a class
wide (static) method belong to the devs class. Calling this
thread has two effects. The first is to assign all models created
after the method call to the thread identifed by thread_id. The
second effect only occurs when NUMA support has been enabled, and it
causes the all memory allocations following the method call to occur in
the memory block belonging to thread_id. Thread identifiers range
from 0 to N, where N is the argument passed to the pdevssim
constructor. A random assignment will be used if no
specific thread assignment is given.
For example, suppose that we had a model with three atomic components,
and we wanted to assign each model to its own thread. The
constructor for the coupled model that contains the components would
look something like
...
/*
Create the first model in the memory block assigned to thread 0, and
assign thread 0 the task of executing the model.
*/
devs::prefer_thread(0);
myModel* model1 = new myModel();
...
/*
Assign the second model to thread 1.
*/
devs::prefer_thread(1);
myModel* model2 = new myModel();
...
/*
Assign the third model the thread 2.
*/
devs::prefer_thread(2);
myModel* model3 = new myModel();
...
Simulating a model with the pdevssim class is nearly identical
to doing so with the devssim class. The only exception is that
the number of threads that will be used for the simulation must be
provided. Shown below is a code snippet that demonstrates the
creation of a model and its execution using the pdevssim class.
int
thrds_to_use = 3;
/*
The model constructor may need the number of threads in order to assign
submodels to specific threads.
*/
myModel* model = myModel(thrds_to_use);
/*
Create a parallel simulator for the model.
*/
pdevssim sim(model,thrds_to_use);
sim.run();
/*
The simulation is complete when the run() method returns.
*/
The simple parallel algorithm that is implemented by the pdevssim class
was chosen for its simplicity; not its performance. The algorithm
is probably well suited for large models that, in general, have enough
simultaneous events to keep the available processors busy. The
algorithm will likely give good performance on dual core processors, 2
to 4 processor parallel computers, and possibly larger machines if your
model is very big and/or has expensive event routing, state transition,
and/or output functions.