Designing ParallelZone’s Parallel Runtime
Designing ParallelZone described the overall design points which went into the architecture of ParallelZone. Here we focus on the architecture of the parallel runtime component.
Why Do We Need a Parallel Runtime?
The primary goal of ParallelZone is to be a runtime system for parallel software. To accomplish this ParallelZone must have abstractions the software can use to interact with the runtime environment in a parallel manner. Ideally these interactions should make the task of writing parallel code as simple as possible.
Design Considerations
The parallel runtime will be responsible for addressing many of the considerations raised in the Designing ParallelZone section. Including:
Object-oriented (both the parallel runtime itself and the data it operates on)
Support for process, CPU-threading, and GPU-threading
Extendable if/when new hardware parallelism arises (e.g. tensor cores)
Support for MPI
Scheduling of tasks
Additionally, we recognize that the parallel runtime will:
manage the lifetime of parallel resources (think
MPI_initialize
)track resource usage
logical partitioning
needs to understand locality/affinity of hardware
compatible with not being the top (i.e., should work even when it’s not wrapping
MPI_COMM_WORLD
).
Parallel Runtime Design
Fig. 6 provides a high-level overview of the parallel
runtime in ParallelZone. The RuntimeView
class is an abstraction which
models the runtime environment. The ResourceSet
class is used to group
resources into affinity sets. The various hardware classes (e.g., CPU
,
RAM
, etc.) and the Scheduler
instances are still considered opaque
entities for the purposes of this page.
RuntimeView
Full discussion Designing RuntimeView Class.
As the name suggests RuntimeView
is a view of the actual runtime
environment. This means it does not own the resources it accesses. Similarly,
copying RuntimeView
instances do not create more resources, instead it just
makes another view to the same set of resources. Conceptually, RuntimeView
is a container filled with ResourceSet
objects.
The RuntimeView
class directly addresses (numbers refer to above):
RuntimeView
is an object, methods in the interface assume they interact with objects.Processes map to
ResourceSet
objects, so managingResourceSet
objects is a large part of supporting process parallelism.
Ultimately powered by MPI
Responsible for managing tasks which use more than one process.
Reference counting of
RuntimeView
used to manage MPI lifetimeTracks resource usage above the process level.
Supports logical partitioning at or above the process level.
Responsible for the overall mapping of hardware to
ResourceSet
State is tied to underlying MPI communicator. Splitting the communicator is allowed.
ResourceSet
Full discussion is at Designing ParallelZone’s ResourceSet Class.
Each process is associated with a ResourceSet
instance. The ResourceSet
affiliated with the current process contains the resources which are local
and directly accessible to that process.
It’s an object, uses objects to model resources, and assumes it will interact with objects.
Managing parallelism of CPU/GPU threads handled by sub-objects.
Additional members can be added for new hardware.
Data movement from one
ResourceSet
to another uses MPI.Responsible for managing tasks below the process level.
Tracks resource usage below process level
Allows logical partitioning of local resources
Understands that its resources are local and resources in other
ResourceSet
objects are in general local.
Hardware Classes
See discussions at Designing RAM Class.
These classes have not been designed in detail yet. For now we note they should
allow the user to query whatever information they want in order to make
informed runtime decisions. They also are responsible for encapsulating the
mechanisms for interacting with that device (e.g., sending result from the
memory of one ResourceSet
to that of another).
Scheduler Classes
The scheduler has not been designed yet.
Additional Options
Ultimately the runtime environment is a singleton. It may make more sense to
actually create a class Runtime
which holds the state of the runtime
environment and have RuntimeView
really hold a reference to the Runtime
object. The primary benefit of this is that it better maps to MPI (which really
is a singleton).