Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Hybrid Distributed-Shared Memory
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Comparison of Shared and Distributed Memory Architectures
Architecture
CC-UMA
CC-NUMA
Distributed
Examples
SMPs,
DEC/Compaq,
SGI Challenge,
IBM POWER3
SGI
Origin,
Sequent, IBM
POWER4 (MCM),
DEC/Compaq
Cray T3E, Mas-
par, IBM SP2
CommunicationsMPI, Threads,
OpenMP,
shmem
MPI, Threads,
OpenMP,
shmem
MPI
Scalability
to 10s of proc.
to 100s of proc. to 1000s of proc.
Draw Backs
Memory-CPU
bandwidth
Memory-CPU
bandwidth,
Non-uniform
access times
System adm.,
Programming is
hard to develop
and maintain
Software
Availability
many
1000s
ISVs
many
1000s
ISVs
100s ISVs
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Hybrid Distributed-Shared Memory (...)
The largest and fastest computers in the world today employ
both shared and distributed memory architectures.
The shared memory component is usually a cache coherent
SMP machine. Processors on a given SMP can address that
machine's memory as global.
The distributed memory component is the networking of
multiple SMPs. SMPs know only about their own memory -
not the memory on another SMP. Therefore, network
communications are required to move data from one SMP to
another.
Current trends seem to indicate that this type of memory
architecture will continue to prevail and increase at the high
end of computing for the foreseeable future.
Advantages and Disadvantages: whatever is common to both
shared and distributed memory architectures.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Overview
There are several parallel programming models in common use:
Shared Memory
Threads
Message Passing
Data Parallel
Hybrid
Parallel programming models exist as an abstraction above
hardware and memory architectures.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
These models are NOT specic to a particular type of machine
or
memory architecture. In fact, these models can (theoretically) be
implemented on any underlying hardware. Two examples:
1
Shared memory model on a distributed memory machine:
Kendall Square Research (KSR) ALLCACHE approach.
Machine memory was physically distributed, but appeared to
the user as a single shared memory (global address space).
Generically, this approach is referred to as "virtual shared
memory".
2
Message passing model on a shared memory machine:
MPI on SGI Origin. The SGI Origin employed the CC-NUMA
type of shared memory architecture, where every task has
direct access to global memory. However, the ability to send
and receive messages with MPI, as is commonly done over
a network of distributed memory machines, is is very
commonly used.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Which model to use is often a combination of what is available and
personal choice. There is no "best" model, although there certainly
are better implementations of some models over others.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Shared Memory Model
In the shared-memory programming model, tasks share a
common address space, which they read and write
asynchronously.
Various mechanisms such as locks / semaphores may be used
to control access to the shared memory.
An advantage of this model from the programmer's point of
view is that the notion of data "ownership" is lacking, so there
is no need to specify explicitly the communication of data
between tasks. Program development can often be simplied.
An important disadvantage in terms of performance is that it
becomes more dicult to understand and manage data locality.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Implementations
On shared memory platforms, the native compilers translate
user program variables into actual memory addresses, which
are global.
No common distributed memory platform implementations
currently exist. However, as mentioned previously in the
Overview section, the KSR ALLCACHE approach provided a
shared memory view of data even though the physical memory
of the machine was distributed.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Threads Model
In the threads model of parallel programming, a single process can
have multiple, concurrent execution paths.
An analogy to describe threads is the concept of a single program
that includes a number of subroutines.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
The main program a.out is scheduled to run by the native OS.
It performs some serial work, and then creates tasks (threads)
that can be scheduled and run by the OS concurrently.
Each thread has local data, but also, shares the entire
resources of a.out. Each thread also benets from a global
memory view because it shares the memory space of a.out.
A thread's work may best be described as a subroutine within
the main program. Any thread can execute any subroutine at
the same time as other threads.
Threads communicate with each other through global memory
This requires synchronization constructs to insure that more
than one thread is not updating the same global address at
any time.
Threads can come and go, but a.out remains present to
provide the necessary shared resources until the application has
completed.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Overview
Concepts and Terminology
Parallel Computer Memory Architectures
Parallel Programming Models
Designing Parallel Programs
Threads are commonly associated with shared memory
architectures and operating systems.
Blaise Barney, Livermore Computing
Introduction to Parallel Computing
Dostları ilə paylaş: |