Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Parallel computing
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Memory and communication=== Main memory in a parallel computer is either [[Shared memory (interprocess communication)|shared memory]] (shared between all processing elements in a single [[address space]]), or [[distributed memory]] (in which each processing element has its own local address space).<ref name=PH713>Patterson and Hennessy, p. 713.</ref> Distributed memory refers to the fact that the memory is logically distributed, but often implies that it is physically distributed as well. [[Distributed shared memory]] and [[memory virtualization]] combine the two approaches, where the processing element has its own local memory and access to the memory on non-local processors. Accesses to local memory are typically faster than accesses to non-local memory. On the [[supercomputers]], distributed shared memory space can be implemented using the programming model such as [[Partitioned global address space|PGAS]]. This model allows processes on one compute node to transparently access the remote memory of another compute node. All compute nodes are also connected to an external shared memory system via high-speed interconnect, such as [[Infiniband]], this external shared memory system is known as [[burst buffer]], which is typically built from arrays of [[non-volatile memory]] physically distributed across multiple I/O nodes. [[File:Numa.svg|right|thumbnail|400px|A logical view of a [[non-uniform memory access]] (NUMA) architecture. Processors in one directory can access that directory's memory with less latency than they can access memory in the other directory's memory.]] Computer architectures in which each element of main memory can be accessed with equal [[Memory latency|latency]] and [[Bandwidth (computing)|bandwidth]] are known as [[uniform memory access]] (UMA) systems. Typically, that can be achieved only by a [[Shared memory (interprocess communication)|shared memory]] system, in which the memory is not physically distributed. A system that does not have this property is known as a [[non-uniform memory access]] (NUMA) architecture. Distributed memory systems have non-uniform memory access. Computer systems make use of [[CPU cache|cache]]s—small and fast memories located close to the processor which store temporary copies of memory values (nearby in both the physical and logical sense). Parallel computer systems have difficulties with caches that may store the same value in more than one location, with the possibility of incorrect program execution. These computers require a [[cache coherency]] system, which keeps track of cached values and strategically purges them, thus ensuring correct program execution. [[Bus sniffing|Bus snooping]] is one of the most common methods for keeping track of which values are being accessed (and thus should be purged). Designing large, high-performance cache coherence systems is a very difficult problem in computer architecture. As a result, shared memory computer architectures do not scale as well as distributed memory systems do.<ref name=PH713/> Processor–processor and processor–memory communication can be implemented in hardware in several ways, including via shared (either multiported or [[Multiplexing|multiplexed]]) memory, a [[crossbar switch]], a shared [[Bus (computing)|bus]] or an interconnect network of a myriad of [[Network topology|topologies]] including [[Star network|star]], [[Ring network|ring]], [[Tree (graph theory)|tree]], [[Hypercube graph|hypercube]], fat hypercube (a hypercube with more than one processor at a node), or [[Mesh networking|n-dimensional mesh]]. Parallel computers based on interconnected networks need to have some kind of [[routing]] to enable the passing of messages between nodes that are not directly connected. The medium used for communication between the processors is likely to be hierarchical in large multiprocessor machines.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Parallel computing
(section)
Add topic