Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
IBM Blue Gene
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Architecture=== The Blue Gene/L architecture was an evolution of the QCDSP and [[QCDOC]] architectures. Each Blue Gene/L Compute or I/O node was a single [[Application-specific integrated circuit|ASIC]] with associated [[Dynamic random access memory|DRAM]] memory chips. The ASIC integrated two 700 MHz [[PowerPC 440]] embedded processors, each with a double-pipeline-double-precision [[floating-point unit|Floating-Point Unit]] (FPU), a [[CPU cache|cache]] sub-system with built-in DRAM controller and the logic to support multiple communication sub-systems. The dual FPUs gave each Blue Gene/L node a theoretical peak performance of 5.6 [[FLOPS|GFLOPS (gigaFLOPS)]]. The two CPUs were not [[Cache coherency|cache coherent]] with one another. Compute nodes were packaged two per compute card, with 16 compute cards (thus 32 nodes) plus up to 2 I/O nodes per node board. A cabinet/rack contained 32 node boards.<ref>{{cite web|url=https://asc.llnl.gov/computing_resources/bluegenel/configuration.html|title=BlueGene/L Configuration|first=Lynn|last=Kissel|website=asc.llnl.gov|access-date=13 October 2017|archive-date=17 February 2013|archive-url=https://web.archive.org/web/20130217032440/https://asc.llnl.gov/computing_resources/bluegenel/configuration.html|url-status=dead}}</ref> By the integration of all essential sub-systems on a single chip, and the use of low-power logic, each Compute or I/O node dissipated about 17 watts (including DRAMs). The low power per node allowed aggressive packaging of up to 1024 compute nodes, plus additional I/O nodes, in a standard [[19-inch rack]], within reasonable limits on electrical power supply and air cooling. The system performance metrics, in terms of [[FLOPS per watt]], FLOPS per m<sup>2</sup> of floorspace and FLOPS per unit cost, allowed scaling up to very high performance. With so many nodes, component failures were inevitable. The system was able to electrically isolate faulty components, down to a granularity of half a rack (512 compute nodes), to allow the machine to continue to run. Each Blue Gene/L node was attached to three parallel communications networks: a [[dimension|3D]] [[torus interconnect|toroidal network]] for peer-to-peer communication between compute nodes, a collective network for collective communication (broadcasts and reduce operations), and a global interrupt network for [[Barrier (computer science)|fast barriers]]. The I/O nodes, which run the [[Linux]] [[operating system]], provided communication to storage and external hosts via an [[Ethernet]] network. The I/O nodes handled filesystem operations on behalf of the compute nodes. A separate and private [[Ethernet]] management network provided access to any node for configuration, [[booting]] and diagnostics. To allow multiple programs to run concurrently, a Blue Gene/L system could be partitioned into electronically isolated sets of nodes. The number of nodes in a partition had to be a positive [[integer]] power of 2, with at least 2<sup>5</sup> = 32 nodes. To run a program on Blue Gene/L, a partition of the computer was first to be reserved. The program was then loaded and run on all the nodes within the partition, and no other program could access nodes within the partition while it was in use. Upon completion, the partition nodes were released for future programs to use. Blue Gene/L compute nodes used a minimal [[operating system]] supporting a single user program. Only a subset of [[POSIX]] calls was supported, and only one process could run at a time on a node in co-processor modeโor one process per CPU in virtual mode. Programmers needed to implement [[green threads]] in order to simulate local concurrency. Application development was usually performed in [[C (programming language)|C]], [[C++]], or [[Fortran]] using [[Message Passing Interface|MPI]] for communication. However, some scripting languages such as [[Ruby (programming language)|Ruby]]<ref>{{Cite web|title=Compute Node Ruby for Bluegene/L|website=www.ece.iastate.edu|url=http://www.ece.iastate.edu/~crb002/cnr.html|archive-url=https://web.archive.org/web/20090211071506/http://www.ece.iastate.edu:80/~crb002/cnr.html|url-status=dead|archive-date=February 11, 2009}}</ref> and [[Python (programming language)|Python]]<ref>{{cite conference |url=http://us.pycon.org/2011/home/ |title=Python for High Performance Computing |author=William Scullin |date=March 12, 2011 |location=Atlanta, GA}}</ref> have been ported to the compute nodes. IBM published BlueMatter, the application developed to exercise Blue Gene/L, as open source.<ref>[https://github.com/IBM/BlueMatter Blue Matter source code, retrieved February 28, 2020]</ref> This serves to document how the torus and collective interfaces were used by applications, and may serve as a base for others to exercise the current generation of supercomputers.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
IBM Blue Gene
(section)
Add topic