A Shining Star Lights Up the Road to Exascale

written by Mike Bernhardt
February 2013

While we have more than our share of stories talking about frustration and politics negatively impacting the race to exascale, there are several bright spots that deserve a round of applause.

One such shining star, the DEEP project, comes from the Jülich Research Centre, nestled in the heart of the Stetternich Forest in Jülich.

DEEP is one of the European responses to the Exascale challenge.

(From the DEEP project flyer)The DEEP consortium, led by Forschungszentrum Jülich, proposes to develop a novel, Exascale-enabling supercomputing architecture with a matching SW stack and a set of optimized grand-challenge simulation applications.

DEEP proposes to take a novel approach to compute acceleration. Rather than adding accelerator cards to Cluster nodes, an accelerator Cluster, called Booster, will complement conventional HPC systems to increase the compute performance.

We're pleased to present this interview with three of the key spokespeople for Forschungszentrum Jülich:

The Exascale Report: How / where / and when did the idea for the DEEP project first originate?

Thomas Lippert: DEEP as a scientific experiment is rooted in early experiences with parallel systems like connection machines, the Italian APE machines and in particular the PQE2000 ideas with two heterogeneous components end of the 1990s. Since then we have a 10 years history of experience in cluster computing and cluster software, and the first concept for a hybrid Cluster-Booster architecture was made end of 2009.

Norbert Eicker: While reviewing the capabilities of cluster-computers equipped with accelerators and aligning it with experiences made with the QPACE system it became clear that some fundamental change concerning concurrency levels have to come otherwise a limited number of applications might profit from Exascale. The idea of the Cluster-Booster architecture then was refined in a paper presented at the PARS workshop in 2011 (N. Eicker and Thomas Lippert, PARS '11, PARS-Mitteilungen, Mitteilungen - Gesellschaft für Informatik e.V., Parallel-Algorithmen und Rechnerstrukturen, ISSN 0177-0454, Nr. 28, Oktober 2011 (Workshop 2011), pp. 110 - 119). In parallel the DEEP consortium was formed and the project proposal was submitted to the EU.

The Exascale Report: Are we correct in stating the DEEP project was launched officially in December 2011, and is planned to run through December 2014?

Estela Suarez: Yes, this is correct. DEEP is a three-year project and we started on 1st December 2011. The end of the project will accordingly be on 31st November 2014.

The Exascale Report: Who are the DEEP project partners?

Estela Suarez: Intel GmbH, ParTec GmbH, Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften, Universitaet Heidelberg, German Research School for Simulation Sciences GmbH, EUROTECH S.p.A, Barcelona Supercomputing Center - Centro Nacional de Supercomputacion, Mellanox Technologies Ltd, Ecole polytechnique federale de Lausanne, Katholieke Universiteit Leuven, Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique, The Cyprus Institute, University of Regensburg, CINECA, CGGVeritas, and of course of Forschungszentrum Juelich GmbH, which is the coordinator of the project.

The Exascale Report: Is project flyer on the DEEP site still accurate, for any of our readers who would like to see this overview of DEEP, or does some of this information need to be updated? http://www.deep-project.eu/deep-project/EN/Service/Flyer/_node.html

Estela Suarez: The flyer is accurate, but the picture on the software architecture has been refined in the last months, following the discussions between the software experts and application developers. Here is the new software architecture diagram.

The Exascale Report: How much funding has the DEEP project received to date, and how much funding do you anticipate receiving during calendar year 2013?

Estela Suarez: DEEP is a three-year-duration project that receives a total funding of 8.030.000 Euro from the European Commission.

The Exascale Report: Could you list the six primary applications targeted for the DEEP project?

Estela Suarez:

  • Brain simulation (Federal Polytechnic School of Lausanne
  • Space weather simulation (Catholic University of Leuven)
  • High temperature superconductivity (CINECA)
  • Seismic imaging (CGGVeritas)
  • Computational fluid engineering (European Centre for Research and Advanced Training in Scientific Computation)
  • Climate simulation (Cyprus Institute)

These applications have been selected as guidelines for their high scientific and industrial relevance and the urgent need for Exascale computing in their research fields. With these six applications we want to cover a wide spectrum of science and industrial applicability, since DEEP aims to build a general purpose machine. Porting the applications will serve us to evaluate the DEEP concept, its programmability, and to compare its performance with standard architectures. From the lessons learned in this way we will create best practice guidelines and propose improvements to the DEEP System.

The Exascale Report: Are the hardware development efforts and the software stack design moving forward at a complementary pace? And which presents the more difficult challenge?

Estela Suarez: Yes, indeed. In DEEP we have created a working group called "Design and Development Group" (DDG), which exactly takes care of having hardware development and software design to go forward at a complementary pace. In the regular meetings of the DDG the technical experts discuss on the design of hardware and software concepts, the construction and testing of hardware prototypes, the software implementation on those prototypes, etc. Having both hardware and software experts around the table guarantees a coherent development of both aspects and allows for a good overview of the evolution of the project.

Norbert Eicker: In my opinion both the hardware and the software aspects of the project present very interesting challenges. On the hardware side we are going to build a cluster of Intel's Xeon Phi accelerators integrating them in the novel EXTOLL network, providing highest scalability for the highly scalable and very regular workloads to be offloaded on this part of the system called Booster. It will be attached to a standard HPC-cluster with Intel Xeon processors and a Mellanox InfiniBand network. In order to bridge Cluster and Booster and their corresponding network-fabrics the Booster-Interface-Cards (BICs) have to be designed. All components have to work together in a smooth and balanced fashion in order to provide a scalable system with good performance to users porting their application to this system.

The special hardware characteristics of DEEP also have influence on the software, making the design and adaptation of the software stack an interesting challenge. In particular the level of heterogeneity provided by the Cluster-Booster architecture creates a significant burden for application developers. The idea of the DEEP software stack is to release them from handling the specifics of the architecture as good as possible. This includes replacing explicit modifications of the code with annotations leaving most of the actual adaption to the Cluster-Booster architecture to the run-time system that will manage the DEEP System.

The Exascale Report: Can you describe the program's goal in terms of what you hope to achieve with application progress and system capability in the 2018 to 2020 timeframe?

Norbert Eicker: As the project acronym (Dynamical Exascale Entry Platform) suggests, DEEP aims for a heterogeneous cluster-architecture leading to Exascale by the end of the decade. We see the Cluster-Booster architecture as a natural evolution of today's HPC-clusters taking the strong developments in the field of accelerators into account.

The Exascale Report: Would you consider the DEEP project an Evolutionary or Revolutionary approach?

Thomas Lippert: In the DEEP scientific experiment we want to find out, if the new Cluster-Booster paradigm can widen the application portfolio of Exascale systems. We will be able to assess the optimal balance between cluster and booster code parts for hardware, programming environments and application software. If it turns out to lead to a paradigm change, call it revolutionary, certainly it is an evolution of ideas that are around for a long time.

Norbert Eicker: DEEP's revolutionary component is the fact that it is a heterogeneous system with a Cluster element and a Booster element. The latter allows for employing accelerating many-core processors like Intel Xeon Phi in a most flexible and scalable fashion. At the same time the Cluster-Booster architecture can be seen as an evolution of the Cluster idea, i.e. using commodity of the shelf components in order to build HPC systems. The same duality can be found on the software-side of the project: While the heterogeneity of the hardware-architecture makes changes to the application-codes and the programming-paradigm in general indispensable, the DEEP system software-stack tries to hide this complexity from the application developer as far as possible. At the same time we try to give them tools necessary to identify the different levels of scalability concealed in the parts of their code in order to exploit the capabilities of the Cluster-Booster architecture as efficient as possible.

The Exascale Report: Is the DEEP project on schedule and will the first system be available to users in October? What are the specifications for the first system?

Estela Suarez: The DEEP project is, in particular regarding the hardware installation, even ahead of schedule. We have already installed the Cluster part of the DEEP System: an AURORA rack from Eurotech. All 128 nodes (containing each 2 Sandy Bridge sockets) have been integrated into the Cluster rack and connected to power and liquid cooling system. A small Linpack benchmark was run on all the nodes individually and they all achieved about 90% performance, which is a very good result for the kind of non-optimized Linpack that was used. The cooling system has also demonstrated its ability to dissipate all the waste heat from the system without difficulties. The DEEP Cluster will be available to users in October, as soon as the installation of login nodes, storage, and operating system will be completed. The InfiniBand network is expected to be ready by end of the year.

The DEEP System will be finally completed with the installation of the DEEP Booster (a cluster of Intel's Xeon Phi accelerators) next to the Cluster. This part of the system is at the moment under development and its installation is planned in the project for early 2014.

The Exascale Report: When discussing the DEEP project, what do you find to be the biggest misunderstanding or biggest point of confusion as to what DEEP is all about?

Norbert Eicker: I observed that scientists are often irritated by their experience with accelerators like GPGPU. Here they have to identify local kernels to be off-loaded to the GPGPU. DEEP's approach is much simpler: highly scalable code-parts of the applications are off-loaded to the Booster exploiting the enormous scalability provided by this part of the system. At the same time less-scalable code parts can make use of the more powerful processors facilitated by the DEEP-Cluster.

Final note:

DEEP will be represented at a Birds-of-a-Feather (BOF) session at SC12, November 13, 2012 at 17:30

http://www.deep-project.eu/SharedDocs/Termine/DEEP-PROJECT/EN/sc12-bof.h...

Photo from Jülich Research Centre