This Horizon2020 FET-HPC ExaNeSt project develops and prototypes solutions for some of the crucial problems on the way towards production of Exascale-level Supercomputers:
WHAT we do:
ExaNeSt develops and prototypes solutions for Interconnection Networks, Storage, and Cooling, as these have to evolve in order for the production of exascale-level supercomputers to become feasible. We tune real HPC Applications, and we use them to evaluate our solutions. Click for more information...
We develop and prototype innovative hardware and software for such networks to become tightly integrated with the system components, to become faster, to offer better quality-of-service (QoS) – especially congestion mitigation, to be resilient to failures, and to consume less energy.
We develop and prototype a distributed storage system where NVM's are local to the compute cores hence fast to access at low energy cost, yet the aggregate NVM's in the entire system form a unified storage.
We develop and prototype innovative Packaging and Cooling technology, based on total immersion in a sophisticated, non-conductive, engineered coolant fluid that allows the highest possible packing density while maintaining reliability.
Furthermore, we tune our firmware, the systems software, libraries, and such applications so that they take the best possible advantage of our novel communication and storage architecture: we support task-to-data software locality models, to ensure minimum data communication energy overheads and property maintenance in databases; and we provide a platform management scheme for big-data I/O to our resilient, unified distributed storage compute architecture.
WHY we do it:
HPC is a precious tool for all of modern technology, science, and society. For the next generation of HPC systems, we need millions of low-power-consumption computing cores, tightly interconnected and packaged together and appropriately cooled, and with a new storage architecture. Click to see why...
We are a project in a group of projects ( FETHPC-2014(a)) that develop the technology needed for building the next-generation of High Performance Computing (HPC) Systems, also known as “Supercomputers” –the Exascale-level HPC systems, i.e. those achieving performance in the range of 1 ExaFLOPS = 1018 FLOPS: one quintillion FLoating-point Operations Per Second.
HPC is, today, a tool of capital importance in the hands of humankind. According to the ETP4HPC Strategic Research Agenda (SRA): “HPC is a pervasive technology that strongly contributes to the excellence of science and the competitiveness of industry. Questions as diverse as how to develop a safer and more efficient generation of aircraft, how to improve the green energy systems such as solar panels or wind turbines, what is the impact of some phenomena on the climate evolution, how to deliver customized treatments to patients... cannot be answered without using HPC [...] HPC is also recognized as crucial in addressing grand societal challenges. Today, to out-compute is to out-compete best describes the role of HPC”. The HiPEAC Roadmap (Vision 2015) adds: “Science is entering its 4th paradigm: data intensive discovery. After being essentially based on Theory, Empirical science, and Simulation, science is now using Data Analytics (cognitive computing) on instrument data, sensor data, but also on simulation data. Simulations are therefore becoming more and more important for science and for industry, avoiding spending millions in experiments. High performance computing is an enabling technology for complex simulations.”
Improving the Performance of Supercomputers as much as possible is an everlasting need, because that progress makes it possible to solve bigger and bigger problems, that were impossible to solve with the previous generation of supercomputers, or it allows us to solve problems at greater accuracy than what was previously possible. In previous decades, performance of supercomputers grew owing to the advances in technology that allowed individual processors to become faster using faster transistors (higher clock frequency) and more transistors (a more advanced pipelined architecture): it was possible to keep the number of processors fixed, and still come up with faster supercomputers. Not anymore, though –technology has recently reached a plateau where individual general-purpose processing “cores” cannot be made any faster because they would consume excessive electric power relative to the computational performance that they would offer. Thus, the only two ways left for achieving higher performance are to integrate a larger number of cores and to integrate special-purpose compute engines in a HPC systems. In this project, we work on enabling more cores –millions of compute cores– by improving on the network that is needed to interconnect all these cores: ordinary networks become slow and congested when they grow in size, and novel techniques are needed in order to prevent that. In the meanwhile, this increase in the number of computing cores has to be achieved...
... while keeping electric power consumption constant! Large supercomputers already consume a few tens of Mega-Watts –like a town of a few tens of thousand of people, each. Neither electric power companies nor society can afford to spend more than that for supercomputing. Thus, in order for progress to be sustained, we need each computing core to spend less Watts of electicity, if we are to integrate more cores in a supercomputer. For that reason, in this project, we use computing cores designed by ARM, as opposed to more electricity-hungry processors used by other HPC systems. ARM –a European company– is the world-leader in low-power-consumption processors, which is teh reason why it already dominates the mobile-phone market; owing precisely to this low-consumption advantage, ARM processors are now moving into Datacenters and into HPC. The computing “chiplets” that we use are provided to us by the previous EuroServer and the concurrent ExaNoDe projects.
New Storage technologies have appeared, in the meanwhile: flash memories are already out in the market, and various other kinds of non-volatile memory are little-by-little emerging out of multiple research labs. These are all (much) faster than the traditional hard-disk drives (HDD), and as their capacities grow to become comparable to HDD, they become the medium of choice for the upper level of the persistent storage hierarchy. As a result, storage now becomes much faster than it used to be, which brings along radical changes in its organization; we take part in this storage architecture revolution.
HOW we do it:
We use the UNIMEM Global Address Space and zero-copy send/receive operations, and we tune our applications for such architectures. We develop efficient networking with all its frequent-path functions implemented in hardware, including advanced congestion management for full utilization at low latency; we distribute the file storage on top of that. Click to see more...
When millions of computing cores cooperate in solving a problem, they need to communicate with each other. Communicating a number from one chip to another, today, takes orders of magnitude more energy than making an arithmetic operation on this number, on-chip. A large portion of this energy is wasted in copying the communicated data from one memory buffer to another; additional energy is spent in running complex networking protocols in software.
We use the UNIMEM architecture from the EuroServer project: its Global Address Space allows communication using Remote Direct Memory Access (RDMA) operations, which deliver data in-place and avoid receiver-side copying. On the sender side, the Input/Output Memory Management Unit (IOMMU) and DMA Engine Virtualization allow user-level initiation of RDMA operations, thus avoiding expensive system calls and data copying. UNIMEM also allows remote DRAM borrowing and remote load/store instructions, which enable remote-mailbox and remote-interrupt notifications for low-latency protocols.
Network Congestion is an important and hard problem to be dealt with in all large network that aspire not to be permanently underutilized. TCP/IP solves it for the internet in an inefficient, expensive, and long-time-scale manner; other solutions, in hardware, have been tried, with better but non-ideal results. We develop novel solutions for Quality of Service (QoS) –including congestion management– in the interconnection network, always keeping the "data plane" (frequent-case) operations in hardware and only performing infrequent control-plane operations in software. We target our solutions to large networks with many flows, while keeping hardware state down to a reasonable cost.
In the domain of data persistence, we place Storage devices (flash or other non-volatile memories) with the compute nodes rather than in a centralized location, e.g. behind a network (SAN/NAS). We introduce extensions to a parallel file system in order to take advantage of such devices as cache layer. Moreover, we design cache maintenance protocols based on the concepts of UNIMEM memory consistency model. On top of that, we plan to provide replication-based resilience, protecting the filesystem by focusing on metadata integrity.
WHO we are:
The ExaNeSt Consortium combines industrial and academic research expertise, especially in the areas of system cooling and packaging, storage, interconnects, and the HPC applications that drive all of the above. Click to see who we are...
Four academic and research partners collaborate on the development of Interconnection Network architectures, beyond their other topics, each. FORTH is one of them, and the other three are:
INFN is the Italian National Institute of Nuclear Physics;
besides owning a rich HPC infrastructure,
it has also developed, itself, such systems.
INFN participates in the design of the low latency interconnect,
and takes also part in the activities on storage and on applications.
And Fraunhofer, Europe's largest organisation for application-oriented research, contributes the research towards highly-scalable storage solutions, based on the distributed file system that it has previously developed.