With the growing demand for computing power, particularly due to the development of artificial intelligence, the energy consumption of supercomputers is also increasing. For ten years, scientists from IT4Innovations National Supercomputing Centre at VSB – Technical University of Ostrava have been working on optimising the energy efficiency of data centres. Thanks to international cooperation and its own innovations, the Czech supercomputing centre is one of the European leaders in energy efficiency optimisation in computing. Saving energy is not just a matter of costs but also an active contribution to protecting the environment.

Higher energy consumption is not only an issue for supercomputers but also for data centres designed for artificial intelligence. Companies such as Google, Amazon, and Meta are building their facilities adjacent to power plants to ensure they have sufficient power. Estimates suggest that the energy consumption of AI data centres could quadruple by the end of the decade.

"Processors and accelerators consume the most electricity. In addition, they heat up during operation, so cooling is also necessary, which adds 8 to 30% extra energy," explains Lubomír Říha, Head of the IT4Innovations Infrastructure Research Lab. "Each new generation of chips can perform more calculations per unit of energy, but the demand for performance is growing even faster – this is known as the Jevons paradox."

 

MERIC: Czech software, which saves megawatts

Energy savings in supercomputing centres are sought not only in hardware but also in software. One of the key tools developed at IT4Innovations is open-source MERIC software – a set of tools for monitoring, managing, and optimising energy consumption in data centres. MERIC enables supercomputers to operate in a fast and energy-efficient manner – it measures energy consumption, monitors the utilisation of processors (CPUs) and computing accelerators (e.g., GPUs), and dynamically adjusts their settings based on the current load. This allows users to measure energy consumption at the application or component level and customise system settings themselves for maximum performance without excessive electricity consumption. The MERIC Software Suite also includes infrastructure for monitoring consumption and temperature in data rooms, featuring 3D visualisation that displays the status of systems in real-time. This allows administrators to identify problem areas (e.g. in the cooling system) more easily and enhance their understanding of the links between individual parts of the supercomputer. At the administrator level, MERIC provides rule-based power capping to keep consumption below specified limits.

"Thanks to MERIC, we have reduced the energy consumption of some calculations by up to 34%, with an average reduction of around 16%, without significantly slowing down the calculations," says Ondřej Vysocký, Head of a research group behind the development of MERIC. With its modular architecture, MERIC can be easily integrated into existing HPC centre management and monitoring systems. The software is deployed not only on the Czech Karolina supercomputer but also on Portugal's Deucalion, which belongs to the EuroHPC Joint Undertaking (EuroHPC JU) network. "A new version of MERIC will be released in the coming weeks, and the MERIC runtime system will be deployed on all EuroHPC JU supercomputers, enabling users of these machines to evaluate the energy efficiency of their applications in a uniform manner," adds Ondřej Vysocký.

"MERIC offers interfaces for commonly used programming languages. It can work effectively with parallel applications and save energy without reducing their performance, or the impact on performance can be fully controlled. MERIC can be used on various types of computers – from standard processors to powerful computing (e.g., graphics) accelerators – and works with most energy monitoring systems," summarises Lubomír Říha.

Karolina: smart performance, lower emissions

In 2023, IT4Innovations also introduced operating frequency restrictions for the Karolina supercomputer processors, thereby reducing energy consumption and the carbon footprint. "The impact on users was minimal – for tasks demanding processor or accelerator performance, the calculation time increased by a maximum of 16%, but for tasks dependent on system memory speed, which account for 80% of the tasks on our systems, there was no slowdown," adds Lubomír Říha.

The long-term emphasis on energy efficiency is also reflected in international rankings. In the Green500 list of the world's most energy-efficient supercomputers, Karolina's accelerated partition ranked 15th after its launch in 2021. Six months later, thanks to parameter optimisation carried out in collaboration with HPE (the computer supplier), it moved up to 8th place, and even after more than four years of operation, it remains in the top 100 – currently in the 77th place.

"The Karolina optimisation has yielded significant results – compared to the original state, we have managed to reduce power consumption, which represents annual financial savings in the order of several million crowns. And in environmental terms, this corresponds to CO2 emission reductions equivalent to those achieved by over 12,600 mature trees. The energy efficiency of supercomputers is crucial for us. Our goal is not only to operate powerful machines but also to operate them in a smart and economical way,” concludes Lubomír Říha.

 

MERIC Energy Efficiency HPC SW Suite: https://code.it4i.cz/energy-efficiency/meric-suite

* MERIC: The development of MERIC software began in 2015 thanks to the READEX project, and its further development is supported by e-INFRA CZ, the e-infrastructure for research and development in the Czech Republic, and several European projects, including EUPEX, POP3 Centre of Excellence, SEANERGYS, and DARE.