Where is the Synergy between HPC and Data Science?
High-Performance Computing (HPC) in Data Science nowadays (Part 4)
Series index:
- What is Data Science nowadays?
- Why should we think about hardware in Data Science?
- What are the elements of High Performance Computing (HPC) nowadays?
4. Where is the synergy between HPC and Data Science?
5. Project using big clusters or supercomputers — publication coming soon
Introduction
In this part, you will learn where the synergy between HPC and Data Science lies and learn about the top 3 fastest supercomputers in the world. This is a very important and essential topic, because without the right hardware, we can’t make progress in Data Science. So… we will focus here on computer systems created for large-scale computing, such as supercomputers and computing clusters.
The synergy between HPC and Data Science lies in the ability to harness the power of advanced computing systems, including supercomputers, to analyze and extract insights from large and complex data sets. By harnessing the power of HPC, data scientists can conduct faster and more accurate analyses, leading to more robust and reliable results. This can help organizations make more informed decisions and drive innovation in various fields.
What is a supercomputer?
A supercomputer is a specialized computer designed to perform extremely fast and complex calculations. Supercomputers are usually used to solve major scientific and technical problems that require high computing power (we can mention weather forecasting, simulations or data analysis, for example).
Supercomputers are usually much more powerful than ordinary computers. They are used to perform tasks that would be impossible or impractical for an ordinary computer to handle. They are typically used in fields such as scientific research, engineering and finance, where large amounts of data must be analyzed and processed quickly.
Parallel supercomputers work by dividing a problem into smaller parts and distributing those parts to multiple processors that work together to solve the problem. This allows the supercomputer to perform many calculations at the same time, greatly speeding up the process.
Why do we need supercomputers in Data Science?
In the age of big data, it is necessary to analyze data that later influences business decisions. To do this effectively and at scale, you need the right computing devices. Today, many of the top machine learning models are taught on supercomputers. Without them, many current advances in data science would not be possible. It is often said that machine learning algorithms have been waiting for the development of appropriate computing hardware. Many machine learning models need both huge amounts of data and computing power to process them and extract knowledge from them. This process often takes months.
Top 3 fastest supercomputers in the world
1. Frontier supercomputer (USA)
Frontier is a supercomputer developed by the US Department of Energy’s Oak Ridge National Laboratory (ORNL). It was officially launched in 2021 and is currently ranked as the most powerful supercomputer in the world — according to the Top500 list, which ranks supercomputers based on their performance.
Frontier is designed to perform a wide range of scientific and technical tasks, including simulation, data analysis and machine learning. It is particularly well suited for computationally intensive tasks such as weather forecasting, astrophysical simulations and genomic research.
One of Frontier’s key features is its speed. It is capable of performing more than 1.6 quadrillion calculations per second, or 1.6 exaflops, making it one of the fastest supercomputers in the world. Frontier is based on Cray’s Shasta architecture, which uses AMD EPYC processors and NVIDIA graphics processing units (GPUs) to deliver high computing performance. It contains 9,472 nodes, each with two CPUs and six GPUs, and has a total of more than 8 million cores. The system is cooled by a state-of-the-art liquid cooling system that helps maintain its peak performance. In addition to its speed, Frontier is also designed to be energy efficient. It uses a liquid cooling system to keep the hardware at an optimal temperature, which helps reduce its power consumption.
Frontier is expected to have a wide range of applications in scientific and technical fields. Researchers at ORNL are already using it to perform materials simulations, study the impact of climate change, and analyze large data sets to better understand complex systems. It is also expected to be used by industrial partners to perform tasks such as product design and manufacturing simulations.
Overall, Frontier represents a major achievement in supercomputing technology and is expected to have a significant impact on many scientific and technical fields.
2. Fugaku supercomputer (Japan)
Fugaku is a supercomputer developed by RIKEN and Fujitsu and is currently the world’s most powerful supercomputer, with a peak performance of more than 1.5 exaflops. It was officially unveiled in 2020 and is used for a wide range of scientific and engineering applications, including simulations in fields such as astrophysics, materials science and genomics.
Fugaku is based on Fujitsu’s A64FX processor, which is a custom chip using ARM architecture and optimized for high-performance computing. It contains more than 150,000 nodes, each with two processors, and has a total of 7,630,848 cores. The system is cooled by a state-of-the-art liquid cooling system that helps keep it at peak performance.
One of Fugaku’s key features is its ability to run artificial intelligence (AI) and machine learning (ML) algorithms at scale. It was designed specifically to support the development and implementation of AI and ML models, and is expected to play an important role in advancing research in these fields.
In addition to scientific and engineering applications, Fugaku is also being used to support a number of important national initiatives. For example, it helps scientists understand and mitigate the impact of climate change, as well as develop new energy technologies. It is also being used to support the development of advanced manufacturing techniques and improve the efficiency of supply chain management.
Overall, Fugaku is a great achievement in the field of supercomputing and is expected to have a major impact on many areas of scientific and engineering research. It is a testament to the capabilities of modern technology and the ingenuity of the researchers and engineers who developed it.
3. Lumi supercomputer (Finland)
Lumi is a supercomputer located at the Department of Computer Science at the University of Helsinki in Finland. It was officially launched in 2022 and is also one of the most powerful supercomputers in the world, ranking in the top 50 on the Top500 list of most powerful supercomputers.
Lumi is based on AMD’s optimized third-generation EPYC 64C 2GHz processor. It has a total of 2,220,288 cores and a peak performance of more than 400 teraflops, meaning it can perform more than 400 trillion calculations per second.
One of Lumi’s main goals is to support research in various fields, including bioinformatics, meteorology and astrophysics. Researchers at the University of Helsinki, for example, are using it to analyze large amounts of data from the Finnish Meteorological Institute, helping to improve weather forecasting in the region.
Lumi is also being used to study the structure of the universe and the formation of galaxies, as well as to simulate the behavior of materials under extreme conditions such as high pressure and temperature.
In addition to supporting scientific research, Lumi is also being used to teach students about high-performance computing and to train scientists in the use of supercomputers.
Overall, Lumi is a powerful tool that helps researchers and students push the boundaries of possibilities in their fields and advance our understanding of the world around us.
What is a computing grid?
A computing grid is a distributed computing system that allows multiple computers to work together as a single system, enabling them to share resources such as computing power, memory and storage. This type of system is often used to solve large-scale computing problems that require a lot of computing power or to perform tasks that need access to large amounts of data.
In grid computing, computers are networked, and tasks are divided into smaller parts that can be distributed among the computers in the grid. Each computer works on its assigned part of the task and then sends the results to a central location, where they are combined to complete the task. In this way, a computational grid can perform tasks much faster than a single computer could do it alone.
Computational grids are used in many applications, including scientific research, data analysis and business operations. They are often used in fields such as finance, medicine and engineering, where there is a need for high-performance computing resources. Grid computing can be implemented using a variety of technologies, such as cloud computing, virtualization and distributed computing.
Known computational grids
- European Grid Infrastructure (EGI)
The European Grid Infrastructure (EGI) is a pan-European distributed computing infrastructure that provides access to a wide range of computing resources, data and services. It aims to support researchers and scientists in Europe and around the world who need access to high-performance computing (HPC) resources in their work.
EGI is based on a network of computing centers located in more than 20 countries in Europe and provides users with access to a variety of computing resources, including supercomputers, clusters and grids. It also provides access to a range of data and services, including data storage and management, software tools and collaboration platforms.
EGI is funded by the European Union and managed by a consortium of European research organizations and universities. It is open to researchers from any discipline and is used for a wide range of scientific and research applications, including data analysis, modeling, simulation and big data analysis.
- PLGrid
PL-Grid is a consortium of research and academic institutions in Poland that operates a national-scale computing infrastructure for scientific research. It provides access to a range of computing resources, including HPC (High Performance Computing) systems, storage and networking, as well as support and training for researchers in Poland. The infrastructure is funded by the Ministry of Science and Higher Education and is intended to support research and development in a wide range of fields, including the natural sciences, engineering and social sciences.
The PL-Grid consortium was formed by 5 Polish supercomputing and networking centers:
1. the Academic Computer Center Cyfronet AGH, Krakow (coordinator)
2. the Interdisciplinary Center for Mathematical and Computer Modeling, Warsaw University, Warsaw, Poland
3. Poznan Supercomputing and Networking Center, Poznan, Poland
4. the Information Center of the Academic Computer Network, Gdansk
5. the Wroclaw Supercomputing and Networking Center, Wroclaw
- Golem grid
Golem is a decentralized network that allows users to buy and sell computing power from other users. It is designed as a global, open and decentralized marketplace for computing power. Golem’s goal is to create a platform for distributed computing that is more flexible, efficient and secure than traditional centralized systems.
Users of the Golem network can buy computing power from other users to perform compute-intensive tasks such as rendering graphics, training machine learning models or processing scientific data. Users who have excess computing power can sell their resources on the Golem network and earn money in return.
Golem uses the Ethereum blockchain to facilitate transactions and ensure that payments are made securely and transparently. The network is powered by a native cryptocurrency called GNT (Golem Network Token), which is used to pay for computing resources and make transactions on the platform.
Overall, the Golem Network aims to create a more efficient and cost-effective way for individuals and organizations to access computing resources and perform complex tasks that require a lot of computing power.
Summary
What did we learn in this part of the series?
- why we need supercomputers
2. what is a supercomputer
3. what are the best supercomputers we have
4. what are computational grids and…
5. we learned about the open source Golem grid project
Words by Patryk Binkowski, Data Scientist at Altimetrik Poland
https://www.linkedin.com/in/patrykbinkowski/
Copywriting by Kinga Kuśnierz, Content Writer at Altimetrik Poland