Summit supercluster

The US Department of Energy’s Oak Ridge Leadership Computing Facility’s newest leadership-class system, Summit, one of the fastest and most capable supercomputers in the world. But the SKA will routinely need more computer power even than this.

Image credit: Oak Ridge National Laboratory (ORNL)

The Data Giant: Where is the relevant information?

SKA collects radio signals from the depths of the universe for scientists around the world to process and analyze. These are true data giants, as SKA will generate as much data in one day as the entire Internet produces in a year today. This data will be picked up by SKA through some 250,000 radio antennas, whose signals will then be combined on site, pre-processed and forwarded to a mainframe computer in South Africa and Australia respectively. These mainframes then perform the initial analysis of the data and filter out the content that is of scientific interest. This is where even today’s most modern mainframes still reach their limits, so SKA is driving the development of the corresponding computers here. Afterwards, these information-dense data sets are sent on their journey to regional data centers around the world. Their key role is to process the data so that it is ready for actual astrophysical research. Out of this vast amount of data, only a fraction yields fundamentally new scientific insights. In filtering out the relevant information, the development of artificial intelligence plays a central role. What we learn here at SKA Data Management will open up the foundations of a new era of computing – technologies that learn logical reasoning. These methods will gain applications in completely different fields, e.g. in health care or the financial market. It is the very aspect of data management that makes SKA so important to this discipline, to all of science, and indeed to humanity itself.