Working with datasets that are larger than the entire university
Radio telescope LOFAR maps the sky. It produces incredibly detailed images of the universe - and vast amounts of data. Huub Röttgering, director of the Leiden Observatory, talks about the challenges of working with those enormous datasets.
What exactly is your research about?
‘I study very detailed images of the universe, to learn more about the evolution of black holes, among other things. When and how did the first black holes originate? And how many are there? I also study how galaxies grow. Our own galaxy forms about one star per year, but there are galaxies that produce more than one hundred new stars each year. I am trying to figure out when and how that happens.’
Where do these images of the universe come from?
‘To make these, we use LOFAR, an enormous radio telescope that has thousands of antennas in the Netherlands and the rest of Europe. The telescope captures radio waves, which it uses to make images of small pieces of the universe. These images have a very high resolution, so we can zoom in very far.’
There must be a lot of data involved in that.
‘Yes, a whole lot. With LOFAR we have collected about 30 petabytes of data so far. By comparison: all the data of Leiden University covers about 5 petabytes. It’s a huge amount, which causes many problems. Moving data from the LOFAR stations to university is already a challenge - the internet is simply not fast enough.’
How do you solve that problem?
‘Part of the data processing is done as close as possible to the LOFAR stations. There we filter out the noise - signals from TVs, mobile phones and aircrafts, for instance. Also, we mediate data from different observations, and then only transmit the average. By doing so, we reduce the amount of data that has to be channeled to the university.’
And still, very powerful computers are needed to process all this data, I imagine.
‘Yes, an ordinary computer will not do the job. In fact, we need so much computing power that the data has to be processed in several major computer centres: here in the Netherlands, but also in Germany and England, for example. We collaborate with computer scientists to guide this process. With Data Science professor Aske Plaat I have a joint PhD student - Alex Mechev - who creates a system that coordinates data processing in all these different places. As a result, the software that astronomers have developed now runs smoothly on different computers; a big step forward.’
What is the next step?
‘It would be nice if we could eventually get a High Performance Computer centre at this university, with a lot more computing power than the machines that we currently have. This would not only be beneficial for processing LOFAR data, but also for other research inside and outside astronomy. For instance, some of my colleagues make simulations of pieces of the universe, for which they sometimes use the largest computers in the world. But you cannot go from a desktop to a supercomputer at once; there must be a step in between. If you want to join the big boys on this planet, you must have local machines on which you can practice first.’
Do astronomers always run into the limits of computing power?
‘Yes, it always goes hand in hand. Computers are getting bigger and better, so the experiments we do also get bigger and better. We want more and more, but there are always limits to what is possible. LOFAR is a good example: we make these antennas scan little pieces of the sky at a time. Over a ten-year period, we add all those pieces together in order to get a complete map of the northern sky. If we had enough computing power, we could connect all the antennas, to map the whole sky at once. But then we would suddenly have a billion times more data, which would be impossible to process.’
Recently, a consortium involving Leiden researchers was awarded a large grant for developing quantum computing. Will this be helpful to astronomers too?
‘Eventually, yes. If quantum computing works out, we can take many new steps in how we think about handling these amounts of data. And who knows, we may be able to process data much faster, better and deeper in the future. But there is still a long way to go.’
Have you always been fascinated by astronomy?
‘I always found it interesting, but during my studies there were many other things that I was also interested in. I got really fascinated when I started doing my own research. I find it amazing to explore the universe, using all kinds of telescopes. And now, there's so much going on here at the Observatory: in the field of exoplanets, dark matter, dark energy. We have 43 nationalities here, it's very inspiring to work with all these people, studying the universe together.’
(JvdB)
Huub Röttgering obtained his PhD in Astronomy at Leiden University. After a period as a Postdoctoral Fellow at Cambridge University, he returned to Leiden, where he is currently professor of Observatory Cosmology and director of the Leiden Observatory. As Principal Investigator of the project Development and Commissioning of LOFAR for astronomy, and as PI of the LOFAR surveys, Röttgering plays a leading role in the development of LOFAR. In addition, he is involved in the development of optical and infrared interferometers.
This article is part of a series of interviews with researchers from the Leiden Centre of Data Science (LCDS). LCDS is a network of researchers from different scientific disciplines, who use innovative methods to deal with large amounts of data. Collaboration between these researchers leads to new solutions to problems in science and society.