Jump directly to the page contents

5 Questions for...Andreas Zeller

Protecting any software in the world against cyber attacks

Andreas Zeller is a professor of software engineering at Saarland University and a faculty member at CISPA - Helmholtz Center for Information Security. Credit: CISPA

Building a robot that can test any software system, fix bugs and make it more secure is Andreas Zeller's vision. For his outstanding research, he received an ERC Advanced Grant worth 2.5 million euros - for the second time, something that hardly any scientist manages.  He tells us in an interview how he made it possible.

Mr. Zeller, you are one of the very few researchers to have received Europe's highest research funding for the second time: the ERC Advanced Grant worth 2.5 million euros. How did you manage that?

With a lot of preparation, patience and strong nerves. First, there was a big "catch." I had already received an ERC Advanced Grant in 2011. When you first apply, the committee will think that you deserve the funding as an ambitious researcher. But if one wants to get this award again, one is cheeky. You have lost the favor of the panel members who decide on the ERC grants, so to speak.

For the second application, in my mind, you have to keep two things in mind: You need a really good idea that convinces immediately, and you can't afford to be weak. At least that's how I imagined it. Whether that's really the case, I don't know. According to the guidelines, the panel has to decide independently and may not take into account the researcher's background. For me, it was clear that I had to leave a completely flawless impression on every level. That was exhausting.

I must have had 30 people proofread the application I had to write. When writing it, I had to remember to address both the experts in my field and the panel members from outside my field. Because together they have to decide on prize money of two and a half million euros - so they also want to understand what it's all about. That was a challenge. I received further feedback from the Federal Ministry of Education and Research, which has set up the "National Contact Point European Research Council" group. This official body looks at applications in advance and evaluates them.

After I sent the application, I received an invitation from the European Research Council (ERC) for a Zoom interview. I had ten minutes to present the application to the panel. This was followed by a question-and-answer session lasting 20 minutes. I rehearsed the presentation and the question-and-answer session quite a few times. It was quite excessive. Whenever a colleague came to my office, it was, "Do you have 10 minutes?" Then the Helmholtz Center arranged a rehearsal session with two former panel members of the ERC, colleagues of mine, who listened to my talk and took me apart by every trick in the book. Sure, they are instructed to ask questions that are as critical as possible. But I was able to counter them well. I also took part in an event organized by the Federal Ministry of Education and Research. There were four colleagues in the group who had also been invited for interviews. We gave each other our presentations. A professional coach was there, and gave important tips, such as how long my sentences should be.

Obviously, you did everything right. Can you explain what your submitted project is about?

Let's say you want to know whether a piece of software works. To do that, it needs to be tested - by people doing it manually. The catch is that manually you can check if it works for the standard case, but not if it works in special cases. On top of that, every time you change the software, you have to test again. Checking all this is a lot of work, costly and often not done. As soon as you connect the software to the Internet and it is publicly available, it is not secure. It is possible, for example, that someone will manage to hack information out of the database.

My proposal, which I elaborate in my motion, is this: I want to completely automate the testing of software by building a robot that is capable of generating all possible variations of input - even while the software is running. If the software does something wrong, I want the robot to be able to automatically diagnose why something went wrong. These three kinds of tools, that is, one for testing, one for checking on the fly, and one for debugging, are programs that run along on the computer.

The basis for such tools is the specification language called S3 that my team and I are developing. From an informatics perspective, a language is a description of what correct program input looks like. Developers can use it to specify grammars and spelling rules. We want to be able to learn these very rules that express what a valid input and output is for each program. In turn, we want to use that to automatically check the program against all possible input and output. In this way, we can test a huge range of software much more comprehensively for functionality than has ever been the case - over and over again. If possible before the bad guys out there start "testing" the software in their own way.

Besides security, are there any other issues you can cover with your idea?

The very big issue is the reliability of software and how to systematically check whether software is doing the right thing or not. With our method, we can for the first time show a way to build software in the future that is checked fully automatically and very comprehensively - not 100% perfectly, but 99%. This means that a great deal has already been gained. The comprehensive testing provides greater reliability and indirectly greater security because it protects the software against attacks. A side effect of our work is that we also find out under which conditions a software does the wrong thing. We can analyze which parts of a program input are responsible for errors and provide programmers and users with hints on how to avoid errors. And even catch inputs early on that we can predict will lead to problems. This approach will lead to corrections and improvements in software in the longer term.

Do you have any idea where S3 might be used first?

We will use S3 for a variety of open source software because it is particularly vulnerable to attacks from the Internet. But my team and I also want to get into critical infrastructure and put it through its paces. One big area is public administration. A lot of sensitive data is exchanged there, and to our knowledge, these systems have not been tested to the extent that they should be. Most of this administrative software operates in protected networks that are not directly accessible from the outside. But the fact is that as digitization continues, these systems are more and more exposed to the Internet, more and more services are exposed to the outside. That is a major security risk. We would test this existing software and improve it where possible.

If you could look into the future: What would be the best-case scenario for your research?

For my team and me, I would like as many companies as possible to take up our work in order to improve their own software. And also to build even smarter tools for testing, debugging and checking software. Imagine what would happen if a company like Microsoft or SAP adopted our techniques and incorporated them into their products. Then one day I can tell my mother: Look, my algorithm runs on hundreds of millions of computers every day. Then she'll be proud.

Readers comments

As curious as we are? Discover more.