Information & Data Science Pilot Projects

In personalized medicine, climate modeling, developing self-learning robots, and many other scientific fields, the need to deal with huge amounts of data is rapidly increasing. Helmholtz has therefore specifically strengthened its expertise in the field of Information & Data Science. In 2017, five highly innovative research projects received a total of 17 million euros in funding for a period of three years.

To the press release

The Helmholtz Analytics Framework pilot project will strengthen the development of data sciences in the Helmholtz Association. The project pursues the systematic development of domain-specific data analysis tools using a co-design approach involving scientists from the respective domain and information experts. Above all, the exchange of methods between individual scientific fields is intended to lead to generalization and standardization. In challenging areas of application such as Earth system modeling, structural biology, the aerospace sector, medical imaging, and neuroscience, the potential thus emerges for scientific breakthroughs and new knowledge. The Helmholtz Analytics Framework collaborates closely with the already-established Helmholtz Data Federation (HDF).

Further information and contacts: Prof. Dr. Dr. Thomas Lippert (FZ-Jülich), th.lippert(at), Prof. Dr. Achim Streit (KIT), achim.streit(at), Björn Hagemeier (FZ-Jülich), b.hagemeier(at), Daniel Mallmann (FZ-Jülich), d.mallmann(at)


The Sparse2Big pilot project provides the methodological and technical foundations for handling big data. The aim is to create usable big data from sparsely observed, large data sets by means of imputation (completion) and robust modeling of the observation processes. The project initially focuses on data sets from single cell genomics, using modern genome sequencing techniques to analyze individual cells in the process: Researchers are thus provided with a “molecular microscope" with various fields of application, such as developmental biology, cancer diagnosis, and stem cell therapy. The innovative techniques of the Sparse2Big project will make a substantial contribution towards improving observations in single cell genomics and thus bio-medical research. Based on this, the transfer of these methods to other research fields is already being prepared.


Further information and contacts: Prof. Dr. Dr. Fabian Theis (HMGU), fabian.theis(at), Prof. Dr. med. Joachim L. Schultze (DZNE), joachim.schultze(at)


Complex mathematical computer models are used in the fields of climate and environmental research, health research, energy research, and in the development of robots. The avalanche of data from these models presents researchers and computer centers with enormous challenges that require the development of new intelligent methods. This is where the Reduced Complexity Models pilot project can help, by developing reduced complexity models based on intelligent new methods from the field of computer science. The project focuses on the quantification of the uncertainties of simulations, the development of simplified models, and the identification of key dependencies. The aim is to increase the stability of models and the reliability of simulations for numerous applications. These new concepts and methods are developed and tested using challenging concrete examples. The project will advance the development of interoperable and reusable technologies, thereby making a contribution towards solving future problems more quickly. 

Further information and contact: Prof. Dr. Corinna Schrum (HZG), corinna.schrum(at)


The Automated Scientific Discovery pilot project will create entirely new technologies in order to automatically establish relationships between large quantities of complex scientific data. To this end, the Helmholtz researchers use highly innovative and reliable methods of artificial intelligence. Initially, the project partners will use these methods to further develop knowledge of nuclear fusion processes and Earth observation. Later on, the scientists will combine basic research, applied research, and the development of generic methods. The development of generic methods will be strengthened through close collaboration with the Helmholtz Analytics Framework pilot project.


Further information and contact: Dr. Jakob Svensson (IPP), jakob.svensson(at)


Imaging processes provide an essential source of information in almost all research fields. The Imaging at the Limit pilot project, which will initially receive start-up funding, focuses on those aspects of image reconstruction that translate measurements into actual images. Skillfully exploiting the information content in the measuring data sets makes it possible, among other things, to improve the efficiency of the reconstruction and therefore the utilization of visualized data. The project hopes to improve existing imaging processes and the related research fields in an innovative way.


Further information and contacts: PD Dr. Wolfgang zu Castell (HMGU), castell(at), Prof. Dr. Christian Schroer (DESY), christian.schroer(at)

Large, complex and high-dimensional data are now ubiquitous in essentially all areas of science and society. Machine learning and AI methods are already highly effective at exploiting such data for the purpose of prediction.

The project SIMCARD will develop novel machine learning tools that are truly robust and reliable and that can go beyond prediction to provide a deeper scientific understanding. In particular, the focus is on new methods for large-scale network modelling and reliable prediction. Designing scalable, principled and interpretable data science approaches is key for providing answers to pressing problems in various application areas. Specifically, the project addresses the fields of data-intensive biomedicine and weather prediction.

Further information and contacts: Melanie Schienle, Melanie.Schienle (at) and Sach Mukherjee, sach.mukherjee (at)

Print Version


  • Photo of Andreas Kosmider
  • Photo of Jacqueline Bender
  • Photo of Patrick Gilroy