Foundation Models
Using AI to Measure the Biological Carbon Pump

Microcosm in the ocean: images of plankton taken with an underwater camera in the North Atlantic. Picture: Hereon/Klas Ove Möller © Image: Klas Ove Möller, HEREON
Tiny plankton organisms constantly transport carbon into the deep sea, thereby stabilizing the Earth's climate. Through the AqQua project, Helmholtz researchers aim to develop an AI system to monitor plankton biodiversity and its impact on the global climate.
They may be tiny, but there are many of them. Single-celled microalgae in the oceans have a significant impact on Earth's living conditions due to their abundance. Like land plants, they perform photosynthesis, building up biomass using sunlight and carbon dioxide. Oxygen is released as a byproduct. At least one out of every two oxygen molecules we breathe comes from the production of this plant plankton, or phytoplankton. These tiny organisms also form the basis of the marine food chain. Small animals (zooplankton) eat them, and then fish eat the zooplankton. Fish become prey for sharks, seals, whales, and humans. Without microalgae, our plates at the fish restaurant would be empty.
But perhaps their most important function concerns the climate. The carbon dioxide (CO2) that these tiny plants bind in the light-flooded surface layer of the ocean comes largely from the atmosphere. When phytoplankton dies, most of it is broken down by microbes and the CO2 is released again. However, some of it clumps together with animal excrement and other particles to form centimeter-sized aggregates that then sink to the depths of the ocean. Each flake of this continuously falling “sea snow” takes a small portion of atmospheric carbon dioxide with it to the deep sea. “Some of these flakes are permanently stored in marine sediments,” says Dagmar Kainmüller. “Every year, more than six gigatons of carbon are removed from the atmosphere for many hundreds of years. This is roughly equivalent to the amount released worldwide each year through the combustion of fossil fuels. If this biological carbon pump did not exist, the CO2 content of the atmosphere would be significantly higher – and the climate would be even warmer.”
Dagmar Kainmüller holds a doctorate in computer science, is a professor at the Hasso Plattner Institute in Potsdam, and heads the Integrative Imaging Data Sciences working group at the Max Delbrück Center for Molecular Medicine in the Helmholtz Association. Together with marine researchers and plankton specialists Rainer Kiko (GEOMAR Helmholtz Centre for Ocean Research Kiel) and Klas Ove Möller (Helmholtz Centre Hereon Geesthacht) as well as big data expert Timo Dickscheid (Jülich Research Centre), she heads the research project AqQua ( “The Aquatic Life Foundation Project: Quantifying Life at Scale in a Changing World”). “The order of magnitude of the amount of carbon pumped biologically is known, but the exact figure is not,” explains Dagmar Kainmüller. “There is also no comprehensive global monitoring. How much plant and animal plankton is there? What species does it consist of in which ocean regions? And how much is the pump weakening as a result of global warming? With AqQua, we want to find the answers.”
Dagmar Kainmüller. Picture: Pablo Castagnola/MDC
AqQua's database consists of billions of plankton images generated by researchers worldwide using various methods. These methods include pouring seawater samples into traditional scanners, using permanently moored underwater cameras, and employing mobile underwater vehicles to take pictures of plankton in surrounding waters. Millions of photos are added daily. In principle, this allows us to monitor plankton globally, even in the deep sea. In principle. However, only a small proportion of the photos are "labeled," meaning they have been checked by a human and provided with information about the plankton species that can be identified. "The big challenge is to make this vast amount of images comparable and categorize them automatically," says Dagmar Kainmüller. "This task is virtually predestined for a foundation model."
Foundation models are currently revolutionizing science. Using artificial intelligence, these models can process enormous amounts of data, link them together, and recognize connections or patterns. As part of the Helmholtz Foundation Model Initiative (HFMI), the Helmholtz Association is funding several projects in this field, including AqQua. “Helmholtz is the perfect place for our project,” the computer scientist explains. "The centers have exactly the expertise we need: marine research, imaging techniques, big data, artificial intelligence, and of course, the necessary computing power." AqQua is supported by five Helmholtz centers (the Max Delbrück Center, GEOMAR, the Helmholtz Center Hereon, the FZ Jülich, the AWI, and the UFZ) and the two cross-center data science platforms, Helmholtz Imaging and Helmholtz AI. Around 40 renowned institutions worldwide support the project. "The broad international support shows how urgently science needs a solution," says the researcher.
In the first step of the project, researchers are compiling approximately three billion images from various ocean regions. These images are provided by four Helmholtz Centers and international partners. Thus far, a good two billion images have been collected. Then, the foundation model is trained to correctly identify individual plankton species, determine their condition (e.g., size and stage of development), and calculate their carbon content. As a result of the training, the self-learning system is constantly improving. "Once the model is functioning reliably, we will make it publicly available as an easy-to-use online tool. We will share it with the global community through our open-source approach and work together to continuously improve it," explains Dagmar Kainmüller. "To map the flow of carbon and the pumping capacity into the deep sea as accurately as possible, we also plan to link image data with satellite data."
Earth observation systems, such as NASA's PACE (Plankton, Aerosol, Cloud, ocean Ecosystem) satellite, provide high-resolution images of the oceans over time. From these images, phytoplankton concentrations near the water surface can be derived. However, the data does not extend to greater depths, nor does it contain information on the exact species composition. "With the help of our foundation model and image analyses from a variety of depths and ocean regions, we can extrapolate surface data into three dimensions," says the computer scientist. "We can then make regional and global statements about the biological carbon pump and changes in pump performance, as well as the species composition, condition, and biomass of phytoplankton. This paves the way for more precise climate models, global phytoplankton monitoring, and policy decisions based on this information.”
Link to the project website: www.aqqua.life
Helmholtz Foundation Model Initiative
Foundation models are a new generation of AI models that have a broad knowledge base and are therefore able to solve a range of complex problems. They are significantly more powerful and flexible than conventional AI models and therefore hold enormous potential for modern, data-driven science. They can become powerful tools that answer a wide range of research questions. The Helmholtz Association is ideally placed to develop such pioneering applications: a wealth of data, powerful supercomputers on which the models can be trained, and in-depth expertise in the field of artificial intelligence. Our goal is to develop foundation models across a broad spectrum of research fields that contribute to solving the big questions of our time.
Readers comments