Existing machine learning (ML) models typically on correlation, but not causation. This can lead to errors, bias, and eventually suboptimal performance. To address this, we aim to develop novel ways to integrate causality into ML models. In the project CausalNet, we advance causal ML toward flexibility, efficiency, and robustness: (1) Flexibility: We develop a general-purpose causal ML model for high-dimensional, timeseries, and multi-modal data. (2) Efficiency: We develop techniques for efficient learning algorithms (e.g., synthetic pre-training, transfer learning, and few-shot learning) that are carefully tailored to causal ML. (3) Robustness: We create new environments/datasets for benchmarking. We also develop new techniques for verifying and improving the robustness of causal ML. (4) Open-source: We fill white spots in the causal ML toolchain to improve industry uptake. (5) Real-world applications: We demonstrate performance gains through causal ML in business, public policy, and bioinformatics for scientific discovery.
The ongoing warming of the earth's climate due to man-made climate change is fundamentally changing our weather. Traditionally, weather forecasts have been made based on numerical models, so-called Numerical Weather Predictions (NWP). Data-driven machine learning models and in particular deep neural networks offer the potential as surrogate models for fast and (energy) efficient emulation of NWP models. As part of the SmartWeather21 project, we want to investigate which DL architecture for NWP is best suited for weather forecasting in a warmer climate, based on the high-resolution climate projections generated with ICON as part of WarmWorld. To incorporate the high resolutions of the WarmWorld climate projections, we will develop data- and model-parallel approaches and architectures for weather forecasting in a warmer climate. Furthermore, we will investigate which (learnable combinations of) variables from the ICON climate projections provide the best, physically plausible forecast accuracy for weather prediction in a warmer climate. With this in mind, we develop dimension reduction techniques for the various input variables that learn a latent, lower-dimensional representation based on the accuracy of the downstream weather forecast as an upstream task. The increased spatial resolution of the ICON simulations also allows conclusions to be drawn about the uncertainties of individual input and output variables at lower resolutions. As part of SmartWeather21, we will develop methods that parameterise these uncertainties using probability distributions and use them as input variables with lower spatial resolution in DL-based weather models. These can be propagated through the model as part of probabilistic forecasts.
Solar tower power plants play a key role in facilitating the ongoing energy transition as they deliver dispatchable climate neutral electricity and direct heat for chemical processes. In this work we develop a heliostat-specific differentiable ray tracer capable of modeling the energy transport at the solar tower in a data-driven manner. This enables heliostat surface reconstruction and thus drastically improved the irradiance prediction. Additionally, such a ray tracer also drastically reduces the required data amount for the alignment calibration. In principle, this makes learning for a fully AI-operated solar tower feasible. The desired goal is to develop a holistic AI-enhanced digital twin of the solar power plant for design, control, prediction, and diagnosis, based on the physical differentiable ray tracer. Any operational parameter in the solar field influencing the energy transport may be, optimized with it. For the first time gradient-based, e.g., field design, aim point control, and current state diagnosis are possible. By extending it with AI-based optimization techniques and reinforcement learning algorithms, it should be possible to map real, dynamic environmental conditions with low-latency to the twin. Finally, due to the full differentiability, visual explanations for the operational action predictions are possible. The proposed AI-enhanced digital twin environment will be verified at a real power plant in Jülich. Its inception marks a significant step towards a fully automatic solar tower power plant.
The overall goal of this project is to improve the radiological diagnosis of human prostate cancer in clinical MRI by AI-based exploitation of information from higher resolution modalities. We will use the brilliance of HiP-CT imaging at beamline 18 and an extended histopathology of the entire prostate to optimise the interpretation of MRI images in the context of a research prototype. In parallel, the correlation of the image data with the molecular properties of the tumours is planned for a better understanding of invasive tumour structures. An interactive multi-scale visualisation across all modalities forms the basis for vividly conveying the immense amounts of data. As a result, at the end of the three-year project phase, the conventional radiological application of magnetic resonance imaging (MRI) is to be transferred into a diagnostic standard that also reliably recognises patients with invasive prostate tumours that have often been incorrectly diagnosed to date, taking into account innovative AI algorithms. In the medium term, a substantial improvement in the care of patients with advanced prostate carcinoma can therefore be expected. In addition, we will make the unique multimodal data set created in the project, including visualisation tools, available as open data to enable further studies to better understand prostate cancer, which could potentially lead to novel diagnostic and therapeutic approaches.
“ICON-SmART” addresses the role of aerosols and atmospheric chemistry for the simulation of seasonal to decadal climate variability and change. To this end, the project will enhance the capabilities of the coupled composition, weather and climate modelling system ICON-ART (ICON, icosahedral nonhydrostatic model – developed by DWD, MPI-M and DKRZ with the atmospheric composition module ART, aerosols and reactive trace gases – developed by KIT) for seasonal to decadal predictions and climate projections in seamless global to regional model configurations with ICON-Seamless-ART (ICON-SmART). Based on previous work, chemistry is a promising candidate for speed-up by machine learning. In addition, the project will explore machine learning approaches for other processes. The ICON-SmART model system will provide scientists, forecasters and policy-makers with a novel tool to investigate atmospheric composition in a changing climate and allows us to answer questions that have been previously out of reach.
This project uses artificial neural networks in an inverse design problem of finding nano-structured materials with optical properties on demand. Achieving this goal requires generating large amounts of data from 3D simulations of Maxwell's equations, which makes this a data-intensive computing problem. Tailored algorithms are being developed that address both the learning process and the efficient inversion. The project complements research in the SDL Materials Science on AI methods, large data sets generated by simulations, and workflows.
The ASSAS project aims at developing a proof-of-concept SA (severe accident) simulator based on ASTEC (Accident Source Term Evaluation Code). The prototype basic-principle simulator will model a simplified generic Western-type pressurized light water reactor (PWR). It will have a graphical user interface to control the simulation and visualize the results. It will run in real-time and even much faster for some phases of the accident. The prototype will be able to show the main phenomena occurring during a SA, including in-vessel and ex-vessel phases. It is meant to train students, nuclear energy professionals and non-specialists. In addition to its direct use, the prototype will demonstrate the feasibility of developing different types of fast-running SA simulators, while keeping the accuracy of the underlying physical models. Thus, different computational solutions will be explored in parallel. Code optimisation and parallelisation will be implemented. Beside these reliable techniques, different machine-learning methods will be tested to develop fast surrogate models. This alternate path is riskier, but it could drastically enhance the performances of the code. A comprehensive review of ASTEC's structure and available algorithms will be performed to define the most relevant modelling strategies, which may include the replacement of specific calculations steps, entire modules of ASTEC or more global surrogate models. Solutions will be explored to extend the models developed for the PWR simulator to other reactor types and SA codes. The training data-base of SA sequences used for machine-learning will be made openly available. Developing an enhanced version of ASTEC and interfacing it with a commercial simulation environment will make it possible for the industry to develop engineering and full-scale simulators in the future. These can be used to design SA management guidelines, to develop new safety systems and to train operators to use them.
The EQUIPE project deals with the quantification of uncertainties in large transformer models for time series prediction. Although the transformer architecture is able to achieve astonishingly high prediction accuracy, it requires immense amounts of computational resources. Common approaches to error estimation in neural networks are equally computationally intensive, which currently makes their use in transformers considerably more difficult. The research work within EQUIPE aims to solve these problems and to develop scalable algorithms for quantifying uncertainties in large neural networks, which will enable the methods to be used in real-time systems in the future.
iMagine is an EU-funded project that provides a portfolio of ‘free at the point of use’ image datasets, high-performance image analysis tools empowered with Artificial Intelligence (AI), and Best Practice documents for scientific image analysis. These services and materials enable better and more efficient processing and analysis of imaging data in marine and freshwater research, relevant to the overarching theme of ‘Healthy oceans, seas, coastal and inland waters’.
The AI4EOSC (Artificial Intelligence for the European Open Science Cloud) is an EU-funded project that delivers an enhanced set of advanced services for the development of AI/ML/DL models and applications in the European Open Science Cloud (EOSC). These services are bundled together into a comprehensive platform providing advanced features such as distributed, federated and split learning; novel provenance metadata for AI/ML/DL models; event-driven data processing services. The project builds on top of the DEEP-Hybrid-DataCloud outcomes and the EOSC compute platform.
The Helmholtz AI COmpute REssources (HAICORE) infrastructure project was launched in early 2020 as part of the Helmholtz Incubator "Information & Data Science" to provide high-performance computing resources for artificial intelligence (AI) researchers in the Helmholtz Association. Technically, the AI hardware is operated as part of the high-performance computing systems JUWELS (Julich Supercomputing Centre) and HoreKa (KIT) at the two centers. The SCC primarily covers prototypical development operations in which new approaches, models and methods can be developed and tested. HAICORE is open to all members of the Helmholtz Association in the field of AI research.
The Helmholtz AI Platform is a research project of the Helmholtz Incubator "Information & Data Science". The overall mission of the platform is the "democratization of AI for a data-driven future" and aims at making AI algorithms and approaches available to a broad user group in an easy-to-use and resource-efficient way. (Translated with DeepL.com)
With the rise of artificial intelligence and the accompanying demand in compute resources, the energy efficiency of large scale deep learning (DL) becomes increasingly important. The goal of EPAIS is to evaluate and correlate computational performance and energy consumption of state-of-the-art DL models at scale, and to improve the latter by optimising the former In this project, we measure and analyze energy consumption and computational performance of scientific DL workloads at scale intending to uncover the correlation between these two. Along these lines, we develop easy-to-use, low overhead tools for measuring energy consumption and performance. These tools can be incorporated by AI developers into their code for basic assessment of these metrics, fostering awareness for GreenAI and GreenHPC. Based on these insights, we develop new approaches to increase the energy efficiency of DL workloads through means of performance optimization.
Despite steady improvements in numerical weather prediction models, they still exhibit systematic errors caused by simplified representations of physical processes, assumptions about linear behavior, and the challenges of integrating all available observational data. Weather services around the world now recognize that addressing these shortcomings through the use of artificial intelligence (AI) could revolutionize the discipline in the coming decades. This will require a fundamental shift in thinking that integrates meteorology much more closely with mathematics and computer science. TEEMLEAP will foster this cultural change through a collaboration of scientists from the KIT Climate and Environment and MathSEE centers by establishing an idealized testbed to explore machine learning in weather forecasting. In contrast to weather services, which naturally focus on improvements of numerical forecast models in their full complexity, TEEMLEAP intends to evaluate the application possibilities and benefits of AI in this testbed along the entire process chain of weather forecasting.
Research data management forms the basis for applying, for example, modern artificial intelligence methods to research questions. Therefore, research data management is an important component of the KIT Climate and Environment Center. In the SmaRD-AI project (short for Smart Research Data Management to facilitate Artificial Intelligence in Climate and Environmental Sciences), the IWG, IMK, GIK, and SCC at KIT are working closely together not only to make the treasure trove of climate and environmental data available at KIT accessible, but also to be able to analyze it in a structured way using tools. Translated with DeepL
The GÉANT Project has grown during its iterations (GN1, GN2, GN3, GN3plus, GN4-1 and GN4-2) to incorporate not just the award-winning 500Gbps pan-European network, but also a catalogue of advanced, user-focused services, and a successful programme of innovation that is pushing the boundaries of networking technology to deliver real impact to over 50 million users.