Distributional copula regression for space-time data develops novel models for multivariate spatio-temporal data using distributional copula regression. Of particular interest are tests for the significance of predictors and automatic variable selection using Bayesian selection priors. In the long run, the project will consider computationally efficient modeling of non-stationary dependencies using stochastic partial differential equations.
Based on the GitLab software, a state service for the administration, versioning and publication of software repositories for universities in Baden-Württemberg is being created as part of the IT alliance. GitLab also offers numerous possibilities for collaborative work and has extensive functionalities for software development and automation. The service enables or simplifies cross-location development projects between universities and with external partners, new possibilities in the field of research data management and can be used profitably in teaching. It also creates an alternative to cloud services such as GitHub and makes it easier to keep data from research and teaching in Baden-Württemberg. Translated with DeepL.com (free version)
The ongoing warming of the earth's climate due to man-made climate change is fundamentally changing our weather. Traditionally, weather forecasts have been made based on numerical models, so-called Numerical Weather Predictions (NWP). Data-driven machine learning models and in particular deep neural networks offer the potential as surrogate models for fast and (energy) efficient emulation of NWP models. As part of the SmartWeather21 project, we want to investigate which DL architecture for NWP is best suited for weather forecasting in a warmer climate, based on the high-resolution climate projections generated with ICON as part of WarmWorld. To incorporate the high resolutions of the WarmWorld climate projections, we will develop data- and model-parallel approaches and architectures for weather forecasting in a warmer climate. Furthermore, we will investigate which (learnable combinations of) variables from the ICON climate projections provide the best, physically plausible forecast accuracy for weather prediction in a warmer climate. With this in mind, we develop dimension reduction techniques for the various input variables that learn a latent, lower-dimensional representation based on the accuracy of the downstream weather forecast as an upstream task. The increased spatial resolution of the ICON simulations also allows conclusions to be drawn about the uncertainties of individual input and output variables at lower resolutions. As part of SmartWeather21, we will develop methods that parameterise these uncertainties using probability distributions and use them as input variables with lower spatial resolution in DL-based weather models. These can be propagated through the model as part of probabilistic forecasts.
The aim of this project is to strengthen research-oriented teaching, especially in the areas of AI, machine learning, simulation and modeling, by providing a state-wide service bwJupyter.
Solar tower power plants play a key role in facilitating the ongoing energy transition as they deliver dispatchable climate neutral electricity and direct heat for chemical processes. In this work we develop a heliostat-specific differentiable ray tracer capable of modeling the energy transport at the solar tower in a data-driven manner. This enables heliostat surface reconstruction and thus drastically improved the irradiance prediction. Additionally, such a ray tracer also drastically reduces the required data amount for the alignment calibration. In principle, this makes learning for a fully AI-operated solar tower feasible. The desired goal is to develop a holistic AI-enhanced digital twin of the solar power plant for design, control, prediction, and diagnosis, based on the physical differentiable ray tracer. Any operational parameter in the solar field influencing the energy transport may be, optimized with it. For the first time gradient-based, e.g., field design, aim point control, and current state diagnosis are possible. By extending it with AI-based optimization techniques and reinforcement learning algorithms, it should be possible to map real, dynamic environmental conditions with low-latency to the twin. Finally, due to the full differentiability, visual explanations for the operational action predictions are possible. The proposed AI-enhanced digital twin environment will be verified at a real power plant in Jülich. Its inception marks a significant step towards a fully automatic solar tower power plant.
Zusammen mit den gewonnenen Erkenntnissen, Bewertungen und Empfehlungen sollen die aktuellen Herausforderungen und definierten Handlungsfelder des Rahmenkonzepts der Universitäten des Landes Baden-Württemberg für das HPC und DIC im Zeitraum 2025 bis 2032 durch folgende Maßnahmen im Projekt konkretisiert werden: • Weiterentwicklung der Wissenschaftsunterstützung bzgl. Kompetenzen zur Unterstützung neuartiger System- und Methodekonzepte (KI, ML oder Quantencomputing), Vernetzung mit Methodenfor- schung, ganzheitliche Bedarfsanalysen und Unterstützungsstrategien (z.B. Onboarding) • Steigerung der Energieeffizienz durch Sensibilisierung sowie Untersuchung und Einsatz neuer Be- triebsmodelle und Workflows inkl. optimierter Software • Erprobung und flexible Integration neuer Systemkomponenten und -architekturen, Ressourcen (z.B. Cloud) sowie Virtualisierung- und Containerisierungslösungen • Umsetzung neue Software-Strategien (z.B. Nachhaltigkeit und Entwicklungsprozesse) • Ausbau der Funktionalitäten der baden-württembergischen Datenföderation (z.B. Daten-Transfer- Service) • Umsetzung von Konzepten beim Umgang mit sensiblen Daten und zur Ausprägung einer digitalen Souveränität • Vernetzung und Kooperation mit anderen Forschungsinfrastrukturen
“ICON-SmART” addresses the role of aerosols and atmospheric chemistry for the simulation of seasonal to decadal climate variability and change. To this end, the project will enhance the capabilities of the coupled composition, weather and climate modelling system ICON-ART (ICON, icosahedral nonhydrostatic model – developed by DWD, MPI-M and DKRZ with the atmospheric composition module ART, aerosols and reactive trace gases – developed by KIT) for seasonal to decadal predictions and climate projections in seamless global to regional model configurations with ICON-Seamless-ART (ICON-SmART). Based on previous work, chemistry is a promising candidate for speed-up by machine learning. In addition, the project will explore machine learning approaches for other processes. The ICON-SmART model system will provide scientists, forecasters and policy-makers with a novel tool to investigate atmospheric composition in a changing climate and allows us to answer questions that have been previously out of reach.
This project uses artificial neural networks in an inverse design problem of finding nano-structured materials with optical properties on demand. Achieving this goal requires generating large amounts of data from 3D simulations of Maxwell's equations, which makes this a data-intensive computing problem. Tailored algorithms are being developed that address both the learning process and the efficient inversion. The project complements research in the SDL Materials Science on AI methods, large data sets generated by simulations, and workflows.
The EQUIPE project deals with the quantification of uncertainties in large transformer models for time series prediction. Although the transformer architecture is able to achieve astonishingly high prediction accuracy, it requires immense amounts of computational resources. Common approaches to error estimation in neural networks are equally computationally intensive, which currently makes their use in transformers considerably more difficult. The research work within EQUIPE aims to solve these problems and to develop scalable algorithms for quantifying uncertainties in large neural networks, which will enable the methods to be used in real-time systems in the future.
The main goal of the present project is the further development and validation of a new computational fluid dynamics (CFD) method using a combination of grid-free (particles) and grid-based techniques. A fundamental assumption of this novel approach is the decomposition of any physical quantity into the grid based (large scale) and the fine scale parts, whereas large scales are resolved on the grid and fine scales are represented by particles. Dynamics of large and fine scales is calculated from two coupled transport equations one of which is solved on the grid whereas the second one utilizes the Lagrangian grid free Vortex Particle Method (VPM).
InterTwin co-designs and implements the prototype of an interdisciplinary Digital Twin Engine (DTE), an open source platform that provides generic and tailored software components for modelling and simulation to integrate application- specific Digital Twins (DTs). Its specifications and implementation are based on a co-designed conceptual model - the DTE blueprint architecture - guided by the principles of open standards and interoperability. The ambition is to develop a common approach to the implementation of DTs that is applicable across the whole spectrum of scientific disciplines and beyond to facilitate developments and collaboration.
Weeds are one of the major contributors to crop yield loss. As a result, farmers deploy various approaches to manage and control weed growth in their agricultural fields, most common being chemical herbicides. However, the herbicides are often applied uniformly to the entire field, which has negative environmental and financial impacts. Site-specific weed management (SSWM) considers the variability in the field and localizes the treatment. Accurate localization of weeds is the first step for SSWM. Moreover, information on the prediction confidence is crucial to deploy methods in real-world applications. This project aims to develop methods for weed identification in croplands from low-altitude UAV remote sensing imagery and uncertainty quantification using Bayesian machine learning, in order to develop a holistic approach for SSWM. The project is supported by Helmholtz Einstein International Berlin Research School in Data Science (HEIBRiDS) and co-supervised by Prof. Dr. Martin Herold from GFZ German Research Centre for Geosciences.
The Simulated Worlds project aims to provide students in Baden-Württemberg with a deeper critical understanding of the possibilities and limitations of computer simulations. The project is jointly supported by the Scientific Computing Center (SCC), the High Performance Computing Center Stuttgart (HLRS) and the University of Ulm and is already working with several schools in Baden-Württemberg.
Despite significant overlap and synergy, machine learning and statistical science have developed largely in parallel. Deep Gaussian mixture models, a recently introduced model class in machine learning, are concerned with the unsupervised tasks of density estimation and high-dimensional clustering used for pattern recognition in many applied areas. In order to avoid over-parameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture. However, the choice of architectures can be interpreted as a Bayesian model choice problem, meaning that every possible model satisfying the constraints is then fitted. The authors propose a much simpler approach: Only one large model needs to be trained and unnecessary components will empty out. The idea that parameters can be assigned prior distributions is highly unorthodox but extremely simple bringing together two sciences, namely machine learning and Bayesian statistics.
Within the Joint Lab VMD, the SDL Materials Science develops methods, tools and architectural concepts for supercomputing and big data infrastructures, which are tailored to tackle the specific application challenges and to facilitate the digitalization in materials research and the creation of digital twins. In particular, the Joint Lab develops a virtual research environment (VRE) that integrates computing and data storage resources in existing workflow managements systems and interactive environments for simulation and data analyses.
Traditional regression models often provide an overly simplistic view on complex associations and relationships to contemporary data problems in the area of biomedicine. In particular, capturing relevant associations between multiple clinical endpoints correctly is of high relevance to avoid model misspecifications, which can lead tobiased results and even wrong or misleading conclusions and treatments. As such, methodological development of statistical methods tailored for such problems in biomedicine are of considerable interest. It is the aim of this project to develop novel conditional copula regression models for high-dimensional biomedical data structures by bringing together efficient statistical learning tools for high-dimensional data and established methods from economics for multivariate data structures that allow to capture complex dependence structuresbetween variables. These methods will allow us to model the entire joint distribution of multiple endpoints simultaneously and to automatically determine the relevant influential covariates and risk factors via algorithms originally proposed in the area of statistical and machine learning. The resulting models can thenbe used both for the interpretation and analysis of complex association-structures as well as for prediction inference (simultaneous prediction intervals for multiple endpoints). Additional implementation in open software and its application in various studies highlight the potentials of this project’s methodological developments in the area of digital medicine.
The Helmholtz AI COmpute REssources (HAICORE) infrastructure project was launched in early 2020 as part of the Helmholtz Incubator "Information & Data Science" to provide high-performance computing resources for artificial intelligence (AI) researchers in the Helmholtz Association. Technically, the AI hardware is operated as part of the high-performance computing systems JUWELS (Julich Supercomputing Centre) and HoreKa (KIT) at the two centers. The SCC primarily covers prototypical development operations in which new approaches, models and methods can be developed and tested. HAICORE is open to all members of the Helmholtz Association in the field of AI research.
Recent progress in computer science has led to data structures of increasing size, detail and complexity in many scientific studies. In particular nowadays, where such big data applications do not only allow but also require more flexibility to overcome modelling restrictions that may result in model misspecification and biased inference, further insight in more accurate models and appropriate inferential methods is of enormous importance. This research group will therefore develop statistical tools for both univariate and multivariate regression models that are interpretable and that can be estimated extremely fast and accurate. Specifically, we aim to develop probabilistic approaches to recent innovations in machine learning in order to estimate models for huge data sets. To obtain more accurate regression models for the entire distribution we construct new distributional models that can be used for both univariate and multivariate responses. In all models we will address the issues of shrinkage and automatic variable selection to cope with a huge number of predictors, and the possibility to capture any type of covariate effect. This proposal also includes software development as well as applications in natural and social sciences (such as income distributions, marketing, weather forecasting, chronic diseases and others), highlighting its potential to successfully contribute to important facets in modern statistics and data science.
The Helmholtz AI Platform is a research project of the Helmholtz Incubator "Information & Data Science". The overall mission of the platform is the "democratization of AI for a data-driven future" and aims at making AI algorithms and approaches available to a broad user group in an easy-to-use and resource-efficient way. (Translated with DeepL.com)
In the Research Training Group (RTG) "Tailored Scale-Bridging Approaches to Computational Nanoscience" we investigate problems, that are not tractable by computational chemistry standard tools. The research is organized in seven projects. Five projects address scientific challenges such as friction, materials aging, material design and biological function. In two further projects, new methods and tools in mathematics and computer science are developed and provided for the special requirements of these applications. The SCC is involved in projects P4. P5 and P6.
CAMMP stands for Computational and Mathematical Modeling Program. It is an extracurricular offer of KIT for students of different ages. We want to make the public aware of the social importance of mathematics and simulation sciences. For this purpose, students actively engage in problem solving with the help of mathematical modeling and computer use in various event formats together with teachers. In doing so, they explore real problems from everyday life, industry or research.
With the rise of artificial intelligence and the accompanying demand in compute resources, the energy efficiency of large scale deep learning (DL) becomes increasingly important. The goal of EPAIS is to evaluate and correlate computational performance and energy consumption of state-of-the-art DL models at scale, and to improve the latter by optimising the former In this project, we measure and analyze energy consumption and computational performance of scientific DL workloads at scale intending to uncover the correlation between these two. Along these lines, we develop easy-to-use, low overhead tools for measuring energy consumption and performance. These tools can be incorporated by AI developers into their code for basic assessment of these metrics, fostering awareness for GreenAI and GreenHPC. Based on these insights, we develop new approaches to increase the energy efficiency of DL workloads through means of performance optimization.
Despite steady improvements in numerical weather prediction models, they still exhibit systematic errors caused by simplified representations of physical processes, assumptions about linear behavior, and the challenges of integrating all available observational data. Weather services around the world now recognize that addressing these shortcomings through the use of artificial intelligence (AI) could revolutionize the discipline in the coming decades. This will require a fundamental shift in thinking that integrates meteorology much more closely with mathematics and computer science. TEEMLEAP will foster this cultural change through a collaboration of scientists from the KIT Climate and Environment and MathSEE centers by establishing an idealized testbed to explore machine learning in weather forecasting. In contrast to weather services, which naturally focus on improvements of numerical forecast models in their full complexity, TEEMLEAP intends to evaluate the application possibilities and benefits of AI in this testbed along the entire process chain of weather forecasting.
Cardiovascular diseases are among the most common causes of death worldwide: Every year, more than 300,000 people die in Germany as a result. Around half of these deaths are caused by cardiac arrhythmias. In the European MICROCARD project, in which the Karlsruhe Institute of Technology (KIT) is involved, researchers are now developing a simulation platform that can digitally map the electrophysical signal transmissions in the heart. The computer simulations are to contribute in particular to improved diagnosis and therapy. KIT will receive about 1.3 million euros for its contributions within the framework of the "European High-Performance Computing Joint Undertaking".
The primary objective of the project is to establish an integrated nationwide computing and data infrastructure and to increase efficiency and effectiveness by providing first-class support to scientists and users.
Together with partners at Forschungszentrum Jülich and Fritz Haber Institute Berlin, our goal is to develop a novel intelligent management system for electric batteries that can make better decisions about battery charging cycles based on a detailed surrogate model ("digital twin") of the battery and artificial intelligence.
The Exascale Earth System Modelling (PL-ExaESM) pilot lab explores specific concepts for applying Earth System models and their workflows to future exascale supercomputers.
Ziel des Projekts ist die Entwicklung von neuen Werkzeugen zur Simulation von Schwerionenstrahlen in Targets. Wir möchten die Orts- und Energieverteilung aller Primär- und Sekundärteilchen charakterisieren. Dies ist von Interesse in vielen Feldern: Atomphysik (atomare Wechselwirkung, Ioneneinfang), Kernphysik (Untersuchung der Struktur von Atomkernen), Elektronik (Ablagerung von Elementen), Materialwissenschaften (Analyse von Beschädigungen z.B. eines Tokamaks), Biologie (Untersuchung der Toxikologie von Gewebe durch Ionenanalyse). Die Simulation von schweren Ionen ist schwierig aus zwei Gründen: Zum einen ist die gitterbasierte Simulation von Teilchentransport sehr herausfordernd. Zum anderen basieren die Simulationen auf Messungen der Bremsvermögen der Ionen, und müssen daher als unsicher angesehen werden. Daher entwickeln wir ein neues, Entropie-basiertes Diskretisierungsschema, welches eine Sub-Auflösung unterhalb des numerischen Gitters ermöglicht, und daher geeignet für die Simulation von Strahlen ist. Zusätzlich benutzen wir eine ähnliche Methode zur Behandlung von Unsicherheiten in der Teilchenverteilung, die durch die unsicheren Wirkungsquerschnitte bedingt werden. Unsere Methode ist rechenaufwändig, aber hochgradig parallelisierbar, was sie ideal für moderne Rechnerarchitekturen macht.
Development of an innovative measurement system based on P&DGNAA technology for environmental analysis including new evaluation algorithms.
Future exascale HPC systems require efficient data management methods. The locality of data and the efficient access during a simulation are of great importance.
The Scientific Computing Center (SCC) operates the research platform Smart Data Innovation Lab (SDIL) at KIT. SDIL creates the conditions for cutting-edge research in the field of Big Data ...