Projekte

Projects

Ongoing Projects (13)
Finished Projects (4)

Ongoing Projects

HIRO: Hierarchische Rekonstruktionsmethoden in der Optischen Bildgebung. Teilvorhaben 4: Effiziente Vorwärtssimulation und Unsicherheiten

Contact: Prof. Dr. Martin Frank, Sidney Hansen, Dr. Emil Løvbak
Funding: BMFTR

since 2026-02-01 - 2029-01-31

Optische Bildgebung spielt eine zentrale Rolle in der biomedizinischen Forschung und wird genutzt, um z.B. die Verteilung von farbstoffmarkierten Wirkstoffen oder Zellen in lebenden Organismen sichtbar zu machen und zu verfolgen. Es gibt zwei wesentliche Ansätze: Die Fluoreszenztomographie (FLT), bei der Farbstoffe durch Licht angeregt werden, und die Biolumineszenztomographie (BLT), bei der Licht durch chemische Reaktionen ohne äußeren Einfluss entsteht. Beide Methoden ermöglichen eine dreidimensionale tomographische Darstellung und teilen ähnliche mathematische Grundlagen. In präklinischen Studien werden mittels FLT fluoreszent markierte Zellen, Proteine, Antikörper oder Wirkstoffträger dargestellt, während bei der BLT transfizierte Zellen verfolgt werden. Eine Verbesserung der Rekonstruktion dieser Verfahren kann die optische Bildgebung im Vergleich zu nuklearmedizinischen Methoden wettbewerbsfähiger machen. Allerdings stellt die Rekonstruktion als schlecht gestelltes inverses Problem mit hochdimensionalen Daten ein komplexes mathematisches Problem dar. In Teilprojekt TP4 wird eine neue Methode zur effizienten Vorwärtssimulation der Lichtausbreitung entwickelt und implementiert. Wir fokussieren uns dabei auf Monte-Carlo-Methoden als sehr genaue Verfahren, um Lösungsansätze für das Optimierungsproblem in den vorherigen TP sowohl Gradienten- als auch Unsicherheiteninformationen bereit zu stellen. Bei der Quantifizierung von Unsicherheiten fokussieren wir uns auf den Effekt von unsicheren Materialparametern (Absorption, Streuung) auf die Lichtpropagation und damit auf die Zielfunktionale.

Bayesian inversion for hybrid deterministic-stochastic kinetic solvers

Contact: Dr. Emil Løvbak
Funding: DFG

since 2025-06-01 - 2027-05-31

Kinetic equations are a core modelling tool across many domains of science and engineering, including fusion reactor design, radiation therapy planning and nuclear waste analysis. These equations model particle dynamics in a position-velocity phase space, whose high-dimensionality makes grid-based discretization expensive in practice. Often, one therefore uses particle-based Monte Carlo methods for their simulation. These methods have the drawback of producing simulation results with a stochastic sampling error, due to tracing a finite number of particles. The stochastic nature of this error presents challenges when performing, e.g., parameter estimation where one wishes to find the correct solver inputs to produce a simulation result that matches a measurement under given assumptions on measurement noise. Applying a Bayesian framework to such estimation problems, one assumes a prior distribution on the parameters to be identified. One then aims to compute a corresponding posterior distribution that takes into account how likely it is that the solver output for a given parameter value matches the provided measurement. In this project we consider sampling methods for evaluating this posterior, such as Markov chain Monte Carlo methods and ensemble Kalman inversion. The theory for these methods does in general not apply unchanged when using particle-based Monte Carlo solvers to evaluate the likelihood. We study how these methods perform in combination with such solvers and develop new robust and efficient variants of these methods to deal with such stochastic solvers. We develop these methods on mathematical toy problems and then extend their application to practical problems within nuclear fusion research and other relevant domains.

Simulierte Welten (Phase V)

Contact: Dr. Jasmin Hörter; Dr. Katharina Bata
Funding: MWK-BW

since 2025-04-01 - 2028-08-31
Project page: https://simulierte-welten.de/index/

We encounter simulations unconsciously in many everyday situations: the daily weather forecast, non-destructive crash tests for car approval, lightweight and material-saving plastic parts in household appliances, or investment strategies for funds and pension investors... An interdisciplinary team has set itself the task of bringing the topics of simulation, mathematical modelling, and artificial intelligence to schools. How do we recognise simulations? How are the results to be understood? And what do employees in data centres work on? How does AI work? Simulated Worlds answers these questions with many practical examples.

Spatial climate variability patterns reconstructed with Bayesian Hierarchical Learning

Contact: Prof. Dr. Nadja Klein
Funding: Helmholtz-Gemeinschaft

since 2025-02-01 - 2029-01-31

The project aims to reconstruct spatial patterns of timescale-dependent climate variability. For that a Bayesian Hierarchical Model will be developed that incorporates a variety of proxy data while considering proxy processes and noise. It aims to quantify limitations and uncertainties of derived climate variability reconstructions related to the covariance structure used and the sparseness, spatial heterogeneity and noisiness of the observational data through Bayesian posterior distributions. We will use the climate variability map to investigate regional patterns of low-frequency variability and the corresponding implications e.g. for the range of possible future climate trends in natural variability and of the frequency of extreme events. The project is supported by Helmholtz Einstein International Berlin Research School in Data Science (HEIBRiDS) and co-supervised by Prof. Dr. Thomas Laepple from Alfred Wegener Institute (AWI) and Prof. Dr. Tobias Krüger from the Humboldt University.

Structured explainability for interactions in deep learning models applied to pathogen phenotype prediction

Contact: Prof. Dr. Nadja Klein
Funding: DFG

since 2025-01-01 - 2029-06-30
Project page: gepris.dfg.de/gepris/projekt/498589566?language

Explaining and understanding the underlying interactions of genomic regions are crucial for proper pathogen phenotype characterization such as predicting the virulence of an organism or the resistance to drugs. Existing methods for classifying the underlying large-scale data of genome sequences face challenges with regard to explainability due to the high dimensionality of data, making it difficult to visualize, access and justify classification decisions. This is particularly the case in the presence of interactions, such as of genomic regions. To address these challenges, we will develop methods for variable selection and structured explainability that capture the interactions of important input variables: More specifically, we address these challenges (i) within a deep mixed models framework for binary outcomes fusing generalized linear mixed models and a deep variant of structured predictors. We thereby combine statistical logistic regression models with deep learning for disentangling complex interactions in genomic data. We particularly enable estimation when no explicitly formulated inputs are available for the models, as for instance relevant with genomics data. Further, (ii), we will extend methods for explainability of classification decisions such as layerwise relevance propagation to explain these interactions. Investigating these two complementary approaches on both the model and explainability levels, it is our main objective to formulate and postulate structured explanations that not only give first-order, single variable explanations of classification decisions, but also regard their interactions. While our methods are motivated by our genomic data, they can be useful and extended to other application areas in which interactions are of interest.

Probabilistic learning approaches for complex disease progression based on high-dimensional MRI data

Contact: Prof. Dr. Nadja Klein
Funding: DFG

since 2025-01-01 - 2029-06-30
Project page: https://gepris.dfg.de/gepris/projekt/498590773?language=en

This project proposes informed, data-driven methods to reveal pathological trajectories based on high-dimensional medical data obtained from magnetic resonance imaging (MRI), which are relevant as both inputs and outputs in regression equations to adequately perform early diagnosis and to model, understand, and predict actual and future disease progression. For this, we will fuse deep learning (DL) methods with Bayesian statistics to (1) accurately predict the complete outcome distributions of individual patients based on MRI data and further confounders and covariates (such as clinical or demographical variables) to adequately quantify uncertainty in predictions in contrast to point predictions not delivering any measures of confidence (2) model temporal dynamics in biomedical patient data. Regarding (1), we will develop deep distributional regression models for image inputs to accurately predict the entire distributions of the different disease scores (e.g. symptom severity), which can be multivariate and are typically highly non-normally distributed. Regarding (2), we will model the complex temporal evolution in neurological diseases by developing DL-based state-space models. Neither model is tailored to a specific disease, but both will be exemplary developed and tested for two neurological diseases, namely Alzheimer’s disease (AD) and multiple sclerosis (MS), chosen for their different disease progression profiles.

Distributional Copula Regression for Space-Time Data

Contact: Prof. Dr. Nadja Klein
Funding: DFG

since 2024-10-01 - 2028-06-30
Project page: gepris.dfg.de/gepris/projekt/544966988

Distributional copula regression for space-time data develops novel models for multivariate spatio-temporal data using distributional copula regression. Of particular interest are tests for the significance of predictors and automatic variable selection using Bayesian selection priors. In the long run, the project will consider computationally efficient modeling of non-stationary dependencies using stochastic partial differential equations.

DFG-priority program 2298 Theoretical Foundations of Deep Learning

Contact: Prof. Dr. Martin Frank, Prof. Dr. Sebastian Krumscheid, Dr. Yijia Tang, Heinrich Daßer
Funding: DFG

since 2024-09-01 - 2027-08-31
Project page: https://www.foundationsofdl.de/

The goal of this project is to use deep neural networks as building blocks in a numerical method to solve the Boltzmann equation. This is a particularly challenging problem since the equation is a high-dimensional integro-differential equation, which at the same time possesses an intricate structure that a numerical method needs to preserve. Thus, artificial neural networks might be beneficial, but cannot be used out-of-the-box. We follow two main strategies to develop structure-preserving neural network-enhanced numerical methods for the Boltzmann equation. First, we target the moment approach, where a structure-preserving neural network will be employed to model the minimal entropy closure of the moment system. By enforcing convexity of the neural network, one can show, that the intrinsic structure of the moment system, such as hyperbolicity, entropy dissipation and positivity is preserved. Second, we develop a neural network approach to solve the Boltzmann equation directly at discrete particle velocity level. Here, a neural network is employed to model the difference between the full non-linear collision operator of the Boltzmann equation and the BGK model, which preserves the entropy dissipation principle. Furthermore, we will develop strategies to generate training data which fully sample the input space of the respective neural networks to ensure proper functioning models.

Bayesian Machine Learning with Uncertainty Quantification for Detecting Weeds in Crop Lands from Low Altitude Remote Sensing

Contact: Prof. Dr. Nadja Klein
Funding: Helmholtz-Gemeinschaft

since 2022-01-01
Project page: www.heibrids.berlin/people/doctoral-students/

Weeds are one of the major contributors to crop yield loss. As a result, farmers deploy various approaches to manage and control weed growth in their agricultural fields, most common being chemical herbicides. However, the herbicides are often applied uniformly to the entire field, which has negative environmental and financial impacts. Site-specific weed management (SSWM) considers the variability in the field and localizes the treatment. Accurate localization of weeds is the first step for SSWM. Moreover, information on the prediction confidence is crucial to deploy methods in real-world applications. This project aims to develop methods for weed identification in croplands from low-altitude UAV remote sensing imagery and uncertainty quantification using Bayesian machine learning, in order to develop a holistic approach for SSWM. The project is supported by Helmholtz Einstein International Berlin Research School in Data Science (HEIBRiDS) and co-supervised by Prof. Dr. Martin Herold from GFZ German Research Centre for Geosciences.

Regression Models Beyond the Mean – A Bayesian Approach to Machine Learning

Contact: Prof. Dr. Nadja Klein
Funding: DFG

since 2019-11-01
Project page: https://gepris.dfg.de/gepris/projekt/425212771?language=en

Recent progress in computer science has led to data structures of increasing size, detail and complexity in many scientific studies. In particular nowadays, where such big data applications do not only allow but also require more flexibility to overcome modelling restrictions that may result in model misspecification and biased inference, further insight in more accurate models and appropriate inferential methods is of enormous importance. This research group will therefore develop statistical tools for both univariate and multivariate regression models that are interpretable and that can be estimated extremely fast and accurate. Specifically, we aim to develop probabilistic approaches to recent innovations in machine learning in order to estimate models for huge data sets. To obtain more accurate regression models for the entire distribution we construct new distributional models that can be used for both univariate and multivariate responses. In all models we will address the issues of shrinkage and automatic variable selection to cope with a huge number of predictors, and the possibility to capture any type of covariate effect. This proposal also includes software development as well as applications in natural and social sciences (such as income distributions, marketing, weather forecasting, chronic diseases and others), highlighting its potential to successfully contribute to important facets in modern statistics and data science.

RTG 2450 - GRK 2450 (DFG)

Contact: Prof. Dr. Martin Frank (P1, P3), Prof. Dr. Alexander Schug (P4, P5)
Funding: DFG

since 2019-04-01 - 2028-03-31
Project page: www.compnano.kit.edu

In the Research Training Group (RTG) "Tailored Scale-Bridging Approaches to Computational Nanoscience" we investigate problems, that are not tractable by computational chemistry standard tools. The research is organized in seven projects. Five projects address scientific challenges such as friction, materials aging, material design and biological function. In two further projects, new methods and tools in mathematics and computer science are developed and provided for the special requirements of these applications. The SCC is involved in projects P4. P5 and P6.

CRC 1173 Wave phenomena

Contact: Prof. Dr. Martin Frank
Funding: DFG

since 2015-07-01
Project page: https://www.waves.kit.edu/index.php

Waves are everywhere, and understanding their behavior leads us to understand nature. The goal of CRC 1173 »Wave Phenomena« is therefore to analytically understand, numerically simulate, and eventually manipulate wave propagation under realistic scenarios by intertwining analysis and numerics.

Computational and Mathematical Modeling Program - CAMMP

Contact: Prof. Dr. Martin Frank, Dr. Jasmin Hörter, Dr. Katharina Bata
Funding: MWK-BW

since 2015-01-01
Project page: forschung/CAMMP

CAMMP stands for Computational and Mathematical Modeling Program. It is an extracurricular offer of KIT for students of different ages. We want to make the public aware of the social importance of mathematics and simulation sciences. For this purpose, students actively engage in problem solving with the help of mathematical modeling and computer use in various event formats together with teachers. In doing so, they explore real problems from everyday life, industry or research.

Finished Projects

Simulated worlds

Contact: Dr. Jasmin Hörter, Dr. Katharina Bata
Funding: MWK-BW

since 2021-09-01 - 2025-03-31
Project page: simulierte-welten.de

The Simulated Worlds project aims to provide students in Baden-Württemberg with a deeper critical understanding of the possibilities and limitations of computer simulations. The project is jointly supported by the Scientific Computing Center (SCC), the High Performance Computing Center Stuttgart (HLRS) and the University of Ulm and is already working with several schools in Baden-Württemberg.

Shallow priors and deep learning: The potential of Bayesian statistics as an agent for deep Gaussian mixture models

Contact: Prof. Dr. Nadja Klein
Funding: Volkswagenstiftung

since 2021-08-01 - 2025-02-28
Project page: portal.volkswagenstiftung.de/search/projectDetails.do?siteLanguage=en&ref=96932

Despite significant overlap and synergy, machine learning and statistical science have developed largely in parallel. Deep Gaussian mixture models, a recently introduced model class in machine learning, are concerned with the unsupervised tasks of density estimation and high-dimensional clustering used for pattern recognition in many applied areas. In order to avoid over-parameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture. However, the choice of architectures can be interpreted as a Bayesian model choice problem, meaning that every possible model satisfying the constraints is then fitted. The authors propose a much simpler approach: Only one large model needs to be trained and unnecessary components will empty out. The idea that parameters can be assigned prior distributions is highly unorthodox but extremely simple bringing together two sciences, namely machine learning and Bayesian statistics.

Boosting copulas - multivariate distributional regression for digital medicine

Contact: Prof. Dr. Nadja Klein
Funding: DFG

since 2020-09-01 - 2025-03-31
Project page: gepris.dfg.de/gepris/projekt/428239776

Traditional regression models often provide an overly simplistic view on complex associations and relationships to contemporary data problems in the area of biomedicine. In particular, capturing relevant associations between multiple clinical endpoints correctly is of high relevance to avoid model misspecifications, which can lead tobiased results and even wrong or misleading conclusions and treatments. As such, methodological development of statistical methods tailored for such problems in biomedicine are of considerable interest. It is the aim of this project to develop novel conditional copula regression models for high-dimensional biomedical data structures by bringing together efficient statistical learning tools for high-dimensional data and established methods from economics for multivariate data structures that allow to capture complex dependence structuresbetween variables. These methods will allow us to model the entire joint distribution of multiple endpoints simultaneously and to automatically determine the relevant influential covariates and risk factors via algorithms originally proposed in the area of statistical and machine learning. The resulting models can thenbe used both for the interpretation and analysis of complex association-structures as well as for prediction inference (simultaneous prediction intervals for multiple endpoints). Additional implementation in open software and its application in various studies highlight the potentials of this project’s methodological developments in the area of digital medicine.

i2Batman - i2batman

Contact: Prof. Dr. Martin Frank
Funding: Helmholtz-Gemeinschaft

since 2020-08-01 - 2023-07-31

Together with partners at Forschungszentrum Jülich and Fritz Haber Institute Berlin, our goal is to develop a novel intelligent management system for electric batteries that can make better decisions about battery charging cycles based on a detailed surrogate model ("digital twin") of the battery and artificial intelligence.