KIT - SCC - Research - Topics - Applied mathematics and numerical methods - Fixed-Point Methods for Numerics at Exascale

2026

Investigating Matrix Repartitioning to Address the Over and Undersubscription Challenge for a GPU-Based CFD Solver
Olenik, G.; Koch, M.; Anzt, H.
2026. High Performance Computing – ISC High Performance 2025 International Workshops, Hamburg, Germany, June 10–13, 2025, Revised Selected Papers. Ed.: S. Neuwirth, 468–479, Springer Nature Switzerland. doi:10.1007/978-3-032-07612-0_36

2025

pyGinkgo: A Sparse Linear Algebra Operator Framework for Python
Tuteja, K.; Olenik, G.; Mishchuk, R.; Tsai, Y.-H.; Götz, M.; Streit, A.; Anzt, H.; Debus, C.
2025. ICOO’25: Proceedings of the 54th International Conference on Parallel Processing, CA, San Diego, September 8-11, 2025, 753–763, Association for Computing Machinery (ACM). doi:10.1145/3754598.3754648

Ginkgo: A high performance numerical linear algebra library
Anzt, H.; Cojean, T.; Chen, Y.-C.; Flegar, G.; Göbel, F.; Grützmacher, T.; Koch, M.; Nayak, P.; Olenik, G.; Ribizel, T.; Tsai, Y.-H.
2025, December 15. doi:10.5281/zenodo.17936456

Efficient solution of batched band linear systems on GPUs
Nayak, P.; Aggarwal, I.; Anzt, H.
2025. The International Journal of High Performance Computing Applications, 39 (5), 615–630. doi:10.1177/10943420251347460

Towards a platform-portable linear algebra backend for OpenFOAM
Olenik, G.; Koch, M.; Boutanios, Z.; Anzt, H.
2025. Meccanica, 60 (6), 1659–1672. doi:10.1007/s11012-024-01806-1

Multifacets of lossy compression for scientific data in the Joint-Laboratory of Extreme Scale Computing
Cappello, F.; Acosta, M.; Agullo, E.; Anzt, H.; Calhoun, J.; Di, S.; Giraud, L.; Grützmacher, T.; Jin, S.; Sano, K.; Sato, K.; Singh, A.; Tao, D.; Tian, J.; Ueno, T.; Underwood, R.; Vivien, F.; Yepes, X.; Kazutomo, Y.; Zhang, B.
2025. Future Generation Computer Systems, 163, Art.-Nr.: 107323. doi:10.1016/j.future.2024.05.022

Accelerating Fusion Plasma Collision Operator Solves with Portable Batched Iterative Solvers on GPUs
Lin, P. T.; Nayak, P.; Kashi, A.; Kulkarni, D.; Scheinberg, A.; Anzt, H.
2025. High Performance Computing. ISC High Performance 2024 International Workshops. Ed.: M. Weiland, 127 – 140, Springer Nature Switzerland. doi:10.1007/978-3-031-73716-9_9

2024

FRSZ2 for In-Register Block Compression Inside GMRES on GPUs
Grützmacher, T.; Underwood, R.; Di, S.; Cappello, F.; Anzt, H.
2024. Proceedings of SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, 240 – 249, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/SCW63240.2024.00038

A Probabilistic Model for Asynchronous Iterative Methods
Nayak, P.; Anzt, H.
2024. 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), San Francisco, CA, USA, 27-31 May 2024, 260–269, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/IPDPSW63119.2024.00064

Portable Mixed Precision Algebraic Multigrid on High Performance GPUs. PhD dissertation
Tsai, Y.-H.
2024, February 29. Karlsruher Institut für Technologie (KIT). doi:10.5445/IR/1000168914

Ginkgo - A math library designed to accelerate Exascale Computing Project science applications
Cojean, T.; Nayak, P.; Ribizel, T.; Beams, N.; Mike Tsai, Y.-H.; Koch, M.; Göbel, F.; Grützmacher, T.; Anzt, H.
2024. The International Journal of High Performance Computing Applications. doi:10.1177/10943420241268323

Earth Virtualization Engines (EVE)
Stevens, B.; Adami, S.; Ali, T.; Anzt, H.; Aslan, Z.; Attinger, S.; Bäck, J.; Baehr, J.; Bauer, P.; Bernier, N.; Bishop, B.; Bockelmann, H.; Bony, S.; Brasseur, G.; Bresch, D. N.; Breyer, S.; Brunet, G.; Buttigieg, P. L.; Cao, J.; Castet, C.; Cheng, Y.; Dey Choudhury, A.; Coen, D.; Crewell, S.; Dabholkar, A.; Dai, Q.; Doblas-Reyes, F.; Durran, D.; El Gaidi, A.; Ewen, C.; Exarchou, E.; Eyring, V.; Falkinhoff, F.; Farrell, D.; Forster, P. M.; Frassoni, A.; Frauen, C.; Fuhrer, O.; Gani, S.; Gerber, E.; Goldfarb, D.; Grieger, J.; Gruber, N.; Hazeleger, W.; Herken, R.; Hewitt, C.; Hoefler, T.; Hsu, H.-H.; Jacob, D.; Jahn, A.; Jakob, C.; Jung, T.; Kadow, C.; Kang, I.-S.; Kang, S.; Kashinath, K.; Kleinen-von Königslöw, K.; Klocke, D.; Kloenne, U.; Klöwer, M.; Kodama, C.; Kollet, S.; Kölling, T.; Kontkanen, J.; Kopp, S.; Koran, M.; Koran, M.; Kulmala, M.; Lappalainen, H.; Latifi, F.; Lawrence, B.; Lee, J. Y.; Lejeun, Q.; Lessig, C.; Li, C.; Lippert, T.; Luterbacher, J.; Manninen, P.; Marotzke, J.; Matsouoka, S.; Merchant, C.; Messmer, P.; Michel, G.; Michielsen, K.; Miyakawa, T.; Müller, J.; Munir, R.; Narayanasetti, S.; Ndiaye, O.; Nobre, C.; Oberg, A.; Oki, R.; Özkan-Haller, T.; Palmer, T.; Posey, S.; Prein, A.; Primus, O.; Pritchard, M.; Pullen, J.; Putrasahan, D.; Quaas, J.; Raghavan, K.; Ramaswamy, V.; Rapp, M.; Rauser, F.; Reichstein, M.; Revi, A.; Saluja, S.; Satoh, M.; Schemann, V.; Schemm, S.; Schnadt Poberaj, C.; Schulthess, T.; Senior, C.; Shukla, J.; Singh, M.; Slingo, J.; Sobel, A.; Solman, S.; Spitzer, J.; Stier, P.; Stocker, T.; Strock, S.; Su, H.; Taalas, P.; Taylor, J.; Tegtmeier, S.; Teutsch, G.; Tompkins, A.; Ulbrich, U.; Vidale, P.-L.; Wu, C.-M.; Xu, H.; Zaki, N.; Zanna, L.; Zhou, T.
2024. Earth System Science Data, 16 (4), 2113 – 2122. doi:10.5194/essd-16-2113-2024

GPU-resident sparse direct linear solvers for alternating current optimal power flow analysis
Świrydowicz, K.; Koukpaizan, N.; Ribizel, T.; Göbel, F.; Abhyankar, S.; Anzt, H.; Peleš, S.
2024. International Journal of Electrical Power & Energy Systems, 155 (Part A), Art.-Nr.: 109517. doi:10.1016/j.ijepes.2023.109517

2023

Synchronization-free algorithms for exascale and beyond : A study of Asynchronous and Batched Iterative Methods. PhD dissertation
Nayak, P. V.
2023, December 13. Karlsruher Institut für Technologie (KIT). doi:10.5445/IR/1000165437

Three-precision algebraic multigrid on GPUs
Tsai, Y.-H. M.; Beams, N.; Anzt, H.
2023. Future Generation Computer Systems, 149, 280–293. doi:10.1016/j.future.2023.07.024

Porting Batched Iterative Solvers onto Intel GPUs with SYCL
Nguyen, P.; Nayak, P.; Anzt, H.
2023. Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, USA, 12-17 November 2023, 1048 – 1058, Association for Computing Machinery (ACM). doi:10.1145/3624062.3624181

Parallel Symbolic Cholesky Factorization
Ribizel, T.; Anzt, H.
2023. Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, Denver, USA, 12-17 November 2023, 1721 – 1727, Association for Computing Machinery (ACM). doi:10.1145/3624062.3624253

Integrating batched sparse iterative solvers for the collision operator in fusion plasma simulations on GPUs
Kashi, A.; Nayak, P.; Kulkarni, D.; Scheinberg, A.; Lin, P.; Anzt, H.
2023. Journal of Parallel and Distributed Computing, 178, 69–81. doi:10.1016/j.jpdc.2023.03.012

Utilizing batched solver ideas for efficient solution of non-batched linear systems
Nayak, P.; Anzt, H.
2023. 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 662 – 665, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/IPDPSW59300.2023.00113

GPU-resident sparse direct linear solvers for alternating current optimal power flow analysis
Świrydowicz, K.; Koukpaizan, N.; Ribizel, T.; Göbel, F.; Abhyankar, S.; Anzt, H.; Peleš, S.
2023. doi:10.48550/arXiv.2306.14337

Sparse matrix‐vector and matrix‐multivector products for the truncated SVD on graphics processors
Aliaga, J. I.; Anzt, H.; Quintana-Ortí, E. S.; Tomás, A. E.
2023. Concurrency and Computation: Practice and Experience, 35 (28), Art.-Nr.: e7871. doi:10.1002/cpe.7871

Mixed Precision Algebraic Multigrid on GPUs
Tsai, Y.-H. M.; Beams, N.; Anzt, H.
2023. Parallel Processing and Applied Mathematics – 14th International Conference, PPAM 2022, Gdansk, Poland, September 11–14, 2022, Revised Selected Papers, Part I. Ed.: R. Wyrzykowski, 113 – 125, Springer International Publishing. doi:10.1007/978-3-031-30442-2_9

A Mixed Precision Randomized Preconditioner for the LSQR Solver on GPUs
Georgiou, V.; Boutsikas, C.; Drineas, P.; Anzt, H.
2023. High Performance Computing – 38th International Conference, ISC High Performance 2023, Hamburg, Germany, May 21–25, 2023, Proceedings. Ed.: A. Bhatele, 164 – 181, Springer Nature Switzerland. doi:10.1007/978-3-031-32041-5_9

Using Ginkgo’s memory accessor for improving the accuracy of memory-bound low precision BLAS
Grützmacher, T.; Anzt, H.; Quintana-Ortí, E. S.
2023. Software - Practice and Experience, 531 (1), 81–98. doi:10.1002/spe.3041

2022

Implementing Asynchronous Jacobi Iteration on GPUs
Tsai, Y.-H. M.; Nayak, P.; Chow, E.; Anzt, H.
2022. 2022 IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH). IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH 2022) Dallas, TX, USA, 13.11.2022–18.11.2022, 1–9, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/ScalAH56622.2022.00006

Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units
Aliaga, J. I.; Anzt, H.; Grützmacher, T.; Quintana-Ortí, E. S.; Tomás, A. E.
2022. Concurrency and Computation: Practice and Experience, 34 (14), Art. Nr.: e6515. doi:10.1002/cpe.6515

Ginkgo—A math library designed for platform portability
Cojean, T.; Tsai, Y.-H. M.; Anzt, H.
2022. Parallel Computing, 111, Art.-Nr.: 102902. doi:10.1016/j.parco.2022.102902

Resiliency in numerical algorithm design for extreme scale simulations
Agullo, E.; Altenbernd, M.; Anzt, H.; Bautista-Gomez, L.; Benacchio, T.; Bonaventura, L.; Bungartz, H.-J.; Chatterjee, S.; Ciorba, F. M.; DeBardeleben, N.; Drzisga, D.; Eibl, S.; Engelmann, C.; Gansterer, W. N.; Giraud, L.; Göddeke, D.; Heisig, M.; Jézéquel, F.; Kohl, N.; Li, X. S.; Lion, R.; Mehl, M.; Mycek, P.; Obersteiner, M.; Quintana-Ortí, E. S.; Rizzi, F.; Rüde, U.; Schulz, M.; Fung, F.; Speck, R.; Stals, L.; Teranishi, K.; Thibault, S.; Thönnes, D.; Wagner, A.; Wohlmuth, B.
2022. International Journal of High Performance Computing Applications, 36 (2), 251–285. doi:10.1177/10943420211055188

Preconditioners for Batched Iterative Linear Solvers on GPUs
Aggarwal, I.; Nayak, P.; Kashi, A.; Anzt, H.
2022. Accelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation – 22nd Smoky Mountains Computational Sciences and Engineering Conference, SMC 2022, Virtual Event, August 23–25, 2022, Revised Selected Papers. Ed.: K. Doug, 38–53, Springer Nature Switzerland. doi:10.1007/978-3-031-23606-8_3

Prediction of Optimal Solvers for Sparse Linear Systems Using Deep Learning
Funk, Y.; Götz, M.; Anzt, H.
2022. Proceedings of the 2022 SIAM Conference on Parallel Processing for Scientific Computing (PP). Ed.: X. Li, 14–24, Society for Industrial and Applied Mathematics (SIAM). doi:10.1137/1.9781611977141.2

Providing performance portable numerics for Intel GPUs
Tsai, Y.-H. M.; Cojean, T.; Anzt, H.
2022. Concurrency and Computation: Practice and Experience, 35 (20), Art.-Nr.: e7400. doi:10.1002/cpe.7400

Compressed basis GMRES on high-performance graphics processing units
Aliaga, J. I.; Anzt, H.; Grützmacher, T.; Quintana-Ortí, E. S.; Tomás, A. E.
2022. The International Journal of High Performance Computing Applications, 37 (2), 82–100. doi:10.1177/10943420221115140

Porting Sparse Linear Algebra to Intel GPUs
Tsai, Y. M.; Cojean, T.; Anzt, H.
2022. Euro-Par 2021: Parallel Processing Workshops – Euro-Par 2021 International Workshops, Lisbon, Portugal, August 30-31, 2021, Revised Selected Papers. Ed.: R. Chaves, 57–68, Springer International Publishing. doi:10.1007/978-3-031-06156-1_5

Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing
Anzt, H.; Cojean, T.; Flegar, G.; Göbel, F.; Grützmacher, T.; Nayak, P.; Ribizel, T.; Tsai, Y. M.; Quintana-Ortí, E. S.
2022. ACM Transactions on Mathematical Software, 48 (1), Art.-Nr.: 2. doi:10.1145/3480935

2021

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic
Abdelfattah, A.; Anzt, H.; Boman, E. G.; Carson, E.; Cojean, T.; Dongarra, J.; Fox, A.; Gates, M.; Higham, N. J.; Li, X. S.; Loe, J.; Luszczek, P.; Pranesh, S.; Rajamanickam, S.; Ribizel, T.; Smith, B. F.; Swirydowicz, K.; Thomas, S.; Tomov, S.; Tsai, Y. M.; Yang, U. M.
2021. International Journal of High Performance Computing Applications, 35 (4), 344–369. doi:10.1177/10943420211003313

Evaluating asynchronous Schwarz solvers on GPUs
Nayak, P.; Cojean, T.; Anzt, H.
2021. The international journal of high performance computing applications, 35 (3), 226–236. doi:10.1177/1094342020946814

Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software
Flegar, G.; Anzt, H.; Cojean, T.; Quintana-Ortí, E. S.
2021. ACM transactions on mathematical software, 47 (2), 1–28. doi:10.1145/3441850

Batched Sparse Iterative Solvers for Computational Chemistry Simulations on GPUs
Aggarwal, I.; Kashi, A.; Nayak, P.; Balos, C. J.; Woodward, C. S.; Anzt, H.
2021. Proceedings of ScalA 2021: 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems: Held in conjunction with SC21: The International Conference for High Performance Computing, Networking, Storage and Analysis ; St. Louis, Missouri, USA, November 14-19, 2021, 35–43, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/ScalA54577.2021.00010

Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs
Göbel, F.; Grützmacher, T.; Ribizel, T.; Anzt, H.
2021. Euro-Par 2021: Parallel Processing: 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1–3, 2021, Proceedings. Ed.: L. Sousa, 550–564, Springer-Verlag. doi:10.1007/978-3-030-85665-6_34

A Collaborative Peer Review Process for Grading Coding Assignments
Nayak, P.; Göbel, F.; Anzt, H.
2021. Computational Science – ICCS 2021: 21st International Conference, Krakow, Poland, June 16–18, 2021, Proceedings, Part VI. Ed.: M. Paszynski, 654–660, Springer-Verlag. doi:10.1007/978-3-030-77980-1_49

An environment for sustainable research software in Germany and beyond: current state, open challenges, and call for action
Anzt, H.; Bach, F.; Druskat, S.; Löffler, F.; Loewe, A.; Renard, B. Y.; Seemann, G.; Struck, A.; Achhammer, E.; Aggarwal, P.; Appel, F.; Bader, M.; Brusch, L.; Busse, C.; Chourdakis, G.; Dabrowski, P. W.; Ebert, P.; Flemisch, B.; Friedl, S.; Fritzsch, B.; Funk, M. D.; Gast, V.; Goth, F.; Grad, J.-N.; Hegewald, J.; Hermann, S.; Hohmann, F.; Janosch, S.; Kutra, D.; Linxweiler, J.; Muth, T.; Peters-Kottig, W.; Rack, F.; Raters, F. H. C.; Rave, S.; Reina, G.; Reißig, M.; Ropinski, T.; Schaarschmidt, J.; Seibold, H.; Thiele, J. P.; Uekermann, B.; Unger, S.; Weeber, R.
2021. F1000Research, 9, 295. doi:10.12688/f1000research.23224.2

Balanced and Compressed Coordinate Layout for the Sparse Matrix-Vector Product on GPUs
Aliaga, J. I.; Anzt, H.; Quintana-Ortí, E. S.; Tomás, A. E.; Tsai, Y. M.
2021. Euro-Par 2020: Parallel Processing Workshops: Euro-Par 2020 International Workshops, Warsaw, Poland, August 24–25, 2020, Revised Selected Papers. Ed.: B. Balis, 83–95, Springer-Verlag. doi:10.1007/978-3-030-71593-9_7

Preparing Ginkgo for AMD GPUs – A Testimonial on Porting CUDA Code to HIP
Tsai, Y. M.; Cojean, T.; Ribizel, T.; Anzt, H.
2021. Euro-Par 2020: Parallel Processing Workshops: Euro-Par 2020 International Workshops, Warsaw, Poland, August 24–25, 2020, Revised Selected Papers. Ed.: B. Balis, 109–121, Springer-Verlag. doi:10.1007/978-3-030-71593-9_9

Crediting pull requests to open source research software as an academic contribution
Anzt, H.; Kuehn, E.; Flegar, G.
2021. Journal of computational science, 49, Art.-Nr.: 101278. doi:10.1016/j.jocs.2020.101278

2020

A Guide for Publishing, Using, and Licensing Research Software in Germany
Struck, A.; Loewe, A.; Achhammer, E.; Rack, F.; Bach, F.; Löffler, F.; Seemann, G.; Anzt, H.; Funk, M.; Unger, S.; Druskat, S.; Friedl, S.
2020. Zenodo. doi:10.5281/zenodo.4327148

A customized precision format based on mantissa segmentation for accelerating sparse linear algebra
Grützmacher, T.; Cojean, T.; Flegar, G.; Göbel, F.; Anzt, H.
2020. Concurrency and computation, 32 (15), Article: e5418. doi:10.1002/cpe.5418

Preparing Ginkgo for AMD GPUs – A Testimonial on Porting CUDA Code to HIP
Tsai, Y. M.; Cojean, T.; Ribizel, T.; Anzt, H.
2020. doi:10.5445/IR/1000131542

Two-stage Asynchronous Iterative Solvers for multi-GPU Clusters
Nayak, P.; Cojean, T.; Anzt, H.
2020. Proceedings of ScalA 2020: 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis, 13 November 2020, online, 9–18, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/ScalA51936.2020.00007

Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse and Batched Computations
Anzt, H.; Tsai, Y. M.; Abdelfattah, A.; Cojean, T.; Dongarra, J.
2020. Proceedings of PMBS 2020: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems : Held in conjunction with SC20: The International Conference for High Performance Computing, Networking, Storage and Analysis ; Virtual Conference, November 9-19, 2020, 26–38, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/PMBS51919.2020.00009

Scalable Data Generation for Evaluating Mixed-Precision Solvers
Luszczek, P.; Tsai, Y.; Lindquist, N.; Anzt, H.; Dongarra, J.
2020. 2020 IEEE High Performance Extreme Computing Conference (HPEC): September 21 – 25, 2020, Virtual, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/HPEC43674.2020.9286145

Sparse Linear Algebra on AMD and NVIDIA GPUs – The Race Is On
Tsai, Y. M.; Cojean, T.; Anzt, H.
2020. High Performance Computing – 35th International Conference, ISC High Performance 2020, Frankfurt/Main, Germany, June 22–25, 2020, Proceedings. Ed.: P. Sadayappan, 309–327, Springer International Publishing. doi:10.1007/978-3-030-50743-5_16

Evaluating asynchronous Schwarz solvers on GPUs
Nayak, P.; Cojean, T.; Anzt, H.
2020

Multiprecision Block-Jacobi for iterative triangular solves
Goebel, F.; Anzt, H.; Cojean, T.; Flegar, G.; Quintana-Ortí, E. S.
2020. Lecture notes in computer science, 546–560, Springer. doi:10.1007/978-3-030-57675-2_34

An environment for sustainable research software in Germany and beyond: current state, open challenges, and call for action
Anzt, H.; Bach, F.; Druskat, S.; Löffler, F.; Loewe, A.; Renard, B. Y.; Seemann, G.; Struck, A.; Achhammer, E.; Aggarwal, P.; Appel, F.; Bader, M.; Brusch, L.; Busse, C.; Chourdakis, G.; Dabrowski, P. W.; Ebert, P.; Flemisch, B.; Friedl, S.; Fritzsch, B.; Funk, M. D.; Gast, V.; Goth, F.; Grad, J.-N.; Hermann, S.; Hohmann, F.; Janosch, S.; Kutra, D.; Linxweiler, J.; Muth, T.; Peters-Kottig, W.; Rack, F.; Raters, F. H. C.; Rave, S.; Reina, G.; Reißig, M.; Ropinski, T.; Schaarschmidt, J.; Seibold, H.; Thiele, J. P.; Uekermann, B.; Unger, S.; Weeber, R.
2020. F1000Research, 9, Article no: 295. doi:10.12688/f1000research.23224.1

Load-balancing Sparse Matrix Vector Product Kernels on GPUs
Anzt, H.; Cojean, T.; Yen-Chen, C.; Dongarra, J.; Flegar, G.; Nayak, P.; Tomov, S.; Tsai, Y. M.; Wang, W.
2020. ACM Transactions on Parallel Computing, 7 (1), Article: 2. doi:10.1145/3380930

Acceleration of PageRank with Customized Precision Based on Mantissa Segmentation
Grützmacher, T.; Cojean, T.; Flegar, G.; Anzt, H.; Quintana-Ortí, E. S.
2020. ACM Transactions on Parallel Computing, 7 (1), Article: 4. doi:10.1145/3380934

2019

Towards Continuous Benchmarking – An Automated Performance Evaluation Framework for High Performance Software
Anzt, H.; Chen, Y.-C.; Cojean, T.; Dongarra, J.; Flegar, G.; Nayak, P.; Quintana-Ortí, E. S.; Tsai, Y. M.; Wang, W.
2019. Proceedings of the Platform for Advanced Scientific Computing Conference, Article no: 9, Association for Computing Machinery (ACM). doi:10.1145/3324989.3325719

Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers
Anzt, H.; Dongarra, J.; Flegar, G.; Higham, N. J.; Quintana-Ortí, E. S.
2019. Concurrency and computation, 31 (6), e4460. doi:10.1002/cpe.4460

Parallel selection on GPUs
Ribizel, T.; Anzt, H.
2019. Parallel computing, 91, Article: 102588. doi:10.1016/j.parco.2019.102588

ParILUT - A parallel threshold ILU for GPUS
Anzt, H.; Ribizel, T.; Flegar, G.; Chow, E.; Dongarra, J.
2019. Proceedings 2019 IEEE 33rd International Parallel and Distributed Processing Symposium, IPDPS 2019: 20-24 May 2019, Rio de Janeiro, Brazil, 231–241, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/IPDPS.2019.00033

PAPI software-defined events for in-depth performance analysis
Jagode, H.; Danalis, A.; Anzt, H.; Dongarra, J.
2019. The international journal of high performance computing applications, 33 (6), 1113–1127. doi:10.1177/1094342019846287

Are we doing the right thing? - A critical analysis of the academic HPC community
Anzt, H.; Flegar, G.
2019. 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 739–745, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/IPDPSW.2019.00122

Approximate and exact selection on GPUs
Ribizel, T.; Anzt, H.
2019. 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 471–478, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/IPDPSW.2019.00088

Residual Replacement in Mixed-Precision Iterative Refinement for Sparse Linear Systems
Anzt, H.; Flegar, G.; Novaković, V.; Quintana-Ortí, E. S.; Tomás, A. E.
2019. International Conference on High Performance Computing, ISC High Performance 2018; Frankfurt; Germany; 28 June 2018 through 28 June 2018. Ed.: M. Weiland, 554–561, Springer. doi:10.1007/978-3-030-02465-9_39

Toward a modular precision ecosystem for high-performance computing
Anzt, H.; Flegar, G.; Grützmacher, T.; Quintana-Ortí, E. S.
2019. The international journal of high performance computing applications, 33 (6), 1069–1078. doi:10.1177/1094342019846547

Machine learning-aided numerical linear Algebra: Convolutional neural networks for the efficient preconditioner generation
Götz, M.; Anzt, H.
2019. Proceedings of ScalA 2018: 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 49–56, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/ScalA.2018.00010

High-Performance GPU Implementation of PageRank with Reduced Precision Based on Mantissa Segmentation
Grützmacher, T.; Anzt, H.; Scheidegger, F.; Quintana-Orti, E. S.
2019. Proceedings of IA³ 2018: 8th Workshop on Irregular Applications: Architectures and Algorithms, 61–68, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/IA3.2018.00015

Variable-Size Batched Condition Number Calculation on GPUs
Anzt, H.; Dongarra, J.; Flegar, G.; Grützmacher, T.
2019. 2018 30th International Symposium on Computer Architecture and High Performance Computing: SBAC-PAD 2018 ; Lyon, France, 24-27 September 2018 ; Proceedings, 132–139, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/CAHPC.2018.8645907

A Jaccard Weights Kernel Leveraging Independent Thread Scheduling on GPUs
Anzt, H.; Dongarra, J.
2019. 2018 30th International Symposium on Computer Architecture and High Performance Computing: SBAC-PAD 2018 ; Lyon, France, 24-27 September 2018 ; Proceedings, 229–232, Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/CAHPC.2018.8645946

A modular precision format for decoupling arithmetic format and storage format
Grützmacher, T.; Anzt, H.
2019. 24th International Conference on Parallel and Distributed Computing, Euro-Par 2018; Turin; Italy; 27 August 2018 through 28 August 2018. Ed.: G. Mencagli, 434–443, Springer. doi:10.1007/978-3-030-10549-5_34

2018

Using Jacobi iterations and blocking for solving sparse triangular systems in incomplete factorization preconditioning
Chow, E.; Anzt, H.; Scott, J.; Dongarra, J.
2018. Journal of parallel and distributed computing, 119, 219–230. doi:10.1016/j.jpdc.2018.04.017

Optimization and performance evaluation of the IDR iterative Krylov solver on GPUs
Anzt, H.; Kreutzer, M.; Ponce, E.; Peterson, G. D.; Wellein, G.; Dongarra, J.
2018. The international journal of high performance computing applications, 32 (2), 220–230. doi:10.1177/1094342016646844

Machine learning-aided numerical linear algebra: Convolutional neural network for the efficient preconditioner generation
Götz, M.; Anzt, H.
2018. ScalA18: 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Dallas, TX, November 12, 2018

ParILUT---A New Parallel Threshold ILU Factorization
Anzt, H.; Chow, E.; Dongarra, J.
2018. SIAM journal on scientific computing, 40 (4), C503–C519. doi:10.1137/16M1079506

Variable-size batched Gauss–Jordan elimination for block-Jacobi preconditioning on graphics processors
Anzt, H.; Dongarra, J.; Flegar, G.; Quintana-Ortí, E. S.
2018. Parallel computing, 81, 131–146. doi:10.1016/j.parco.2017.12.006

Incomplete Sparse Approximate Inverses for Parallel Preconditioning
Anzt, H.; Huckle, T. K.; Bräckle, J.; Dongarra, J.
2018. Parallel computing, 71, 1–22. doi:10.1016/j.parco.2017.10.003

Fixed-Point Methods for Numerics at Exascale

Projects

List of publications KITopen

Software

Cooperations

Contact