Forschung

Die verschiedenen von der GWDG betriebenen HPC-Systeme sind auch Gegenstand der GWDG-Forschung zu HPC-Methoden, deren Ergebnisse dann zur Verbesserung der Bedienung und/oder User Experience genutzt werden. In diesem Zusammenhang ist die GWDG auch an verschiedenen Drittmittelprojekten beteiligt.

Neben der serviceorientierten Forschung ist die akademische Lehre in den Bereichen Informatik ein Schwerpunkt unserer Arbeit. Aus diesem Grund engagieren wir uns auf vielfältige Weise in der Ausbildung der Studierenden. Die GWDG verfügt derzeit über drei Forschungsgruppen, deren Lehrtätigkeit am Institut für Informatik der Universität Göttingen verankert ist und deren Lehrinhalte Teil verschiedener Studiengänge sind.

Aktuelle HPC Veröffentlichungen

    2022   🔗

  • Improve the Deep Learning Models in Forestry Based on Explanations and Expertise (Ximeng Cheng, Ali Doosthosseini, Julian Kunkel), In Frontiers in Plant Science, Schloss Dagstuhl -- Leibniz-Zentrum für Informatik, ISSN: 1664-462X, 2022-05-01 DOI PDF BIBTEX
    @article{ITDLMIFBOE22
    	abstract = {"In forestry studies, deep learning models have achieved excellent performance in many application scenarios (e.g., detecting forest damage). However, the unclear model decisions (i.e., black-box) undermine the credibility of the results and hinder their practicality. This study intends to obtain explanations of such models through the use of explainable artificial intelligence methods, and then use feature unlearning methods to improve their performance, which is the first such attempt in the field of forestry. Results of three experiments show that the model training can be guided by expertise to gain specific knowledge, which is reflected by explanations. For all three experiments based on synthetic and real leaf images, the improvement of models is quantified in the classification accuracy (up to 4.6%) and three indicators of explanation assessment (i.e., root-mean-square error, cosine similarity, and the proportion of important pixels). Besides, the introduced expertise in annotation matrix form was automatically created in all experiments. This study emphasizes that studies of deep learning in forestry should not only pursue model performance (e.g., higher classification accuracy) but also focus on the explanations and try to improve models according to the expertise."}
    	author = {Ximeng Cheng and Ali Doosthosseini and Julian Kunkel}
    	doi = {https://doi.org/10.3389/fpls.2022.902105}
    	issn = {1664-462X}
    	journal = {Frontiers in Plant Science}
    	publisher = {Schloss Dagstuhl -- Leibniz-Zentrum für Informatik}
    	title = {Improve the Deep Learning Models in Forestry Based on Explanations and Expertise}
    	year = {2022}
    	month = {05}
    }
  • 2021   🔗

  • User-Centric System Fault Identification Using IO500 Benchmark (Radita Liem, Dmytro Povaliaiev, Jay Lofstead, Julian Kunkel, Christian Terboven), pp. 35-40, IEEE, 2021-12-01 DOI PDF BIBTEX
    @inproceedings{USFIUIBLPL21
    	abstract = {"I/O performance in a multi-user environment is difficult to predict. Users do not know what I/O performance to expect when running and tuning applications. We propose to use the IO500 benchmark as a way to guide user expectations on their application’s performance and to aid identifying root causes of their I/O problems that might come from the system. Our experiments describe how we manage user expectation with IO500 and provide a mechanism for system fault identification. This work also provides us with information of the tail latency problem that needs to be addressed and granular information about the impact of I/O technique choices (POSIX and MPI-IO)."}
    	author = {Radita Liem and Dmytro Povaliaiev and Jay Lofstead and Julian Kunkel and Christian Terboven}
    	booktitle = {In 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW)}
    	conference = {International Parallel Data Systems Workshop (PDSW)}
    	doi = {https://doi.org/10.1109/PDSW54622.2021.00011}
    	editor = {}
    	location = {St. Louis}
    	pages = {35-40}
    	publisher = {IEEE}
    	title = {User-Centric System Fault Identification Using IO500 Benchmark}
    	year = {2021}
    	month = {12}
    }
  • Understanding I/O Behavior in Scientific and Data-Intensive Computing (Dagstuhl Seminar 21332) (Philip Carns, Julian Kunkel, Kathryn Mohror, Martin Schulz), In Dagstuhl Reports, pp. 16-75, Schloss Dagstuhl -- Leibniz-Zentrum für Informatik, ISSN: 2192-5283, 2021-09-14 URL DOI PDF BIBTEX
    @article{UIBISADCSC21
    	abstract = {| Two key changes are driving an immediate need for deeper understanding of I/O  workloads in high-performance computing (HPC): applications are evolving beyond  the traditional bulk-synchronous models to include integrated multistep workflows,  in situ analysis, artificial intelligence, and data analytics methods; and storage  systems designs are evolving beyond a two-tiered file system and archive model to  complex hierarchies containing temporary, fast tiers of storage close to compute  resources with markedly different performance properties. Both of these changes  represent a significant departure from the decades-long status quo and require  investigation from storage researchers and practitioners to understand their  impacts on overall I/O performance. Without an in-depth understanding of I/O  workload behavior, storage system designers, I/O middleware developers, facility  operators, and application developers will not know how best to design or utilize  the additional tiers for optimal performance of a given I/O workload. The goal  of this Dagstuhl Seminar was to bring together experts in I/O performance analysis  and storage system architecture to collectively evaluate how our community is  capturing and analyzing I/O workloads on HPC systems, identify any gaps in our  methodologies, and determine how to develop a better in-depth understanding of  their impact on HPC systems. Our discussions were lively and resulted in identifying  critical needs for research in the area of understanding I/O behavior. We  document those discussions in this report.}
    	author = {Philip Carns and Julian Kunkel and Kathryn Mohror and Martin Schulz}
    	doi = {https://doi.org/10.4230/DagRep.11.7.16}
    	issn = {2192-5283}
    	journal = {Dagstuhl Reports}
    	pages = {16-75}
    	publisher = {Schloss Dagstuhl -- Leibniz-Zentrum für Informatik}
    	title = {Understanding I/O Behavior in Scientific and Data-Intensive Computing (Dagstuhl Seminar 21332)}
    	url = {https://drops.dagstuhl.de/opus/volltexte/2021/15589}
    	year = {2021}
    	month = {09}
    }
Eine Übersicht aller GWDG Veröffentlichungen finden Sie hier.

Offene Projekte und Bachelor-, Master- und Doktorarbeiten   🔗

Thema
Professor*in
Typ
Token Management for an API to utilise HPC resources in generic workflows
Prof. Ramin Yahyapour
BSc, MSc

Betreuer*in: Sven Bingert 📧
Cluster on Demand with Kubernetes
Prof. Julian Kunkel
BSc, MSc, PhD

Betreuer*in: Christian Boehme 📧
Parallele Anwendungen mit Containern
Prof. Julian Kunkel
BSc, MSc, PhD
Parallel applications on HPC systems often rely on system specific MPI (Message Passing Interface) and interconnect libraries, for example for Infiniband or OmniPath networks. This partially offsets one main advantage of containerizing such applications, namely the portability between different platforms. The goal of this project is to evaluate different ways of integrating system specific communication libraries into containers, allowing for porting these containers to a different platform with minimal effort. A PoC should be implemented and benchmarked against running natively on a system.
Betreuer*in: Christian Boehme 📧
Digital Twin of the data center: Erstellung eines 3D Modells für den GWDG Data Center für Begehungen in virtual reality
Prof. Julian Kunkel
BSc, MSc

Betreuer*in:
Digitale Lehere: Entwicklung von Prüfungszenarien für HPC-Kenntnisse
Prof. Julian Kunkel
BSc, MSc

Betreuer*in:
Entwicklung einer Provenance aware ad-hoc Schnittstelle für einen Data Lake
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Hendrik Nolte 📧
Semantische Klassifizierung von Metadatenattributen in einem Data Lake durch maschinelles Lernen
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Hendrik Nolte 📧
Governance für einen Data Lake
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Hendrik Nolte 📧
Authentifizierung im HPC über WebAPI
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Christian Köhler 📧
Vergleich der Leistung von Remote-Visualisierungstechniken
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Azat Khuziyakhmetov 📧
Empfehlungssystem für die Leistungsüberwachung und -analyse im HPC
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Azat Khuziyakhmetov 📧
Überwachung und Auswertung der Anwendungsnutzung im Rechenzentrum
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Marcus Vincent Boden 📧
Parallelisierung von iterativen Optimierungsalgorithmen für die Bildverarbeitung mit MPI
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Jack Ogaja 📧
Einbringen ungenutzter HPC-Ressourcen in Grid-Computing-Projekte mit BOINC durch Backfilling
Prof. Julian Kunkel
BSc, MSc

Betreuer*in:
Hochskalierung der Einzelzellanalyse mit dem HPC
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Stefanie Mühlhausen 📧
Benchmarking von AlphaFold und alternativen Modellen für die Proteinstrukturvorhersage auf dem HPC
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Stefanie Mühlhausen 📧
Prototypisierung und Benchmarking von Arbeitsabläufe bei der Rekonstruktion phylogenetischer Bäume auf dem HPC
Prof. Julian Kunkel
BSc, MSc

Betreuer*in: Stefanie Mühlhausen 📧