Research

Our projects

Network-based genome analysis

Some of our recent projects are described below.

The Herwig Lab maintains ConsensusPathDB, a meta-database of human – as well as mouse and yeast – molecular interactions integrated from 33 public resources [Kamburov and Herwig, Nucleic Acids Research, 2022]. The integrated protein-protein interaction (PPI) network serves as a scaffold for genome analysis, and we developed NetCore, a mathematical framework based on random walk with restart, to analyze experimental data at the network level (Fig. 1) [Barel and Herwig, Nucleic Acids Research, 2020]. Unlike traditional approaches using node degree, NetCore utilizes node core for re-ranking, making it particularly robust against biases in PPI experiments.

We applied NetCore in collaboration with the German Diabetes Center Düsseldorf, analyzing time-resolved phosphorylation data to study insulin stimulation response in human muscle, identifying kinase network modules as insulin targets [Turewicz et al., Nature Communications, 2025]. With the Institute Pasteur Tunis, we examined dynamic gene expression data to track network changes in susceptible vs. resistant mice during Leishmania major infection [Bouabid et al., Frontiers in Immunology, 2023]. Internally, we collaborate with the Metzger Lab, using network propagation to integrate transcriptomic and proteomic data from diverse pig breeds, aiming to identify pathways and network modules influencing body size.

Atanas Kamburov and Ralf Herwig, "ConsensusPathDB 2022: molecular interactions update as a resource for network biology," Nucleic Acids Research 50 (D1), D587-D595 (2022).
Gal Barel and Ralf Herwig, "NetCore: a network propagation approach using node coreness," Nucleic Acids Research 48 (17), e98 (2020).
Michael Turewicz, Christine Skagen, Sonja Hartwig, Stephan Majda, Kristina Thedinga, Ralf Herwig, Christian Binsch , Delsi Altenhofen, D. Margriet Ouwens, Pia Marlene Förster, Thorsten Wachtmeister, Karl Köhrer, Torben Stermann, Alexandra Chadt, Stefan Lehr, Tobias Marschall , G. Hege Thoresen, and Hadi Al-Hasani, "Temporal phosphoproteomics reveals circuitry of phased propagation in insulin signaling," Nature Communications 16 (1), Article 1570 (2025).

Software development for long-read transcriptome sequencing (LRTS)

Our lab developed IsoTools (Fig. 2A), a comprehensive pipeline for mapping, annotation, and statistical analysis of third-generation PacBio Iso-Seq and Oxford Nanopore LRTS data. The package includes data quality control steps, isoform identification and quantification, detection of (coordinated) splicing events, and statistical tests for differential splicing [Lienhard et al., Bioinformatics, 2023]. With IsoTools we participated in LRGASP, an international challenge on longread transcript identification and quantification organized by the GENCODE consortium, where the software ranked among the top-performing tools in isoform quantification [Pardo-Palacios et al., Nature Methods, 2024]. We recently enhanced IsoTools with new features, including transcription start site detection from long reads, gene model statistics such as entropy, improved visualization components, and functional annotation using protein domains (Fig. 2B, C) [Bi et al., J Mol Biol, 2025].


F. J. Pardo-Palacios, D. Wang, F. Reese, M. Diekhans, S. Carbonell-Sala, B. Williams, J. E. Loveland, M. De Maria, M. S. Adams, G. Balderrama-Gutierrez, A. K. Behera, J. M. Gonzalez, T. Hunt, J. Lagarde, C. E. Liang, H. Li, M. Jerryd Meade, D. A. Moraga Amador, A. D. Prjibelski, I. Birol, H. Bostan, A. M. Brooks, M. Hasan Celik, Y. Chen, M. R. M. Du, C. Felton, J. Goke, S. Hafezqorani, R. Herwig, H. Kawaji, J. Lee, J. Liang Li, M. Lienhard, A. Mikheenko, D. Mulligan, K. Ming Nip, M. Pertea, M. E. Ritchie, A. D. Sim, A. D. Tang, Y. Kei Wan, C. Wang, B. Y. Wong, C. Yang, I. Barnes, A. Berry, S. Capella, N. Dhillon, J. M. Fernandez-Gonzalez, L. Ferrandez-Peral, N. Garcia-Reyero, S. Goetz, C. Hernandez-Ferrer, L. Kondratova, T. Liu, A. Martinez-Martin, C. Menor, J. Mestre-Tomas, J. M. Mudge, N. G. Panayotova, A. Paniagua, D. Repchevsky, E. Rouchka, B. Saint-John, E. Sapena, L. Sheynkman, M. Laird Smith, M. M. Suner, H. Takahashi, I. A. Youngworth, P. Carninci, N. D. Denslow, R. Guigo, M. E. Hunter, H. U. Tilgner, B. J. Wold, C. Vollmers, A. Frankish, K. Fai Au, G. M. Sheynkman, A. Mortazavi, A. Conesa, and A. N. Brooks, "Systematic assessment of long-read RNA-seq methods for transcript identification and quantification," Nature Methods 21 (7), 1349-1363 (2024).
Yalan Bi, Tom Lukas Lankenau, Matthias Lienhard, and Ralf Herwig, "IsoTools 2.0: Software for Comprehensive Analysis of Long-read Transcriptome Sequencing Data," Journal of Molecular Biology , Article 169049 (2025).

Effects of splicing factor mutations in cancer patients

In a project funded by the German Research Foundation (DFG), we collaborated with groups at the universities of Cologne and Frankfurt to apply long-read transcriptomes sequencing (LRTS) to investigate aberrant splicing induced by mutations in the splicing factor SF3B1 in chronic lymphocytic leukemia (CLL) and myelodysplastic syndrome (MDS) patients (Fig. 3A). IsoTools analyses revealed that SF3B1 hot-spot mutations specifically impact 3’ alternative splicing (3’AS; Fig. 3B) and identified a preferential selection of alternative splice sites located -12 to -21 bp upstream of the canonical splice site (Fig. 3C).


Enhancing plausibility of machine learning predictions in biomedical applications

Our lab participated in the AI initiative of the Federal Ministry of Education and Research (BMBF), coordinating a project to improve machine learning for biomedical applications by integrating biological background knowledge and leveraging methods that enable post hoc interpretability of machine learning methods.
For patient risk prediction using molecular data, we developed “Predict and Propagate,” an approach combining tree learning with XGBoost and subsequent network propagation of the learned features using NetCore. This method generates plausible network modules from otherwise non-interpretable machine learning methods (Fig. 4) [Thedinga and Herwig, 2021, 2022]. Using TCGA gene expression data from over 10,000 patients across 25 cancer types, we found that XGBoost ensemble tree learning outperforms classical decision trees and support vector machines. Additionally, pan-cancer training yielded better results than training on single cancer cohorts, identifying predictive biomarkers for survival across multiple cancer types. By integrating network propagation with machine learning, we further demonstrated that the tumor microenvironment is highly predictive for pan-cancer survival.

Kristina Thedinga and Ralf Herwig, "A gradient tree boosting and network propagation derived pan-cancer survival network of the tumor microenvironment," iScience 25 (1), 103617 (2021).
Kristina Thedinga and Ralf Herwig, "Gradient tree boosting and network propagation for the identification of pan-cancer survival networks," STAR Protocols 3 (2), 101353 (2022).

Deep learning models for drug response predictions

In cooperation with machine learning experts at the University of Potsdam, we explored deep neural network (DNN) architectures and transfer learning for drug response predictions. This approach leverages the vast amount of in vitro drug sensitivity data for pre-training, enabling more accurate predictions of drug sensitivity in preclinical models such as PDXs, PDOs, and ex vivo patient tissues, where available data is typically limited [Prasse et al., 2022a, 2022b]. Recently, we developed a deep learning method based on transformer technology to predict drug combinations of varying sizes, with a particular focus on triplet drug combinations [Campana et al., submitted to Briefings in Bioinformatics, 2025].

Paul Prasse , Pascal Iversen , Matthias Lienhard, Kristina Thedinga, Chris Bauer, Ralf Herwig, and Tobias Scheffer, "Matching anticancer compounds and tumor cell lines by neural networks with ranking loss," NAR: genomics and bioinformatics 4 (1), lqab128 (2022).
Paul Prasse, Pascal Iversen, Matthias Lienhard, Kristina Thedinga, Ralf Herwig, and Tobias Scheffer, "Pre-Training on In Vitro and Fine-Tuning on Patient-Derived Data Improves Deep Neural Networks for Anti-Cancer Drug-Sensitivity Prediction," Cancers / Molecular Diversity Preservation International (MDPI) 14 (16), 3950 (2022).
Go to Editor View