Unlock the potential of microbiome research with expert-led training and insights into microbiome data analysis! Accurate microbiome data analysis is crucial for understanding the complex interactions between microorganisms and their environments, impacting fields like health, agriculture, and environmental science. However, microbiome data comes with its own challenges, making it different from other -omics data types.
On day 1 of the workshop you will learn about the basics: going from reads to OTU/ASV tables, exploring and visualizing microbiome data, and differential abundance analysis. These take the form of lectures, including hands-on sessions. The second day continues with more advanced and emerging topics, brought to you by top experts who will guide you through cutting-edge techniques to manage and interpret microbiome datasets. Whether you’re a beginner or looking to enhance your skills, this is your opportunity to learn from the best!
This workshop will be of interest to a broad range of profiles, ranging from PhD students, post-docs to experienced professionals who need to work with microbiome data. We target people with backgrounds in biology, biomedical sciences, engineering, environmental sciences, …, but also statisticians and data scientists with an interest in the challenges specific to the microbiome.
Secure your place and register here! Only 50 places are available.
We shall kick-off the workshop with some basics, starting from raw sequence reads to the generation of an ASV (Amplicon Sequence Variant) or OTU (Operational Taxonomic Unit) table, followed by data visualization, and finally differential abundance (DA) analysis. The workshop is designed to guide participants through each stage of microbiome data analysis, providing both theoretical background and practical hands-on experience. Therefore, each session on the first day will consist of two parts: a theoretical introduction to key concepts and methodologies, followed by an interactive hands-on session.
09:00 - 09:30 Registration and welcome coffee
09:30 - 12:30 Visualising the microbiome by Stijn Wittouck, Department of Bioscience Engineering, University of Antwerp, Belgium
Abstract
Microbiome composition is a high dimensional measurement modality.
Naturally, this renders visualization, a crucial tool for understanding the structure of the microbiome manifold, challenging.
Additionally, microbiome measurements are compositional (meaning that the proportions sum to 1), which further impedes standard methods.
In this session, I will address strategies to visualize the microbiome through lecture and hands-on practical exercises
This will lay the foundation for, and aid interpretation of downstream analysis.
12:30 - 13:30 Lunch
13:30 - 15:30 Data Visualisation by Thies Gehrmann, Department of Bioscience Engineering, University of Antwerp, Belgium
15:30 - 16:00 Coffee Break
16:00 - 18:00 Differential Abundance Analysis by Olivier Thas, Data Science Institute, Hasselt University, Belgium
Abstract
In this lecture we will start from the OTU/ASV table and look into the problem of testing for differential abundance (DA). There is a broad spectrum of methods available, and it is hard to choose the most appropriate. We will first discuss some general principles (e.g. normalisation, FDR control, sensitivity, …) and subsequently walk through some of the more popular and better methods (e.g. ANCOM-BC, LinDA, CoDa methods, Wilcoxon-Mann-Whitney, …). In a hands-on session we will apply several of the methods to a few datasets. In addition we will spend attention to chosing the taxonomic rank to which the DA method is applied, combining results of different DA methods and reporting.
The second day will focus on additional skills in microbiome research and multi-omics integration. Participants will explore new technologies and bioinformatics tools, such as metagenome assembly, multi-table methods, and covariate-aware testing to gain deeper insights from complex datasets. We will also cover microbial networks and dynamics, integration of data sets, and development of biomarkers, showcasing modern ways of working with microbiome data. Each session will build upon the foundational knowledge from Day 1.
09:00-10:00 Computational investigation of gut microbial secondary metabolism by George Zeller, Leiden University Medical Center (LUMC), Netherlands
10:00 - 11:00 Multi-table methods in microbiome data science by Leo Lahti, Department of Computing, University of Turku, Finland
Abstract
The diversity of various ‘omics data sets and their combinations poses challenges for explainable and verifiable analyses in microbial ecology. This talk will demonstrate how recent advances in statistical programming can help to manage relationships among diverse omics data while implementing reproducible data science workflows. The talks will conclude by highlighting current trends in computational data integration in the context of microbiome research.
11:00 - 11:30 Coffee Break
11:30 - 12:30 Studying host-microbiota interactions through multi-omics integration by Laura Symul, Institute for Statistics, Biostatistics, and Actuarial Sciences at UCLouvain, Belgium
Abstract
In humans, several studies have highlighted the associations between the hosts' health and their microbiotas. To understand how microbiotas and their hosts interact, one needs to jointly analyze (omic) data characterizing both the host and the microbiota. This seminar/presentation will focus on the joint statistical analysis of such data. It will first introduce principles and tools for data collection and organization (e.g., the MultiAssayExperiment BioConductor R package), then detail a few statistical methods for unsupervised and supervised multi-omics integration. These methods will be introduced by presenting how well-known multivariate methods, such as PCA, MDS, or PLS, can be extended to a multi-table set-up. Examples drawn from recent vaginal microbiota research will be used for illustration.
12:30 - 13:30 Lunch
13:30 - 14:30 Old Problems and New Approaches in Microbial Co-occurrence Network Analysis by Ksenia Guseva, Department of Microbiology and Ecosystem Science, University of Vienna, Switzerland
Abstract
Over the past decade, co-occurrence network analysis has become a popular tool for studying microbial communities across various ecosystems, including soil, gut, and marine environments. However, questions remain about the type of signals these analyses can reliably uncover. In particular, networks constructed from correlations often provide unreliable information about microbial interactions, and it is still highly debated whether these networks represent meaningful ecological relationships among microorganisms. In this talk, I will explore several key challenges that affect our ability to accurately interpret microbial associations, even when we use absolute abundances, uncover environmental confounders and use samples of small volume. I will demonstrate these issues using both real and simulated datasets. Finally, I will introduce a promising new framework for principled network reconstruction that offer a potential solutions to some of these challenges.
14:30 - 15:30 TBA
15:30 - 16:00 Coffee break
16:00 - 17:00 Detection of Joint Microbiome Biomarkers in Intervention Experiments by Ziv Shkedy, Data Science Institute, Hasselt University, Belgium
Abstract
In microbiome studies, identifying biomarkers that reflect the complex interactions between the microbiome and health outcomes is essential for advancing the understanding and improving disease management. In the past, research was focused on the detection of single feature biomarkers, this allow us to better understand the microbiome effect on the health outcome. This paper extends the prior work about microbiome biomarkers by combining information from multiple taxa to identify a joint biomarker relevant to continuous clinical outcomes.
We applied the least absolute shrinkage and selection operator (LASSO) and Elastic net methods (Hastie et al., 2015; Zou and Hastie, 2005), aiming to find a combination of taxa that can serve as biomarker(s) for the response of interest. The information theory approach (Alonso and Molenberghs, 2007) was used for biomarker(s) construction. Monte Carlo cross-validation with 1000 repetitions was used to reduce bias and improve the reliability of the prediction. A LASSO/ Elastic net and relax LASSO after MCCV were fitted to a fixed list of taxa. The methods were evaluated using mean squared error, the correlation between the observed and expected values.
Two datasets were used to illustrate the proposed method. The high salt diet study was conducted to investigate the effect of diet on both microbiome and cancer tumor size in rats. After filtering and identifying relevant genera, the analysis of the top five most relevant genera resulted in a 73.27% reduction in the uncertainty of tumor size when the microbiome predictor was known. This highlights the substantial impact of microbiome data on tumor growth. The CERTIFI study, conducted to investigate the associations between the fecal microbiota and the therapeutic response of Crohn’s disease patients used ustekinumab as a treatment. After filtering, the analysis of the top five most frequently selected taxa showed that knowing the microbiome predictor reduced the uncertainty in the change of CDAI at week 6 from baseline by 10.51%.
17:00 Closing
Olivier Thas is a professor of biostatistics with a strong background in nonparametric statistics, genomics, and high-throughput technologies. He is currently affiliated with Hasselt University at the Data Science Institute, where his work focuses on statistical methods applied to life sciences, including microbiome studies and proteomics. He has authored more than 120 scientific papers and two monographs. In addition to his role at Hasselt University, Thas is also a guest professor at Ghent University and an honorary professor at the National Institute for Applied Statistics Research Australia. His research spans various projects in biostatistics, including statistical decision-making and microbiome research, with notable involvement in international collaborations. |
Ziv Shkedy is a leading expert and a professor in biostatistics and data science at Hasselt University's CenStat and the Data Science Institute. His research interests span statistical modelling, survival analysis, and clinical trials, with a strong focus on biomarker development. Ziv is also the founder of the eR-Biostat initiative, which provides open-access educational materials for learning R, specifically designed for students and professionals globally, including those in developing countries. He is recognized for his contributions to advancing statistical methods in life sciences and his commitment to accessible education in data science. |
Stijn Wittouck is a postdoctoral researcher in the Department of Bioscience Engineering at the University of Antwerp, where he focuses on the ecology and evolution of lactic acid bacteria by comparative studies of their genome sequences. He develops software to gain new types of insights from such datasets, or to process them on a larger scale. He applies these and existing bioinformatic tools to study how different species of lactic acid bacteria are related to each other, how they adapt to different lifestyles (e.g. free-living vs host-adapted) and how they interact with their viruses and other mobile genetic elements. |
Thies Gehrmann is a senior researcher at the University of Antwerp in the Department of Bioscience Engineering. His work focuses on bioinformatics and statistical analysis within microbiome studies, including the Isala project, which aims to understand the human vaginal microbiome. Gehrmann's background in computer science and statistics, with prior applications to fungal and human genetics and transcriptomics, makes him a valuable contributor to analyzing the diverse data collected in such microbiome research. |
Leo Lahti is professor in Data Science at the University of Turku, Finland. His research team focuses on computational analysis and modeling of complex natural and social systems. Lahti obtained doctoral degree (DSc) from Aalto University in Finland (2010), developing probabilistic machine learning methods for high-throughput life science data integration. This was followed by subsequent postdoctoral research at EBI/Hinxton (UK), Wageningen University (NL), and VIB/KU Leuven (BE). Lahti has coordinated international networks in data science methods and applications and organizes international data science training events on a regular basis. He is vice chair for the national coordination on open science Finland, executive committee member for the International Science Council Committee on Data (2023-2025), member of the global Bioconductor Community Advisory Board, and founder of the open science work group of Open Knowledge Finland ry. |
Laura Symul is an Assistant Professor of non-clinical biostatistics at UCLouvain's Institute of Statistics, Biostatistics, and Actuarial Sciences (ISBA). Her research focuses on developing statistical methods for women's health, with a particular emphasis on menstrual health, fertility, cycle-related symptoms, and the vaginal microbiome. She utilizes both parametric and non-parametric models to analyze a variety of data, including clinical, self-tracked (from apps), and publicly available datasets. Before joining UCLouvain, Laura was a postdoctoral researcher at Stanford University, where she began her work in this field. She earned her Ph.D. in computational biology from the École Polytechnique Fédérale de Lausanne (EPFL). |
Ksenia Guseva is a postdoctoral researcher at the University of Vienna, working in the Division of Terrestrial Ecosystem Research at the Centre for Microbiology and Environmental Systems Science. Her research focuses on microbial ecology, particularly in soil environments, where she uses modelling and nonlinear dynamics to understand microbial interactions and ecosystem processes. She is involved in projects related to the construction of co-occurrence networks and enzyme production trade-offs during depolymerization. |
George Zeller is an Associate Professor at the Leiden University Center for Infectious Diseases (LUCID) at LUMC. His research aims to better understand human microbiomes and their impact on health and disease. Exploring if non-invasive detection of colorectal cancer (CRC) is feasible from gut microbiome profiles, my group established an accurate classification model through large-scale machine learning meta-analyses. Ongoing work assesses the prognostic potential of this CRC classifier. To elucidate bacterial carcinogenesis mechanisms, the group investigates intratumoral microbes using multi-omics approaches, bacterial imaging and cellular infection models supported by an ERC Synergy grant. Beyond cancer, Georg is interested in more general delineation of healthy versus diseased gut microbiomes (dysbiosis) across many disorders. For developing a quantitative model of dysbiosis he received an LUMC Fellowship. Ultimately my work will underpin the development of novel microbiome-targeting diagnostic, prognostics and therapeutic approaches to maintain health, prevent disease and improve treatment. |
Type of participant | Price |
PhD-student | €40 |
Other (postdoc, industry,...) | €210 |
Please register only if you can attend both days as places are limited.
PhD students, postdocs, researchers, and professionals working in the microbiome field. For PhD students from Uhasselt: Acknowledged as DS requirement
The workshop will take place from December 16, 2024 to December 17, 2024
Holiday Inn Hasselt, Kattegatstraat 1, 3500, Hasselt Belgium
Holiday Inn Hasselt, an IHG Hotel
Bus or Train: There is a bus stop opposite the hotel. The train station is 1 km (12 minutes walk) away from the hotel.
Car: The hotel offers an excellent base: You are immediately on the E314 direction Brussels/Antwerp/Liège. Parking is available in the underground Q-park parking Molenpoort or underground parking Blauwe Boulevard, for a fee.