StatOmique organise une journée satellite de la conférence SMPGD (Statistical Methods for Post Genomic Data) à Grenoble le mercredi 28 janvier 2026.
La conférence SMPGD (Statistical Methods for Post Genomic Data) est une conférence annuelle dédiée aux méthodes statistiques pour l’analyse de données post-génomiques, dont l’objectif est de présenter à la fois des travaux de mathématiques aux statistiques appliquées, mais aussi de nouveaux champs d’application de la biologie à haut débit qui auraient besoin de nouveaux développements statistiques. Vous pouvez visiter le site général de la conférence pour en savoir plus sur les éditions passées. Et si vous ne connaissez pas, nous vous incitons fortement à venir à la journée StatOmique et à rester à Grenoble pour participer à la session 2026 de SMPGD.
Journée StatOmique - 28 janvier
Invité.es :
Nelle Varoquaux, CNRS Grenoble.
Clovis Galiez, Université Grenoble Alpes, CNRS, INRIA, LJK, Grenoble.
Alexandre Wendling, Université Grenoble Alpes, CNRS, INRIA, LJK, Grenoble.
Programme
13h00-13h40 Accueil - café
13h40-13h50 Opening presentation
13h50-14h15 Alexandre Wendling, Université Grenoble Alpes, CNRS, INRIA, LJK, Grenoble, Detection of exact sequence variants in metabarcoding by PCR abundance signal clustering.
Abstract : Advances over the past decade in high-throughput sequencing (HTS) technologies have significantly enhanced the use of molecular methods for species identification via environmental DNA (eDNA). Metabarcoding—a technique that enables the simultaneous tagging, sequencing, and identification of multiple species from a single environmental sample (Taberlet, Coissac, Hajibabaei, & Rieseberg, 2012)—has become a cornerstone of biodiversity research (Compson et al., 2020). This approach relies on PCR amplification of taxonomically informative genetic fragments (‘DNA barcodes’), which are sequenced and matched to reference datasets for species identification. However, the analysis of eDNA faces several challenges, the most important of which is the deterministic amplification bias during PCR (Gold et al., 2023). To improve the accuracy and efficiency of taxonomic assignments in eDNA metabarcoding there is a pressing need for advanced tools that can better differentiate genuine environmental signals from the noise inherent in molecular detection processes (Mathon et al., 2021). PCR amplification biases often result in amplicon sequence variants (ASVs) (Callahan, 2017), which do not correspond to real sequences, as has been observed in mock community studies. Currently, after classical denoising pipeline, the classical way to filter out these spurious sequences is by comparing them with reference databases. However, databases are often much incomplete at least for certain taxonomies, and setting the right similarity threshold can be cumbersome in practice. To improve biodiversity descriptions from eDNA, we propose an alternative approach: leveraging the variability in the abundance of sequences across samples and PCR replicates to delineate between true and supurious ASVs without relying on reference databases. Assuming that there is more biological signal than experimental noise, for a true sequence to be considered, the variation in abundance between samples must be greater than the variation in abundance between PCR replicates. We have tested our method on three community mocks for plant, fungal and bacterial primers, where we have demonstrated a clear improvement in precision by removing a large number of false positives in existing pipelines. We have also tested it on real data from the ORCHAMP observatory, where we can construct diversity curves that appear closer to reality than with existing pipelines.
14h15-14h40 Clovis Galiez, Université Grenoble Alpes, CNRS, INRIA, LJK, Grenoble, Family of Wasserstein metrics to study biodiversity through the lens of DNA.
14h40-15h05 Emeline Perthame, Institut Pasteur de Paris, Application of Large Language Models to enhance animal ethics evaluation in research.
- 15h05-15h30 Laurent Guyon, BioSanté, Institut de Recherche Interdisciplinaire de Grenoble, CEA, INSERM, Université Grenoble Alpes, Peritumoral tissue is a promising source of prognostic biomarkers.
15h30 - 16h00 break
- 16h00-16h25 Louise Velut, BioSanté (UMR BioSanté) Institut National de la Santé et de la Recherche Médicale, Institut de Recherche Interdisciplinaire de Grenoble, Université Grenoble Alpes, Unlocking miRNA Regulation: Potential and Pitfalls of Single-Cell miRNA-mRNA Co-Sequencing.
- 16h25-16h50 Hugo Varet, Hub bioinformatique et biostatistique, Institut Pasteur de Paris, Impact de la méthode et du paramétrage sur les résultats d’analyses d’enrichissement
- 16h50–17h15 Blanche Francheterre, Université Paris-Saclay, AgroParisTech, INRAE, UMR MIA Paris-Saclay, Interpretable variable selection and differential structure estimation in graphical models.
Abstract : Network inference is widely used to study direct associations between biomarkers, while differential network analysis aims to identify how these associations change across conditions. In the two-graph setting (e.g., disease vs control), numerous methods have been proposed to jointly estimate networks or directly infer differential edges. However, existing evaluations are often limited to a narrow range of simulation settings and do not systematically compare the major methodological families under realistic high-dimensional conditions.
To address this gap, we propose a comprehensive simulation framework to evaluate differential support estimation between two graphs. We compare neighborhood selection, graphical lasso, and partial-correlation–based methods, fitted either independently or with joint regularization schemes, including fused, group, node-based, and data-shared lasso penalties. We also consider a direct differential estimator, D-trace. Networks are generated under biologically motivated topologies, scale-free graphs and random networks with hubs, across a large range of dimensions (p = 30, 100, 200, 500) and sample sizes (n = 50 or 100 per condition). Differential structure is introduced via hub-based disruptions or random edge perturbations over varying proportions of differential edges.
Beyond methodological comparison, the goal is to assess whether differential network estimation can improve downstream classification and interpretation. In particular, we investigate how incorporating differential graph estimates can enhance quadratic discriminant analysis (QDA) by capturing condition-specific covariance structures. While classical approaches primarily focus on identifying differential mean effects, many biological processes, especially in omics data, are also characterized by changes in co-expression or regulatory. By leveraging differential network information, we aim to improve QDA performance and interpretability, enabling the identification of condition-specific interaction patterns that are both predictive and biologically meaningful.
17h15-18h00 Nelle Varoquaux, CNRS Grenoble, Machine learning for -omics data: functions, structures, and evolution.
18h00 - 18h05 Closing
Informations pratiques
Soutien
Cet événement est organisé avec l’aide du comité local d’organisation de smpgd 2026 et soutenu par le gdr BiMMM.
