Talks – Principled Statistics in the Age of AI

Keynote Address: Prof. Xiao-Li Meng

Title: From a Cauchy Surprise to the Half-Cauchy Miracle

Abstract:

This talk follows the path from Pillai and Meng (2016, Annals of Statistics), “An unexpected encounter with Cauchy and Levy” to Liu, Meng, and Pillai (2025) “A Heavily Right Strategy for Integrating Dependent Studies in Any Dimension,” inviting the audiences to join a journey to explore an emerging and mystical force in principled statistical inference in arbitrarily high dimensions: heavy-tail approximations.

Bio:

Xiao-Li Meng, the Whipple V. N. Jones Professor of Statistics, and the Founding Editor-in-Chief of Harvard Data Science Review, is well known for his depth and breadth in research, his innovation and passion in pedagogy, his vision and effectiveness in administration, as well as for his engaging and entertaining style as a speaker and writer. Meng was named the best statistician under the age of 40 by COPSS (Committee of Presidents of Statistical Societies) in 2001, and he is the recipient of numerous awards and honors for his more than 150 publications in at least a dozen theoretical and methodological areas, as well as in areas of pedagogy and professional development. He has delivered more than 400 research presentations and public speeches on these topics, and he is the author of “The XL-Files,” a thought-provoking and entertaining column in the IMS (Institute of Mathematical Statistics) Bulletin. His interests range from the theoretical foundations of statistical inferences (e.g., the interplay among Bayesian, Fiducial, and frequentist perspectives; frameworks for multi-source, multi-phase and multi- resolution inferences) to statistical methods and computation (e.g., posterior predictive p-value; EM algorithm; Markov chain Monte Carlo; bridge and path sampling) to applications in natural, social, and medical sciences and engineering (e.g., complex statistical modeling in astronomy and astrophysics, assessing disparity in mental health services, and quantifying statistical information in genetic studies). Meng received his BS in mathematics from Fudan University in 1982 and his PhD in statistics from Harvard in 1990. He was on the faculty of the University of Chicago from 1991 to 2001 before returning to Harvard, where he served as the Chair of the Department of Statistics (2004-2012) and the Dean of Graduate School of Arts and Sciences (2012-2017).

Invited Talk: Dr. Joshua Bon

Title: Persuasive Privacy

Abstract:

We propose a novel framework for measuring privacy from a Bayesian game-theoretic perspective. This framework enables the creation of new, purpose-driven privacy definitions that are rigorously justified, while also allowing for the assessment of existing privacy guarantees through game theory. We show that pure and probabilistic differential privacy are special cases of our framework, and provide new interpretations of the post-processing inequality in this setting. Further, we demonstrate that privacy guarantees can be established for deterministic algorithms, which are overlooked by current privacy standards. (Joint work with James Bailie, Judith Rousseau, and Christian Robert.)

Bio:

Dr Bon is a Lecturer in the School of Mathematical Sciences at Adelaide University. His research is in computational statistics, focussing on the development and analysis of Bayesian inference algorithms, including sequential Monte Carlo and simulation-based inference. He collaborates with applied scientists across sports science, psychology, social science, and ecology to develop principled and robust data analysis procedures. Alongside this, he develops open-source statistical software. Dr Bon is currently working on methods for Bayesian inference with data privacy guarantees involving Persuasive Privacy.

Homepage: bonstats.github.io

Invited Talk: Prof. Cory McCartan

Title: Computer Redistricting Simulation: Past, Present, and Future

Abstract:

Statistical sampling of redistricting plans has emerged as a powerful tool to study legislative redistricting and identify partisan and racial gerrymanders. I will introduce redistricting simulation tools and explain the basics of how they operate and the statistical guarantees they provide. Then I will discuss how these algorithms have been applied in the past as well as new and emerging methods and applications of these tools. I will conclude with some thoughts on the future use of simulation tools in light of the shifting judicial and political landscape in the United States.

Bio:

Cory McCartan is the Hoben and Patricia Thomas and Thomas and Ann Hettmansperger Early Career Professor of Statistics at Penn State and a faculty affiliate in political science. His research focuses on methodological and applied problems in the social sciences, including elections, legislative redistricting, racial disparities, and missing data. He is a Co-PI of the Algorithm-Assisted Redistricting Methodology (ALARM) Project at Harvard University. As part of his research, he also develops and maintains a number of R packages for visualization, redistricting, and statistical analysis.

Homepage: corymccartan.com

Invited Talk: Prof. Fredrik Johansson

Title: Learning Causally Sound and Interpretable Composite Endpoints for Clinical Trials

Abstract:

Randomized clinical trials are considered the gold standard evidence for learning about the causal effects of medical interventions, but have natural limitations on scope and length. This often rules out targeting long-term outcomes of interest, such as mortality or cardiovascular disease, as these endpoints won’t be observed for most participants during the length of the trial. Instead, researchers turn to surrogate endpoints that are associated with the primary outcome of interest and can be observed during the trial. This presents a problem: What constitutes a good surrogate? In theory, a good surrogate is one for which the effect of the treatment is predictive of its effect on the primary outcome, but the definition alone does not reveal how to find such a variable. More than that, to be useful in a clinical trial, the surrogate must be approved by a regulatory body when registering the trial, necessitating its interpretability. In this talk, I will discuss the implications of this, algorithms that can provably learn composite surrogates from observational data, and situations where there is no hope to find a good surrogate.

Bio:

Fredrik Johansson is associate professor of Computer Science & Engineering at Chalmers University of Technology, where he runs the Healthy AI Lab, dedicated to develop machine learning methods and theory to advance decision making in healthcare.

Homepage: healthyai.se

Invited Talk: Prof. Ashkan Panahi

Title: Learning Trajectories of Large Neural Networks: A Statistical Analysis

Abstract:

The statistical behavior of neural networks strongly depends on the way they are trained. This observation has made the analysis of learning trajectories a central problem in theoretical machine learning. While such trajectories can be extremely complex to analyze exactly, they often exhibit a simpler structure in asymptotically large settings, where they depend on fewer effective parameters that capture the overall behavior of the model. In this talk, we review recent developments in this area. In particular, we introduce a Gaussian comparison technique that relates the dynamics of learning trajectories to alternative dynamical systems that may be more tractable to analyze. Our results hold in finite dimensions and provide a rigorous justification of earlier findings obtained through alternative approaches such as dynamical mean field theory

Bio:

Ashkan Panahi is a researcher and associate professor in the Department of Computer Science and Engineering at Chalmers University of Technology. Originally trained in electrical engineering with a specialization in communication systems, he received his PhD in signal processing in 2015. Since then, his research has focused on the theoretical foundations of statistical machine learning, optimization, and information theory. His work spans areas including distributed and federated learning, graph-based machine learning, signal processing, and statistical inference, with applications in modern AI systems and networked data analysis. He currently leads a research group developing efficient methods for multi-scale transformer-based image processing and general approaches to knowledge transfer, with a particular emphasis on cell microscopy imaging.