Fall 2025 Departmental Seminars & Lectures
During the Fall and Spring semesters, the Department of Biostatistics holds regular seminars on Thursdays, called the Levin Lecture Series, on a wide variety of topics which are of interest to both students and faculty. Over each semester, there are also often guest lectures outside the regular Thursday Levin Lecture Series, to provide a robust schedule the covers the wide range of topics in Biostatistics. The speakers are invited guests who spend the day of their seminar discussing their research with Biostatistics faculty and students.
Many seminars this semester are Zoom, which are joinable via the link here or using Meeting ID: 963 2560 9671 & Passcode: 698339. Links are also available on the individual talk entries.
In-Person seminars will have no zoom option.
Fall 2025 Schedule
Thursday, September 4th, Zoom, 11:45am
Levin Lecture
James Zou, PhD
Associate Professor of Biomedical Data Science and, by courtesy, of Computer Science and of Electrical Engineering
Stanford University
Computational Biology in the Age of AI Agents
Abstract:
AI agents—large language models equipped with tools and reasoning capabilities—are emerging as powerful research enablers. This talk will explore how computational biology is particularly well-positioned to benefit from rapid advances in agentic AI. I’ll first introduce the Virtual Lab—a collaborative team of AI scientist agents conducting in silico research meetings to tackle open-ended research projects. As an example application, the Virtual Lab designed new nanobody binders to recent Covid variants that we experimentally validated. Then I will present CellVoyager, a data science agent that analyzes complex genomics data to derive new insights. I will conclude by discussing limits of agents and a roadmap for human researcher-AI collaboration.
Thursday, September 11th, ARB 8th Floor Auditorium, 11:45am
Levin Lecture
Bingxin Zhao, PhD
Assistant Professor of Statistics and Data Science
Assistant Professor of Medicine, Division of Translational Medicine and Human Research (secondary appointment)
University of Pennsylvania
Resampling-based Pseudo-training in Genomic Predictions
Abstract:
In this talk, I will present a resampling-based pseudo-training framework for genomic prediction that enables model development using only summary-level data. We show that generating pseudo-training and validation statistics from summary results achieves asymptotic equivalence to conventional training while avoiding the need for individual-level datasets. Simulations and real data applications suggest that pseudo-training performs comparably to standard approaches with large datasets and substantially better when tuning data are limited. We highlight two platforms built on this framework: PennPRS (https://pennprs.org/), a cloud-based computing infrastructure supporting large-scale, no-code polygenic risk score training with purely summary data resources, and GCB-Hub (https://www.gcbhub.org/), which applies pseudo-training to proteome-wide association studies for protein-disease mapping and drug discovery. Together, these advances demonstrate how resampling-based pseudo-training methods can broaden accessibility, scalability, and impact of genomic prediction across diverse biomedical research settings
Thursday, September 18th, Zoom, 11:45am
Levin Lecture
Jingyi Jessica Li, PhD
Professor & Program Head, Biostatistics Program; Donald and Janet K. Guthrie Endowed Chair in Statistics, Public Health Sciences Division; Fred Hutch Cancer Center Affiliate Professor Biostatistics
University of Washington
Nullstrap: A Simple, High-Power, and Fast Framework for FDR Control in Variable Selection for Diverse High-Dimensional Models
Abstract:
Balancing false discovery rate (FDR) control with high statistical power is a central challenge in high-dimensional variable selection. Existing methods often degrade data through knockoffs or splitting, leading to power loss. We propose Nullstrap, a framework that con- trols FDR without altering the original data. Nullstrap generates synthetic null data by fitting a null model under the null hypothesis and applies the same estimation to both original and synthetic datasets. This parallel structure resembles the likelihood ratio test, serving as its numerical analog. A data-driven correction procedure adjusts null estimates, enabling variable selection with theoretical guarantees: asymptotic FDR control at any desired level and power converging to one. Nullstrap is fast, stable, and broadly applicable across linear, generalized linear, Cox, and graphical models. Simulations indicate that Nullstrap maintains robust FDR control and outperforms the knockoff filter and data splitting in power (0.95 vs. 0.50 and 0.70) and efficiency (≈ 30×). While all three methods are randomized, Nullstrap is more stable (Jaccard 0.98 vs. 0 and 0.42). In a triple-omics time-to-labor dataset, the knockoff filter and data splitting fail to identify variables in most of 70 runs with different random seeds, whereas Nullstrap consistently selects predictors, achieves > 90% predictive accuracy, and is three orders of magnitude faster.
Thursday, September 25th, Zoom, 11:45am
Levin Lecture
Brian Caffo, PhD, MS
Professor, Department of Biostatistics
John Hopkins University Bloomberg School of Public Health
Does AI need to be artificial? Does it need to be intelligent?
In this talk we consider the fascinating possibility of organoid intelligence (OI). Organoid intelligence using human derived pluripotent stem cells to create neural clusters with measurable neuronal activity. We discuss a recent JHU effort in OI from a team of neuroscientists, engineers, signal processors and statisticians. The goal is to use organoids to perform complex computing tasks through stimulation and response. Measurement is obtained from an electrode shell custom designed for three dimensional measurements. Apart from OI, organoids electrophysiology experiments are useful for studying human genetic disorders and toxicity through another phenotype in vitro. Here, we will focus on the statistical challenges underlying this novel form of measurement. Time permitting, we will discuss other efforts in biocomputing.
Thursday, October 2nd, Zoom, 11:45am
Levin Lecture
Natalie Dean, PhD
Associate Professor, Department of Biostatistics and Bioinformatics, Department of Epidemiology
Emory University Rollins School of Public Health
Challenges in Estimating Vaccine Effectiveness Against Progression to Severe Disease
Abstract:
Vaccines can reduce an individual’s risk of infection and their risk of progression to disease given infection. The latter effect is less commonly estimated but is relevant for risk communication and vaccine impact modeling. Using a motivating example from the COVID-19 literature, we note how vaccine effectiveness against progression can appear to increase over time in settings where true biological strengthening is unlikely. We use mathematical modeling to demonstrate how this phenomenon can occur when there is an underlying vulnerable subpopulation with poor vaccine response against infection and progression. As a result, the earliest infections are among those with the weakest protection against disease. We describe a modeling framework to link underlying immunology and post-vaccination outcomes that we use to further examine this problem. This work highlights methodological challenges in isolating a vaccine’s effect on progression to severe disease after infection.
Thursday, October 9th, Zoom, 11:45am
Levin Lecture
KC Gary Chan, PhD
Professor, Health Services; Professor , Biostatistics
University of Washington School of Public Health
Robust and efficient semiparametric inference for the stepped wedge design
Abstract:
Stepped wedge designs (SWDs) are increasingly used to evaluate longitudinal cluster-level interventions but pose substantial challenges for valid inference. Because crossover times are randomized, intervention effects are intrinsically confounded with secular time trends, while heterogeneous cluster effects, complex correlation structures, baseline covariate imbalances, and unreliable standard errors from few clusters further complicate statistical inference. We propose a unified semiparametric framework for estimating possibly time-varying intervention effects in SWDs that directly addresses these issues. A nonstandard development of semiparametric efficiency theory is required to accommodate correlated observations within clusters, non-identically distributed outcomes across clusters due to varying cluster-period sizes, and weakly dependent treatment assignments that are hallmarks of SWDs. The resulting estimator of treatment contrast is consistent and asymptotically normal even under misspecification of the covariance structure and control cluster-period means, and achieves the semiparametric efficiency bound when both are correctly specified. To facilitate inference for trials with few clusters, we introduce a permutation-based procedure to better capture finite-sample variability and a leave-one-out correction to mitigate plug-in bias. We further discuss how effect modification can be naturally incorporated, and imbalanced precision variables can be accommodated via a simple adjustment closely related to post-stratification, a novel connection of independent interest. Simulations and application to a public health trial demonstrate the robustness and efficiency of the proposed method relative to standard approaches
Thursday, October 16th, Hess Commons, 11:45am
Levin Lecture
Andrew An Chen, PhD
Assistant Professor, Department of Public Health Sciences
Medical University of South Carolina
Methodological Considerations in Applying Brain Charts to New Samples
Abstract:
Multi-site national and international imaging consortia have formed with the goal of precisely characterizing the human brain across the lifespan. These consortia have succeeded in collecting large samples of brain magnetic resonance imaging (MRI) scans to estimate sex-specific trajectories of brain phenotypes across age, often called brain charts. The promise of brain charts is that future researchers and clinicians will be able to assess a new scan for deviations from this healthy trajectory. However, the implementation of these charts in practice is severely limited by differences across study sites, also known as site effects. Here, we first discuss several projects in harmonization of MRI data specifically tailored to this normative modeling setting. Then, we leverage advancements in model uncertainty quantification to propose new ways to calibrate brain charts, as an alternative to harmonizing data. Finally, we apply our approaches to the Lifespan Brain Chart Consortium (LBCC) to assess generalizability to new scans from both healthy individuals and Alzheimer's disease (AD) patients. Based on our findings, we provide methodological recommendations for applying fitted brain charts to new sites.
Thursday, October 23rd, Zoom, 11:45am
Levin Lecture
GuanNan Wang, PhD
Assistant Professor, Mathematics
College of William & Mary
Boosting Biomedical Imaging Analysis via Distributed Functional Regression and Synthetic Surrogates
Abstract:
Understanding how scalar covariates influence spatial patterns in medical imaging data, such as neuroimaging or organ-level functional images, is a central challenge in modern biomedical research. The rapid expansion of large-scale imaging studies has heightened the need for statistical frameworks that are both interpretable and computationally scalable. In this talk, I will introduce a new class of domain-aware functional regression models, where spatially varying coefficients link scalar predictors to imaging responses defined over complex 3D domains. Our Distributed Image-on-Scalar Regression framework employs a triangulation-based domain decomposition strategy, enabling efficient parallel estimation with trivariate penalized splines. This design preserves global spatial structure while flexibly accommodating subregion-specific heterogeneity. To address additional challenges posed by incomplete or noisy imaging data, I will also discuss the use of synthetic surrogates generated with modern AI tools. Rather than imputing missing values directly, these synthetic surrogates can serve as auxiliary data that can be jointly analyzed with observed images, improving efficiency while maintaining robustness to imputation error. Together, these advances pave the way for scalable, uncertainty-aware statistical analysis of high-dimensional biomedical imaging.
Thursday, October 30th, Zoom, Time TBA
Levin Lecture
Bibhas Chakraborty, PhD
Associate Professor, Centre for Quantitative Medicine; Interim Director, Centre for Quantitative Medicine
Duke-NUS Medical School
Innovative Trial Designs in Mobile Health Using Reinforcement Learning
Abstract:
Multi-site national and international imaging consortia have formed with the goal of precisely characterizing the human brain across the lifespan. These consortia have succeeded in collecting large samples of brain magnetic resonance imaging (MRI) scans to estimate sex-specific trajectories of brain phenotypes across age, often called brain charts. The promise of brain charts is that future researchers and clinicians will be able to assess a new scan for deviations from this healthy trajectory. However, the implementation of these charts in practice is severely limited by differences across study sites, also known as site effects. Here, we first discuss several projects in harmonization of MRI data specifically tailored to this normative modeling setting. Then, we leverage advancements in model uncertainty quantification to propose new ways to calibrate brain charts, as an alternative to harmonizing data. Finally, we apply our approaches to the Lifespan Brain Chart Consortium (LBCC) to assess generalizability to new scans from both healthy individuals and Alzheimer's disease (AD) patients. Based on our findings, we provide methodological recommendations for applying fitted brain charts to new sites.
Thursday, November 6th, Hess Commons, 11:45am
Levin Lecture
Weng Kee Wong, PhD
Professor of Biostatistics
University of California Los Angeles Fielding School of Public Health
Title & Abstract TBA
Thursday, November 13th, Zoom, 11:45am
Levin Lecture
Lina Montoya, PhD
Assistant Professor, Department of Biostatistics; Assistant Professor, School of Data Science and Society
University of North Carolina Gillings School of Global Public Health
Title & Abstract TBA
Thursday, November 20th, Zoom, 11:45am
Levin Lecture
Didong Li, PhD
Assistant Professor, Department of Biostatistics
University of North Carolina Gillings School of Global Public Health
Title & Abstract TBA
Thursday, December 4th, Zoom, 11:45am
Levin Lecture
Mingyao Li, PhD
Professor of Biostatistics in Biostatistics and Epidemiology
University of Pennsylvania Perelman School of Medicine
Title & Abstract TBA
Thursday, December 11th, Hess Commons, 11:45am
Levin Lecture
Falco J. Bargagli Stoffi, PhD
Assistant Professor, Department of Biostatistics
University of California Los Angeles Fielding School of Public Health
Title & Abstract TBA