Fall 2025 Departmental Seminars & Lectures

During the Fall and Spring semesters, the Department of Biostatistics holds regular seminars on Thursdays, called the Levin Lecture Series, on a wide variety of topics which are of interest to both students and faculty. Over each semester, there are also often guest lectures outside the regular Thursday Levin Lecture Series, to provide a robust schedule the covers the wide range of topics in Biostatistics. The speakers are invited guests who spend the day of their seminar discussing their research with Biostatistics faculty and students. 

Many seminars this semester are Zoom, which are joinable via the link here or using Meeting ID: 963 2560 9671 & Passcode: 698339. Links are also available on the individual talk entries.

In-Person seminars will have no zoom option.

Fall 2025 Schedule

Thursday, September 4th, Zoom, 11:45am
Levin Lecture 

James Zou, PhD
Associate Professor of Biomedical Data Science and, by courtesy, of Computer Science and of Electrical Engineering
Stanford University

Computational Biology in the Age of AI Agents

Abstract: 

AI agents—large language models equipped with tools and reasoning capabilities—are emerging as powerful research enablers. This talk will explore how computational biology is particularly well-positioned to benefit from rapid advances in agentic AI. I’ll first introduce the Virtual Lab—a collaborative team of AI scientist agents conducting in silico research meetings to tackle open-ended research projects. As an example application, the Virtual Lab designed new nanobody binders to recent Covid variants that we experimentally validated. Then I will present CellVoyager, a data science agent that analyzes complex genomics data to derive new insights. I will conclude by discussing limits of agents and a roadmap for human researcher-AI collaboration.

Thursday, September 11th, ARB 8th Floor Auditorium, 11:45am
Levin Lecture 

Bingxin Zhao, PhD
Assistant Professor of Statistics and Data Science
Assistant Professor of Medicine, Division of Translational Medicine and Human Research (secondary appointment)
University of Pennsylvania  

Resampling-based Pseudo-training in Genomic Predictions

Abstract: 
In this talk, I will present a resampling-based pseudo-training framework for genomic prediction that enables model development using only summary-level data. We show that generating pseudo-training and validation statistics from summary results achieves asymptotic equivalence to conventional training while avoiding the need for individual-level datasets. Simulations and real data applications suggest that pseudo-training performs comparably to standard approaches with large datasets and substantially better when tuning data are limited. We highlight two platforms built on this framework: PennPRS (https://pennprs.org/), a cloud-based computing infrastructure supporting large-scale, no-code polygenic risk score training with purely summary data resources, and GCB-Hub (https://www.gcbhub.org/), which applies pseudo-training to proteome-wide association studies for protein-disease mapping and drug discovery. Together, these advances demonstrate how resampling-based pseudo-training methods can broaden accessibility, scalability, and impact of genomic prediction across diverse biomedical research settings

Thursday, September 18th, Zoom, 11:45am
Levin Lecture 

Jingyi Jessica Li, PhD
Professor & Program Head, Biostatistics Program; Donald and Janet K. Guthrie Endowed Chair in Statistics, Public Health Sciences Division; Fred Hutch Cancer Center Affiliate Professor Biostatistics
University of Washington

Nullstrap: A Simple, High-Power, and Fast Framework for FDR Control in Variable Selection for Diverse High-Dimensional Models

Abstract:
Balancing false discovery rate (FDR) control with high statistical power is a central challenge in high-dimensional variable selection. Existing methods often degrade data through knockoffs or splitting, leading to power loss. We propose Nullstrap, a framework that con- trols FDR without altering the original data. Nullstrap generates synthetic null data by fitting a null model under the null hypothesis and applies the same estimation to both original and synthetic datasets. This parallel structure resembles the likelihood ratio test, serving as its numerical analog. A data-driven correction procedure adjusts null estimates, enabling variable selection with theoretical guarantees: asymptotic FDR control at any desired level and power converging to one. Nullstrap is fast, stable, and broadly applicable across linear, generalized linear, Cox, and graphical models. Simulations indicate that Nullstrap maintains robust FDR control and outperforms the knockoff filter and data splitting in power (0.95 vs. 0.50 and 0.70) and efficiency (≈ 30×). While all three methods are randomized, Nullstrap is more stable (Jaccard 0.98 vs. 0 and 0.42). In a triple-omics time-to-labor dataset, the knockoff filter and data splitting fail to identify variables in most of 70 runs with different random seeds, whereas Nullstrap consistently selects predictors, achieves > 90% predictive accuracy, and is three orders of magnitude faster.

Thursday, September 25th, Zoom, 11:45am
Levin Lecture 

Brian Caffo, PhD, MS
Professor, Department of Biostatistics
John Hopkins University Bloomberg School of Public Health

Does AI need to be artificial? Does it need to be intelligent?

In this talk we consider the fascinating possibility of organoid intelligence (OI). Organoid intelligence using human derived pluripotent stem cells to create neural clusters with measurable neuronal activity. We discuss a recent JHU effort in OI from a team of neuroscientists, engineers, signal processors and statisticians. The goal is to use organoids to perform complex computing tasks through stimulation and response. Measurement is obtained from an electrode shell custom designed for three dimensional measurements. Apart from OI, organoids electrophysiology experiments are useful for  studying human genetic disorders and toxicity through another phenotype in vitro. Here, we will focus on the statistical challenges underlying this novel form of measurement. Time permitting, we will discuss other efforts in biocomputing.

Thursday, October 2nd, Zoom, 11:45am
Levin Lecture 

Natalie Dean, PhD
Associate Professor, Department of Biostatistics and Bioinformatics, Department of Epidemiology
Emory University Rollins School of Public Health

Challenges in Estimating Vaccine Effectiveness Against Progression to Severe Disease

Abstract:
Vaccines can reduce an individual’s risk of infection and their risk of progression to disease given infection. The latter effect is less commonly estimated but is relevant for risk communication and vaccine impact modeling. Using a motivating example from the COVID-19 literature, we note how vaccine effectiveness against progression can appear to increase over time in settings where true biological strengthening is unlikely. We use mathematical modeling to demonstrate how this phenomenon can occur when there is an underlying vulnerable subpopulation with poor vaccine response against infection and progression. As a result, the earliest infections are among those with the weakest protection against disease. We describe a modeling framework to link underlying immunology and post-vaccination outcomes that we use to further examine this problem. This work highlights methodological challenges in isolating a vaccine’s effect on progression to severe disease after infection.

Thursday, October 9th, Zoom, 11:45am
Levin Lecture 

KC Gary Chan, PhD
Professor, Health Services; Professor , Biostatistics
University of Washington School of Public Health

Robust and efficient semiparametric inference for the stepped wedge design

Abstract:
Stepped wedge designs (SWDs) are increasingly used to evaluate longitudinal cluster-level interventions but pose substantial challenges for valid inference. Because crossover times are randomized, intervention effects are intrinsically confounded with secular time trends, while heterogeneous cluster effects, complex correlation structures, baseline covariate imbalances, and unreliable standard errors from few clusters further complicate statistical inference. We propose a unified semiparametric framework for estimating possibly time-varying intervention effects in SWDs that directly addresses these issues. A nonstandard development of semiparametric efficiency theory is required to accommodate correlated observations within clusters, non-identically distributed outcomes across clusters due to varying cluster-period sizes, and weakly dependent treatment assignments that are hallmarks of SWDs. The resulting estimator of treatment contrast is consistent and asymptotically normal even under misspecification of the covariance structure and control cluster-period means, and achieves the semiparametric efficiency bound when both are correctly specified. To facilitate inference for trials with few clusters, we introduce a permutation-based procedure to better capture finite-sample variability and a leave-one-out correction to mitigate plug-in bias. We further discuss how effect modification can be naturally incorporated, and imbalanced precision variables can be accommodated via a simple adjustment closely related to post-stratification, a novel connection of independent interest. Simulations and application to a public health trial demonstrate the robustness and efficiency of the proposed method relative to standard approaches

Thursday, October 16th, Hess Commons, 11:45am
Levin Lecture 

Andrew An Chen, PhD
Assistant Professor, Department of Public Health Sciences
Medical University of South Carolina

Methodological Considerations in Applying Brain Charts to New Samples

Abstract:

Multi-site national and international imaging consortia have formed with the goal of precisely characterizing the human brain across the lifespan. These consortia have succeeded in collecting large samples of brain magnetic resonance imaging (MRI) scans to estimate sex-specific trajectories of brain phenotypes across age, often called brain charts. The promise of brain charts is that future researchers and clinicians will be able to assess a new scan for deviations from this healthy trajectory. However, the implementation of these charts in practice is severely limited by differences across study sites, also known as site effects. Here, we first discuss several projects in harmonization of MRI data specifically tailored to this normative modeling setting. Then, we leverage advancements in model uncertainty quantification to propose new ways to calibrate brain charts, as an alternative to harmonizing data. Finally, we apply our approaches to the Lifespan Brain Chart Consortium (LBCC) to assess generalizability to new scans from both healthy individuals and Alzheimer's disease (AD) patients. Based on our findings, we provide methodological recommendations for applying fitted brain charts to new sites.

Thursday, October 23rd, Zoom, 11:45am
Levin Lecture 

GuanNan Wang, PhD
Assistant Professor, Mathematics
College of William & Mary

Boosting Biomedical Imaging Analysis via Distributed Functional Regression and Synthetic Surrogates

Abstract:
Understanding how scalar covariates influence spatial patterns in medical imaging data, such as neuroimaging or organ-level functional images, is a central challenge in modern biomedical research. The rapid expansion of large-scale imaging studies has heightened the need for statistical frameworks that are both interpretable and computationally scalable. In this talk, I will introduce a new class of domain-aware functional regression models, where spatially varying coefficients link scalar predictors to imaging responses defined over complex 3D domains. Our Distributed Image-on-Scalar Regression framework employs a triangulation-based domain decomposition strategy, enabling efficient parallel estimation with trivariate penalized splines. This design preserves global spatial structure while flexibly accommodating subregion-specific heterogeneity. To address additional challenges posed by incomplete or noisy imaging data, I will also discuss the use of synthetic surrogates generated with modern AI tools. Rather than imputing missing values directly, these synthetic surrogates can serve as auxiliary data that can be jointly analyzed with observed images, improving efficiency while maintaining robustness to imputation error. Together, these advances pave the way for scalable, uncertainty-aware statistical analysis of high-dimensional biomedical imaging.

Thursday, October 30th, Zoom, Time TBA
Levin Lecture 

Bibhas Chakraborty, PhD
Associate Professor, Centre for Quantitative Medicine; Interim Director, Centre for Quantitative Medicine
Duke-NUS Medical School

Innovative Trial Designs in Mobile Health Using Reinforcement Learning

Abstract:

Multi-site national and international imaging consortia have formed with the goal of precisely characterizing the human brain across the lifespan. These consortia have succeeded in collecting large samples of brain magnetic resonance imaging (MRI) scans to estimate sex-specific trajectories of brain phenotypes across age, often called brain charts. The promise of brain charts is that future researchers and clinicians will be able to assess a new scan for deviations from this healthy trajectory. However, the implementation of these charts in practice is severely limited by differences across study sites, also known as site effects. Here, we first discuss several projects in harmonization of MRI data specifically tailored to this normative modeling setting. Then, we leverage advancements in model uncertainty quantification to propose new ways to calibrate brain charts, as an alternative to harmonizing data. Finally, we apply our approaches to the Lifespan Brain Chart Consortium (LBCC) to assess generalizability to new scans from both healthy individuals and Alzheimer's disease (AD) patients. Based on our findings, we provide methodological recommendations for applying fitted brain charts to new sites.

Thursday, November 6th, Hess Commons, 11:45am
Levin Lecture 

Weng Kee Wong, PhD
Professor of Biostatistics
University of California Los Angeles Fielding School of Public Health

Title & Abstract TBA

 

 

 

 

Thursday, November 13th, Zoom, 11:45am
Levin Lecture 

Lina Montoya, PhD
Assistant Professor, Department of Biostatistics; Assistant Professor, School of Data Science and Society
University of North Carolina Gillings School of Global Public Health

Title & Abstract TBA

 

 

 

 

 

Thursday, November 20th, Zoom, 11:45am
Levin Lecture 

Didong Li, PhD
Assistant Professor, Department of Biostatistics
University of North Carolina Gillings School of Global Public Health

Title & Abstract TBA

 

 

 

 

Thursday, December 4th, Zoom, 11:45am
Levin Lecture 

Mingyao Li, PhD
Professor of Biostatistics in Biostatistics and Epidemiology
University of Pennsylvania Perelman School of Medicine

Title & Abstract TBA

 

 

 

 

Thursday, December 11th, Hess Commons, 11:45am
Levin Lecture 

Falco J. Bargagli Stoffi, PhD
Assistant Professor, Department of Biostatistics
University of California Los Angeles Fielding School of Public Health

Title & Abstract TBA