Yuanjia Wang, PhD

  • Professor of Biostatistics
Profile Headshot


Dr. Yuanjia Wang is a Professor in the Department of Biostatistics and Department of Psychiatry at Columbia University. She is also a member of the Data Science Institute at Columbia University and a core member of the Mental Health Data Science division at New York State Psychiatric Institute. Dr. Wang’s research focuses on developing innovative machine learning methods, generative models and artificial intelligence for precision medicine in mental health and neurodegenerative diseases. Her work addresses complex challenges in risk prediction, early intervention, and prevention strategies. By integrating large-scale data from clinical trials, electronic health records, high-dimensional biomarkers, digital phenotypes, and behavioral tests, Dr. Wang's research aims to improve patient outcomes and healthcare efficiency through personalized and safe treatment strategies and more precise disease modeling.

Dr. Wang has served as the principal investigator (PI) on multiple NIH-funded R01 research grants and collaborates widely with psychiatrists and neurologists. During the COVID-19 pandemic, she led a team that contributed weekly forecasts to the CDC COVID Forecast Hub, directly informing CDC policy decisions. Additionally, Dr. Wang is passionate about training the next generation of leaders in biostatistics. She is the PhD Program Director in the Department of Biostatistics and the contact PI of an NIH T32 training grant on Mental Health Biostatistics and Data Science. She has supervised more than 16 doctoral students, multiple postdoctoral researchers, and junior faculty. As of 2024, her students have won over 20 dissertation awards and student paper awards from the ASA, ENAR and ICHPS. 

Dr. Wang has held leadership roles within the American Statistical Association (ASA), including serving as Section Chair and Program Chair for the Mental Health Statistics and Health Policy Statistics Sections. She has been a standing member of NIH study sections, an Associate Editor for major statistics journals, a member of numerous program committees for statistics and informatics conferences, and a member of the ASA Lester R. Curtin Award Committee. In 2016, she was elected to be a fellow of the American Statistical Association.

Academic Appointments

  • Professor of Biostatistics

Administrative Titles

  • Biostatistics PhD Program Director
  • T-32 Training Grant (Mental Health Biostatsitics and Data Science) Co-Director


  • Female

Credentials & Experience

Education & Training

  • BA, 2001 University of Science and Technology of China, Hefei, China
  • PhD, 2005 Department of Statistics, Columbia University

Committees, Societies, Councils

  • 2021-2022: Section Chair, American Statistical Association Health Policy Statistics Section (HPSS)
  • 2018-2019: Section Chair, American Statistical Association Section on Mental Health Statistics (MHS)
  • 2019-2020, 2023-2024: Member, American Statistical Association Lester R. Curtin Award Committee
  • Scientific Program Committee: ENAR, JSM, International Conference on Brain Informatics
  • NIH Study Sections; NINDS Huntington Disease Biospecimen Resource Access Committee

Honors & Awards

  • 2016: Elected Fellow, American Statistical Association (ASA)
  • 2017: First Place Winning Team for the Parkinson’s Disease Digital Biomarker DREAM Challenge
  • 2015-2018: Tow Faculty Leadership Scholars Award. 2015-2018
  • Invited speaker at NIH and FDA workshops
  • Associate Editor: Journal of the American Statistical Association, Biometrics


Dr. Wang’s research centers on developing innovative machine learning methods, generative models and artificial intelligence to advance precision medicine in psychiatry and neurodegenerative diseases. She has extensive experience analyzing large-scale multimodal data, including clinical trials, electronic health records, high-dimensional biomarkers, digital phenotypes, and behavioral tests, to tackle complex methodological and practical issues. Ultimately, Dr. Wang's research seeks to improve patient outcomes and healthcare efficiency by constructing personalized and safe treatment strategies and more precise disease models.

Research Interests

  • Biostatistical Methods
  • Electronic Health Records (EHR/EMR)
  • Machine learning and artificial intelligence
  • Mental Health
  • Precision Medicine


Present Grants

Contact Principal Investigator, “Research Training Program in Mental Health Biostatistics and Data Science". T32MH135856. 

Principal Investigator, “Machine Learning Methods for Optimizing Individualized Treatment Strategies for Precision Psychiatry". R01MH123487. 

Principal Investigator, “Statistical Methods for Integrating Mixed-type Biomarkers and Phe- notypes in Neurodegenerative Disease Modeling". R01NS073671. 

Multiple Principal Investigator (MPI), “Statistical and Machine Learning Methods to Improve Dynamic Treatment Regimens Estimation Using Real World Data". R01GM124104.

Selected Publications

NCBI Bibliography

Yang B, Guo X, Loh JM, Wang Q, Wang Y. Learning optimal biomarker-guided treatment policy for chronic disorders. Stat Med2024 Jun 30;43(14):2765-2782doi: 10.1002/sim.10099. Epub 2024 May 3. PubMed PMID: 38700103; PubMed Central PMCID: PMC11178467.

Ma Y, Wang Y. Estimating Disease Distribution Functions from Censored Mixture Data. Applied Statistics: Journal of the Royal Statistical Society, Series C.. Forthcoming.

Guo X, Zeng D, Wang Y. A Semiparametric Inverse Reinforcement Learning Approach to Characterize Decision Making for Mental Disorders. J Am Stat Assoc2024;119(545):27-38doi: 10.1080/01621459.2023.2261184. Epub 2023 Nov 22. PubMed PMID: 38706706; PubMed Central PMCID: PMC11068237.

Wang Q, Wang Y. Multilayer Exponential Family Factor models for integrative analysis and learning disease progression. Biostatistics2023 Dec 15;25(1):203-219doi: 10.1093/biostatistics/kxac042. PubMed PMID: 36124992; PubMed Central PMCID: PMC10939400.

Lin DY, Xu Y, Gu Y, Zeng D, Wheeler B, Young H, Moore Z, Sunny SK. Effects of COVID-19 vaccination and previous SARS-CoV-2 infection on omicron infection and severe outcomes in children under 12 years of age in the USA: an observational cohort study. Lancet Infect Dis2023 Nov;23(11):1257-1265doi: 10.1016/S1473-3099(23)00272-4. Epub 2023 Jun 16. PubMed PMID: 37336222; PubMed Central PMCID: PMC10275621.

Zou H, Zeng D, Xiao L, Luo S. BAYESIAN INFERENCE AND DYNAMIC PREDICTION FOR MULTIVARIATE LONGITUDINAL AND SURVIVAL DATA. Ann Appl Stat2023 Sep;17(3):2574-2595doi: 10.1214/23-aoas1733. Epub 2023 Sep 7. PubMed PMID: 37719893; PubMed Central PMCID: PMC10500582.

Wang Q, Loh JM, He X, Wang Y. A latent state space model for estimating brain dynamics from electroencephalogram (EEG) data. Biometrics2023 Sep;79(3):2444-2457doi: 10.1111/biom.13742. Epub 2022 Sep 19. PubMed PMID: 36004670; PubMed Central PMCID: PMC10894450.

Xu T, Chen Y, Zeng D, Wang Y. Mixed-Response State-Space Model for Analyzing Multi-Dimensional Digital Phenotypes. J Am Stat Assoc2023;118(544):2288-2300doi: 10.1080/01621459.2023.2225742. Epub 2023 Jul 20. PubMed PMID: 38404670; PubMed Central PMCID: PMC10888145.

Yang S, Gao C, Zeng D, Wang X. Elastic integrative analysis of randomised trial and real-world data for treatment heterogeneity estimation. J R Stat Soc Series B Stat Methodol2023 Jul;85(3):575-596doi: 10.1093/jrsssb/qkad017. eCollection 2023 Jul. PubMed PMID: 37521165; PubMed Central PMCID: PMC10376438.

Yu H, Wang Y, Zeng D. A general framework of nonparametric feature selection in high-dimensional data. Biometrics2023 Jun;79(2):951-963doi: 10.1111/biom.13664. Epub 2022 Apr 7. PubMed PMID: 35318639; PubMed Central PMCID: PMC10540052.