Research

All BEST students complete a research project under the supervision of an experienced faculty mentor. For both students and faculty, the research project and final capstone presentation serve as the highlight of the program.

With the opportunity to learn about topics ranging from computer simulation of genetic disorders, to classification trees, to models for clinical decision making, students go far beyond traditional classroom education in order to synthesize and integrate all they learn. In doing so, students are able to demonstrate and better appreciate the importance of statistical methods in biomedical research.

Schedule

The program allocates 6-8 hours per week over about 7 weeks to trainee-mentor research time. Students are assigned a faculty member prior to arriving on campus.

From weeks two through seven, students work intensively with their mentors on their research projects. During the eighth and final week, students, faculty, coordinators, and mentors gather for a capstone symposium, where students present a summary of their research project.

 2024 Research Projects:

Causal discovery of signaling networks: learning causal graphical models from genetic data to understand pathways of disease

Mentor: Daniel Malinsky, PhD, Assistant Professor of Biostatistics
Mentees: Avani Ghosh; Radiya Imran; James Hiller

“Causal inference” aims to disentangle cause-and-effect relationships from mere statistical correlations, which is important for understanding mechanisms and the consequences of interventions on complex systems. “Causal discovery” is about estimating causal structures (networks or graphs) from purely observational data. We will apply graph-learning algorithms to data on gene expression/protein concentration from human cells and try to explore how different statistical choices embedded in those algorithms have consequences for the sparsity of the learned graph.

Characterizing Determinants of Cardiovascular Health

Mentor: Yihong Zhao, PhD, Professor of Data Science, School of Nursing
Mentees: Tatyana Bowers; Joniel Lewis; Harris Shaikh

Cardiovascular diseases remain the leading cause of mortality globally, creating an urgent need for improved preventive measures and treatments. Understanding the complex interplay of various determinants that influence cardiovascular health is crucial for developing targeted intervention strategies that can be personalized based on individual risk profiles. In this project, we have two aims. First, we will develop a metric that can be used to evaluate cardiovascular health. Second, we will use state-of-art machine learning methods to evaluate the relative impact of various determinants such as lifestyle choices, environmental factors, and socio-economic status on cardiovascular health.

Sleep after Stroke: Examining the relationship of sleep apnea with post-stroke outcomes

Mentor: Ari Shechter, PhD, Associate Professor of Medical Sciences (in Medicine) at CUMC
Mentees: Zina Mojekwu; Kayla Williamson

Sleep is closely related to nearly all aspects of physical and mental health and disease. The main purpose of the study was to examine how sleep in individuals who experienced stroke relates to risk of recurrent cardiovascular event and psychological health over the year following stroke. This study was a longitudinal cohort study with measures taken at baseline (i.e., at stroke hospitalization), and 1-, 6-, and 12-month follow-ups. The current analysis will focus on sleep apnea (a disorder where people stop breathing repeatedly during sleep) – to examine its prevalence, and its relationship to post-stroke outcomes like fatigue, depression, anxiety, level of physical disability, rehospitalizations, and recurrent events.

Assessing AI Tools in Academic Research Writing

Mentor: Tian Gu, PhD, Assistant Professor of Biostatistics
Mentees: Ariana Yahira; Adheesh Perera

Large Language Models have facilitated the widespread adoption of AI tools such as ChatGPT that serve various roles in academic research, from basic coding to language polishing. Increasing numbers of AI-powered tools are available in the market that promise to make essay writing easier and faster. These tools are often advertised on social media platforms, with names like JenniAI, Aithor, and Scite. However, is it really true that one can effortlessly use an AI assistant to write an essay? Can they really provide accurate and comprehensive research citations as they claim? In what areas and to what extent can they be helpful in writing an academic research paper? This project aims to systematically assess the effectiveness of AI tools in academic research writing, specifically in the context of literature review. We will evaluate their accuracy, efficiency, and impact on the quality of writing a statistical paper with an applied or a methodological focus, respectively.

Cortical myelin profile variations in lifetime in NKI data

Mentor: Seonjoo Lee, PhD, Associate Professor of Clinical Biostatistics (in Psychiatry)
Mentees: Divinia Ashley; Hanna Duque

Demyelination is observed in development, healthy aging, and age-related neurodegenerative disorders. It is important to identify the lifetime patterns of intracortical myelin. We will analyze n=2000 participants aged 6-85 from the NKI-RS dataset. We will investigate intracortical myelin using the T1w/T2w ratio and linear and nonlinear regression analyses.

Chitwan Valley Family Study – data archiving and hair cortisol

Mentor: Sabrina Hermosilla, PhD, Assistant Professor of Population and Family Health
Mentees: Anagha Chundury; Naomi Saenger

This project builds on the 27+ year CVFS (https://cvfs.isr.umich.edu/). The students will engage in analytic data management and explore individual identifiers across multiple datasets and work with data management team to create a unique identifier solution and implement solution across datasets. Additionally, they will be engaged in a project that will conduct analyses in Stata on respondents who completed hair cortisol survey and those who did not with a goal of understanding what characteristics predict study protocol compliance and how best to design data collection tools to maximize compliance.

 

2023 Research Projects:

Assessing Racial and Socioeconomic Inequities in the Public Health Burden of Hyper-Policing in NYC

Mentor: John Pamplin III, PhD, MPH, Assistant Professor of Epidemiology 
Mentees: Ruva Kiara and Kennedi Scales

Hyper-policing, defined as extraordinary and aggressive levels of police attention directed towards a specific neighborhood or community, is a prominently used policing tool that may also negatively impact health. Elements of criminal legal system exposure are associated with worse mental health, increased risk of communicable illness, increased risk of overdose for those who use drugs, and increased risk of injury and death by way of police brutality. Many of these negative relationships are especially consequential for Black people, who have much greater exposure to all aspects of the criminal legal system. Despite recognition of the vast potential public health consequences of broad criminal legal system exposure, most epidemiologic studies only focus on incarceration. The objective of this project was to assess patterns and predictors of neighborhood Hyper-Policing and other public-health related outcomes in NYC using publicly available NYPD policing

Evaluating the Effect of Psychiatric Comorbidity on the Development Using Healthy Brain Network Data

Mentor: Seonjoo Lee, PhD, Associate Professor of Clinical Biostatistics (in Psychiatry)
Mentees: Jude Ighovoyivwi and Torre Lloyd

The primary aim of this study was to identify psychiatric comorbidity patterns in children and adolescents and find an association with subjective and objective developmental outcomes. We will use the Child Behavior Check List (CBCL) and NIH Toolbox. First, we performed descriptive statistics of the psychiatric comorbidity accessed by KSADS. Then, we identified the psychiatric comorbidity groups using latent class analysis. Finally, we compared the subjective and objective developmental outcomes across the identified clusters.

Comparisons of Medical Cost Trajectories between Non-Hispanic Black and Non-Hispanic White Patients with Newly Diagnosed Localized Lung Cancer

Mentor: Shikun Wang, PhD, Assistant Professor of Biostatistics, Herbert Irving Cancer Center
Mentees: Lidio Jaimes, Jr.

The objective of this study was to examine the medical cost trajectories between non-Hispanic White (NHW) and non-Hispanic Black (NHB) patients with localized lung cancer from TCR-Medicare, using a recently developed statistical model which estimates the cost trajectories conditional on survival time. First, we performed descriptive statistics of the medical cost data. Then, we conducted the cost-effectiveness analysis. Finally, we compared the cost trajectory of different ethnic (and/or other social-demographic, clinical) subgroups.

Influence of Cardiorespiratory Signal Fluctuations on the BOLD Signal

Mentor: Yihong Zhao, PhD, Professor of Data Science, School of Nursing
Mentees: Rachel Jackson and Jonetta Lah

Blood oxygen-level-dependent functional magnetic resonance imaging (BOLD fMRI) has become a popular technique for the investigation of brain function in healthy individuals, patients as well as in animal studies. The primary aims of this project were to 1) understand noise and non-neuronal contributions to the BOLD signal, 2) identify methods used to clean the BOLD signal, and 3) quantify the influence of cardiorespiratory signal fluctuations on the BOLD signal from the rest-state fMRI data.

Genetic Association Between Alzheimer's Disease and Cardio-Cerebrovascular Risk Factor

Mentor: Annie Lee, PhD, Assistant Professor of Neurology Science
Mentees: Danielle Savellano and Kayla Scott-McDowell

The primary aim of this study was to identify genes that interact with cardiovascular risk factors (CVRFs) such as hypertension and diabetes to confer Alzheimer’s disease (AD) risk in multi-ethnic cohorts and investigate how they perturb molecular pathways leading to AD through analyzing multi-omics (transcriptomics and proteomics) profiles in humans. First, we identified the genes using gene-based gene-environment interaction test.  Then, we used multi-omics profiles in human brains to characterize the functional effects of the candidate genes using regression analysis and their respective disease pathways related to vascular interactions in AD using pathway enrichment analysis.

Biological Measures of Stress Response in Depressed Patients, and Associations with Risk of Suicide

Mentor: Hanga Galfalvy, PhD, Associate Professor of Biostatistics (in Psychiatry)
Mentees: Mark Almazo Rosendo and Kaylinn Escobar

The aim of the study was to provide a description of how inflammatory markers implicated in major depressive disorder (MDD) and suicide risk change with stress. We administered the Trier Social Stress Test (TSST) to participants with MDD and healthy volunteers to assess responses to acute psychosocial stress. Blood samples were collected before and 60 and 90 minutes after the TSST procedure. Up to 48 analytes were quantified. Outcome measures will be baseline levels and change over time using area under curve with respect to baseline (AUCi) and peak change. We compared the outcomes between the  MDD and the HV group using univariate and multivariate analyses and test associations with suicidal ideation and  history of suicidal behavior.

Chitwan Valley Family Study – Data Archiving and Hair Cortisol

Mentor: Sabrina Hermosilla, PhD, Assistant Professor of Population and Family Health
Mentees: Marco Maluf and Ainhoa Petri-Hidalgo

This project builds on the 27+ year CVFS (https://cvfs.isr.umich.edu/). The students engaged in analytic data management and explore individual identifiers across multiple datasets and worked with a data management team to create a unique identifier solution and implement solution across datasets. Additionally, they engaged in a project that conducted analyses in Stata on respondents who completed hair cortisol survey and those who did not with a goal of understanding what characteristics predict study protocol compliance and how best to design data collection tools to maximize compliance.