Finding Public Health in Data Science

May 3, 2021

Whenever I am back in my hometown of Chennai, India, my studies and research are frequent conversation points with friends and family. Due to the pandemic, most of them have been exposed to the term “public health” through the news. However, they are often confused about the kind of work that public health professionals engage in.

As an epidemiology student, I study public health research methods, data science, and biostatistics. Several friends back home work in accounting firms, software companies, and other industries that use statistics and data science on a regular basis. They frequently ask me: “What makes public health data science unique from other fields that use data science?”

On the surface, techniques used to analyze health data can be applied to a wide array of variables across disciplines. With the explosion in data availability and computational prowess, we can now extract a great deal of useful information for various purposes, from marketing to software development. But what sets public health statisticians and data scientists apart is our interest in using data to equitably promote community health and wellbeing.

To accomplish these health goals, we are careful about where our data comes from and whether it is representative of the population we hope to impact. More data does not always equal better data, since there is often a lack of data representing minority and underserved populations. For example, research surrounding vaccine efficacy could be strongly biased if the data collected is primarily from white men and not from individuals with different body types and from diverse racial/ethnic backgrounds. The result: not only inaccurate conclusions but also research and interventions that inadequately serve the people who need them most.

Avoiding these biases in study design, understanding how data is collected, and using the information at hand to produce accurate and representative outcomes is thus crucial in furthering public health’s mission of healthcare equity.

Fighting these biases requires public health data scientists to focus on fundamental causes of health problems while analyzing data, rather than observable statistical associations. Identifying these root causes necessitates an interdisciplinary approach that combines the perspective of healthcare professionals, biomedical researchers, and other experts to gain a more holistic understanding of the issue at hand. For example, healthcare professionals can help data scientists determine whether an observable association is clinically relevant or has tangible effects on health outcomes.

Working together with these experts, we can use data to identify risk factors for illnesses or provide evidence for policy changes by combining problem-solving skills, empathy, and analytical expertise.

Since public health aims to address structural issues that affect community health, ethical decision-making is crucial to closing the gap in health outcomes and protecting the people involved in research and intervention planning. When formulating research questions and conducting analyses, public health scientists ensure that our study findings benefit the population being studied substantially. In fact, the study or intervention design process usually begins by working with community members to clearly understand their needs so we can address them. Finally, we present our results in a clear and concise manner to community members and other stakeholders using easy-to-understand maps, reports, flowcharts, and graphs.

With the rampant widening of health disparities and the rise of public health issues such as police brutality and COVID-19, the need for nuanced, people-centric data science cannot be overstated. Through my experiences living in India and the U.S, I witnessed a palpable divide in health outcomes based on race, socioeconomic status, religious beliefs, nationality, and caste assigned at birth. The ongoing politicization of public health interventions such as vaccine distribution and mask mandates in the U.S, as well as discriminatory policies that affect the physical and mental health of minority communities, such as anti-Muslim laws in India, are deepening this already great divide.

As future public health professionals, my classmates and I have a daunting challenge ahead of us due to the breadth of social, political, biological, and environmental factors that need to be addressed in order to equitably improve population health and wellbeing. However, we can harness the power of public health data science to understand the burden of health-related problems that affect our communities and subsequently recognize solutions that can address these problems. As the field evolves, I am excited by the prospect of applying a uniquely public health version of data science to promote health and improve the lives of communities around the world.

Adarsh Ramakrishnan, MPH '21 in Epidemiology is from Chennai, India. He received his Bachelor of Science and Arts degree in Biology from the University of Texas at Austin.