Collage: headshot of Daniel Malinsky, circuit boards, and text "Faculty Q&A"

A Biostatistician Pursues Causality and Algorithmic Fairness

November 18, 2024

Daniel Malinsky was in his first semester of a bachelor’s degree at Columbia in October 2007 when the U.S. stock market hit an all-time high. As Malinsky explored music composition and theory, history, particle physics, politics, and philosophy, the U.S. economy tanked. Bear Stearns and Lehman Brothers collapsed; so, too, did major mortgage lenders and guarantors. Just as markets in the U.S. recovered in June 2009, international markets destabilized.

Malinsky—now an assistant professor of biostatistics at Columbia Mailman—was an undergraduate research assistant in particle physics when international monetary policy caught his eye. “Especially austerity policies in Europe,” he says. What intrigued him was the underlying body of statistical work used to justify those moves, especially economists’ assumptions about how debt and investment affect GDP. “That got me thinking,” says Malinsky. “How do we decide what are the important causal factors in, say, economic stability? How do we know if these proposed policy changes are the right ones?”

Malinsky would go on to earn a PhD in logic, computation, and methodology, with a dissertation on data-driven causal modeling for policy. As anyone who’s sat through the first week of statistics 101 learns, correlation does not equal causation. Yet the burgeoning world of big data generates a veritable tsunami of patterns and associations—and when the data informs policies that affect population health, life and death can hang in the balance. Causal modeling applies machine learning algorithms to find meaning in the morass. Says Malinsky: “It’s all about disentangling cause and effect.”

How did you come to the field of public health?

Malinsky: I got interested in these methodological and philosophical questions of causal inference because I wanted to make a difference to practice and policy. The history of thinking about causal influence in epidemiological and biomedical fields is rich and open to new ideas. And in public health, new ideas are expected to make real, substantive effects on practice. That expectation reduces the amount of time spent on mathematical ideas that are really cool but won’t go anywhere practically.

You’ve written about the hazard that AI trained on real-world data will perpetuate harm and proposed strategies to optimize fairness, including in the field of precision medicine. What is algorithmic fairness?

Malinsky: The sub-discipline of algorithmic fairness emerged in machine learning, and a bit in statistics, where people were trying to propose broadly technocratic solutions to what is ultimately a value-laden and substantively ethical, political problem of what kind of world we want to make with our algorithms. You could think of this branch of my work, broadly, as examining the benefits and harms that algorithms can perpetuate in socially impactful settings, which can include healthcare and other domains of life.

What drew you into the conversation?

Malinsky: There are some problems where the seemingly technical, objective solutions that people propose are ultimately built on a base of unstated—but very important—values and ethical commitments. I thought that bringing to bear both my technical insights from causal inference, statistics, and machine learning and the philosophical arguments that are being left out would be a way of making an impact in this area.

Where did your passion for justice and fairness originate?

Malinsky: I come from a family where we argue about politics a lot and thinking about history and the future—the way the world should be—was a ubiquitous part of my family life and upbringing. Even though I was someone who did science and nerdy stuff in that sense, it never felt separate or different from thinking about making the world a better place.

How does algorithmic fairness affect public health practice?

Malinsky: Using algorithms in different ways can have substantially different impacts on racial disparities in medicine. There might be some situations where you can modify the ways that algorithms work to respect fairness norms or to mitigate or counteract systemic injustice, for example in precision medicine, screening for child welfare, or explaining disparities in liver transplant outcomes. I do work that combines these ideas from causal inference and from social and political thinking.

Many of the projects you’ve tackled involve disparities in health outcomes, including air quality, cardiac recovery, and emphysema. How do you pick your projects?

Malinsky: A lot of the project selection is influenced by who you get to meet and who you get along with, work-wise. The one area that was perhaps more intentionally chosen by me was environmental determinants of health, especially air pollution. The policy relevance of air pollution epidemiology and causal inference methods is not only clear, but pretty direct: As part of the Clean Air Act, Environmental Protection Agency officials have to synthesize the latest evidence every few years and that evidence gets translated into policy. Seeing how scientific work has the power to influence policy was inspiring to me.