Hi! I am a postdoc at Cornell, working with Emma Pierson and Jenna Wiens. I completed my Ph.D. in the Clinical and Applied Machine Learning group at MIT, where I was lucky to be advised by John Guttag. Before, I was at MIT for undergrad, where I majored in computer science with a concentration in South Asian studies.

I work on machine learning for healthcare. My current research (often) falls into one or more of these categories:

Methods to measure human behavior in health datasets
I believe that we can improve healthcare not just by training better predictive models, but also by building better *descriptive* models of how care is currently delivered.
    > How can we measure the extent to which diseases are underdiagnosed in different patient subgroups? (NWH 2024)
    > How can we measure different patterns of health access? (under review)
    > How can we measure overtreatment? (in progress)
Methods to update and evaluate machine learning models
There is substantial room to improve the ways we update, evaluate, and select machine learning models.
    > How can we efficiently update models to be more accurate, robust, and calibrated? (ICCV 2021, under review)
    > How can we facilitate semantically-grounded, context-specific evaluation? (CHI 23)
    > How can we best evaluate classifiers in the absence of abundant labeled data? (under review)
Health equity
Can we use AI to characterize and mitigate persistent health inequities? I (and many of my co-authors) would say yes! I am especially committed to translating advances in machine learning to women's health.
    > What disparities might we miss without access to granular race data? (MLHC 2023)
    > What features are important when predicting ovulation? (SR 2023)
    > How can we better measure the prevalence of intimate partner violence? (NWH 2024)
    > What new opportunities in health equity do large language models enable? (NEJM AI 2025)

Sometimes I describe my interests as “everything but model training”. This is because 1. I am impatient and 2. I believe the road from messy to clean data and the road from trained model to deployment raise important unanswered questions.