Skip to main navigation Skip to search Skip to main content

Mutual information based matching for causal inference with observational data

  • SUNY Buffalo

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

This paper presents an information theory-driven matching methodology for making causal inference from observational data. The paper adopts a "potential outcomes framework" view on evaluating the strength of cause-effect relationships: The population-wide average effects of binary treatments are estimated by comparing two groups of units-the treated and untreated (control). To reduce the bias in such treatment effect estimation, one has to compose a control group in such a way that across the compared groups of units, treatment is independent of the units' covariates. This requirement gives rise to a subset selection / matching problem. This paper presents the models and algorithms that solve the matching problem by minimizing the mutual information (MI) between the covariates and the treatment variable. Such a formulation becomes tractable thanks to the derived optimality conditions that tackle the non-linearity of the sample-based MI function. Computational experiments with mixed integer-programming formulations and four matching algorithms demonstrate the utility of MI based matching for causal inference studies. The algorithmic developments culminate in a matching heuristic that allows for balancing the compared groups in polynomial (close to linear) time, thus allowing for treatment effect estimation with large data sets.

Original languageEnglish
Pages (from-to)1-31
Number of pages31
JournalJournal of Machine Learning Research
Volume17
StatePublished - Feb 1 2016

Keywords

  • Matching
  • Mutual Information
  • Observational Causal Inference
  • Optimization
  • Subset Selection

Fingerprint

Dive into the research topics of 'Mutual information based matching for causal inference with observational data'. Together they form a unique fingerprint.

Cite this