TY - GEN
T1 - Finite Sample Guarantees of Differentially Private Expectation Maximization Algorithm
AU - Wang, Di
AU - Ding, Jiahao
AU - Hu, Lijie
AU - Xie, Zejun
AU - Pan, Miao
AU - Xu, Jinhui
N1 - Publisher Copyright:
© 2023 The Authors.
PY - 2023/9/28
Y1 - 2023/9/28
N2 - (Gradient) Expectation Maximization (EM) is a widely used algorithm for estimating the maximum likelihood of mixture models or incomplete data problems. A major challenge facing this popular technique is how to effectively preserve the privacy of sensitive data. Previous research on this problem has already lead to the discovery of some Differentially Private (DP) algorithms for (Gradient) EM. However, unlike in the non-private case, existing techniques are not yet able to provide finite sample statistical guarantees. To address this issue, we propose in this paper the first DP version of Gradient EM algorithm with statistical guarantees. Specifically, we first propose a new mechanism for privately estimating the mean of a heavy-tailed distribution, which significantly improves a previous result in [25], and it could be extended to the local DP model, which has not been studied before. Next, we apply our general framework to three canonical models: Gaussian Mixture Model (GMM), Mixture of Regressions Model (MRM) and Linear Regression with Missing Covariates (RMC). Specifically, for GMM in the DP model, our estimation error is near optimal in some cases. For the other two models, we provide the first result on finite sample statistical guarantees. Our theory is supported by thorough numerical experiments on both real-world data and synthetic data.
AB - (Gradient) Expectation Maximization (EM) is a widely used algorithm for estimating the maximum likelihood of mixture models or incomplete data problems. A major challenge facing this popular technique is how to effectively preserve the privacy of sensitive data. Previous research on this problem has already lead to the discovery of some Differentially Private (DP) algorithms for (Gradient) EM. However, unlike in the non-private case, existing techniques are not yet able to provide finite sample statistical guarantees. To address this issue, we propose in this paper the first DP version of Gradient EM algorithm with statistical guarantees. Specifically, we first propose a new mechanism for privately estimating the mean of a heavy-tailed distribution, which significantly improves a previous result in [25], and it could be extended to the local DP model, which has not been studied before. Next, we apply our general framework to three canonical models: Gaussian Mixture Model (GMM), Mixture of Regressions Model (MRM) and Linear Regression with Missing Covariates (RMC). Specifically, for GMM in the DP model, our estimation error is near optimal in some cases. For the other two models, we provide the first result on finite sample statistical guarantees. Our theory is supported by thorough numerical experiments on both real-world data and synthetic data.
UR - https://www.scopus.com/pages/publications/85175834278
U2 - 10.3233/FAIA230546
DO - 10.3233/FAIA230546
M3 - Conference contribution
AN - SCOPUS:85175834278
T3 - Frontiers in Artificial Intelligence and Applications
SP - 2435
EP - 2442
BT - ECAI 2023 - 26th European Conference on Artificial Intelligence, including 12th Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 - Proceedings
A2 - Gal, Kobi
A2 - Gal, Kobi
A2 - Nowe, Ann
A2 - Nalepa, Grzegorz J.
A2 - Fairstein, Roy
A2 - Radulescu, Roxana
PB - IOS Press BV
T2 - 26th European Conference on Artificial Intelligence, ECAI 2023
Y2 - 30 September 2023 through 4 October 2023
ER -