Skip to main navigation Skip to search Skip to main content

Meta-Learning without Data via Wasserstein Distributionally-Robust Model Fusion

  • Zhenyi Wang
  • , Xiaoyang Wang
  • , Li Shen
  • , Qiuling Suo
  • , Kaiqiang Song
  • , Dong Yu
  • , Yan Shen
  • , Mingchen Gao
  • SUNY Buffalo
  • Tencent
  • JD Explore Academy

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

Existing meta-learning works assume that each task has available training and testing data. However, we can only use many available pre-trained models without accessing their training data in practice. We often need a single model to solve different tasks simultaneously as this is much more convenient to deploy the models. Our work aims to meta-learn a model initialization from these pre-trained models without using corresponding training data. We name this challenging problem setting Data-Free Learning To Learn (DFL2L). We propose a distributionally robust optimization (DRO) framework to learn a black-box model to fuse and compress all the pre-trained models into a single network to address this problem. The proposed DRO framework diversifies the learned task embedding associated with each pre-trained model to cover the diversity in the underlying training task distributions, encouraging good generalization to unseen new tasks. We sample a meta-initialization from the black-box network during meta-testing for fast adaptation to unseen new tasks. Extensive experiments on offline and online DFL2L settings and several real image datasets demonstrate the effectiveness of the proposed methods.

Original languageEnglish
Pages (from-to)2045-2055
Number of pages11
JournalProceedings of Machine Learning Research
Volume180
StatePublished - 2022
Event38th Conference on Uncertainty in Artificial Intelligence, UAI 2022 - Eindhoven, Netherlands
Duration: Aug 1 2022Aug 5 2022

Fingerprint

Dive into the research topics of 'Meta-Learning without Data via Wasserstein Distributionally-Robust Model Fusion'. Together they form a unique fingerprint.

Cite this