Skip to main navigation Skip to search Skip to main content

LightToken: A Task and Model-agnostic Lightweight Token Embedding Framework for Pre-trained Language Models

  • Haoyu Wang
  • , Ruirui Li
  • , Haoming Jiang
  • , Zhengyang Wang
  • , Xianfeng Tang
  • , Bin Bi
  • , Monica Cheng
  • , Bing Yin
  • , Yaqing Wang
  • , Tuo Zhao
  • , Jing Gao
  • Purdue University
  • Amazon.com, Inc.
  • Georgia Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Pre-trained language models∼(PLMs) such as BERT, RoBERTa, and DeBERTa have achieved state-of-the-art performance on various downstream tasks. The enormous sizes of PLMs hinder their deployment in resource-constrained scenarios, e.g., on edge and mobile devices. To address this issue, many model compression approaches have been proposed to reduce the number of model parameters. This paper focuses on compressing the token embedding matrices of PLMs, which typically make up a large proportion∼(around 20-30%) of the entire model parameters. Existing efforts to compress token embedding usually require the introduction of customized compression architectures or the optimization of model compression processes for individual downstream tasks, limiting their applicability in both model and task dimensions. To overcome these limitations and adhere to the principle of "one-for-all", we propose a lightweight token embedding framework named LightToken, which is able to produce compressed token embedding in a task and model-agnostic fashion. LightToken is generally compatible with different architectures and applicable to any downstream task. Specifically, through an integration of low-rank approximation, novel residual binary autoencoder, and a new compression loss function, LightToken can significantly improve the model compression ratio. To demonstrate the effectiveness of LightToken, we conduct comprehensive experiments on natural language understanding and question answering tasks. In particular, LightToken improves the state-of-the-art token embedding compression ratio from 5 to 25 and outperforms the existing token embedding compression approaches by 11% and 5% on GLUE and SQuAD v1.1 benchmarks, respectively.

Original languageEnglish
Title of host publicationKDD 2023 - Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages2302-2313
Number of pages12
ISBN (Electronic)9798400701030
DOIs
StatePublished - Aug 4 2023
Event29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023 - Long Beach, United States
Duration: Aug 6 2023Aug 10 2023

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISSN (Print)2154-817X

Conference

Conference29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023
Country/TerritoryUnited States
CityLong Beach
Period08/6/2308/10/23

Keywords

  • compression
  • pre-trained language model

Fingerprint

Dive into the research topics of 'LightToken: A Task and Model-agnostic Lightweight Token Embedding Framework for Pre-trained Language Models'. Together they form a unique fingerprint.

Cite this