Skip to main navigation Skip to search Skip to main content

Efficient algorithms for graph regularized PLSA for probabilistic topic modeling

  • Xin Wang
  • , Ming Ching Chang
  • , Lan Wang
  • , Siwei Lyu
  • CuraCloud Corporation
  • State University of New York System
  • Tianjin Normal University

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

Probabilistic latent semantic analysis (PLSA) is a popular data analysis method with the objective to discover the underlying semantic structure of input data. In this work, we describe a method for probabilistic topic analysis in image and text based on a new representation of graph-regularized PLSA (GPLSA). In GPLSA, data entities are mapped to an undirected graph, where similarities between topic compositions on the graph are measured by the divergence between discrete probabilities. Such divergence is essentially incorporated as a graph-regularizer that augments the original PLSA algorithm. Furthermore, we extend the GPLSA algorithms to multiple data modalities based on the connections between data entities of each modality. We propose efficient multiplicative iterative algorithms for GPLSA with three popular regularizers, namely ℓ1, ℓ2 and symmetric KL divergences. In each case, we derive simple efficient numerical solutions that require only matrix arithmetic operations during the optimization. Experimental results demonstrate the efficacy of GPLSA over state-of-the-art methods.

Original languageEnglish
Pages (from-to)236-247
Number of pages12
JournalPattern Recognition
Volume86
DOIs
StatePublished - Feb 2019

Keywords

  • Clustering
  • Graph regularization
  • Probabilistic latent semantic analysis
  • Topic analysis

Fingerprint

Dive into the research topics of 'Efficient algorithms for graph regularized PLSA for probabilistic topic modeling'. Together they form a unique fingerprint.

Cite this