Skip to main navigation Skip to search Skip to main content

Distance Metrics and Clustering Methods for Mixed-type Data

  • SUNY Buffalo
  • Arenadotio

Research output: Contribution to journalArticlepeer-review

58 Scopus citations

Abstract

In spite of the abundance of clustering techniques and algorithms, clustering mixed interval (continuous) and categorical (nominal and/or ordinal) scale data remain a challenging problem. In order to identify the most effective approaches for clustering mixed-type data, we use both theoretical and empirical analyses to present a critical review of the strengths and weaknesses of the methods identified in the literature. Guidelines on approaches to use under different scenarios are provided, along with potential directions for future research.

Original languageEnglish
Pages (from-to)80-109
Number of pages30
JournalInternational Statistical Review
Volume87
Issue number1
DOIs
StatePublished - Apr 2019

Keywords

  • Discretisation
  • Gower's distance
  • Mahalanobis distance
  • dummy coding
  • k-means clustering
  • machine learning
  • mixture model
  • multivariate data analysis
  • unsupervised learning

Fingerprint

Dive into the research topics of 'Distance Metrics and Clustering Methods for Mixed-type Data'. Together they form a unique fingerprint.

Cite this