Abstract
In spite of the abundance of clustering techniques and algorithms, clustering mixed interval (continuous) and categorical (nominal and/or ordinal) scale data remain a challenging problem. In order to identify the most effective approaches for clustering mixed-type data, we use both theoretical and empirical analyses to present a critical review of the strengths and weaknesses of the methods identified in the literature. Guidelines on approaches to use under different scenarios are provided, along with potential directions for future research.
| Original language | English |
|---|---|
| Pages (from-to) | 80-109 |
| Number of pages | 30 |
| Journal | International Statistical Review |
| Volume | 87 |
| Issue number | 1 |
| DOIs | |
| State | Published - Apr 2019 |
Keywords
- Discretisation
- Gower's distance
- Mahalanobis distance
- dummy coding
- k-means clustering
- machine learning
- mixture model
- multivariate data analysis
- unsupervised learning
Fingerprint
Dive into the research topics of 'Distance Metrics and Clustering Methods for Mixed-type Data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver