Skip to main navigation Skip to search Skip to main content

GRM: Generalized regression model for clustering linear sequences

  • SUNY Buffalo

Research output: Contribution to conferencePaperpeer-review

4 Scopus citations

Abstract

Linear relation is valuable in rule discovery of stocks, such as "if stock X goes up 1, stock Y will go down 3", etc. The traditional linear regression models the linear relation of two sequences perfectly. However, if user asks "please cluster the stocks in the NASDAQ market into groups where sequences have strong linear relationship with each other", it is prohibitively expensive to compare sequences one by one. In this paper, we propose a new model named GRM (Generalized Regression Model) to gracefully handle the problem of linear sequences clustering. GRM gives a measure, GR 2, to tell the degree of linearity of multiple sequences without having to compare each pair of them. Our experiments on the stocks in the NASDAQ market mined out many interesting clusters of linear stocks accurately and efficiently using the GRM clustering algorithm.

Original languageEnglish
Pages23-32
Number of pages10
DOIs
StatePublished - 2004
EventProceedings of the Fourth SIAM International Conference on Data Mining - Lake Buena Vista, FL, United States
Duration: Apr 22 2004Apr 24 2004

Conference

ConferenceProceedings of the Fourth SIAM International Conference on Data Mining
Country/TerritoryUnited States
CityLake Buena Vista, FL
Period04/22/0404/24/04

Fingerprint

Dive into the research topics of 'GRM: Generalized regression model for clustering linear sequences'. Together they form a unique fingerprint.

Cite this