TY - GEN
T1 - Output-sensitive evaluation of prioritized skyline queries
AU - Meneghetti, Niccolò
AU - Mindolin, Denis
AU - Ciaccia, Paolo
AU - Chomicki, Jan
N1 - Publisher Copyright:
Copyright © 2015 ACM.
PY - 2015/5/27
Y1 - 2015/5/27
N2 - Skylines assume that all attributes are equally important, as each dimension can always be traded off for another. Prioritized skylines (p-skylines) take into account non-compensatory preferences, where some dimensions are deemed more important than others, and trade-offs are constrained by the relative importance of the attributes involved. In this paper we show that querying using non-compensatory preferences is computationally efficient. We focus on preferences that are representable with p-expressions, and develop an efficient in-memory divide-and-conquer algorithm for answering p-skyline queries. Our algorithm is outputsensitive; this is very desirable in the context of preference queries, since the output is expected to be, on average, only a small fraction of the input. We prove that our method is well behaved in both the worst- and the average-case scenarios. Additionally, we develop a general framework for benchmarking p-skyline algorithms, showing how to sample prioritized preference relations uniformly, and how to highlight the effect of data correlation on performance. We conclude our study with extensive experimental results.
AB - Skylines assume that all attributes are equally important, as each dimension can always be traded off for another. Prioritized skylines (p-skylines) take into account non-compensatory preferences, where some dimensions are deemed more important than others, and trade-offs are constrained by the relative importance of the attributes involved. In this paper we show that querying using non-compensatory preferences is computationally efficient. We focus on preferences that are representable with p-expressions, and develop an efficient in-memory divide-and-conquer algorithm for answering p-skyline queries. Our algorithm is outputsensitive; this is very desirable in the context of preference queries, since the output is expected to be, on average, only a small fraction of the input. We prove that our method is well behaved in both the worst- and the average-case scenarios. Additionally, we develop a general framework for benchmarking p-skyline algorithms, showing how to sample prioritized preference relations uniformly, and how to highlight the effect of data correlation on performance. We conclude our study with extensive experimental results.
KW - P-skyline
KW - Pareto accumulation
KW - Preference
KW - Preference query
KW - Prioritized accumulation
KW - Skyline
UR - https://www.scopus.com/pages/publications/84957554259
U2 - 10.1145/2723372.2723736
DO - 10.1145/2723372.2723736
M3 - Conference contribution
AN - SCOPUS:84957554259
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1955
EP - 1967
BT - SIGMOD 2015 - Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
PB - Association for Computing Machinery
T2 - ACM SIGMOD International Conference on Management of Data, SIGMOD 2015
Y2 - 31 May 2015 through 4 June 2015
ER -