Concepedia

Publication | Closed Access

Maximal vector computation in large data sets

372

Citations

10

References

2005

Year

Abstract

Finding the maximals in a collection of vectors is relevant to many applications. The maximal set is related to the convex hull— and hence, linear optimization—and nearest neighbors. The maximal vector problem has resurfaced with the advent of skyline queries for relational databases and skyline algorithms that are external and relationally well behaved. The initial algorithms proposed for maximals are based on divide-and-conquer. These established good average and worst case asymptotic running times, showing it to be O(n) average-case, where n is the number of vectors. However, they are not amenable to externalizing. We prove their performance is quite bad with respect to the dimensionality, k, of the problem. We demonstrate that the more recent external skyline algorithms are actually better behaved, although they do not have as good an apparent asymptotic complexity. We introduce a new external algorithm, LESS, that combines the best features of these, experimentally evaluate its effectiveness and improvement over the field, and prove its average-case running time is O(kn). 1

References

YearCitations

Page 1