Concepedia

Publication | Open Access

Processing Interval Joins On Map-Reduce

27

Citations

16

References

2014

Year

Abstract

In this paper we investigate the problem of processing multi-way interval joins on map-reduce platform. We look at join queries formed by interval predicates as defined by Allen’s interval algebra. These predicates can be classified in two groups: colocation based predicates and sequence based predicates. A colocation predicate requires two intervals to share at least one common point while a sequence predi-cate requires two intervals to be disjoint. An interval join query can therefore be thought of as belonging to one of the three classes: (a) queries containing only colocation based predicates, (b) queries containing only sequence based pred-icates and (c) queries containing both classes of predicates. We address these three classes of join queries, discuss the challenges and present novel approaches for processing these queries on map-reduce platform. We also discuss why the current approaches developed for handling join queries on real-valued data can not be directly used to handle inter-val joins. We finally extend the approaches developed to handle join queries containing multiple interval attributes as well as join queries containing both interval as well as non-interval attributes. Through experimental evaluations both on synthetic and real life datasets, we demonstrate that the proposed approaches comfortably outperform naive ap-proaches. 1.

References

YearCitations

Page 1