Publication
Title
On scale independence for querying big data
Author
Abstract
To make query answering feasible in big datasets, practitioners have been looking into the notion of scale independence of queries. Intuitively, such queries require only a relatively small subset of the data, whose size is determined by the query and access methods rather than the size of the dataset itself. This paper aims to formalize this notion and study its properties. We start by defining what it means to be scale-independent, and provide matching upper and lower bounds for checking scale independence, for queries in various languages, and for combined and data complexity. Since the complexity turns out to be rather high, and since scale-independent queries cannot be captured syntactically, we develop sufficient conditions for scale independence. We formulate them based on access schemas, which combine indexing and constraints together with bounds on the sizes of retrieved data sets. We then study two variations of scale-independent query answering, inspired by existing practical systems. One concerns incremental query answering: we check when query answers can be maintained in response to updates scale-independently. The other explores scale-independent query rewriting using views.
Language
English
Source (book)
PODS '14 : proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems / Hull, Richard [edit.]; et al.
Publication
New York, N.Y. : ACM , 2014
ISBN
978-1-4503-2375-8
Volume/pages
p. 51-62
ISI
000450809200005
Full text (Publisher's DOI)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 02.09.2014
Last edited 09.10.2023
To cite this reference