Relative information completeness
Faculty of Sciences. Mathematics and Computer Science
S.l. :ACM, 2009
Proceedings of the 28th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2009), Providence, R.I., USA, June 19 - July 1, 2009 / Paredaens, Jan [edit.]; et al.
The paper investigates the question of whether a partially closed database has complete information to answer a query. In practice an enterprise often maintains master data Dm, a closed-world database. We say that a database D is partially closed if it satisfies a set V of containment constraints of the form "q(D) is a subset of p(Dm)", where q is a query in a language Lc and p is a projection query. The part of D not constrained by (Dm,V) is open, from which some tuples may be missing. The database D is said to be complete for a query Q relative to (Dm,V) if for all partially closed extensions D' of D, Q(D')=Q(D), i.e., adding tuples to D either violates some constraints in V or does not change the answer to Q. We first show that the proposed model can also capture the consistency of data, in addition to its relative completeness. Indeed, integrity constraints studied for consistency can be expressed as containment constraints. We then study two problems. One is to decide, given Dm, V, a query Q in a language Lq and a partially closed database D, whether D is complete for Q relative to (Dm,V). The other is to determine, given Dm, V and Q, whether there exists a partially closed database that is complete for Q relative to (Dm,V). We establish matching lower and upper bounds on these problems for a variety of languages Lq and Lc. We also provide characterizations for a database to be relatively complete, and for a query to allow a relatively complete database, when Lq and Lc are conjunctive queries.