Publication
Title
Theoretical bounds on the size of condensed representations
Author
Abstract
Recent studies demonstrate the usefulness of condensed representations as a semantic compression technique for the frequent item-sets. Especially in inductive databases, condensed representations are a useful tool as an intermediate format to support exploration of the itemset space. In this paper we establish theoretical upper bounds on the maximal size of an itemset in different condensed representations. A central notion in the development of the bounds are the l-free sets, that form the basis of many well-known representations. We will bound the maximal cardinality of an l-free set based on the size of the database. More concrete, we compute a lower bound for the size of the database in terms of the size of the l-free set, and when the database size is smaller than this lower bound, we know that the set cannot be l-free. An efficient method for calculating the exact value of the bound, based on combinatorial identities of partial row sums, is presented. We also present preliminary results on a statistical approximation of the bound and we illustrate the results with some simulations.
Language
English
Source (journal)
Lecture notes in computer science. - Berlin, 1973, currens
Publication
Berlin : 2005
ISSN
0302-9743 [print]
1611-3349 [online]
Volume/pages
3377 (2005) , p. 46-65
ISI
000228724600004
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 08.10.2008
Last edited 04.03.2024
To cite this reference