Using background knowledge to rank itemsets

Tatti, Nikolaj; Mampaey, Michael

doi:10.1007/S10618-010-0188-4

Title

Using background knowledge to rank itemsets

Author

Tatti, Nikolaj

Mampaey, Michael

Abstract

Assessing the quality of discovered results is an important open problem in data mining. Such assessment is particularly vital when mining itemsets, since commonly many of the discovered patterns can be easily explained by background knowledge. The simplest approach to screen uninteresting patterns is to compare the observed frequency against the independence model. Since the parameters for the independence model are the column margins, we can view such screening as a way of using the column margins as background knowledge. In this paper we study techniques for more flexible approaches for infusing background knowledge. Namely, we show that we can efficiently use additional knowledge such as row margins, lazarus counts, and bounds of ones. We demonstrate that these statistics describe forms of data that occur in practice and have been studied in data mining. To infuse the information efficiently we use a maximum entropy approach. In its general setting, solving a maximum entropy model is infeasible, but we demonstrate that for our setting it can be solved in polynomial time. Experiments show that more sophisticated models fit the data better and that using more information improves the frequency prediction of itemsets.

Language

English

Source (journal)

Data mining and knowledge discovery. - Boston, Mass., 1997, currens

Publication

Boston, Mass. : 2010

ISSN

1384-5810 [print]

1573-756X [online]

DOI

10.1007/S10618-010-0188-4

Volume/pages

21 :2 (2010) , p. 293-309

ISI

000280564900006

Full text (Publisher's DOI)

https://doi.org/10.1007/S10618-010-0188-4

Full text (publisher's version - intranet only)

https://repository.uantwerpen.be/docman/iruaauth/914a9a/41687cf0fe1.pdf

Faculty/Department				Faculty of Sciences. Mathematics and Computer Science

Research group				ADReM Data Lab (ADReM)
Publication type				A1 Journal article

Subject				Computer. Automation

Affiliation				Publications with a UAntwerp address

Web of Science

View record in Web of Science®

View citing articles in Web of Science®

Identifier

Creation

08.09.2010

Last edited

25.05.2022

To cite this reference

https://hdl.handle.net/10067/838160151162165141