Frequent itemset mining for big data

Moens, Sandy; Akşehirli, Emin; Goethals, Bart

doi:10.1109/BIGDATA.2013.6691742

Title

Frequent itemset mining for big data

Author

Moens, Sandy

Akşehirli, Emin

Goethals, Bart

Abstract

Frequent Itemset Mining (FIM) is one of the most well known techniques to extract knowledge from data. The combinatorial explosion of FIM methods become even more problematic when they are applied to Big Data. Fortunately, recent improvements in the field of parallel programming already provide good tools to tackle this problem. However, these tools come with their own technical challenges, e.g. balanced data distribution and inter-communication costs. In this paper, we investigate the applicability of FIM techniques on the MapReduce platform. We introduce two new methods for mining large datasets: Dist-Eclat focuses on speed while BigFIM is optimized to run on really large datasets. In our experiments we show the scalability of our methods.

Language

English

Source (book)

IEEE Big Data 2013 : International Conference on Big Data, October 6-9, 2013, Santa Clara, Calif., USA

Publication

New York, N.Y. : IEEE , 2013

ISBN

978-1-4799-1292-6

DOI

10.1109/BIGDATA.2013.6691742

Volume/pages

p. 111-118

ISI

000330831300199

Full text (Publisher's DOI)

https://doi.org/10.1109/BIGDATA.2013.6691742

Faculty/Department				Faculty of Sciences. Mathematics and Computer Science

Research group				ADReM Data Lab (ADReM)
Project info				Principles of Pattern Set Mining for structured data.
Publication type				H1 Book chapter

Subject				Computer. Automation

Affiliation				Publications with a UAntwerp address

Web of Science

View record in Web of Science®

View citing articles in Web of Science®

Identifier

Creation

30.01.2014

Last edited

09.10.2023

To cite this reference

https://hdl.handle.net/10067/1134290151162165141