Title
|
|
|
|
Mining frequent itemsets in a stream
| |
Author
|
|
|
|
| |
Abstract
|
|
|
|
Mining frequent itemsets in a datastream proves to be a difficult problem, as itemsets arrive in rapid succession and storing parts of the stream is typically impossible. Nonetheless, it has many useful applications; e.g., opinion and sentiment analysis from social networks. Current stream mining algorithms are based on approximations. In earlier work, mining frequent items in a stream under the max-frequency measure proved to be effective for items. In this paper, we extended our work from items to itemsets. Firstly, an optimized incremental algorithm for mining frequent itemsets in a stream is presented. The algorithm maintains a very compact summary of the stream for selected itemsets. Secondly, we show that further compacting the summary is non-trivial. Thirdly, we establish a connection between the size of a summary and results from number theory. Fourthly, we report results of extensive experimentation, both of synthetic and real-world datasets, showing the efficiency of the algorithm both in terms of time and space. (C) 2012 Elsevier Ltd. All rights reserved. |
| |
Language
|
|
|
|
English
| |
Source (journal)
|
|
|
|
Information systems. - London
| |
Publication
|
|
|
|
London
:
2014
| |
ISSN
|
|
|
|
0306-4379
| |
DOI
|
|
|
|
10.1016/J.IS.2012.01.005
| |
Volume/pages
|
|
|
|
39
(2014)
, p. 233-255
| |
ISI
|
|
|
|
000329531300012
| |
Full text (Publisher's DOI)
|
|
|
|
| |
Full text (publisher's version - intranet only)
|
|
|
|
| |
|