Publication
Title
Mining top-k quantile-based cohesive sequential patterns
Author
Abstract
Finding patterns in long event sequences is an important data mining task. Two decades ago research focused on finding all frequent patterns, where the anti-monotonic property of support was used to design efficient algorithms. Recent research focuses on producing a smaller output containing only the most interesting patterns. To achieve this goal, we introduce a new interestingness measure by computing the proportion of the occurrences of a pattern that are cohesive. This measure is robust to outliers, and is applicable to sequential patterns. We implement an efficient algorithm based on constrained prefix-projected pattern growth and pruning based on an upper bound to uncover the set of top-k quantile-based cohesive sequential patterns. We run experiments to compare our method with existing state-of-the-art methods for sequential pattern mining and show that our algorithm is efficient and produces qualitatively interesting patterns on large event sequences.
Language
English
Source (book)
Proceedings of the 2018 SIAM International Conference on Data Mining, May 3-5, 2018, San Diego, CA, USA
Publication
San Diego, Calif. : SIAM, 2018
ISBN
978-1-61197-532-1
Volume/pages
p. 90-98
Full text (Publisher's DOI)
Full text (publisher's version - intranet only)
UAntwerpen
Faculty/Department
Research group
Project info
Reliable on-the-fly prediction of future events in data streams.
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Record
Identification
Creation 12.12.2018
Last edited 15.07.2021
To cite this reference