Publication
Title
Revisiting conditional functional dependency discovery : splitting the “C” from the “FD”
Author
Abstract
any techniques for cleaning dirty data are based on enforcing some set of integrity constraints. Conditional functional dependencies (CFDs) are a combination of traditional Functional dependencies (FDs) and association rules, and are widely used as a constraint formalism for data cleaning. However, the discovery of such CFDs has received limited attention. In this paper, we regard CFDs as an extension of association rules, and present three general methodologies for (approximate) CFD discovery, each using a different way of combining pattern mining for discovering the conditions (the “C” in CFD) with FD discovery. We discuss how existing algorithms fit into these three methodologies, and introduce new techniques to improve the discovery process. We show that the right choice of methodology improves performance over the traditional CFD discovery method CTane.
Language
English
Source (journal)
Lecture notes in computer science. - Berlin, 1973, currens
Source (book)
Machine learning and knowledge discovery in databases : European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), September 10-14, 2018, Dublin, Ireland
Source (series)
Lecture notes in artificial intelligence (LNAI); 11052
Lecture notes in computer science (LNCS); 11052
Publication
Cham : Springer , 2019
ISBN
978-3-030-10927-1
978-3-030-10927-1
DOI
10.1007/978-3-030-10928-8_33
Volume/pages
(2019) , p. 552-568
ISI
000611394900033
Full text (Publisher's DOI)
Full text (open access)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 30.09.2020
Last edited 02.10.2024
To cite this reference