Protecting Sensitive Frequent Itemsets in Database Transaction Using Unknown Symbol
Abstract
Data mining algorithms give advantages on data analytics, thus the information which are hidden from database can be revealed maximally as a result the data owner may use it effectively. Besides the benefits, it also brings some challenges like some information which is considered as sensitive can be revealed under some algorithms. Sensitive information can be considered as the information of people or organization that should be kept under certain rule before it is published. Therefore, in this research we propose an efficient approach to deal with privacy preserving data mining (PPDM) for avoiding privacy breach in frequent itemsets mining. The size of database is also be considered, therefore we conduct data segregation in order to separate between transactions with sensitive itemsets and transactions without sensitive itemsets. This step is followed by deriving which item from transactions that is going to be replaced using unknown symbol to perform data sanitization. A set of experiment is conducted to show the benefit of our approach. Based on the experimental results, the proposed approach has good performance for hiding sensitive itemsets and also it results less changes in the original database.