Improving Security and Efficiency in Association Rule Mining using PFP-Growth Algorithm via Transaction Splitting

International Journal of Computer Science and Engineering
© 2016 by SSRG - IJCSE Journal
Volume 3 Issue 4
Year of Publication : 2016
Authors : R. Syed Ali Fathima, M. John Basha, P.Saravanan

pdf
How to Cite?

R. Syed Ali Fathima, M. John Basha, P.Saravanan, "Improving Security and Efficiency in Association Rule Mining using PFP-Growth Algorithm via Transaction Splitting," SSRG International Journal of Computer Science and Engineering , vol. 3,  no. 4, pp. 46-51, 2016. Crossref, https://doi.org/10.14445/23488387/IJCSE-V3I4P116

Abstract:

Data Mining is a technique which is used to discover hidden information from a large database. Frequent item set mining is also an important fundamental problem in data mining. Nowadays, most of the researchers are used association rule mining to find correlation between items and items sets resourcefully. Security is also important problem in data mining. To this end, we propose a transaction splitting based on PFP-growth algorithm and frequent items should keep as secured with the help of cryptography algorithms. PFP-Growth algorithm is advanced to FP-growth algorithm. It consists of both preprocessing phase and mining phase. In the preprocessing phase, we used smart transaction splitting method to improve the utility and tradeoff. In the mining phase, the transformed database and user specified threshold value helps to estimate the number of support computations, so that we can gradually reduce the amount of noise required and the information loss caused by transaction splitting. Using frequent item, we find the global association rules based on association rule mining. In this paper, cryptography technique (AES - Advanced Encryption Standard algorithm) is used to secure the frequent item set. Trusted party should preserve the privacy of individual data while the data is distributed among different sites.

Keywords:

Data mining, frequent itemset mining, transaction splitting, cryptography technique.

References:

1. Frawley,W.,Piatetsky-Shapiro,G.,Matheus,C.(1992) Knowledge Discovery in Databases: An Overview.AI Magazine, Fall 1992,pp.213-228.
2. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in Proc. 20th Int. Conf. Very Large Data Bases, 1994, pp. 487–499.
3. J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” in Proc. ACM SIGMOD Int. Conf. Manage.Data, 2000, pp. 1–12.
4. N.V.Muthu Lakshmi, and K.sandhya Rani, “Privacy Preserving Association Rule Mining in Horizontally Partitioned Databases Using Cryptography Techniques”, International Journal of Computer Science and Information Technologies, Vol. 3 (1), pp. 3176 – 3182, 2012.
5. Prerna Mahajan and Abhishek Sachdeva, “A Study of Encryption Algorithms AES, DES and RSA for Security”, Global Journal of Computer Science and Technology Network, Web & Security Vol.13(15), Version 1.0, pp.14- 22,2013.
6. S.Sen , X. Shengzhi Xu, Xiang Cheng, Zhengyi Li, and Fangchun Yang, “Differentially Private Frequent Itemset Mining via Transaction Splitting”, IEEE Transactions On Knowledge And Data Engineering, vol. 27, no. 7, pp.1875- 1891, 2015.
7. L. Bonomi and L. Xiong, “A two-phase algorithm for mining sequential patterns with differential privacy,” in Proc. 22nd ACM Conf. Inf. Knowl. Manage., 2013, pp. 269–278.
8. E. Shen and T. Yu, “Mining frequent graph patterns with differential privacy,” in Proc. 12th CM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2013, pp. 545–553.
9. R. Chen, B. C. M. Fung, and B. C. Desai, “Differentially private transit data publication: A case study on the montreal transportation system,” in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 213–221.
10. R. Chen, G. Acs, and C. Castelluccia, “Differentially private sequential data publication via variable-length n-grams,” in Proc. ACM Conf. Comput. Commun. Security, 2012, pp. 638– 649.
11. Agrawal.R and Srikant.R. Fast algorithms for mining association rules in large databases. In Proc. 20th VLDB,Sept. 1994.
12. Lin D.I and Kedem Z.M. “Pincer-Search : An Efficient Algorithm for Discovering the Maximal Frequent Set”,Knowledge and Data Engineering IEEE, pp: 553-566, 1999.
13. Padmapriya, Dr.A, Subhasri, P. “Cloud Computing: Security Challenges & Encryption Practices”. International Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X, Volume 3, Issue 3, pp. 257, March 2013.
14. M.A.Santhi, “Application of Data Mining Using Snort rule for intrusion detection”, SSRG International Journal of Computer Science and Engineering, Volume 1, Issue 8, 2014.
15. B.Muruganantham and Ankita Dubey, “Outlier Detection Using Distributed Mining Technology In Large Database”, SSRG International Journal of Computer Science and Engineering, Volume 2, Issue 2, 2015.