Credit Card Number Fraud Detection Using K-Means with Hidden Markov Method

International Journal of Mobile Computing and Application
© 2015 by SSRG - IJMCA Journal
Volume 2 Issue 2
Year of Publication : 2015
Authors : Pooja Bhati and Manoj Sharma
How to Cite?

Pooja Bhati and Manoj Sharma, "Credit Card Number Fraud Detection Using K-Means with Hidden Markov Method," SSRG International Journal of Mobile Computing and Application, vol. 2,  no. 2, pp. 15-18, 2015. Crossref,


Clustering is a way of segmenting the data into some purposeful groups. When done efficiently, the final product, i.e. the clusters should seize the very essence of the original data. Clustering and outlier detection are the two paramount fields of data mining. In today’s time, when security is one of the major issues in every aspect of life, outlier detection becomes inevitable for data mining. Any arrangement or design which is contradictory to the rest of the arrangement can be defined as an outlier for that particular sampleset. Monetary fraud is hugely spread over every possible aspect of life. Credit card fraud is also very common, as they are being extensively nowadays. The method being proposed in this paper is to use k-means clustering and then finding outliers in the resultant clusters using the Hidden Markov Method. Our proposed algorithm effectively sub-divides the in-liners into clusters and then detects the outliers. After that Luhn Algorithm is being used for validating the resultant credit card numbers. Our proposed work would work much more efficiently and effectively in case of Big data, as normal k-means has a poor tendency to work with big data and also to detect outliers.


 Outlier, Monetary fraud, k-Means, Hidden Markov Method, Luhn Algorithm


[1]Credit Card Fraud Detection by Improving K-Means, Mahesh Singh, Aashima, Sangeeta Raheja, International Journal of Engineering and Technical Research (IJETR), ISSN: 2321-0869, Volume-2, Issue-5, May 2014 
[2] Clustering Memes in social media streams, Mohsen JafariAsbagh, Emilio Ferrara, Onur Varol,                   (published online: 18 November, 2014.)
[3] A hybrid network intrusion detection framework based on random forests and weighted k-means.                 Reda M. Elbasiony, , Elsayed A. Sallam1, , Tarek E. Eltobely2, , Mahmoud M. Fahmy3 (Ain Shams Engineering Journal Volume 4, Issue 4, December 2013).
[4] Enhance Luhn Algorithm for Validation of Credit Cards Numbers , Khalid Waleed Hussein  , Dr. Nor Fazlida Mohd. Sani  , Professor Dr. Ramlan Mahmod , Dr. Mohd. Taufik Abdullah , IJCSMC, Vol. 2, Issue. 7, July 2013, pg.262 – 272
[5] A comparative study of efficient initialization methods for the k-means clustering algorithm M. Emre Celebi, Hassan A. Kingravi , Patricio A. Vela , Expert Systems with Applications, Volume 40, Issue 1, January 2013, Pages 200–210.
[6] Outlier Detection over Data Set Using Cluster-Based and Distance-Based Approach, Ms. S. D. Pachgade, Ms. S. S. Dhande, Volume 2, Issue 6, June 2012, International Journal of Advanced Research in Computer Science and Software Engineering.
[7] A New Hybridized K-Means Clustering Based Outlier Detection Technique For Effective Data Mining, H.S.Behera     Abhishek Ghosh, Sipak ku. Mishra, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 4, April 2012.
[8] Regularized k-means clustering of high-dimensional data and its asymptotic consistency Wei Sun, Junhui Wang, and Yixin Fang, Electron. J. Statist. Volume 6 (2012), 148-167.
[9] An improved network intrusion detection technique based on k-means clustering via Naïve bayes classification, Sanjay Kumar Pankaj Pandey ;  Susheel Kumar Tiwari ;  Mahendra Singh Sisodia, Advances in Engineering, Science and Management (ICAESM), 2012 International Conference 
[10] The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature E.W.T. Ngai , Yong Hu , Y.H. Wong , Yijun Chen, Xin Sun,  Decision Support Systems Volume 50, Issue 3, February 2011, Pages 559–569
[11] A Hidden Markov Model Based Method for Anomaly* Detection of Precipitation Series by Jun Shen, Minhua Yang, Ronghua Zhong, Cuchai Zhang, Journal of Information & Computational Science 8: 9 (2011) 1551–1560
[12] Equations for Hidden Markov Model, Alexander Schonhuth, 2008.  
[13] Ben-Gal I., Outlier detection,  In: Maimon O. and Rockach L. (Eds.) Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers," Kluwer Academic Publishers, 2005, ISBN 0-387-24435-2.
[14] Outlier Detection in Clustering, Svetlana Cherednichenko, 24.01.2005, University of Joensuu, Department of Computer Science, Master’s Thesis.