Improving Classification of Fraudulent Sales
|International Journal of Computer Science and Engineering|
|© 2018 by SSRG - IJCSE Journal|
|Volume 5 Issue 12|
|Year of Publication : 2018|
|Authors : Barry E. King|
How to Cite?
Barry E. King, "Improving Classification of Fraudulent Sales," SSRG International Journal of Computer Science and Engineering , vol. 5, no. 12, pp. 16-17, 2018. Crossref, https://doi.org/10.14445/23488387/IJCSE-V5I12P104
This article presents an improved solution to classifying fraudulent sales. An original k-nearest neighbor solution for a dataset of more than fifteen thousand cases yielded a misclassification rate of 0.058 where eight percent of the observations were fraudulent. An improved solution using a boosted C5.0 algorithm yielded a misclassification rate of 0.038. The solution was expanded to recognize that false positives (classifying a fraudulent sale as clean) were five times as costly as were false negatives (classifying a clean sale as fraudulent). The misclassification rate for this expanded solution was 0.058 but lowered the misclassification cost by twenty-one percent.
binary classification, machine learning, k-nearest neighbor, C5.0 algorithm
 Murillo, J. P. (2016). Predicting fraudulent sales. [Online] https://rpubs.com/jpmurillo/fraudulentsales.
 Lantz, B. (2015). Machine Learning with R, 2nd edition. Birmingham, UK: Packt.