Implementation and Classification of Anomalous Detection with Varying Parameters

Manvi Chahar, Ms. Savita

Citation :

Manvi Chahar, Ms. Savita, "Implementation and Classification of Anomalous Detection with Varying Parameters," International Journal of Computer Science and Engineering , vol. 6, no. 4, pp. 16-18, 2019. Crossref, https://doi.org/10.14445/23488387/IJCSE-V6I4P104

Abstract

Classification is a classic data mining technique based on machine learning. Classification is used to classify each item in a set of data into one of predefined set of classes or groups. We consider the problem of discovering attributes, or properties, accounting for the a-priori stated abnormality of a group of anomalous individuals (testing data) with respect to an overall given population (training data). To this aim, we use the notion of gain ratio. Gain ratio is an attribute selection method and has been used to rank the attributes of the datasets. For this we found that the attributes which have high gain ratio will have high classification accuracy and those attributes which have lower gain ratio can be neglected which helps in reduction of the attributes. This thesis shows that if we apply gain ratio on attributes to rank them and classify our data with small no of attributes and get the high accuracy rate. The results in the report on this dataset also show the efficiency and accuracy of Naỉve bayes classifier.

Keywords

Data Mining, Gain Ratio, Naive Bayes Classifier

References

[1] Beckerman R, “Distributional Word Clusters vs. Words for Text Categorization,” Journal of Machine Learning Research, vol 3, pp 1183– 1208, 2003.
[2] I.H. Witten, E. Frank and M.A. Hall, Data mining practical machine learning tools and techniques, Morgan Kaufmann publisher, Burlington 2011.
[3] H. Grosskreutz and S. Ruping, “On subgroup discovery in numerical domains,” Data Mining and Knowledge Discovery, vol. 19, no. 2, pp. 210– 226, 2009.
[4] J. Han And M. Kamber, Data Mining: Concepts and Techniques. San Francisco, Morgan KauffmannPublishers (2001).
[5] Irina Rish,” An empirical study of the naïve bayes classifier”, IBM Research Report, pp. 1-7, 2001.