An approach of Clustering and analysis of Unstructured Data

International Journal of Computer Science and Engineering
© 2019 by SSRG - IJCSE Journal
Volume 6 Issue 11
Year of Publication : 2019
Authors : Gunisetti Tirupathi Rao, Dr. Rajendra Gupta

How to Cite?

Gunisetti Tirupathi Rao, Dr. Rajendra Gupta, "An approach of Clustering and analysis of Unstructured Data," SSRG International Journal of Computer Science and Engineering , vol. 6,  no. 11, pp. 64-69, 2019. Crossref,


Unstructured dataset is a kind of information which is not pre-defined and it is organized in improper manner, dataset contains different data like email, chat, images, video, xml, links etc. It is very complextask to search words in unstructured dataset.To get particular piece of information/search a pair from dataset, there arefour approaches are applied viz. First, Pre-process the dataset, using TPMRFC and assign weight using DTM, Second, Re-calculate the error and update the weights of matrix(DTM), third,Cluster the each term according to its weight using Self Organization Map,lastly, Using Least Frequently Used (LFU) with Dynamic Aging (LFUDA) method by which the pair of words is more frequently used to place in Cache. For testing the proposed scheme PROLOG unstructured dataset is used and results are achieved in terms of accuracy.


Unstructured data, Text Pattern Mining, Cluster, Self-Organized Map


[1] LiGuoa, FengShi, &JunTu, (2016), “Textual analysis and machine leaning: Crack unstructured data in finance and accounting”, The journal of Finance and Data Science, Vol. 2, Issue 3, pp. 153-170.
[2] M. Siva Lakshmi1 and MD. Arsha Sultana,(2016), “Text Mining of Unstructured Data Using R”, International Journal of Computer Science and Engineering (JCSE), Vol.4,Issue 9, pp.123-130.
[3] ZurainiZainol, Puteri N.E. Nohuddin, Tengku A.T. Mohd and Omar Zakaria, (2017), “Text Analytics of Unstructured Textual Data: AStudy on Military Peacekeeping Document using R Text Mining Package”, The 6th International Conference on Computing & Informatics (ICOCI17), At Sepang, Malaysia,pp.1-19
[4] K.V.Kanimozhi1 and Dr.M.Venkatesan,(2015), “Unstructured Data Analysis-A Survey”, International Journal of Advanced Research in Computer and
Communication Engineering,Vol. 4, Issue 3, pp.223-225.
[5] M. Siva Lakshmi1 and MD. Arsha Sultana, (2016), “Text Mining of Unstructured Data Using R“,International Journal of Computer Sciences and Engineering (JCSE), Vol 4, Issue 9, pp.123-130.
[6] B. ShravanKumar&VadlamaniRavia, (2016), “A Survey of the Applications of Text Mining in Financial Domain”,Knowledge-Based System, Vol 114, pp 128-147.
[7] ChankookParka&SeunghyunChob,(2017), “Future Sign Detection in Smart Grids Through Text Mining“, International Scientific Conference Environmental and Climate Technologies, CONECT, Vol 128, pp. 79-85.
[8] ShudongHuang ZenglinXuand JianchengLv, (2018), “Adaptive local structure learningfor document coclustering” Knowledge-Based Systems, Vol. 148, pp.74-84.
[9] Hussein Hashimi , Alaaeldin Hafez and HassanMathkour (2015), “Selection Criteria for Text Mining Approaches”, Computer in Human Behavior, Vol 51, pp.729-733.
[10] Charlotte LaclauandMohamed Nadif, (2016), “Hard and fuzzy diagonal co-clustering for document-term partitioning”, Neurcomputing, Vol 193, pp 133-147.
[11] Adrien Russo , François Verdier and BenoîtMiramond, (2018), “Energy Saving in a Wireless Sensor Network by Data Prediction by using Self-Organized Maps”, Procedia computer science, Vol 130, pp 1090-1095.
[12] Macario O. Cordel II and Arnulfo P.Azcarraga(2015), “Fast Emulation of Self-organizing Maps for Large Datasets”, Procedia Computer science, Vol. 52, pp 381-388.
[13] Dr. Goutam Chakra borty, Murali Krishna Pagolu (2014) "Text Analytics and Sentiment Analysis of Unstructured Data", International Conference on Analysis of Unstructured Data held at SAS Global Forum, Washington D.C.