Evolution of Incremental Map Reduce Technique in Web Mining

R.Shanthini, D.Vinotha

Citation :

R.Shanthini, D.Vinotha, "Evolution of Incremental Map Reduce Technique in Web Mining," International Journal of Computer Science and Engineering , vol. 3, no. 5, pp. 43-45, 2016. Crossref, https://doi.org/10.14445/23488387/IJCSE-V3I5P108

Abstract

Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources in a website for fast development of networking, data storage. The data mining is a computation process of discover large sets of data’s. It extract information from website based on the URL generations and transform into understandable structure for further user. The Map Reduce programming model is widely used for large scale and one-time data-intensive distributed computing, but lacks flexibility and efficiency of processing small incremental data. So the data mining concept not efficient perform of large volume of data and increase the time frame process. MRB graph input and Delta graph input provide the updating graph MRB Graph. And the cache process decreases the time in a main memory. The final phase using the I2 MAP reduce is using to search a website in shortest path way from the servers by depending on user search information and using extracting techniques The I2 reduce function of Updated MRB Graph the cache process decreases the Time based on map shuffle sort and merge process.

Keywords

Incremental processing, MapReduce, iterative computation, big data.

References

[1] J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” in Proc. 6th Conf. Symp. Opear. Syst. Des. Implementation, 2004, p. 10.
[2] Harikrishnan Natarajan, SSRG-IJCSE pp 2-3. Truthful bidding for cloud resources based on compettitve cloud auction, costing and depreciation , Volume 3, Issue 3, March 2016.
[3] Avilash Roul E, SSRG-IJCSE pp 3-4,pricing method in cloud computing, Volume 3, Issue 1, January 2016.
[4] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient distributed datasets: A fault-tolerant abstraction for, in-memory cluster computing,” in Proc. 9th USENIX Conf. Netw. Syst. Des. Implementation, 2012, p. 2.
[5] R. Power and J. Li, “Piccolo: Building fast, distributed programs with partitioned tables,” in Proc. 9th USENIX Conf. Oper. Syst. Des. Implementation, 2010, pp. 1–14.
[6] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: A system for large-scale graph processing,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2010, pp. 135–146.
[7] S. R. Mihaylov, Z. G. Ives, and S. Guha, “Rex: Recursive, deltabased data-centric computation,” in Proc. VLDB Endowment, 2012, vol. 5, no. 11, pp. 1280–1291.
[8] Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, “Distributed graphlab: A framework for machine learning and data mining in the cloud,” in Proc. VLDB Endowment, 2012, vol. 5, no. 8, pp. 716–727.
[9] S. Ewen, K. Tzoumas, M. Kaufmann, and V. Markl, “Spinning fast iterative data flows,” in Proc. VLDB Endowment, 2012, vol. 5, no. 11, pp. 1268–1279.
[10] Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, “Haloop: Efficient iterative data processing on large clusters,” in Proc. VLDB Endowment, 2010, vol. 3, no. 1– 2, pp. 285–296