Storage and Retrieval of Data for Smart City using Hadoop

Ravi Gehlot

Citation :

Ravi Gehlot, "Storage and Retrieval of Data for Smart City using Hadoop," International Journal of Computer Science and Engineering , vol. 3, no. 5, pp. 85-89, 2016. Crossref, https://doi.org/10.14445/23488387/IJCSE-V3I5P117

Abstract

Smart cities are equipped with a huge range of technologies that generate enormous amount of data. Ubiquitous technologies that are embedded with Radio Frequency Identification (RFID) and Wireless Sensor Network (WSN) that frequently generates data. These devices are implanted everywhere in a smart city. The data generated by the sensors will grow so large, that it cannot be handled by the conventional File Storage Systems. Processing and analyzing of this amount of data needs special tasks to be done. For this we would require a special system that would store and process it with a feasible cost. Hadoop uses Hadoop Distributed File System (HDFS) to store data in distributed servers and MapReduce Algorithm to analyze and process the data. HDFS stores the stream of Application data in Data Nodes, which are mapped by NameNodes. This results in high availability and fault tolerant system. MapReduce algorithm uses key and value pair which makes the analyzing task easier.

Keywords

Hadoop; Hadoop Distributed File System (HDFS); MapReduce; NameNode; DataNode; Big Data

References

[1] Apache Hadoop http://haddop.apache.org/
[2] T. White, Hadoop: The Definitive Guide. O'Reilly Media, Yahoo! Press, June 5, 2009.
[3] Konstantin Shvachko, HairongKuang, Sanjay Radia, Robert Chansler Yahoo!, ―The Hadoop Distributed File System,‖ IEEE NASA storage conference, http://storageconference.org/2010/Papers/MSST/Shvachko.pdf.
[4] ApacheHBase. http://hbase.apache.org/
[5] Yonghwan KIM, Tadashi ARARAGI, Junya NAKAMURA and Toshimitsu MASUZAWA,‖ A Distributed NameNode Cluster for a Highly-Available Hadoop Distributed File System‖, IEEE 33rd International Symposium, pp. 333-334, 2014
[6] E.Sivaraman, Dr.R.Manickachezian,‖ High Performance and Fault Tolerant Distributed File System for Big Data Storage and Processing using Hadoop‖,
IEEE Intern Conf., pp. 32-36, 2014
[7] Zhanye Wang, Dongsheng Wang, ―NCluster: Using Multiple Active Namenodes to Achieve High Availability for HDFS‖, IEEE Intern Conf., pp. 2291-
2297, 2013
[8] Apache Hadoop Project HDFS976, ―Hadoop Avatar Node High Availability‖, http://hadoopblog.blogspot.com/2010/02/hadoopnamen ode-high-availability.htm, February 6, 2010.
[9] Mohammad Asif Khan, Zulfiqar A. Memon,Sajid Khan,‖Highly Available Hadoop NameNode Architecture‖, IEEE Intern Conf., pp. 167 -172, 2012
[10] T. White. Hadoop: The Definitive Guide. O’Reilly Media 2009.
[11]Facebook has the world's largest Hadoop cluster! http://hadoopblog.blogspot.com/2010/05/facebookhasworlds-largesthadoop.html
[12]D. Borthakur et al. Apache Hadoop Goes Real-time at Facebook. SIGMOD 11: Proceedings of the 2011
International Conference on Management of Data.
[13] About Li-Fi http://www.lifi-centre.com/aboutlifi/what-is-li-fi-technology/