

Various Tools and Techniques to Assess Information from Big Data
Big data refers to sets of data with high computational complexity and that is larger than the capacity of traditional software tools to seize, accumulate and investigate. It relates to structured and unstructured data. Big data actually revolves around 3 V's-velocity i.e. speed, volume i.e. quantity and variety i.e. types of data. Big Data is data generated from social media (Facebook, Twitter etc.) , the data generated by networks, for example IOT (Internet of Things).This research paper sheds light on various issues related to tools available, languages used to explore big data and also mining techniques needed to fetch and analyze big data. The Methodology used is the Beautiful Soup which is a python library that can perform parsing of html page and web scraping .Web scraping helps to transform unstructured data to structured form. From the study, it has been observed that in today's era, python is the most powerful language to fetch and analyze big data, because it can handle Zeta Bytes (ZBs) amount of data. Java and other languages cannot handle data more than Giga Bytes (GBs). Hadoop is the most useful and powerful tool for distributed storage and processing of large datasets, by the use of various plug-ins, it becomes easy to analyze big data.
Keywords
Big Data, Web Mining, Web Scraping, Beautiful Soup Python Library.
User
Font Size
Information

Abstract Views: 246

PDF Views: 1