Big Data and companies.

Prithviraj Singh
4 min readSep 17, 2020

--

Big data is a term that describes large amount of data that cannot be stored or be processed using the traditional approach. This big data is being created daily and for now can be considered a problem for the industry. So how do we store this huge amount of data?

By the use of the wondrous world of technology. Big data generally poses two major problem with storage. They being:-

  1. Volume.

Just the sheer amount of data that has to be stored poses the problem called the problem of 'Volume’; as there is no such hardware being produced to store this huge amount of data and even if we were to produce such a product, it’ll be priced more than its worth.

Now even if someone is willing to pay such money, it again poses our second problem of…

2. Velocity.

The velocity problem. With velocity we refer to the speed of I/O, i.e. the time taken in storing data from the source into the storage device and then again the time taken in retrieval of that data for it’s usage.

The common solution for both of these problems lay in the technology known as 'Distributed Storage’. Here istead of using a single harddisk in a single system to store data, we use multiple such systems to do the same. This problems are hence solved as now there is virtually infinite storage capacity solving the volume problem, and now since the data transfer is done parallelly to multiple storages at the same time it divides the time used exponentially hence solving the velocity problem.

Though data and it’s volume is not the concern, the concern is what has to happens with such a volume of Data.

On a general this data has to be stored and retrieved on demand. But companies and corporation use this data to analyse it for insights that leads to better decisions and strategic moves.

Let’s look at how much data is being generated and how the top players use this data to further their buisness and research…

A general idea

  • Around the world people are generating 2.5 Quintillion bytes of data each day. That is 2.5 Exabyte per day.
  • Nearly 90 percent of all data has been created in the last two years.
  • By 2023, the big data industry will be worth an estimate of $77Billion.

Google

  • Google has played a major part of making big data the part of our daily lives with it’s services like ‘Big Query’ or ‘Map Reduce’.
  • Google generally gets 3.5 Billion request each day. And these requests are to be sorted and queried through 20 Billion web pages.
  • Though it has not been published by Google themselves some industry professionals estimate the size of Google’s database to be approximately 10 Exabyte or 10 Million Terabytes.
  • Google generally uses their big data to improve it’s search engine.
  • They also use data such as your Google+ data, your gmail account and subscriptions to get you a more personalized result for your query.
  • However in 2008 they published a paper in the science journal Nature claiming that their technology can predict outbreaks of flu more accurate than the present technologies.
  • The results were very controversial but this unveiled the possibility of ‘Crowd prediction’.

Facebook

  • Everyday Facebook get their hands on 2.5 Billion pieces of content, 500 Terabytes of data and run 600,000 queries and 1 Million map-reduce jobs per day.
  • They also get 2.7 Billion like actions and 300 Million pictures.
  • Yet amazingly facebook can scan through 105 Terabytes of data in just half an hour.
  • Facebook generally uses their big data to sell us adverts.
  • However they have been the center of controversy. As they in the past have used their site provide certain stimulus to its customers and then collect their reaction (big data) to perform psychological experiments.
  • The real reason for controversy was the conduction of such experiment without the permission of the subjects.
  • The difference between Google’s collection system and that of Facebook is that Google generally collect information such as sites we visit, while Facebook explicitly ask us who we are, where we live etc.

Microsoft

  • Today there are more than 1 billion devices in world that run windows 10 as their operating system.
  • Microsoft’s smart assistant 'Cortona’ has been asked 18 Billion questions since launch.
  • Every day l, Microsoft analyzes over 6.5 Trillion signals in order to identify emerging threats and protect customers.
  • Microsoft provides their users with data hosting and analytics services based on Hadoop.
  • They have used their big data to make their search engine ‘Bing’ the second most used search engine overtaking ‘Yahoo’.
  • As a future plan Microsoft wishes to rebrand their product 'Xbox' as an intelligent living room activity hub monitoring user’s activities to adapt and provide a good recreational environment.

#bigdata #hadoop #bigdatamanagement #arthbylw #vimaldaga #righteducation #educationredefine #rightmentor
#worldrecordholder #ARTH #linuxworld #makingindiafutureready #righeudcation

--

--