What is Big Data?
When data reaches a quantity so big that it can no longer be processed by traditional methods, it is said to be ‘big data.’ In the world of rapidly accelerating technology, big data is becoming a more fundamental part of our lives. There is no strict definition for big data however is can be associated with the five ‘v’s:
-
Volume – How much of the data is there?
-
Velocity – How fast is the data being produced
-
Variety – What type of data is it? And how can it be categorised?
-
Value – What is the data’s value to a person/organisation? How can the vast quantities be used?
-
Veracity – How reliable is the data?
Where does data come from?
‘Over the last two years 90 percent of the world’s data has been generated.’
This statement has modelled the growth of worldwide data since the birth of computing. The more you think about it, the more remarkable it seems, but what has caused this constant acceleration in data production? The answer is you and I and everyone living in this digital age and how we live our lives. Every link you click, every website you visit and picture you post etches more information into the big data pool. In fact in every minute of the day:
-
527,760 photos are shared on Snapchat
-
4,146,600 videos are watched on YouTube
-
500 hours of video are uploaded to YouTube
-
456,000 tweets are sent via Twitter
-
46,740 photos are posted on Instagram
Source: Domo’s Data Never Sleeps Report
These statistics come from social media alone. Large sources of data also come from GPS tracking, financial transactions, E-mails, satellite information and many, many more.
How are these vast quantities of data managed?
Big data is incredibly difficult process. There are so many different types of data which mean nothing to the computer for example TV categories, political views, advertisement preferences etc. All of which must be managed differently and processed differently. The gigantic and growing amounts of these unique data types are also very difficult to store. In the 1990s a large problem in many organisations was that it was just too difficult to store all this data. This lead to the development of ‘MapReduce’ – an algorithm designed by Google. MapReduce is a computational framework designed to ‘map’ values to keys and then ‘reduce’ them by combining values based on their similar keys. The framework is also scalable, meaning that it can be done with 100 computers or 100,000 computers. This helped to solve the issue many companies were facing as they could process large amounts of data by splitting it up over many devices. The MapReduce algorithm was then used to create an open source project called ‘Hadoop’ which enables anyone to handle large volumes of data over their networks and devices.
Where is big data used?
Big data’s implementation in the business industry has led to very smart but scary algorithms which can use your data to manipulate how you live your life. Whether this is through the advertisements you are shown or the recommended videos you are presented with, big data is more in control of your life than you may believe. Here are some interesting examples of how big name companies use you data:
-
Netflix: Netflix can use many facts about what you have watched to tailor what they recommend to you. For example actors, genres and directors can all play a role. Furthermore how these films/TV shows are recommended also vary as they can be portrayed as different genres using different parts of the film/TV show as a trailer. Netflix also uses such information on a global scale to determine what films/TV shows it will buy the rights for in the future.
-
Starbucks: Starbucks use lots of relevant information such as road traffic, area culture and of course drink popularity to target not only advertisements but also where they open up their next branches. This is how the business can still remain successful when there are three shops all on the same street. Interactions through their rewards apps also help to harness magnitudes of customer data. Big data is also used to design special limited time offers. For example when Tennessee was struck with a heatwave, Starbucks created a Frappuccino promotion to raise sales.
-
Amazon: Amazon uses your data every time you shop to manipulate how you will shop the next time with them. This, most of the time, makes shopping quicker and more optimised as names and details are remember for you. However deeper algorithms also target products and sales at you to entice you to buy even more from the store. Amazon can use your past patterns to predict what you will purchase in the future and this means that where amazon store their products depends on their big data algorithms.
Where will big data be used in the future?
The future of big data is both exciting and worrisome. Computers will get better and better at doing things for you. This is both a good thing and a bad thing. Algorithms will become smarter and smarter as more data is produced. The ethics behind big data will also become much more prominent in the future – hence why some people wish to add another ‘v’, ‘virtue’ into the five ‘v’s. Big data will also create jobs and many predict ‘data officers’ and ‘data scientists’ will become an increasingly popular profession. Big data has the potential to become incredibly valuable in sectors such as the heath industry – helping doctors to prescribe better medication for patients. Overall big data is definitely a topic to keep your eye on.
Bibliography
- Intro to Big Data: Crash Course Statistics #38– CrashCourse
- What is Hadoop? – Intricity101
- What is Big Data? – Computerphile
- What is MapReduce? – Internet-class
- How Much Data Do We Create Every Day? The Mind-Blowing Stats Everyone Should Read – Forbes
- 10 companies that are using big data – Icas
- Starbucks: Using Big Data, Analytics And Artificial Intelligence To Boost Performance – Forbes