Big Data: Powering the Age of Information
Not-that-long-ago, we all used flagship Nokia phones that could save only few kB (Kilo-Bytes or 1000 bytes) of data in the form of Contacts. Then came the era of slightly smarter phones who could have storage capacities in MB (Mega-Bytes or 1000 kilo bytes). But now, with the proliferation of highly-smart mobile devices with 4k resolution cameras, the storage of our phones has gone up to hundreds of GB (Giga-Bytes or 1000 megabytes). Sometimes, even that is not enough to capture and save our picture-perfect lives, hence we rely on Cloud Storage Systems such as OneDrive, Google Drive, etc. (read more about Cloud here)
So, if we are to measure an individual’s Data Footprint (i.e. the amount of data generated by a person over lifetime), it would certainly be in trillions of bytes (TB or 1000 GB). With over 4 billion active internet users (approximately 59% of the world population), we can only imagine the amount of data generated! Crazy, right?!
Joining this crazy data party, are Internet of Things or IoT devices, as they are called. These devices use sensors, store data and employ data analytics in order to make decisions concerning their task (you can read more about IoT here). As IoT devices are constantly connected to internet, they continuously generate information. Presently, as the number of IoT units exceeds even the human population and are rising further, the amount of data generated by them is humongous. Hence, storage and analysis of this data is a complex challenge.
What is Big Data?
As elaborated above, the Digital Age that we all live in, is characterised by ‘Data’. How do you think search engines and social media companies decide which ads to show you? They analyse all the information they have of you – your age, profession, past searches, shopping patterns & preferences, browsing history, etc. They even look into your precise as well as approximate location and use the information of what is sold around you in order to target relevant ads. Ignoring the privacy concerns here for a bit, we can agree that all of this data generated is not only quantifiably huge but also varied. Hence, it is called Big Data, quite self-explanatory, right?
Characteristics of Big Data –
How is Big Data (BD) different? Does the ‘Big’ in its name refer to the quantity of data generated? Both these questions are the most frequently asked ones when it comes to BD. Hence, there defined are characteristics that distinguish and define it. The 5Vs of Big Data are as follows –
1) Volume – We have seen earlier that Big Data is characterised by tremendous volume of data. This owes to the fact that it is collected from a variety of sources.
2) Variety – This large amount of data can be in the form of spreadsheets, search queries, images, text, video, etc.
3) Velocity – In each moment of time, hundreds of TBs of information is generated all over the world. Thus, Big Data processing also demands real-time collection & storage. That’s why, handling BD is a herculean task thus, warranting use of specified servers and processors with faster turnaround speeds.
4) Veracity – Though the former 3 characteristics focussed on the quantitative aspect of BD, the next two are related to its quality. Veracity assesses the reliability of incoming data. It mainly includes filtering, translating, handling and managing the big data efficiently.
5) Value – This is the most important feature of BD. Generating value for businesses, governments, consumers, etc. from the huge data is the reason why analysis of data is done. Only that information which is reliable, useful and valuable is hunted through the Big Data and stored after careful analysis.
Where does it come from?
Any device, gadget or machine that generates, captures, stores or analyses data is a source for Big Data. Particularly, the following streams generate the largest quantity of ‘Big Data’ –
1. Social Media Sites – The images, videos, GIFs, captions, tweets, links, etc. that we post on social media, are all a part of Big Data.
2. IoT Devices – As mentioned earlier, due to their 24×7 connectedness, IoT devices are a large and important Big Data source.
3. Internet – Every time that you use internet – be it for browsing, payments, entertainment, communication, etc., Big Data is generated. Even your cloud backups contribute to its ever-increasing volume.
Types of Big Data –
Post reading the 5Vs of Big Data and its 3 most significant sources, you might have realized that Big Data is not homogenous. In fact, it is an amalgamation of diverse categories of information that originate from variegated sources, thus adding to their diversity. Thence, there exists 4 different types of Big Data, as follows –
I) Structured Data – This data is that which is neatly organised into tables or spreadsheets and generally consists of only one form of entries, for example – numeric entries or textual characters, etc. Structured data is a rarity in the world of Big Data as our complex systems generate data which is a combination of different formats.
II) Unstructured – The information in the form of images, videos, GIFs, raw text data, audio files, etc. fall into this category as none of them have set variables and are random in nature. This type of data is the most abundant Big Data found in the world.
III) Semi-structured Data – It is somewhere between the unstructured and structured big data types. Though the semi-structured data is somewhat organised and hence can be put into tabular form, it fails to comply with a formal structure that defines structured data. Because it is not as random as an unstructured data, the former requires lesser pre-processing than the latter.
Why is Big Data Important?
After learning what is Big Data, its characteristics, types and the difficulties in processing, the obvious question is why do we need it? So let’s delve further into the topic.
Corporations such as restaurants, businesses, governments, factories, retail outlets, etc. need data to know their customer so as to boost sales. For example, a textile retail chain would need information about their customers’ demographics, preferences, size and fittings, etc. This information would help them make decisions such as the discount sale dates, inventory and storage, what label is most sold, when do most sales occur, etc. All this information helps businesses pivot their products and marketing strategies so as to better appeal to customers. Before the advent of technology, sales information was collated and analysed by astute businessmen and their marketers. But with the advent of tools to handle, process and analyse Big Data, now the process of scouring for relevant information is much more simpler and efficient.
Some of the other fields that rely heavily on Big Data Analytics are Banking, Entertainment and Medical. It may come as a surprise, but Big Data Analytics are extensively used in political campaigns, elections and for targeted voting drives.
Thus, to summarize the application areas of Big Data, following are the main functions where its analysis and processing is done –
A) To understand customer behaviour and feedback.
B) To improve operational efficiency and maximize return on investment.
C) For enhancing cybersecurity and avoid related losses
D) To identify trends and patterns, analyse them and predict the future.
Thus, you must have gauged the significance of Big Data in our current world. It wouldn’t be an exaggeration to call Big Data as the fuel that ushered in and keeps running the Age of Information that we are living in. As the nature of businesses changed and they have greater online presence in the post-pandemic world, the importance of Big Data Analytics has also grown exponentially. That has led to increased employment opportunities for Data Scientists and Business Analysts. Further, due to the trend of targeted online marketing, other jobs such as those of Influencers, Digital Marketers, Search Engine Optimization (SEO) Analysts, etc. are also on the rise. In fact, according to multiple surveys post 2020, a Data Scientist is termed as the ‘Hottest Job’ of upcoming times. The statistics reiterate hitherto observed trend that advanced technology leads to more jobs in the creative field. For example, if you have a LinkedIn Creator profile or a Business Account on Instagram, you are given a set of free analytics tools that tell you the preferences of your audiences. This allows you to curate your content in such a way that it has a wider appeal amongst your target audience, thus increasing your reach. Hence, we can confidently say that the ‘Content Economy’ has further propelled creation and utility of Big Data. Due to the ubiquitous presence of Big Data and its impact on almost all aspects of human life, we can truly say that ‘Data is the new Oil!’.
Stay tuned for next week’s follow-up article where we discuss the field of Data Analytics and Data Science. We shall also look into the privacy concerns raised by the omnipresence of data-gathering devices. Happy Reading!
Vishvali Deo is an E&TC (Electronics and Telecommunication) Engineer by education and Software Engineer by Profession. She believes that 'Technology is a Great Democratising and Equalising Force' and hence is on a mission to make the general public understand seemingly complex technologies in a simple manner.
She is convinced that the root of today's world problems lie in the past, hence she has also pursued post-graduation in History. She has a keen interest and a good grip over Economics, Political Science and Environmental Engineering. She has a penchant for working with Women and spreading Digital Literacy amongst them, with the aim of their empowerment. She also strives to provide Free Quality Education to children and counsels young adults. Besides, she is also skilled at Public Speaking, having won many awards in Elocution & Debate Competitions and Technical Paper Presentations.