Big Data

Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with, by traditional data-processing application software. Data with many cases (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source. Big data was originally associated with three key concepts: volume, variety, and velocity.When we handle big data, we may not sample but simply observe and track what happens. Therefore, big data often includes data with sizes that exceed the capacity of traditional software to process within an acceptable time and value.

Big data system

Big data is often characterized by the 3Vs: the large volume of data in many environments, the wide variety of data types stored in big data systems and the velocity at which the data is generated, collected and processed.

 These characteristics were first identified by Doug Laney, then an analyst at Meta Group Inc., in a report published in 2001; Gartner further popularized them after it acquired Meta Group in 2005. More recently, several other Vs have been added to different descriptions of big data, including veracity, value and variability.

Although big data doesn’t equate to any specific volume of data, big data deployments often involve terabytes (TB), petabytes (PB) and even exabytes (EB) of data captured over time.

Examples of big data

Big data comes from myriad different sources, such as business transaction systems, customer databases, medical records, internet clickstream logs, mobile applications, social networks, scientific research repositories, machine-generated data and real-time data sensors used in internet of things (IoT) environments. The data may be left in its raw form in big data systems or preprocessed using data mining tools or data preparation software so it’s ready for particular analytics uses.

Using customer data as an example, the different branches of analytics that can be done with the information found in sets of big data include the following:

Comparative analysis. This includes the examination of user behavior metrics and the observation of real-time customer engagement in order to compare one company's products, services and brand authority with those of its competition.

Social media listening. This is information about what people are saying on social media about a specific business or product that goes beyond what can be delivered in a poll or survey. This data can be used to help identify target audiences for marketing campaigns by observing the activity surrounding specific topics across various sources.

Marketing analysis. This includes information that can be used to make the promotion of new products, services and initiatives more informed and innovative.

Customer satisfaction and sentiment analysis. All of the information gathered can reveal how customers are feeling about a company or brand, if any potential issues may arise, how brand loyalty might be preserved and how customer service efforts might be improved.

Breaking down the Vs of big data

Volume is the most commonly cited characteristic of big data. A big data environment doesn’t have to contain a large amount of data, but most do because of the nature of the data being collected and stored in them. Clickstreams, system logs and stream processing systems are among the sources that typically produce massive volumes of big data on an ongoing basis.

CAREER FORM