Emerging Big Data Ecosystem
Data Generators (Devices): this group belongs to data devices that generate new data about data. So for each megabyte of new data createdand additional gigabyte is generated.-For instance, loyalty card that started to get popular lately were built with the idea of collecting information about spending habits, most visited stores, best hours of the day, and bestselling products. Having these data gleaned and analyzed, the business executive can have actionable knowledge toward the sale patterns and make better future decisions [17].-Blizzard Entertainment, one of the leading video game developer and publisher based in Irvine uses massive big data platforms for all of its games, tools, and operations being offered. The company utilizes robust pipelines to collect global information that power analytics, operations, machine learning, and discovery [18].
Data Collectors: as the name is descriptive enough, this group collect data about users and devices with attention to their attributes and attitudes.-Nielsen, being the long dominant player in the collection of data in the television industry, tracks activity on mobile devices, internet, and cable television in order to gain insight on consumer sentiments, reputation of the brand, and consumer reaction to public relations events [19]. -Furthermore, retails stores, using the RFID chip, track the path a customer takes through their store in order to gain insights on products having most foot traffic [20].
Data Aggregators: this group belongs that collects data and draw patterns based on it. These organizations gleans data from various sources such as retail stores, sensors, websites, and smartphones, analyze it, generate insights, and then sell it as a product to other organizations that are in need [15]. -Axciom is a company that concentrates on business and residential listings in U.S. and Canada. Company gleans data about business names, phone numbers, addresses, classification codes, and coordination. In addition, Axciom provides developers with more than 100 million indexed keywords and phrases [21]. These information will be utilized to generate patterns of behaviors.
Data Consumers (users/buyers): this group benefit from the data gleaned and crunched by others within the data value chain.-There are numerous scenarios in which the data is being used to benefit the business. For instance, Donald Trump’s presidential campaign in 2016, used data analysis based on a hyper-targeted psychological approach that deemed to be successful [22]
Datadevices[showninthe(1)sectionofFigure1-11]andthe“Sensornet”gatherdatafrommultiple locations and continuously generate new data about this data. For each gigabyte of new data cre- ated, an additional petabyte of data is created about that data. [2]
Forexample,considersomeoneplayinganonlinevideogamethroughaPC,gameconsole, or smartphone. In this case, the video game provider captures data about the skill and levels attained by the player. Intelligent systems monitor and log how and when the user plays the game. As a consequence, the game provider can fine-tune the difficulty of the game, suggest other related games that would most likely interest the user, and offer additional equipment and enhancements for the character based on the user’s age, gender, and interests. This information may get stored locally or uploaded to the game provider’s cloud to analyze the gaming habits and opportunities for upsell and cross-sell, and identify archetypical profiles of specific kinds of users.
● Smartphonesprovideanotherrichsourceofdata.Inadditiontomessagingandbasicphone usage, they store and transmit data about Internet usage, SMS usage, and real-time location. This metadata can be used for analyzing traffic patterns by scanning the density of smart- phones in locations to track the speed of cars or the relative traffic congestion on busy roads. In this way, GPS devices in cars can give drivers real-time updates and offer alternative routes to avoid traffic delays.
● Retailshoppingloyaltycardsrecordnotjusttheamountanindividualspends,buttheloca- tions of stores that person visits, the kinds of products purchased, the stores where goods are purchased most often, and the combinations of products purchased together. Collecting this data provides insights into shopping and travel habits and the likelihood of successful advertisement targeting for certain types of retail promotions.
Data Science and Big Data Analytics
Datacollectors[theblueovals,identifiedas(2)withinFigure1-11]includesampleentitiesthat collect data from the device and users.
● DataresultsfromacableTVprovidertrackingtheshowsapersonwatches,whichTV channels someone will and will not pay for to watch on demand, and the prices someone is willing to pay for premium TV content
● Retailstorestrackingthepathacustomertakesthroughtheirstorewhilepushingashop- ping cart with an RFID chip so they can gauge which products get the most foot traffic using geospatial data collected from the RFID chips
● Dataaggregators(thedarkgrayovalsinFigure1-11,markedas(3))makesenseofthedatacollected from the various entities from the “SensorNet” or the “Internet of Things.” These organizations compile data from the devices and usage patterns collected by government agencies, retail stores, and websites. In turn, they can choose to transform and package the data as products to sell to list brokers, who may want to generate marketing lists of people who may be good targets for specific ad campaigns.
● Datausersandbuyersaredenotedby(4)inFigure1-11.Thesegroupsdirectlybenefitfromthedata collected and aggregated by others within the data value chain.
● Retailbanks,actingasadatabuyer,maywanttoknowwhichcustomershavethehighest likelihood to apply for a second mortgage or a home equity line of credit. To provide input for this analysis, retail banks may purchase data from a data aggregator. This kind of data may include demographic information about people living in specific locations; people who appear to have a specific level of debt, yet still have solid credit scores (or other characteris- tics such as paying bills on time and having savings accounts) that can be used to infer credit worthiness; and those who are searching the web for information about paying off debts or doing home remodeling projects. Obtaining data from these various sources and aggrega- tors will enable a more targeted marketing campaign, which would have been more chal- lenging before Big Data due to the lack of information or high-performing technologies.
● UsingtechnologiessuchasHadooptoperformnaturallanguageprocessingon unstructured, textual data from social media websites, users can gauge the reaction to events such as presidential campaigns. People may, for example, want to determine public sentiments toward a candidate by analyzing related blogs and online comments. Similarly, data users may want to track and prepare for natural disasters by identifying which areas a hurricane affects first and how it moves, based on which geographic areas are tweeting about it or discussing it via social media.
The Big Data ecosystem demands three categories of roles, as shown in Figure 1-12. These roles were described in the McKinsey Global study on Big Data, from May 2011 [1].
The first group—Deep Analytical Talent— is technically savvy, with strong analytical skills. Members pos- sess a combination of skills to handle raw, unstructured data and to apply complex analytical techniques at massive scales. This group has advanced training in quantitative disciplines, such as mathematics, statistics, and machine learning.
The second group—Data Savvy Professionals—has less technical depth but has a basic knowledge of statistics or machine learning and can define key questions that can be answered using advanced analytics.
These people tend to have a base knowledge of working with data, or an appreciation for some of the work being performed by data scientists and others with deep analytical talent. Examples of data savvy profes- sionals include financial analysts, market research analysts, life scientists, operations managers, and business and functional managers.
The third category of people mentioned in the study is Technology and Data Enablers. This group represents people providing technical expertise to support analytical projects, such as provisioning and administrating analytical sandboxes, and managing large-scale data architectures that enable widespread analytics within companies and other organizations. This role requires skills related to computer engineering, programming, and database administration.
SLIDE DI NG SU AI e ORGANIZATION