Big Data Analytic tools are promising techniques for a future prediction in many aspects of our life. The need for such predictive techniques has been exponentially increasing. even though, there are many challenges and risks are still of concern of researchers and decision makers, the outcome from the use of these techniques will considerable revolutionize our world to a new era of technology.
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Big data analytics and its impact on internet users
1. Big Data Analytics
and Its Impact on Internet Users
Salaheddin Khiri M.Beskri1, Sharafaldeen Mohamed Ashoury 2
UniversityUtara Malaysia
Email Address
1
Eng.salaheddin@Gmail.com
2
shraf82@Gmail.com
I. INTRODUCTION
In the new era of globalization, the amount of digital data
has been tremendously bursting and that, consequently, has
made the process of analyzing it a serious challenge when it
comes to productivity growth, innovation and consumer
surplus whereby they rely on data analysis to make a right
decision. Big data has become 21th century challenge, and it
consists of verities of imperfect complex and unstructured
data. It’s development communities’ and policymakers’
concern to figure out a solution for this matter. Big data
doesn’t mean only how much data in the store. Big Data is
about predicting current and future issues in all aspects of life,
to answer questions that used to be considered out of reach.
Big Data is a combination of 3 components (Three V’s), as
shown in Fig. 1, namely:
1) Volume: the incoming amount of data expedites
exponentially in terms of size to the Petabytes1. Smart meters
and heavy industrial equipment like machine sensors generate
similar data volumes as well as social media and millions of
millions of other traditional databases.
2) Variety: data is generated in different formats from
different sources (e.g. Images, Audio, Video and Texts) and
that reflects one of Big Data characteristics.
3) Velocity: it is about the speed of how much data
incomes. Different sources with different speeds of incoming
data from outer objects makes data becomes bigger faster and
harder to be analysed.
Fig. 1 A Big Data Characteristic
1
1 Petabyte= 1,000,000 Megabytes
II. TYPE OF BIG DATA
There are various sources that data is flown and generated
from through different types of channels every second.
Weblogs, social media, email, sensors, photographs and
transactional data are some examples of type of big data and
they are classified, as follows [5] :
Traditional enterprise data: including ERP databases and
traditional information systems.
Machine-generated /sensor data: surveillance cameras,
industrial systems, weblogs.
Social data: consumers’ comments, Twitter, social media
platforms like Facebook, Space, Instagram.
However, Multimedia contents have played a main role in
the data’s extreme growth. Each second of high definition
video, for instance, adds 2000 times as many bytes as a page
of text [4]. In addition, social media (e.g. Facebook), since the
rate of new users per month has reached to 100,000 users in
2012 [3], lots of various pieces of data have been uploaded to
the internet and shared between users globally. Also, medical
images, maps, video files, data that is stored in data
warehouses are other types of Big Data.
In other words, big data is can be in any form of data,
namely structured and unstructured data such pictures, audio
video files, text, and log files and many more.
III. BENEFITS OF BIG DATA
Our world encounters everyday new issues, in terms of
digital world and numerous techniques are invented to
overcome those issues to benefit our digital world in order to
achieve its goals. Big Data, as a solution, will bring in
numerous opportunities to our world in different aspects of
life.
Basically, advanced analytics enables you to create and
develop models that can be used to predict answers for many
critical questions. For instance, a developed model for
statistics that consists of combination of consumer buying
behaviour with consumer profiles can then utilized to predict
future behaviour of consumers.
Companies that struggle everyday to maintain their
competitive advantage in the market are the ones who will
definitely benefit from Big Data in many ways [6] such as:
2. A. Detect, prevent and remediate financial fraud.
Big Data is used by many organizations to reveal and
indicate any attempts or criminal fraud that are trying to harm
their networking system and servers and also predicts most
likely future attacks against their systems. So that,
organizations have embraced the most powerful analytics
technique.
B. Maintaining customer life time value
Marketing campaigns are mainly used by companies to
gain customers value and to avoid losses. Financial services
companies operate and run campaigns to billions of potential
customers. Therefore the amount of information is
tremendously increasing and the attempts to process data with
traditional tools have become failure. Consequently, that
failure limits companies’ businesses growth since customers
lifetime value is short and limited
Businesses managers have turned into high-performance
analytics. Tremendous gains institutions have achieved by
developing and compressing their analytic model and
implementing further validation for the variables in the model
to obtain greater reliability in their models [6].
C. Improve delinquent collections.
Prepaid phone services are widely spread among consumers
around the world. In the US mobile telecom market, post-pay
phone services are dominant among Americans. So, that will
impose telecom companies to conduct an exhaustive search to
trace consumers’ debts. The term high-performance analytics,
again, changed the way in which the US mobile telecom
companies calculate and determine how much credits
consumers owe them and how much credits left [6].
IV. CHALLENGES IN BIG DATA ANALYTICS
In the past, the amount of data wasn’t that huge. Variety,
velocity and volume were never a serious issue when it comes
to data process and analysis. Nowadays, new data types are
founded and used in our systems. Besides, the amount of data
income has been tremendously increasing and that resulted in
huge size of repositories need to be processed. Moreover,
decision makers have been looking for techniques to process
and analyse their data and that, consequently, led to the use of
Big Data analytics.
To utilize Big Data analytics, organizations should take
major challenges into account prior to Big Data analytics
techniques deployment. Whereby, there are many challenges
organizations could encounter in dealing with big data,
though, in below discussions, the main challenges are
highlighted and explored, as followings:
A. Internet Users privacy.
Privacy is the most sensitive issue for organizations as well
as for people around the world. “Because privacy is a pillar of
democracy, we must remain alert to the possibility that it
might be compromised by the rise of new technologies, and
put in place all necessary safeguards.” [3]. Social networks
and mobile phones have been widely used to exchange
information and that is likely to result in misusing that
information spontaneously or intently by others.
B. Data Acquisition and sharing
Backing up data in magnetic tapes and keeping it in a secret
store, will exhaust the operation of accessing the data. “An
Indonesian mobile carrier estimated that it would take up to
half a day of work to extract one day’s worth of backup data
currently stored on magnetic tapes.” [3]. After all, the stored
data cannot be accessed or transferred.
In addition, due to seeking for a success in businesses and
to achieve competitive advantage, companies always look for
the right partners to compete their rivals and bring in a success
to their businesses. For being a partner, private data should be
shared and exchanged between partners and these are two
issues have be secured during the engagement period to fulfill
the promise namely, reliable access to data streams and get
access to back up data for retrospective analysis and data
training purposes. Also, inter-comparability of data and interoperability of systems are other technical issues that are
encountered in accessing and sharing data but they are less
problematic than obtaining access or license to access data by
partners [3].
C. Data analysis.
The analysis type that was implemented and the type of
decision that is going to be informed are main variables that
revolutionize the relevance and severity of the number of
analytical challenges. In science research, scientists build their
decision on a collected data from different sources and,
moreover, policymakers may ask a question such as what is
data telling us? So that, the answer will tell them either change
the companies policies or not.
The human analyst’s input is critical whether it is fabricated
or real. In many cases, decisions have gone mistakenly
incorrect because of data analysis mismatch. A good example
is Google Flu Trends, whose ability to “detect influenza
epidemics in areas with a large population of web search
users” was. Data were compared by a group of medical
experts from Google Flu Trends from 2003 to 2008 used data
from two different networks2 and found that the Google Flu
Trends researchers did not predict actual flu very well even
though “they did a very good job at predicting nonspecific
respiratory illnesses (bad colds and other infections like
SARS) that seem like the flu. The mismatch was due to the
presence of infections causing symptoms that resemble those
of influenza, and the fact that influenza is not always
associated with influenza-like symptoms” [3].
2
The CDC's influenza-like-illness surveillance network and the CDC's
virologic surveillance system.
3. V. BIG DATA ANALYTICS RISKS
A. Data Privacy
In fact, Data acquired from different sources to be analysed
belongs to Internet users. Knowingly or unknowingly
institutions use Internet user’s public and private data to make
bad or good decisions.
B. Making False Decisions
In addition, Big Data analytics’ results are predictions. Any
failure in analysing data or mistake throughout a process of
Big Data analysis will result in false decisions. Thereby,
institutions takes false actions based on false decisions.
Even though results were achieved and critical questions
are answered, the need for results verification is still and
critical step in the whole process. Decision makers do not
want to dare risking their businesses by relying on unverified
results as a matter of fact. So that, data scientists review the
whole process of the knowledge discovery phase by retracing
the used methods, understanding the results and critically
undergo the analysis into quality’s tests.
C. Over-dependence on data
Big Data Analytics are just predictions. Therefore, treating
results as necessity may help institution avoid lots of losses.
Big Data Analytic’s processes require talent, lots of work and
validation. In other words, it’s an iterative process and the
need for other analytics’ techniques is necessary.
VI. BIG DATA ANALYTIC PROCESS
Since Big Data consists of tremendous amount of data
whereby traditional processors cannot handle it, big data
analytic process must be carried out into two phases [2], as
shown in Fig.2.
A. Knowledge Discover
In this phase, data has to be undergone certain preprocesses to make it meaningfully coherent. There are 5 steps
an organization should go through before proceeding to the
second phase, namely:
1) Acquisition:
Acquiring data needed to be analysed from different
repositories is the first step. In order to acquire the data, an
access to information and also methods to gather the data are
required. Tracking websites, machine sensors, and system or
application log files and writing inquiries to a search engine.
In some cases, data from external sources of an institution
might be required as well.
2) Pre-processing
To achieve trustworthy and useful results, data has to be
organized and classified based on its format.
3) Integration
In this step, data is completely retrieved and organized.
Whereby, redundant and clustering data are eliminated and
data becomes smaller representative sample.
4) Analysis
In the analysis step, data is analysed by describing and
predicting broad trends. In addition, researchers start
searching for relationships among data looking for answers for
their questions. Answers could be factors such as customers’
tendency in buying mobile phone.
5) Interpretation
Fig. 2 Big Data Analytic Processes
B. Application implementation
In this phase, after the data has went through series of
processes with help of certain algorithms, Data, in this phase,
are fed into an application owned by an institution to
determine what and how the institution should act. For
instance, It may predict the customer behaviour while buying
their products. It may predict what shops or market they prefer
and what may buy. Locations also can be predicted while
travelling or driving on a road. So that, based on that,
companies decide the next and proper actions to increase
customer value lifetime and draw customers’ attentions to
their products . In one word, the institution reaps the benefits
in this phase.
VII.
TECHNIQUE/APPROACH TO OVERCOME THE
CHALLENGES
Consumers are mainly affected by the all above discussed
issues whereby businesses architectures consists of private
information about their stakeholders and the privacy is
growing as the value of big data becomes more apparent. On
the hand, organizations that would like to deliver the value of
big data have to adopt a flexible multidisciplinary approach.
Thus, there are verities of techniques and approaches have
been developed by either academics or companies to analyse,
manage, gather and represent data visually. Below are some of
techniques and solutions, which are deployed by [1], for the
discussed challenges and issues, namely:
4. A. NOSQL Databases
NOSQL databases are segregated from Structured Query
Language (SQL) that relational databases (RDBMS) use. SQL
is a complementary for relational databases and is considered
as the domain-specific language for ad hoc queries, whereas
non-relational databases can use whatever they want because
SQL is not included and it can be, if needed, included [1].
Relational databases can’t maintain its performance when it
comes to a tremendous amount of data and lots of transactions
in a very small time unit
However, No-SQL databases have created a divided
solutions consisted of [1]:
1) Not Only SQL (NoSQL) solutions:
2) SQL solutions
NoSQL and SQL solutions have been combined to achieve
the highest performance at processing transactions. Whereby,
institution can maintain privacy. “Oracle corporation’
solutions, for instance, implemented these techniques in its
solutions for enterprises and successfully met all the
challenges” [1], as shown in Fig. 3.
research to collect data is another added costs for institutions
whereas using the Ineternet to elicit customers feedback from
various and active communities reduces the amount of time
and that consequently will reduce the staffing costs and
research expenses.
However, all of the techniques that we’ve listed are part of
numerous techniques that can be applied to big data.
VIII.
BIG DATA ANALYTIC AND ITS SERIOUS IMPACTS ON
THE INTERNET USERS
Data is manmade. Every day billions of new data are
entered to the Internet from different sources beginning with
social media, which is the most used source, ending with
machine sensors (e.g. Surveillance Cameras). Therefore, Big
Data Analytics definitely impact the Internet users as a bottom
line for companies, for instance. Internet users can be
influenced by this technique in many way namely:
• In marketing district: Companies have been looking for
the best ways to sell their products using advertisements and
the Internet has become the most targeted place for
advertising their products and services. So that, Big Data
Analytics have given another reason for companies to mainly
use the Internet as a medium to reach the bottom lines. A
pregnant who is walking on a street passing a baby affairs’
shop might receive a message on her phone using GPS from a
shop advertising their products since that woman is considered
as a potential customer [3]. Moreover, Consumer Tendency
can be predicted using Big Data Analytics.
• In politics district: Internet users’ tendency towards
candidates in presidency election can be predicted using Big
Data Analytics. “Before the votes were cast, New York Times
blogger Nate Silver predicted, with 90%+ confidence, that
Obama would win the election” said [7]. The blogger used a
Big Data Analytic tool to predict the result.
Fig. 3. Oracle’s Big Data Solutions
B. Cloud Computing
Cloud computing has established a new era of less
expensive and ease-to-use technology. It’s a pool of
computing resources which can be utilized on users’ demand.
Consequently, it has become a target for analysing and
predicting users’ insights. Cloud computing has become a
significant place for Big Data analytics tools [4]. Since
customers who subscribed to cloud applications, cloud
application providers have benefitted from the external data
sources by analysing them with their operational systems[4].
C. Crowdsourcing
Data is gained from numerous sources. Based on certain
criteria that institutions have predetermined, labours are
needed to collect data and implement series of processes to
prepare data for analysis. Using formal focus group or tren
• In smart healthcare district: The lack of follow-up with
patients after leaving a hospital and failure to provide patients
with necessary information upon leaving a hospital have
increased patients’ readmission rates. Patients, since s/he
admitted to a hospital and filled in an admission form then
after staying for period of time and left the hospital, a plenty
of information about a patients can help predicting the
likelihood of the patient’s readmission. Using Big Data
analytics, hospitals are managed to reduce admissions that
might occur in the next 30 days after the patient’s discharge
[2].
• In education enhancing district: Big Data Analytics have
intelligently impacted the students learning amount. Big Data
Analytics have showed teachers which student needs more
attention, exercises and learning materials as well as any
changes in classes if it is needed. In addition, Big Data
analytics brought enhancements into education system by
predicting students’ admission rate and students’ dropout rate
[2].
5. • In network security, Internet is a network of networks
where billions of users conduct everyday different
transactions on it whether harmful or useful. Hence, the need
for network security has been rising as well as the risks have.
Big Data Analytics has improved the network security,
protecting networks from malfunctions, attacks and suspicious
activities. Moreover, Big Data Analytics predicts any future
attack or threat likely to harm a network system [2]. Big Data
analytic gathers system’s log files of a network and it is
processed in few steps as shown in Fig. 2. Since log files
includes all attempts to access a server, attempts to download
or upload a file, attempts to access specific files, system logins
or any email emissions, predicting potential threats will be
estimated and the network administrators takes cautious
measures against them.
IX. BIG DATA ANALYTICS LIMITATIONS
Big Data analytics has brought a new era of prediction’s
techniques. Whereby, Big Data saves lots of time and money
in favour of decision makers to predict a potential future of
any aspect of our life. Bid Data still has limitations namely:
1) Specific context is critical.
Big Data might save time and money but without the
specific context it is useless [8]. If an institution is trying to
figure out an answer for a question using Big Data Analytic
and the answer is not in the acquired data, Big Data becomes
useless in that situation.
2) The three V’s of data should be considered.
Big Data three Vs should be considered before thinking of
using its technologies. If an institution’s data is just an amount
of data, Big Data analytic results are unreliable since velocity
and variety do not exist in data [8].
3) Traditional Analytics cannot be replaced
Traditional analytic methods such as Oracle, MySQL, MS
SQL and more can’t be replaced with Big Data analytic [8].
Traditional applications and systems are still used widely to
answers questions decision makers may seek for. Most
institutions run their own applications and systems and they
assigned strong databases to maintain tremendous amount of
data. Therefore, answers can be found easily in their systems
rather than wasting time on proving Big Data Analytics
requirements and implementing its complex processes.
However, large-scale institution may still need both traditional
analytics and Big Data analytics. Thus, they are considered
complementary.
CONCLUSION
Big data analytic is a promising predicting technology for
many aspects of life such as marketing, politics, healthcare
systems, network security, education, and many more. Big
Data analytics will benefit many institutions that have
incompletely unanswered questions. In spite of its advantages,
companies should take into account its risks and challenges
prior to adoption’s phase. Privacy, making false decisions,
over dependence on Big Data analytics’ results might be
repercussions that unaware institutions might encounter.
REFERENCES
[1]. Dijcks, J.-P. (2012). Big Data for the Enterpise. Oracle and its
Affiliates, 1–14.
[2]. Hunton and Williams LLP. (2013). Big Data and Analytics:
Seeking Foundations for Effective Privacy Guidance. Center
Information Policy Leadership, (February), 1–16.
[3]. Letouze, E. (2012). Big Data for Development : Challenges &
Opportunities.
[4]. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R.,
Roxburgh, C., & Hung Buyers, A. (2011). Big data : The next
frontier for innovation , competition , and productivity. McKinsey
& Company, (June), 1–143.
[5]. Oracle, A., Paper, W., & August, E. A. (2012). Oracle Information
Architecture : An Architect ’ s Guide to Big Data, (
August).
[6]. Spakes, G. (2012, 4 16). Four ways big data can benefit your
business.
Retrieved
from
SAS:
http://www.sas.com/news/feature/big-data-benefits.html
[7]. Jim, R. (2013). Obama Wins and a big data lesson for the
customer experience. Customer Relationship Metrics. Retrieved
October 12, 2013, from http://metrics.net/blog/2012/11/obamawins-big-data-lesson-customer-experience/
[8]. Jean, Y. (2013). Big Data , Bigger Opportunities collaborate in
the
era
of
big
data.
Retrieved
from
http://www.meritalk.com/pdfs/bdx/bdx-whitepaper-090413.pdf.