SlideShare uma empresa Scribd logo
1 de 42
Baixar para ler offline
Big Data Social Network
Analysis
by
Chamin Nalinda
(Registration No : 2011/CS/005, Index No : 11000058)
chmk90@gmail.com
+94 772416604
SCS 3017
Literature Survey
Supervised by
Dr. H. A. Caldera
BSc(Colombo), PGDip(Colombo), MSc(Colombo), PhD(Western Sydney)
University of Colombo School of Computing
Colombo 7
SRI LANKA
TexMaker | Mendele Desktop |Harvard Style Referencing | Word Count = 5466
Declaration
I hereby declare that this literature survey report was written by Chamin Nalinda.
A great deal of analysis was carried out in preparing this report and the bibliography
reflects key reference materials. Self learned knowledge was also included. References
have been mentioned without violating the owner’s exact content(paragraphs, sentences
etc...)
Name of Candidate: L.G.H.C. Nalinda
Signature: ............................... Date: December 12, 2014
Abstract
Big Data Social Network Analysis (BDSNA) is the focal computational and graphical
study of powerful techniques that can be used to identify clusters, patterns, hidden
structures, generate business intelligence, in social relationships within social networks
in terms of network theory. Social Network Analysis (SNA) has a diversified set of
applications and research areas such as Health care, Travel and Tourism, Defence and
Security, Internet of Things (IoT) etc. . . With the boom of the internet, Web 2.0
and handheld devices, there is an explosive growth in size, complexity and variety in
unstructured data, thus the analysis and information extraction is of great value and
adaptation of Big Data concept to SNA is vital.
This literature survey aims to investigate the usefulness of SNA in the “Big Data
(BD)” arena. This survey report reviews major research studies that have proposed
business strategies, BD approaches to generate predictive models by gratifying con-
temporary challenges that have arises from SNA.
Acknowledgements
I would like to offer my heartfelt thanks to Dr. H. A. Caldera, my supervisor for the
Literature Survey for his immense support and continuos feedback during the course
of the Survey and for guiding me by giving valuable ideas.
Further, my sincere gratitude goes to all the lecturers, assistant lecturers and the
entire UCSC family.
Special thanks to my parents, brother and sister who have always given me the
strength through the journey of my life.
Chamin Nalinda, December 12, 2014
i
Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
1 Introduction 1
1.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Current status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Big Data Social Network Analysis Domains 5
2.1 Health care . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Challeges and Future . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Defence and Security . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Identifying key players in network . . . . . . . . . . . . . . . 9
2.2.2 Usecases from recent history . . . . . . . . . . . . . . . . . . 11
2.2.3 Challenges and Future . . . . . . . . . . . . . . . . . . . . . 11
2.3 Travel and Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Web 2.0 forms Tourism 2.0 . . . . . . . . . . . . . . . . . . . 12
2.3.2 Tourism 2.0 Destination Management . . . . . . . . . . . . . 13
2.3.3 Challenges and Future . . . . . . . . . . . . . . . . . . . . . 14
2.4 Web 2.0 and IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Challenges and Future . . . . . . . . . . . . . . . . . . . . . 16
2.5 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 BDSNA Tools and Technologies 18
3.1 Major Concerns in BDSNA . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Real Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
ii
3.3 Lambda Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.1 Batch layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.2 Serving layer . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.3 Speed Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4 Recommendation systems . . . . . . . . . . . . . . . . . . . . . . . 23
3.5 Web 2.0 IoT Architecture . . . . . . . . . . . . . . . . . . . . . . . 24
3.6 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Conclusion, Challenges and Future Directions 26
4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Challenges and Future Directions . . . . . . . . . . . . . . . . . . . 27
Bibliography 29
iii
List of Figures
2.1 Sources Used to Find or Access Health and Welness Related informa-
tion in 2008, in United States of America (USA) . . . . . . . . . . . 6
2.2 9/11 attackers having weak ties with others . . . . . . . . . . . . . . 9
2.3 Decentralized terrorist network . . . . . . . . . . . . . . . . . . . . . 10
2.4 PISTA ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 most consulted Social Networks (SNs) in cybertravelling . . . . . . . 13
2.6 Traveller recommendation system . . . . . . . . . . . . . . . . . . . . 14
2.7 TAM to gain loyalty . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.8 Tweeting trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 Expected growth in real time analytics by 2015 . . . . . . . . . . . . 21
3.2 Capabilities of Operational Intelligence . . . . . . . . . . . . . . . . . 21
3.3 Overview of Lambda Architecture . . . . . . . . . . . . . . . . . . . 22
3.4 Architecture for Social Internet of Things (SIoT) Client Side and Server
Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1 Hosting data on cloud and challenges . . . . . . . . . . . . . . . . . . 27
iv
Acronyms
AMA American Medical Association
ANN Artificial Nueral Network
API Application Programming Interface
BD Big Data
BDA Big Data Analytics
BDSNA Big Data Social Network Analysis
BI Business Intelligence
BP Batch Processing
CC Cloud Computing
CO Cognative Objects
DARPA Defense Advanced Research Projects Agency
DB Bata Base
DD Deep Data
DM Data Mining
DMO Destination Management Organizations
DT Decision Tree
DW Data Warehousing
eWOM e-word-of-mouth
FB Facebook
FC Fog Computing
IoT Internet of Things
v
LA Lambda Architecture
NLP Natural Language Processing
NSA National Security Agency
OI Operational Intelligence
OM Opinion Mining
RFID Radio Frequency Identification
ROS Robotic Operating System
RTA Real Time Analysis
RTBDA Real Time Big Data Analytics
SIoT Social Internet of Things
SM Social Media
SNA Social Network Analysis
SNs Social Networks
SP Stream Processing
SW Software
TAM Technology Acceptance Model
TM Text Mining
TPA Technosocial Predictive Analytics
UGC User Generated Content
US United States
USA United States of America
WSNs Wireless Sensor Networks
WWW World Wide Web
vi
Chapter 1
Introduction
This literature survey is based on key domain areas that "Social Network Analysis"
play a vital role with use of "Big Data" technologies. The discovered knowledge can
be utilize to extend current status of respected domains. This chapter highlights
importance, history and growth potentials in the survey topic in a nutshell.
1.1 Approach
Social Networks(SNs) connect people with different ideas, education, status, back-
groudns, geographies etc... The focal idea of Social Network Analysis(SNA) is iden-
tifying network relationships within the network. Information diffusion is the key
behind relationship formation. Within SNs, variety of interest are sharing that adress-
ing different domains, and it forms complex relationships. With World Wide Web
(WWW) and Web 2.0, SNs have gained a new shift and focus. Online SNs are massive
data repositories. Visitors to SNs leave a digital footprint once they are logged in and
hence all activities of logged users can be examined in online SNs. Data scientists
found the importance of translating these technological opportunities into revenue,
competitive advantages and useful discoveries to redefine human interaction[6] and
day to day life. Otherwise the data would have remained in data tombs and oppor-
tunities would have been ignored.
A trusted technique to analyse SNs are BD analytical approaches. Data Mining
1
(DM) techniques are heavily used to dig deeper into data in SNs.Big Data Analytics
(BDA) is a proven method of defining new storage/access/query/scaling mechanism
of data and of developing new approaches to sentiment analysis, predictive modeling,
Natural Language Processing (NLP), click stream pattern recognition etc. . .
BDSNA is a fast growing research area. There are quite a number of algorithms,
software tools and analytic engines that are optimized[39] for BDSNA. These tools
are capable of gathering data, processing, analyse and present results visually for a
particular domain. This literature survey gives an overview review on BDSNA topic
as published on research papers, journals, web articles, books etc...
1.2 Motivation
“Connectivity” is the concept for forming SNs. Competencies given by SNs, sen-
sors, online networks are rich data sources. People spend a substantial amount of time
in online networks.Therefor SNs generate high volume of User Generated Content
(UGC) with different varieties at a rapid velocity[40]. This UGC is a true reflection
of human behaviour in SNs hence UGC in SNs are of high commercial value. But it’s
enormity and unstructured nature has presented multiple challenges, hence the need
for storage, access, analytics and high computational performance needed to con-
sider. As a result “BD” technology mix been with SNA to discover new diamensions
of knowledge.
Facebook (FB)1
, Twitter2
, LinkedIn3
, Google+4
, Tripadvisor5
, Blogger6
, Insta-
gram7
are the leading SNs with vast user engagements in todays context. UGC in
SNs are in the form of text, emoticons, images, ratings, likes etc. . . and address
many domains such as travel and tourism[33], defence and security[4], healthcare
and medicine. Nature and characteristics are different in these SNs, how ever there
are similarities which can aggregate in addressing domains.UGC poses many busi-
1
https://www.facebook.com/
2
https://twitter.com/
3
https://www.linkedin.com/
4
https://plus.google.com/
5
tripadvisor.com/
6
https://www.blogger.com/
7
http://instagram.com/
2
ness opportunities. Discovery of knowledge that are resides in UGC while analysing
attributes that are unique to each domain will create more opportunities for both
private and government sectors. Another big wave in the coming decade is IoT. This
will further create more UGC in semi structured and unstructured nature, and vari-
eties of SNs. As a result “BD” will move to “Deep Data (DD)” concept while, “Cloud
Computing (CC)” will move to “Fog Computing (FC)”.
Deducing business intelligence via connecting dots using Operational Intelligence
(OI) and comparing and applying discovered knowledged in the modern and future
societal context using classification, sentiment analysis and other techniques in BD
and DM paradigm are blooming research topics. In these researches areas, it is
integral to determine which BDSNA algorithms and techniques have accommodation
for growth in size, scalability, quantification issues, pattern recognition issues and
capability of real time analytics in SNA application areas. Data scientists and other
researchers are also seeking novel ways of redesigning the infrastructure to facilitate
BDSNA with the rapid growth in IoT.
1.3 History
Various arguments are there to claim the initiative on SNA, while an experiment
done by Stanley Milgram in 1967 provide proper groundings for it. He came up with
“six degree of separation”[37] concept where he stated that most people connected
by six acquaintances. SixDegrees.com was the first acceptable online social network.
The research arena BDSNA boomed with Web 2.0 that came to light in 1999 [46].
Low availability of internet facilities and lack of Software (SW) tools to meet BD
requirements,were a major reason for BDSNA to stay out of sight in early days.
1.4 Current status
Today millions of people are connected with social networks in many different
ways[28]. Social networks are in a neck to neck fight to keep their current users while
attracting new users. This leads to semistructured and unstructured data being
generated at a rapid pace.
3
BDSNA is an aggressive and lucrative research areas in modern computer sci-
ence. Public and private sector organizations have open up their data repositories
for research purposes[13] and have encouraged data scientists to actively engage in
more research areas in BDSNA. Tech giants like Google8
,Microsoft9
, FB, Amazon10
and IBM11
are investing in start-up companies that operate in BDSNA because of its
lucrative nature and growth potentials. The demand for business intelligence tools
are erupting[10]. High performance, low latency, parallel distributed processing, real
time processing, scalability, migration are factors that are continuously optimized in
such tools. Further with IoT a new era has been born where trees tweet on their
conditions[12][15].
1.5 Chapter summary
Early days BDSNA was not so popular due various reasons and it has emerged
with Web 2.0 technology. SNs connect people with different views and opinions.
The UGC data repositories of SNs are huge and those are in different varieties and
variations.BDSNA helps in analysing UGC in SNs and there by discover knowledge.
This knowledge has a higher commercial value as well. Today, there are different
forms of online SNs that address different user groups (FB, LinkedIn, TripAdvisor
etc...). Classification, sentiment analysis, clustering, Real Time Analysis (RTA) and
various other BD and DM techniques are widely used in SNA. Today, the advances in
technology has spread to SNA where now tech giants and data scientists are looking
for novel approaches to accommodate the needs of SNA such as storage, querying,
accessing and analysing UGC with much more improved technologies.
Next chapter gives a detailed illustration on four major SNA domains that this
literature survey mainly concerned with. Examples and use cases from survey reports,
articles journals have included in order exploring the BDSNA importance to respective
domains.
8
https://www.google.com/
9
http://www.microsoft.com/
10
http://www.amazon.com/
11
http://www.research.ibm.com/
4
Chapter 2
Big Data Social Network Analysis
Domains
In this chapter, SNA domains in health care, defence and security, travel and tourism,
web 2.0 and IoT are discussed.Examples illustrate how BDSNA have used to address
stakeholder intensions and expectations. Further, this chapter exposes specializations
in each domain that emerged as a result of BDSNA.
2.1 Health care
As shown in "Figure 2.1", it is apparent that there’s a strong likelihood to use
internet as a source of finding health and wellness related information and people are
more likely to spend much time in SNs in their day to day lives. Web 2.0 attracts users
of all age groups. Discussions, information diffusion, collaboration over SNs growing
so rapidly in healthcare space. Recent researches have identified that professionals in
healthcare are willing to use SNs as means of addressing their patients and monitor
health conditions of patients. Further, patients who have recovered are also inter-
ested in sharing their success stories in SNs in the forms of blogging, photo sharing,
video uploads and articles. This information is publicly available to a vast variety of
people. As of now, we are in Health 2.0, “the use of social software and its ability
to promote collaboration between patients, their caregivers, medical professionals, and
5
other stakeholders in health”[17].
Figure 2.1: Sources Used to Find or Access Health and Welness Related information
in 2008, in USA
It is “Collective Wisdom” that act as driving force for people to increasingly use
SN to find information relevant to their health matters. There are specifically devel-
oped SNs like PatientsLikeMe1
,OrganizedWisdom2
,ICYou3
, Google Health Groups,
Sermo4
, DailyStregth5
to bridge the knowledge and experience of patients and health
care professional expertise[17].American Medical Association (AMA) emphasizes the
importance of adhering to professionalism to physicians, neurosurgeons and other
professionals, when publishing content over SN to safeguard career status in health-
care background[9]. Even though there are challenges in collecting data, healthcare
sector in SNs reflects accurate data where it is over 99.7%[20].
1
http://www.patientslikeme.com/
2
http://www.organizedwisdom.com/Home
3
http://icyouhealth.tumblr.com/
4
https://www.sermo.com/
5
http://www.dailystrength.org/
6
A research focussed on cancer patients social behaviour on FB conducted by the
University of Texas M.D Anderson Cancer Center has enabled them to provide better
service towards its patients. The UGC had been of poster types and text. This tech-
nique is called “Telemedicine”. Just as Health 2.0, Medicine 2.0 is another concept
that evolves with high user participation over sn to communicate and collaborate on
health care. The Twitter network is also widely popular among patients and health-
care professionals as a medium of communication[20]. How ever patients willingness
to communicate over SNs openly is mandatory, otherwise regulations will mark it as
a violation of patients’ rights.
Videos, articles, comments, chats, images and other form of UGC related to health-
care available on SNs represents a gold mine of opportunities[20][17]. Sophisticated
applications have been developed integrating both DMand BD techniques. TrialX6
is
one such application that patients can use. Once a patient tweets, TrialX will send a
tailored response to the patient from his/her past health history[20][32].
Gene engineering, drug research, disease research and public health domains utilize
UGC on SNs to discover knowledge and thereby develop models to enhance health
conditions of people. Twitter hashtags are quite useful when determining disease/drug
related effects[8]. Automated filtering system that was developed by US Food and
Drug Administration has proved that 98% of tweets are bogus, however the true
information is of great value[23].
Information extraction is critical. Migration of digital documentation from paper
work, and SNs data are huge repositories. An automated surveillance system would
be much effective in information extraction, analysing data and recognizing patterns.
One such system implemented at University of Alabama has proven results. It had
been successful in determining, high risk patients, short-term health issues and ad-
verse effects from drugs. Use of big data has enabled to deliver tailored prescriptions
for patients[30].
Significant number of BD applications in healthcare domain exist today. SW tool
that is similar to Asthmapolis7
would be meaningful to implement considering SNs
data repository. Mobility is expected through big data tools hence mobile platforms
6
http://trialx.com/enablers/
7
http://propellerhealth.com/
7
enable tools will have a lot of growth potentials.Ginger.io8
and mHealthCoach9
are
the leading tools[18] at present but these two have been unable to incorporate SNs
domain into there applications, and the necessity for such tools prevail.
2.1.1 Challeges and Future
UGC appear on anonymous blogs and spam comments are unreliable sources.
Efficient NLP techniques and Text Mining (TM)techniques need to be utilize when
developing BD tools and appliations. Strong rules and regulations exist in healthcare
domain. This is a barrier to obtain useful information from SNs. Mere sentiments are
not enough to develop solid algorithms and models, patient information and other
related information will add much value to researches. "Privacy" concerns are another
barrier. People might not want others to use what they share on SNs
Web 2.0 will evolved to Web 3.0 and eventually Health 2.0 and Medicine 2.0
will evolved to Health 3.0 and Medicine 3.0[17]. With the rise of IoT BD wearables
will take piority in healthcare[7]. SNs BD wearable concept will redefine human
interactions with healthcare matters.
2.2 Defence and Security
With the 9/11 massacre in the United States (US), the National Security Agency
(NSA) invested a huge amount of resources to counter attack terror networks. “Net-
works and Networs” by John Arquilla and David Ronfeldt prior to 9/11 massacre
highlighted the network behavioral patterns of criminal networks. Modern war net-
work structures are leaderless, extremely quick hence novel approaches are needed in
counter terror threats. Valdis Krebs mapped Al-Qaeda network responsible for 9/11
[37]. More and more importance was given in SNA to trace terror network and to-
day SNA plays a key role in demolishing terror networks[11].Technosocial Predictive
Analytics (TPA) methods for web DM, social web tools needed to capture and query
UGC in SNs[22]
8
https://ginger.io/
9
http://www.mhealthcoach.com/
8
National security is the main concern. Unlike other SNs domain, defence domain
is different in many ways since key players are not openly active. Weakly tied parties
are somewhat open in SNs, but even they hardly communicate. SNA in defence
requires two major parties, data collectors and data modellers.Data collectors face a
cumbersome time in gathering data due to the above reason. University of Arizona
Artificial Intelligence Center10
offers large data repository of newspaper articles, web
pages, social network data that is terror related.Clustering technique have been used
to segregate possible terror networks and they have managed to pictorially represent
diffused networks linked with weak ties(Figure 2.2) in network of 9/11 attackers[37].
Figure 2.2: 9/11 attackers having weak ties with others
2.2.1 Identifying key players in network
Two main focuses of analysing SNs in defece domain are to identify structure
of possible networks and to recognize key players. With 9/11 attack, the structure
decentralized (yet still both centralized and decentralized networks do exists). Un-
derstanding key player will help in taking the control of the entire network. Though
10
http://ai.arizona.edu/research/terror
9
it sounds easy, factors such as incompleteness, fuzzy boundaries and dynamics makes
it a tough task. In a decentralized network player do exists to handle financial as-
sistance and other supplies while the leader plays a silent role in managing[4](Figure
2.3). BDSNA is use to identify financial manager and there by recognize key roles.
Twitter BD analytic techniques are most likey to be used in recognizing key players.
[31]
Figure 2.3: Decentralized terrorist network
As shown in "Figure 2.4", PISTA Architecture is quite useful in filling major loop
holes in national security domain. But at the moment, this architecuture has fewer
applications with SNs UGC integration. It is highly recommend to invest on extending
the functionality of PISTA architecture to supportBDSNA in security domains since
most SNs have video sharing, geo-location setting features in them[42].
Figure 2.4: PISTA ontology
10
2.2.2 Usecases from recent history
During recent history there had been several major incidents happening through-
out the globe whith Web 2.0 initiatives. This section highlights some major incidents
and BDSNA technologies used in those situations.
• 2008 Egyptian Revolution started through an initiation of FB group.Importance
of giving attention on SNA was discovered[14].
• 2009 Pakistan Chief Justice restatement efforts were caused purely due to SNs
influence. Government banned private media, yet people did social awareness
through SNA so Govt had to restate the Chief Justice back in his position [14].
• ISIS is a technologically sophisticated terror group that actively engage in SNs.
ISIS use strong encryption techniques when communicating over SNs. Due to
this barrier,BDSNA approaches like NLP,Graph data bases (determine hierar-
chies and identities) and cognitive computing platforms cannot solely be used
as they are. Project Minerva by Department of Defence USA, utilize high end
algorithms to determine terror activities that are pulled from Twitter.[47]
• FIFA World Cup 2014 can be considered as an event that used BDSNA to
establish peace around grounds and nearby cities. Brazil securities used real
time Twitter feeds, FB feeds and other SNs UGC and analysed semantics to
determine where to send troops to control riots. Security agencies used powerful
BD solution, Oracle Complex Event Processor11
to do real time querying on SNs
feeds.
2.2.3 Challenges and Future
BD analytics in defence sector provide meaningful insight to Governments. The
director of Defense Advanced Research Projects Agency (DARPA)12
in US empha-
size the importance of algorithm optimization in discovering useful intelligence. “e-
harassment”, “cyberbullying”, “hacking” are major investigating areas.The adoption
of SNs data is yet at a low stage, but considering recent history it is apparent that
11
http://www.oracle.com/technetwork/middleware/complex-event-processing/documentation/index.html
12
http://www.darpa.mil/default.aspx
11
it is highly essential to take into consideration SNs data when discussion the secu-
rity domain. Big argument against BDSNA in defence is, violation of privacy. People
share their thoughts on FB, Twitter and other SNs because they have a right, and not
to use those for other purposes. Recent whistle blowing incidents by Julian Assange
and Wikileaks, PRISM and Edward Snowden are such examples. It is apparent that
the Government try to hide these information from public visibility[11]. To obtain
successive results there should be a balance between Govt policy towards SNs and
users attitudes.
2.3 Travel and Tourism
Tourism has always been a networked industry. Web 2.0 redefined tourism and all
related industries. This phenomenon is Tourism 2.0[26]. In tourist networks, two
major types of stakeholder (tourist, travel agent, accommodation providers, restau-
rants etc. . . )[41] can identify tourist and service providers. Different views have been
given to BDSNA in the domain of tourism. Two such broad views are using SNs as a
tool in tourist destination determination [33][26] and second is process and discover
interesting patterns in SNs and apply derived knowledge to tourism[34][16].
2.3.1 Web 2.0 forms Tourism 2.0
SNs are powerful tools that uses Technology Acceptance Model (TAM) and e-
word-of-mouth (eWOM). TAMillustrates users’ willingness to adapt to technologies
while eWOM is content sharing on SNs in forms of text, images, videos etc... TAM
and eWOM provide primary source of information for cybertravellers. Cybertrav-
ellers behaviour depend on what other people say about destinations(Figure 2.5).The
need for new framework to address destination governance is highlighted in this ap-
proach. Service providers need to adopt their networks with features of embedding
SNs to support searching, visualization, interactivity and this would trigger positive
attitude towards travelling. Travel 2.0 SNs (TripAdvisor, WAYN13
, Tripwolf14
, Trav-
elblog15
, Trivago16
)SNs features to address cybertraveller expectations.Here focus is
13
http://www.wayn.com/
14
http://www.tripwolf.com/
15
https://www.travelblog.org/
16
http://www.trivago.com/
12
more towards leisure travellers rather than business travellers.[33][26][35][36]
Figure 2.5: most consulted SNs in cybertravelling
2.3.2 Tourism 2.0 Destination Management
Tourist attitudes, behaviour and psychology has huge impact when determining
destinations to explore. Different market segments demands are different. eMar-
keteers use tailored strategies to attract potential tourists... Destination Manage-
ment Organizations (DMO) utilize DM and BDSNA techniques (clustering, Artificial
Nueral Network (ANN), Decision Tree (DT)) to determine customer intentions from
mixture of facts and opinion from UGC on SNs[43].
Travel 2.0 benefit BDSNA in demand/sales forecasting, inventory management,
multichannel marketing campaign organization etc... Use of SNs methods are quite
important when removing noise and discover meaningful knowledge from SNs to bring
meaninguful insight[1].RapidMiner17
analyse traveller patterns and render dynamic
personalize suggestions based on past as well as other linked networks results (pre-
dicting air ticket price, hotel charges etc. . . [44]
At the time of decision making, traveller in a state of switching one to the other
depending on reviews. FB pages, provide great insights about destinations/hotels.
How ever researches have proved that it is very much likely to Tweet or post on FB if
travellers had a bad experience with service provide organizations.Twitter users are
more likely to re-Tweet negative reviews than positive reviews. This highlight the
importance of monitering UGC on SNs pages of service providers. Once the traveller
has selected preferred hotel/travel service, they are very much likely to visit brand
17
https://rapidminer.com/
13
Figure 2.6: Traveller recommendation system
website of hotel/travel agency. It is vital to integrate TAM features to explore more
about services that are offered to customer to win customer loyalty[36].Strategies help
DMO ultimately to boost their revenue and gain competitive advantage over peers
that ignores BDSNA.
2.3.3 Challenges and Future
This sections describes prevailing restrictions in BDSNA in tourism and travel
domain and how the future would be.
• Currently, most are relationald BDs. Tourism and travel sector need new in-
frasturture tools to get maximum of bdsna.
• User opinions are subjective. Algorithms should support the viewing of gener-
alized opinion of travellers and should not be affect it by outliers.
• Content that shares on FB, Twitter and other SNs have direct influence on
DMO, travel agents and hotels.So there is a need for strong monitering mecha-
nism need to incorporate to Travel 2.0 websites.
• Airline service providers can benefit from real time data analytics on flight
delays, UGC from SNs, and sensor data (weather patterns etc. . . ) serve greatly
when optimizing operations.
14
Figure 2.7: TAM to gain loyalty
2.4 Web 2.0 and IoT
In an era where Web 2.0 evolves to Web 3.0 (Ubiquitous Computing), that hard-
ware embedded software takes the lead in daily routines of mankind, will have a huge
influence on current SNs practices as well. Today mostly humans are connected to
SNs. With advances in IoT, Cognative Objects (CO) or smart objects are capable of
sharing UGC over SNs. Tweeting trees(Figure 2.8), tweeting washing machines send
real time content to humans[24][27]. Two broad SNs exists with SNs, humans to CO
SNs and CO to CO SNs SIoT[19].
Figure 2.8: Tweeting trees
15
Developers integrate SNs capability to every smart device because SN play an
important role in personal life. Google Glass18
, Samsung Galaxy Gear watch19
, Apple
iWatch20
and many other wearable technologies have integrated SNs capability. Lewis
Robinson on his article to SocialMediaToday21
stated that “iWatch will check in for
you via Facebook when you arrive at an event. Your oven will take a photo of the
cake you just baked and post it directly to Instagram”.[38] It is evident that automated
interconnected smart devices can act without human intervention.
There will be more data as neverbefore. BDSNA will be able to provide more
personalized information to all stakeholder groups, and advanced Business Intelligence
(BI) can be derived using sophisticated analytical approaches. Concept of “Smart
Cities” is an example of advanced data analytic utilization of UGC from IoT devices
that connected SNs and other CO. Waze22
, is such real time traffic application that
connect mobile devices with other CO (traffic lights, street signs etc. . . )
2.4.1 Challenges and Future
“Privacy” is again a major concern in this arena. Since devices having capability of
generating automated content sharing on SNs, it could be a violation of privacy of in-
dividuals. How ever, Lawrence Ampofo on his recent article to Business2Community23
emphasizes that, “conception of privacy become more sophisticated” where people are
more likely to openly communicate their personal life through social networks and
“data to be more liberated from wall gardens making available to all platforms”[2].
It is predicted that by the end of 2020, the number of IoT devices would rise
above 50 billion[38]. The potential for new concept SNs is massive. The amount of
unstructured data that is generated from IoT devices will be so huge that even current
bd technologies cannot accommodate the size, growth and scalability. The concept
of “Deep Data” and “Fog Computing” need to be utilized effectively to accommodate
infrastructure requirements.
18
https://www.google.com/glass/start/
19
http://www.samsung.com/uk/consumer/mobile-devices/wearables/gear/
20
https://www.apple.com/watch/
21
http://www.socialmediatoday.com/
22
https://www.waze.com/
23
http://www.business2community.com/
16
2.5 Chapter summary
Health 2.0, Medicine 2.0 approaches have evolved as a result of Web 2.0 because
it identified that, the potential from SNs to health care industry is massive. SNs are
fastest method of communication between patients and health care professionals such
as nurses, doctors and specialists etc. . . In PatientsLikeMe, Google health groups
and various other SNs that specially focus towards health care are sharing knowledge
and experiences of all parties related to health care. Sophisticated SW tools such as
TrialX utilizes BDSNA methods to send tailored responses to patients and doctors by
analysing related party data reflected on SNs. Specialized research areas such as drug
research and disease research massively use BDSNA approaches like TM, sentiment
analysis, clustering and RTA etc. . .
Defence and Security domain is very different compared to other domains in BD-
SNA. Finding reliable data repository is a major challenge because terror groups
hardly reveal any data. But recent ISIS scenario is totally different. Today, gov-
ernment agencies and authorities use BDSNA to establish security in their territory.
RTA play an important role in analysing UGC of SNs. Highly sophisticated models
and predictive algorithms have developed using BDSNA mechanisms.
With Web 2.0, Travel 2.0 evolved. UGC that are in form of text, video and
images etc. . . are useful resources for discovering traveller psychology and behaviour.
Business models like TAM were developed as a result of BDSNA . Hotel owners, travel
agencies are using BDSNA approaches in addressing their customer requirements.
RTA and recommendations are heavily use in Travel 2.0.
IoT has paved the way for living things like trees and non-living objects such as
washing machine to share their status over SNs. As a result of smart devices being
part of SNs, the amount of data that is generated, that is of unstructured and semi
structured are unbelievable. This pushes data scientists to explore new technologies
like FC and DD to integrate to BDA.
Third chapter focuses on core BDA technologies in SNA. Network visualization,
data storage, process, accessing, recommendations systems and RTA that discussed
in above SNs domains are illustrated in technically and theoretically.
17
Chapter 3
BDSNA Tools and Technologies
In this Web 2.0 era, data is generation is exploding exponentially and data scientists
and IT professional are highly ambitious in turning BI to an asset in their busi-
ness domain.This chapter illustrate, key concerns and core technologies and tools in
BDSNA.
3.1 Major Concerns in BDSNA
This section highlights identified issues from previous chapter in a nutshell.
• Security and Privacy: Most UGC on SNs reflect people’s personal life moments.
All scenarios we considered in the last chapter highlights security and privacy
as a major concern[11][2].
• Explosive growth rate: With growth of Internet and IoT will generate more
UGC. Infrastructures should accommodate to store, process, capture and anal-
yse new sources of semi-structured and unstructured data from all SNs. FB
uses Apache Hadoop1
and Apache Hive2
for storage purpose because hardware
scalability is high, and Scribe3
as a log collection strategy[45].
1
http://hadoop.apache.org/
2
https://hive.apache.org/
3
https://github.com/facebookarchive/scribe
18
• Extract valid UGC removing noise: TM, NLP and other DM techniques need
to optimize to find validity of data[2].
• Real time analytics: Need for Stream Processing (SP) is erupting. User data
gathered over a period will go through Batch Processing (BP) machanism to
develop models to check and analyse incoming events in real time.
• Sophisticated analytics tools and SW: Low latency and more visualization is
expected from BDSNA tools and SW. The Lincoln Laboratory is currently
engage in research projects to develop sophisticated algorithms and software
tools to generate networks from unstructured/semistructured data[10].
To represent different user groups in SNs wide range of tools and SW are avail-
able in the market place. When considering selecting the right tool, factors such
as, intended goal, ease of use, operating platform, cost effectiveness etc. . . need
to be taken into consideration. Out of all these “visualization” capability is vi-
tal.Streanghts of network ties, user groups structures, and dynamics can be viewed
using these tools.[29].
Tool / SW Description
Gephi4
Platform independent SW that is distributed under open source
licence. Good tool in visualizing networks and their relationships.
NetLogo5
Free software that supports platform independency. Helps in visu-
alizing dynamics in network formation. Study of network behaviour
can be done using this tool.
iGraph6
Free SW that can be used to perform heavy calculations.
Pajek7
Another free SW that runs only on Windows platform. Network
formation, dynamics, information diffusion and many other inbuilt
feature.
UCINet8
Commercial SW that supports only the Windows platform.
NodeXL9
Fairly new to market. SNA can integrate with Excel. Free SW and
for the moment only available for Windows platform.4
http://gephi.github.io
5
https://ccl.northwestern.edu/netlogo/
6
http://igraph.org/
7
http://pajek.imfm.si/doku.php
8
https://sites.google.com/site/ucinetsoftware/home
9
http://nodexl.codeplex.com/
19
NetworkX10
A good tool in programming perspective. Has developed using C
and Fortral libraries. Optimized for scaling for large matrices.
Nuero productions 5K Twitter browser and Neofomix Twitter Stream Graph are
advanced visualization tools that can be used to analyse UGC from Twitter.[24]
3.2 Real Time Analysis
FB, Twitter, LinkedIn, Goolge+, TripAdvisor and all leading SNs provide real
time visibility on what their users prefer. Intel BD Research Center forecasted that
the uses cases for Real Time Big Data Analytics (RTBDA) will spread towards more
in SNA than BP, yet BP will still act as the core for RTBDA. Real time analytics
based on SP.OI[39] and Lambda Architecture (LA) are the core BDA technologies
that SNs mainly use for RTBDA.
RTA explanation
RTBDA is an advance technique to make better decisions and meaningful actions
at precise time. There are two major important aspects in RTA. Real time actions
are treated as “streams of events” in RTA. To determine the required action to be
performed when an event comes to the system, the system need to capture, pro-
cess and analyse the parameters and attributes in the incoming event stream, and
determine the corresponding stream category or group with regard to application do-
main. Then the corresponding categories stream would match with an action that
is determined by pre defined model.It is important to develop this “model” at first
phase in RTA. Further more the RTA engines are stateless engines, in that it doesn’t
require provisions for previous incoming streams in determining action for current
stream[25].
10
http://networkx.lanl.gov/
20
Figure 3.1: Expected growth in real time analytics by 2015
Figure 3.2: Capabilities of Operational Intelligence
FB, Twitter, and other SNs use data records that are collected over a large period
of time. Model is developed considering the nature of the application domain(i.e.
tourism, healthcare etc...), not the individual records that reside in data repositories.
OI and LA are core technical approaches in designing and developing RTA engines.
21
3.3 Lambda Architecture
LA,developed by Nathan Marz,achieves the capability of real time processing by
decomposing the event into three layers, batch layer, serving layer and speed
layer. Everything starts from query = function(all data) equation[5]. The computa-
tional cost is highly expensive for to perform this function for every event on the fly.
In batch view, a precomputed query function will be used to check the result for
the query instead of calculating on the fly. The precomputed view is indexed so that
it can access fast with few random reads.
Figure 3.3: Overview of Lambda Architecture
3.3.1 Batch layer
Batch layer acts as the master holding the values of batch views that are computed
on master data set (HDFS) and compute arbitrary views (MapReduce)[25]. This
master data set domain can be either historical data or historical data with current
data (depend on business domain and key stakeholder interest). Apache Hadoop is
used to process master data set and develop required model.
simplest pseudo code for batch layer[25]
function runBatchLayer():
while(true): // repeatedly recompute batch views from beginning
22
recomputeBatchViews()
3.3.2 Serving layer
Real time querying is supported by the serving layer. Real time stream is ingested
into the analytic engine and inside the engine, stream is processed, then the corre-
sponding action is triggered. Apache Drill11
and Cloudera Impala12
are SP engines
that are used to implement serving layer functions[25].
3.3.3 Speed Layer
There is a substantial latency in BP, and the impact is compensated via dis-
tributed SP. Apache Storm13
and Apache S414
are used to implement this layer[25].
3.4 Recommendation systems
FB, Twitter, LinkedIn, Goolge+ and all leading SNs .These systems apply knowl-
edge discovery techniques to the problem of making personalized recommendations
during a live interaction[21].
ex: Consider a scenario where you add a friend on FB and FB will automatically
give similar recommendations. (a generalized recommendation system)
Recommendation engine analyse people who add the same person that you add,
and from those people(1), the engine analyse and determine other people(2) who are
added by those people(1). System will give people(2) as our recommended people to
add and expand our network
11
https://github.com/apache/drill
12
http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html
13
https://storm.apache.org/
14
http://incubator.apache.org/s4/
23
SNs recommendations are determined by the number of Likes, clicks, user rat-
ings and emoticons. The algorithms are mainly of two categories, content-based
algorithms and collaborative filtering algorithms. Content based algorithms
check similarity of target item (recommended). Collaborative filtering technique will
use, previous similar recommendations based on clicks, ratings etc. . . Additionally
time window technique is adapted to give recommendations according to time du-
rations.(Google+ and Twitter trends etc...
3.5 Web 2.0 IoT Architecture
Distributed Wireless Sensor Networks (WSNs) to share data, Robotic Operating
System (ROS) as middleware platform and Radio Frequency Identification (RFID)
as an identification technology, provide the core architectural infrastructure to CO
to recognize activities and at the same time incorporate knowledge to smart objects.
Pachube platform15
provide fundamental API groundings for developers to develop
SIoT.
Figure 3.4: Architecture for SIoT Client Side and Server Side
It is important to understand SIoT network characteristics and relationships when
designing and developing smart environments. Four main types of relationships are
exists in SIoT networrks[3]
• parental object relationship: This is a family like structure that believes CO do
share similar characteristics with devices that are developed during the same
time period (argument here is that, technology changes so rapidly)
15
http://datahub.io/dataset/pachube
24
• co-location object relationship:Object relationships needed to established dur-
ing the design and development of smart environment, based on location base
inforamation.
• ownership object relationships: One person can be owner of several CO. This
ownership information is vital when interacting with SNs of CO.
• social object relationship: Devices with similar characteristics can share best
practices to solve issues. “Cloud-of-cloud” concept is a broad view that shares
same idea. This idea can relate to edge computing IoT devices.
3.6 Chapter summary
Security and privacy need to give a great deal of attention when designing and
developing BDA SW and tools as well as developing algorithms to SNs domain. In
SNA, “visualization” is an important aspect to look at when designing SW that can
analyse different user groups and Gephi tool out performs other network visualization
tools.
Lambda Architecture that uses BP and SP, utilize for RTA in BDSNA. OI uses
in order to develop models that can be used in RTA. Recommendations systems are
differ from domain to domain and use quite a number of user actions such as user
click, likes and ratings etc. . . in designing algorithms.
Next chapter is the final chapter that summarizes literature survey and it gives
future directives to BDSNA domains widening current status to a new level.
25
Chapter 4
Conclusion, Challenges and Future
Directions
This chapter summarises overall survey and provides insight into future directions for
SNs in applying gathered knowledge in practice.
4.1 Conclusion
Today even tech averse and less techy people do have an understanding about
SNs (like FB), but they are hardly aware of what search engines can do.Children,
youngsters, adults and even old people are making their presence felt in SNs. People
are eager to share their personal life stories, and on the other hand people like to peep
into other peoples’ affairs. Interesting fact is that, not only humans, but also other
living and non-living objects are becoming users of SNs. The highly dynamic UGC
on SNs reflect user perspectives and feedback. UGC is not restricted to a particular
domain, it spreads to a vast variety of fields and BDSNA helps in addressing wider
range of stakeholder groups with higher degree of accurate BI.
This survey is focused on four major domains(Healthcare, Defence and Security,
Travel and Tourism, Web 2.0 with IoT) in SNs. To derive useful knowledge and
recognize hidden patterns from user activities of SNs, it is important to differentiate
26
what is exiting and interesting among all activities. BDSNA is the solution. BDSNA
has redefined these sectors to a new dimension making it worth for all interested
parties. Qualitative and quantitative results have been obtained through BDSNA, to
give a better service to users of SNs. Business strategies and models are creating to
satisfy the demands of users. Predictive models, recommendation systems and real
time analytics play a major role in today’s BDSNA.
Modern day BDSNA has been identified as a best approach as an answer to
many business domains. BDSNA has become an essential part of developing highly
sophisticated intelligence tools and SW.
4.2 Challenges and Future Directions
SNs like Facebook are considering cloud storage as a solution to accommodate
growing needs of data storage. As shown in "Figure 4.1", the biggest challenge in
adopting cloud storage that is identifies by all organizations, is security and privacy
violations. Even though a private cloud can provide security mechanisms to establish
more security, cyber attackers are smart enough to identify loopholes and thereby
spoil data on a cloud. It is evident that a s o f yet there is no 100% guarantee of
using cloud technology as a trusted service.
Figure 4.1: Hosting data on cloud and challenges
Most UGC on SNs are irrelevant to the considered domain. Incompleteness of
text information, multilingualism content, bogus user feedback are difficult to cater
27
to in doing genuine analytic. Deriving algorithms and strategies based on particular
geography user group is not sufficient. Data scientists need to give more attention to
these factors when doing SNA. Also TM and NLP are currently supported most in
micro blogging content (Tweets are limited to a maximum of 140 characters). These
techniques need to improve to a level where it can analyse much more text content.
Mechanism similar to YouTube real time translation is quite beneficial in SNs domain
context to spread awareness to wide range of users.
SNs have a huge impact on human behaviour and intensions, and it has challenged
the conventional behavioural patterns of humans over recent years. .FB can be used
to find a friend or relation and LinkedIn is a place to find professionals. It is apparent
that SNs play the role of a “search engine”. Integrating proper index methodologies,
would enhance search function of SNs and would give its users more accurate results.
Further, companies advertise their products and services on SNs. In the near future,
users will find it more compelling and attractive to use SNs for their online shopping
experiences. This highlights a big business opportunity for SNs like FB, but on the
other hand, a possibility for users to stay away from SNs may arise. The need for
shopping pattern analytics in SNs will also arise in the future.Like we have differ-
ent type of SNs now for different purposes (FB and LinkedIn), there will be more
categories of SNs in future. IoT will be a driving factor in diversifying SNs.
28
Bibliography
[1] Rajendra Akerkar. Big Data & Tourism Big Data & Tourism To promote inno-
vation and increase. 2012.
[2] Lawrence Ampofo. 5 ways the internet of things
will change social media, October 2014. URL
http://www.business2community.com/social-media/5-ways-internet-things-will-cha
Accessed November,2014.
[3] Luigi Atzori, Senior Member, Antonio Iera, Senior Member, and Giacomo Mora-
bito. SIoT : Giving a Social Structure to the Internet of Things. 15(11):1193–
1195, 2011.
[4] Ala Berzinji. Detecting Key Players in Terrorist Networks. 2011.
[5] Nathan Bijnens. A real-time Lambda Architecture using Hadoop & Storm
NoSQL Matters Cologne 2014 by Nathan Bijnens Speaker. 2014.
[6] Jaap Bloem, Sander Duivestein, and Thomas Van Manen. Big Social Predicting
behavior with Big Data.
[7] BloombergTV. Can wearables and big data cure disease?, August 2014. URL
http://www.bloomberg.com/video/parkinson-s-disease-new-ways-to-study-illness-V
Accessed November,2014.
[8] David Bollier and Charles M Firestone. The Promise and Peril of Big Data.
2010. ISBN 0898435161.
[9] Jeff Cain. Social media in health care: the case for organizational policy and
employee education. American journal of health-system pharmacy : AJHP
: official journal of the American Society of Health-System Pharmacists, 68
29
(11):1036–40, June 2011. ISSN 1535-2900. doi: 10.2146/ajhp100589. URL
http://www.ncbi.nlm.nih.gov/pubmed/21593233.
[10] William M Campbell, Charlie K Dagli, and Clifford J Weinstein. with Content
and Graphs. 20(1), 2013.
[11] Neil Couch and Bill Robins. BIG DATA FOR DEFENCE AND SECURITY.
[12] Paul M. Davis. A tree that tweets, September 2010. URL
http://www.shareable.net/blog/a-tree-that-tweets. Accessed Octo-
ber,2014.
[13] YOREE KOH DON CLARK. Ibm and twit-
ter forge partnership on data analytics, 2014. URL
http://online.wsj.com/articles/ibm-and-twitter-forge-partnership-on-data-analy
Accessed October,2014.
[14] Mark Drapeau and Linton Wells Ii. Social Software and National Security : An
Initial Net Assessment. (April), 2009.
[15] Rob Faludi. New york times on botanicalls, again!, April 2013. URL
http://www.botanicalls.com/. Accessed October,2014.
[16] Roberta Floris and Michele Campagna. Social Media Data in Tourism Planning:
Analysing Tourists’ Satisfaction in Space and Time Roberta Floris, Michele Cam-
pagna. 8(May):997–1003, 2014.
[17] California Healthcare Foundation. The Wisdom of Patients : Health Care Meets
Online Social Media. (April), 2008.
[18] Peter Groves and David Knott. The ‘ big data ’ revolution in healthcare.
(January), 2013.
[19] Dominique Guinard, Vlad Trifa, Friedemann Mattern, and Erik Wilde. From
the internet of things to the web of things: Resource-oriented architecture and
best practices. In Architecting the Internet of Things, pages 97–129. Springer,
2011.
[20] Carissa Hilliard. Social media for healthcare: A content analysis of md an-
derson’s facebook presence and its contribution to cancer support systems. of
Undergraduate Research in Communications, page 23.
30
[21] Jianming and Wesley W Chu. A Social Networ k-Based Recommender System
( SNRS ).
[22] Maged N Kamel Boulos, Antonio P Sanfilippo, Courtney D Corley,
and Steve Wheeler. Social Web mining and exploitation for seri-
ous applications: Technosocial Predictive Analytics and related tech-
nologies for public health, environmental and national security surveil-
lance. Computer methods and programs in biomedicine, 100(1):16–23, Oc-
tober 2010. ISSN 1872-7565. doi: 10.1016/j.cmpb.2010.02.007. URL
http://www.ncbi.nlm.nih.gov/pubmed/20236725.
[23] Deborah Kotz. Using twitter as tool to track
side effects from drugs, April 2014. URL
http://www.bostonglobe.com/lifestyle/health-wellness/2014/04/30/using-twitter-
Accessed November,2014.
[24] Matthias Kranz, Luis Roalter, and Florian Michahelles. Things That Twitter :
Social Networks and the Internet of Things.
[25] Nathan Marz and James Warren. Big Data principals and practices of scalable
real time systems .
[26] Roberta Milano. The effects of online social media on tourism websites. 2011.
[27] Mark Million. Washing machine twitters
when clothes are done, January 2009. URL
http://latimesblogs.latimes.com/technology/2009/01/twitter-washing.html.
Accessed November,2014.
[28] Alan Mislove, Hema Swetha Koppula, Krishna P Gummadi, Peter Druschel, and
Bobby Bhattacharjee. Growth of the flickr social network. In Proceedings of the
first workshop on Online social networks, pages 25–30. ACM, 2008.
[29] Chamin Nalinda. Social network analysis tools and softwares, October 2014. URL
http://techspiro.blogspot.com/2014/10/social-network-analysis-tools-softwares.
Accessed October,2014.
[30] Mary K Obenshain. Application of Data Mining Techniques to Healthcare Data.
(August):690–695, 2004.
31
[31] Onook Oh, Manish Agrawal, and H Raghav Rao. Information control and terror-
ism: Tracking the mumbai terrorist attack through twitter. Information Systems
Frontiers, 13(1):33–43, 2011.
[32] Chintan Patel. Now you can talk to twitter and
find clinical trials on trialx, December 2012. URL
http://trialx.com/enablers/2009/03/now-you-can-talk-to-twitter-and-find-clinic
Accessed November,2014.
[33] Loredana Di Pietro, Francesca Di Virgilio, and Eleonora Pantano. So-
cial network for the choice of tourist destination: attitude and be-
havioural intention. Journal of Hospitality and Tourism Technology, 3(1):
60–76, 2012. ISSN 1757-9880. doi: 10.1108/17579881211206543. URL
http://www.emeraldinsight.com/10.1108/17579881211206543.
[34] Angelo Presenza and Maria Cipollina. Analysis of links and features of tourism
destination’s stakeholders. an empirical investigation of a south italian region.
2009.
[35] Pslulfdo and Ehwzhhq. An Empirical Study on the Relationship between Twitter
Sentiment and influence in Tourism Domain. 2012.
[36] Cornell Hospitality Report, Laura Mccarthy, Debra Stock, Rohit Verma, D Ph,
Rod Clough, Gregg Gilman, Employment Practices, and Gilbert Llp. How Trav-
elers Use Online and Social Media Channels to Make Hotel-choice Decisions. 10
(18), 2010.
[37] Steve Ressler. Social network analysis as an approach to combat terrorism:
past, present, and future research. Homeland Security Affairs, 2006. URL
http://www.hsaj.org/?download&mode=dl&h&w&drm=resources%2Fvolume2%2Fissue2%2Fp
[38] Lewis Robinson. A tweet from your toaster: How the in-
ternet of things will affect social media, May 2014. URL
http://www.socialmediatoday.com/content/tweet-your-toaster-how-internet-things
Accessed November,2014.
[39] Philip Russom. TDWI Checklist Report: Operational Intelligence: Real-Time
Business Analytics from Big Data.
[40] Philip Russom. T DW I R E S E A R C H BIG DATA. 2011.
32
[41] Series, Chris Cooper, C Michael Hall, New Zealand, Noel Scott, and Rodolfo
Baggio. Network Analysis and Tourism From Theory to Practice.
[42] Amit Sheth, Boanerges Aleman-meza, I Budak Arpinar, Chris Halaschek, and
Cartic Ramakrishnan. Semantic Association Identification and Knowledge Dis-
covery for National Security Applications. 16(March):1–16, 2005.
[43] Sung-bum and Dae-young Kim. TRAVEL INFORMATION SEARCH BEHAV-
IOR AND SOCIAL NETWORKING.
[44] Sarawut Supattranuwong and Sukree Sinthupinyo. Applying Data Mining to
Analyze Travel Pattern in Searching Travel Destination Choices. pages 38–44,
2013.
[45] Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba Borthakur, Na-
mit Jain, Joydeep Sen Sarma, Raghotham Murthy, and Hao Liu.
Data warehousing and analytics infrastructure at facebook. Proceed-
ings of the 2010 international conference on Management of data - SIG-
MOD ’10, page 1013, 2010. doi: 10.1145/1807167.1807278. URL
http://portal.acm.org/citation.cfm?doid=1807167.1807278.
[46] Tim O’Reilly. What Is Web 2.0. URL
http://oreilly.com/web2/archive/what-is-web-20.html.
[47] Alex Woodie. How big data analytics can help fight isis, October 2014. URL
http://www.datanami.com/2014/10/14/big-data-analytics-can-help-fight-isis/.
Accessed November,2014.
33

Mais conteúdo relacionado

Mais procurados

Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Vala Ali Rohani
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social networkFiras Husseini
 
The Mathematics of Social Network Analysis: Metrics for Academic Social Networks
The Mathematics of Social Network Analysis: Metrics for Academic Social NetworksThe Mathematics of Social Network Analysis: Metrics for Academic Social Networks
The Mathematics of Social Network Analysis: Metrics for Academic Social NetworksEditor IJCATR
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs StatisticsAndry Alamsyah
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportDavid Kennedy
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An IntroductionAli Abbasi
 
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK
POLITICAL OPINION ANALYSIS IN SOCIAL  NETWORKS: CASE OF TWITTER AND FACEBOOK POLITICAL OPINION ANALYSIS IN SOCIAL  NETWORKS: CASE OF TWITTER AND FACEBOOK
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK dannyijwest
 
IRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social NetworksIRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social NetworksIRJET Journal
 
Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis Athena Vakali
 
Multiple Regression to Analyse Social Graph of Brand Awareness
Multiple Regression to Analyse Social Graph of Brand AwarenessMultiple Regression to Analyse Social Graph of Brand Awareness
Multiple Regression to Analyse Social Graph of Brand AwarenessTELKOMNIKA JOURNAL
 
Oxford Digital Humanities Summer School
Oxford Digital Humanities Summer SchoolOxford Digital Humanities Summer School
Oxford Digital Humanities Summer SchoolScott A. Hale
 
Relation of Coffee Break and Productivity
Relation of Coffee Break and ProductivityRelation of Coffee Break and Productivity
Relation of Coffee Break and ProductivityJavaCoffeeIQ.com
 
A Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based EnvironmentA Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based Environmentpaperpublications3
 
Literature Review on Social Networking in Supply chain
Literature Review on Social Networking in Supply chainLiterature Review on Social Networking in Supply chain
Literature Review on Social Networking in Supply chainSujoy Bag
 
Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)SocialMediaMining
 
IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...IRJET Journal
 

Mais procurados (20)

Sharma social networks
Sharma social networksSharma social networks
Sharma social networks
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social network
 
The Mathematics of Social Network Analysis: Metrics for Academic Social Networks
The Mathematics of Social Network Analysis: Metrics for Academic Social NetworksThe Mathematics of Social Network Analysis: Metrics for Academic Social Networks
The Mathematics of Social Network Analysis: Metrics for Academic Social Networks
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs Statistics
 
AI Class Topic 5: Social Network Graph
AI Class Topic 5:  Social Network GraphAI Class Topic 5:  Social Network Graph
AI Class Topic 5: Social Network Graph
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf export
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An Introduction
 
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK
POLITICAL OPINION ANALYSIS IN SOCIAL  NETWORKS: CASE OF TWITTER AND FACEBOOK POLITICAL OPINION ANALYSIS IN SOCIAL  NETWORKS: CASE OF TWITTER AND FACEBOOK
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK
 
IRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social NetworksIRJET- Link Prediction in Social Networks
IRJET- Link Prediction in Social Networks
 
Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis
 
Multiple Regression to Analyse Social Graph of Brand Awareness
Multiple Regression to Analyse Social Graph of Brand AwarenessMultiple Regression to Analyse Social Graph of Brand Awareness
Multiple Regression to Analyse Social Graph of Brand Awareness
 
Oxford Digital Humanities Summer School
Oxford Digital Humanities Summer SchoolOxford Digital Humanities Summer School
Oxford Digital Humanities Summer School
 
Relation of Coffee Break and Productivity
Relation of Coffee Break and ProductivityRelation of Coffee Break and Productivity
Relation of Coffee Break and Productivity
 
A Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based EnvironmentA Novel Frame Work System Used In Mobile with Cloud Based Environment
A Novel Frame Work System Used In Mobile with Cloud Based Environment
 
Literature Review on Social Networking in Supply chain
Literature Review on Social Networking in Supply chainLiterature Review on Social Networking in Supply chain
Literature Review on Social Networking in Supply chain
 
Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)
 
IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...
 

Destaque

データ分析を生かした運営
データ分析を生かした運営データ分析を生かした運営
データ分析を生かした運営Hiroki Funahashi
 
국내기업들의 Sns활용 실태 분석
국내기업들의 Sns활용 실태 분석국내기업들의 Sns활용 실태 분석
국내기업들의 Sns활용 실태 분석Seongjo Ahn
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Lauri Eloranta
 
Sns+현황+및+전망
Sns+현황+및+전망Sns+현황+및+전망
Sns+현황+및+전망철순 장
 
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...Lauri Eloranta
 
Domain Specific Language for Specify Operations of a Central Counterparty
Domain Specific Language for Specify Operations of a Central CounterpartyDomain Specific Language for Specify Operations of a Central Counterparty
Domain Specific Language for Specify Operations of a Central CounterpartyChamin Nalinda Loku Gam Hewage
 
Presentacion Final.pptx
Presentacion Final.pptxPresentacion Final.pptx
Presentacion Final.pptxUNISON
 
Tugas 4 persoalan khusus
Tugas 4 persoalan khususTugas 4 persoalan khusus
Tugas 4 persoalan khususfadhly arsani
 
Kuliah sli 2015 rev 127
Kuliah sli 2015 rev 127Kuliah sli 2015 rev 127
Kuliah sli 2015 rev 127Rizky Maulana
 
Acoustic barcodes&FlatFitFab
Acoustic barcodes&FlatFitFabAcoustic barcodes&FlatFitFab
Acoustic barcodes&FlatFitFab又瑋 賴
 
Dengue fever
Dengue feverDengue fever
Dengue feverDr Slayer
 
Sdal overview sallie keller
Sdal overview  sallie kellerSdal overview  sallie keller
Sdal overview sallie kellerkimlyman
 
Healey sdal social dynamics in living systems from microbe to metropolis
Healey sdal social dynamics in living systems from microbe to metropolis Healey sdal social dynamics in living systems from microbe to metropolis
Healey sdal social dynamics in living systems from microbe to metropolis kimlyman
 
Portico&video bubbles
Portico&video bubblesPortico&video bubbles
Portico&video bubbles又瑋 賴
 
Psychological disorder
Psychological disorder Psychological disorder
Psychological disorder UNISON
 
Grafik analisis kebangkrutan
Grafik analisis kebangkrutanGrafik analisis kebangkrutan
Grafik analisis kebangkrutanfadhly arsani
 
Exploring percussive gesture on i pads with ensemble
Exploring percussive gesture on i pads with ensembleExploring percussive gesture on i pads with ensemble
Exploring percussive gesture on i pads with ensemble又瑋 賴
 
Depend&dingdong
Depend&dingdongDepend&dingdong
Depend&dingdong又瑋 賴
 

Destaque (20)

データ分析を生かした運営
データ分析を生かした運営データ分析を生かした運営
データ分析を生かした運営
 
最近のデータ分析の潮流(仮)
最近のデータ分析の潮流(仮)最近のデータ分析の潮流(仮)
最近のデータ分析の潮流(仮)
 
국내기업들의 Sns활용 실태 분석
국내기업들의 Sns활용 실태 분석국내기업들의 Sns활용 실태 분석
국내기업들의 Sns활용 실태 분석
 
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
 
Sns+현황+및+전망
Sns+현황+및+전망Sns+현황+및+전망
Sns+현황+및+전망
 
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
 
Domain Specific Language for Specify Operations of a Central Counterparty
Domain Specific Language for Specify Operations of a Central CounterpartyDomain Specific Language for Specify Operations of a Central Counterparty
Domain Specific Language for Specify Operations of a Central Counterparty
 
Presentacion Final.pptx
Presentacion Final.pptxPresentacion Final.pptx
Presentacion Final.pptx
 
Tugas 4 persoalan khusus
Tugas 4 persoalan khususTugas 4 persoalan khusus
Tugas 4 persoalan khusus
 
Kuliah sli 2015 rev 127
Kuliah sli 2015 rev 127Kuliah sli 2015 rev 127
Kuliah sli 2015 rev 127
 
Acoustic barcodes&FlatFitFab
Acoustic barcodes&FlatFitFabAcoustic barcodes&FlatFitFab
Acoustic barcodes&FlatFitFab
 
Dengue fever
Dengue feverDengue fever
Dengue fever
 
Sdal overview sallie keller
Sdal overview  sallie kellerSdal overview  sallie keller
Sdal overview sallie keller
 
Healey sdal social dynamics in living systems from microbe to metropolis
Healey sdal social dynamics in living systems from microbe to metropolis Healey sdal social dynamics in living systems from microbe to metropolis
Healey sdal social dynamics in living systems from microbe to metropolis
 
Portico&video bubbles
Portico&video bubblesPortico&video bubbles
Portico&video bubbles
 
Psychological disorder
Psychological disorder Psychological disorder
Psychological disorder
 
Masalah khusus
Masalah khususMasalah khusus
Masalah khusus
 
Grafik analisis kebangkrutan
Grafik analisis kebangkrutanGrafik analisis kebangkrutan
Grafik analisis kebangkrutan
 
Exploring percussive gesture on i pads with ensemble
Exploring percussive gesture on i pads with ensembleExploring percussive gesture on i pads with ensemble
Exploring percussive gesture on i pads with ensemble
 
Depend&dingdong
Depend&dingdongDepend&dingdong
Depend&dingdong
 

Semelhante a Big Data Social Network Analysis

Semelhante a Big Data Social Network Analysis (20)

Big data
Big dataBig data
Big data
 
FULLTEXT01.pdf
FULLTEXT01.pdfFULLTEXT01.pdf
FULLTEXT01.pdf
 
Master's Thesis
Master's ThesisMaster's Thesis
Master's Thesis
 
QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - Copy
 
Content and concept filter
Content and concept filterContent and concept filter
Content and concept filter
 
Investigation in deep web
Investigation in deep webInvestigation in deep web
Investigation in deep web
 
10.0000@citeseerx.ist.psu.edu@generic 8 a6c4211cf65
10.0000@citeseerx.ist.psu.edu@generic 8 a6c4211cf6510.0000@citeseerx.ist.psu.edu@generic 8 a6c4211cf65
10.0000@citeseerx.ist.psu.edu@generic 8 a6c4211cf65
 
Botnet Detection and Prevention in Software Defined Networks (SDN) using DNS ...
Botnet Detection and Prevention in Software Defined Networks (SDN) using DNS ...Botnet Detection and Prevention in Software Defined Networks (SDN) using DNS ...
Botnet Detection and Prevention in Software Defined Networks (SDN) using DNS ...
 
Data mining of massive datasets
Data mining of massive datasetsData mining of massive datasets
Data mining of massive datasets
 
Stock_Market_Prediction_using_Social_Media_Analysis
Stock_Market_Prediction_using_Social_Media_AnalysisStock_Market_Prediction_using_Social_Media_Analysis
Stock_Market_Prediction_using_Social_Media_Analysis
 
Master's Thesis
Master's ThesisMaster's Thesis
Master's Thesis
 
12.06.2014
12.06.201412.06.2014
12.06.2014
 
Mining of massive datasets
Mining of massive datasetsMining of massive datasets
Mining of massive datasets
 
FULLTEXT01.pdf
FULLTEXT01.pdfFULLTEXT01.pdf
FULLTEXT01.pdf
 
Thesis
ThesisThesis
Thesis
 
KurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurtPortelliMastersDissertation
KurtPortelliMastersDissertation
 
merged_document
merged_documentmerged_document
merged_document
 
Aregay_Msc_EEMCS
Aregay_Msc_EEMCSAregay_Msc_EEMCS
Aregay_Msc_EEMCS
 
Secure and Smart IoT using Blockchain and AI
Secure and Smart  IoT using Blockchain and AISecure and Smart  IoT using Blockchain and AI
Secure and Smart IoT using Blockchain and AI
 
Srs
SrsSrs
Srs
 

Mais de Chamin Nalinda Loku Gam Hewage

Mais de Chamin Nalinda Loku Gam Hewage (7)

Domain Specific Language for Specify Operations of a Central Counterparty(CCP)
Domain Specific Language for Specify Operations of a Central Counterparty(CCP)Domain Specific Language for Specify Operations of a Central Counterparty(CCP)
Domain Specific Language for Specify Operations of a Central Counterparty(CCP)
 
Domain Specific Language for Specify Operations of a Central Counterparty(CCP)
Domain Specific Language for Specify Operations of a Central Counterparty(CCP)Domain Specific Language for Specify Operations of a Central Counterparty(CCP)
Domain Specific Language for Specify Operations of a Central Counterparty(CCP)
 
Branch And Bound and Beam Search Feature Selection Algorithms
Branch And Bound and Beam Search Feature Selection AlgorithmsBranch And Bound and Beam Search Feature Selection Algorithms
Branch And Bound and Beam Search Feature Selection Algorithms
 
World’s Fastest Supercomputer | Tianhe - 2
World’s Fastest Supercomputer |  Tianhe - 2World’s Fastest Supercomputer |  Tianhe - 2
World’s Fastest Supercomputer | Tianhe - 2
 
Structured Cabling Technologies
Structured Cabling TechnologiesStructured Cabling Technologies
Structured Cabling Technologies
 
IP Multicasting
IP MulticastingIP Multicasting
IP Multicasting
 
Last Mile Access Technologies
Last Mile Access TechnologiesLast Mile Access Technologies
Last Mile Access Technologies
 

Último

THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT17mos052
 
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw DigitalTop 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digitalmacawdigitalseo2023
 
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In DubaiDubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubaikojalkojal131
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesNetqom Solutions
 
Values Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfValues Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfSoftServe HRM
 
The--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media PitchThe--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media Pitch17mos052
 
Top 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social mediaTop 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social mediae-Definers Technology
 
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdfINDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdfcarlos784vt
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsSocioCosmos
 

Último (9)

THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
 
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw DigitalTop 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
 
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In DubaiDubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing Services
 
Values Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfValues Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdf
 
The--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media PitchThe--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media Pitch
 
Top 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social mediaTop 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social media
 
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdfINDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
 

Big Data Social Network Analysis

  • 1. Big Data Social Network Analysis by Chamin Nalinda (Registration No : 2011/CS/005, Index No : 11000058) chmk90@gmail.com +94 772416604 SCS 3017 Literature Survey Supervised by Dr. H. A. Caldera BSc(Colombo), PGDip(Colombo), MSc(Colombo), PhD(Western Sydney) University of Colombo School of Computing Colombo 7 SRI LANKA TexMaker | Mendele Desktop |Harvard Style Referencing | Word Count = 5466
  • 2. Declaration I hereby declare that this literature survey report was written by Chamin Nalinda. A great deal of analysis was carried out in preparing this report and the bibliography reflects key reference materials. Self learned knowledge was also included. References have been mentioned without violating the owner’s exact content(paragraphs, sentences etc...) Name of Candidate: L.G.H.C. Nalinda Signature: ............................... Date: December 12, 2014
  • 3. Abstract Big Data Social Network Analysis (BDSNA) is the focal computational and graphical study of powerful techniques that can be used to identify clusters, patterns, hidden structures, generate business intelligence, in social relationships within social networks in terms of network theory. Social Network Analysis (SNA) has a diversified set of applications and research areas such as Health care, Travel and Tourism, Defence and Security, Internet of Things (IoT) etc. . . With the boom of the internet, Web 2.0 and handheld devices, there is an explosive growth in size, complexity and variety in unstructured data, thus the analysis and information extraction is of great value and adaptation of Big Data concept to SNA is vital. This literature survey aims to investigate the usefulness of SNA in the “Big Data (BD)” arena. This survey report reviews major research studies that have proposed business strategies, BD approaches to generate predictive models by gratifying con- temporary challenges that have arises from SNA.
  • 4. Acknowledgements I would like to offer my heartfelt thanks to Dr. H. A. Caldera, my supervisor for the Literature Survey for his immense support and continuos feedback during the course of the Survey and for guiding me by giving valuable ideas. Further, my sincere gratitude goes to all the lecturers, assistant lecturers and the entire UCSC family. Special thanks to my parents, brother and sister who have always given me the strength through the journey of my life. Chamin Nalinda, December 12, 2014 i
  • 5. Contents Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi 1 Introduction 1 1.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Current status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Big Data Social Network Analysis Domains 5 2.1 Health care . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Challeges and Future . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Defence and Security . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Identifying key players in network . . . . . . . . . . . . . . . 9 2.2.2 Usecases from recent history . . . . . . . . . . . . . . . . . . 11 2.2.3 Challenges and Future . . . . . . . . . . . . . . . . . . . . . 11 2.3 Travel and Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.1 Web 2.0 forms Tourism 2.0 . . . . . . . . . . . . . . . . . . . 12 2.3.2 Tourism 2.0 Destination Management . . . . . . . . . . . . . 13 2.3.3 Challenges and Future . . . . . . . . . . . . . . . . . . . . . 14 2.4 Web 2.0 and IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.1 Challenges and Future . . . . . . . . . . . . . . . . . . . . . 16 2.5 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 BDSNA Tools and Technologies 18 3.1 Major Concerns in BDSNA . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Real Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 ii
  • 6. 3.3 Lambda Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.1 Batch layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.2 Serving layer . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.3 Speed Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 Recommendation systems . . . . . . . . . . . . . . . . . . . . . . . 23 3.5 Web 2.0 IoT Architecture . . . . . . . . . . . . . . . . . . . . . . . 24 3.6 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4 Conclusion, Challenges and Future Directions 26 4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Challenges and Future Directions . . . . . . . . . . . . . . . . . . . 27 Bibliography 29 iii
  • 7. List of Figures 2.1 Sources Used to Find or Access Health and Welness Related informa- tion in 2008, in United States of America (USA) . . . . . . . . . . . 6 2.2 9/11 attackers having weak ties with others . . . . . . . . . . . . . . 9 2.3 Decentralized terrorist network . . . . . . . . . . . . . . . . . . . . . 10 2.4 PISTA ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5 most consulted Social Networks (SNs) in cybertravelling . . . . . . . 13 2.6 Traveller recommendation system . . . . . . . . . . . . . . . . . . . . 14 2.7 TAM to gain loyalty . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.8 Tweeting trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1 Expected growth in real time analytics by 2015 . . . . . . . . . . . . 21 3.2 Capabilities of Operational Intelligence . . . . . . . . . . . . . . . . . 21 3.3 Overview of Lambda Architecture . . . . . . . . . . . . . . . . . . . 22 3.4 Architecture for Social Internet of Things (SIoT) Client Side and Server Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1 Hosting data on cloud and challenges . . . . . . . . . . . . . . . . . . 27 iv
  • 8. Acronyms AMA American Medical Association ANN Artificial Nueral Network API Application Programming Interface BD Big Data BDA Big Data Analytics BDSNA Big Data Social Network Analysis BI Business Intelligence BP Batch Processing CC Cloud Computing CO Cognative Objects DARPA Defense Advanced Research Projects Agency DB Bata Base DD Deep Data DM Data Mining DMO Destination Management Organizations DT Decision Tree DW Data Warehousing eWOM e-word-of-mouth FB Facebook FC Fog Computing IoT Internet of Things v
  • 9. LA Lambda Architecture NLP Natural Language Processing NSA National Security Agency OI Operational Intelligence OM Opinion Mining RFID Radio Frequency Identification ROS Robotic Operating System RTA Real Time Analysis RTBDA Real Time Big Data Analytics SIoT Social Internet of Things SM Social Media SNA Social Network Analysis SNs Social Networks SP Stream Processing SW Software TAM Technology Acceptance Model TM Text Mining TPA Technosocial Predictive Analytics UGC User Generated Content US United States USA United States of America WSNs Wireless Sensor Networks WWW World Wide Web vi
  • 10. Chapter 1 Introduction This literature survey is based on key domain areas that "Social Network Analysis" play a vital role with use of "Big Data" technologies. The discovered knowledge can be utilize to extend current status of respected domains. This chapter highlights importance, history and growth potentials in the survey topic in a nutshell. 1.1 Approach Social Networks(SNs) connect people with different ideas, education, status, back- groudns, geographies etc... The focal idea of Social Network Analysis(SNA) is iden- tifying network relationships within the network. Information diffusion is the key behind relationship formation. Within SNs, variety of interest are sharing that adress- ing different domains, and it forms complex relationships. With World Wide Web (WWW) and Web 2.0, SNs have gained a new shift and focus. Online SNs are massive data repositories. Visitors to SNs leave a digital footprint once they are logged in and hence all activities of logged users can be examined in online SNs. Data scientists found the importance of translating these technological opportunities into revenue, competitive advantages and useful discoveries to redefine human interaction[6] and day to day life. Otherwise the data would have remained in data tombs and oppor- tunities would have been ignored. A trusted technique to analyse SNs are BD analytical approaches. Data Mining 1
  • 11. (DM) techniques are heavily used to dig deeper into data in SNs.Big Data Analytics (BDA) is a proven method of defining new storage/access/query/scaling mechanism of data and of developing new approaches to sentiment analysis, predictive modeling, Natural Language Processing (NLP), click stream pattern recognition etc. . . BDSNA is a fast growing research area. There are quite a number of algorithms, software tools and analytic engines that are optimized[39] for BDSNA. These tools are capable of gathering data, processing, analyse and present results visually for a particular domain. This literature survey gives an overview review on BDSNA topic as published on research papers, journals, web articles, books etc... 1.2 Motivation “Connectivity” is the concept for forming SNs. Competencies given by SNs, sen- sors, online networks are rich data sources. People spend a substantial amount of time in online networks.Therefor SNs generate high volume of User Generated Content (UGC) with different varieties at a rapid velocity[40]. This UGC is a true reflection of human behaviour in SNs hence UGC in SNs are of high commercial value. But it’s enormity and unstructured nature has presented multiple challenges, hence the need for storage, access, analytics and high computational performance needed to con- sider. As a result “BD” technology mix been with SNA to discover new diamensions of knowledge. Facebook (FB)1 , Twitter2 , LinkedIn3 , Google+4 , Tripadvisor5 , Blogger6 , Insta- gram7 are the leading SNs with vast user engagements in todays context. UGC in SNs are in the form of text, emoticons, images, ratings, likes etc. . . and address many domains such as travel and tourism[33], defence and security[4], healthcare and medicine. Nature and characteristics are different in these SNs, how ever there are similarities which can aggregate in addressing domains.UGC poses many busi- 1 https://www.facebook.com/ 2 https://twitter.com/ 3 https://www.linkedin.com/ 4 https://plus.google.com/ 5 tripadvisor.com/ 6 https://www.blogger.com/ 7 http://instagram.com/ 2
  • 12. ness opportunities. Discovery of knowledge that are resides in UGC while analysing attributes that are unique to each domain will create more opportunities for both private and government sectors. Another big wave in the coming decade is IoT. This will further create more UGC in semi structured and unstructured nature, and vari- eties of SNs. As a result “BD” will move to “Deep Data (DD)” concept while, “Cloud Computing (CC)” will move to “Fog Computing (FC)”. Deducing business intelligence via connecting dots using Operational Intelligence (OI) and comparing and applying discovered knowledged in the modern and future societal context using classification, sentiment analysis and other techniques in BD and DM paradigm are blooming research topics. In these researches areas, it is integral to determine which BDSNA algorithms and techniques have accommodation for growth in size, scalability, quantification issues, pattern recognition issues and capability of real time analytics in SNA application areas. Data scientists and other researchers are also seeking novel ways of redesigning the infrastructure to facilitate BDSNA with the rapid growth in IoT. 1.3 History Various arguments are there to claim the initiative on SNA, while an experiment done by Stanley Milgram in 1967 provide proper groundings for it. He came up with “six degree of separation”[37] concept where he stated that most people connected by six acquaintances. SixDegrees.com was the first acceptable online social network. The research arena BDSNA boomed with Web 2.0 that came to light in 1999 [46]. Low availability of internet facilities and lack of Software (SW) tools to meet BD requirements,were a major reason for BDSNA to stay out of sight in early days. 1.4 Current status Today millions of people are connected with social networks in many different ways[28]. Social networks are in a neck to neck fight to keep their current users while attracting new users. This leads to semistructured and unstructured data being generated at a rapid pace. 3
  • 13. BDSNA is an aggressive and lucrative research areas in modern computer sci- ence. Public and private sector organizations have open up their data repositories for research purposes[13] and have encouraged data scientists to actively engage in more research areas in BDSNA. Tech giants like Google8 ,Microsoft9 , FB, Amazon10 and IBM11 are investing in start-up companies that operate in BDSNA because of its lucrative nature and growth potentials. The demand for business intelligence tools are erupting[10]. High performance, low latency, parallel distributed processing, real time processing, scalability, migration are factors that are continuously optimized in such tools. Further with IoT a new era has been born where trees tweet on their conditions[12][15]. 1.5 Chapter summary Early days BDSNA was not so popular due various reasons and it has emerged with Web 2.0 technology. SNs connect people with different views and opinions. The UGC data repositories of SNs are huge and those are in different varieties and variations.BDSNA helps in analysing UGC in SNs and there by discover knowledge. This knowledge has a higher commercial value as well. Today, there are different forms of online SNs that address different user groups (FB, LinkedIn, TripAdvisor etc...). Classification, sentiment analysis, clustering, Real Time Analysis (RTA) and various other BD and DM techniques are widely used in SNA. Today, the advances in technology has spread to SNA where now tech giants and data scientists are looking for novel approaches to accommodate the needs of SNA such as storage, querying, accessing and analysing UGC with much more improved technologies. Next chapter gives a detailed illustration on four major SNA domains that this literature survey mainly concerned with. Examples and use cases from survey reports, articles journals have included in order exploring the BDSNA importance to respective domains. 8 https://www.google.com/ 9 http://www.microsoft.com/ 10 http://www.amazon.com/ 11 http://www.research.ibm.com/ 4
  • 14. Chapter 2 Big Data Social Network Analysis Domains In this chapter, SNA domains in health care, defence and security, travel and tourism, web 2.0 and IoT are discussed.Examples illustrate how BDSNA have used to address stakeholder intensions and expectations. Further, this chapter exposes specializations in each domain that emerged as a result of BDSNA. 2.1 Health care As shown in "Figure 2.1", it is apparent that there’s a strong likelihood to use internet as a source of finding health and wellness related information and people are more likely to spend much time in SNs in their day to day lives. Web 2.0 attracts users of all age groups. Discussions, information diffusion, collaboration over SNs growing so rapidly in healthcare space. Recent researches have identified that professionals in healthcare are willing to use SNs as means of addressing their patients and monitor health conditions of patients. Further, patients who have recovered are also inter- ested in sharing their success stories in SNs in the forms of blogging, photo sharing, video uploads and articles. This information is publicly available to a vast variety of people. As of now, we are in Health 2.0, “the use of social software and its ability to promote collaboration between patients, their caregivers, medical professionals, and 5
  • 15. other stakeholders in health”[17]. Figure 2.1: Sources Used to Find or Access Health and Welness Related information in 2008, in USA It is “Collective Wisdom” that act as driving force for people to increasingly use SN to find information relevant to their health matters. There are specifically devel- oped SNs like PatientsLikeMe1 ,OrganizedWisdom2 ,ICYou3 , Google Health Groups, Sermo4 , DailyStregth5 to bridge the knowledge and experience of patients and health care professional expertise[17].American Medical Association (AMA) emphasizes the importance of adhering to professionalism to physicians, neurosurgeons and other professionals, when publishing content over SN to safeguard career status in health- care background[9]. Even though there are challenges in collecting data, healthcare sector in SNs reflects accurate data where it is over 99.7%[20]. 1 http://www.patientslikeme.com/ 2 http://www.organizedwisdom.com/Home 3 http://icyouhealth.tumblr.com/ 4 https://www.sermo.com/ 5 http://www.dailystrength.org/ 6
  • 16. A research focussed on cancer patients social behaviour on FB conducted by the University of Texas M.D Anderson Cancer Center has enabled them to provide better service towards its patients. The UGC had been of poster types and text. This tech- nique is called “Telemedicine”. Just as Health 2.0, Medicine 2.0 is another concept that evolves with high user participation over sn to communicate and collaborate on health care. The Twitter network is also widely popular among patients and health- care professionals as a medium of communication[20]. How ever patients willingness to communicate over SNs openly is mandatory, otherwise regulations will mark it as a violation of patients’ rights. Videos, articles, comments, chats, images and other form of UGC related to health- care available on SNs represents a gold mine of opportunities[20][17]. Sophisticated applications have been developed integrating both DMand BD techniques. TrialX6 is one such application that patients can use. Once a patient tweets, TrialX will send a tailored response to the patient from his/her past health history[20][32]. Gene engineering, drug research, disease research and public health domains utilize UGC on SNs to discover knowledge and thereby develop models to enhance health conditions of people. Twitter hashtags are quite useful when determining disease/drug related effects[8]. Automated filtering system that was developed by US Food and Drug Administration has proved that 98% of tweets are bogus, however the true information is of great value[23]. Information extraction is critical. Migration of digital documentation from paper work, and SNs data are huge repositories. An automated surveillance system would be much effective in information extraction, analysing data and recognizing patterns. One such system implemented at University of Alabama has proven results. It had been successful in determining, high risk patients, short-term health issues and ad- verse effects from drugs. Use of big data has enabled to deliver tailored prescriptions for patients[30]. Significant number of BD applications in healthcare domain exist today. SW tool that is similar to Asthmapolis7 would be meaningful to implement considering SNs data repository. Mobility is expected through big data tools hence mobile platforms 6 http://trialx.com/enablers/ 7 http://propellerhealth.com/ 7
  • 17. enable tools will have a lot of growth potentials.Ginger.io8 and mHealthCoach9 are the leading tools[18] at present but these two have been unable to incorporate SNs domain into there applications, and the necessity for such tools prevail. 2.1.1 Challeges and Future UGC appear on anonymous blogs and spam comments are unreliable sources. Efficient NLP techniques and Text Mining (TM)techniques need to be utilize when developing BD tools and appliations. Strong rules and regulations exist in healthcare domain. This is a barrier to obtain useful information from SNs. Mere sentiments are not enough to develop solid algorithms and models, patient information and other related information will add much value to researches. "Privacy" concerns are another barrier. People might not want others to use what they share on SNs Web 2.0 will evolved to Web 3.0 and eventually Health 2.0 and Medicine 2.0 will evolved to Health 3.0 and Medicine 3.0[17]. With the rise of IoT BD wearables will take piority in healthcare[7]. SNs BD wearable concept will redefine human interactions with healthcare matters. 2.2 Defence and Security With the 9/11 massacre in the United States (US), the National Security Agency (NSA) invested a huge amount of resources to counter attack terror networks. “Net- works and Networs” by John Arquilla and David Ronfeldt prior to 9/11 massacre highlighted the network behavioral patterns of criminal networks. Modern war net- work structures are leaderless, extremely quick hence novel approaches are needed in counter terror threats. Valdis Krebs mapped Al-Qaeda network responsible for 9/11 [37]. More and more importance was given in SNA to trace terror network and to- day SNA plays a key role in demolishing terror networks[11].Technosocial Predictive Analytics (TPA) methods for web DM, social web tools needed to capture and query UGC in SNs[22] 8 https://ginger.io/ 9 http://www.mhealthcoach.com/ 8
  • 18. National security is the main concern. Unlike other SNs domain, defence domain is different in many ways since key players are not openly active. Weakly tied parties are somewhat open in SNs, but even they hardly communicate. SNA in defence requires two major parties, data collectors and data modellers.Data collectors face a cumbersome time in gathering data due to the above reason. University of Arizona Artificial Intelligence Center10 offers large data repository of newspaper articles, web pages, social network data that is terror related.Clustering technique have been used to segregate possible terror networks and they have managed to pictorially represent diffused networks linked with weak ties(Figure 2.2) in network of 9/11 attackers[37]. Figure 2.2: 9/11 attackers having weak ties with others 2.2.1 Identifying key players in network Two main focuses of analysing SNs in defece domain are to identify structure of possible networks and to recognize key players. With 9/11 attack, the structure decentralized (yet still both centralized and decentralized networks do exists). Un- derstanding key player will help in taking the control of the entire network. Though 10 http://ai.arizona.edu/research/terror 9
  • 19. it sounds easy, factors such as incompleteness, fuzzy boundaries and dynamics makes it a tough task. In a decentralized network player do exists to handle financial as- sistance and other supplies while the leader plays a silent role in managing[4](Figure 2.3). BDSNA is use to identify financial manager and there by recognize key roles. Twitter BD analytic techniques are most likey to be used in recognizing key players. [31] Figure 2.3: Decentralized terrorist network As shown in "Figure 2.4", PISTA Architecture is quite useful in filling major loop holes in national security domain. But at the moment, this architecuture has fewer applications with SNs UGC integration. It is highly recommend to invest on extending the functionality of PISTA architecture to supportBDSNA in security domains since most SNs have video sharing, geo-location setting features in them[42]. Figure 2.4: PISTA ontology 10
  • 20. 2.2.2 Usecases from recent history During recent history there had been several major incidents happening through- out the globe whith Web 2.0 initiatives. This section highlights some major incidents and BDSNA technologies used in those situations. • 2008 Egyptian Revolution started through an initiation of FB group.Importance of giving attention on SNA was discovered[14]. • 2009 Pakistan Chief Justice restatement efforts were caused purely due to SNs influence. Government banned private media, yet people did social awareness through SNA so Govt had to restate the Chief Justice back in his position [14]. • ISIS is a technologically sophisticated terror group that actively engage in SNs. ISIS use strong encryption techniques when communicating over SNs. Due to this barrier,BDSNA approaches like NLP,Graph data bases (determine hierar- chies and identities) and cognitive computing platforms cannot solely be used as they are. Project Minerva by Department of Defence USA, utilize high end algorithms to determine terror activities that are pulled from Twitter.[47] • FIFA World Cup 2014 can be considered as an event that used BDSNA to establish peace around grounds and nearby cities. Brazil securities used real time Twitter feeds, FB feeds and other SNs UGC and analysed semantics to determine where to send troops to control riots. Security agencies used powerful BD solution, Oracle Complex Event Processor11 to do real time querying on SNs feeds. 2.2.3 Challenges and Future BD analytics in defence sector provide meaningful insight to Governments. The director of Defense Advanced Research Projects Agency (DARPA)12 in US empha- size the importance of algorithm optimization in discovering useful intelligence. “e- harassment”, “cyberbullying”, “hacking” are major investigating areas.The adoption of SNs data is yet at a low stage, but considering recent history it is apparent that 11 http://www.oracle.com/technetwork/middleware/complex-event-processing/documentation/index.html 12 http://www.darpa.mil/default.aspx 11
  • 21. it is highly essential to take into consideration SNs data when discussion the secu- rity domain. Big argument against BDSNA in defence is, violation of privacy. People share their thoughts on FB, Twitter and other SNs because they have a right, and not to use those for other purposes. Recent whistle blowing incidents by Julian Assange and Wikileaks, PRISM and Edward Snowden are such examples. It is apparent that the Government try to hide these information from public visibility[11]. To obtain successive results there should be a balance between Govt policy towards SNs and users attitudes. 2.3 Travel and Tourism Tourism has always been a networked industry. Web 2.0 redefined tourism and all related industries. This phenomenon is Tourism 2.0[26]. In tourist networks, two major types of stakeholder (tourist, travel agent, accommodation providers, restau- rants etc. . . )[41] can identify tourist and service providers. Different views have been given to BDSNA in the domain of tourism. Two such broad views are using SNs as a tool in tourist destination determination [33][26] and second is process and discover interesting patterns in SNs and apply derived knowledge to tourism[34][16]. 2.3.1 Web 2.0 forms Tourism 2.0 SNs are powerful tools that uses Technology Acceptance Model (TAM) and e- word-of-mouth (eWOM). TAMillustrates users’ willingness to adapt to technologies while eWOM is content sharing on SNs in forms of text, images, videos etc... TAM and eWOM provide primary source of information for cybertravellers. Cybertrav- ellers behaviour depend on what other people say about destinations(Figure 2.5).The need for new framework to address destination governance is highlighted in this ap- proach. Service providers need to adopt their networks with features of embedding SNs to support searching, visualization, interactivity and this would trigger positive attitude towards travelling. Travel 2.0 SNs (TripAdvisor, WAYN13 , Tripwolf14 , Trav- elblog15 , Trivago16 )SNs features to address cybertraveller expectations.Here focus is 13 http://www.wayn.com/ 14 http://www.tripwolf.com/ 15 https://www.travelblog.org/ 16 http://www.trivago.com/ 12
  • 22. more towards leisure travellers rather than business travellers.[33][26][35][36] Figure 2.5: most consulted SNs in cybertravelling 2.3.2 Tourism 2.0 Destination Management Tourist attitudes, behaviour and psychology has huge impact when determining destinations to explore. Different market segments demands are different. eMar- keteers use tailored strategies to attract potential tourists... Destination Manage- ment Organizations (DMO) utilize DM and BDSNA techniques (clustering, Artificial Nueral Network (ANN), Decision Tree (DT)) to determine customer intentions from mixture of facts and opinion from UGC on SNs[43]. Travel 2.0 benefit BDSNA in demand/sales forecasting, inventory management, multichannel marketing campaign organization etc... Use of SNs methods are quite important when removing noise and discover meaningful knowledge from SNs to bring meaninguful insight[1].RapidMiner17 analyse traveller patterns and render dynamic personalize suggestions based on past as well as other linked networks results (pre- dicting air ticket price, hotel charges etc. . . [44] At the time of decision making, traveller in a state of switching one to the other depending on reviews. FB pages, provide great insights about destinations/hotels. How ever researches have proved that it is very much likely to Tweet or post on FB if travellers had a bad experience with service provide organizations.Twitter users are more likely to re-Tweet negative reviews than positive reviews. This highlight the importance of monitering UGC on SNs pages of service providers. Once the traveller has selected preferred hotel/travel service, they are very much likely to visit brand 17 https://rapidminer.com/ 13
  • 23. Figure 2.6: Traveller recommendation system website of hotel/travel agency. It is vital to integrate TAM features to explore more about services that are offered to customer to win customer loyalty[36].Strategies help DMO ultimately to boost their revenue and gain competitive advantage over peers that ignores BDSNA. 2.3.3 Challenges and Future This sections describes prevailing restrictions in BDSNA in tourism and travel domain and how the future would be. • Currently, most are relationald BDs. Tourism and travel sector need new in- frasturture tools to get maximum of bdsna. • User opinions are subjective. Algorithms should support the viewing of gener- alized opinion of travellers and should not be affect it by outliers. • Content that shares on FB, Twitter and other SNs have direct influence on DMO, travel agents and hotels.So there is a need for strong monitering mecha- nism need to incorporate to Travel 2.0 websites. • Airline service providers can benefit from real time data analytics on flight delays, UGC from SNs, and sensor data (weather patterns etc. . . ) serve greatly when optimizing operations. 14
  • 24. Figure 2.7: TAM to gain loyalty 2.4 Web 2.0 and IoT In an era where Web 2.0 evolves to Web 3.0 (Ubiquitous Computing), that hard- ware embedded software takes the lead in daily routines of mankind, will have a huge influence on current SNs practices as well. Today mostly humans are connected to SNs. With advances in IoT, Cognative Objects (CO) or smart objects are capable of sharing UGC over SNs. Tweeting trees(Figure 2.8), tweeting washing machines send real time content to humans[24][27]. Two broad SNs exists with SNs, humans to CO SNs and CO to CO SNs SIoT[19]. Figure 2.8: Tweeting trees 15
  • 25. Developers integrate SNs capability to every smart device because SN play an important role in personal life. Google Glass18 , Samsung Galaxy Gear watch19 , Apple iWatch20 and many other wearable technologies have integrated SNs capability. Lewis Robinson on his article to SocialMediaToday21 stated that “iWatch will check in for you via Facebook when you arrive at an event. Your oven will take a photo of the cake you just baked and post it directly to Instagram”.[38] It is evident that automated interconnected smart devices can act without human intervention. There will be more data as neverbefore. BDSNA will be able to provide more personalized information to all stakeholder groups, and advanced Business Intelligence (BI) can be derived using sophisticated analytical approaches. Concept of “Smart Cities” is an example of advanced data analytic utilization of UGC from IoT devices that connected SNs and other CO. Waze22 , is such real time traffic application that connect mobile devices with other CO (traffic lights, street signs etc. . . ) 2.4.1 Challenges and Future “Privacy” is again a major concern in this arena. Since devices having capability of generating automated content sharing on SNs, it could be a violation of privacy of in- dividuals. How ever, Lawrence Ampofo on his recent article to Business2Community23 emphasizes that, “conception of privacy become more sophisticated” where people are more likely to openly communicate their personal life through social networks and “data to be more liberated from wall gardens making available to all platforms”[2]. It is predicted that by the end of 2020, the number of IoT devices would rise above 50 billion[38]. The potential for new concept SNs is massive. The amount of unstructured data that is generated from IoT devices will be so huge that even current bd technologies cannot accommodate the size, growth and scalability. The concept of “Deep Data” and “Fog Computing” need to be utilized effectively to accommodate infrastructure requirements. 18 https://www.google.com/glass/start/ 19 http://www.samsung.com/uk/consumer/mobile-devices/wearables/gear/ 20 https://www.apple.com/watch/ 21 http://www.socialmediatoday.com/ 22 https://www.waze.com/ 23 http://www.business2community.com/ 16
  • 26. 2.5 Chapter summary Health 2.0, Medicine 2.0 approaches have evolved as a result of Web 2.0 because it identified that, the potential from SNs to health care industry is massive. SNs are fastest method of communication between patients and health care professionals such as nurses, doctors and specialists etc. . . In PatientsLikeMe, Google health groups and various other SNs that specially focus towards health care are sharing knowledge and experiences of all parties related to health care. Sophisticated SW tools such as TrialX utilizes BDSNA methods to send tailored responses to patients and doctors by analysing related party data reflected on SNs. Specialized research areas such as drug research and disease research massively use BDSNA approaches like TM, sentiment analysis, clustering and RTA etc. . . Defence and Security domain is very different compared to other domains in BD- SNA. Finding reliable data repository is a major challenge because terror groups hardly reveal any data. But recent ISIS scenario is totally different. Today, gov- ernment agencies and authorities use BDSNA to establish security in their territory. RTA play an important role in analysing UGC of SNs. Highly sophisticated models and predictive algorithms have developed using BDSNA mechanisms. With Web 2.0, Travel 2.0 evolved. UGC that are in form of text, video and images etc. . . are useful resources for discovering traveller psychology and behaviour. Business models like TAM were developed as a result of BDSNA . Hotel owners, travel agencies are using BDSNA approaches in addressing their customer requirements. RTA and recommendations are heavily use in Travel 2.0. IoT has paved the way for living things like trees and non-living objects such as washing machine to share their status over SNs. As a result of smart devices being part of SNs, the amount of data that is generated, that is of unstructured and semi structured are unbelievable. This pushes data scientists to explore new technologies like FC and DD to integrate to BDA. Third chapter focuses on core BDA technologies in SNA. Network visualization, data storage, process, accessing, recommendations systems and RTA that discussed in above SNs domains are illustrated in technically and theoretically. 17
  • 27. Chapter 3 BDSNA Tools and Technologies In this Web 2.0 era, data is generation is exploding exponentially and data scientists and IT professional are highly ambitious in turning BI to an asset in their busi- ness domain.This chapter illustrate, key concerns and core technologies and tools in BDSNA. 3.1 Major Concerns in BDSNA This section highlights identified issues from previous chapter in a nutshell. • Security and Privacy: Most UGC on SNs reflect people’s personal life moments. All scenarios we considered in the last chapter highlights security and privacy as a major concern[11][2]. • Explosive growth rate: With growth of Internet and IoT will generate more UGC. Infrastructures should accommodate to store, process, capture and anal- yse new sources of semi-structured and unstructured data from all SNs. FB uses Apache Hadoop1 and Apache Hive2 for storage purpose because hardware scalability is high, and Scribe3 as a log collection strategy[45]. 1 http://hadoop.apache.org/ 2 https://hive.apache.org/ 3 https://github.com/facebookarchive/scribe 18
  • 28. • Extract valid UGC removing noise: TM, NLP and other DM techniques need to optimize to find validity of data[2]. • Real time analytics: Need for Stream Processing (SP) is erupting. User data gathered over a period will go through Batch Processing (BP) machanism to develop models to check and analyse incoming events in real time. • Sophisticated analytics tools and SW: Low latency and more visualization is expected from BDSNA tools and SW. The Lincoln Laboratory is currently engage in research projects to develop sophisticated algorithms and software tools to generate networks from unstructured/semistructured data[10]. To represent different user groups in SNs wide range of tools and SW are avail- able in the market place. When considering selecting the right tool, factors such as, intended goal, ease of use, operating platform, cost effectiveness etc. . . need to be taken into consideration. Out of all these “visualization” capability is vi- tal.Streanghts of network ties, user groups structures, and dynamics can be viewed using these tools.[29]. Tool / SW Description Gephi4 Platform independent SW that is distributed under open source licence. Good tool in visualizing networks and their relationships. NetLogo5 Free software that supports platform independency. Helps in visu- alizing dynamics in network formation. Study of network behaviour can be done using this tool. iGraph6 Free SW that can be used to perform heavy calculations. Pajek7 Another free SW that runs only on Windows platform. Network formation, dynamics, information diffusion and many other inbuilt feature. UCINet8 Commercial SW that supports only the Windows platform. NodeXL9 Fairly new to market. SNA can integrate with Excel. Free SW and for the moment only available for Windows platform.4 http://gephi.github.io 5 https://ccl.northwestern.edu/netlogo/ 6 http://igraph.org/ 7 http://pajek.imfm.si/doku.php 8 https://sites.google.com/site/ucinetsoftware/home 9 http://nodexl.codeplex.com/ 19
  • 29. NetworkX10 A good tool in programming perspective. Has developed using C and Fortral libraries. Optimized for scaling for large matrices. Nuero productions 5K Twitter browser and Neofomix Twitter Stream Graph are advanced visualization tools that can be used to analyse UGC from Twitter.[24] 3.2 Real Time Analysis FB, Twitter, LinkedIn, Goolge+, TripAdvisor and all leading SNs provide real time visibility on what their users prefer. Intel BD Research Center forecasted that the uses cases for Real Time Big Data Analytics (RTBDA) will spread towards more in SNA than BP, yet BP will still act as the core for RTBDA. Real time analytics based on SP.OI[39] and Lambda Architecture (LA) are the core BDA technologies that SNs mainly use for RTBDA. RTA explanation RTBDA is an advance technique to make better decisions and meaningful actions at precise time. There are two major important aspects in RTA. Real time actions are treated as “streams of events” in RTA. To determine the required action to be performed when an event comes to the system, the system need to capture, pro- cess and analyse the parameters and attributes in the incoming event stream, and determine the corresponding stream category or group with regard to application do- main. Then the corresponding categories stream would match with an action that is determined by pre defined model.It is important to develop this “model” at first phase in RTA. Further more the RTA engines are stateless engines, in that it doesn’t require provisions for previous incoming streams in determining action for current stream[25]. 10 http://networkx.lanl.gov/ 20
  • 30. Figure 3.1: Expected growth in real time analytics by 2015 Figure 3.2: Capabilities of Operational Intelligence FB, Twitter, and other SNs use data records that are collected over a large period of time. Model is developed considering the nature of the application domain(i.e. tourism, healthcare etc...), not the individual records that reside in data repositories. OI and LA are core technical approaches in designing and developing RTA engines. 21
  • 31. 3.3 Lambda Architecture LA,developed by Nathan Marz,achieves the capability of real time processing by decomposing the event into three layers, batch layer, serving layer and speed layer. Everything starts from query = function(all data) equation[5]. The computa- tional cost is highly expensive for to perform this function for every event on the fly. In batch view, a precomputed query function will be used to check the result for the query instead of calculating on the fly. The precomputed view is indexed so that it can access fast with few random reads. Figure 3.3: Overview of Lambda Architecture 3.3.1 Batch layer Batch layer acts as the master holding the values of batch views that are computed on master data set (HDFS) and compute arbitrary views (MapReduce)[25]. This master data set domain can be either historical data or historical data with current data (depend on business domain and key stakeholder interest). Apache Hadoop is used to process master data set and develop required model. simplest pseudo code for batch layer[25] function runBatchLayer(): while(true): // repeatedly recompute batch views from beginning 22
  • 32. recomputeBatchViews() 3.3.2 Serving layer Real time querying is supported by the serving layer. Real time stream is ingested into the analytic engine and inside the engine, stream is processed, then the corre- sponding action is triggered. Apache Drill11 and Cloudera Impala12 are SP engines that are used to implement serving layer functions[25]. 3.3.3 Speed Layer There is a substantial latency in BP, and the impact is compensated via dis- tributed SP. Apache Storm13 and Apache S414 are used to implement this layer[25]. 3.4 Recommendation systems FB, Twitter, LinkedIn, Goolge+ and all leading SNs .These systems apply knowl- edge discovery techniques to the problem of making personalized recommendations during a live interaction[21]. ex: Consider a scenario where you add a friend on FB and FB will automatically give similar recommendations. (a generalized recommendation system) Recommendation engine analyse people who add the same person that you add, and from those people(1), the engine analyse and determine other people(2) who are added by those people(1). System will give people(2) as our recommended people to add and expand our network 11 https://github.com/apache/drill 12 http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html 13 https://storm.apache.org/ 14 http://incubator.apache.org/s4/ 23
  • 33. SNs recommendations are determined by the number of Likes, clicks, user rat- ings and emoticons. The algorithms are mainly of two categories, content-based algorithms and collaborative filtering algorithms. Content based algorithms check similarity of target item (recommended). Collaborative filtering technique will use, previous similar recommendations based on clicks, ratings etc. . . Additionally time window technique is adapted to give recommendations according to time du- rations.(Google+ and Twitter trends etc... 3.5 Web 2.0 IoT Architecture Distributed Wireless Sensor Networks (WSNs) to share data, Robotic Operating System (ROS) as middleware platform and Radio Frequency Identification (RFID) as an identification technology, provide the core architectural infrastructure to CO to recognize activities and at the same time incorporate knowledge to smart objects. Pachube platform15 provide fundamental API groundings for developers to develop SIoT. Figure 3.4: Architecture for SIoT Client Side and Server Side It is important to understand SIoT network characteristics and relationships when designing and developing smart environments. Four main types of relationships are exists in SIoT networrks[3] • parental object relationship: This is a family like structure that believes CO do share similar characteristics with devices that are developed during the same time period (argument here is that, technology changes so rapidly) 15 http://datahub.io/dataset/pachube 24
  • 34. • co-location object relationship:Object relationships needed to established dur- ing the design and development of smart environment, based on location base inforamation. • ownership object relationships: One person can be owner of several CO. This ownership information is vital when interacting with SNs of CO. • social object relationship: Devices with similar characteristics can share best practices to solve issues. “Cloud-of-cloud” concept is a broad view that shares same idea. This idea can relate to edge computing IoT devices. 3.6 Chapter summary Security and privacy need to give a great deal of attention when designing and developing BDA SW and tools as well as developing algorithms to SNs domain. In SNA, “visualization” is an important aspect to look at when designing SW that can analyse different user groups and Gephi tool out performs other network visualization tools. Lambda Architecture that uses BP and SP, utilize for RTA in BDSNA. OI uses in order to develop models that can be used in RTA. Recommendations systems are differ from domain to domain and use quite a number of user actions such as user click, likes and ratings etc. . . in designing algorithms. Next chapter is the final chapter that summarizes literature survey and it gives future directives to BDSNA domains widening current status to a new level. 25
  • 35. Chapter 4 Conclusion, Challenges and Future Directions This chapter summarises overall survey and provides insight into future directions for SNs in applying gathered knowledge in practice. 4.1 Conclusion Today even tech averse and less techy people do have an understanding about SNs (like FB), but they are hardly aware of what search engines can do.Children, youngsters, adults and even old people are making their presence felt in SNs. People are eager to share their personal life stories, and on the other hand people like to peep into other peoples’ affairs. Interesting fact is that, not only humans, but also other living and non-living objects are becoming users of SNs. The highly dynamic UGC on SNs reflect user perspectives and feedback. UGC is not restricted to a particular domain, it spreads to a vast variety of fields and BDSNA helps in addressing wider range of stakeholder groups with higher degree of accurate BI. This survey is focused on four major domains(Healthcare, Defence and Security, Travel and Tourism, Web 2.0 with IoT) in SNs. To derive useful knowledge and recognize hidden patterns from user activities of SNs, it is important to differentiate 26
  • 36. what is exiting and interesting among all activities. BDSNA is the solution. BDSNA has redefined these sectors to a new dimension making it worth for all interested parties. Qualitative and quantitative results have been obtained through BDSNA, to give a better service to users of SNs. Business strategies and models are creating to satisfy the demands of users. Predictive models, recommendation systems and real time analytics play a major role in today’s BDSNA. Modern day BDSNA has been identified as a best approach as an answer to many business domains. BDSNA has become an essential part of developing highly sophisticated intelligence tools and SW. 4.2 Challenges and Future Directions SNs like Facebook are considering cloud storage as a solution to accommodate growing needs of data storage. As shown in "Figure 4.1", the biggest challenge in adopting cloud storage that is identifies by all organizations, is security and privacy violations. Even though a private cloud can provide security mechanisms to establish more security, cyber attackers are smart enough to identify loopholes and thereby spoil data on a cloud. It is evident that a s o f yet there is no 100% guarantee of using cloud technology as a trusted service. Figure 4.1: Hosting data on cloud and challenges Most UGC on SNs are irrelevant to the considered domain. Incompleteness of text information, multilingualism content, bogus user feedback are difficult to cater 27
  • 37. to in doing genuine analytic. Deriving algorithms and strategies based on particular geography user group is not sufficient. Data scientists need to give more attention to these factors when doing SNA. Also TM and NLP are currently supported most in micro blogging content (Tweets are limited to a maximum of 140 characters). These techniques need to improve to a level where it can analyse much more text content. Mechanism similar to YouTube real time translation is quite beneficial in SNs domain context to spread awareness to wide range of users. SNs have a huge impact on human behaviour and intensions, and it has challenged the conventional behavioural patterns of humans over recent years. .FB can be used to find a friend or relation and LinkedIn is a place to find professionals. It is apparent that SNs play the role of a “search engine”. Integrating proper index methodologies, would enhance search function of SNs and would give its users more accurate results. Further, companies advertise their products and services on SNs. In the near future, users will find it more compelling and attractive to use SNs for their online shopping experiences. This highlights a big business opportunity for SNs like FB, but on the other hand, a possibility for users to stay away from SNs may arise. The need for shopping pattern analytics in SNs will also arise in the future.Like we have differ- ent type of SNs now for different purposes (FB and LinkedIn), there will be more categories of SNs in future. IoT will be a driving factor in diversifying SNs. 28
  • 38. Bibliography [1] Rajendra Akerkar. Big Data & Tourism Big Data & Tourism To promote inno- vation and increase. 2012. [2] Lawrence Ampofo. 5 ways the internet of things will change social media, October 2014. URL http://www.business2community.com/social-media/5-ways-internet-things-will-cha Accessed November,2014. [3] Luigi Atzori, Senior Member, Antonio Iera, Senior Member, and Giacomo Mora- bito. SIoT : Giving a Social Structure to the Internet of Things. 15(11):1193– 1195, 2011. [4] Ala Berzinji. Detecting Key Players in Terrorist Networks. 2011. [5] Nathan Bijnens. A real-time Lambda Architecture using Hadoop & Storm NoSQL Matters Cologne 2014 by Nathan Bijnens Speaker. 2014. [6] Jaap Bloem, Sander Duivestein, and Thomas Van Manen. Big Social Predicting behavior with Big Data. [7] BloombergTV. Can wearables and big data cure disease?, August 2014. URL http://www.bloomberg.com/video/parkinson-s-disease-new-ways-to-study-illness-V Accessed November,2014. [8] David Bollier and Charles M Firestone. The Promise and Peril of Big Data. 2010. ISBN 0898435161. [9] Jeff Cain. Social media in health care: the case for organizational policy and employee education. American journal of health-system pharmacy : AJHP : official journal of the American Society of Health-System Pharmacists, 68 29
  • 39. (11):1036–40, June 2011. ISSN 1535-2900. doi: 10.2146/ajhp100589. URL http://www.ncbi.nlm.nih.gov/pubmed/21593233. [10] William M Campbell, Charlie K Dagli, and Clifford J Weinstein. with Content and Graphs. 20(1), 2013. [11] Neil Couch and Bill Robins. BIG DATA FOR DEFENCE AND SECURITY. [12] Paul M. Davis. A tree that tweets, September 2010. URL http://www.shareable.net/blog/a-tree-that-tweets. Accessed Octo- ber,2014. [13] YOREE KOH DON CLARK. Ibm and twit- ter forge partnership on data analytics, 2014. URL http://online.wsj.com/articles/ibm-and-twitter-forge-partnership-on-data-analy Accessed October,2014. [14] Mark Drapeau and Linton Wells Ii. Social Software and National Security : An Initial Net Assessment. (April), 2009. [15] Rob Faludi. New york times on botanicalls, again!, April 2013. URL http://www.botanicalls.com/. Accessed October,2014. [16] Roberta Floris and Michele Campagna. Social Media Data in Tourism Planning: Analysing Tourists’ Satisfaction in Space and Time Roberta Floris, Michele Cam- pagna. 8(May):997–1003, 2014. [17] California Healthcare Foundation. The Wisdom of Patients : Health Care Meets Online Social Media. (April), 2008. [18] Peter Groves and David Knott. The ‘ big data ’ revolution in healthcare. (January), 2013. [19] Dominique Guinard, Vlad Trifa, Friedemann Mattern, and Erik Wilde. From the internet of things to the web of things: Resource-oriented architecture and best practices. In Architecting the Internet of Things, pages 97–129. Springer, 2011. [20] Carissa Hilliard. Social media for healthcare: A content analysis of md an- derson’s facebook presence and its contribution to cancer support systems. of Undergraduate Research in Communications, page 23. 30
  • 40. [21] Jianming and Wesley W Chu. A Social Networ k-Based Recommender System ( SNRS ). [22] Maged N Kamel Boulos, Antonio P Sanfilippo, Courtney D Corley, and Steve Wheeler. Social Web mining and exploitation for seri- ous applications: Technosocial Predictive Analytics and related tech- nologies for public health, environmental and national security surveil- lance. Computer methods and programs in biomedicine, 100(1):16–23, Oc- tober 2010. ISSN 1872-7565. doi: 10.1016/j.cmpb.2010.02.007. URL http://www.ncbi.nlm.nih.gov/pubmed/20236725. [23] Deborah Kotz. Using twitter as tool to track side effects from drugs, April 2014. URL http://www.bostonglobe.com/lifestyle/health-wellness/2014/04/30/using-twitter- Accessed November,2014. [24] Matthias Kranz, Luis Roalter, and Florian Michahelles. Things That Twitter : Social Networks and the Internet of Things. [25] Nathan Marz and James Warren. Big Data principals and practices of scalable real time systems . [26] Roberta Milano. The effects of online social media on tourism websites. 2011. [27] Mark Million. Washing machine twitters when clothes are done, January 2009. URL http://latimesblogs.latimes.com/technology/2009/01/twitter-washing.html. Accessed November,2014. [28] Alan Mislove, Hema Swetha Koppula, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. Growth of the flickr social network. In Proceedings of the first workshop on Online social networks, pages 25–30. ACM, 2008. [29] Chamin Nalinda. Social network analysis tools and softwares, October 2014. URL http://techspiro.blogspot.com/2014/10/social-network-analysis-tools-softwares. Accessed October,2014. [30] Mary K Obenshain. Application of Data Mining Techniques to Healthcare Data. (August):690–695, 2004. 31
  • 41. [31] Onook Oh, Manish Agrawal, and H Raghav Rao. Information control and terror- ism: Tracking the mumbai terrorist attack through twitter. Information Systems Frontiers, 13(1):33–43, 2011. [32] Chintan Patel. Now you can talk to twitter and find clinical trials on trialx, December 2012. URL http://trialx.com/enablers/2009/03/now-you-can-talk-to-twitter-and-find-clinic Accessed November,2014. [33] Loredana Di Pietro, Francesca Di Virgilio, and Eleonora Pantano. So- cial network for the choice of tourist destination: attitude and be- havioural intention. Journal of Hospitality and Tourism Technology, 3(1): 60–76, 2012. ISSN 1757-9880. doi: 10.1108/17579881211206543. URL http://www.emeraldinsight.com/10.1108/17579881211206543. [34] Angelo Presenza and Maria Cipollina. Analysis of links and features of tourism destination’s stakeholders. an empirical investigation of a south italian region. 2009. [35] Pslulfdo and Ehwzhhq. An Empirical Study on the Relationship between Twitter Sentiment and influence in Tourism Domain. 2012. [36] Cornell Hospitality Report, Laura Mccarthy, Debra Stock, Rohit Verma, D Ph, Rod Clough, Gregg Gilman, Employment Practices, and Gilbert Llp. How Trav- elers Use Online and Social Media Channels to Make Hotel-choice Decisions. 10 (18), 2010. [37] Steve Ressler. Social network analysis as an approach to combat terrorism: past, present, and future research. Homeland Security Affairs, 2006. URL http://www.hsaj.org/?download&mode=dl&h&w&drm=resources%2Fvolume2%2Fissue2%2Fp [38] Lewis Robinson. A tweet from your toaster: How the in- ternet of things will affect social media, May 2014. URL http://www.socialmediatoday.com/content/tweet-your-toaster-how-internet-things Accessed November,2014. [39] Philip Russom. TDWI Checklist Report: Operational Intelligence: Real-Time Business Analytics from Big Data. [40] Philip Russom. T DW I R E S E A R C H BIG DATA. 2011. 32
  • 42. [41] Series, Chris Cooper, C Michael Hall, New Zealand, Noel Scott, and Rodolfo Baggio. Network Analysis and Tourism From Theory to Practice. [42] Amit Sheth, Boanerges Aleman-meza, I Budak Arpinar, Chris Halaschek, and Cartic Ramakrishnan. Semantic Association Identification and Knowledge Dis- covery for National Security Applications. 16(March):1–16, 2005. [43] Sung-bum and Dae-young Kim. TRAVEL INFORMATION SEARCH BEHAV- IOR AND SOCIAL NETWORKING. [44] Sarawut Supattranuwong and Sukree Sinthupinyo. Applying Data Mining to Analyze Travel Pattern in Searching Travel Destination Choices. pages 38–44, 2013. [45] Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba Borthakur, Na- mit Jain, Joydeep Sen Sarma, Raghotham Murthy, and Hao Liu. Data warehousing and analytics infrastructure at facebook. Proceed- ings of the 2010 international conference on Management of data - SIG- MOD ’10, page 1013, 2010. doi: 10.1145/1807167.1807278. URL http://portal.acm.org/citation.cfm?doid=1807167.1807278. [46] Tim O’Reilly. What Is Web 2.0. URL http://oreilly.com/web2/archive/what-is-web-20.html. [47] Alex Woodie. How big data analytics can help fight isis, October 2014. URL http://www.datanami.com/2014/10/14/big-data-analytics-can-help-fight-isis/. Accessed November,2014. 33