SlideShare uma empresa Scribd logo
1 de 46
Baixar para ler offline
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Big Data and the Semantic Web:
Challenges and Opportunities
Srinath Srinivasa
Open Systems Laboratory
IIIT Bangalore
http://osl.iiitb.ac.in/
sri@iiitb.ac.in
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
http://www.bda2013.net/
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
OSL Releases
Topical Anchors: Given 
a list of noun phrases, 
identify a semantic 
topic for these terms.
Powered by Wikipedia 
co­occurrence graph 
hosted by Agama
Web APIs enable use of 
Topical Anchors in 
third party applications 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
OSL Releases
Topic Expansion: Given a
term, expands it into
semantically relevant topical
clusters with different
senses.
Uses co-occurrence
datasets from Wikipedia
2006 or 2011.
Web APIs enable use by
third party applications
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
OSL Releases
Agama: A graph database for 
storing large undirected graphs 
for efficient traversal (not 
structure­based retrieval)
Currently Agama powers a co­
occurrence graph of all noun­
phrases from Wikipedia articles 
hosted in OSL, managing 10s of 
millions of nodes and 100s of 
millions of edges 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
More data beats better algorithms..
meets
No data is an island..
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Outline
● Big Data Characteristics
● Big Data Analytics
● Pattern­driven and Model­driven Analytics
● Big Data and the Semantic Web
● Semantic Challenges
● The myth of a global ontology
● Convergent and divergent semantics
● Semantic interoperability 
● Technology Challenges
● Storage, traversal and retrieval of large­scale semantic networks
● Inference on Big Data
● On the road ahead
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Big Data
Data that is 
● Too large to be processed by conventional 
databases and data management techniques 
(Volume)
● Too diverse in structure that no single data model 
captures all elements of the data (Variety)
● Transient and/or impermanent, especially when 
pertaining to dynamic phenomena (Velocity)
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Big Data
● Transaction records
● Network streams
● Experimental output
● Social media data 
● Demographic records
● Citation data 
● Clickstreams
● Log data
● Weather data 
● …
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Some Big Data Stats
● YouTube users upload 48 hours of video every minute 
http://gigaom.com/2011/05/25/youtube­48­hours­of­video­per­minute/
● Facebook data grows by 500TB daily 
http://www.slashgear.com/facebook­data­grows­by­over­500­tb­daily­23243691/
● WalMart handles more than 1 million customer 
transactions every hour http://www.economist.com/node/15557443
● Akamai analyzes 75 million events per day for 
targeted advertising http://wikibon.org/blog/taming­big­data/
● 90% of data in the world today was created in the last 
2 years http://wikibon.org/blog/big­data­infographics/ 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Big Data Analytics
Examine Big Data for useful (often actionable) 
knowledge
The long spectrum of Big Data Analytics
Pattern identification
Association rule mining
Classification/Clustering
Record Linkage
Security analytics
Complex Event
Processing
Opinion mining
Predictive modeling
Pattern driven
Model driven
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Pattern Driven Analytics
● Discovery and visualization 
of recurring patterns in 
datasets
● Mostly quantitative
●  Paradigms in pattern 
discovery:
● Sampling and 
aggregation
● Thresholding and 
filtering
Image Source: Wikipedia
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Pattern Driven Analytics
Sampling and Aggregation
● Query based pattern aggregation
● Based on an initial idea of what we are looking 
for
Hypothesis
Data
Query Patterns Aggregation Presentation
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Pattern Driven Analytics
Tresholding and Filtering
● Based on sifting through the entire dataset (or a 
view) to look for “interesting” patterns without 
the context of a query
Data
Interestingness
criteria
Patterns Filtering
and
Segregation
Presentation
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Model Driven Analytics
Analytics as a model­discovery problem
Wedding
Images source: Wikipedia
Observable
Data
Latent
Concept
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Model Driven Analytics
● Pattern discovery coupled with semantic 
modeling
● Non­trivial qualitative modeling challenges
● Model discovery:
● Descriptive model discovery
Fit a model to explain the observed data
● Predictive model discovery
Discover a model that can predict values of data elements 
into the future
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Linked Data
Image source: Wikipedia
The Linked Data
Cloud as of
September 2011
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Linked Data
● Using Semantic Web technologies to connect data 
elements from disparate data sources
● From Web of Documents to Web of Data
● Elements of Linked Data
● URIs 
● HTTP
● Resource Description Framework (RDF)
● Serialization formats (RDFa, RDF/XML, N3, Turtle, 
and others)
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Big Data and the Semantic Web
Big Data
Semantic Web
Model Discovery
Catalyzation and
Predictive Modeling
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Big Data        Semantic Web
● One of the main elements of the Linked Data Cloud: DBpedia is 
built from a Big Data resource: Wikipedia
● Open Biomedical Ontology (OBO) (http://www.oboedit.org/) created from 
mining PubMed publications
● Enterprise scale Big Data Analytics helping build organizational 
models, operational intelligence solutions, etc. Example: Anzo 
software suite by Cambridge Semantics (www.cambridgesemantics.com), 
Loom data management suite by Revelytix (www.revelytix.com)
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic Web       Big Data
Schema.org
● Collection of schemata on various topics that are recognized by major 
search providers and used to semantically interpret web content
SourceMap
● Linked data augmented with web content and crowdsourced data used 
to provide details about companies like their carbon footprint, energy 
use, water use, etc. www.sourcemap.com 
OpenSteetMap
● Linked data augmenting crowdsourced data on www.openstreetmap.org 
helped in detailed mapping of disaster scenario during the Jan 2010 
Haiti earthquake (http://www.scientificamerican.com/article.cfm?id=berners­lee­linked­data)
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Big Data and the Semantic Web: 
Challenges
Semantic challenges
● The myth of a global ontology
● Convergent and divergent semantics
Technology and system challenges
● Characteristics of a semantic graph
● Managing graph structured data
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
The Myth of a Global Ontology
Several “core” semantic ontologies exist:
● WordNet
● YAGO
● OpenCyc
● SUMO
However, none of them (even automated ones) can 
capture all possible semantic associations and all 
possible perspectives on a given topic
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
The Myth of a Global Ontology
The open world problem
● We don't know what we don't know.. 
● Representation bias in big data sources
The neutral­but­useless perspective
● Localized, utilitarian descriptions often more useful than neutral, 
global descriptions. Ex: Use of “zones” as a geographical element in 
Indian Railways
● Difficult for disparate perspectives to co­exist in a single Ontology, 
violating design principles like Occam's razor
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Convergent and Divergent 
Semantics
Wikipedia article on
West Bank
conflict
Palestine POV
Israeli POV
Historians' POV
UN's POV
Encyclopedic Semantics
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Convergent and Divergent 
Semantics
IPL
event schedule
Traffic planning
Advertisement planning
around IPL
Legal structuring
around IPL
TV programme
scheduling
Security
planning
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic Interoperability
● Binary predicates like RDF may not capture 
complete semantics of the association
But it is too difficult to work with higher­order predicates
● Semantic queries are characterized by contextual 
relevance and default assumptions
● Linked Data can be useful primarily within the 
context of a model
Model­building from predicates as complex a problem as 
identifying predicates from data
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic Challenges: Summary
● Hard to distinguish data from noise without a model
Especially hard when we are using data to help build a model!
● There may not be a single global model explaining the data
● Model construction as challenging, if not more challenging, as predicate 
mining
● No clarity on the underlying processes that aid in knowledge aggregation
Knowledge aggregation happens differently depending on the kind of 
knowledge being aggregated (encyclopedic versus operational knowledge) 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Tech Challenges
Storing Big Semantic Data
● Semantic data not amenable to physical access coherence to be 
efficiently stored in relational tables
● Logical proximity of triples, more important than physical 
proximity
● Read/Write storage models change logical proximity
● RDF graphs tend to be extremely dense and/or clustered
● Need efficient methods of graph storage and retrieval 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic store for Big Data
● Databases optimized to store and retrieve interrelated 
sets of triples of the form (subject, predicate, object) 
● Query models based on answering graph queries 
(usually in SPARQL) rather than SQL queries
●  Main design criteria: storage and read­ahead policies of 
triples based on their logical proximity rather than 
physical proximity in order to enable Bulk Synchronous 
Parallel (BSP) processing
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic store for Big Data
AllegroGraph  (http://www.franz.com/agraph/allegrograph/)
● NoSQL Graph based native storage for RDF triples
● ACID compliant
● Interfaces with Solr for free text indexing 
● Triple and text level indexing
● MongoDB integration
● RDFS++ Reasoning with dynamic materialization 
● SPARQL queries on named graphs and Prolog based 
inferencing engine
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic store for Big Data
Sesame http://www.openrdf.org/
●  Open source Java framework for parsing, storing, 
querying and inferencing over RDF data 
● Collections of RDF triples can be manipulated in memory 
using a graph data model
● Compliant with SPARQL 1.1 protocol recommendation 
● Provides two levels of APIs: SAIL (Storage and Inference 
Layer) for low level RDF processing and Repository layer 
for programmatic interfacing with Sesame
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic store for Big Data
Mulgara http://www.mulgara.org/ 
● Native storage model for RDF
● Supports multiple models (databases) per server
● ACID transactions and concurrency support 
● Copy­on­write­ cache semantics
● Full­text search and support for data types
● Primarily useful as a repository – no evidence of 
support for logical inferences over RDF 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Semantic store for Big Data
Other examples:
● InfiniteGraph from Objectivity http://www.objectivity.com/
● Big­Data http://www.bigdata.com/bigdata/blog/ 
– A high scale­out storage and computing engine
● Agama https://github.com/arrac/agama/wiki/Agama 
– Storage, search and traversal support (Ruby library) for 
very large graphs 
● Neo4j http://www.neo4j.org/ 
– Embedded, disk­based transactional graph database 
written in Java 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Logical inference over Big Data
● Problem: Find factual answers to specific questions by 
reasoning over large­scale data.  
● Performing extremely large­scale deductions over large 
semantic datasets in interactive response time 
● Need to contend with potentially inconsistent predicates, 
incomplete or missing values and default assumptions
● Varieties of inference over datasets
● Deduction
● Induction
● Abduction
● Statistical inference
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Logical inference over Big Data
Common approaches for scalable inferencing:
● Horn clause inferencing
● Variants of random walks on knowledge graphs
● Distributed MCMC (Markov Chain Monte Carlo) 
methods
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Horn Clauses
Horn clauses are predicates of the form:
atomic sentence with no negation and a single consequent
Horn clause knowledge bases can be resolved using “backward 
chaining” starting from the consequent and building a tree of 
antecedents until they are grounded in facts
Horn clause resolution can be scaled over large datasets by 
parallelizing resolutions using MapReduce 
 
p1∧p2∧...∧pn →u
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Random Walks on Big Data
Random walks on RDF graphs as a means of:
● Belief materialization
● Soft inference
a c e
d f
b
R R
R
R
Assuming transitivity of R
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Random Walks on Big Data
Large scale graph processing solutions for 
scaling random walks over Big Data: 
● Apache Giraph http://giraph.apache.org/ 
● Pregel [Malewicz et al., 2010]
● Grappa http://www.cs.washington.edu/node/4217/ 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
MCMC
A “generic” problem solving method based on local 
sampling, useful for soft inferences on semantic data
Time homogeneous Markov Chain:
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
MCMC
A homogeneous Markov chain can be represented as a set of 
“states” and “transition probabilities” across states
Given an initial “prior” probability distribution across states  
         the “stationary distribution” or “equilibrium condition” 
is defined as: 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
MCMC
Markov Chain Monte Carlo
Given a state space S and an “equilibrium” distribution       
choose a sample s of the state space S so that a Markov chain 
on s results in      as the stationary distribution
MCMC for logical inference
For a logical inference problem, the equilibrium condition 
would be of the form [0,1]m
 defined over a set of m predicates
Example Sampling algorithms for MCMC
Gibbs Sampling http://en.wikipedia.org/wiki/Gibbs_sampling 
Metropolis­Hastings algorithm 
http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Scaling MCMC for Big Data
Distributed MCMC
Several models are explored for distributing MCMC computations 
over large datasets making them amenable to diffusing 
computations. Some examples include: [Murray 2010; Singh et al 
2011]
Distributional models for MCMC beyond the scope of this talk.. 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
On the road ahead..
Some promising directions for Big Data and 
Semantics
● Diffusion models for large scale inference
● Cognitive models for semantics over large scale data
● Model­based reasoning and reasoning across models
● Soft (probabilistic) inferences, confidence measures, 
relevance feedback
● Continuous learning over Big Data 
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
Thank You!
Big Data Tech Conclave, 26—27 April 2013
Bangalore, India
References
● Neal Madras. Introduction to Markov Chain Monte Carlo. 
http://www.cs.cornell.edu/selman/cs475/lectures/intro­mcmc­lukas.pdf 
● Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz 
Czajkowski. 2010. Pregel: a system for large­scale graph processing. In Proceedings of the 2010 ACM SIGMOD International 
Conference on Management of data (SIGMOD '10). ACM, New York, NY, USA, 135­146. DOI=10.1145/1807167.1807184 
http://doi.acm.org/10.1145/1807167.1807184
● Ni Lao, Tom Mitchell, and William W. Cohen. 2011. Random walk inference and learning in a large scale knowledge base. In 
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '11). Association for 
Computational Linguistics, Stroudsburg, PA, USA, 529­539. 
● Lawrence Murray, Distributed Markov Chain Monte Carlo. Proceedings of NIPS 2010 Workshop on Learning on Cores, 
Clusters and Clouds. http://lccc.eecs.berkeley.edu/ 
● Stefan Schoenmackers, Oren Etzioni, and Daniel S. Weld. 2008. Scaling textual inference to the web. In Proceedings of the 
Conference on Empirical Methods in Natural Language Processing (EMNLP '08). Association for Computational Linguistics, 
Stroudsburg, PA, USA, 79­88.
● Stefan Schoenmackers, Oren Etzioni, Daniel S. Weld, and Jesse Davis. 2010. Learning first­order Horn clauses from web 
text. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP '10). 
Association for Computational Linguistics, Stroudsburg, PA, USA, 1088­1098.
● Sameer Singh, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum. 2011. Large­scale cross­document 
coreference using distributed inference and hierarchical models. In Proceedings of the 49th Annual Meeting of the 
Association for Computational Linguistics: Human Language Technologies ­ Volume 1 (HLT '11), Vol. 1. Association for 
Computational Linguistics, Stroudsburg, PA, USA, 793­803.   

Mais conteúdo relacionado

Mais procurados

Semantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisSemantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisCraig Knoblock
 
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...semanticsconference
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise Ontotext
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataOntotext
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsOntotext
 
A possible future role of schema.org for business reporting
A possible future role of schema.org for business reportingA possible future role of schema.org for business reporting
A possible future role of schema.org for business reportingsopekmir
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Ontotext
 
Accelerating Time to Research Using CloudBank
Accelerating Time to Research Using CloudBankAccelerating Time to Research Using CloudBank
Accelerating Time to Research Using CloudBankSanjay Padhi, Ph.D
 
Rank | Analyse | Lead | Search
Rank | Analyse | Lead | SearchRank | Analyse | Lead | Search
Rank | Analyse | Lead | Searchsopekmir
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...semanticsconference
 
Linked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationLinked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationSören Auer
 
Koneksys - Offering Services to Connect Data using the Data Web
Koneksys - Offering Services to Connect Data using the Data WebKoneksys - Offering Services to Connect Data using the Data Web
Koneksys - Offering Services to Connect Data using the Data WebKoneksys
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Connected Data World
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsOntotext
 

Mais procurados (20)

Semantics for Big Data Integration and Analysis
Semantics for Big Data Integration and AnalysisSemantics for Big Data Integration and Analysis
Semantics for Big Data Integration and Analysis
 
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
Stephen Buxton | Data Integration - a Multi-Model Approach - Documents and Tr...
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
A possible future role of schema.org for business reporting
A possible future role of schema.org for business reportingA possible future role of schema.org for business reporting
A possible future role of schema.org for business reporting
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Accelerating Time to Research Using CloudBank
Accelerating Time to Research Using CloudBankAccelerating Time to Research Using CloudBank
Accelerating Time to Research Using CloudBank
 
Rank | Analyse | Lead | Search
Rank | Analyse | Lead | SearchRank | Analyse | Lead | Search
Rank | Analyse | Lead | Search
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Graph db
Graph dbGraph db
Graph db
 
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
 
Linked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationLinked data for Enterprise Data Integration
Linked data for Enterprise Data Integration
 
Koneksys - Offering Services to Connect Data using the Data Web
Koneksys - Offering Services to Connect Data using the Data WebKoneksys - Offering Services to Connect Data using the Data Web
Koneksys - Offering Services to Connect Data using the Data Web
 
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk Analytics
 

Destaque

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big DataMarin Dimitrov
 
Is data sharing the privilege of a few? Bringing Linked Data to those without...
Is data sharing the privilege of a few? Bringing Linked Data to those without...Is data sharing the privilege of a few? Bringing Linked Data to those without...
Is data sharing the privilege of a few? Bringing Linked Data to those without...Christophe Guéret
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semanticsCraig Trim
 
From Big Data to Smart Data
From Big Data to Smart DataFrom Big Data to Smart Data
From Big Data to Smart DataMarin Dimitrov
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술Haklae Kim
 
Big Data and Semantic Web in Manufacturing
Big Data and Semantic Web in ManufacturingBig Data and Semantic Web in Manufacturing
Big Data and Semantic Web in ManufacturingNitesh Khilwani
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like YouSalford Systems
 
What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...
What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...
What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...Mills Davis
 
Big Data: Analisi del Sentiment
Big Data: Analisi del SentimentBig Data: Analisi del Sentiment
Big Data: Analisi del SentimentMiriade Spa
 
ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...
ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...
ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...Robert Cole
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Pythonshanbady
 
The World Wide Web Power Point
The World Wide Web Power PointThe World Wide Web Power Point
The World Wide Web Power Pointkaramfilova
 
Internet and World Wide Web
Internet and World Wide WebInternet and World Wide Web
Internet and World Wide WebSamudin Kassan
 

Destaque (15)

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
Is data sharing the privilege of a few? Bringing Linked Data to those without...
Is data sharing the privilege of a few? Bringing Linked Data to those without...Is data sharing the privilege of a few? Bringing Linked Data to those without...
Is data sharing the privilege of a few? Bringing Linked Data to those without...
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semantics
 
From Big Data to Smart Data
From Big Data to Smart DataFrom Big Data to Smart Data
From Big Data to Smart Data
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
 
Big Data and Semantic Web in Manufacturing
Big Data and Semantic Web in ManufacturingBig Data and Semantic Web in Manufacturing
Big Data and Semantic Web in Manufacturing
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
 
What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...
What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...
What is the role of cloud computing, web 2.0, and web 3.0 semantic technologi...
 
Big Data: Analisi del Sentiment
Big Data: Analisi del SentimentBig Data: Analisi del Sentiment
Big Data: Analisi del Sentiment
 
ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...
ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...
ATME Travel Marketing Conference - How Big Data, Deep Web & Semantic Technolo...
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
The World Wide Web Power Point
The World Wide Web Power PointThe World Wide Web Power Point
The World Wide Web Power Point
 
Internet and World Wide Web
Internet and World Wide WebInternet and World Wide Web
Internet and World Wide Web
 
world wide web
world wide webworld wide web
world wide web
 
Ppt on internet
Ppt on internetPpt on internet
Ppt on internet
 

Semelhante a Big Data and the Semantic Web: Challenges and Opportunities

A Generic Model for Student Data Analytic Web Service (SDAWS)
A Generic Model for Student Data Analytic Web Service (SDAWS)A Generic Model for Student Data Analytic Web Service (SDAWS)
A Generic Model for Student Data Analytic Web Service (SDAWS)Editor IJCATR
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldAmit Sheth
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016Jessie Chuang
 
Bridging the gap between the semantic web and big data: answering SPARQL que...
Bridging the gap between the semantic web and big data:  answering SPARQL que...Bridging the gap between the semantic web and big data:  answering SPARQL que...
Bridging the gap between the semantic web and big data: answering SPARQL que...IJECEIAES
 
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University ChennaiBig Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University Chennaisethuraman R
 
Resume_latest_22_01
Resume_latest_22_01Resume_latest_22_01
Resume_latest_22_01Raghu Golla
 
IRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword ManagerIRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword ManagerIRJET Journal
 
Database Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiDatabase Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiOllieShoresna
 
Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database  Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database dannyijwest
 
FIWARE Training: Introduction to Smart Data Models
FIWARE Training: Introduction to Smart Data ModelsFIWARE Training: Introduction to Smart Data Models
FIWARE Training: Introduction to Smart Data ModelsFIWARE
 
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان دادهمعرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان دادهWeb Standards School
 

Semelhante a Big Data and the Semantic Web: Challenges and Opportunities (20)

Pf3426712675
Pf3426712675Pf3426712675
Pf3426712675
 
A Generic Model for Student Data Analytic Web Service (SDAWS)
A Generic Model for Student Data Analytic Web Service (SDAWS)A Generic Model for Student Data Analytic Web Service (SDAWS)
A Generic Model for Student Data Analytic Web Service (SDAWS)
 
9. the semantic grid and autonomic grid
9. the semantic grid and autonomic grid9. the semantic grid and autonomic grid
9. the semantic grid and autonomic grid
 
Radhakrishnan Moni
Radhakrishnan MoniRadhakrishnan Moni
Radhakrishnan Moni
 
Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
 
Semantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-WorldSemantic Web: Technolgies and Applications for Real-World
Semantic Web: Technolgies and Applications for Real-World
 
Resume
ResumeResume
Resume
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016
 
Bridging the gap between the semantic web and big data: answering SPARQL que...
Bridging the gap between the semantic web and big data:  answering SPARQL que...Bridging the gap between the semantic web and big data:  answering SPARQL que...
Bridging the gap between the semantic web and big data: answering SPARQL que...
 
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University ChennaiBig Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
 
Resume_ChiungLun_Hung
Resume_ChiungLun_HungResume_ChiungLun_Hung
Resume_ChiungLun_Hung
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Resume_latest_22_01
Resume_latest_22_01Resume_latest_22_01
Resume_latest_22_01
 
No sql
No sqlNo sql
No sql
 
IRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword ManagerIRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword Manager
 
Database Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiDatabase Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wi
 
Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database  Linked Data Generation for the University Data From Legacy Database
Linked Data Generation for the University Data From Legacy Database
 
FIWARE Training: Introduction to Smart Data Models
FIWARE Training: Introduction to Smart Data ModelsFIWARE Training: Introduction to Smart Data Models
FIWARE Training: Introduction to Smart Data Models
 
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان دادهمعرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
 
Application Profiles
Application ProfilesApplication Profiles
Application Profiles
 

Mais de Srinath Srinivasa

Modeling sustainability in social networks
Modeling sustainability in social networksModeling sustainability in social networks
Modeling sustainability in social networksSrinath Srinivasa
 
Characterizing online social cognition
Characterizing online social cognitionCharacterizing online social cognition
Characterizing online social cognitionSrinath Srinivasa
 
Big Social Machines: Architecture and Challenges
Big Social Machines: Architecture and ChallengesBig Social Machines: Architecture and Challenges
Big Social Machines: Architecture and ChallengesSrinath Srinivasa
 
Abstraction and Expression on the Web
Abstraction and Expression on the WebAbstraction and Expression on the Web
Abstraction and Expression on the WebSrinath Srinivasa
 
The Power Law of Social Media: What CIOs Should Know
The Power Law of Social Media: What CIOs Should KnowThe Power Law of Social Media: What CIOs Should Know
The Power Law of Social Media: What CIOs Should KnowSrinath Srinivasa
 
Aggregating Operational Knowledge in Community Settings
Aggregating Operational Knowledge in Community SettingsAggregating Operational Knowledge in Community Settings
Aggregating Operational Knowledge in Community SettingsSrinath Srinivasa
 
Information Networks and Semantics
Information Networks and SemanticsInformation Networks and Semantics
Information Networks and SemanticsSrinath Srinivasa
 
Semantics hidden within co-occurrence patterns
Semantics hidden within co-occurrence patternsSemantics hidden within co-occurrence patterns
Semantics hidden within co-occurrence patternsSrinath Srinivasa
 
The open problem of open-world computing
The open problem of open-world computingThe open problem of open-world computing
The open problem of open-world computingSrinath Srinivasa
 
Trends In Graph Data Management And Mining
Trends In Graph Data Management And MiningTrends In Graph Data Management And Mining
Trends In Graph Data Management And MiningSrinath Srinivasa
 
Information Networks And Their Dynamics
Information Networks And Their DynamicsInformation Networks And Their Dynamics
Information Networks And Their DynamicsSrinath Srinivasa
 

Mais de Srinath Srinivasa (15)

AI and the sense of self
AI and the sense of selfAI and the sense of self
AI and the sense of self
 
Modeling sustainability in social networks
Modeling sustainability in social networksModeling sustainability in social networks
Modeling sustainability in social networks
 
Characterizing online social cognition
Characterizing online social cognitionCharacterizing online social cognition
Characterizing online social cognition
 
Open ended data
Open ended dataOpen ended data
Open ended data
 
The Web and the Mind
The Web and the MindThe Web and the Mind
The Web and the Mind
 
Big Social Machines: Architecture and Challenges
Big Social Machines: Architecture and ChallengesBig Social Machines: Architecture and Challenges
Big Social Machines: Architecture and Challenges
 
Abstraction and Expression on the Web
Abstraction and Expression on the WebAbstraction and Expression on the Web
Abstraction and Expression on the Web
 
Towards a "Mindful" Web
Towards a "Mindful" WebTowards a "Mindful" Web
Towards a "Mindful" Web
 
The Power Law of Social Media: What CIOs Should Know
The Power Law of Social Media: What CIOs Should KnowThe Power Law of Social Media: What CIOs Should Know
The Power Law of Social Media: What CIOs Should Know
 
Aggregating Operational Knowledge in Community Settings
Aggregating Operational Knowledge in Community SettingsAggregating Operational Knowledge in Community Settings
Aggregating Operational Knowledge in Community Settings
 
Information Networks and Semantics
Information Networks and SemanticsInformation Networks and Semantics
Information Networks and Semantics
 
Semantics hidden within co-occurrence patterns
Semantics hidden within co-occurrence patternsSemantics hidden within co-occurrence patterns
Semantics hidden within co-occurrence patterns
 
The open problem of open-world computing
The open problem of open-world computingThe open problem of open-world computing
The open problem of open-world computing
 
Trends In Graph Data Management And Mining
Trends In Graph Data Management And MiningTrends In Graph Data Management And Mining
Trends In Graph Data Management And Mining
 
Information Networks And Their Dynamics
Information Networks And Their DynamicsInformation Networks And Their Dynamics
Information Networks And Their Dynamics
 

Último

Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...rajnisinghkjn
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Hematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsHematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsMedicoseAcademics
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxDr.Nusrat Tariq
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
 
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceCollege Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceNehru place Escorts
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...narwatsonia7
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
 
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...narwatsonia7
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaPooja Gupta
 
Call Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service SuratCall Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service Suratnarwatsonia7
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 

Último (20)

Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
 
Hematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsHematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes Functions
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptx
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceCollege Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
 
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
 
Call Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service SuratCall Girl Surat Madhuri 7001305949 Independent Escort Service Surat
Call Girl Surat Madhuri 7001305949 Independent Escort Service Surat
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
 

Big Data and the Semantic Web: Challenges and Opportunities

  • 1. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Big Data and the Semantic Web: Challenges and Opportunities Srinath Srinivasa Open Systems Laboratory IIIT Bangalore http://osl.iiitb.ac.in/ sri@iiitb.ac.in
  • 2. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India http://www.bda2013.net/
  • 3. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India OSL Releases Topical Anchors: Given  a list of noun phrases,  identify a semantic  topic for these terms. Powered by Wikipedia  co­occurrence graph  hosted by Agama Web APIs enable use of  Topical Anchors in  third party applications 
  • 4. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India OSL Releases Topic Expansion: Given a term, expands it into semantically relevant topical clusters with different senses. Uses co-occurrence datasets from Wikipedia 2006 or 2011. Web APIs enable use by third party applications
  • 5. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India OSL Releases Agama: A graph database for  storing large undirected graphs  for efficient traversal (not  structure­based retrieval) Currently Agama powers a co­ occurrence graph of all noun­ phrases from Wikipedia articles  hosted in OSL, managing 10s of  millions of nodes and 100s of  millions of edges 
  • 6. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India More data beats better algorithms.. meets No data is an island..
  • 7. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Outline ● Big Data Characteristics ● Big Data Analytics ● Pattern­driven and Model­driven Analytics ● Big Data and the Semantic Web ● Semantic Challenges ● The myth of a global ontology ● Convergent and divergent semantics ● Semantic interoperability  ● Technology Challenges ● Storage, traversal and retrieval of large­scale semantic networks ● Inference on Big Data ● On the road ahead
  • 8. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Big Data Data that is  ● Too large to be processed by conventional  databases and data management techniques  (Volume) ● Too diverse in structure that no single data model  captures all elements of the data (Variety) ● Transient and/or impermanent, especially when  pertaining to dynamic phenomena (Velocity)
  • 9. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Big Data ● Transaction records ● Network streams ● Experimental output ● Social media data  ● Demographic records ● Citation data  ● Clickstreams ● Log data ● Weather data  ● …
  • 10. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Some Big Data Stats ● YouTube users upload 48 hours of video every minute  http://gigaom.com/2011/05/25/youtube­48­hours­of­video­per­minute/ ● Facebook data grows by 500TB daily  http://www.slashgear.com/facebook­data­grows­by­over­500­tb­daily­23243691/ ● WalMart handles more than 1 million customer  transactions every hour http://www.economist.com/node/15557443 ● Akamai analyzes 75 million events per day for  targeted advertising http://wikibon.org/blog/taming­big­data/ ● 90% of data in the world today was created in the last  2 years http://wikibon.org/blog/big­data­infographics/ 
  • 11. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Big Data Analytics Examine Big Data for useful (often actionable)  knowledge The long spectrum of Big Data Analytics Pattern identification Association rule mining Classification/Clustering Record Linkage Security analytics Complex Event Processing Opinion mining Predictive modeling Pattern driven Model driven
  • 12. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Pattern Driven Analytics ● Discovery and visualization  of recurring patterns in  datasets ● Mostly quantitative ●  Paradigms in pattern  discovery: ● Sampling and  aggregation ● Thresholding and  filtering Image Source: Wikipedia
  • 13. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Pattern Driven Analytics Sampling and Aggregation ● Query based pattern aggregation ● Based on an initial idea of what we are looking  for Hypothesis Data Query Patterns Aggregation Presentation
  • 14. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Pattern Driven Analytics Tresholding and Filtering ● Based on sifting through the entire dataset (or a  view) to look for “interesting” patterns without  the context of a query Data Interestingness criteria Patterns Filtering and Segregation Presentation
  • 15. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Model Driven Analytics Analytics as a model­discovery problem Wedding Images source: Wikipedia Observable Data Latent Concept
  • 16. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Model Driven Analytics ● Pattern discovery coupled with semantic  modeling ● Non­trivial qualitative modeling challenges ● Model discovery: ● Descriptive model discovery Fit a model to explain the observed data ● Predictive model discovery Discover a model that can predict values of data elements  into the future
  • 17. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Linked Data Image source: Wikipedia The Linked Data Cloud as of September 2011
  • 18. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Linked Data ● Using Semantic Web technologies to connect data  elements from disparate data sources ● From Web of Documents to Web of Data ● Elements of Linked Data ● URIs  ● HTTP ● Resource Description Framework (RDF) ● Serialization formats (RDFa, RDF/XML, N3, Turtle,  and others)
  • 19. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Big Data and the Semantic Web Big Data Semantic Web Model Discovery Catalyzation and Predictive Modeling
  • 20. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Big Data        Semantic Web ● One of the main elements of the Linked Data Cloud: DBpedia is  built from a Big Data resource: Wikipedia ● Open Biomedical Ontology (OBO) (http://www.oboedit.org/) created from  mining PubMed publications ● Enterprise scale Big Data Analytics helping build organizational  models, operational intelligence solutions, etc. Example: Anzo  software suite by Cambridge Semantics (www.cambridgesemantics.com),  Loom data management suite by Revelytix (www.revelytix.com)
  • 21. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic Web       Big Data Schema.org ● Collection of schemata on various topics that are recognized by major  search providers and used to semantically interpret web content SourceMap ● Linked data augmented with web content and crowdsourced data used  to provide details about companies like their carbon footprint, energy  use, water use, etc. www.sourcemap.com  OpenSteetMap ● Linked data augmenting crowdsourced data on www.openstreetmap.org  helped in detailed mapping of disaster scenario during the Jan 2010  Haiti earthquake (http://www.scientificamerican.com/article.cfm?id=berners­lee­linked­data)
  • 22. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Big Data and the Semantic Web:  Challenges Semantic challenges ● The myth of a global ontology ● Convergent and divergent semantics Technology and system challenges ● Characteristics of a semantic graph ● Managing graph structured data
  • 23. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India The Myth of a Global Ontology Several “core” semantic ontologies exist: ● WordNet ● YAGO ● OpenCyc ● SUMO However, none of them (even automated ones) can  capture all possible semantic associations and all  possible perspectives on a given topic
  • 24. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India The Myth of a Global Ontology The open world problem ● We don't know what we don't know..  ● Representation bias in big data sources The neutral­but­useless perspective ● Localized, utilitarian descriptions often more useful than neutral,  global descriptions. Ex: Use of “zones” as a geographical element in  Indian Railways ● Difficult for disparate perspectives to co­exist in a single Ontology,  violating design principles like Occam's razor
  • 25. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Convergent and Divergent  Semantics Wikipedia article on West Bank conflict Palestine POV Israeli POV Historians' POV UN's POV Encyclopedic Semantics
  • 26. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Convergent and Divergent  Semantics IPL event schedule Traffic planning Advertisement planning around IPL Legal structuring around IPL TV programme scheduling Security planning
  • 27. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic Interoperability ● Binary predicates like RDF may not capture  complete semantics of the association But it is too difficult to work with higher­order predicates ● Semantic queries are characterized by contextual  relevance and default assumptions ● Linked Data can be useful primarily within the  context of a model Model­building from predicates as complex a problem as  identifying predicates from data
  • 28. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic Challenges: Summary ● Hard to distinguish data from noise without a model Especially hard when we are using data to help build a model! ● There may not be a single global model explaining the data ● Model construction as challenging, if not more challenging, as predicate  mining ● No clarity on the underlying processes that aid in knowledge aggregation Knowledge aggregation happens differently depending on the kind of  knowledge being aggregated (encyclopedic versus operational knowledge) 
  • 29. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Tech Challenges Storing Big Semantic Data ● Semantic data not amenable to physical access coherence to be  efficiently stored in relational tables ● Logical proximity of triples, more important than physical  proximity ● Read/Write storage models change logical proximity ● RDF graphs tend to be extremely dense and/or clustered ● Need efficient methods of graph storage and retrieval 
  • 30. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic store for Big Data ● Databases optimized to store and retrieve interrelated  sets of triples of the form (subject, predicate, object)  ● Query models based on answering graph queries  (usually in SPARQL) rather than SQL queries ●  Main design criteria: storage and read­ahead policies of  triples based on their logical proximity rather than  physical proximity in order to enable Bulk Synchronous  Parallel (BSP) processing
  • 31. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic store for Big Data AllegroGraph  (http://www.franz.com/agraph/allegrograph/) ● NoSQL Graph based native storage for RDF triples ● ACID compliant ● Interfaces with Solr for free text indexing  ● Triple and text level indexing ● MongoDB integration ● RDFS++ Reasoning with dynamic materialization  ● SPARQL queries on named graphs and Prolog based  inferencing engine
  • 32. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic store for Big Data Sesame http://www.openrdf.org/ ●  Open source Java framework for parsing, storing,  querying and inferencing over RDF data  ● Collections of RDF triples can be manipulated in memory  using a graph data model ● Compliant with SPARQL 1.1 protocol recommendation  ● Provides two levels of APIs: SAIL (Storage and Inference  Layer) for low level RDF processing and Repository layer  for programmatic interfacing with Sesame
  • 33. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic store for Big Data Mulgara http://www.mulgara.org/  ● Native storage model for RDF ● Supports multiple models (databases) per server ● ACID transactions and concurrency support  ● Copy­on­write­ cache semantics ● Full­text search and support for data types ● Primarily useful as a repository – no evidence of  support for logical inferences over RDF 
  • 34. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Semantic store for Big Data Other examples: ● InfiniteGraph from Objectivity http://www.objectivity.com/ ● Big­Data http://www.bigdata.com/bigdata/blog/  – A high scale­out storage and computing engine ● Agama https://github.com/arrac/agama/wiki/Agama  – Storage, search and traversal support (Ruby library) for  very large graphs  ● Neo4j http://www.neo4j.org/  – Embedded, disk­based transactional graph database  written in Java 
  • 35. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Logical inference over Big Data ● Problem: Find factual answers to specific questions by  reasoning over large­scale data.   ● Performing extremely large­scale deductions over large  semantic datasets in interactive response time  ● Need to contend with potentially inconsistent predicates,  incomplete or missing values and default assumptions ● Varieties of inference over datasets ● Deduction ● Induction ● Abduction ● Statistical inference
  • 36. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Logical inference over Big Data Common approaches for scalable inferencing: ● Horn clause inferencing ● Variants of random walks on knowledge graphs ● Distributed MCMC (Markov Chain Monte Carlo)  methods
  • 37. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Horn Clauses Horn clauses are predicates of the form: atomic sentence with no negation and a single consequent Horn clause knowledge bases can be resolved using “backward  chaining” starting from the consequent and building a tree of  antecedents until they are grounded in facts Horn clause resolution can be scaled over large datasets by  parallelizing resolutions using MapReduce    p1∧p2∧...∧pn →u
  • 38. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Random Walks on Big Data Random walks on RDF graphs as a means of: ● Belief materialization ● Soft inference a c e d f b R R R R Assuming transitivity of R
  • 39. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Random Walks on Big Data Large scale graph processing solutions for  scaling random walks over Big Data:  ● Apache Giraph http://giraph.apache.org/  ● Pregel [Malewicz et al., 2010] ● Grappa http://www.cs.washington.edu/node/4217/ 
  • 40. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India MCMC A “generic” problem solving method based on local  sampling, useful for soft inferences on semantic data Time homogeneous Markov Chain:
  • 41. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India MCMC A homogeneous Markov chain can be represented as a set of  “states” and “transition probabilities” across states Given an initial “prior” probability distribution across states            the “stationary distribution” or “equilibrium condition”  is defined as: 
  • 42. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India MCMC Markov Chain Monte Carlo Given a state space S and an “equilibrium” distribution        choose a sample s of the state space S so that a Markov chain  on s results in      as the stationary distribution MCMC for logical inference For a logical inference problem, the equilibrium condition  would be of the form [0,1]m  defined over a set of m predicates Example Sampling algorithms for MCMC Gibbs Sampling http://en.wikipedia.org/wiki/Gibbs_sampling  Metropolis­Hastings algorithm  http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm 
  • 43. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Scaling MCMC for Big Data Distributed MCMC Several models are explored for distributing MCMC computations  over large datasets making them amenable to diffusing  computations. Some examples include: [Murray 2010; Singh et al  2011] Distributional models for MCMC beyond the scope of this talk.. 
  • 44. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India On the road ahead.. Some promising directions for Big Data and  Semantics ● Diffusion models for large scale inference ● Cognitive models for semantics over large scale data ● Model­based reasoning and reasoning across models ● Soft (probabilistic) inferences, confidence measures,  relevance feedback ● Continuous learning over Big Data 
  • 45. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India Thank You!
  • 46. Big Data Tech Conclave, 26—27 April 2013 Bangalore, India References ● Neal Madras. Introduction to Markov Chain Monte Carlo.  http://www.cs.cornell.edu/selman/cs475/lectures/intro­mcmc­lukas.pdf  ● Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz  Czajkowski. 2010. Pregel: a system for large­scale graph processing. In Proceedings of the 2010 ACM SIGMOD International  Conference on Management of data (SIGMOD '10). ACM, New York, NY, USA, 135­146. DOI=10.1145/1807167.1807184  http://doi.acm.org/10.1145/1807167.1807184 ● Ni Lao, Tom Mitchell, and William W. Cohen. 2011. Random walk inference and learning in a large scale knowledge base. In  Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '11). Association for  Computational Linguistics, Stroudsburg, PA, USA, 529­539.  ● Lawrence Murray, Distributed Markov Chain Monte Carlo. Proceedings of NIPS 2010 Workshop on Learning on Cores,  Clusters and Clouds. http://lccc.eecs.berkeley.edu/  ● Stefan Schoenmackers, Oren Etzioni, and Daniel S. Weld. 2008. Scaling textual inference to the web. In Proceedings of the  Conference on Empirical Methods in Natural Language Processing (EMNLP '08). Association for Computational Linguistics,  Stroudsburg, PA, USA, 79­88. ● Stefan Schoenmackers, Oren Etzioni, Daniel S. Weld, and Jesse Davis. 2010. Learning first­order Horn clauses from web  text. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP '10).  Association for Computational Linguistics, Stroudsburg, PA, USA, 1088­1098. ● Sameer Singh, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum. 2011. Large­scale cross­document  coreference using distributed inference and hierarchical models. In Proceedings of the 49th Annual Meeting of the  Association for Computational Linguistics: Human Language Technologies ­ Volume 1 (HLT '11), Vol. 1. Association for  Computational Linguistics, Stroudsburg, PA, USA, 793­803.