SlideShare uma empresa Scribd logo
1 de 22
May 2013
From Big Data to Smart Data
Marin Dimitrov - CTO
About Ontotext
• Provides products and services for creating,
managing and exploiting semantic data
– Founded in 2000
– Offices in Bulgaria, USA and UK
• Major clients and industries
– Media & Publishing (BBC, Press Association, EuroMoney,
NDP Nieuwsmedia)
– HCLS (AstraZeneca, UCB, NIBIO)
– Cultural Heritage (The British Museum, The National
Archives, Polish National Museum, Dutch Public Library)
– Government (UK Parliament, United Nations FAO, LMI)
#2May 2013From Big Data to Smart Data (Semantic Days 2013)
Contents
• The Problem with Big Data for BI
• From Big Data to Smart Data
• Success Stories by Ontotext
#3From Big Data to Smart Data (Semantic Days 2013) May 2013
BIG DATA FOR BUSINESS
INTELLIGENCE
#4From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
#5From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
• It’s not only about Volume, Velocity & Variety
• Too much focus on processing speed & storage
volume
• “Brute force” approaches increase the amount of
data processed…
– But not necessarily the Value & insight derived from data
– May lead to even more data quality & inconsistency
problems
– Problems with data visualisation & exploration
– Often do not lead to better decision making
#6From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
• BI success is not measured by Volume, Velocity &
Variety, but by more derived Value
• Organisations should learn how to better utilise their
“small data” before targeting Big Data
– Quality over quantity
– Better understanding of the data leads to better decision
making
– Avoid “needle in a haystack” situations
#7From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
#8From Big Data to Smart Data (Semantic Days 2013) May 2013
Smart Data for Better BI
• Efficiently analyse unstructured data
– Most of the enterprise data is still unstructured
– Even within structured & transactional data sources there
is a lot of embedded unstructured data
– … and this unstructured data is poorly analysed (if at all) =>
lots of potential value still remains locked
– (sometimes even within semantic / Linked Data with
insufficient granularity)
#9From Big Data to Smart Data (Semantic Days 2013) May 2013
Smart Data for Better BI
• Focus on metadata first, Big Data later
– (As opposed to: Big Data first, metadata later)
• Enrich data
• Interlink data
• Provide a common metadata layer
– Break legacy silos
– Align heterogeneous metadata if necessary
• Better analysis of the data, better insight
#10From Big Data to Smart Data (Semantic Days 2013) May 2013
SUCCESS STORIES
#11From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
• Comprehensive recruitment database for the UK
– 4 million job ads / vacancies (dynamic)
– 220,000 company websites & 700 job boards monitored
• Questions we can answer
– What skills are in demand at present?
– Which are the top job boards in a region?
– Which is the right Job board for your industry sector?
– Which are the most active job advertisers / employers?
– Which are the agencies and employers that do not
advertise on your job board?
#12From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#13From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
• Technology stack
– Web mining & focussed crawling
– KB construction from open & proprietary data sources
– Skills taxonomy (based on DISCO)
– Text mining & semantic enrichment
– Reconciliation & interlinking
– BI reporting & dashboards
#14From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#15From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#16From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#17From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
• Support Financial Intelligence Units with tracking
stolen assets, fight corruption & money laundering
• Questions we can answer
– What are the reported activities related to a person?
– What is the person’s personal/professional network?
– What are corruptions cases reported in regional news?
• Data sources
– News feeds from major news agencies
– Dow Jones data & news feeds
– SARs to the FIU
– Open data (people & companies, Wikipedia)
#18From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
#19From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
• Technology stack
– Web Mining
– Text mining & semantic enrichment (KIM)
– ARIS ontology
• People, companies, assets, relations, financial transactions, …
– Reconciliation & Interlinking
– Triplestore (OWLIM)
– Semantic search & exploration UX
– BI reporting / factsheets / alerts
#20From Big Data to Smart Data (Semantic Days 2013) May 2013
Semantic Information Integration & Enrichment
#21From Big Data to Smart Data (Semantic Days 2013) May 2013
Q & A
Thank you!
@ontotext
#22From Big Data to Smart Data (Semantic Days 2013) May 2013

Mais conteúdo relacionado

Mais procurados

Big data characteristics, value chain and challenges
Big data characteristics, value chain and challengesBig data characteristics, value chain and challenges
Big data characteristics, value chain and challengesMusfiqur Rahman
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Ziyad Saleh
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data DATAVERSITY
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Personalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSurePersonalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSureLeanne Hwee
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Businessazuyo.com
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data AnalyticsProduct School
 
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and BeyondJohn Avery
 
Everis big data_wilson_v1.4
Everis big data_wilson_v1.4Everis big data_wilson_v1.4
Everis big data_wilson_v1.4wilson_lucas
 
Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Sujit Ghosh
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
BIG Data and Methodology-A review
BIG Data and Methodology-A reviewBIG Data and Methodology-A review
BIG Data and Methodology-A reviewShilpa Soi
 

Mais procurados (20)

Big data
Big dataBig data
Big data
 
Big data characteristics, value chain and challenges
Big data characteristics, value chain and challengesBig data characteristics, value chain and challenges
Big data characteristics, value chain and challenges
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big data
Big dataBig data
Big data
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data
Big dataBig data
Big data
 
Personalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSurePersonalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSure
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 
Big Data
Big DataBig Data
Big Data
 
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and Beyond
 
Data science
Data scienceData science
Data science
 
Everis big data_wilson_v1.4
Everis big data_wilson_v1.4Everis big data_wilson_v1.4
Everis big data_wilson_v1.4
 
Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
BIG Data and Methodology-A review
BIG Data and Methodology-A reviewBIG Data and Methodology-A review
BIG Data and Methodology-A review
 
Big data
Big dataBig data
Big data
 

Destaque

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big DataMarin Dimitrov
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data SmarterMatheus Mota
 
Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesDATAVERSITY
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semanticsCraig Trim
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesSrinath Srinivasa
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술Haklae Kim
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceMarin Dimitrov
 
OWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudOWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudMarin Dimitrov
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteMarin Dimitrov
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceMarin Dimitrov
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseMarin Dimitrov
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Marin Dimitrov
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4Marin Dimitrov
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриNikolay Stoitsev
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudMarin Dimitrov
 

Destaque (20)

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data Challenges
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semantics
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and Opportunities
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-Service
 
OWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudOWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the Cloud
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-Service
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
 
From Python to Java
From Python to JavaFrom Python to Java
From Python to Java
 

Semelhante a From Big Data to Smart Data

Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? INACAP
 
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAkshata Humbe
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentationkperi
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big dataAnushkaGupta763558
 
Big Data for Business & Social Innovation
Big Data for Business & Social InnovationBig Data for Business & Social Innovation
Big Data for Business & Social InnovationStartupSaturdayEurope
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesMarin Dimitrov
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAANURAGGUPTA570
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
SME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeSME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeNathean Technologies
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015Sanmeet Dhokay
 

Semelhante a From Big Data to Smart Data (20)

Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
 
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentation
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big data
 
Big Data for Business & Social Innovation
Big Data for Business & Social InnovationBig Data for Business & Social Innovation
Big Data for Business & Social Innovation
 
Big data
Big dataBig data
Big data
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and Challenges
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Big data
Big dataBig data
Big data
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCA
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Final_Bigdata_pret
Final_Bigdata_pretFinal_Bigdata_pret
Final_Bigdata_pret
 
Big data in telecom
Big data in telecomBig data in telecom
Big data in telecom
 
SME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeSME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data Landscape
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
Big data by_mcal
Big data by_mcalBig data by_mcal
Big data by_mcal
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 

Mais de Marin Dimitrov

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Marin Dimitrov
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career JourneyMarin Dimitrov
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsMarin Dimitrov
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Marin Dimitrov
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ UberMarin Dimitrov
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger SelfMarin Dimitrov
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesMarin Dimitrov
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsMarin Dimitrov
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Marin Dimitrov
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesMarin Dimitrov
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyMarin Dimitrov
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersMarin Dimitrov
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityMarin Dimitrov
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data MarketplacesMarin Dimitrov
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data ManagementMarin Dimitrov
 

Mais de Marin Dimitrov (18)

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science Practitioners
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia University
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data Marketplaces
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
 

Último

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

From Big Data to Smart Data

  • 1. May 2013 From Big Data to Smart Data Marin Dimitrov - CTO
  • 2. About Ontotext • Provides products and services for creating, managing and exploiting semantic data – Founded in 2000 – Offices in Bulgaria, USA and UK • Major clients and industries – Media & Publishing (BBC, Press Association, EuroMoney, NDP Nieuwsmedia) – HCLS (AstraZeneca, UCB, NIBIO) – Cultural Heritage (The British Museum, The National Archives, Polish National Museum, Dutch Public Library) – Government (UK Parliament, United Nations FAO, LMI) #2May 2013From Big Data to Smart Data (Semantic Days 2013)
  • 3. Contents • The Problem with Big Data for BI • From Big Data to Smart Data • Success Stories by Ontotext #3From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 4. BIG DATA FOR BUSINESS INTELLIGENCE #4From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 5. The Problem with Big Data for BI #5From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 6. The Problem with Big Data for BI • It’s not only about Volume, Velocity & Variety • Too much focus on processing speed & storage volume • “Brute force” approaches increase the amount of data processed… – But not necessarily the Value & insight derived from data – May lead to even more data quality & inconsistency problems – Problems with data visualisation & exploration – Often do not lead to better decision making #6From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 7. The Problem with Big Data for BI • BI success is not measured by Volume, Velocity & Variety, but by more derived Value • Organisations should learn how to better utilise their “small data” before targeting Big Data – Quality over quantity – Better understanding of the data leads to better decision making – Avoid “needle in a haystack” situations #7From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 8. The Problem with Big Data for BI #8From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 9. Smart Data for Better BI • Efficiently analyse unstructured data – Most of the enterprise data is still unstructured – Even within structured & transactional data sources there is a lot of embedded unstructured data – … and this unstructured data is poorly analysed (if at all) => lots of potential value still remains locked – (sometimes even within semantic / Linked Data with insufficient granularity) #9From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 10. Smart Data for Better BI • Focus on metadata first, Big Data later – (As opposed to: Big Data first, metadata later) • Enrich data • Interlink data • Provide a common metadata layer – Break legacy silos – Align heterogeneous metadata if necessary • Better analysis of the data, better insight #10From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 11. SUCCESS STORIES #11From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 12. UK Job Market Intelligence • Comprehensive recruitment database for the UK – 4 million job ads / vacancies (dynamic) – 220,000 company websites & 700 job boards monitored • Questions we can answer – What skills are in demand at present? – Which are the top job boards in a region? – Which is the right Job board for your industry sector? – Which are the most active job advertisers / employers? – Which are the agencies and employers that do not advertise on your job board? #12From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 13. UK Job Market Intelligence #13From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 14. UK Job Market Intelligence • Technology stack – Web mining & focussed crawling – KB construction from open & proprietary data sources – Skills taxonomy (based on DISCO) – Text mining & semantic enrichment – Reconciliation & interlinking – BI reporting & dashboards #14From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 15. UK Job Market Intelligence #15From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 16. UK Job Market Intelligence #16From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 17. UK Job Market Intelligence #17From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 18. Asset Recovery Intelligence System (ARIS) • Support Financial Intelligence Units with tracking stolen assets, fight corruption & money laundering • Questions we can answer – What are the reported activities related to a person? – What is the person’s personal/professional network? – What are corruptions cases reported in regional news? • Data sources – News feeds from major news agencies – Dow Jones data & news feeds – SARs to the FIU – Open data (people & companies, Wikipedia) #18From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 19. Asset Recovery Intelligence System (ARIS) #19From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 20. Asset Recovery Intelligence System (ARIS) • Technology stack – Web Mining – Text mining & semantic enrichment (KIM) – ARIS ontology • People, companies, assets, relations, financial transactions, … – Reconciliation & Interlinking – Triplestore (OWLIM) – Semantic search & exploration UX – BI reporting / factsheets / alerts #20From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 21. Semantic Information Integration & Enrichment #21From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 22. Q & A Thank you! @ontotext #22From Big Data to Smart Data (Semantic Days 2013) May 2013