SlideShare uma empresa Scribd logo
Text Analytics & Linked Data
Management As-a-Service
Marin Dimitrov, Alex Simov, Yavor Petkov
May 31st, 2015
Text Analytics & Linked Data Management -aaS / Wasabi’2015 #1May 2015
About Ontotext
• Provides products & solutions for content
enrichment and metadata management
– 70 employees, headquarters in Sofia (Bulgaria)
– Sales presence in London, NYC & Boston
• Major clients and industries
– Media & Publishing
– Health Care & Life Sciences
– Cultural Heritage & Digital Libraries
– Government
– Education
#2Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Semantic Technology adoption challenges
• The Self-Service Semantic Suite (S4)
• Lessons learned
Contents
#3Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Semantic Technology Adoption
Challenges
#4Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Time-to-value gap (Gartner)
#5Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
From Wasabi @
ESWC’2014
Performance,
Integration,
Penetration,
Payback & ROI
• Limiting factors
– Complexity & cost of existing solutions
– Limited resources to evaluate novel technologies
(startups)
– Slow procurement processes, risk aversion (enterprises)
• How can we…
– Reduce time-to-market
– Reduce adoption risks
– Optimise costs
Semantic Technology adoption
#6Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
The Self-Service Semantic Suite
(S4)
#7Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Capabilities for text analytics, content enrichment
and smart data management
– Text analytics for news, life sciences and social media
– RDF graph database as-a-service
– Access to large open knowledge graphs
• Available on-demand, anytime, anywhere
– Simple RESTful services
• Simple pay-per-use pricing
– No upfront commitments
What is S4?
#8Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
What is S4?
#9Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Enables quick prototyping
– Instantly available, no provisioning & operations
required
– Focus on building applications, don’t worry about
infrastructure
• Free tier!
• Easy to start, shorter learning curve
– Various add-ons, SDKs and demo code
• Based on enterprise semantic technology
Benefits
#10Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Text analytics services
– News annotation
– News categorisation
– Biomedical
– Twitter
• Entity linking & disambiguation
– Mappings to DBpedia & GeoNames instances
– Mappings to biomedical data sources (LinkedLifeData)
• HTML, MS Word, XML, plain text input
• Simple JSON output
Text analytics with S4
#11Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
News analytics example
#12
S4 result
Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Low-cost graph DBaaS available 24/7
• Ideal for small & moderate data volumes
– database options: 1M, 10M, 50M, 250M and 1B triples
• Instantly deploy new databases when needed
• Zero administration: automated operations,
maintenance & upgrades
• Users pay only for the actual database utilisation
– Number of triples stored + number of queries per month
• OpenRDF REST API
Fully managed RDF DB in the Cloud
#13Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Fully managed RDF DB in the Cloud
#14Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• SPARQL query endpoint to the FactForge semantic
data warehouse
– 500 million entities / 5 billion triples
• Key LOD datasets integrated
– DBpedia, Freebase/WikiData, GeoNames, WordNet
– Dublin Core, SKOS, PROTON ontologies and
vocabularies
Knowledge graphs with S4
#15Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Cloud native architecture of S4
#16Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Elasticity vs
High Availability vs
Cost Efficiency
Lessons Learned
#17Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• You must build a “cost aware” cloud platform
• Cloud-native architectures are more efficient, but
more difficult to build
• A microservices architecture improve system
resilience & agility, but difficult to design right
• Extensive and continuous benchmarking &
monitoring
– Some problems emerge only at large scale
• Assume failures will happen & design for resilience
Lessons learned
#18Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Thank you!
#19Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015

Mais conteúdo relacionado

Mais procurados

Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Fwdays
 
What Data-Driven Websites Are and How They Work
What Data-Driven Websites Are and How They WorkWhat Data-Driven Websites Are and How They Work
What Data-Driven Websites Are and How They Work
Tessa Mero
 
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDBScylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
ScyllaDB
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
Cambridge Semantics
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
Avinash Ramineni
 
Simplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open DataSimplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open Data
Salvatore Virtuoso
 
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
PGDay.Amsterdam
 
Choosing the Right Open Source Database
Choosing the Right Open Source DatabaseChoosing the Right Open Source Database
Choosing the Right Open Source Database
All Things Open
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Md. Afif Al Mamun
 
sitMAI, Helping a Friend
sitMAI, Helping a FriendsitMAI, Helping a Friend
sitMAI, Helping a Friend
Phillip Parkinson
 
Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFI
Adam Doyle
 
Sasaki practical-linked-data
Sasaki practical-linked-dataSasaki practical-linked-data
Sasaki practical-linked-data
Felix Sasaki
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
StampedeCon
 
Memory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business InnovationMemory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business Innovation
VoltDB
 
Drupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP WebinarDrupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP Webinar
scorlosquet
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion Engine
Adam Doyle
 
7 Container Design Patterns
7 Container Design Patterns7 Container Design Patterns
7 Container Design Patterns
Christian Melendez
 
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event ProcessingMike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
VoltDB
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
DataWorks Summit
 
ML Production Pipelines: A Classification Model
ML Production Pipelines: A Classification ModelML Production Pipelines: A Classification Model
ML Production Pipelines: A Classification Model
Databricks
 

Mais procurados (20)

Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
 
What Data-Driven Websites Are and How They Work
What Data-Driven Websites Are and How They WorkWhat Data-Driven Websites Are and How They Work
What Data-Driven Websites Are and How They Work
 
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDBScylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash RamineniStrata+Hadoop World NY 2016 - Avinash Ramineni
Strata+Hadoop World NY 2016 - Avinash Ramineni
 
Simplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open DataSimplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open Data
 
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
 
Choosing the Right Open Source Database
Choosing the Right Open Source DatabaseChoosing the Right Open Source Database
Choosing the Right Open Source Database
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
sitMAI, Helping a Friend
sitMAI, Helping a FriendsitMAI, Helping a Friend
sitMAI, Helping a Friend
 
Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFI
 
Sasaki practical-linked-data
Sasaki practical-linked-dataSasaki practical-linked-data
Sasaki practical-linked-data
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
Memory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business InnovationMemory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business Innovation
 
Drupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP WebinarDrupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP Webinar
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion Engine
 
7 Container Design Patterns
7 Container Design Patterns7 Container Design Patterns
7 Container Design Patterns
 
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event ProcessingMike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
 
ML Production Pipelines: A Classification Model
ML Production Pipelines: A Classification ModelML Production Pipelines: A Classification Model
ML Production Pipelines: A Classification Model
 

Destaque

Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
Marin Dimitrov
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012
Marin Dimitrov
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
Marin Dimitrov
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Marin Dimitrov
 
From Python to Java
From Python to JavaFrom Python to Java
From Python to Java
Nikolay Stoitsev
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science Practitioners
Marin Dimitrov
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
Nikolay Stoitsev
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
Marin Dimitrov
 
From Big Data to Smart Data
From Big Data to Smart DataFrom Big Data to Smart Data
From Big Data to Smart Data
Marin Dimitrov
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Yahoo Developer Network
 
Graph db
Graph dbGraph db
Graph db
Gagan Agrawal
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
Marin Dimitrov
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
Marin Dimitrov
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
Amy W. Tang
 

Destaque (14)

Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
 
From Python to Java
From Python to JavaFrom Python to Java
From Python to Java
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science Practitioners
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
From Big Data to Smart Data
From Big Data to Smart DataFrom Big Data to Smart Data
From Big Data to Smart Data
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
 
Graph db
Graph dbGraph db
Graph db
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 

Semelhante a Text Analytics & Linked Data Management As-a-Service

Webinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in PublishingWebinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in Publishing
Ontotext
 
Semantic Technology in Publishing & Finance
Semantic Technology in Publishing & FinanceSemantic Technology in Publishing & Finance
Semantic Technology in Publishing & Finance
Vladimir Alexiev, PhD, PMP
 
Workshop_CITA2015
Workshop_CITA2015Workshop_CITA2015
Workshop_CITA2015
Bebo White
 
Data Engineering at Udemy
Data Engineering at UdemyData Engineering at Udemy
Data Engineering at Udemy
Ankara Big Data Meetup
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
Ontotext
 
Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?
DataWorks Summit
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Ontotext
 
A Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesA Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD Resources
Karwan Jacksi
 
Grand Challenges Learning Analytics
Grand Challenges Learning AnalyticsGrand Challenges Learning Analytics
Grand Challenges Learning Analytics
amberg
 
CV
CVCV
Boston Hadoop Meetup: Presto for the Enterprise
Boston Hadoop Meetup: Presto for the EnterpriseBoston Hadoop Meetup: Presto for the Enterprise
Boston Hadoop Meetup: Presto for the Enterprise
Matt Fuller
 
Open Information in need of liberation: Aspire and the conundrum of linked data
Open Information in need of liberation: Aspire and the conundrum of linked dataOpen Information in need of liberation: Aspire and the conundrum of linked data
Open Information in need of liberation: Aspire and the conundrum of linked data
Talis
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Denodo
 
Power BI as a storyteller
Power BI as a storytellerPower BI as a storyteller
Power BI as a storyteller
Berkovich Consulting
 
Emerging technologies in academic libraries
Emerging technologies in academic librariesEmerging technologies in academic libraries
Emerging technologies in academic libraries
Michael Cummings
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
Vladimir Alexiev, PhD, PMP
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
Cambridge Semantics
 
RWDG Webinar: Big Data & BI Analytics Require Data Governance
RWDG Webinar: Big Data & BI Analytics Require Data GovernanceRWDG Webinar: Big Data & BI Analytics Require Data Governance
RWDG Webinar: Big Data & BI Analytics Require Data Governance
DATAVERSITY
 
Saim Kaya CV
Saim Kaya CVSaim Kaya CV
Saim Kaya CV
Saim Kaya
 

Semelhante a Text Analytics & Linked Data Management As-a-Service (20)

Webinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in PublishingWebinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in Publishing
 
Semantic Technology in Publishing & Finance
Semantic Technology in Publishing & FinanceSemantic Technology in Publishing & Finance
Semantic Technology in Publishing & Finance
 
Workshop_CITA2015
Workshop_CITA2015Workshop_CITA2015
Workshop_CITA2015
 
Data Engineering at Udemy
Data Engineering at UdemyData Engineering at Udemy
Data Engineering at Udemy
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 
Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
 
A Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesA Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD Resources
 
Grand Challenges Learning Analytics
Grand Challenges Learning AnalyticsGrand Challenges Learning Analytics
Grand Challenges Learning Analytics
 
CV
CVCV
CV
 
Boston Hadoop Meetup: Presto for the Enterprise
Boston Hadoop Meetup: Presto for the EnterpriseBoston Hadoop Meetup: Presto for the Enterprise
Boston Hadoop Meetup: Presto for the Enterprise
 
Open Information in need of liberation: Aspire and the conundrum of linked data
Open Information in need of liberation: Aspire and the conundrum of linked dataOpen Information in need of liberation: Aspire and the conundrum of linked data
Open Information in need of liberation: Aspire and the conundrum of linked data
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Power BI as a storyteller
Power BI as a storytellerPower BI as a storyteller
Power BI as a storyteller
 
Emerging technologies in academic libraries
Emerging technologies in academic librariesEmerging technologies in academic libraries
Emerging technologies in academic libraries
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
RWDG Webinar: Big Data & BI Analytics Require Data Governance
RWDG Webinar: Big Data & BI Analytics Require Data GovernanceRWDG Webinar: Big Data & BI Analytics Require Data Governance
RWDG Webinar: Big Data & BI Analytics Require Data Governance
 
Saim Kaya CV
Saim Kaya CVSaim Kaya CV
Saim Kaya CV
 

Mais de Marin Dimitrov

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Marin Dimitrov
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
Marin Dimitrov
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
Marin Dimitrov
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
Marin Dimitrov
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
Marin Dimitrov
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
Marin Dimitrov
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
Marin Dimitrov
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
Marin Dimitrov
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
Marin Dimitrov
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
Marin Dimitrov
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia University
Marin Dimitrov
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and Challenges
Marin Dimitrov
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
Marin Dimitrov
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data Marketplaces
Marin Dimitrov
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
Marin Dimitrov
 

Mais de Marin Dimitrov (15)

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia University
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and Challenges
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data Marketplaces
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
 

Último

UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 

Último (20)

UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 

Text Analytics & Linked Data Management As-a-Service

  • 1. Text Analytics & Linked Data Management As-a-Service Marin Dimitrov, Alex Simov, Yavor Petkov May 31st, 2015 Text Analytics & Linked Data Management -aaS / Wasabi’2015 #1May 2015
  • 2. About Ontotext • Provides products & solutions for content enrichment and metadata management – 70 employees, headquarters in Sofia (Bulgaria) – Sales presence in London, NYC & Boston • Major clients and industries – Media & Publishing – Health Care & Life Sciences – Cultural Heritage & Digital Libraries – Government – Education #2Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 3. • Semantic Technology adoption challenges • The Self-Service Semantic Suite (S4) • Lessons learned Contents #3Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 4. Semantic Technology Adoption Challenges #4Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 5. Time-to-value gap (Gartner) #5Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015 From Wasabi @ ESWC’2014 Performance, Integration, Penetration, Payback & ROI
  • 6. • Limiting factors – Complexity & cost of existing solutions – Limited resources to evaluate novel technologies (startups) – Slow procurement processes, risk aversion (enterprises) • How can we… – Reduce time-to-market – Reduce adoption risks – Optimise costs Semantic Technology adoption #6Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 7. The Self-Service Semantic Suite (S4) #7Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 8. • Capabilities for text analytics, content enrichment and smart data management – Text analytics for news, life sciences and social media – RDF graph database as-a-service – Access to large open knowledge graphs • Available on-demand, anytime, anywhere – Simple RESTful services • Simple pay-per-use pricing – No upfront commitments What is S4? #8Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 9. What is S4? #9Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 10. • Enables quick prototyping – Instantly available, no provisioning & operations required – Focus on building applications, don’t worry about infrastructure • Free tier! • Easy to start, shorter learning curve – Various add-ons, SDKs and demo code • Based on enterprise semantic technology Benefits #10Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 11. • Text analytics services – News annotation – News categorisation – Biomedical – Twitter • Entity linking & disambiguation – Mappings to DBpedia & GeoNames instances – Mappings to biomedical data sources (LinkedLifeData) • HTML, MS Word, XML, plain text input • Simple JSON output Text analytics with S4 #11Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 12. News analytics example #12 S4 result Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 13. • Low-cost graph DBaaS available 24/7 • Ideal for small & moderate data volumes – database options: 1M, 10M, 50M, 250M and 1B triples • Instantly deploy new databases when needed • Zero administration: automated operations, maintenance & upgrades • Users pay only for the actual database utilisation – Number of triples stored + number of queries per month • OpenRDF REST API Fully managed RDF DB in the Cloud #13Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 14. Fully managed RDF DB in the Cloud #14Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 15. • SPARQL query endpoint to the FactForge semantic data warehouse – 500 million entities / 5 billion triples • Key LOD datasets integrated – DBpedia, Freebase/WikiData, GeoNames, WordNet – Dublin Core, SKOS, PROTON ontologies and vocabularies Knowledge graphs with S4 #15Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 16. Cloud native architecture of S4 #16Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015 Elasticity vs High Availability vs Cost Efficiency
  • 17. Lessons Learned #17Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 18. • You must build a “cost aware” cloud platform • Cloud-native architectures are more efficient, but more difficult to build • A microservices architecture improve system resilience & agility, but difficult to design right • Extensive and continuous benchmarking & monitoring – Some problems emerge only at large scale • Assume failures will happen & design for resilience Lessons learned #18Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 19. Thank you! #19Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015