SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
Outside Insight
The Meltwater Data Science Platform
ATI London – June 19th 2017
Meltwater: Who are we?
Sources: Editorial, Social, Broadcasts
Meltwater: Media Intelligence
media
exposure
trends influencers
sentiment
analysis
More than 300k different types of user queries
Meltwater: Some numbers
Our ingestion fetches about 3.3M documents / day from 190k editorial sources, re-crawled every 30 minutes.
With the social fire hoses we go up to 30M documents / day.
Since its foundation, Meltwater has indexed almost 200B documents.
Meltwater: How it is done
Ingestion:
⭕ Social media hoses (partnerships)
⭕ Editorial (partnerships + web crawling)
⭕ Broadcasts (views on the above)
Media Intelligence applications (Custom)
⭕ Boolean queries (keywords / entities)
⭕ Counters
⭕ Aggregates
⭕ Drill downs / pivoting
Enrichments (15 languages):
⭕ Text categorization (topic, language)
⭕ NERD (person, location, organization, ...)
⭕ NED ( https://en.wikipedia.org/wiki/Tim_Cook )
⭕ Sentiment Analysis
Storage and search:
⭕ Elastic search
⭕ Rabbit MQ (distributed queues)
⭕ AWS
Building Insights
Raw data, statistics, aggregates, summaries,
and visualizations are all good but...
⭕ Insights are still discovered by analysts by
munching and interpreting the data
⭕ ...and we all know how data can be deceitful
What we need is a systematic way to mine,
propose, and explain possible insights
⭕ we need factual knowledge
⭕ combine (machine) learning and reasoning
Source: www.tylervigen.com (Spurious Correlations)
Vision: Insight Building as a Platform
Build a world class AI platform for a new software category
Outside Insight
A I
Meltwater’s AI-First Approach
Use Data Science (data, algorithms, infrastructure, tools) to power everything
Data and Content Lakes
Credit
Ratings
Editorial
Patents
&
Trademarks
Gov Data
Job Postings
Social
Media
Company
Websites
SEO
Filings
Factual information: wherefrom?
Need to restrict the domain: focus on the corporate domain, i.e., companies, people, products, ...
just for last year
Data Lake and Outside Data
Factual information... How? Get as close as possible to the original source of information
Wrapidity (acq. 2016) automates
outside data acquisition via web scraping
Orchestration
Enrichments
Content
Lake
NED
Relation
Extraction
NER
Coreference
DOM / TABLE / ANNO
(Wrapidity)
Triples
Training
Pipeline
Enriched
Docs
Sentiment
Analysis
Event
Extraction
Factual information: how?
Linguistic enrichments provide a way to support semantic retrieval and fact extraction.
Summary:
⭕ Scalable and distributed enrichment workflows (Cosmify acquisition),
⭕ CRFs for NER, PageRank (variant) for NED, CNNs and LSTMs for Relation/Event extraction, Bayesian
classification for sentiment analysis,
⭕ TensorFlow and GPUs for training infrastructure
Enrichments
We can’t foresee all uses of our data: Developer APIs to Integrate and orchestrate third party tools.
Personalization is key in Data Science: A flexible data wrangling infrastructure is required.
Knowledge Graph
Wait… did you say PageRank, triples? So do you have a (Knowledge) Graph?
Challenges:
⭕ Data Cleaning
⭕ Data deduplication / integration
⭕ Truth Finding
Goals:
⭕ Relate facts
⭕ Data mining
⭕ Cognitive applications (higher-order reasoning)
Content:
⭕ Companies
⭕ Brands
⭕ Products
⭕ Key people
⭕ Influencers
Infer high-level insights from a set of extracted events/facts.
⭕ Competitor
⭕ Customer
⭕ Investment
⭕ Lawsuit/Litigation
⭕ Partnership
Cognitive Applications
Insight discovery:
⭕ Rule/Graph mining (data cleaning)
○ GPAR (VLDB ‘15)
○ RUDIK (internal, paper submitted)
⭕ Link prediction (data enrichment, fact checking)
○ Path Ranking Algorithms (PRA)
○ Probabilistic Soft Logic (PSL)
⭕ Supplier
⭕ Acquisition
⭕ Out/under performance
⭕ Expanding Operations
⭕ Compliance
⭕ Funding Developments
⭕ Leadership Changes
⭕ New Offerings
⭕ Bankruptcy,
⭕ Restructuring, Cost Cutting
Cognitive Applications complement traditional media intelligence tools
Cognitive Applications
Involve Users, Entrepreneurs, and Researchers
6 Data Science Hubs (co-working spaces)
✔ Sydney
✔ Berlin
✔ New York
✔ London
✔ San Francisco
✔ Singapore
Meltwater Entrepreneurial School of Technology
⭕ HQ in Accra, Ghana
⭕ Training program for African entrepreneurs
⭕ Incubator (25+ startups)
⭕ Networking hub
University collaborations
Questions?

Mais conteúdo relacionado

Mais procurados

Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentOntotext
 
Diving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsDiving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsOntotext
 
Data Curation @ SpazioDati - NEXA Lunch Seminar
Data Curation @ SpazioDati - NEXA Lunch SeminarData Curation @ SpazioDati - NEXA Lunch Seminar
Data Curation @ SpazioDati - NEXA Lunch SeminarSpazioDati
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data SmarterMatheus Mota
 
Semantic Security : Authorization on the Web with Ontologies
Semantic Security : Authorization on the Web with OntologiesSemantic Security : Authorization on the Web with Ontologies
Semantic Security : Authorization on the Web with OntologiesAmit Jain
 
Text analytics for Google Spreadsheets using Text Mining add-on
Text analytics for Google Spreadsheets using Text Mining add-on Text analytics for Google Spreadsheets using Text Mining add-on
Text analytics for Google Spreadsheets using Text Mining add-on SpazioDati
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Connected Data World
 
Recommender Systems based on Linked Open Data
Recommender Systems based on Linked Open DataRecommender Systems based on Linked Open Data
Recommender Systems based on Linked Open DataCataldo Musto
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
 
Fraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et DétectionFraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et DétectionLinkurious
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphsSören Auer
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
 
Thinking Outside the Table
Thinking Outside the TableThinking Outside the Table
Thinking Outside the TableOntotext
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data introvafopoulos
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked datavafopoulos
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
 

Mais procurados (20)

Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news content
 
Diving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging NewsDiving in Panama Papers and Open Data to Discover Emerging News
Diving in Panama Papers and Open Data to Discover Emerging News
 
Data Curation @ SpazioDati - NEXA Lunch Seminar
Data Curation @ SpazioDati - NEXA Lunch SeminarData Curation @ SpazioDati - NEXA Lunch Seminar
Data Curation @ SpazioDati - NEXA Lunch Seminar
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
Semantic Security : Authorization on the Web with Ontologies
Semantic Security : Authorization on the Web with OntologiesSemantic Security : Authorization on the Web with Ontologies
Semantic Security : Authorization on the Web with Ontologies
 
Text analytics for Google Spreadsheets using Text Mining add-on
Text analytics for Google Spreadsheets using Text Mining add-on Text analytics for Google Spreadsheets using Text Mining add-on
Text analytics for Google Spreadsheets using Text Mining add-on
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
 
Recommender Systems based on Linked Open Data
Recommender Systems based on Linked Open DataRecommender Systems based on Linked Open Data
Recommender Systems based on Linked Open Data
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Fraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et DétectionFraudes Financières: Méthodes de Prévention et Détection
Fraudes Financières: Méthodes de Prévention et Détection
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Thinking Outside the Table
Thinking Outside the TableThinking Outside the Table
Thinking Outside the Table
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data intro
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
agINFRA – a multilingual infrastructure for information on agricultural innov...
agINFRA – a multilingual infrastructure for information on agricultural innov...agINFRA – a multilingual infrastructure for information on agricultural innov...
agINFRA – a multilingual infrastructure for information on agricultural innov...
 
Cognitive data
Cognitive dataCognitive data
Cognitive data
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataIntroduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
 

Semelhante a Fairhair.ai – alan turing institute june '17 (public)

Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October IssueJIMS Rohini Sector 5
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009Christopher Eagle
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Fabrizio Orlandi
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryYoung Alista
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHarry Potter
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryJames Wong
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryFraboni Ec
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryLuis Goldster
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryTony Nguyen
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHoang Nguyen
 
Integrating and publishing public safety data using semantic technologies
Integrating and publishing public safety data using semantic technologiesIntegrating and publishing public safety data using semantic technologies
Integrating and publishing public safety data using semantic technologiesAlvaro Graves
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us? Andrea Volpini
 
Eavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteEavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteShalin Hai-Jew
 
Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdf
Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdfBertelsmann: BeTrend – Building a Trend Aggregation Machine.pdf
Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdfNeo4j
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Stanford University
 
Data Journalism Workshop - Prague
Data Journalism Workshop - PragueData Journalism Workshop - Prague
Data Journalism Workshop - PragueChristian Kreutz
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalstelligence
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
 

Semelhante a Fairhair.ai – alan turing institute june '17 (public) (20)

Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
JIMS Rohini IT Flash Monthly Newsletter - October Issue
JIMS Rohini IT Flash Monthly Newsletter  - October IssueJIMS Rohini IT Flash Monthly Newsletter  - October Issue
JIMS Rohini IT Flash Monthly Newsletter - October Issue
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Integrating and publishing public safety data using semantic technologies
Integrating and publishing public safety data using semantic technologiesIntegrating and publishing public safety data using semantic technologies
Integrating and publishing public safety data using semantic technologies
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
Eavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteEavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging Site
 
Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdf
Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdfBertelsmann: BeTrend – Building a Trend Aggregation Machine.pdf
Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdf
 
Data Mining
Data MiningData Mining
Data Mining
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016
 
Data Journalism Workshop - Prague
Data Journalism Workshop - PragueData Journalism Workshop - Prague
Data Journalism Workshop - Prague
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 

Mais de Giorgio Orsi

Joint Repairs for Web Wrappers
Joint Repairs for Web WrappersJoint Repairs for Web Wrappers
Joint Repairs for Web WrappersGiorgio Orsi
 
SAE: Structured Aspect Extraction
SAE: Structured Aspect ExtractionSAE: Structured Aspect Extraction
SAE: Structured Aspect ExtractionGiorgio Orsi
 
wadar_poster_final
wadar_poster_finalwadar_poster_final
wadar_poster_finalGiorgio Orsi
 
Query Rewriting and Optimization for Ontological Databases
Query Rewriting and Optimization for Ontological DatabasesQuery Rewriting and Optimization for Ontological Databases
Query Rewriting and Optimization for Ontological DatabasesGiorgio Orsi
 
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014Giorgio Orsi
 
Deos 2014 - Welcome
Deos 2014 - WelcomeDeos 2014 - Welcome
Deos 2014 - WelcomeGiorgio Orsi
 
Heuristic Ranking in Tightly Coupled Probabilistic Description Logics
Heuristic Ranking in Tightly Coupled Probabilistic Description LogicsHeuristic Ranking in Tightly Coupled Probabilistic Description Logics
Heuristic Ranking in Tightly Coupled Probabilistic Description LogicsGiorgio Orsi
 
Datalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web DatabasesDatalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web DatabasesGiorgio Orsi
 
AMBER WWW 2012 Poster
AMBER WWW 2012 PosterAMBER WWW 2012 Poster
AMBER WWW 2012 PosterGiorgio Orsi
 
AMBER WWW 2012 (Demonstration)
AMBER WWW 2012 (Demonstration)AMBER WWW 2012 (Demonstration)
AMBER WWW 2012 (Demonstration)Giorgio Orsi
 
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)Giorgio Orsi
 
Querying UML Class Diagrams - FoSSaCS 2012
Querying UML Class Diagrams - FoSSaCS 2012Querying UML Class Diagrams - FoSSaCS 2012
Querying UML Class Diagrams - FoSSaCS 2012Giorgio Orsi
 
OPAL: automated form understanding for the deep web - WWW 2012
OPAL: automated form understanding for the deep web - WWW 2012OPAL: automated form understanding for the deep web - WWW 2012
OPAL: automated form understanding for the deep web - WWW 2012Giorgio Orsi
 
Nyaya: Semantic data markets: a flexible environment for knowledge management...
Nyaya: Semantic data markets: a flexible environment for knowledge management...Nyaya: Semantic data markets: a flexible environment for knowledge management...
Nyaya: Semantic data markets: a flexible environment for knowledge management...Giorgio Orsi
 
The Diadem Ontology
The Diadem OntologyThe Diadem Ontology
The Diadem OntologyGiorgio Orsi
 

Mais de Giorgio Orsi (20)

Joint Repairs for Web Wrappers
Joint Repairs for Web WrappersJoint Repairs for Web Wrappers
Joint Repairs for Web Wrappers
 
SAE: Structured Aspect Extraction
SAE: Structured Aspect ExtractionSAE: Structured Aspect Extraction
SAE: Structured Aspect Extraction
 
diadem-vldb-2015
diadem-vldb-2015diadem-vldb-2015
diadem-vldb-2015
 
wadar_poster_final
wadar_poster_finalwadar_poster_final
wadar_poster_final
 
Query Rewriting and Optimization for Ontological Databases
Query Rewriting and Optimization for Ontological DatabasesQuery Rewriting and Optimization for Ontological Databases
Query Rewriting and Optimization for Ontological Databases
 
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
ROSeAnn: Reconciling Opinions of Semantic Annotators VLDB 2014
 
Deos 2014 - Welcome
Deos 2014 - WelcomeDeos 2014 - Welcome
Deos 2014 - Welcome
 
Perv a ds-rr13
Perv a ds-rr13Perv a ds-rr13
Perv a ds-rr13
 
Heuristic Ranking in Tightly Coupled Probabilistic Description Logics
Heuristic Ranking in Tightly Coupled Probabilistic Description LogicsHeuristic Ranking in Tightly Coupled Probabilistic Description Logics
Heuristic Ranking in Tightly Coupled Probabilistic Description Logics
 
Datalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web DatabasesDatalog and its Extensions for Semantic Web Databases
Datalog and its Extensions for Semantic Web Databases
 
AMBER WWW 2012 Poster
AMBER WWW 2012 PosterAMBER WWW 2012 Poster
AMBER WWW 2012 Poster
 
AMBER WWW 2012 (Demonstration)
AMBER WWW 2012 (Demonstration)AMBER WWW 2012 (Demonstration)
AMBER WWW 2012 (Demonstration)
 
DIADEM WWW 2012
DIADEM WWW 2012DIADEM WWW 2012
DIADEM WWW 2012
 
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
OPAL: a passe-partout for web forms - WWW 2012 (Demonstration)
 
Querying UML Class Diagrams - FoSSaCS 2012
Querying UML Class Diagrams - FoSSaCS 2012Querying UML Class Diagrams - FoSSaCS 2012
Querying UML Class Diagrams - FoSSaCS 2012
 
OPAL: automated form understanding for the deep web - WWW 2012
OPAL: automated form understanding for the deep web - WWW 2012OPAL: automated form understanding for the deep web - WWW 2012
OPAL: automated form understanding for the deep web - WWW 2012
 
Nyaya: Semantic data markets: a flexible environment for knowledge management...
Nyaya: Semantic data markets: a flexible environment for knowledge management...Nyaya: Semantic data markets: a flexible environment for knowledge management...
Nyaya: Semantic data markets: a flexible environment for knowledge management...
 
Table Recognition
Table RecognitionTable Recognition
Table Recognition
 
The Diadem Ontology
The Diadem OntologyThe Diadem Ontology
The Diadem Ontology
 
Diadem 1.0
Diadem 1.0Diadem 1.0
Diadem 1.0
 

Último

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 

Último (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Fairhair.ai – alan turing institute june '17 (public)

  • 1. Outside Insight The Meltwater Data Science Platform ATI London – June 19th 2017
  • 3. Sources: Editorial, Social, Broadcasts Meltwater: Media Intelligence media exposure trends influencers sentiment analysis More than 300k different types of user queries
  • 4. Meltwater: Some numbers Our ingestion fetches about 3.3M documents / day from 190k editorial sources, re-crawled every 30 minutes. With the social fire hoses we go up to 30M documents / day. Since its foundation, Meltwater has indexed almost 200B documents.
  • 5. Meltwater: How it is done Ingestion: ⭕ Social media hoses (partnerships) ⭕ Editorial (partnerships + web crawling) ⭕ Broadcasts (views on the above) Media Intelligence applications (Custom) ⭕ Boolean queries (keywords / entities) ⭕ Counters ⭕ Aggregates ⭕ Drill downs / pivoting Enrichments (15 languages): ⭕ Text categorization (topic, language) ⭕ NERD (person, location, organization, ...) ⭕ NED ( https://en.wikipedia.org/wiki/Tim_Cook ) ⭕ Sentiment Analysis Storage and search: ⭕ Elastic search ⭕ Rabbit MQ (distributed queues) ⭕ AWS
  • 6. Building Insights Raw data, statistics, aggregates, summaries, and visualizations are all good but... ⭕ Insights are still discovered by analysts by munching and interpreting the data ⭕ ...and we all know how data can be deceitful What we need is a systematic way to mine, propose, and explain possible insights ⭕ we need factual knowledge ⭕ combine (machine) learning and reasoning Source: www.tylervigen.com (Spurious Correlations)
  • 7. Vision: Insight Building as a Platform Build a world class AI platform for a new software category Outside Insight A I
  • 8. Meltwater’s AI-First Approach Use Data Science (data, algorithms, infrastructure, tools) to power everything
  • 9. Data and Content Lakes Credit Ratings Editorial Patents & Trademarks Gov Data Job Postings Social Media Company Websites SEO Filings Factual information: wherefrom? Need to restrict the domain: focus on the corporate domain, i.e., companies, people, products, ... just for last year
  • 10. Data Lake and Outside Data Factual information... How? Get as close as possible to the original source of information Wrapidity (acq. 2016) automates outside data acquisition via web scraping
  • 11. Orchestration Enrichments Content Lake NED Relation Extraction NER Coreference DOM / TABLE / ANNO (Wrapidity) Triples Training Pipeline Enriched Docs Sentiment Analysis Event Extraction Factual information: how? Linguistic enrichments provide a way to support semantic retrieval and fact extraction. Summary: ⭕ Scalable and distributed enrichment workflows (Cosmify acquisition), ⭕ CRFs for NER, PageRank (variant) for NED, CNNs and LSTMs for Relation/Event extraction, Bayesian classification for sentiment analysis, ⭕ TensorFlow and GPUs for training infrastructure
  • 12. Enrichments We can’t foresee all uses of our data: Developer APIs to Integrate and orchestrate third party tools. Personalization is key in Data Science: A flexible data wrangling infrastructure is required.
  • 13. Knowledge Graph Wait… did you say PageRank, triples? So do you have a (Knowledge) Graph? Challenges: ⭕ Data Cleaning ⭕ Data deduplication / integration ⭕ Truth Finding Goals: ⭕ Relate facts ⭕ Data mining ⭕ Cognitive applications (higher-order reasoning) Content: ⭕ Companies ⭕ Brands ⭕ Products ⭕ Key people ⭕ Influencers
  • 14. Infer high-level insights from a set of extracted events/facts. ⭕ Competitor ⭕ Customer ⭕ Investment ⭕ Lawsuit/Litigation ⭕ Partnership Cognitive Applications Insight discovery: ⭕ Rule/Graph mining (data cleaning) ○ GPAR (VLDB ‘15) ○ RUDIK (internal, paper submitted) ⭕ Link prediction (data enrichment, fact checking) ○ Path Ranking Algorithms (PRA) ○ Probabilistic Soft Logic (PSL) ⭕ Supplier ⭕ Acquisition ⭕ Out/under performance ⭕ Expanding Operations ⭕ Compliance ⭕ Funding Developments ⭕ Leadership Changes ⭕ New Offerings ⭕ Bankruptcy, ⭕ Restructuring, Cost Cutting
  • 15. Cognitive Applications complement traditional media intelligence tools Cognitive Applications
  • 16. Involve Users, Entrepreneurs, and Researchers 6 Data Science Hubs (co-working spaces) ✔ Sydney ✔ Berlin ✔ New York ✔ London ✔ San Francisco ✔ Singapore Meltwater Entrepreneurial School of Technology ⭕ HQ in Accra, Ghana ⭕ Training program for African entrepreneurs ⭕ Incubator (25+ startups) ⭕ Networking hub University collaborations