SlideShare uma empresa Scribd logo
1 de 2
Baixar para ler offline
DATA CURATION INSIGHTS
HELEN LIPPELL
METADATA MANAGER BIGBig Data Public Private Forum
“We understand that
it is not about a
single tool but a
whole suite of tools
is needed to make
sure that the data
curation suits into
the whole lifecycle
effectively.”
Introduction
Helen Lippell is an experienced information professional with
extensive experience of managing data and metadata for
organisations across the media, financial, government and
entertainment sectors. Her passion is building knowledge
organisation systems such as ontologies to support a wide range
of business applications such as search, content publishing and
business intelligence.
Edward Curry: Please tell us a little about yourself and your
history in the industry, role at the organisation, past
experience?
Helen Lippell: I am working at the Press Association (PA) which
is a partner in BIG. It is a very well established news agency for
the UK and Ireland. My job title is metadata manager, which
means introducing metadata decriptors into our core content. It is
a fairly new role, but it basically means applying more semantic
information to content in order to package it more flexibly. The
traditional product for the PA is the Newswire. There is a need to
use that data more to describe it better to curate it better.
I have been working about 15 years in this industry. I started
working in media and financial publishing organizations including
The Financial Times, BBC. I would describe myself as an
information professional. I had different j ob titles but it always
boils down to make sense of data and content, structuring things
describing things.
Edward Curry: What is the biggest change going on in your
industry at this time? Is Big Data playing a role in the
change?
Helen Lippell: I think it is related to the sheer increase in volumes
of data that can be captured. Also it is related to the fact that now
everyone expect to watch and consumer whatever they want,
when they want on whatever device they want, al of which can
generate enormous amounts of data. I guess the big media
organisations and PA is one of them, is working out on how they fir
into that landscape, because the traditional business model might
not always be effective.
Edward Curry: How has your job changed in the past 12
months? How do you expect it to change in the next 3-5
years?
Helen Lippell: I just moved to a new role that is still shaping. I
would say more thinking about the future of the information
professional role. I come out from a traditional librarianship world
which means describing and structuring things for tousands of
years. But, now we‘ve got online data. I think understanding how
these roles are changing, in partuclar for data curators. One
change is the need for analytical skills: trying to make sense of
massive information, trying to extract value from it. Extracting
meaning from large volumes of data.
How can this be applied to business need (how they can make
money) or how governments can govern effectively. So bridging that
gap between technological and statistical. The role is now becoming
more technical.
These roles have been in the industry for a while but sometimes they
get obscure job titles fro some reason.
Edward Curry: What does data curation mean to you in your
organisation?
Helen Lippell: Having some processes in place to actually make
sense of this enormous fire-hoses of words that media companies
produce. And being able to respond to customers‘ requirements. If
someone brings up a bunch of content, well Google search cannot
deal with them due to noise. So data curation means in this context
tagging the content appropriately, building semantic graphs to find
what is related to what and then applying those concepts to the
content, in order to satisfy those customer needs.
Edward Curry: What data do you curate? What is the size of
dataset? How many users are involved in curating the data?
Helen Lippell: I am curating data that is newsworthy like people,
organizations, events, places, and using a mixture of public domain
information like Freebase, with BI kind of knowledge and insights, and
exporting to an RDF triple store to be used by journalists when
creating content. The data we are producing are stories, images,
multimedia and the identification of key concepts within that.
This is one part of the story. PA is an extremely diverse organization.
For example we have a subsidiary that does weather forecasting: they
could be dealing with enormous volumes of data as well.
Edward Curry: What are the uses of the curated data? What value
does it give?
Helen Lippell: Broadly speaking PA provides media and content to
large a range of customers. Newspapers, BBC, sport sites, niche
websites, local newspapers, etc.
CONTACT
Edward Curry: What processes and technologies do you use
for data curation? Any comments on the performance or
design of the technologies?
Helen Lippell: PA uses a metadata management tool which has in-
built functionality and APIs which crawls data from the public
domain and injects into our system. Later, unique identifiers are
assigned to the public content and data curators check and validate
the data before pushing it into the production system.
Journalists will progressively getting involved into this process.
Edward Curry: What is your understanding of the term "Big
Data"?
Helen Lippell: In my opinion, Big Data is an exponential increase in
data which unprecedent in the human history. It is not just about the
exponential growth in data but also about the technology that goes
along with that. It is related to the technologies that can cope with
the exponential growth of data. It is closely related to cloud
computing which offers the flexibility, scalability and cost
advantages that make large-scale data processing viable.
Edward Curry: What influences do you think "Big Data" will
have on future of data curation? What are the technological
demands of curation in the "Big Data" context?
Helen Lippell: The amount of digital information and the volume of
data we have today, it requires data curation to be looked at very
different ways. We have got millions of information resources and
extracting the value from all of these requires a different kind of
information literacy. I think the whole Big Data movement will
provide some sort of applications or algorithms which data curators
will be able to use in order to find the real value hidden deep inside
millions of information resources.
From my perspective things like text analytics are core
technologies. Text analytics have been around for a while, but only
now it has gone into mainstream, enabling data curators to work
with massive corpora, to find what they need.
.
Edward Curry: What data curation technologies will cope with
"Big Data"?
Helen Lippell: We understand that it is not about a single tool but a
whole suite of tools is needed to make sure that the data curation
suits into the whole lifecycle effectively. All the stuff around the
community, review cycles and how to integrate it with the rest of the
architecture is central. Tools need to be flexible to export the data in
whatever format is needed.
About the BIG Project
The BIG project aims to create a collaborative platform to address the
challenges and discuss the opportunities offered by technologies that provide
treatment of large volumes of data (Big Data) and its impact in the new
economy. BIG‘s contributions will be crucial for both the industry and the
scientific community, policy makers and the general public, since the
management of large amounts of data play an increasingly important role in
society and in the current economy.
“If someone brings up a bunch of content, well Google search cannot deal with them due to
noise, so data curation means in this context tagging the content appropriately, building
semantic graphs to find what is related to what and then applying those concepts to content, in
order to satisfy those customer needs.”
http://big-project.eu
TECHNICAL WORKING
GROUPS AND
INDUSTRY
Project co-funded by the European Commission within the 7th Framework Programme (Grant Agreement No. 318062)
Collaborative Project
in Information and Communication Technologies
2012 — 2014
General contact
info@big-project.eu
Project Coordinator
Jose Maria Cavanillas de San Segundo
Research & Innovation Director
Atos Spain Corporation
Albarracín 25
28037 Madrid, Spain
Phone: +34 912148609
Fax: +34 917543252
Email: jose-
maria.cavanillas@atosresearch.eu
Strategic Director
Prof. Wolfgang Wahlster
CEO and Scientific Director
Deutsches Forschungszentrum
für Künstliche Intelligenz
66123 Saarbrücken, Germany
Phone: +49 681 85775 5252 or 5251
Fax: +49 681 85775 5383
Email: wahlster@dfki.de
Data Curation Working Group
Dr. Edward Curry
Digital Enterprise Research Institute
National University of Ireland, Galway
Email: ed.curry@deri.org
CONTACT

Mais conteúdo relacionado

Último

COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Joonhun Lee
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 

Último (20)

COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 

Destaque

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Helen Lippell BIG Project Data Curation Interview

  • 1. DATA CURATION INSIGHTS HELEN LIPPELL METADATA MANAGER BIGBig Data Public Private Forum “We understand that it is not about a single tool but a whole suite of tools is needed to make sure that the data curation suits into the whole lifecycle effectively.” Introduction Helen Lippell is an experienced information professional with extensive experience of managing data and metadata for organisations across the media, financial, government and entertainment sectors. Her passion is building knowledge organisation systems such as ontologies to support a wide range of business applications such as search, content publishing and business intelligence. Edward Curry: Please tell us a little about yourself and your history in the industry, role at the organisation, past experience? Helen Lippell: I am working at the Press Association (PA) which is a partner in BIG. It is a very well established news agency for the UK and Ireland. My job title is metadata manager, which means introducing metadata decriptors into our core content. It is a fairly new role, but it basically means applying more semantic information to content in order to package it more flexibly. The traditional product for the PA is the Newswire. There is a need to use that data more to describe it better to curate it better. I have been working about 15 years in this industry. I started working in media and financial publishing organizations including The Financial Times, BBC. I would describe myself as an information professional. I had different j ob titles but it always boils down to make sense of data and content, structuring things describing things. Edward Curry: What is the biggest change going on in your industry at this time? Is Big Data playing a role in the change? Helen Lippell: I think it is related to the sheer increase in volumes of data that can be captured. Also it is related to the fact that now everyone expect to watch and consumer whatever they want, when they want on whatever device they want, al of which can generate enormous amounts of data. I guess the big media organisations and PA is one of them, is working out on how they fir into that landscape, because the traditional business model might not always be effective. Edward Curry: How has your job changed in the past 12 months? How do you expect it to change in the next 3-5 years? Helen Lippell: I just moved to a new role that is still shaping. I would say more thinking about the future of the information professional role. I come out from a traditional librarianship world which means describing and structuring things for tousands of years. But, now we‘ve got online data. I think understanding how these roles are changing, in partuclar for data curators. One change is the need for analytical skills: trying to make sense of massive information, trying to extract value from it. Extracting meaning from large volumes of data. How can this be applied to business need (how they can make money) or how governments can govern effectively. So bridging that gap between technological and statistical. The role is now becoming more technical. These roles have been in the industry for a while but sometimes they get obscure job titles fro some reason. Edward Curry: What does data curation mean to you in your organisation? Helen Lippell: Having some processes in place to actually make sense of this enormous fire-hoses of words that media companies produce. And being able to respond to customers‘ requirements. If someone brings up a bunch of content, well Google search cannot deal with them due to noise. So data curation means in this context tagging the content appropriately, building semantic graphs to find what is related to what and then applying those concepts to the content, in order to satisfy those customer needs. Edward Curry: What data do you curate? What is the size of dataset? How many users are involved in curating the data? Helen Lippell: I am curating data that is newsworthy like people, organizations, events, places, and using a mixture of public domain information like Freebase, with BI kind of knowledge and insights, and exporting to an RDF triple store to be used by journalists when creating content. The data we are producing are stories, images, multimedia and the identification of key concepts within that. This is one part of the story. PA is an extremely diverse organization. For example we have a subsidiary that does weather forecasting: they could be dealing with enormous volumes of data as well. Edward Curry: What are the uses of the curated data? What value does it give? Helen Lippell: Broadly speaking PA provides media and content to large a range of customers. Newspapers, BBC, sport sites, niche websites, local newspapers, etc.
  • 2. CONTACT Edward Curry: What processes and technologies do you use for data curation? Any comments on the performance or design of the technologies? Helen Lippell: PA uses a metadata management tool which has in- built functionality and APIs which crawls data from the public domain and injects into our system. Later, unique identifiers are assigned to the public content and data curators check and validate the data before pushing it into the production system. Journalists will progressively getting involved into this process. Edward Curry: What is your understanding of the term "Big Data"? Helen Lippell: In my opinion, Big Data is an exponential increase in data which unprecedent in the human history. It is not just about the exponential growth in data but also about the technology that goes along with that. It is related to the technologies that can cope with the exponential growth of data. It is closely related to cloud computing which offers the flexibility, scalability and cost advantages that make large-scale data processing viable. Edward Curry: What influences do you think "Big Data" will have on future of data curation? What are the technological demands of curation in the "Big Data" context? Helen Lippell: The amount of digital information and the volume of data we have today, it requires data curation to be looked at very different ways. We have got millions of information resources and extracting the value from all of these requires a different kind of information literacy. I think the whole Big Data movement will provide some sort of applications or algorithms which data curators will be able to use in order to find the real value hidden deep inside millions of information resources. From my perspective things like text analytics are core technologies. Text analytics have been around for a while, but only now it has gone into mainstream, enabling data curators to work with massive corpora, to find what they need. . Edward Curry: What data curation technologies will cope with "Big Data"? Helen Lippell: We understand that it is not about a single tool but a whole suite of tools is needed to make sure that the data curation suits into the whole lifecycle effectively. All the stuff around the community, review cycles and how to integrate it with the rest of the architecture is central. Tools need to be flexible to export the data in whatever format is needed. About the BIG Project The BIG project aims to create a collaborative platform to address the challenges and discuss the opportunities offered by technologies that provide treatment of large volumes of data (Big Data) and its impact in the new economy. BIG‘s contributions will be crucial for both the industry and the scientific community, policy makers and the general public, since the management of large amounts of data play an increasingly important role in society and in the current economy. “If someone brings up a bunch of content, well Google search cannot deal with them due to noise, so data curation means in this context tagging the content appropriately, building semantic graphs to find what is related to what and then applying those concepts to content, in order to satisfy those customer needs.” http://big-project.eu TECHNICAL WORKING GROUPS AND INDUSTRY Project co-funded by the European Commission within the 7th Framework Programme (Grant Agreement No. 318062) Collaborative Project in Information and Communication Technologies 2012 — 2014 General contact info@big-project.eu Project Coordinator Jose Maria Cavanillas de San Segundo Research & Innovation Director Atos Spain Corporation Albarracín 25 28037 Madrid, Spain Phone: +34 912148609 Fax: +34 917543252 Email: jose- maria.cavanillas@atosresearch.eu Strategic Director Prof. Wolfgang Wahlster CEO and Scientific Director Deutsches Forschungszentrum für Künstliche Intelligenz 66123 Saarbrücken, Germany Phone: +49 681 85775 5252 or 5251 Fax: +49 681 85775 5383 Email: wahlster@dfki.de Data Curation Working Group Dr. Edward Curry Digital Enterprise Research Institute National University of Ireland, Galway Email: ed.curry@deri.org CONTACT