SlideShare a Scribd company logo
1 of 43
Download to read offline
Removing Barriers to Transparency:
a Case Study on the Use of Semantic
Technologies to Tackle Procurement Data
Inconsistency
Giuseppe Futia*, Alessio Melandri**, Antonio Vetrò*,
Federico Morando**, Juan Carlos De Martin*
* Nexa Center for Internet and Society, Politecnico di Torino (DAUIN)
** Synapta Srl
Outline
• Public procurement context
• Our contribution
• Results
• Discussion
• Future works
2
Context
What is Transparency?
Data related to functioning of government can
be accessed and interpreted, without being
pre-processed and manipulated
ESWC 2017 - Removing Barriers to Transparency 5
Open Government Data (OGD) is (or should be)
a mechanism integrated in the heart of the
government functions for creating transparency
ESWC 2017 - Removing Barriers to Transparency 6
Public Procurement (PP) is a specific area of the
OGD that increases openness of government
information and supports business activities
ESWC 2017 - Removing Barriers to Transparency 7
Is OGD Enough for Enabling
Transparency through
Procurement Data?
• An obstacle to transparency is related to the
fragmentation of OGD in the domain of
procurement data
• Fragmentation can generate discrepancies and
inconsistencies among data that can not be a
priori detected and fixed
ESWC 2017 - Removing Barriers to Transparency 9
The Italian Context
Our case study is represented by Italian PP data
published on different websites of Italian
administrations - in compliance with the Italian
anti-corruption Act (law n. 190/2012)
ESWC 2017 - Removing Barriers to Transparency 11
1212
13ESWC 2017 - Removing Barriers to Transparency
300.000 XML files from
more than 16.000 public bodies
What is Data Consistency?
Data consistency refers to the absence of
apparent contradictions within data
(ISO/IEC 25012)
ESWC 2017 - Removing Barriers to Transparency 16
• Certain types of consistency problems directly emerge analyzing
contracts data collected in a single XML file
– contracts in which the beneficiary is more than one
– contracts in which the amount of money is paid, but no
recipient is present in the data
– contracts in which the sum reported as paid is greater than
the sum initially awarded to the beneficiary
ESWC 2017 - Removing Barriers to Transparency 17
• Other types of inconsistencies manifest themselves only after
interlinking and merging data contained in different sources
– business entities with more than one business name
– CIGs that identify more than one contract
– incoherent payments among different versions of an ongoing
contract
ESWC 2017 - Removing Barriers to Transparency 18
These issues represent a significant barrier to
achieve transparency, because the results obtained
by querying the dataset are misleading
ESWC 2017 - Removing Barriers to Transparency 19
Our Contribution
21
pc: http://purl.org/procurement/public-contracts#
dcterms: http://purl.org/dc/terms#
gr: http://purl.org/goodrelations/v1#
dbpedia-owl: http://dbpedia.org/ontology/
pay: http://reference.data.gov.uk/def/payment#
time: http://www.w3.org/2006/time#
org: http://www.w3.org/ns/org#
foaf: http://xmlns.com/foaf/0.1#
22
23
Results
25
26
27
Discussion on Inconsistency Issues
• Business entities with more than one business
name
• CIGs that identify more than one contract
• Incoherent payments among different versions
of an ongoing contract
ESWC 2017 - Removing Barriers to Transparency 29
Business entities with more than
one business name
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX gr: <http://purl.org/goodrelations/v1#>
SELECT (COUNT(DISTINCT ?be)) WHERE {
{
SELECT DISTINCT(?be) WHERE {
?be rdfs:label ?label .
?be a gr:BusinessEntity .
}
GROUP BY ?be HAVING (count(*)>1)
}
}
ESWC 2017 - Removing Barriers to Transparency 31
Exploiting VAT ID value, we obtain a unique
business name for contracting authorities, building
links to the Italian public administration index of
SPCData (http://spcdata.digitpa.gov.it:8899/sparql)
ESWC 2017 - Removing Barriers to Transparency 32
CIGs that identify more than one
contract
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX pc: <http://purl.org/procurement/public-contracts#>
SELECT (COUNT(DISTINCT ?contract)) WHERE {
{
SELECT DISTINCT(?contract) WHERE {
?contract dcterms:identifier ?CIG .
?contract a pc:Contract .
}
GROUP BY ?contract HAVING (count(*)>1)
}
}
ESWC 2017 - Removing Barriers to Transparency 34
The solution to this issue is generating a hash
value, avoiding ambiguity due to duplicate CIGs, to
build contracts URIs. In this way, we separate
different contracts, misidentified by the same CIG,
in different entities
ESWC 2017 - Removing Barriers to Transparency 35
Incoherent payments among
different versions of an ongoing
contract
select ?result (COUNT(*) as ?n) where {
{select ((ROUND(?pay_sum/?ap*100)/100) as ?result) where {
?c <http://purl.org/procurement/public-contracts#agreedPrice> ?ap .
FILTER (?ap > 0).
FILTER (?ap < 1000000000000).
{ select ?c (SUM(?pay) as ?pay_sum)
where {
?c <http://reference.data.gov.uk/def/payment#payment> ?p .
?p <http://reference.data.gov.uk/def/payment#netAmount> ?pay .
FILTER (?pay > 0).
FILTER (?pay < 1000000000000). }group by ?c
}}}} group by ?result order by ?result
ESWC 2017 - Removing Barriers to Transparency 37
38
Conclusions
• We presented an approach to tackle fragmentation
of Italian procurement data and to improve
consistency of such data based on semantic
technologies
• Both these issues represent a significant barrier to
achieve a full transparency, because user and
robots that query the datasets risk to obtain partial,
inconsistent, and misleading results
ESWC 2017 - Removing Barriers to Transparency 40
Future Works
• Developing automatic tools to detect and fix
consistency problems among contracts published in
different files
• Exploring ways to evaluate the provenance of data, in
order to improve the data processing stage and
improve data consistency
• Exploiting advanced methods and dashboards in
order to monitor the consistency degree of the data
ESWC 2017 - Removing Barriers to Transparency 42
Thank you!
Mail: giuseppe.futia@polito.it
Twitter: giuseppe_futia
GitHub: https://github.com/synapta/public-contracts

More Related Content

What's hot

World Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportWorld Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportSergey Nazarov
 
World Economic Forum Tipping Point Blockchain
World Economic Forum Tipping Point BlockchainWorld Economic Forum Tipping Point Blockchain
World Economic Forum Tipping Point BlockchainSergey Nazarov
 
Growing impact and future potential of blockchain for telcos: A Game Changer?
Growing impact and future potential of blockchain for telcos: A Game Changer?Growing impact and future potential of blockchain for telcos: A Game Changer?
Growing impact and future potential of blockchain for telcos: A Game Changer?José Luis Núñez Díaz
 
World Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportWorld Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportSergey Nazarov
 
GLSEC 2017 Build an Open Data .Net MVC site in 30 mins
GLSEC 2017 Build an Open Data .Net MVC site in 30 minsGLSEC 2017 Build an Open Data .Net MVC site in 30 mins
GLSEC 2017 Build an Open Data .Net MVC site in 30 minsBrian McKeiver
 
South By South Best 2018: Part 2
South By South Best 2018: Part 2South By South Best 2018: Part 2
South By South Best 2018: Part 2James Quinlan
 
IoT Guildford Meetup#26: GDPR, IoT and Transparency
IoT Guildford Meetup#26: GDPR, IoT and TransparencyIoT Guildford Meetup#26: GDPR, IoT and Transparency
IoT Guildford Meetup#26: GDPR, IoT and TransparencyMicheleNati
 
Data privacy & compliance considerations on using cloud services
Data privacy & compliance considerations on using cloud servicesData privacy & compliance considerations on using cloud services
Data privacy & compliance considerations on using cloud servicesCharles Mok
 
Partnering for Innovation (Metadata)
Partnering for Innovation (Metadata)Partnering for Innovation (Metadata)
Partnering for Innovation (Metadata)Chinwag
 
GDPR and IoT: What do you need to know?
GDPR and IoT: What do you need to know?GDPR and IoT: What do you need to know?
GDPR and IoT: What do you need to know?MicheleNati
 
Blockchain’s impact on taxes and global trade
Blockchain’s impact on taxes and global tradeBlockchain’s impact on taxes and global trade
Blockchain’s impact on taxes and global tradeEY
 
How blockchain is changing finance
How blockchain is changing financeHow blockchain is changing finance
How blockchain is changing financeEY
 
Blockchain and io t in carbon credit management
Blockchain and io t in carbon credit managementBlockchain and io t in carbon credit management
Blockchain and io t in carbon credit managementYuri Anisimov
 
Industrializing the blockchain
Industrializing the blockchainIndustrializing the blockchain
Industrializing the blockchainEY
 
L'economia europea dei dati. Politiche europee e opportunità di finanziamento...
L'economia europea dei dati. Politiche europee e opportunità di finanziamento...L'economia europea dei dati. Politiche europee e opportunità di finanziamento...
L'economia europea dei dati. Politiche europee e opportunità di finanziamento...Data Driven Innovation
 
201802 idento.one 1_v1.1
201802 idento.one 1_v1.1201802 idento.one 1_v1.1
201802 idento.one 1_v1.1h-bauer2014
 
The somewhat awkward marriage between digital marketing and data protection (...
The somewhat awkward marriage between digital marketing and data protection (...The somewhat awkward marriage between digital marketing and data protection (...
The somewhat awkward marriage between digital marketing and data protection (...Bart Van Den Brande
 
Opportunities and Challenges in Data Innovation
Opportunities and Challenges in Data InnovationOpportunities and Challenges in Data Innovation
Opportunities and Challenges in Data InnovationCharles Mok
 

What's hot (20)

World Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportWorld Economic Forum Tipping Points Report
World Economic Forum Tipping Points Report
 
World Economic Forum Tipping Point Blockchain
World Economic Forum Tipping Point BlockchainWorld Economic Forum Tipping Point Blockchain
World Economic Forum Tipping Point Blockchain
 
Growing impact and future potential of blockchain for telcos: A Game Changer?
Growing impact and future potential of blockchain for telcos: A Game Changer?Growing impact and future potential of blockchain for telcos: A Game Changer?
Growing impact and future potential of blockchain for telcos: A Game Changer?
 
World Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportWorld Economic Forum Tipping Points Report
World Economic Forum Tipping Points Report
 
GLSEC 2017 Build an Open Data .Net MVC site in 30 mins
GLSEC 2017 Build an Open Data .Net MVC site in 30 minsGLSEC 2017 Build an Open Data .Net MVC site in 30 mins
GLSEC 2017 Build an Open Data .Net MVC site in 30 mins
 
South By South Best 2018: Part 2
South By South Best 2018: Part 2South By South Best 2018: Part 2
South By South Best 2018: Part 2
 
IoT Guildford Meetup#26: GDPR, IoT and Transparency
IoT Guildford Meetup#26: GDPR, IoT and TransparencyIoT Guildford Meetup#26: GDPR, IoT and Transparency
IoT Guildford Meetup#26: GDPR, IoT and Transparency
 
Data privacy & compliance considerations on using cloud services
Data privacy & compliance considerations on using cloud servicesData privacy & compliance considerations on using cloud services
Data privacy & compliance considerations on using cloud services
 
Partnering for Innovation (Metadata)
Partnering for Innovation (Metadata)Partnering for Innovation (Metadata)
Partnering for Innovation (Metadata)
 
GDPR and IoT: What do you need to know?
GDPR and IoT: What do you need to know?GDPR and IoT: What do you need to know?
GDPR and IoT: What do you need to know?
 
Blockchain’s impact on taxes and global trade
Blockchain’s impact on taxes and global tradeBlockchain’s impact on taxes and global trade
Blockchain’s impact on taxes and global trade
 
How blockchain is changing finance
How blockchain is changing financeHow blockchain is changing finance
How blockchain is changing finance
 
Blockchain and io t in carbon credit management
Blockchain and io t in carbon credit managementBlockchain and io t in carbon credit management
Blockchain and io t in carbon credit management
 
Industrializing the blockchain
Industrializing the blockchainIndustrializing the blockchain
Industrializing the blockchain
 
L'economia europea dei dati. Politiche europee e opportunità di finanziamento...
L'economia europea dei dati. Politiche europee e opportunità di finanziamento...L'economia europea dei dati. Politiche europee e opportunità di finanziamento...
L'economia europea dei dati. Politiche europee e opportunità di finanziamento...
 
E govt.
E govt.E govt.
E govt.
 
BIS BUCHHOLZ
BIS BUCHHOLZBIS BUCHHOLZ
BIS BUCHHOLZ
 
201802 idento.one 1_v1.1
201802 idento.one 1_v1.1201802 idento.one 1_v1.1
201802 idento.one 1_v1.1
 
The somewhat awkward marriage between digital marketing and data protection (...
The somewhat awkward marriage between digital marketing and data protection (...The somewhat awkward marriage between digital marketing and data protection (...
The somewhat awkward marriage between digital marketing and data protection (...
 
Opportunities and Challenges in Data Innovation
Opportunities and Challenges in Data InnovationOpportunities and Challenges in Data Innovation
Opportunities and Challenges in Data Innovation
 

Similar to Removing barriers to transparency: a case study on the use of semantic technologies to tackle procurement data inconsistency

Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...
Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...
Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...First Tuesday Bergen
 
Data center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the ITData center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the ITAlessandro Guli
 
Revue de presse IoT / Data du 22/01/2017
Revue de presse IoT / Data du 22/01/2017Revue de presse IoT / Data du 22/01/2017
Revue de presse IoT / Data du 22/01/2017Romain Bochet
 
Gartner Interconnectedness Report
Gartner Interconnectedness ReportGartner Interconnectedness Report
Gartner Interconnectedness ReportCMassociates
 
The Global Interconnection Index - Measuring Growth of the Digital Economy
The Global Interconnection Index - Measuring Growth of the Digital EconomyThe Global Interconnection Index - Measuring Growth of the Digital Economy
The Global Interconnection Index - Measuring Growth of the Digital EconomyEquinix
 
How Retirement Services Providers Can Tap Blockchain Thinking and Technology
How Retirement Services Providers Can Tap Blockchain Thinking and TechnologyHow Retirement Services Providers Can Tap Blockchain Thinking and Technology
How Retirement Services Providers Can Tap Blockchain Thinking and TechnologyCognizant
 
The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution Ian Beckett
 
Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big DataRavinder Kamboj
 
Why Telcos need to embrace Digital Transformation now
Why Telcos need to embrace Digital Transformation nowWhy Telcos need to embrace Digital Transformation now
Why Telcos need to embrace Digital Transformation nowEricsson
 
Sgd itm-soft-computing-11-march -11
Sgd itm-soft-computing-11-march -11Sgd itm-soft-computing-11-march -11
Sgd itm-soft-computing-11-march -11Sanjeev Deshmukh
 
Globalization and E-commerce
Globalization and   E-commerceGlobalization and   E-commerce
Globalization and E-commerceArnaldy Fortin
 
Hierarchies without firms? Vertical disintegration, personal outsourcing and ...
Hierarchies without firms? Vertical disintegration, personal outsourcing and ...Hierarchies without firms? Vertical disintegration, personal outsourcing and ...
Hierarchies without firms? Vertical disintegration, personal outsourcing and ...IE Law School, IE University
 
eDiscovery Market - Integrate latest technologies 2025
eDiscovery Market - Integrate latest technologies 2025eDiscovery Market - Integrate latest technologies 2025
eDiscovery Market - Integrate latest technologies 2025Arushi00
 
Evolving regulations are changing the way we think about tools and technology
Evolving regulations are changing the way we think about tools and technologyEvolving regulations are changing the way we think about tools and technology
Evolving regulations are changing the way we think about tools and technologyUlf Mattsson
 
Managing corruption risks in infrastructure... - Gavin Hayman, CoST
Managing corruption risks in infrastructure... - Gavin Hayman, CoSTManaging corruption risks in infrastructure... - Gavin Hayman, CoST
Managing corruption risks in infrastructure... - Gavin Hayman, CoSTOECD Governance
 
Trends in legal tech 2018
Trends in legal tech 2018Trends in legal tech 2018
Trends in legal tech 2018Secure Privacy
 
Vertical chain and transactional cost economy (tce)
Vertical chain and transactional cost economy (tce)Vertical chain and transactional cost economy (tce)
Vertical chain and transactional cost economy (tce)fadi_alnajjar
 
Modernizing the Supply Chain into the 21st Century
Modernizing the Supply Chain into the 21st CenturyModernizing the Supply Chain into the 21st Century
Modernizing the Supply Chain into the 21st CenturyAnil John
 
TNT TECHNOLOGIES PRIVATE L
TNT TECHNOLOGIES PRIVATE LTNT TECHNOLOGIES PRIVATE L
TNT TECHNOLOGIES PRIVATE LTakishaPeck109
 
ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)ICARUS2020.aero
 

Similar to Removing barriers to transparency: a case study on the use of semantic technologies to tackle procurement data inconsistency (20)

Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...
Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...
Antonio Senatore - Intro om blockchain & Nøkkelbruksområder for blockchain me...
 
Data center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the ITData center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the IT
 
Revue de presse IoT / Data du 22/01/2017
Revue de presse IoT / Data du 22/01/2017Revue de presse IoT / Data du 22/01/2017
Revue de presse IoT / Data du 22/01/2017
 
Gartner Interconnectedness Report
Gartner Interconnectedness ReportGartner Interconnectedness Report
Gartner Interconnectedness Report
 
The Global Interconnection Index - Measuring Growth of the Digital Economy
The Global Interconnection Index - Measuring Growth of the Digital EconomyThe Global Interconnection Index - Measuring Growth of the Digital Economy
The Global Interconnection Index - Measuring Growth of the Digital Economy
 
How Retirement Services Providers Can Tap Blockchain Thinking and Technology
How Retirement Services Providers Can Tap Blockchain Thinking and TechnologyHow Retirement Services Providers Can Tap Blockchain Thinking and Technology
How Retirement Services Providers Can Tap Blockchain Thinking and Technology
 
The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution The rise of the digital supply network - IIOT Industry40 distribution
The rise of the digital supply network - IIOT Industry40 distribution
 
Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big Data
 
Why Telcos need to embrace Digital Transformation now
Why Telcos need to embrace Digital Transformation nowWhy Telcos need to embrace Digital Transformation now
Why Telcos need to embrace Digital Transformation now
 
Sgd itm-soft-computing-11-march -11
Sgd itm-soft-computing-11-march -11Sgd itm-soft-computing-11-march -11
Sgd itm-soft-computing-11-march -11
 
Globalization and E-commerce
Globalization and   E-commerceGlobalization and   E-commerce
Globalization and E-commerce
 
Hierarchies without firms? Vertical disintegration, personal outsourcing and ...
Hierarchies without firms? Vertical disintegration, personal outsourcing and ...Hierarchies without firms? Vertical disintegration, personal outsourcing and ...
Hierarchies without firms? Vertical disintegration, personal outsourcing and ...
 
eDiscovery Market - Integrate latest technologies 2025
eDiscovery Market - Integrate latest technologies 2025eDiscovery Market - Integrate latest technologies 2025
eDiscovery Market - Integrate latest technologies 2025
 
Evolving regulations are changing the way we think about tools and technology
Evolving regulations are changing the way we think about tools and technologyEvolving regulations are changing the way we think about tools and technology
Evolving regulations are changing the way we think about tools and technology
 
Managing corruption risks in infrastructure... - Gavin Hayman, CoST
Managing corruption risks in infrastructure... - Gavin Hayman, CoSTManaging corruption risks in infrastructure... - Gavin Hayman, CoST
Managing corruption risks in infrastructure... - Gavin Hayman, CoST
 
Trends in legal tech 2018
Trends in legal tech 2018Trends in legal tech 2018
Trends in legal tech 2018
 
Vertical chain and transactional cost economy (tce)
Vertical chain and transactional cost economy (tce)Vertical chain and transactional cost economy (tce)
Vertical chain and transactional cost economy (tce)
 
Modernizing the Supply Chain into the 21st Century
Modernizing the Supply Chain into the 21st CenturyModernizing the Supply Chain into the 21st Century
Modernizing the Supply Chain into the 21st Century
 
TNT TECHNOLOGIES PRIVATE L
TNT TECHNOLOGIES PRIVATE LTNT TECHNOLOGIES PRIVATE L
TNT TECHNOLOGIES PRIVATE L
 
ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - BDVA Policy Session (November 2018, Vienna)
 

More from giuseppe_futia

From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...
From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...
From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...giuseppe_futia
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
TellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery frameworkTellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery frameworkgiuseppe_futia
 
From unstructured data to structured journalism
From unstructured data to structured journalismFrom unstructured data to structured journalism
From unstructured data to structured journalismgiuseppe_futia
 
Visualization of Linked Data
Visualization of Linked DataVisualization of Linked Data
Visualization of Linked Datagiuseppe_futia
 
Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...giuseppe_futia
 
Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...
Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...
Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...giuseppe_futia
 

More from giuseppe_futia (7)

From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...
From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...
From Big Linked Data to Linked Big Data - DBpedia as a framework for data int...
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
TellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery frameworkTellMeFirst - A knowledge domain discovery framework
TellMeFirst - A knowledge domain discovery framework
 
From unstructured data to structured journalism
From unstructured data to structured journalismFrom unstructured data to structured journalism
From unstructured data to structured journalism
 
Visualization of Linked Data
Visualization of Linked DataVisualization of Linked Data
Visualization of Linked Data
 
Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...
 
Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...
Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...
Visualizing Internet-Measurements Data for Research Purposes: the NeuViz Data...
 

Recently uploaded

Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpressssuser166378
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Shubham Pant
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024Jan Löffler
 
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSedrianrheine
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdfShreedeep Rayamajhi
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteMavein
 
Niche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus BonusNiche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus BonusSkylark Nobin
 
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptxA_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptxjayshuklatrainer
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxnaveenithkrishnan
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...APNIC
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSlesteraporado16
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilitiesalihassaah1994
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfmchristianalwyn
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsRoxana Stingu
 
world Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptxworld Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptxnaveenithkrishnan
 

Recently uploaded (15)

Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpress
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
 
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a Website
 
Niche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus BonusNiche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus Bonus
 
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptxA_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptx
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilities
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
 
world Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptxworld Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptx
 

Removing barriers to transparency: a case study on the use of semantic technologies to tackle procurement data inconsistency

  • 1. Removing Barriers to Transparency: a Case Study on the Use of Semantic Technologies to Tackle Procurement Data Inconsistency Giuseppe Futia*, Alessio Melandri**, Antonio Vetrò*, Federico Morando**, Juan Carlos De Martin* * Nexa Center for Internet and Society, Politecnico di Torino (DAUIN) ** Synapta Srl
  • 2. Outline • Public procurement context • Our contribution • Results • Discussion • Future works 2
  • 5. Data related to functioning of government can be accessed and interpreted, without being pre-processed and manipulated ESWC 2017 - Removing Barriers to Transparency 5
  • 6. Open Government Data (OGD) is (or should be) a mechanism integrated in the heart of the government functions for creating transparency ESWC 2017 - Removing Barriers to Transparency 6
  • 7. Public Procurement (PP) is a specific area of the OGD that increases openness of government information and supports business activities ESWC 2017 - Removing Barriers to Transparency 7
  • 8. Is OGD Enough for Enabling Transparency through Procurement Data?
  • 9. • An obstacle to transparency is related to the fragmentation of OGD in the domain of procurement data • Fragmentation can generate discrepancies and inconsistencies among data that can not be a priori detected and fixed ESWC 2017 - Removing Barriers to Transparency 9
  • 11. Our case study is represented by Italian PP data published on different websites of Italian administrations - in compliance with the Italian anti-corruption Act (law n. 190/2012) ESWC 2017 - Removing Barriers to Transparency 11
  • 12. 1212
  • 13. 13ESWC 2017 - Removing Barriers to Transparency
  • 14. 300.000 XML files from more than 16.000 public bodies
  • 15. What is Data Consistency?
  • 16. Data consistency refers to the absence of apparent contradictions within data (ISO/IEC 25012) ESWC 2017 - Removing Barriers to Transparency 16
  • 17. • Certain types of consistency problems directly emerge analyzing contracts data collected in a single XML file – contracts in which the beneficiary is more than one – contracts in which the amount of money is paid, but no recipient is present in the data – contracts in which the sum reported as paid is greater than the sum initially awarded to the beneficiary ESWC 2017 - Removing Barriers to Transparency 17
  • 18. • Other types of inconsistencies manifest themselves only after interlinking and merging data contained in different sources – business entities with more than one business name – CIGs that identify more than one contract – incoherent payments among different versions of an ongoing contract ESWC 2017 - Removing Barriers to Transparency 18
  • 19. These issues represent a significant barrier to achieve transparency, because the results obtained by querying the dataset are misleading ESWC 2017 - Removing Barriers to Transparency 19
  • 21. 21
  • 22. pc: http://purl.org/procurement/public-contracts# dcterms: http://purl.org/dc/terms# gr: http://purl.org/goodrelations/v1# dbpedia-owl: http://dbpedia.org/ontology/ pay: http://reference.data.gov.uk/def/payment# time: http://www.w3.org/2006/time# org: http://www.w3.org/ns/org# foaf: http://xmlns.com/foaf/0.1# 22
  • 23. 23
  • 25. 25
  • 26. 26
  • 27. 27
  • 29. • Business entities with more than one business name • CIGs that identify more than one contract • Incoherent payments among different versions of an ongoing contract ESWC 2017 - Removing Barriers to Transparency 29
  • 30. Business entities with more than one business name
  • 31. PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX gr: <http://purl.org/goodrelations/v1#> SELECT (COUNT(DISTINCT ?be)) WHERE { { SELECT DISTINCT(?be) WHERE { ?be rdfs:label ?label . ?be a gr:BusinessEntity . } GROUP BY ?be HAVING (count(*)>1) } } ESWC 2017 - Removing Barriers to Transparency 31
  • 32. Exploiting VAT ID value, we obtain a unique business name for contracting authorities, building links to the Italian public administration index of SPCData (http://spcdata.digitpa.gov.it:8899/sparql) ESWC 2017 - Removing Barriers to Transparency 32
  • 33. CIGs that identify more than one contract
  • 34. PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX pc: <http://purl.org/procurement/public-contracts#> SELECT (COUNT(DISTINCT ?contract)) WHERE { { SELECT DISTINCT(?contract) WHERE { ?contract dcterms:identifier ?CIG . ?contract a pc:Contract . } GROUP BY ?contract HAVING (count(*)>1) } } ESWC 2017 - Removing Barriers to Transparency 34
  • 35. The solution to this issue is generating a hash value, avoiding ambiguity due to duplicate CIGs, to build contracts URIs. In this way, we separate different contracts, misidentified by the same CIG, in different entities ESWC 2017 - Removing Barriers to Transparency 35
  • 36. Incoherent payments among different versions of an ongoing contract
  • 37. select ?result (COUNT(*) as ?n) where { {select ((ROUND(?pay_sum/?ap*100)/100) as ?result) where { ?c <http://purl.org/procurement/public-contracts#agreedPrice> ?ap . FILTER (?ap > 0). FILTER (?ap < 1000000000000). { select ?c (SUM(?pay) as ?pay_sum) where { ?c <http://reference.data.gov.uk/def/payment#payment> ?p . ?p <http://reference.data.gov.uk/def/payment#netAmount> ?pay . FILTER (?pay > 0). FILTER (?pay < 1000000000000). }group by ?c }}}} group by ?result order by ?result ESWC 2017 - Removing Barriers to Transparency 37
  • 38. 38
  • 40. • We presented an approach to tackle fragmentation of Italian procurement data and to improve consistency of such data based on semantic technologies • Both these issues represent a significant barrier to achieve a full transparency, because user and robots that query the datasets risk to obtain partial, inconsistent, and misleading results ESWC 2017 - Removing Barriers to Transparency 40
  • 42. • Developing automatic tools to detect and fix consistency problems among contracts published in different files • Exploring ways to evaluate the provenance of data, in order to improve the data processing stage and improve data consistency • Exploiting advanced methods and dashboards in order to monitor the consistency degree of the data ESWC 2017 - Removing Barriers to Transparency 42
  • 43. Thank you! Mail: giuseppe.futia@polito.it Twitter: giuseppe_futia GitHub: https://github.com/synapta/public-contracts