SlideShare uma empresa Scribd logo
1 de 22
Linked Open Government Data
(LOGD): Ontology Usage
Experimental Results
Second Presentation
Nooshin Allahyari
1
Outlines
• Categorizing data provider
• Dataset collection
• Dataset characteristics
▫ Namespace
▫ Ontology Usage
▫ Annotation property
• Concept Coverage
• Case-Based Analysis
• Conclusion
Nooshin Allahyari
2
Categorizing data provider
• US Government Agencies
• Dividing agencies based on US Federal Government
Reference Model
• Each agency is in charge of publishing related datasets
• Data.gov catalog also provide topic related categorization
Nooshin Allahyari
3
Outlines
• Categorizing data provider
• Dataset collection
• Dataset characteristics
▫ Namespace
▫ Ontology Usage
▫ Annotation property
• Concept Coverage
• Case-Based Analysis
• Conclusion
Nooshin Allahyari
4
Dataset Collection
• All 25 Datasets collected from Data.gov
• Datasets are in RDF format
• Difficulties running huge datasets
• Using different tools As endpoint
▫ Virtuoso commercial version as SPARQL endpoint
 Easy to Install
 GUI
 Lots of visual tools
 SQL,SQL tools and connection tools.
• Increasing dataset number for reliability
Nooshin Allahyari
5
Outlines
• Categorizing data provider
• Dataset collection
• Dataset Composition Characteristics
▫ Namespace
▫ Ontology Usage
▫ Annotation property
• Concept Coverage
• Case-Based Analysis
• Conclusion
Nooshin Allahyari
6
NameSpace
Nooshin Allahyari
7
• Same Namespace usage for all datasets
Ontology Vocabulary Usage
• FEA Reference Model Ontology(RMO)
• Vocabulary Related to Government Context
▫ General Vocabulary
 Country
 State
 City
▫ Government programs, Services:
 Health Program
 Cultural Program
Nooshin Allahyari
8
Annotation Property
• Useful to provide additional information about
datasets. All datasets have:
▫ rdfs:lable
▫ Rdfs:comments
▫ No language tag or metadata
 Some datsets from Italy dataset catalog in TWC LOGD
contain Language Tag .
Nooshin Allahyari
9
Outlines
• Categorizing data provider
• Dataset collection
• Dataset characteristics
▫ Namespace
▫ Ontology Usage
▫ Annotation property
• Concept Coverage
• Case-Based Analysis
• Conclusion
Nooshin Allahyari
10
Concept Coverage
• Same Concept in all datasets
• Metadata for Data.gov wiki and TWC LOGD
Nooshin Allahyari
11
Prefix Concept
foaf Homepage
rdfs isDefinedBy
dcterms Source
dgtwc uses-property
dgtwc number-of-triples
dgtwc number-of-properties
dgtwc number-of-enteries
Concept Coverage
• General Concept Related Government
• Low Coverage of concept
• Multi-name concepts
Nooshin Allahyari
12
Concept Coverage(percentage)
State 48%
City 32%
State-Abbreviation 16%
Region 12%
Zip 12%
Country 8%
Country origin code 8%
Area code 8%
Outlines
• Categorizing data provider
• Dataset collection
• Dataset characteristics
▫ Namespace
▫ Ontology Usage
▫ Annotation property
• Concept Coverage
• Case-Based Analysis
• Conclusion
Nooshin Allahyari
13
Case-Based Analysis
• Three dataset from same agency in same
category
▫ Department of Veterans Affairs
 dataset1213
 dataset1288
 Dataset1290
• Result of each dataset queries shows all three of
them have similar concepts
 State
 City
 VISN
 Station
Nooshin Allahyari
14
Case-Based Analysis-1288
• The query lists all station with their specific code(VISN)
in each city and determine the state in which the city is
located in:
Nooshin Allahyari
15
SELECT DISTINCT ?city ?station ?visn ?st
WHERE
{
?s <http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#city> ?city
OPTIONAL{ ?s
<http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#station> ?station}
OPTIONAL{?s
<http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#visn> ?visn}
OPTIONAL{?s
<http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#st> ?st}
}
State VISN Station City
"NJ" "3" "561" "East Orange"
"NY" "3" "620" "Montrose"
"NY" "3" "630"
"New York
Harbor"
"NY" "3" "632" "Northport"
"DE" "4" "460" "Wilmington"
"PA" "4" "503" "Altoona"
"PA" "4" "529" "Butler"
"WV" "4" "540" "Clarksburg"
Case-Based Analysis-1290
• The query lists all station with their specific code(VISN)
in each city and determine the state in which the city is
located in:
Nooshin Allahyari
16
SELECT DISTINCT ?city ?station ?visn ?st
WHERE
{
?s <http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#city> ?city
OPTIONAL{ ?s
<http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#station> ?station}
OPTIONAL{?s
<http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#visn> ?visn}
OPTIONAL{?s
<http://www.data.gov/semantic/data/alpha/1288/dataset-
1288.rdf#st> ?st}
}
State VISN Station City
"ME" "1" "402" "Togus"
"VT" "1" "405"
"White River
Junction"
"MA" "1" "518" "Bedford"
"MA" "1" "523" "West Roxbury"
"NH" "1" "608" "Manchester"
"MA" "1" "631" "Northampton"
"RI" "1" "650" "Providence"
"CT" "1" "689" "West Haven"
Case-Based Analysis-1213
• The query lists all station with their specific code(VISN)
in each city and determine the state in which the city is
located in:
Nooshin Allahyari
17
SELECT DISTINCT ?visn ?city ?state
WHERE
{
?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#visn>
?visn.
?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#city>
?city.
?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#state>
?state
}
State VISN City
"CT" "1" "West Haven"
"MA" "1" "Bedford"
"MA" "1" "West Roxbury"
"MA" "1" "Northampton"
"ME" "1" "Togus"
"NH" "1" "Manchester"
"RI" "1" "Providence"
"VT" "1" "White River Junction"
Case-Based Analysis-1206
• Dataset 1206 similarities
Nooshin Allahyari
18
VISN STATE Facility-name City
"1" "CT" "VA Connecticut HCS" "West Haven"
"1" "MA"
"Edith Nourse Rogers Memorial
Veterans Hospital"
"Bedford"
"1" "MA"
"VA Boston HCSW Roxbury Brockton
Jamaica Plns"
"West Roxbury"
"1" "MA" "VAMC" "Northampton"
"1" "ME" "VAMC/RO" "Togus"
"1" "NH" "VAMC" "Manchester"
"1" "RI" "VAMC" "Providence"
"1" "VT" "VAM/ROC"
"White River
Junction"
SELECT DISTINCT ?state ?facilityname ?city
?visn
WHERE
{
?s
<http://www.data.gov/semantic/data/alpha/12
06/dataset-1206.rdf#visn> ?visn.
?s
<http://www.data.gov/semantic/data/alpha/12
06/dataset-1206.rdf#state> ?state.
?s
<http://www.data.gov/semantic/data/alpha/12
06/dataset-1206.rdf#city> ?city.
?s
<http://www.data.gov/semantic/data/alpha/12
06/dataset-1206.rdf#facility_name>
?facilityname
}
Case-Based Analysis-Comparison
• We need to explicitly define “owl:sameAs” property for
similar properties in order to get query results:
Nooshin Allahyari
19
SELECT DISTINCT ?state ?city
WHERE
{ GRAPH <http://localhost8890/vad/dataset1288>
{
?s1 <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st >?state.
?s1 <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city> ?city .
<http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st>
owl:sameAs
<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st> .
http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city
Owl:sameAs
http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city.
}
GRAPH <http://localhost8890/vad/dataset1290>
{
?s2 <<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st> ?st.
?s2 <http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city> ?city.
<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st>
owl:sameAs
<http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st>.
<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city>
Owl:sameAs
<http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city>.
}}order by ?state
State City
"CT" "West Haven"
"MA" "Bedford"
"MA" "West Roxbury"
"MA" "Northampton"
"ME" "Togus"
"NH" "Manchester"
"RI" "Providence"
"VT" "White River Junction"
Outlines
• Categorizing data provider
• Dataset collection
• Dataset characteristics
▫ Namespace
▫ Ontology Usage
▫ Annotation property
• Concept Coverage
• Case-Based Analysis
• Conclusion
Nooshin Allahyari
20
Conclusion
• No Government ontology have been used in
experimental datasets
• Weak vocabulary usage in US Government
• Multi-vocabulary usage for same concept
• Multi-vocabulary usage in same government agency
• Lack of well defined, coherent, and consistent
government ontology.
Nooshin Allahyari
21
Thank you
Nooshin Allahyari
22

Mais conteúdo relacionado

Destaque

Destaque (16)

Presentación sant antoni
Presentación sant antoniPresentación sant antoni
Presentación sant antoni
 
Presentación sant antoni
Presentación sant antoniPresentación sant antoni
Presentación sant antoni
 
Livro completo física em casa
Livro completo física em casaLivro completo física em casa
Livro completo física em casa
 
Presentación sant antoni
Presentación sant antoniPresentación sant antoni
Presentación sant antoni
 
GLOSES A SANT ANTONI
GLOSES A SANT ANTONIGLOSES A SANT ANTONI
GLOSES A SANT ANTONI
 
Knowledge managementneedsinprescriptionmedicationprocess
Knowledge managementneedsinprescriptionmedicationprocessKnowledge managementneedsinprescriptionmedicationprocess
Knowledge managementneedsinprescriptionmedicationprocess
 
Textos infantins (3)
Textos infantins (3)Textos infantins (3)
Textos infantins (3)
 
Textos infantins (2)
Textos infantins (2)Textos infantins (2)
Textos infantins (2)
 
J'ai mal 01
J'ai mal 01J'ai mal 01
J'ai mal 01
 
Formation de l'imparfait
Formation de l'imparfaitFormation de l'imparfait
Formation de l'imparfait
 
La France - Viagem Interativa
La France - Viagem InterativaLa France - Viagem Interativa
La France - Viagem Interativa
 
Présenter Paris
Présenter ParisPrésenter Paris
Présenter Paris
 
O material escolar em francês
O material escolar em francêsO material escolar em francês
O material escolar em francês
 
Pronoms relatifs invariables
Pronoms relatifs invariablesPronoms relatifs invariables
Pronoms relatifs invariables
 
la meteo
la meteola meteo
la meteo
 
Les degrés des adjectifs
Les degrés des adjectifsLes degrés des adjectifs
Les degrés des adjectifs
 

Semelhante a Linked Open Government Data (LOGD)

Rakesh-Nune-Incident-Management-for-DDOT
Rakesh-Nune-Incident-Management-for-DDOTRakesh-Nune-Incident-Management-for-DDOT
Rakesh-Nune-Incident-Management-for-DDOT
Rakesh Nune
 

Semelhante a Linked Open Government Data (LOGD) (20)

Linked Data Usecases
Linked Data UsecasesLinked Data Usecases
Linked Data Usecases
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBSolving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDB
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
 
2018 GIS in Government: Publishing BLM Data On the Web
2018 GIS in Government: Publishing BLM Data On the Web2018 GIS in Government: Publishing BLM Data On the Web
2018 GIS in Government: Publishing BLM Data On the Web
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
Rakesh-Nune-Incident-Management-for-DDOT
Rakesh-Nune-Incident-Management-for-DDOTRakesh-Nune-Incident-Management-for-DDOT
Rakesh-Nune-Incident-Management-for-DDOT
 
Webinar: How to Drive Business Value in Financial Services with MongoDB
Webinar: How to Drive Business Value in Financial Services with MongoDBWebinar: How to Drive Business Value in Financial Services with MongoDB
Webinar: How to Drive Business Value in Financial Services with MongoDB
 
Big data visualization frameworks and applications at Kitware
Big data visualization frameworks and applications at KitwareBig data visualization frameworks and applications at Kitware
Big data visualization frameworks and applications at Kitware
 
How Insurance Companies Use MongoDB
How Insurance Companies Use MongoDB How Insurance Companies Use MongoDB
How Insurance Companies Use MongoDB
 
Smart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Smart Cities, Open Data and SMW - SMWCon Spring 2012 KeynoteSmart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Smart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
 
Webinar: How to Drive Business Value in Financial Services with MongoDB
Webinar: How to Drive Business Value in Financial Services with MongoDBWebinar: How to Drive Business Value in Financial Services with MongoDB
Webinar: How to Drive Business Value in Financial Services with MongoDB
 
DBtrends Semantics 2016
DBtrends Semantics 2016DBtrends Semantics 2016
DBtrends Semantics 2016
 
HPCC Systems Presentation to TDWI Chicago Chapter
HPCC Systems Presentation to TDWI Chicago ChapterHPCC Systems Presentation to TDWI Chicago Chapter
HPCC Systems Presentation to TDWI Chicago Chapter
 
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtren...
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
 
datamining-lect1.pptx
datamining-lect1.pptxdatamining-lect1.pptx
datamining-lect1.pptx
 
chương 1 - Tổng quan về khai phá dữ liệu.pdf
chương 1 - Tổng quan về khai phá dữ liệu.pdfchương 1 - Tổng quan về khai phá dữ liệu.pdf
chương 1 - Tổng quan về khai phá dữ liệu.pdf
 
Meet 1 - Introduction Data Mining - Dedi Darwis.pdf
Meet 1 - Introduction Data Mining - Dedi Darwis.pdfMeet 1 - Introduction Data Mining - Dedi Darwis.pdf
Meet 1 - Introduction Data Mining - Dedi Darwis.pdf
 

Último

Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 

Último (20)

Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 

Linked Open Government Data (LOGD)

  • 1. Linked Open Government Data (LOGD): Ontology Usage Experimental Results Second Presentation Nooshin Allahyari 1
  • 2. Outlines • Categorizing data provider • Dataset collection • Dataset characteristics ▫ Namespace ▫ Ontology Usage ▫ Annotation property • Concept Coverage • Case-Based Analysis • Conclusion Nooshin Allahyari 2
  • 3. Categorizing data provider • US Government Agencies • Dividing agencies based on US Federal Government Reference Model • Each agency is in charge of publishing related datasets • Data.gov catalog also provide topic related categorization Nooshin Allahyari 3
  • 4. Outlines • Categorizing data provider • Dataset collection • Dataset characteristics ▫ Namespace ▫ Ontology Usage ▫ Annotation property • Concept Coverage • Case-Based Analysis • Conclusion Nooshin Allahyari 4
  • 5. Dataset Collection • All 25 Datasets collected from Data.gov • Datasets are in RDF format • Difficulties running huge datasets • Using different tools As endpoint ▫ Virtuoso commercial version as SPARQL endpoint  Easy to Install  GUI  Lots of visual tools  SQL,SQL tools and connection tools. • Increasing dataset number for reliability Nooshin Allahyari 5
  • 6. Outlines • Categorizing data provider • Dataset collection • Dataset Composition Characteristics ▫ Namespace ▫ Ontology Usage ▫ Annotation property • Concept Coverage • Case-Based Analysis • Conclusion Nooshin Allahyari 6
  • 7. NameSpace Nooshin Allahyari 7 • Same Namespace usage for all datasets
  • 8. Ontology Vocabulary Usage • FEA Reference Model Ontology(RMO) • Vocabulary Related to Government Context ▫ General Vocabulary  Country  State  City ▫ Government programs, Services:  Health Program  Cultural Program Nooshin Allahyari 8
  • 9. Annotation Property • Useful to provide additional information about datasets. All datasets have: ▫ rdfs:lable ▫ Rdfs:comments ▫ No language tag or metadata  Some datsets from Italy dataset catalog in TWC LOGD contain Language Tag . Nooshin Allahyari 9
  • 10. Outlines • Categorizing data provider • Dataset collection • Dataset characteristics ▫ Namespace ▫ Ontology Usage ▫ Annotation property • Concept Coverage • Case-Based Analysis • Conclusion Nooshin Allahyari 10
  • 11. Concept Coverage • Same Concept in all datasets • Metadata for Data.gov wiki and TWC LOGD Nooshin Allahyari 11 Prefix Concept foaf Homepage rdfs isDefinedBy dcterms Source dgtwc uses-property dgtwc number-of-triples dgtwc number-of-properties dgtwc number-of-enteries
  • 12. Concept Coverage • General Concept Related Government • Low Coverage of concept • Multi-name concepts Nooshin Allahyari 12 Concept Coverage(percentage) State 48% City 32% State-Abbreviation 16% Region 12% Zip 12% Country 8% Country origin code 8% Area code 8%
  • 13. Outlines • Categorizing data provider • Dataset collection • Dataset characteristics ▫ Namespace ▫ Ontology Usage ▫ Annotation property • Concept Coverage • Case-Based Analysis • Conclusion Nooshin Allahyari 13
  • 14. Case-Based Analysis • Three dataset from same agency in same category ▫ Department of Veterans Affairs  dataset1213  dataset1288  Dataset1290 • Result of each dataset queries shows all three of them have similar concepts  State  City  VISN  Station Nooshin Allahyari 14
  • 15. Case-Based Analysis-1288 • The query lists all station with their specific code(VISN) in each city and determine the state in which the city is located in: Nooshin Allahyari 15 SELECT DISTINCT ?city ?station ?visn ?st WHERE { ?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#city> ?city OPTIONAL{ ?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#station> ?station} OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#visn> ?visn} OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#st> ?st} } State VISN Station City "NJ" "3" "561" "East Orange" "NY" "3" "620" "Montrose" "NY" "3" "630" "New York Harbor" "NY" "3" "632" "Northport" "DE" "4" "460" "Wilmington" "PA" "4" "503" "Altoona" "PA" "4" "529" "Butler" "WV" "4" "540" "Clarksburg"
  • 16. Case-Based Analysis-1290 • The query lists all station with their specific code(VISN) in each city and determine the state in which the city is located in: Nooshin Allahyari 16 SELECT DISTINCT ?city ?station ?visn ?st WHERE { ?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#city> ?city OPTIONAL{ ?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#station> ?station} OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#visn> ?visn} OPTIONAL{?s <http://www.data.gov/semantic/data/alpha/1288/dataset- 1288.rdf#st> ?st} } State VISN Station City "ME" "1" "402" "Togus" "VT" "1" "405" "White River Junction" "MA" "1" "518" "Bedford" "MA" "1" "523" "West Roxbury" "NH" "1" "608" "Manchester" "MA" "1" "631" "Northampton" "RI" "1" "650" "Providence" "CT" "1" "689" "West Haven"
  • 17. Case-Based Analysis-1213 • The query lists all station with their specific code(VISN) in each city and determine the state in which the city is located in: Nooshin Allahyari 17 SELECT DISTINCT ?visn ?city ?state WHERE { ?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#visn> ?visn. ?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#city> ?city. ?s <http://www.data.gov/semantic/data/alpha/1213/dataset-1213.rdf#state> ?state } State VISN City "CT" "1" "West Haven" "MA" "1" "Bedford" "MA" "1" "West Roxbury" "MA" "1" "Northampton" "ME" "1" "Togus" "NH" "1" "Manchester" "RI" "1" "Providence" "VT" "1" "White River Junction"
  • 18. Case-Based Analysis-1206 • Dataset 1206 similarities Nooshin Allahyari 18 VISN STATE Facility-name City "1" "CT" "VA Connecticut HCS" "West Haven" "1" "MA" "Edith Nourse Rogers Memorial Veterans Hospital" "Bedford" "1" "MA" "VA Boston HCSW Roxbury Brockton Jamaica Plns" "West Roxbury" "1" "MA" "VAMC" "Northampton" "1" "ME" "VAMC/RO" "Togus" "1" "NH" "VAMC" "Manchester" "1" "RI" "VAMC" "Providence" "1" "VT" "VAM/ROC" "White River Junction" SELECT DISTINCT ?state ?facilityname ?city ?visn WHERE { ?s <http://www.data.gov/semantic/data/alpha/12 06/dataset-1206.rdf#visn> ?visn. ?s <http://www.data.gov/semantic/data/alpha/12 06/dataset-1206.rdf#state> ?state. ?s <http://www.data.gov/semantic/data/alpha/12 06/dataset-1206.rdf#city> ?city. ?s <http://www.data.gov/semantic/data/alpha/12 06/dataset-1206.rdf#facility_name> ?facilityname }
  • 19. Case-Based Analysis-Comparison • We need to explicitly define “owl:sameAs” property for similar properties in order to get query results: Nooshin Allahyari 19 SELECT DISTINCT ?state ?city WHERE { GRAPH <http://localhost8890/vad/dataset1288> { ?s1 <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st >?state. ?s1 <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city> ?city . <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st> owl:sameAs <http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st> . http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city Owl:sameAs http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city. } GRAPH <http://localhost8890/vad/dataset1290> { ?s2 <<http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st> ?st. ?s2 <http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city> ?city. <http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#st> owl:sameAs <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#st>. <http://www.data.gov/semantic/data/alpha/1290/dataset-1290.rdf#city> Owl:sameAs <http://www.data.gov/semantic/data/alpha/1288/dataset-1288.rdf#city>. }}order by ?state State City "CT" "West Haven" "MA" "Bedford" "MA" "West Roxbury" "MA" "Northampton" "ME" "Togus" "NH" "Manchester" "RI" "Providence" "VT" "White River Junction"
  • 20. Outlines • Categorizing data provider • Dataset collection • Dataset characteristics ▫ Namespace ▫ Ontology Usage ▫ Annotation property • Concept Coverage • Case-Based Analysis • Conclusion Nooshin Allahyari 20
  • 21. Conclusion • No Government ontology have been used in experimental datasets • Weak vocabulary usage in US Government • Multi-vocabulary usage for same concept • Multi-vocabulary usage in same government agency • Lack of well defined, coherent, and consistent government ontology. Nooshin Allahyari 21