1. Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing &
Interoperability
MetadataArtificial Intelligence
Data Integration &
Exploration
Data Annotation &
Enrichment
3. Medha Devare, Spinning a Semantic Web
for Agriculture
The CGIAR challenges
Their domain is very varied, ranging from fighting poverty to helping to access markets
Technologies to integrate data exists, need to be put together
AI and SW are different, but one need the other, and one can provide results for the other
See also: https://bigdata.cgiar.org/
The GARDIAN Platform
Where should I plant my rice? How should I manage my crop? How to mitigate risks and define
insurance plans?
CGIAR diverse data collected and harmonised, using LD/ontologies
Data made available via SPARQL
AgroFIMS
Platform for field trial data collection
Support both electronic and paper-based operations (need for flexibility)
UI and functionality built over ontology modelling
Export metadata
R Scripting functionality for analytics
4. Christian Lovis , AI and Big data: the
dilemna of Truth
Limits of bioinformatics (eg, in genetics)
There are unpredictable things
Limits of AI, eg,
overfitting (Wheels are faces)
Unreliable data (chocolate consumption vs Nobel laureates)
Biases (Google prefers white skin women)
Our conceptions
Should unreproducible papers being retracted? Shamed?
Anonymisation is impossible, privacy should be post-action
too, not just preventive
5. Philippe Bourne, How does Data Science
impact the Semantic Web
Science isn’t made with formal definitions
Data Science is unexpected reuse of
information
SW has opportunity to contribute, but
schema.org is becoming the norm, not
the exception
FAIR is broader than SW
Model
Transportability
Horizontal
Integration
Multi-scale
Integration
human
mouse
zebrafish
DNA
Gene/Protein
Network
Cell
Tissue
Organ
Body
Population
CNV SNP methylation
3D structure Gene
expression Proteomics
Metabolomics
MetabolicSignaling
transduction
Gene
regulation
Hepatic Myoepithelial Erythrocyte
Epithelial Muscle Nervous
Liver Kidney Pancreas Heart
Physiologically based
pharmacokinetics
GWASPopulation
dynamics
Microbiota
Open, complex, diverse digital data
Systems Pharmacology
Xie et al. Annu Rev Pharmacol Toxicol. 2017 57:245-262
12/04/18
18
6. Dean Allemang, Semantic Web and the
New Industrial Revolution
Comparing academia and business
eg, publishing/Sharing as goal, vs
absolutely forbidden
The evolution of FIBO, from OWL to auto-
generated multiple views/formats
The increasing role of vocabularies and
shared data models
7. Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing &
Interoperability
MetadataArtificial Intelligence
Data Integration &
Exploration
Data Annotation &
Enrichment
9. More Agriculture-related talks
Design of a Framework to Support Reuse of Open Data about Agriculture
Data files are harvested and enriched (at metadata level) with text mining, stored
as linked data
A recommender ranks data annotations according to vectors of user preferences
Web services to access annotated data are auto-created with SADI
Lightly specified ontologies for access to agricultural information across languages and
domains (https://goo.gl/z5qwvh)
The case for taxonomies, SKOS vocabularies, thesauri, etc
The Global Agricultural Concept Scheme (GACS)
10. Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing &
Interoperability
MetadataArtificial Intelligence
Data Integration &
Exploration
Data Annotation &
Enrichment
12. Integration and Data Access Platforms
Data2Services: enabling automated conversion of data to services
Several data formats converted to generic RDF model,
Which can then be translated to something more significant, via SPARQL
Garlic service to convert from spraql to API, with calls like class/all, class/$class/instances,
resource/$uri
Architecture for the harmonization of clinical cohort data in the IMI EMIF project
Harmonised model, configurable mappings to data sources
Sparklis over PEGASE Knowledge Graph: A New Tool for Pharmacovigilance
SPARKLIS, help formulating SPARQL in natural language, and also get NL results
They developed a vigilance ontology and extended SPARKLIS: OntoADR, which leverages SNOMED
They use MeDRA as vigilance source
13. Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing &
Interoperability
MetadataArtificial Intelligence
Data Integration &
Exploration
Data Annotation &
Enrichment
14. Notes by Marco Brandizi
Data Publishing &
Interoperability
15. The bioschemas and WikiBase Tutorials
Bioschemas
Common lightweight ontology to publish data on the web,
mainly to support search engines (derived from schema.org)
Tutorial gave an overview and examples of annotation using their tool: https://goo.gl/GFDhPF
During the hackathon, I’ve got info about proposing new types
WikiData
The Wikipedia of data
Multiple formats supported (JSON, RDF, SPARQL)
Increasingly being used to share open data, make resolvable URIs
Batch imports or Wikibase editor (http://wikiba.se)
Can be used with local installations, Docker support
See also: https://stuff.coffeecode.net/2018/wikibase-workshop-swib18.html
Common properties to describe data (promotes interoperability)
Using Wikidata for semantic data modeling in education and research
Data integration workshops, using Wikidata and Wikibase
16. Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing &
Interoperability
MetadataArtificial Intelligence
Data Integration &
Exploration
Data Annotation &
Enrichment
18. CEDAR
Tool to annotate datasets with metadata
Similar to COPO
Dataset-level metadata description, ontology autocompletion,
ontology recommender
http://tinyurl.com/cedar-swat4ls2018
See also: https://www.go-fair.org
19. Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing &
Interoperability
MetadataArtificial Intelligence
Data Integration &
Exploration
Data Annotation &
Enrichment
20. Notes by Marco Brandizi
Artificial Intelligence
Data Annotation &
Enrichment
21. Enrichment, Text Mining, AI and alike
Evaluation of Knowledge Graph Embedding Approaches for Drug-Drug Interaction Prediction using Linked Open Data
Uses the same methods are used in:
Vec2SPARQL: integrating SPARQL queries and knowledge graph embeddings
APIs to extract feature vectors are integrated with Linked Data in two ways:
LD used to compute features via ML (random walks)
SPARQL extended with similarity metric functions (eg, mostSimilar (?uri, top-n)
Ontology-Driven Metadata Enrichment for Genomic Datasets
Semi-structured sequencing data annotated with ZOOMA/BioPortal,
then scored according to similarity between original text and matched term label
Cooperation of bio-ontologies for the classification of genetic intellectual disabilities : a diseasome approach
Data are classified into multiple disease classes
Results assessed with a comparison metric that rewards pairs in the same disease class
22. Notes by Marco Brandizi
Artificial Intelligence
Data Annotation &
Enrichment
23. The KNIME Tutorial
It’s a platform similar to Galaxy, but dedicated mostly to Machine
Learning
Workflows can be executed in batch mode, results can be exported
as structured data
24. Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing &
Interoperability
MetadataArtificial Intelligence
Data Integration &
Exploration
Data Annotation &
Enrichment