Notes about SWAT4LS 2018

Notes by Marco Brandizi
Key Notes
Agriculture
Data Publishing & 
Interoperability
MetadataArtificial Intelligence
Data Integration & 
Exploration
Data Annotation &
Enrichment

Key Notes

Medha Devare, Spinning a Semantic Web
for Agriculture
The CGIAR challenges
Their domain is very varied, ranging from fighting poverty to helping to access markets
Technologies to integrate data exists, need to be put together
AI and SW are different, but one need the other, and one can provide results for the other
See also: https://bigdata.cgiar.org/
The GARDIAN Platform
Where should I plant my rice? How should I manage my crop? How to mitigate risks and define
insurance plans?
CGIAR diverse data collected and harmonised, using LD/ontologies
Data made available via SPARQL
AgroFIMS
Platform for field trial data collection
Support both electronic and paper-based operations (need for flexibility)
UI and functionality built over ontology modelling
Export metadata
R Scripting functionality for analytics

Christian Lovis , AI and Big data: the
dilemna of Truth
Limits of bioinformatics (eg, in genetics)
There are unpredictable things
Limits of AI, eg,
overﬁtting (Wheels are faces)
Unreliable data (chocolate consumption vs Nobel laureates)
Biases (Google prefers white skin women)
Our conceptions
Should unreproducible papers being retracted? Shamed?
Anonymisation is impossible, privacy should be post-action
too, not just preventive

Philippe Bourne, How does Data Science
impact the Semantic Web
Science isn’t made with formal deﬁnitions
Data Science is unexpected reuse of
information
SW has opportunity to contribute, but
schema.org is becoming the norm, not
the exception
FAIR is broader than SW
Model
Transportability
Horizontal
Integration
Multi-scale
Integration
human
mouse
zebrafish
DNA
Gene/Protein
Network
Cell
Tissue
Organ
Body
Population
CNV SNP methylation
3D structure Gene
expression Proteomics
Metabolomics
MetabolicSignaling
transduction
Gene
regulation
Hepatic Myoepithelial Erythrocyte
Epithelial Muscle Nervous
Liver Kidney Pancreas Heart
Physiologically based
pharmacokinetics
GWASPopulation
dynamics
Microbiota
Open, complex, diverse digital data
Systems Pharmacology
Xie et al. Annu Rev Pharmacol Toxicol. 2017 57:245-262
12/04/18
18

Dean Allemang, Semantic Web and the
New Industrial Revolution
Comparing academia and business
eg, publishing/Sharing as goal, vs
absolutely forbidden
The evolution of FIBO, from OWL to auto-
generated multiple views/formats
The increasing role of vocabularies and
shared data models

Agriculture

More Agriculture-related talks
Design of a Framework to Support Reuse of Open Data about Agriculture
Data ﬁles are harvested and enriched (at metadata level) with text mining, stored
as linked data
A recommender ranks data annotations according to vectors of user preferences
Web services to access annotated data are auto-created with SADI
Lightly speciﬁed ontologies for access to agricultural information across languages and
domains (https://goo.gl/z5qwvh)
The case for taxonomies, SKOS vocabularies, thesauri, etc
The Global Agricultural Concept Scheme (GACS)

Data Integration & 
Exploration

Integration and Data Access Platforms
Data2Services: enabling automated conversion of data to services
Several data formats converted to generic RDF model,
Which can then be translated to something more signiﬁcant, via SPARQL
Garlic service to convert from spraql to API, with calls like class/all, class/$class/instances,
resource/$uri
Architecture for the harmonization of clinical cohort data in the IMI EMIF project
Harmonised model, conﬁgurable mappings to data sources
Sparklis over PEGASE Knowledge Graph: A New Tool for Pharmacovigilance
SPARKLIS, help formulating SPARQL in natural language, and also get NL results
They developed a vigilance ontology and extended SPARKLIS: OntoADR, which leverages SNOMED
They use MeDRA as vigilance source

Data Publishing & 
Interoperability

The bioschemas and WikiBase Tutorials
Bioschemas
Common lightweight ontology to publish data on the web,
mainly to support search engines (derived from schema.org)
Tutorial gave an overview and examples of annotation using their tool: https://goo.gl/GFDhPF
During the hackathon, I’ve got info about proposing new types
WikiData
The Wikipedia of data
Multiple formats supported (JSON, RDF, SPARQL)
Increasingly being used to share open data, make resolvable URIs
Batch imports or Wikibase editor (http://wikiba.se)
Can be used with local installations, Docker support
See also: https://stuff.coffeecode.net/2018/wikibase-workshop-swib18.html
Common properties to describe data (promotes interoperability)
Using Wikidata for semantic data modeling in education and research
Data integration workshops, using Wikidata and Wikibase

Metadata

CEDAR
Tool to annotate datasets with metadata
Similar to COPO
Dataset-level metadata description, ontology autocompletion,
ontology recommender
http://tinyurl.com/cedar-swat4ls2018
See also: https://www.go-fair.org

Artificial Intelligence
Data Annotation &
Enrichment

Enrichment, Text Mining, AI and alike
Evaluation of Knowledge Graph Embedding Approaches for Drug-Drug Interaction Prediction using Linked Open Data
Uses the same methods are used in:
Vec2SPARQL: integrating SPARQL queries and knowledge graph embeddings
APIs to extract feature vectors are integrated with Linked Data in two ways:
LD used to compute features via ML (random walks)
SPARQL extended with similarity metric functions (eg, mostSimilar (?uri, top-n)
Ontology-Driven Metadata Enrichment for Genomic Datasets
Semi-structured sequencing data annotated with ZOOMA/BioPortal,
then scored according to similarity between original text and matched term label
Cooperation of bio-ontologies for the classiﬁcation of genetic intellectual disabilities : a diseasome approach
Data are classiﬁed into multiple disease classes
Results assessed with a comparison metric that rewards pairs in the same disease class

The KNIME Tutorial
It’s a platform similar to Galaxy, but dedicated mostly to Machine
Learning
Workﬂows can be executed in batch mode, results can be exported
as structured data

Notes about SWAT4LS 2018

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Notes about SWAT4LS 2018

Semelhante a Notes about SWAT4LS 2018 (20)

Mais de Rothamsted Research, UK

Mais de Rothamsted Research, UK (20)

Último

Último (20)

Notes about SWAT4LS 2018