SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
Social Mining & Big Data Ecosystem
Educating Data Scientists:
the SoBigData master
experience
www.sobigdata.eu
Fosca Giannotti, Valerio Grossi
ISTI-CNR Pisa
H2020-INFRAIA-2014-2015
Grant Agreement N. 654024
Modern science is data-intensive,
multidisciplinary, collaborative and global
– efficiency of data management (noSQL paradigms and
cloud computing play important role here) and
curation, search, sharing, transfer.
– managing the complexity of the analytical process is a
key issue (scalable distributed analytical methods and
and Visual Analytics are crucial here).
Firenze, 14 Nov 2016
Validation
Data
DemographicdataGeographicdataMovementdataTransportdata
Models
T-ClusteringT-Patterns
Forecasts
Big Data Analytics process
Firenze, 14 Nov 2016
Interdisciplinary and collaborative
• for sharing data/models/processes and results of
experiments (different level of interoperability and semantic
enrichment)
• to realize experiments by combining resources (data, methods
and results) belonging to different communities.
– This call for tools facilitating the govern of complex
analytical process in a workflow style or mega-modeling.
– This call also for sophisticate search that supports resource
discovery.
Firenze, 14 Nov 2016
Data scientist
A new kind of professional
has emerged, the data
scientist, who combines the
skills of software
programmer, statistician and
storyteller/artist to extract
the nuggets of gold hidden
under mountains of data.
Firenze, 14 Nov 2016
Four core points of a data scientist
• Data Procurement and Curation
• Making sense of Data
• Story-telling
• Respond step-by-step on technical correctness and
legal and ethical issues
Firenze, 14 Nov 2016
SoBigData is…
A Multidisciplinary European Infrastructure for Big Data and Social
Data Mining providing an integrated ecosystem for ethically
sensitive scientific discoveries and advanced applications of social
data mining on the various dimensions of social life, as recorded by
“big data”.
Firenze, 14 Nov 2016
Social Mining - Answer to:
Firenze, 14 Nov 2016
• Who will win US elections? What’s the elector’s current
intention of vote? How reliable is it?
• Which are the indicators of social well-being (beyond GDP)
and how can they be computed and monitored?
• How is the aging population effectively helped by the social
participation to digital community services?
• What is the link between media ownership and media
content? Is there bias in news reporting? And in content
reviews?
• Is an infective disease emerging? How is its diffusion model?
Firenze, 14 Nov 2016
Estimating traffic fluxes on road network with mobile phone
data
A
B
C
H
W
Firenze, 14 Nov 2016
Predicting Success
“Football is a simple game: 22 men chase a ball for 90
minutes and at the end, the Germans always win”
-- Gary Lieneker (after Italy 1990 Final)
Firenze, 14 Nov 2016
Managing Data does not means
Support discover
Provide access, Verify the quality of data, Clean errors, outliers, anomalier
Transform data in a format suitable for specific data analytical tools
It must include support for
• legal interoperability
– copyright management,
– licensing of single and derivative products
– terms of use
• fine-grained policies
– attribution,
– citation policy,
– provenance management
• Ethics issues
Managing Data: what this means?
Firenze, 14 Nov 2016
Metadata in the SoBigData RI
experience
• Huge datasets often describe human activities, which implies
privacy and ethical issues
• As a Research Infrastructure FAIRness is one of our main targets
– The success of the RI is directly connected to the fact that
datasets are Findable, Accessible, Interoperable and
Reusable
– The intellectual property has to be considered
– The design of a highly structured metadata schema allows
the RI to automatically grant or deny access to a dataset, to
force the acceptance of terms of use or signing NDAs…
SoBigData metadata structure
• A highly structured and detailed metadata structure
has been designed in order to provide information
about:
– Description of the dataset (to make it Findable)
– How the dataset has been produced
– Intellectual Property
– Privacy issues
– Who can access the data and how (terms of use,
NDA…)
• Mainly based on the DataCite standard
The ethics of SoBigData
• Gathering large quantities of data has serious consequences
that SoBigData is trying to address. These consequences range
from personal harm, to issues of autonomy, injustice and
inequality.
• In order to deal with these problems, SoBigData adheres to a
value-sensitive design approach. This approach consists in using
design solutions to overcome ethical dilemma’s, in this case
those between the utility of the data gathered vs. the
protection of the individuals subject to the research.
• In order to make the ideals of SoBigData successful, scientific
methods also need to be developed in order embed moral
principles in practice.
Ethics: the challenge for SoBigData
• How do we create an infrastructure in which such data
and methods can be disseminated and improved
upon?
1. A Massive Online Open Cource (MOOC) which instructs all
prospective researchers about the legal and ethical
dangers of big data research and the steps they can take to
minimise these;
2. A set of workflows that outline the steps researchers can
take when designing their approach;
3. Information pop-ups which redirect researchers to state-of-
the-art ethical methods.
Meta data definition: Ethics
Firenze, 14 Nov 2016
Meta data definition: Intellectual Properties
Firenze, 14 Nov 2016
Master in Big Data Analytics & Social Mining
http://www.sobigdata.eu/master/bigdata
Firenze, 14 Nov 2016
Firenze, 14 Nov 2016
Education
• Big Data Sensing
• Big Data Mining
• Big Data Story Telling
• Big Data Technology
• Big Data for Social Good
• Big Data Ethics
Firenze, 14 Nov 2016
Students: their studies
0
1
2
3
4
5
6
7
8
2015
2016
Firenze, 14 Nov 2016
Gender distribution
0
5
10
15
20
25
2014-2015 2015-2016
M
F
Firenze, 14 Nov 2016
Firenze, 14 Nov 2016

Mais conteúdo relacionado

Mais procurados

The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...Anna De Liddo
 
Addressing non economical externalities
Addressing non economical externalitiesAddressing non economical externalities
Addressing non economical externalitiesBYTE Project
 
A-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and healthA-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and healthBYTE Project
 
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive EraAlgorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive EraNozha Boujemaa
 
BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016Trilateral Research
 
Cross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and SolutionsCross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and SolutionsBYTE Project
 
Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016Trilateral Research
 
Digital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveDigital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveChristopher Brown
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
Open Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesOpen Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesSlim Turki, Dr.
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | EUDAT
 
Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014Matti Rossi
 
Customer Centricity at ATF 11Jun2014
Customer Centricity at ATF 11Jun2014Customer Centricity at ATF 11Jun2014
Customer Centricity at ATF 11Jun2014Rick Holgate
 
Holger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulHolger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulsemanticsconference
 
Data ecosystems: turning data into public value
Data ecosystems:  turning data into public valueData ecosystems:  turning data into public value
Data ecosystems: turning data into public valueSlim Turki, Dr.
 
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...CambridgeshireInsight
 
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Anna Maria Tammaro
 

Mais procurados (20)

The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
 
Addressing non economical externalities
Addressing non economical externalitiesAddressing non economical externalities
Addressing non economical externalities
 
A-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and healthA-XLRM summary for BYTE case studies: Crisis, culture and health
A-XLRM summary for BYTE case studies: Crisis, culture and health
 
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive EraAlgorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
Algorithmic Systems Transparency and Accountability in Big Data & Cognitive Era
 
BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016BYTE bdva Valencia Summit November 2016
BYTE bdva Valencia Summit November 2016
 
Cross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and SolutionsCross-Disciplinary Insights on Big Data Challenges and Solutions
Cross-Disciplinary Insights on Big Data Challenges and Solutions
 
Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016Phaedra II Technology foresight, 17 Nov 2016
Phaedra II Technology foresight, 17 Nov 2016
 
Data Science and its impact on society
Data Science and its impact on societyData Science and its impact on society
Data Science and its impact on society
 
Digital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspectiveDigital notebooks - a Jisc perspective
Digital notebooks - a Jisc perspective
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
Collaborate to Share
Collaborate to ShareCollaborate to Share
Collaborate to Share
 
Open Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and OpportunitiesOpen Data: Barriers, Risks, and Opportunities
Open Data: Barriers, Risks, and Opportunities
 
Research Data Alliance Overview
Research Data Alliance OverviewResearch Data Alliance Overview
Research Data Alliance Overview
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu |
 
Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014Open data ecosystems research talk at Copenhagen Business School on 25042014
Open data ecosystems research talk at Copenhagen Business School on 25042014
 
Customer Centricity at ATF 11Jun2014
Customer Centricity at ATF 11Jun2014Customer Centricity at ATF 11Jun2014
Customer Centricity at ATF 11Jun2014
 
Holger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulHolger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and useful
 
Data ecosystems: turning data into public value
Data ecosystems:  turning data into public valueData ecosystems:  turning data into public value
Data ecosystems: turning data into public value
 
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
 
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
 

Destaque

Social Network Analysis Project
Social Network Analysis ProjectSocial Network Analysis Project
Social Network Analysis ProjectFrancesco Corucci
 
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...AmbasciatadelCanada
 
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MININGDino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MININGieee-cist
 
Data management experiences in the European projects context: which lessons f...
Data management experiences in the European projects context: which lessons f...Data management experiences in the European projects context: which lessons f...
Data management experiences in the European projects context: which lessons f...Research Data Alliance
 
SoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningSoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningResearch Data Alliance
 
Soil Research Data Policies, Data availability and Access, and the Interopera...
Soil Research Data Policies, Data availability and Access, and the Interopera...Soil Research Data Policies, Data availability and Access, and the Interopera...
Soil Research Data Policies, Data availability and Access, and the Interopera...Research Data Alliance
 
Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedResearch Data Alliance
 

Destaque (8)

Social Network Analysis Project
Social Network Analysis ProjectSocial Network Analysis Project
Social Network Analysis Project
 
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
 
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MININGDino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
 
Rda in a_nutshell_january_2017
Rda in a_nutshell_january_2017Rda in a_nutshell_january_2017
Rda in a_nutshell_january_2017
 
Data management experiences in the European projects context: which lessons f...
Data management experiences in the European projects context: which lessons f...Data management experiences in the European projects context: which lessons f...
Data management experiences in the European projects context: which lessons f...
 
SoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningSoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social Mining
 
Soil Research Data Policies, Data availability and Access, and the Interopera...
Soil Research Data Policies, Data availability and Access, and the Interopera...Soil Research Data Policies, Data availability and Access, and the Interopera...
Soil Research Data Policies, Data availability and Access, and the Interopera...
 
Rda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updatedRda in a_nutshell_february_2017_updated
Rda in a_nutshell_february_2017_updated
 

Semelhante a Educating Data Scientists: the SoBigData master experience

My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018Susanna-Assunta Sansone
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Joanne Luciano
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020OpenAIRE
 
Locus Charter Presentation
Locus Charter Presentation Locus Charter Presentation
Locus Charter Presentation Suchith Anand
 
Research Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactResearch Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactHerman Stehouwer
 
A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...
A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...
A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...José Pablo Gómez Barrón S.
 
The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...African Open Science Platform
 
Exploring Research Opportunities in the Digital Era
Exploring Research Opportunities in the Digital EraExploring Research Opportunities in the Digital Era
Exploring Research Opportunities in the Digital EraTogar Simatupang
 
Health Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataHealth Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataPhilip Bourne
 
dissertation proposal writing service
dissertation proposal writing servicedissertation proposal writing service
dissertation proposal writing servicePhd Assistance
 
Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...
Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...
Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...Andrei Ciortea
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventGigaScience, BGI Hong Kong
 
Me and My Big Data Project
Me and My Big Data Project Me and My Big Data Project
Me and My Big Data Project DIPRC2019
 
20170530_Open Research Data in Horizon 2020
20170530_Open Research Data in Horizon 202020170530_Open Research Data in Horizon 2020
20170530_Open Research Data in Horizon 2020OpenAIRE
 

Semelhante a Educating Data Scientists: the SoBigData master experience (20)

My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
 
Locus Charter Presentation
Locus Charter Presentation Locus Charter Presentation
Locus Charter Presentation
 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
 
Research Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactResearch Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected Impact
 
A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...
A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...
A Socio-Technical Design Approach to Build Crowdsourced and Volunteered Geogr...
 
The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...The role of libraries and information professionals during the Big Data Era/ ...
The role of libraries and information professionals during the Big Data Era/ ...
 
Exploring Research Opportunities in the Digital Era
Exploring Research Opportunities in the Digital EraExploring Research Opportunities in the Digital Era
Exploring Research Opportunities in the Digital Era
 
Big data
Big dataBig data
Big data
 
Health Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataHealth Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big Data
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
dissertation proposal writing service
dissertation proposal writing servicedissertation proposal writing service
dissertation proposal writing service
 
Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...
Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...
Hypermedia-driven Socio-technical Networks for Goal-driven Discovery in the W...
 
ppt1.pptx
ppt1.pptxppt1.pptx
ppt1.pptx
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
 
Me and My Big Data Project
Me and My Big Data Project Me and My Big Data Project
Me and My Big Data Project
 
From Aspiration to Reality: Open Smart Cities
From Aspiration to Reality: Open Smart CitiesFrom Aspiration to Reality: Open Smart Cities
From Aspiration to Reality: Open Smart Cities
 
20170530_Open Research Data in Horizon 2020
20170530_Open Research Data in Horizon 202020170530_Open Research Data in Horizon 2020
20170530_Open Research Data in Horizon 2020
 

Mais de Research Data Alliance

The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersResearch Data Alliance
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchResearch Data Alliance
 

Mais de Research Data Alliance (20)

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
 
RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
 

Último

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 

Último (20)

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 

Educating Data Scientists: the SoBigData master experience

  • 1. Social Mining & Big Data Ecosystem Educating Data Scientists: the SoBigData master experience www.sobigdata.eu Fosca Giannotti, Valerio Grossi ISTI-CNR Pisa H2020-INFRAIA-2014-2015 Grant Agreement N. 654024
  • 2. Modern science is data-intensive, multidisciplinary, collaborative and global – efficiency of data management (noSQL paradigms and cloud computing play important role here) and curation, search, sharing, transfer. – managing the complexity of the analytical process is a key issue (scalable distributed analytical methods and and Visual Analytics are crucial here). Firenze, 14 Nov 2016
  • 4. Interdisciplinary and collaborative • for sharing data/models/processes and results of experiments (different level of interoperability and semantic enrichment) • to realize experiments by combining resources (data, methods and results) belonging to different communities. – This call for tools facilitating the govern of complex analytical process in a workflow style or mega-modeling. – This call also for sophisticate search that supports resource discovery. Firenze, 14 Nov 2016
  • 5. Data scientist A new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Firenze, 14 Nov 2016
  • 6. Four core points of a data scientist • Data Procurement and Curation • Making sense of Data • Story-telling • Respond step-by-step on technical correctness and legal and ethical issues Firenze, 14 Nov 2016
  • 7. SoBigData is… A Multidisciplinary European Infrastructure for Big Data and Social Data Mining providing an integrated ecosystem for ethically sensitive scientific discoveries and advanced applications of social data mining on the various dimensions of social life, as recorded by “big data”. Firenze, 14 Nov 2016
  • 8. Social Mining - Answer to: Firenze, 14 Nov 2016 • Who will win US elections? What’s the elector’s current intention of vote? How reliable is it? • Which are the indicators of social well-being (beyond GDP) and how can they be computed and monitored? • How is the aging population effectively helped by the social participation to digital community services? • What is the link between media ownership and media content? Is there bias in news reporting? And in content reviews? • Is an infective disease emerging? How is its diffusion model?
  • 10. Estimating traffic fluxes on road network with mobile phone data A B C H W Firenze, 14 Nov 2016
  • 11. Predicting Success “Football is a simple game: 22 men chase a ball for 90 minutes and at the end, the Germans always win” -- Gary Lieneker (after Italy 1990 Final) Firenze, 14 Nov 2016
  • 12. Managing Data does not means Support discover Provide access, Verify the quality of data, Clean errors, outliers, anomalier Transform data in a format suitable for specific data analytical tools It must include support for • legal interoperability – copyright management, – licensing of single and derivative products – terms of use • fine-grained policies – attribution, – citation policy, – provenance management • Ethics issues Managing Data: what this means? Firenze, 14 Nov 2016
  • 13. Metadata in the SoBigData RI experience • Huge datasets often describe human activities, which implies privacy and ethical issues • As a Research Infrastructure FAIRness is one of our main targets – The success of the RI is directly connected to the fact that datasets are Findable, Accessible, Interoperable and Reusable – The intellectual property has to be considered – The design of a highly structured metadata schema allows the RI to automatically grant or deny access to a dataset, to force the acceptance of terms of use or signing NDAs…
  • 14. SoBigData metadata structure • A highly structured and detailed metadata structure has been designed in order to provide information about: – Description of the dataset (to make it Findable) – How the dataset has been produced – Intellectual Property – Privacy issues – Who can access the data and how (terms of use, NDA…) • Mainly based on the DataCite standard
  • 15. The ethics of SoBigData • Gathering large quantities of data has serious consequences that SoBigData is trying to address. These consequences range from personal harm, to issues of autonomy, injustice and inequality. • In order to deal with these problems, SoBigData adheres to a value-sensitive design approach. This approach consists in using design solutions to overcome ethical dilemma’s, in this case those between the utility of the data gathered vs. the protection of the individuals subject to the research. • In order to make the ideals of SoBigData successful, scientific methods also need to be developed in order embed moral principles in practice.
  • 16. Ethics: the challenge for SoBigData • How do we create an infrastructure in which such data and methods can be disseminated and improved upon? 1. A Massive Online Open Cource (MOOC) which instructs all prospective researchers about the legal and ethical dangers of big data research and the steps they can take to minimise these; 2. A set of workflows that outline the steps researchers can take when designing their approach; 3. Information pop-ups which redirect researchers to state-of- the-art ethical methods.
  • 17. Meta data definition: Ethics Firenze, 14 Nov 2016
  • 18. Meta data definition: Intellectual Properties Firenze, 14 Nov 2016
  • 19. Master in Big Data Analytics & Social Mining http://www.sobigdata.eu/master/bigdata Firenze, 14 Nov 2016
  • 21. Education • Big Data Sensing • Big Data Mining • Big Data Story Telling • Big Data Technology • Big Data for Social Good • Big Data Ethics Firenze, 14 Nov 2016