SlideShare uma empresa Scribd logo
1 de 36
A curse of interdisciplinarity
‘ A challenge in the other discipline always
   seems ‘easy’ because we are not hindered by
   knowledge’.

Barend Mons
(DTL-DISC/ELIXIR)
NBIC, LUMC.




                                                 1
PPP




10/09/12         2
ELIXIR
Safeguarding the results of life science
                   research in Europe
        European Life Sciences Infrastructure for Biological
                                               Information
                                www.elixir-europe.org
DISC: the connected data departments of DTL research Hotels



                                         DISC*
technology
facilities

technology
research

education
                        DTL
& training




 *) DISC = DTL Data Integration & Stewardship Centre
What is bioinformatics?
• The science of storing,
  retrieving and analysing
  large amounts of biological
  information
• An interdisciplinary science
  involving biologists,
  biochemists, computer
  scientists and
  mathematicians
• At the heart of modern
  biology


                                    5
Bioinformatics underpins life-science research


 11Genomes
   Genomes
Contain genes
 Contain genes

     22Genes are
        Genes are
     transcribed
      transcribed

            33Transcripts translate
               Transcripts translate
             to protein sequences
              to protein sequences

                    44Proteins form three-
                       Proteins form three-
                    dimensional structures
                     dimensional structures

                     55Proteins interact with each other
                        Proteins interact with each other
                      and with small molecules to form
                       and with small molecules to form
                                  pathways
                                   pathways

                                               6 Pathways combine
                                                6 Pathways combine
                                                 to build systems
                                                  to build systems


                                                                     6
Life Science data: Multi-omics, multi-technology, multi organism, multi dimensional
From molecules to medicine
Molecular components                  Integration                         Translation
                     Genomes
                                                                                    Human
                                                                                    populations
                        Nucleotides                        Biobanks
                                      Tissues and organs
             Transcripts
                             Complexes                                Therapies

             Proteins                                                                   Disease
                                                                                        prevention

 Domains

                 Pathways
                                      Cells
                                               Human                                Early
                                               individuals                          Diagnosis
Structures
              Small molecules

                                                                                            8
What is ELIXIR?
• An ESFRI research infrastructure of global significance
• Unites Europe’s leading life science organisations in
  managing and safeguarding the vast amounts of
  data being generated every day by publicly funded
  research.
• A large-scale initiative that will provide the facilities
  necessary for Europe’s life-science researchers to
  make the most of our rapidly growing store of
  information about living systems, which is the
  foundation on which our understanding of life is built.




                                                              9
Why ELIXIR?
• Creating a robust infrastructure for biological
  information is a bigger task than EMBL-EBI – or
  any individual organisation or nation – can take
  on alone.
• Biology has by far the largest research
  community:
    • ~3 million life science researchers in Europe
    • >6 million web hits a day at EMBL-EBI alone
• We need to involve other European partners




                                                      10
The challenge
• Computer speed
  and storage capacity
  is doubling every 18
  months and this
  rate is steady
• DNA sequence data
  is doubling every 6-
  8 months over the
  last 3 years and
  looks to continue for         Guy Cochrane, ENA, EMBL-EBI

  this decade


                                                    11
Europe has already paid for the
           science

              Annual cost of generating new protein
              structure data in labs around the world




                Annual cost of maintaining the data
                in a central database




                                                  12
ELIXIR’s mission
To build a sustainable
European infrastructure for
biological information,
supporting life science
research and its
                                                  medicine
translation to:

                                    environment


                         bioindustries

            society


                                                             13
A distributed pan-European
        infrastructure




                             14
Benefits
ELIXIR will contribute to European innovation by:
• Optimising access and exploitation of life-science data
• Ensuring longevity of the data, thereby protecting
  investments already made in research
• Enhancing the quality of European research by supporting
  national efforts to increase the competence and number
  of bioinformatics users through training
• Strengthening the global position and influence of Europe
  in life-science research in both in academia and industry




                                                          15
The scientific reason for ELIXIR
• Data is an essential commodity
  for life-science research.
• Ten years ago, finding the
  connection between a gene
  and a characteristic (e.g.
  drought tolerance, risk of heart
  disease) could take years; now
  it takes minutes.                           Image courtesy of Genome Research Ltd.




• Data analysis is now the bottleneck in life-science research
• ELIXIR is our only realistic hope of easing that bottleneck



                                                                             16
One societal reason for ELIXIR
• The era of personal genome
  sequencing is upon us.
• Sequence data will not cross
  national boundaries.
• Every national health
  system will need expertise
  to interpret it and treat
  patients accordingly.
• Individuals need to be sure
  that their personal
  biological data are in safe
  hands.



                                    18
The financial reason for ELIXIR
• Europe has already spent
  the money to generate the
  data.
• It will waste all this
  investment in research if the
  future of the data is not
  secured.
• Industry, from SMEs to big
  multinationals, needs
  access to public data to
  analyse its proprietary data.



                                     19
Maintaining open access
• Open access to life science is essential for
  advances in many areas of research
• Open access to bioinformatics resources provides
  a valuable path to discovery, one that in many
  other areas of research is limited by commercial
  confidentiality
                                                        Mark Forster, Syngenta,
• Charging for that data, or seeking to restrict       member of the EMBL-EBI
                                                          Industry Programme
  access through exercising Intellectual Property
  (IP) rights, would impede progress
• ELIXIR will guarantee that open access to
  biological data is maintained. Speaking with a
  single voice will strengthen Europe’s influence in
  such global discussions.


                                                                          20
13 ELIXIR Countries




                      21
Part two >>>> eScience in LS
• The way we dicover knowledge has changed
  fundamentally over just a decade.



                             BIGNORANC
                                 E



10/09/12                                     22
The general challenge: Data has far outgrown institutional handling capacity is everywhere
                                                        The Data Deluge
                                   The Issue:               But Life Sciences is particularly
                                                            challenged and complex.




                                                                More and more
                                                                We write
                                                                ‘about datasets’
                         ….The amount of digital data is        That are too large to publish
                         exploding, with a staggering 1.8
                         zettabytes in 2011                     In narrative
Nanopublications & Cardinal Assertions
            Nanopublication
                                     A Nanopublication is the smallest unit of
                                     publishable information containing:
                                     1.Assertion
                                         A statement of concepts in terms of one or
                                         more ‘subject -> predicate -> object’ (triple)
                                         relationships.
                                     1.Provenance
                                         a)Attribution – Who made this assertion,
1                      ‘n’               when and where?
identical              different         b)Supporting information – Any other
assertion              provenances       information which is relevant to the assertion
                                         (e.g. this assertion is only valid in humans
                                         under 18).

                                     A Cardinal Assertion aggregates all ‘n’
                                     Nanopublications making the same
                                     assertion. It therefore has 1 assertion and
                                     ‘n’ provenances, eliminating redundancy.
       Cardinal Assertion
Under the hood……
Managing volume & complexity
Combining Cardinal Assertions with




                                      5
                                      5
Concept profiles reduces the amount of
data with ≈99.999996%




                              4
                              4



                                          1
                                          1
Individual




                                  2
                                  2
Concept Profiles
≈4x106
Individual
Cardinal Assertions
                      5               4       2   1
> 10  11



Individual
Nanopublications
> 1014
The LS concept web: 2x2x106 concepts (profiles)
A dynamic Concept Web versus a static Ontology




28
= Known reference pairs
                          = non-co-occurrence pairs



  More mutual information
No increase in concept overlap
  Including manual curation



  More concepts in common




  Removal of low info paths
eScience…. in silico reasoning and in cerebro validation


                      Expert Skype calls




                       Reading up
Organisation of the ecosystem
Global Authority         Nanopublishers   App & Service     Users
                                            Providers



         Endorse          CA Space
                                          Application      Knowledge
                         (OCS & ICS)
                                          development     Management
                           Providers

                                           Reasoning
                                            services
             Practices




                                                          Academic &
               Best




                         ONS/INSs         technical and   Commercial
                                             process        Users
                                          consultancy

                                             project
                                                          Knowledge
                           Original         delivery
                                                          Discovery
         Assist &        Data Owners        capacity
          Certify
33
IN ANY CASE: regardless of how
     ‘sensitive’ your data is, it is malpractice
                        to:
        - Generate data without a solid stewardship plan
        - Build impenetrable SILOS
        - Fail to record provenance
        - Store them in non interoperable format
        - Think that data=information

        -EVEN if your only goal is the Nobel Prize
         (or for Dutch: a Spinoza Prize)




34
Acceptance of Semantic Web Approach

Over the last decade, academic
research organisations developed
new methodologies and tools to
address the Big Data problem.
Global agreement by leading
scientists on unique
Nanopublication solution.
100’s of millions already invested
in the basis technology
Applicable as a technology across
(STM) domains and industries.
Pharmaceutical companies are
early adopters (Innovative
Medicine Initiative).
The ‘Dutch Team’                                Acknowledging…
 •   Herman van Haagen , MsC. (LUMC)
 •   Dr. Peter Bram ‘t Hoen (LUMC)                          CWA- Open PHACTS
 •   Dr. Marco Roos (LUMC)
                                                        •   Prof. Amos Bairoch (SIB, Switzerland, CWA)
 •   Dr. Erik Schultes (LUMC)
                                                        •   Prof. Carole Goble (Mancheste, CWA, OPS)
 •   Prof. Johan den Dunnen (LUMC)
                                                        •   Prof. Katy Borner (Indiana University CWA)
 •   Prof. Gertjan van Ommen (LUMC)
                                                        •   Prof. Mark Musen (NCBO, Stanford CWA,OPS)
 •   Dr. Erik van Mulligen (EMC)
                                                        •   Dr. Pascale Gaudet (UniProt, ISB, CWA
 •   Dr. Jan Kors (EMC)
                                                        •   Dr. Mike Colon (VIVO, UF, CWA)
 •   Dr. Martijn Schuemie (EMC)
                                                        •   Prof. Maryann Martone (Force 11, USC, CWA)
 •   Prof. Johan van der Lei (EMC)
                                                        •   Dr. Nigam Shah (NCBO, Stanford, CWA, OPS)
 •   Dr. Rob Hooft (NBIC)
                                                        •   Dr. Mark Wlikinson (Canada, CWA)
 •   Dr. Christine Chichester (NBIC)
                                                        •   Abel Packer (Brazil, Scielo, CWA, OPS)
 •   Dr. Leon Mei (NBIC)
                                                        •   Jan Velterop (ACKnowledge, CWA, OPS)
 •   Kees Burger (NBIC)
                                                        •   Albert Mons (CWA, NBIC)
 •   Bharat Singh (NBIC/EMC)
                                                        •   Prof. Frank van Harnelen (FUA/LARKC, CWA, OPS)
 •   Dr. Marc van Driel (NBIC)
                                                        •   Dr. Chris Evelo (Maastrciht, CWA, OPS)
 •   Dr. Ruben Kok (NBIC)
                                                        •   Dr. Antony Willams (RSC/ChemSpider, CWA,OPS)
 •   Prof. Marcel Reinders (NBIC)
                                                        •   Dr. Richard Kidd (RSC, OPS)
 •   Prof. Jaap Heringa (NBIC)
                                                        •   Dr. Paul Groth (FUA, CWA, OPS)
 •   Prof. Gert Vriend (NBIC)
                                                        •   Dr. Michel Dumontier (Canada, CWA, OPS)
 •   Dr. Morris Schwertz (BBMRI, CWA)
                                                        •   Dr .Andrew Gibson, UA, CWA, OPS)
 •   Dr. Andra Waagmeester (NBIC)
                                                        •   Dr. Bryn Williams-Jones (Pfizer, OPS)
 •   Dr. Kristina Hettne (LUMC)
                                                        •   Dr. Ian Dix (Astra Zeneca, OPS)
 •   Dr. Rene van Schaik (eScience Cenrte)
                                                        •   Dr. Niklas Blomberg (Astra Zeneca, OPS)
 •   Drs. Albert Mons (PHORTOS consultants)
                                                        •   Dr. Mike Barnes, GSK, OPS)
 •   Mr. Drs. Arie Baak (PHORTOS consultants)
                                                        •   Prof. Jan-erik Litton (CWA, BBMRI)

Mais conteúdo relacionado

Mais procurados

Current and future techniques for cancer diagnosis
Current and future techniques for  cancer diagnosisCurrent and future techniques for  cancer diagnosis
Current and future techniques for cancer diagnosisNitin Talreja
 
Bionanotechnology and its applications
Bionanotechnology and its applications Bionanotechnology and its applications
Bionanotechnology and its applications rita martin
 
Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...
Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...
Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...Brian Russell
 
CYTOO Stories Mitochondria
CYTOO Stories MitochondriaCYTOO Stories Mitochondria
CYTOO Stories MitochondriaCYTOO
 
Stem cells and_regenerative_medicine
Stem cells and_regenerative_medicineStem cells and_regenerative_medicine
Stem cells and_regenerative_medicineMarthaBeatrizLpezYri
 
E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1Vincent Breton
 
Bioinformatics relevance with biotechnology
Bioinformatics relevance with biotechnologyBioinformatics relevance with biotechnology
Bioinformatics relevance with biotechnologyKAUSHAL SAHU
 
Nanomedicine- a brief introductory outline
Nanomedicine- a brief introductory outlineNanomedicine- a brief introductory outline
Nanomedicine- a brief introductory outlineAratrika Dutta
 

Mais procurados (14)

Nanomedicine
NanomedicineNanomedicine
Nanomedicine
 
Current and future techniques for cancer diagnosis
Current and future techniques for  cancer diagnosisCurrent and future techniques for  cancer diagnosis
Current and future techniques for cancer diagnosis
 
Bionanotechnology and its applications
Bionanotechnology and its applications Bionanotechnology and its applications
Bionanotechnology and its applications
 
nanomedicines
nanomedicinesnanomedicines
nanomedicines
 
Nano seminar final
Nano seminar finalNano seminar final
Nano seminar final
 
Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...
Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...
Biology, genetics, nanotechnology, neuroscience, materials science, biotech, ...
 
CYTOO Stories Mitochondria
CYTOO Stories MitochondriaCYTOO Stories Mitochondria
CYTOO Stories Mitochondria
 
Stem cells and_regenerative_medicine
Stem cells and_regenerative_medicineStem cells and_regenerative_medicine
Stem cells and_regenerative_medicine
 
E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1
 
Bioinformatics relevance with biotechnology
Bioinformatics relevance with biotechnologyBioinformatics relevance with biotechnology
Bioinformatics relevance with biotechnology
 
Nanomedicine
NanomedicineNanomedicine
Nanomedicine
 
Bionanotechnology
BionanotechnologyBionanotechnology
Bionanotechnology
 
Nanomedicine
NanomedicineNanomedicine
Nanomedicine
 
Nanomedicine- a brief introductory outline
Nanomedicine- a brief introductory outlineNanomedicine- a brief introductory outline
Nanomedicine- a brief introductory outline
 

Semelhante a Big Data

Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformaticsMakarand Bhale
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesRafael C. Jimenez
 
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...ELIXIR-Europe
 
Bio it worldexpoeurope2012_shublaq
Bio it worldexpoeurope2012_shublaqBio it worldexpoeurope2012_shublaq
Bio it worldexpoeurope2012_shublaqNour Shublaq
 
Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Iddo
 
ISB Prosperity Partnership Presentation by John Aitchison
ISB Prosperity Partnership Presentation by John AitchisonISB Prosperity Partnership Presentation by John Aitchison
ISB Prosperity Partnership Presentation by John AitchisonInstitute for Systems Biology
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactionsPrianca12
 
Genome data management
Genome data managementGenome data management
Genome data managementShareb Ismaeel
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformaticsbiinoida
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomicsNikhil Aggarwal
 
Bioinformatics issues and challanges presentation at s p college
Bioinformatics  issues and challanges  presentation at s p collegeBioinformatics  issues and challanges  presentation at s p college
Bioinformatics issues and challanges presentation at s p collegeSKUASTKashmir
 
Bioprinting
BioprintingBioprinting
BioprintingMIT
 
Mie2012 27 aug12-shublaq
Mie2012 27 aug12-shublaqMie2012 27 aug12-shublaq
Mie2012 27 aug12-shublaqINBIOMEDvision
 
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnectionseROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnectionse-ROSA
 
Synthetic biology
Synthetic biology Synthetic biology
Synthetic biology Elham Lasemi
 

Semelhante a Big Data (20)

Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciences
 
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
 
Bio it worldexpoeurope2012_shublaq
Bio it worldexpoeurope2012_shublaqBio it worldexpoeurope2012_shublaq
Bio it worldexpoeurope2012_shublaq
 
Basic of bioinformatics
Basic of bioinformaticsBasic of bioinformatics
Basic of bioinformatics
 
Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013
 
ISB Prosperity Partnership Presentation by John Aitchison
ISB Prosperity Partnership Presentation by John AitchisonISB Prosperity Partnership Presentation by John Aitchison
ISB Prosperity Partnership Presentation by John Aitchison
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
Genome data management
Genome data managementGenome data management
Genome data management
 
eScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-BrazileScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-Brazil
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
Bioinformatics issues and challanges presentation at s p college
Bioinformatics  issues and challanges  presentation at s p collegeBioinformatics  issues and challanges  presentation at s p college
Bioinformatics issues and challanges presentation at s p college
 
Bioprinting
BioprintingBioprinting
Bioprinting
 
Mie2012 27 aug12-shublaq
Mie2012 27 aug12-shublaqMie2012 27 aug12-shublaq
Mie2012 27 aug12-shublaq
 
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnectionseROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Synthetic biology
Synthetic biology Synthetic biology
Synthetic biology
 

Mais de SURFnet

7-minute-speeches. Deel 3.
7-minute-speeches. Deel 3.7-minute-speeches. Deel 3.
7-minute-speeches. Deel 3.SURFnet
 
The mobile evolution of the employee and student pass
The mobile evolution of the employee and student passThe mobile evolution of the employee and student pass
The mobile evolution of the employee and student passSURFnet
 
Location-based services: van theorie naar praktijk. Deel 2
Location-based services: van theorie naar praktijk. Deel 2Location-based services: van theorie naar praktijk. Deel 2
Location-based services: van theorie naar praktijk. Deel 2SURFnet
 
Automatisering en orkestratie: update en toekomstplannen
Automatisering en orkestratie: update en toekomstplannenAutomatisering en orkestratie: update en toekomstplannen
Automatisering en orkestratie: update en toekomstplannenSURFnet
 
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2SURFnet
 
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1SURFnet
 
RUGnet, een service oriented internationaal netwerk van Fryslân tot China
RUGnet, een service oriented internationaal netwerk van Fryslân tot ChinaRUGnet, een service oriented internationaal netwerk van Fryslân tot China
RUGnet, een service oriented internationaal netwerk van Fryslân tot ChinaSURFnet
 
Opening en netwerkvisie SURF
Opening en netwerkvisie SURFOpening en netwerkvisie SURF
Opening en netwerkvisie SURFSURFnet
 
Trends in unwired communications
Trends in unwired communicationsTrends in unwired communications
Trends in unwired communicationsSURFnet
 
Netwerkfunctievirtualisatie: proof-of-concept en demo
Netwerkfunctievirtualisatie: proof-of-concept en demoNetwerkfunctievirtualisatie: proof-of-concept en demo
Netwerkfunctievirtualisatie: proof-of-concept en demoSURFnet
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 4
SURF-dienstenportfolio: draadvrije netwerk. Deel 4SURF-dienstenportfolio: draadvrije netwerk. Deel 4
SURF-dienstenportfolio: draadvrije netwerk. Deel 4SURFnet
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 3
SURF-dienstenportfolio: draadvrije netwerk. Deel 3SURF-dienstenportfolio: draadvrije netwerk. Deel 3
SURF-dienstenportfolio: draadvrije netwerk. Deel 3SURFnet
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 2
SURF-dienstenportfolio: draadvrije netwerk. Deel 2SURF-dienstenportfolio: draadvrije netwerk. Deel 2
SURF-dienstenportfolio: draadvrije netwerk. Deel 2SURFnet
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 1
SURF-dienstenportfolio: draadvrije netwerk. Deel 1SURF-dienstenportfolio: draadvrije netwerk. Deel 1
SURF-dienstenportfolio: draadvrije netwerk. Deel 1SURFnet
 
De toekomst van netwerkinfrastructuur op de campus: in gesprek!
De toekomst van netwerkinfrastructuur op de campus: in gesprek!De toekomst van netwerkinfrastructuur op de campus: in gesprek!
De toekomst van netwerkinfrastructuur op de campus: in gesprek!SURFnet
 
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...SURFnet
 
7-minute-speeches. Deel 2
7-minute-speeches. Deel 27-minute-speeches. Deel 2
7-minute-speeches. Deel 2SURFnet
 
Nieuwe mogelijkheden van het SURFnet-netwerk Dashboard
Nieuwe mogelijkheden van het SURFnet-netwerk DashboardNieuwe mogelijkheden van het SURFnet-netwerk Dashboard
Nieuwe mogelijkheden van het SURFnet-netwerk DashboardSURFnet
 
7-minute-speeches
7-minute-speeches7-minute-speeches
7-minute-speechesSURFnet
 
Winnende voorstellen location-based services - deel 2
Winnende voorstellen location-based services - deel 2Winnende voorstellen location-based services - deel 2
Winnende voorstellen location-based services - deel 2SURFnet
 

Mais de SURFnet (20)

7-minute-speeches. Deel 3.
7-minute-speeches. Deel 3.7-minute-speeches. Deel 3.
7-minute-speeches. Deel 3.
 
The mobile evolution of the employee and student pass
The mobile evolution of the employee and student passThe mobile evolution of the employee and student pass
The mobile evolution of the employee and student pass
 
Location-based services: van theorie naar praktijk. Deel 2
Location-based services: van theorie naar praktijk. Deel 2Location-based services: van theorie naar praktijk. Deel 2
Location-based services: van theorie naar praktijk. Deel 2
 
Automatisering en orkestratie: update en toekomstplannen
Automatisering en orkestratie: update en toekomstplannenAutomatisering en orkestratie: update en toekomstplannen
Automatisering en orkestratie: update en toekomstplannen
 
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 2
 
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1
Welke nieuwe mogelijkheden biedt het SURFnet8-netwerk? Deel 1
 
RUGnet, een service oriented internationaal netwerk van Fryslân tot China
RUGnet, een service oriented internationaal netwerk van Fryslân tot ChinaRUGnet, een service oriented internationaal netwerk van Fryslân tot China
RUGnet, een service oriented internationaal netwerk van Fryslân tot China
 
Opening en netwerkvisie SURF
Opening en netwerkvisie SURFOpening en netwerkvisie SURF
Opening en netwerkvisie SURF
 
Trends in unwired communications
Trends in unwired communicationsTrends in unwired communications
Trends in unwired communications
 
Netwerkfunctievirtualisatie: proof-of-concept en demo
Netwerkfunctievirtualisatie: proof-of-concept en demoNetwerkfunctievirtualisatie: proof-of-concept en demo
Netwerkfunctievirtualisatie: proof-of-concept en demo
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 4
SURF-dienstenportfolio: draadvrije netwerk. Deel 4SURF-dienstenportfolio: draadvrije netwerk. Deel 4
SURF-dienstenportfolio: draadvrije netwerk. Deel 4
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 3
SURF-dienstenportfolio: draadvrije netwerk. Deel 3SURF-dienstenportfolio: draadvrije netwerk. Deel 3
SURF-dienstenportfolio: draadvrije netwerk. Deel 3
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 2
SURF-dienstenportfolio: draadvrije netwerk. Deel 2SURF-dienstenportfolio: draadvrije netwerk. Deel 2
SURF-dienstenportfolio: draadvrije netwerk. Deel 2
 
SURF-dienstenportfolio: draadvrije netwerk. Deel 1
SURF-dienstenportfolio: draadvrije netwerk. Deel 1SURF-dienstenportfolio: draadvrije netwerk. Deel 1
SURF-dienstenportfolio: draadvrije netwerk. Deel 1
 
De toekomst van netwerkinfrastructuur op de campus: in gesprek!
De toekomst van netwerkinfrastructuur op de campus: in gesprek!De toekomst van netwerkinfrastructuur op de campus: in gesprek!
De toekomst van netwerkinfrastructuur op de campus: in gesprek!
 
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
 
7-minute-speeches. Deel 2
7-minute-speeches. Deel 27-minute-speeches. Deel 2
7-minute-speeches. Deel 2
 
Nieuwe mogelijkheden van het SURFnet-netwerk Dashboard
Nieuwe mogelijkheden van het SURFnet-netwerk DashboardNieuwe mogelijkheden van het SURFnet-netwerk Dashboard
Nieuwe mogelijkheden van het SURFnet-netwerk Dashboard
 
7-minute-speeches
7-minute-speeches7-minute-speeches
7-minute-speeches
 
Winnende voorstellen location-based services - deel 2
Winnende voorstellen location-based services - deel 2Winnende voorstellen location-based services - deel 2
Winnende voorstellen location-based services - deel 2
 

Último

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Último (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

Big Data

  • 1. A curse of interdisciplinarity ‘ A challenge in the other discipline always seems ‘easy’ because we are not hindered by knowledge’. Barend Mons (DTL-DISC/ELIXIR) NBIC, LUMC. 1
  • 3. ELIXIR Safeguarding the results of life science research in Europe European Life Sciences Infrastructure for Biological Information www.elixir-europe.org
  • 4. DISC: the connected data departments of DTL research Hotels DISC* technology facilities technology research education DTL & training *) DISC = DTL Data Integration & Stewardship Centre
  • 5. What is bioinformatics? • The science of storing, retrieving and analysing large amounts of biological information • An interdisciplinary science involving biologists, biochemists, computer scientists and mathematicians • At the heart of modern biology 5
  • 6. Bioinformatics underpins life-science research 11Genomes Genomes Contain genes Contain genes 22Genes are Genes are transcribed transcribed 33Transcripts translate Transcripts translate to protein sequences to protein sequences 44Proteins form three- Proteins form three- dimensional structures dimensional structures 55Proteins interact with each other Proteins interact with each other and with small molecules to form and with small molecules to form pathways pathways 6 Pathways combine 6 Pathways combine to build systems to build systems 6
  • 7. Life Science data: Multi-omics, multi-technology, multi organism, multi dimensional
  • 8. From molecules to medicine Molecular components Integration Translation Genomes Human populations Nucleotides Biobanks Tissues and organs Transcripts Complexes Therapies Proteins Disease prevention Domains Pathways Cells Human Early individuals Diagnosis Structures Small molecules 8
  • 9. What is ELIXIR? • An ESFRI research infrastructure of global significance • Unites Europe’s leading life science organisations in managing and safeguarding the vast amounts of data being generated every day by publicly funded research. • A large-scale initiative that will provide the facilities necessary for Europe’s life-science researchers to make the most of our rapidly growing store of information about living systems, which is the foundation on which our understanding of life is built. 9
  • 10. Why ELIXIR? • Creating a robust infrastructure for biological information is a bigger task than EMBL-EBI – or any individual organisation or nation – can take on alone. • Biology has by far the largest research community: • ~3 million life science researchers in Europe • >6 million web hits a day at EMBL-EBI alone • We need to involve other European partners 10
  • 11. The challenge • Computer speed and storage capacity is doubling every 18 months and this rate is steady • DNA sequence data is doubling every 6- 8 months over the last 3 years and looks to continue for Guy Cochrane, ENA, EMBL-EBI this decade 11
  • 12. Europe has already paid for the science Annual cost of generating new protein structure data in labs around the world Annual cost of maintaining the data in a central database 12
  • 13. ELIXIR’s mission To build a sustainable European infrastructure for biological information, supporting life science research and its medicine translation to: environment bioindustries society 13
  • 14. A distributed pan-European infrastructure 14
  • 15. Benefits ELIXIR will contribute to European innovation by: • Optimising access and exploitation of life-science data • Ensuring longevity of the data, thereby protecting investments already made in research • Enhancing the quality of European research by supporting national efforts to increase the competence and number of bioinformatics users through training • Strengthening the global position and influence of Europe in life-science research in both in academia and industry 15
  • 16. The scientific reason for ELIXIR • Data is an essential commodity for life-science research. • Ten years ago, finding the connection between a gene and a characteristic (e.g. drought tolerance, risk of heart disease) could take years; now it takes minutes. Image courtesy of Genome Research Ltd. • Data analysis is now the bottleneck in life-science research • ELIXIR is our only realistic hope of easing that bottleneck 16
  • 17.
  • 18. One societal reason for ELIXIR • The era of personal genome sequencing is upon us. • Sequence data will not cross national boundaries. • Every national health system will need expertise to interpret it and treat patients accordingly. • Individuals need to be sure that their personal biological data are in safe hands. 18
  • 19. The financial reason for ELIXIR • Europe has already spent the money to generate the data. • It will waste all this investment in research if the future of the data is not secured. • Industry, from SMEs to big multinationals, needs access to public data to analyse its proprietary data. 19
  • 20. Maintaining open access • Open access to life science is essential for advances in many areas of research • Open access to bioinformatics resources provides a valuable path to discovery, one that in many other areas of research is limited by commercial confidentiality Mark Forster, Syngenta, • Charging for that data, or seeking to restrict member of the EMBL-EBI Industry Programme access through exercising Intellectual Property (IP) rights, would impede progress • ELIXIR will guarantee that open access to biological data is maintained. Speaking with a single voice will strengthen Europe’s influence in such global discussions. 20
  • 22. Part two >>>> eScience in LS • The way we dicover knowledge has changed fundamentally over just a decade. BIGNORANC E 10/09/12 22
  • 23. The general challenge: Data has far outgrown institutional handling capacity is everywhere The Data Deluge The Issue: But Life Sciences is particularly challenged and complex. More and more We write ‘about datasets’ ….The amount of digital data is That are too large to publish exploding, with a staggering 1.8 zettabytes in 2011 In narrative
  • 24. Nanopublications & Cardinal Assertions Nanopublication A Nanopublication is the smallest unit of publishable information containing: 1.Assertion A statement of concepts in terms of one or more ‘subject -> predicate -> object’ (triple) relationships. 1.Provenance a)Attribution – Who made this assertion, 1 ‘n’ when and where? identical different b)Supporting information – Any other assertion provenances information which is relevant to the assertion (e.g. this assertion is only valid in humans under 18). A Cardinal Assertion aggregates all ‘n’ Nanopublications making the same assertion. It therefore has 1 assertion and ‘n’ provenances, eliminating redundancy. Cardinal Assertion
  • 26. Managing volume & complexity Combining Cardinal Assertions with 5 5 Concept profiles reduces the amount of data with ≈99.999996% 4 4 1 1 Individual 2 2 Concept Profiles ≈4x106 Individual Cardinal Assertions 5 4 2 1 > 10 11 Individual Nanopublications > 1014
  • 27. The LS concept web: 2x2x106 concepts (profiles)
  • 28. A dynamic Concept Web versus a static Ontology 28
  • 29. = Known reference pairs = non-co-occurrence pairs More mutual information No increase in concept overlap Including manual curation More concepts in common Removal of low info paths
  • 30.
  • 31. eScience…. in silico reasoning and in cerebro validation Expert Skype calls Reading up
  • 32. Organisation of the ecosystem Global Authority Nanopublishers App & Service Users Providers Endorse CA Space Application Knowledge (OCS & ICS) development Management Providers Reasoning services Practices Academic & Best ONS/INSs technical and Commercial process Users consultancy project Knowledge Original delivery Discovery Assist & Data Owners capacity Certify
  • 33. 33
  • 34. IN ANY CASE: regardless of how ‘sensitive’ your data is, it is malpractice to: - Generate data without a solid stewardship plan - Build impenetrable SILOS - Fail to record provenance - Store them in non interoperable format - Think that data=information -EVEN if your only goal is the Nobel Prize (or for Dutch: a Spinoza Prize) 34
  • 35. Acceptance of Semantic Web Approach Over the last decade, academic research organisations developed new methodologies and tools to address the Big Data problem. Global agreement by leading scientists on unique Nanopublication solution. 100’s of millions already invested in the basis technology Applicable as a technology across (STM) domains and industries. Pharmaceutical companies are early adopters (Innovative Medicine Initiative).
  • 36. The ‘Dutch Team’ Acknowledging… • Herman van Haagen , MsC. (LUMC) • Dr. Peter Bram ‘t Hoen (LUMC) CWA- Open PHACTS • Dr. Marco Roos (LUMC) • Prof. Amos Bairoch (SIB, Switzerland, CWA) • Dr. Erik Schultes (LUMC) • Prof. Carole Goble (Mancheste, CWA, OPS) • Prof. Johan den Dunnen (LUMC) • Prof. Katy Borner (Indiana University CWA) • Prof. Gertjan van Ommen (LUMC) • Prof. Mark Musen (NCBO, Stanford CWA,OPS) • Dr. Erik van Mulligen (EMC) • Dr. Pascale Gaudet (UniProt, ISB, CWA • Dr. Jan Kors (EMC) • Dr. Mike Colon (VIVO, UF, CWA) • Dr. Martijn Schuemie (EMC) • Prof. Maryann Martone (Force 11, USC, CWA) • Prof. Johan van der Lei (EMC) • Dr. Nigam Shah (NCBO, Stanford, CWA, OPS) • Dr. Rob Hooft (NBIC) • Dr. Mark Wlikinson (Canada, CWA) • Dr. Christine Chichester (NBIC) • Abel Packer (Brazil, Scielo, CWA, OPS) • Dr. Leon Mei (NBIC) • Jan Velterop (ACKnowledge, CWA, OPS) • Kees Burger (NBIC) • Albert Mons (CWA, NBIC) • Bharat Singh (NBIC/EMC) • Prof. Frank van Harnelen (FUA/LARKC, CWA, OPS) • Dr. Marc van Driel (NBIC) • Dr. Chris Evelo (Maastrciht, CWA, OPS) • Dr. Ruben Kok (NBIC) • Dr. Antony Willams (RSC/ChemSpider, CWA,OPS) • Prof. Marcel Reinders (NBIC) • Dr. Richard Kidd (RSC, OPS) • Prof. Jaap Heringa (NBIC) • Dr. Paul Groth (FUA, CWA, OPS) • Prof. Gert Vriend (NBIC) • Dr. Michel Dumontier (Canada, CWA, OPS) • Dr. Morris Schwertz (BBMRI, CWA) • Dr .Andrew Gibson, UA, CWA, OPS) • Dr. Andra Waagmeester (NBIC) • Dr. Bryn Williams-Jones (Pfizer, OPS) • Dr. Kristina Hettne (LUMC) • Dr. Ian Dix (Astra Zeneca, OPS) • Dr. Rene van Schaik (eScience Cenrte) • Dr. Niklas Blomberg (Astra Zeneca, OPS) • Drs. Albert Mons (PHORTOS consultants) • Dr. Mike Barnes, GSK, OPS) • Mr. Drs. Arie Baak (PHORTOS consultants) • Prof. Jan-erik Litton (CWA, BBMRI)

Notas do Editor

  1. Messages: The data in the life sciences is not only immense, but also highly complex First: data are captured from the differently levels of organisation in living organisms: DNA, RNA, Protein, Metabolites, cells, tissues, organs and whole organisms. Next even ecological, social-behavioural and epidemiological data play a key role. These data are captured with a variety of instruments and techniques and are in many different formats (not necessarily compatible) Such data are generated in studies on many different (model) organisms form virusses and bateria to humans. Many data need interpretation across species. Many data have to be captured in time or space series and is therefore also mutlidimensional DISC will nor only provide the necessary tools and compute infrastructure but critically also the experts to integrate and connect the data towards biological interpretation. In some case this will only be two pieces of the puzzle, but in many cases more. The final goal is biological understanding and societal application, not just major publications in the Green, the Red and the White sectors of biology.
  2. Messages: Big Data problem now pervading mainstream non-science literature as well and the deluge is everywhere, however the complexity and multidisciplinary nature of LS data makes them a particular challenge. No single institution or even Big Pharma or DSM/UNILEVER can have all the technology and expertise in-house (see IMI, ESFRI) Even if economically and technically feasible, repeating the deep analysis and preprocessing of massive (frequently publicly available) datasets behind firewalls of institutions or companies is now considered a waste of precious resources as much of it is precompetitive. The real added value is in the biological interpretation of the data and its application in red, green and whit innovations. Modern science is really about ‘projecting’ one’s own limited data on a massive body of ‘known’ and prior biological knowledge, way beyond ‘reading’ DISC will support all super institutional needs for data integration, stewardship and interpretation at the request of the users DISC will be closely associated with the top research institutions participating in it, and distributed over multiple concentrations of expertise and infrastructure to ensure a continued ‘cutting edge’ offering in all four infrastructural aspects (computing, tooling, expertise and training) Several key technologies of can be applied beyond the Life Sciences. If The Netherlands does miss out on massive data expertise other centers will develop and crucial expertise will ‘leave’ our country. Now, NL has a leading role and can benefit (example BGI China).
  3. De ecosystem aanpak met interoperable data maakt knowledge management en knowledge discovery mogelijk over ALLE data
  4. Dat kan een private partij per definitie niet oppakken omdat ze geen trusted party zijn (community vorming, certificering, ONS beheer, etc) Vandaar de 4 kolommen en al het werk dat al is verzet inclusief 'adaptatie' door heel veel relevante Associations en Academic Institutions (CWA, W3C, ..................) Dat vraagt om een PPP benadering waarin Elsevier zijn eigen rol speelt strategisch gepositioneerd in de value chain van het ecosysteem  De trusted party activiteiten, de infrastructuur en de community worden door anderen gedaan
  5. Schaakspel metafoor??