SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
Big Medical Data – Challenge or Potential?
Personalized Medicine Conference, Berlin, March 7th, 2014
Dr. Matthieu-P. Schapranow, Hasso Plattner Institute
Hasso Plattner Institute
Key Facts
■  Founded as a public-private partnership
in 1998 in Potsdam near Berlin, Germany
■  Institute belongs to the
University of Potsdam
■  Ranked 1st in CHE since 2009
■  500 B.Sc. and M.Sc. students
■  10 professors, 150 PhD students
■  Course of study: IT Systems Engineering
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20142
Hasso Plattner Institute
Programs
■ Full university curriculum
■ Bachelor (6 semesters)
■ Master (4 semesters)
■ Orthogonal Activities:
□ E-Health Consortium
□ School of Design Thinking
□ Research School
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20143
Hasso Plattner Institute
Enterprise Platform and Integration Concepts Group
  Prof. Dr. h.c. Hasso Plattner
■ Research focuses on the technical aspects of enterprise
software and design of complex applications
□  In-Memory Data Management for Enterprise Applications
□  Enterprise Application Programming Model
□  Scientific Data Management
□  Human-Centered Software Design and Engineering
■ Industry cooperations, e.g. SAP, Siemens, Audi, and EADS
■ Research cooperations, e.g. Stanford, MIT, and Berkeley
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20144
Partner of
Stanford Center
for Design
Research
Partner of MIT in
Supply Chain
Innovation and
CSAIL
Partner at
UC Berkeley

RAD / AMP Lab
Partner of
SAP AG
Our Motivation
Personalized Medicine
■  Motivation: Can we analyze the entire data of a patient, incl. Electronic
Health Records (EHR) and genome data, during a doctor’s visit?
■  Genome data analysis may add up to weeks,
i.e. biopsy, biological preparation, sequencing,
alignment, variant calling, full analysis, and evaluation
■  Issue: Complex and time-consuming data processing tasks
■  In-memory technology accelerates genome data processing
□  Highly parallel alignment / variant calling
□  Real-time analysis of individual patient or cohort data
□  Combined search in structured / unstructured data
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20145
Our Challenge
Distributed Big Data Sources
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20146
Human genome/biological data
600GB per full genome
15PB+ in databases of leading institutes
Prescription data
1.5B records from 10,000 doctors
and 10M Patients (100 GB)
Clinical trials
Currently more than 30k
recruiting on ClinicalTrials.gov
Human proteome
160M data points (2.4GB) per sample
>3TB raw proteome data in ProteomicsDB
PubMed database
>23M articles
Hospital information systems
Often more than 50GB
Medical sensor data
Scan of a single organ in 1s
creates 10GB of raw data
Cancer patient records
>160k records at NCT
Combined
column
and row store
Map/Reduce Single and
multi-tenancy
Lightweight
Compression
Insert only
for time travel
Real-time
Replication
Working on
integers
SQL interface
on columns
and rows
Active/passive
data store
Minimal
projections
Group key Reduction of
Software
layers
Dynamic multi-
threading
Bulk load
of data
Object-
relational
mapping
Text retrieval
and extraction
engine
No aggregate
Tables
Data
partitioning
Any attribute
as index
No disk
On-the-fly
extensibility
Analytics on
historical data
Multi-core/
parallelization
Our Approach
In-Memory Technology
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20147
+
++
+
+
P
v
+++
t
SQL
x
x
T
disk
Our Vision
Personalized Medicine
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20148
Our Vision
Personalized Medicine
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Desirability
■  Leveraging directed customer services
■  Portfolio of integrated services for clinicians, researchers, and
patients
■  Include latest research results, e.g. most effective therapies
Viability
■  Enable personalized medicine also in far-off
regions and developing countries
■  Share data via the Internet to get feedback
from word-wide experts (cost-saving)
■  Combine research data (publications,
annotations, genome data) from international
databases in a single knowledge base
Feasibility
■  HiSeq 2500 enables high-coverage whole
genome sequencing in ≈1d
■  IMDB enables allele frequency
determination of 12B records within <1s
■  1 relevant out of 80M annotations <1s
■  Data preparation as a service reduces TCO
9
High-Performance In-Memory Genome Project
Integration of Genomic Data
■  Preprocessing of DNA (alignment,
variant calling) can be modeled and is
executed as integrated process
■  Results are directly stored in in-memory
databases, e.g. for
□  Statistical analyses, and
□  Links to latest research knowledge
■  Real-time analysis of genome data
enables completely new way of research
and therapies, e.g. instant comparison
with patient cohorts
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201410
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201411
High-Performance In-Memory Genome Project
Architectural Overview
Real-Time Capturing and Data Analysis
In-Memory Database
All Relevant Medical Information
*omics
Electronic
Medical Records
Service
Line Items
Patients Doctors Insurers Researchers
Information and Feedback within the Window of Opportunity
...
High-Performance In-Memory Genome Project
Selected Research Topics
Improving Analyses:
■  Information combination, e.g. Medical Knowledge Cockpit, Oncolyzer
■  Genome Browser enables deep dive into the genome
■  Cohort Analysis, e.g. clustering of patient cohorts
■  Combined Search, e.g. in clinical trials and side-effect databases
■  Pathways Topology Analysis, e.g. to identify cause/effect
  Improving Data Preparations:
■  Graphical modeling of Genome Data Processing (GDP) pipelines
■  Scheduling and execution of multiple GPD pipelines in parallel
■  App store for medical knowledge (bring algorithms to data)
■  Exchange of sensitive data, e.g. history-based access control
■  Billing processes for intellectual property and services
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201412
High-Performance In-Memory Genome Project
HANA Oncolyzer
■  Research initiative for exchanging relevant
tumor data to improve personalized treatment
■  Honored 2012 by the Innovation Award of
the German Capitol Region
■  In-memory technology as key-enabler for
real-time analysis of tumor data in
seconds instead of hours
■  Information available at your fingertips:
In-memory technology on mobile devices,
e.g. iPad
■  Interdisciplinary cooperation between
□  Medical doctors,
□  Researchers, and
□  Software engineers.
13 Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
High-Performance In-Memory Genome Project
HANA Oncolyzer
Patient Details Screen
■  Combines patient’s time
series data of specific
patient and analysis results
of patient cohort
■  Real-time analysis across
hospital-wide data
whenever details screen is
accessed
■  http://epic.hpi.uni-
potsdam.de/Home/
HanaOncolyzer
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201414
High-Performance In-Memory Genome Project
HANA Oncolyzer
Patient Analysis Screen
■  Allows to real-time analysis
on complete patient cohort
■  Flexible filters and various
chart types allow graphical
exploration of data on
mobile devices
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201415
High-Performance In-Memory Genome Project
Medical Knowledge Cockpit
■  Search for affected genes in distributed and
heterogeneous data sources
■  Immediate exploration of relevant information,
such as
□  Gene descriptions,
□  Molecular impact and related pathways,
□  Scientific publications, and
□  Suitable clinical trials.
■  No manual searching for hours or days:
In-memory technology translates searching into
interactive finding!
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Automatic clinical trial
matching build on text
analysis features
Unified access to
structured and un-
structured data sources
16
High-Performance In-Memory Genome Project
Search in Structured and Unstructured Medical Data
■  Extended text analysis feature by medical
terminology
□  Genes (122,975 + 186,771 synonyms)
□  Medical terms and categories (98,886 diseases
+ 48,561 synonyms, 47 categories)
□  Pharmaceutical ingredients (7,099 + 5,561
synonyms)
■  Indexed clinicaltrials.gov database (145k trials/
30,138 recruiting)
■  Extracted, e.g., 320k genes, 161k ingredients,
30k periods
■  Select all studies based on multiple filters in less
than 500ms
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Clinical trial matching using
text analysis features
Unified access to structured
and unstructured data sources
17
T
High-Performance In-Memory Genome Project
Analysis of Patient Cohorts
■  In a patient cohort, a subset does not respond to
therapy – why?
■  Clustering using various statistical algorithms,
such as k-means or hierarchical clustering
■  Calculation of all locus combinations in which at
least 5% of all TCGA participants have
mutations: 200ms for top 20 combinations
■  Individual clusters are calculated in parallel
directly within the database
■  K-means algorithm: 50ms (PAL) vs. 500ms (R)
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Fast clustering directly performed
within the in-memory database
18
High-Performance In-Memory Genome Project
Genome Browser
■  Genome Browser: Comparison of multiple
genomes with reference
■  Combined knowledge base: latest relevant
annotations and literature, e.g. NCBI, dbSNP, and
UCSC
■  Detailed exploration of genome locations and
existing associations
■  Ranked variants, e.g. accordingly to known
diseases
■  Links to more detailed sources enable fast
identification of relevant information while
eliminating long-lasting searches.
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Unified access to multiple
formerly disjoint data sources
Matching of genetic variants
and relevant annotations
19
High-Performance In-Memory Genome Project
Pathway Analysis
■  Search in pathways is limited to “is a certain
element contained” today
■  Integrated >1,5k pathways from international
sources, e.g. KEGG, HumanCyc, and
WikiPathways, into HANA
■  Implemented graph-based topology exploration
and ranking based on patient specifics
■  Enables interactive identification of possible
dysfunctions affecting the course of a therapy
before its start
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Unified access to multiple
formerly disjoint data sources
Pathway analysis of genetic
variants with graph engine
20
High-Performance In-Memory Genome Project
Drug Response Testing
■  Drug response depends on individual genetic
variants of tumors
■  Challenge: Identification of relevant genetic
variants and their impact on drug response is a
ongoing research activity, e.g. Xenograft models
■  Exploration of experiment results is time-
consuming and Excel-driven
■  In-memory technology enables interactive
exploration of experiment data to leverage new
scientific insights
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Interactive analysis of
correlations between drugs
and genetic variants
21
What to take home?
Test it: http://we.AnalyzeGenomes.com
  For researchers
■  Enable real-time analysis of medical data
■  Automatic assessment of data, e.g. scan of pathways to identify
cellular impact of mutations
■  Combined free-text search in publications, diagnosis, and EMR
data, i.e. structured and unstructured data
  For clinicians
■  Preventive diagnostics to identify risk patients early
■  Indicate pharmacokinetic correlations
■  Scan for similar patient cases, e.g. to evaluate therapy success
  For patients
■  Identify relevant clinical trials and medical experts
■  Start most appropriate therapy as early as possible
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201422
Keep in contact with us!
Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
Hasso Plattner Institute
Enterprise Platform & Integration Concepts
Dr. Matthieu-P. Schapranow
August-Bebel-Str. 88
14482 Potsdam, Germany
Dr. Matthieu-P. Schapranow
schapranow@hpi.uni-potsdam.de
http://we.analyzegenomes.com/
23

Mais conteúdo relacionado

Mais procurados

Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...Matthieu Schapranow
 
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in PracticePatient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in PracticeMatthieu Schapranow
 
In-Memory Apps for Precision Medicine
In-Memory Apps for Precision MedicineIn-Memory Apps for Precision Medicine
In-Memory Apps for Precision MedicineMatthieu Schapranow
 
KConnect - making Medical Information Easier to Find: Semantic Annotation and...
KConnect - making Medical Information Easier to Find: Semantic Annotation and...KConnect - making Medical Information Easier to Find: Semantic Annotation and...
KConnect - making Medical Information Easier to Find: Semantic Annotation and...Peter Voisey
 
How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?Matthieu Schapranow
 
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...Matthieu Schapranow
 
Pharma data analytics
Pharma data analyticsPharma data analytics
Pharma data analyticsAxon Lawyers
 
In-memory Applications for Oncology
In-memory Applications for OncologyIn-memory Applications for Oncology
In-memory Applications for OncologyMatthieu Schapranow
 
AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)Paul Agapow
 
Heart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining TechniquesHeart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining Techniquespaperpublications3
 
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Hellmuth Broda
 
Analyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisAnalyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisMatthieu Schapranow
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinarPistoia Alliance
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?Paul Agapow
 
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...CTSI at UCSF
 
Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...
Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...
Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...Matthieu Schapranow
 
Analytics in healthcare
Analytics in healthcareAnalytics in healthcare
Analytics in healthcareAnushkaAlok
 
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...Christopher Hart
 

Mais procurados (20)

Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
 
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in PracticePatient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
 
In-Memory Apps for Precision Medicine
In-Memory Apps for Precision MedicineIn-Memory Apps for Precision Medicine
In-Memory Apps for Precision Medicine
 
KConnect - making Medical Information Easier to Find: Semantic Annotation and...
KConnect - making Medical Information Easier to Find: Semantic Annotation and...KConnect - making Medical Information Easier to Find: Semantic Annotation and...
KConnect - making Medical Information Easier to Find: Semantic Annotation and...
 
How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?
 
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
 
Pharma data analytics
Pharma data analyticsPharma data analytics
Pharma data analytics
 
In-memory Applications for Oncology
In-memory Applications for OncologyIn-memory Applications for Oncology
In-memory Applications for Oncology
 
AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)
 
Heart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining TechniquesHeart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining Techniques
 
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
 
Analyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisAnalyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response Analysis
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinar
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
 
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
 
AI in Oncology
AI in OncologyAI in Oncology
AI in Oncology
 
Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...
Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...
Festival of Genomics 2016 London: Real-time Exploration of the Cancer Genome,...
 
Analytics in healthcare
Analytics in healthcareAnalytics in healthcare
Analytics in healthcare
 
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...
 
Jane Howard
Jane HowardJane Howard
Jane Howard
 

Destaque

How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014CREATION
 
Big Data
Big DataBig Data
Big DataNGDATA
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsPerficient, Inc.
 

Destaque (7)

How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data
Big DataBig Data
Big Data
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is big data?
What is big data?What is big data?
What is big data?
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 

Semelhante a Big Medical Data – Challenge or Potential?

In-memory Applications for Informed Patients
In-memory Applications for Informed PatientsIn-memory Applications for Informed Patients
In-memory Applications for Informed PatientsMatthieu Schapranow
 
Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Matthieu Schapranow
 
Big Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesBig Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesMatthieu Schapranow
 
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...Matthieu Schapranow
 
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesAnalyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesMatthieu Schapranow
 
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...Matthieu Schapranow
 
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences ResearchAnalyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences ResearchMatthieu Schapranow
 
Turning Big Data into Precision Medicine
Turning Big Data into Precision MedicineTurning Big Data into Precision Medicine
Turning Big Data into Precision MedicineMatthieu Schapranow
 
Gaining Time -- Real-time Analysis of Big Medical Data
Gaining Time -- Real-time Analysis of Big Medical DataGaining Time -- Real-time Analysis of Big Medical Data
Gaining Time -- Real-time Analysis of Big Medical DataMatthieu Schapranow
 
Gaining Time – Real-time Analysis of Big Medical Data
Gaining Time – Real-time Analysis of Big Medical Data Gaining Time – Real-time Analysis of Big Medical Data
Gaining Time – Real-time Analysis of Big Medical Data SAP Technology
 
Analyze Genomes: In-memory Apps supporting Precision Medicine
Analyze Genomes: In-memory Apps supporting Precision MedicineAnalyze Genomes: In-memory Apps supporting Precision Medicine
Analyze Genomes: In-memory Apps supporting Precision MedicineMatthieu Schapranow
 
Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Matthieu Schapranow
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineMatthieu Schapranow
 
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...Matthieu Schapranow
 
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...Matthieu Schapranow
 
In-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineIn-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineMatthieu Schapranow
 
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...arx-deidentifier
 
Advanced Analytics for Clinical Data Full Event Guide
Advanced Analytics for Clinical Data Full Event GuideAdvanced Analytics for Clinical Data Full Event Guide
Advanced Analytics for Clinical Data Full Event GuidePfizer
 
Umc floortje scheepers
Umc floortje scheepersUmc floortje scheepers
Umc floortje scheepersBigDataExpo
 
Festival of Genomics 2016 London: Analyze Genomes: Real-world Examples
Festival of Genomics 2016 London: Analyze Genomes: Real-world ExamplesFestival of Genomics 2016 London: Analyze Genomes: Real-world Examples
Festival of Genomics 2016 London: Analyze Genomes: Real-world ExamplesMatthieu Schapranow
 

Semelhante a Big Medical Data – Challenge or Potential? (20)

In-memory Applications for Informed Patients
In-memory Applications for Informed PatientsIn-memory Applications for Informed Patients
In-memory Applications for Informed Patients
 
Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI
 
Big Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesBig Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and Challenges
 
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
 
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesAnalyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
 
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
 
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences ResearchAnalyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
 
Turning Big Data into Precision Medicine
Turning Big Data into Precision MedicineTurning Big Data into Precision Medicine
Turning Big Data into Precision Medicine
 
Gaining Time -- Real-time Analysis of Big Medical Data
Gaining Time -- Real-time Analysis of Big Medical DataGaining Time -- Real-time Analysis of Big Medical Data
Gaining Time -- Real-time Analysis of Big Medical Data
 
Gaining Time – Real-time Analysis of Big Medical Data
Gaining Time – Real-time Analysis of Big Medical Data Gaining Time – Real-time Analysis of Big Medical Data
Gaining Time – Real-time Analysis of Big Medical Data
 
Analyze Genomes: In-memory Apps supporting Precision Medicine
Analyze Genomes: In-memory Apps supporting Precision MedicineAnalyze Genomes: In-memory Apps supporting Precision Medicine
Analyze Genomes: In-memory Apps supporting Precision Medicine
 
Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision Medicine
 
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
 
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
 
In-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineIn-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems Medicine
 
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
 
Advanced Analytics for Clinical Data Full Event Guide
Advanced Analytics for Clinical Data Full Event GuideAdvanced Analytics for Clinical Data Full Event Guide
Advanced Analytics for Clinical Data Full Event Guide
 
Umc floortje scheepers
Umc floortje scheepersUmc floortje scheepers
Umc floortje scheepers
 
Festival of Genomics 2016 London: Analyze Genomes: Real-world Examples
Festival of Genomics 2016 London: Analyze Genomes: Real-world ExamplesFestival of Genomics 2016 London: Analyze Genomes: Real-world Examples
Festival of Genomics 2016 London: Analyze Genomes: Real-world Examples
 

Mais de Matthieu Schapranow

ICT Platform to Enable Consortium Work for Systems Medicine of Heart Failure
ICT Platform to Enable Consortium Work for Systems Medicine of Heart FailureICT Platform to Enable Consortium Work for Systems Medicine of Heart Failure
ICT Platform to Enable Consortium Work for Systems Medicine of Heart FailureMatthieu Schapranow
 
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...Matthieu Schapranow
 
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Matthieu Schapranow
 
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...Matthieu Schapranow
 
Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Matthieu Schapranow
 
Festival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaFestival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaMatthieu Schapranow
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesMatthieu Schapranow
 

Mais de Matthieu Schapranow (8)

ICT Platform to Enable Consortium Work for Systems Medicine of Heart Failure
ICT Platform to Enable Consortium Work for Systems Medicine of Heart FailureICT Platform to Enable Consortium Work for Systems Medicine of Heart Failure
ICT Platform to Enable Consortium Work for Systems Medicine of Heart Failure
 
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...
 
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
 
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
 
Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?
 
Festival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaFestival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: Agenda
 
Big Data in Life Sciences
Big Data in Life SciencesBig Data in Life Sciences
Big Data in Life Sciences
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life Sciences
 

Último

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 

Último (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 

Big Medical Data – Challenge or Potential?

  • 1. Big Medical Data – Challenge or Potential? Personalized Medicine Conference, Berlin, March 7th, 2014 Dr. Matthieu-P. Schapranow, Hasso Plattner Institute
  • 2. Hasso Plattner Institute Key Facts ■  Founded as a public-private partnership in 1998 in Potsdam near Berlin, Germany ■  Institute belongs to the University of Potsdam ■  Ranked 1st in CHE since 2009 ■  500 B.Sc. and M.Sc. students ■  10 professors, 150 PhD students ■  Course of study: IT Systems Engineering Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20142
  • 3. Hasso Plattner Institute Programs ■ Full university curriculum ■ Bachelor (6 semesters) ■ Master (4 semesters) ■ Orthogonal Activities: □ E-Health Consortium □ School of Design Thinking □ Research School Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20143
  • 4. Hasso Plattner Institute Enterprise Platform and Integration Concepts Group   Prof. Dr. h.c. Hasso Plattner ■ Research focuses on the technical aspects of enterprise software and design of complex applications □  In-Memory Data Management for Enterprise Applications □  Enterprise Application Programming Model □  Scientific Data Management □  Human-Centered Software Design and Engineering ■ Industry cooperations, e.g. SAP, Siemens, Audi, and EADS ■ Research cooperations, e.g. Stanford, MIT, and Berkeley Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20144 Partner of Stanford Center for Design Research Partner of MIT in Supply Chain Innovation and CSAIL Partner at UC Berkeley
 RAD / AMP Lab Partner of SAP AG
  • 5. Our Motivation Personalized Medicine ■  Motivation: Can we analyze the entire data of a patient, incl. Electronic Health Records (EHR) and genome data, during a doctor’s visit? ■  Genome data analysis may add up to weeks, i.e. biopsy, biological preparation, sequencing, alignment, variant calling, full analysis, and evaluation ■  Issue: Complex and time-consuming data processing tasks ■  In-memory technology accelerates genome data processing □  Highly parallel alignment / variant calling □  Real-time analysis of individual patient or cohort data □  Combined search in structured / unstructured data Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20145
  • 6. Our Challenge Distributed Big Data Sources Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20146 Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB) Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB PubMed database >23M articles Hospital information systems Often more than 50GB Medical sensor data Scan of a single organ in 1s creates 10GB of raw data Cancer patient records >160k records at NCT
  • 7. Combined column and row store Map/Reduce Single and multi-tenancy Lightweight Compression Insert only for time travel Real-time Replication Working on integers SQL interface on columns and rows Active/passive data store Minimal projections Group key Reduction of Software layers Dynamic multi- threading Bulk load of data Object- relational mapping Text retrieval and extraction engine No aggregate Tables Data partitioning Any attribute as index No disk On-the-fly extensibility Analytics on historical data Multi-core/ parallelization Our Approach In-Memory Technology Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20147 + ++ + + P v +++ t SQL x x T disk
  • 8. Our Vision Personalized Medicine Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20148
  • 9. Our Vision Personalized Medicine Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Desirability ■  Leveraging directed customer services ■  Portfolio of integrated services for clinicians, researchers, and patients ■  Include latest research results, e.g. most effective therapies Viability ■  Enable personalized medicine also in far-off regions and developing countries ■  Share data via the Internet to get feedback from word-wide experts (cost-saving) ■  Combine research data (publications, annotations, genome data) from international databases in a single knowledge base Feasibility ■  HiSeq 2500 enables high-coverage whole genome sequencing in ≈1d ■  IMDB enables allele frequency determination of 12B records within <1s ■  1 relevant out of 80M annotations <1s ■  Data preparation as a service reduces TCO 9
  • 10. High-Performance In-Memory Genome Project Integration of Genomic Data ■  Preprocessing of DNA (alignment, variant calling) can be modeled and is executed as integrated process ■  Results are directly stored in in-memory databases, e.g. for □  Statistical analyses, and □  Links to latest research knowledge ■  Real-time analysis of genome data enables completely new way of research and therapies, e.g. instant comparison with patient cohorts Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201410
  • 11. Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201411 High-Performance In-Memory Genome Project Architectural Overview Real-Time Capturing and Data Analysis In-Memory Database All Relevant Medical Information *omics Electronic Medical Records Service Line Items Patients Doctors Insurers Researchers Information and Feedback within the Window of Opportunity ...
  • 12. High-Performance In-Memory Genome Project Selected Research Topics Improving Analyses: ■  Information combination, e.g. Medical Knowledge Cockpit, Oncolyzer ■  Genome Browser enables deep dive into the genome ■  Cohort Analysis, e.g. clustering of patient cohorts ■  Combined Search, e.g. in clinical trials and side-effect databases ■  Pathways Topology Analysis, e.g. to identify cause/effect   Improving Data Preparations: ■  Graphical modeling of Genome Data Processing (GDP) pipelines ■  Scheduling and execution of multiple GPD pipelines in parallel ■  App store for medical knowledge (bring algorithms to data) ■  Exchange of sensitive data, e.g. history-based access control ■  Billing processes for intellectual property and services Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201412
  • 13. High-Performance In-Memory Genome Project HANA Oncolyzer ■  Research initiative for exchanging relevant tumor data to improve personalized treatment ■  Honored 2012 by the Innovation Award of the German Capitol Region ■  In-memory technology as key-enabler for real-time analysis of tumor data in seconds instead of hours ■  Information available at your fingertips: In-memory technology on mobile devices, e.g. iPad ■  Interdisciplinary cooperation between □  Medical doctors, □  Researchers, and □  Software engineers. 13 Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014
  • 14. High-Performance In-Memory Genome Project HANA Oncolyzer Patient Details Screen ■  Combines patient’s time series data of specific patient and analysis results of patient cohort ■  Real-time analysis across hospital-wide data whenever details screen is accessed ■  http://epic.hpi.uni- potsdam.de/Home/ HanaOncolyzer Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201414
  • 15. High-Performance In-Memory Genome Project HANA Oncolyzer Patient Analysis Screen ■  Allows to real-time analysis on complete patient cohort ■  Flexible filters and various chart types allow graphical exploration of data on mobile devices Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201415
  • 16. High-Performance In-Memory Genome Project Medical Knowledge Cockpit ■  Search for affected genes in distributed and heterogeneous data sources ■  Immediate exploration of relevant information, such as □  Gene descriptions, □  Molecular impact and related pathways, □  Scientific publications, and □  Suitable clinical trials. ■  No manual searching for hours or days: In-memory technology translates searching into interactive finding! Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Automatic clinical trial matching build on text analysis features Unified access to structured and un- structured data sources 16
  • 17. High-Performance In-Memory Genome Project Search in Structured and Unstructured Medical Data ■  Extended text analysis feature by medical terminology □  Genes (122,975 + 186,771 synonyms) □  Medical terms and categories (98,886 diseases + 48,561 synonyms, 47 categories) □  Pharmaceutical ingredients (7,099 + 5,561 synonyms) ■  Indexed clinicaltrials.gov database (145k trials/ 30,138 recruiting) ■  Extracted, e.g., 320k genes, 161k ingredients, 30k periods ■  Select all studies based on multiple filters in less than 500ms Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Clinical trial matching using text analysis features Unified access to structured and unstructured data sources 17 T
  • 18. High-Performance In-Memory Genome Project Analysis of Patient Cohorts ■  In a patient cohort, a subset does not respond to therapy – why? ■  Clustering using various statistical algorithms, such as k-means or hierarchical clustering ■  Calculation of all locus combinations in which at least 5% of all TCGA participants have mutations: 200ms for top 20 combinations ■  Individual clusters are calculated in parallel directly within the database ■  K-means algorithm: 50ms (PAL) vs. 500ms (R) Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Fast clustering directly performed within the in-memory database 18
  • 19. High-Performance In-Memory Genome Project Genome Browser ■  Genome Browser: Comparison of multiple genomes with reference ■  Combined knowledge base: latest relevant annotations and literature, e.g. NCBI, dbSNP, and UCSC ■  Detailed exploration of genome locations and existing associations ■  Ranked variants, e.g. accordingly to known diseases ■  Links to more detailed sources enable fast identification of relevant information while eliminating long-lasting searches. Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Unified access to multiple formerly disjoint data sources Matching of genetic variants and relevant annotations 19
  • 20. High-Performance In-Memory Genome Project Pathway Analysis ■  Search in pathways is limited to “is a certain element contained” today ■  Integrated >1,5k pathways from international sources, e.g. KEGG, HumanCyc, and WikiPathways, into HANA ■  Implemented graph-based topology exploration and ranking based on patient specifics ■  Enables interactive identification of possible dysfunctions affecting the course of a therapy before its start Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Unified access to multiple formerly disjoint data sources Pathway analysis of genetic variants with graph engine 20
  • 21. High-Performance In-Memory Genome Project Drug Response Testing ■  Drug response depends on individual genetic variants of tumors ■  Challenge: Identification of relevant genetic variants and their impact on drug response is a ongoing research activity, e.g. Xenograft models ■  Exploration of experiment results is time- consuming and Excel-driven ■  In-memory technology enables interactive exploration of experiment data to leverage new scientific insights Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Interactive analysis of correlations between drugs and genetic variants 21
  • 22. What to take home? Test it: http://we.AnalyzeGenomes.com   For researchers ■  Enable real-time analysis of medical data ■  Automatic assessment of data, e.g. scan of pathways to identify cellular impact of mutations ■  Combined free-text search in publications, diagnosis, and EMR data, i.e. structured and unstructured data   For clinicians ■  Preventive diagnostics to identify risk patients early ■  Indicate pharmacokinetic correlations ■  Scan for similar patient cases, e.g. to evaluate therapy success   For patients ■  Identify relevant clinical trials and medical experts ■  Start most appropriate therapy as early as possible Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 201422
  • 23. Keep in contact with us! Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Hasso Plattner Institute Enterprise Platform & Integration Concepts Dr. Matthieu-P. Schapranow August-Bebel-Str. 88 14482 Potsdam, Germany Dr. Matthieu-P. Schapranow schapranow@hpi.uni-potsdam.de http://we.analyzegenomes.com/ 23