This document discusses using graphs to connect diverse diabetes-related data and advance research. It describes how different research areas like hospitals, basic research and data analysis view the same "customer" (patient) differently. A graph database called DZDConnect was created to connect heterogeneous data like clinical studies, biobanks, omics data and literature at various levels. Examples show how the graph enables querying across data silos. The graph is being extended with various data sources and tools like natural language processing. The goal is personalized prevention and therapy by identifying diabetes subtypes using machine learning. This will help validate individualized treatment approaches.
3. What is diabetes mellitus?
• metabolic disease
• insulin production is reduced in pancreas or
body poorly responds on insulin
(insulin=hormone, the body needs to get glucose out of the blood
stream into the cells)
• consequences:
• less absorbtion of sugar
• sugar will not be stored in liver and muscle cells
• persistently high levels of sugar in blood (hyperglycemia)
• tremendous complications
• currently, not curable (only treatable)
diabetes
T1D
diabetes
Gestational
diabetes
special
types
T2D
diabetes
4. Diabetes TYPE 1 (T1D)
• appr. 5-10 % of diabetes patients have T1D
• often starts in childhood
• autoimmune reaction
• independent from life style
• patients need external insulin source
throughout their life
• appr. 20 genes involved
• currently, T1D is not curable
5. Diabetes TYPE 2 (T2D)
• appr. 90-95 % of diabetes patients have T2D
(mostly after age 40)
• insulin resistance, pancreas is not able to
produce enough insulin
• symptoms develop slowly
• >150 genes are identified that increase risk
• “the cocktail of evil“: predisposition +
overweight + physical inactivity
6. Some numbers (worldwide)
1 in 11 adults has diabetes (425 million)
Since 1980 quadrupled
12% of global health expenditure is spent on
diabetes ($727 billion)
Over 1 million children
and adolescents have
type 1 diabetes
Two-thirds of people with diabetes are of
working age (327 million)
2017
Three quarters of people with diabetes
live in low and middle income countries
2017
1 in 2 adults with diabetes is
undiagnosed (212 million)
International Diabetes Federation (IDF)
7. Some numbers (USA and Germany)
30 million have diabetes (9.4 % of adults )1
+1‘500‘000 p.a.
84 mio. prediabetes2
16 billion € costs p.a.1
7 million have diabetes (7.4 % of adults)1
+500‘000 p.a.
~ 7 mio. prediabetes and undiagnozed
$327 billion USD costs p.a.1
($237 bn. medical costs,
$90 bn. reduced productivity)2
1 www.statistica.com 2 American Diabetes Association
8. Overweight/obesity in the US (1985-
2009) obese adults in the US (BMI* >= 30)
*BMI=30: 5”11 = 220,46 lbs (180cm = 100 kg)
9. Complications develop after many years
kidney
Diabetic nephropathy
40 % of kidney failure/dialysis
feet
70 % of all foot
amputations
eyes
Diabetic retinopathy
30 % of loss of sight
brain
2-4 fold increased risk
for stroke
acute cardiac death
Main reason of death of diabetic patients
(33 % of all heart attacks)
nerves
Diabetic Neuropathy
Amputations of
extremeties
12. Epigenetics – beyond generation
weight[g]
age [weeks]
daughters of
obese mice
having diabtes
daughters of
healthy mice
Huypens and Beckers, Nat Genet. 2016
13. The German Center for Diabetes Research
funded by the Federal Ministry for
Education and Research and the states
5 Partners, 5 associated partners – 400 researchers (basic research and university hospitals)
DZD bundles competencies so that those affected benefit more quickly from research results.
academic, non-profit
14. The German Center for Diabetes Research
hospitals
prevention
nutrition / diet
beta cells
genetics
therapy
clinial studies
cohorts
basic researchhealthcare
diabetes
treatment
diabetes
prevention
prevention of
complications
15. Goal:
better diabetes prevention and therapy
personalized prevention and therapy
identify and cluster diabetes subtypes
individualized treatment of subtypes
19. We all “serve“ the same “customer“
Hospitals
Basic
Research
Data
Analysis
20. But we all see the “customer“ a little
differently
“Patient“
“Gene“
“Study“
“Metabolite“
“drug“
“statistics“
64kg, 178cm, male
C6H12O6
Metformin
T2D
AAGCTTCACATGG
cell
insulin resistance
inactive
mice
prediabetic pig
microscope
image
complications
21. Look at our “customer“ in a new way
“Patient“
“Gene“
“Study“
“Metabolite“
“drug“
“statistics“
64kg, 178cm, male
C6H12O6
Metformin
T2D
AAGCTTCACATGG
cell
insulin resistance
inactive
mice
prediabetic pig
microscope
image
complications
22. Look at our “customer“ from many
perspectives simultaneously – connect data
Hospitals
Basic
Research
Data
Analysis
data
23. Connect data – one option
Hospitals
Basic
Research
Data
Analysis
“Patient“
64kg, 178cm, male
“drug“
Metformin
“Study“
T2D
insulin resistance
“Gene“
AAGCTTCACATGG
“Metabolite“
C6H12O6
cell
inactive
mice
prediabetic pig
“statistics“
microscope
image
complications
25. DZDConnect – a Neo4j graph database
Graph that can help answering
bio-medical questions
across locations
across disciplines
across species
extendable
scalable
visualizable
29. Classify types of data
clin.
study
clin.
study
clin.
study
statis
tics
statis
tics
RNA
DNA
RNA
DNA
images
chem
istry
patient
patient
patient
bio
sample
bio
sample bio
sample
wet
lab
chem
istry
drug
30. Connect types of data
statis
tics
statis
tics
RNA
DNA
images
chem
istry
patient
wet
lab
chem
istry
drug
patient
patient
bio
sample
bio
sample bio
sample
clin.
study
clin.
study
clin.
study
RNA
DNA
32. Why graph?
• in „biology“ everything is connected anyway
• data is connected
• human readable – easy-to-understand for non-computer
scientists
• easy to query: queries are similar to human-like questions
• scalable
• easy-adoptable and extendable
• visualization
38. Extending our graph
Dr. Jan Krumsiek, Assistant Professor, Weill Cornell Medicine, NYC
metabolic pathway data
from 15-20 very rich data sources
~900’000 nodes
~1.7 mio. relationships
phenotype associations studies
41. How many biosamples were aquired in visit
17 of ‘PLIS‘ and which parameters were
measured?Goals:
1. Connect data from our clinical studies and biobanks
2. Researches can easily browse through measured parameters and available biosamples
3. Meta data of parameters helps to assess which samples are comparable
42. name: HMGU
name: AJ
position: data mgmt
name: PLIS
multi-center: true
recruiting: closed
analysis: on-going
no. of patients: 1105
visit: 17
name: blood
type: OGTT
number of samples: 3436
organism: Human
name: laboratory
46. Can human T2D genes be studied in
the pre-diabetic pig model?
Goals:
1. Connect data from different species (i.e. mice, pig, human)
2. Connect multiomics data
3. Researches can easily find information between human and comparable data from animal models
51. Natural language processing (NLP) example
Identification of genetic elements in metabolism by high-throughput mouse
phenotyping.
Metabolic diseases are a worldwide problem but the underlying
genetic factors and their relevance to metabolic disease remain
incompletely understood. Genome-wide research is needed to
characterize so-far unannotated mammalian metabolic genes.
Here, we generate and analyze metabolic phenotypic data
of 2016 knockout mouse strains under the aegis of the
International Mouse Phenotyping Consortium (IMPC) and find 974
gene knockouts with strong metabolic phenotypes. 429 of those
had no previous link to metabolism and 51 genes remain functionally completely
unannotated. We compared human orthologues of these uncharacterized genes in
five GWAS consortia and indeed 23 candidate genes, like ABC1, XYZ2, are associated
with metabolic disease. We further identify common regulatory elements in promoters
of candidate genes. As each regulatory element is composed of several transcription
factor binding sites, our data reveal an extensive metabolic phenotype-associated
network of co-regulated genes.
Our systematic mouse phenotype analysis thus paves the way for full functional
annotation of the genome. Metabolic disorders, including obesity and type 2 diabetes
mellitus, are major challenges for public health.
Rozman and Hrabe de Angelis, Nat Commun. 2018 NLP method by GraphAware
53. Machine learning for personalized prevention and
therapy
identify and cluster diabetes subtypes
individualized treatment of subtypes
Expert
Knowledge
validation of personalized treatment
Graph
Technology
54. DDPC – Digital Diabetes Prevention Center
• pattern recognition in huge amounts of data
• (un)supervised ML methods to identify subtypes of diabetes
• developing/validating individulized prevention/therapy
transparency to people benefit for people benefit for society
55. Next level in diabetes prevention and treatment
Hospitals
Basic
Research
Data
Analysis
lebenstil der eltern vor zeugung hat einfluss bereits für risk obisisty für die knder durch epigenetische enflüsse
VOR der schwangerschaft
cover all aspects of diabetes research from molecular studies in cell models. and animal models
to clinicial investigations in patients and health care research