SlideShare uma empresa Scribd logo
1 de 108
BIMCV: The Perfect "Big Data" Storm. 
Collision of Peta Bytes of Population Image Data, Millions of Hardware 
Devices and Thousands of Software Tools. 
e-Infraestructuras 
Nacionales 
Maria de la Iglesia, PhD. http://ceib.san.gva.es
OVERVIEW 
• Big Data 
• Strategic Vision of Big Data in EU 
• Strategic Vision of Big Data in US 
• Big Data in Neuroimaging 
• Population Imaging 
• BIMCV – EuroBioimaging - Valencia Node 
• Neuroimaging: the landscapes of the mind 
• Relevant facts
Big Data
Big data techniques and 
technologies 
• Techniques for analyzing big data 
– A/B testing. 
• Association rule learning. 
• Classification. 
• Cluster analysis. 
• Crowdsourcing. 
• Data fusion and data integration. 
– Signal processing 
– natural language processing 
• Data mining.
Big data techniques and 
technologies 
• Techniques for analyzing big data 
– Ensemble learning 
– Genetic algorithms 
– Machine learning 
– Natural language processing (NLP) 
– Neural Networks 
• Pattern recognition 
– Network analysis
Big data techniques and 
technologies 
• Techniques for analyzing big data 
– Optimization 
• Pattern recognition 
• Predictive modeling. 
• Regression. 
• Signal processing 
– time series analysis 
– data fusion 
• Spatial analysis. 
• Statistics.
Big data techniques and 
technologies 
• Big DataTechnologies 
– Big Table. (Proprietary distributed database system 
built on the Google File System. Inspiration for 
Hbase) 
– Business intelligence (BI). BI tools are often used to 
read data that have been previously stored in a data 
warehouse or data mart 
– Cassandra. An open source (free) database 
management system designed to handle huge 
amounts of data on a distributed system. This system 
was originally developed at Facebook and is now 
managed as a project of the Apache Software 
foundation
Big data techniques and 
technologies 
• Big DataTechnologies 
– Cloud computing. 
– Data mart. 
– Data warehouse. using ETL (extract, transform, and load) 
– Distributed system. 
– Dynamo 
– ETL 
– Google File System. 
– Hadoop 
– HBase. 
– MapReduce. 
– Mashup
Big data techniques and 
technologies 
• Big DataTechnologies 
– Non-relational database. 
– R. 
– Relational database. 
– Semi-structured data. 
– SQL. 
– Stream processing. 
– Structured data. 
– Unstructured data. 
– Visualization.
Big data techniques and 
technologies 
• Big DataTechnologies 
– VISUALIZATION 
• Tag cloud 
• Clustergram 
• History flow 
• Spatial information flow
VISUALIZATION: Tag cloud
VISUALIZATION: Clustergram
VISUALIZATION: History flow
VISUALIZATION: Spatial 
information flow
Strategic Vision of Big Data in EU
How Is the Europe Union Responding? 
In Big Data
Panel: Personalized Medicine in the 
Era of Big Data 
EHTEL Symposium 
Tapani Piha 
• Head of Unit for eHealth and Technology 
Assessment 
European Commission 
DG Health and Consumers 
Health Systems and Products
How does Big Data link to the 
Personalized Medicine? 
•Big Data refers to a collection of data sets so 
large and complex, it’s impossible to process 
them with the usual databases and tools 
•The data is gathered (most of the time) by 
people just living their lives (e.g. using mobile 
phones, the internet, driving cars, paying with 
banking cards) 
•Big data is used in the private sector (e.g. 
Google), and in the public sector (e.g. NSA)
Big Data use in public health & 
health care? 
•Research: "In the last five years, more scientific 
data has been generated than in the entire history 
of mankind”1 
•Health care: more evidence about personalized 
treatment, better selection of right provider, better 
equipped health care providers (e.g. IBM's Watson) 
•Public health: better personalized life-style info 
for citizens, earlier detection of epidemics, more 
and quicker access to epidemiological 
information 
12012 Winston Hide, The Promise of Big Data, Harvard Public Health
Staff Working Document on –omics 
•the potential for, and issues with, the use of '- 
omics' technologies in personalised medicine, 
and the related EU research funding, 
•recent developments in EU legislation for 
placing medicinal products and devices on the 
market. 
•factors affecting the uptake of personalised 
medicine in health care systems.
Commission action on Big Data 
•BIG-project: multi-sectorial initiative started in 
2011 to promote adoption of earlier waves of big 
data technology and contribute to EU 
competitiveness; 
•Green paper on mHealth: to assess market and 
further clarify what is needed in the legal 
framework concerning mHealth 
•Study in health program: to assess the usages 
and adoption of big data programs for (public) 
health systems within the EU.
Strategic Vision of Big Data in US
How Is U.S. Responding? 
National Institute of Standards an 
Technology (NIST) 
NIST is an agency of the U.S. Department of Commerce. 
To search federal science and technology web sites, including online databases see: 
science.org 
NIST program questions: 
Public Inquiries Unit: (301) 975-NIST (6478), Federal Relay Service (800) 877-8339 (TTY). 
NIST, 100 Bureau Drive, Stop 1070, Gaithersburg, MD 20899-1070 
Technical website questions: DO-webmaster@nist.gov
NIST Big Data Public Working Group 
Big Data PWG Overview Presentation 
September 30, 2013 
Wo Chang, NIST 
Robert Marcus, ET-Strategies 
Chaitanya Baru, UC San Diego
Agenda 
• Why Big Data? Why NIST? 
• NBD-PWG Charter 
• Overall Workplan 
• Subgroups Charter and Deliverables 
– Use Case and Requirements SG 
– Definitions and Taxonomies SG 
– Reference Architecture SG 
– Security and Privacy SG 
– Technology Roadmap SG 
• Next Steps 
9/30/13 NBD-PWG Overview 
28
Why Big Data? Why NIST? 
• Why Big Data? There is a broad agreement among commercial, academic, and government 
leaders about the remarkable potential of “Big Data” to spark innovation, fuel commerce, 
and drive progress. 
• Why NIST? (a) Recommendation from January 15 -- 17, 2013 Cloud/Big Data Forum and (b) 
A lack of consensus on some important, fundamental questions is confusing potential users 
and holding back progress. Questions such as: 
– What are the attributes that define Big Data solutions? 
– How is Big Data different from the traditional data environments and related 
applications that we have encountered thus far? 
– What are the essential characteristics of Big Data environments? 
– How do these environments integrate with currently deployed architectures? 
– What are the central scientific, technological, and standardization challenges that 
need to be addressed to accelerate the deployment of robust Big Data solutions? 
NBD-PWG is being launched to address these questions and is charged to develop 
consensus definitions, taxonomies, secure reference architecture, and technology roadmap 
for Big Data that can be embraced by all sectors. 
9/30/13 NBD-PWG Overview 
29
NBD-PWG Deliverables 
Working Drafts version 1.0 for 
1. Big Data Definitions 
2. Big Data Taxonomies 
3. Big Data Requirements 
4. Big Data Security and Privacy Requirements 
5. Big Data Architectures White Paper Survey 
6. Big Data Reference Architectures 
7. Big Data Security and Privacy Reference Architectures 
8. Big Data Technology Roadmap 
9/30/13 NBD-PWG Overview 
30
NBD-PWG Workplan 
9/30/13 NBD-PWG Overview 
31
Big Data Ecosystem in One Sentence 
• Use Clouds running Data Analytics 
Collaboratively processing Big Data to solve 
problems in X-Informatics ( or e-X) 
• X = Astronomy, Biology, Biomedicine, Business, Chemistry, Climate, 
Crisis, Earth Science, Energy, Environment, Finance, Health, Intelligence, 
Lifestyle, Marketing, Medicine, Pathology, Policy, Radar, Security, 
Sensor, Social, Sustainability, Wealth and Wellness with more fields 
(physics) defined implicitly 
• Spans Industry and Science (research) 
• Education: Data Science see recent New York Times articles 
• http://datascience101.wordpress.com/2013/04/13/new-york-times-data- 
science-articles/ 
32
Social Informatics 
Visual&Decision 
Informatics 
33
Big Data Definition 
• More consensus on Data Science definition than that of Big Data 
• Big Data refers to digital data volume, velocity and/or variety 
that: 
– Enable novel approaches to frontier questions previously inaccessible or 
impractical using current or conventional methods; and/or 
– Exceed the storage capacity or analysis capability of current or 
conventional methods and systems; and 
– Differentiates by storing and analyzing population data and not sample 
sizes. 
– Needs management requiring scalability across coupled horizontal 
resources 
34
Vendor-neutral and Technology-agnostic Proposals 
Data Processing Flow 
M0039 
Data Transformation Flow 
M0017 
IT Stack 
M0047 
35
Data Processing Flow 
M0039 
Data Transformation Flow 
M0017 
IT Stack 
M0047 
36 
Vendor-neutral and Technology-agnostic Proposals
Data Processing Flow 
M0039 
IT Stack 
M0047 
Data Transformation Flow 
M0017 
37 
Vendor-neutral and Technology-agnostic Proposals
Vendor-neutral and Technology-agnostic 
Proposals 
Data Transformation Flow 
M0017 
IT Stack 
M0047 
Data Processing Flow 
M0039 
38
Electronic Medical Record (EMR) Data I 
• Application: Large national initiatives around health data are emerging, and 
include developing a digital learning health care system to support 
increasingly evidence-based clinical decisions with timely accurate and up-to- 
date patient-centered clinical information; using electronic observational 
clinical data to efficiently and rapidly translate scientific discoveries into 
effective clinical treatments; and electronically sharing integrated health 
data to improve healthcare process efficiency and outcomes. These key 
initiatives all rely on high-quality, large-scale, standardized and aggregate 
health data. One needs advanced methods for normalizing patient, 
provider, facility and clinical concept identification within and among 
separate health care organizations to enhance models for defining and 
extracting clinical phenotypes from non-standard discrete and free-text 
clinical data using feature selection, information retrieval and machine 
learning decision-models. One must leverage clinical phenotype data to 
support cohort selection, clinical outcomes research, and clinical decision 
support. 
39 
PP, Fusion, S/Q, Index Parallelism Streaming over EMR (a set per person), viewers
Electronic Medical Record (EMR) Data II 
• Current Approach: Clinical data from more than 1,100 discrete logical, 
operational healthcare sources in the Indiana Network for Patient Care 
(INPC) the nation's largest and longest-running health information 
exchange. This describes more than 12 million patients, more than 4 
billion discrete clinical observations. > 20 TB raw data. Between 
500,000 and 1.5 million new real-time clinical transactions added per 
day. 
• Futures: Teradata, PostgreSQL and MongoDB supporting information 
retrieval methods to identify relevant clinical features (tf-idf, latent 
semantic analysis, mutual information). Natural Language Processing 
techniques to extract relevant clinical features. Validated features will 
be used to parameterize clinical phenotype decision models based on 
maximum likelihood estimators and Bayesian networks. Decision 
models will be used to identify a variety of clinical phenotypes such as 
diabetes, congestive heart failure, and pancreatic cancer. 
40
Pathology Imaging/ Digital Pathology I 
• Application: Digital pathology imaging is an emerging field where examination of high 
resolution images of tissue specimens enables novel and more effective ways for 
disease diagnosis. Pathology image analysis segments massive (millions per image) 
spatial objects such as nuclei and blood vessels, represented with their boundaries, 
along with many extracted image features from these objects. The derived information 
is used for many complex queries and analytics to support biomedical research and 
clinical diagnosis. 
41 
MR, MRIter, PP, Classification Streaming Parallelism over Images
Pathology Imaging/ Digital Pathology II 
• Current Approach: 1GB raw image data + 1.5GB analytical results per 2D image. MPI for 
image analysis; MapReduce + Hive with spatial extension on supercomputers and 
clouds. GPU’s used effectively. Figure 3 of section 2.12 shows the architecture of 
Hadoop-GIS, a spatial data warehousing system over MapReduce to support spatial 
analytics for analytical pathology imaging. 
42 
• Futures: Recently, 3D pathology imaging 
is made possible through 3D laser 
technologies or serially sectioning 
hundreds of tissue sections onto slides 
and scanning them into digital images. 
Segmenting 3D microanatomic objects 
from registered serial images could 
produce tens of millions of 3D objects 
from a single image. This provides a 
deep “map” of human tissues for next 
generation diagnosis. 1TB raw image 
data + 1TB analytical results per 3D 
image and 1PB data per moderated 
hospital per year. 
Architecture of Hadoop-GIS, a spatial data warehousing system over MapReduce 
to support spatial analytics for analytical pathology imaging
Computational Bioimaging 
• Application: Data delivered from bioimaging is increasingly automated, higher 
resolution, and multi-modal. This has created a data analysis bottleneck that, if 
resolved, can advance the biosciences discovery through Big Data techniques. 
• Current Approach: The current piecemeal analysis approach does not scale to 
situation where a single scan on emerging machines is 32TB and medical 
diagnostic imaging is annually around 70 PB even excluding cardiology. One 
needs a web-based one-stop-shop for high performance, high throughput 
image processing for producers and consumers of models built on bio-imaging 
data. 
• Futures: Goal is to solve that bottleneck with extreme scale computing with 
community-focused science gateways to support the application of massive 
data analysis toward massive imaging data sets. Workflow components include 
data acquisition, storage, enhancement, minimizing noise, segmentation of 
regions of interest, crowd-based selection and extraction of features, and 
object classification, and organization, and search. Use ImageJ, OMERO, 
VolRover, advanced segmentation and feature detection software. 
43 
MR, MRIter?, PP, Classification Streaming Parallelism over Images
22: Statistical Relational Artificial Intelligence for Health Care 
• Application: The goal of the project is to analyze large, multi-modal medical data 
including different data types such as imaging, EHR, genetic and natural language. This 
approach employs the relational probabilistic models that have the capability of 
handling rich relational data and modeling uncertainty using probability theory. The 
software learns models from multiple data types and can possibly integrate the 
information and reason about complex queries. Users can provide a set of descriptions 
– say for instance, MRI images and demographic data about a particular subject. They 
can then query for the onset of a particular disease (say Alzheimer’s) and the system 
will then provide a probability distribution over the possible occurrence of this disease. 
• Current Approach: A single server can handle a test cohort of a few hundred patients 
with associated data of 100’s of GB. 
• Futures: A cohort of millions of patient can involve petabyte datasets. Issues include 
availability of too much data (as images, genetic sequences etc) that complicate 
analysis. A major challenge lies in aligning the data and merging from multiple sources 
in a form that can be made useful for a combined analysis. Another issue is that 
sometimes, large amount of data is available about a single subject but the number of 
subjects themselves is not very high (i.e., data imbalance). This can result in learning 
algorithms picking up random correlations between the multiple data types as 
important features in analysis. 
MRIter, EGO Streaming Parallelism over People and their EMR 44
El paradigma P4 de la Medicina 
PREDICTIVO PREVENTIVO PERSONALIZADO PARTICIPATIVO
El paradigma V4 en Big Data 
Medicina 
V-OLUME V-ARIETY V-ELOCITY V-ALUE
Big Data in Neuroimaging
human neuroimaging is now, officially, a 
“big data” science 
• Among the examples of “big data” featured at 
the meeting was – no surprise - human 
neuroimaging 
• The Brain Research through Advancing 
Innovative Neurotechnologies (BRAIN) Initiative 
• Initiatives surrounding large-scale brain mapping 
are also underway in Europe 
http://www.humanbrainproject.eu 
• Organization for Human Brain Mapping (OHBM; 
http://www.humanbrainmapping.org)
How Big is “Big”? 
• While size is a relative term when it comes to data, 
medical imaging applied to the brain comes in a variety of 
forms which each generating differing types and amounts 
of information about neural structure and/or function. 
• NeuroImage, indicates that since 1995 the amount of 
data collected has doubled approximately every 26 
months. At this rate, by 2015 the amount of acquired 
neuroimaging data alone, discounting header information 
and before more files are generated during data 
processing and statistical analysis, may exceed an average 
of 20GB per published research study
Growth of Neuroimaging 
Study Size 
20000 
15000 
10000 
5000 
0 
1990 1995 2000 2005 2010 2015 2020 
MegaBytes 
Year 
Expected 
Observed 
Predicted 
Van Horn and Toga (in press) Brain Imaging and Behavior
Kryder’s law: Exponential Growth of 
Data 
VOLUME OF DATA 
MB = MEGABYTE = 106, GB = GIGABYTE = 109 
TB = TERABYTE = 1012, PB = PETABYTE = 1015 
COMPUTE 
POWER 
CPU TRANSISTOR 
COUNTS 
MOORE’S LAW 
YEARS 
SINGLE CRYO BRAIN VOLUME 
1600 CM2 
NEUROIMAGING 
(ANNUALLY) 
GENOMICS 
(BP/YR) 
Voxel Resolution Gray Scale Color 200 GB 10 MB 1x105 1985-1989 
Size Count 8bits 16bits 24bits 1 TB 100 MB 1x106 1990-1994 
1cm 12x15x9 1620 3000 4860 50 TB 10 GB 5x106 1995-1999 
1mm 
120x 
150x90 
1.62 
MB 
3.24 MB 4.86 MB 250 TB 1TB 1x107 2000-2004 
100 μm 
1200x 
1500x900 
1.62 
GB 
3.24 GB 4.86 GB 1 PB 30TB 8x106 2005-2009 
10 μm 
12000x 
15000x 
9000 
1.62 
TB 
3.24 TB 4.86 TB 5 PB 1 PB 1x109 2010-2014 
1 μm 
120000x 
150000x 
90000 
1.62 
PB 
3.24 PB 4.86 PB 10+ PB 20+ PB 1x1011 2015-2019 
(estimated) 
Kryder's law, Chip Walter - Scientific American, 2005 - nature.com
Big Neuroimaging + Big Genetics = 
REALLY Big Data 
• With the ability to obtain genome-wide sets of single 
nucleotide polymorphism (SNP) information becoming 
routine and the costs of full genomic sequencing rapidly 
becoming affordable. 
• Next Generation Sequencing (NGS) methods, for major 
brain imaging studies such as the Alzheimer’s Disease 
Neuroimaging Initiative (ADNI) (Weiner, Veitch et al. 
2012), with its initially available sample of 832 subjects. 
• As the bond between neuroimaging and genomics grows 
tighter, with both areas growing at incredible rates, disk 
storage, unique data compression techniques
Multisite Consortia and 
Data Sharing 
• Examples of multisite neuroimaging efforts can be found 
in the ubiquitous application of neuroimaging in health 
but also in devestating illnesses such as: 
• Parkinson’s (Evangelou, Maraganore et al. 2009) 
• psychiatric disorders (Schumann, Loth et al. 2010) 
• the mapping of human brain connectivity (Toga, Clark et 
al. 2012 
• databases of aging and aging-related diseases, largescale 
Autism Research (NDAR; Hall,Huerta et al. 2012) and the 
Federal InteracgencyTraumatic Brain Injury Research 
(FITBIR; Bushnik and Gordon 2012)
Multisite Consortia and Data Sharing 
• The various “grass roots” collections of resting-state 
fMRI data maintained as part of the 
“1000 Functional Connectomes” project 
http://fcon_1000.projects.nitrc.org/ 
(see Biswal, Mennes et al. 2010) 
• Task-based OpenfMRI http://www.openfmri.org 
(Poldrack, Barch et al. 2013) are other notable 
examples.
The Role of Cyberinfrastructure 
• Individual desktop computers are now no longer 
suitable for analyzing potentially petabytes 
worth of brain and genomics data at a time. 
• While the National Science Foundation (NSF) 
has made major investments in the computer 
architecture needed for physics, weather, and 
geological data. 
• Eg. XSEDE, https://www.xsede.org/ , and Open 
Science Grid, https://www.opensciencegrid.org
The Role of Cyberinfrastructure 
• The Neuroimaging Informatics Tools and 
Resources Clearinghouse 
(NITRC; http://www.nitrc.org ) 
• The International Neuroinformatics Coordinating 
Facility (INCF; http://incf.org ) 
Have begun to deploy local clusters with Amazon 
EC2 server technology toward this goal but a larger 
effort will be required involving dedicated 
processing centers or distributed grids of linked 
compute centers.
Conclusions 
• Neuroimaging research, by its very nature, is data intensive, 
multimodal, and collaborative factors which have been 
instrumental in its success and growth. 
• Yet the infrastructure needed for supporting this advancing form 
of brain research where data is king is still maturing. 
• The next steps for the development of resources supporting “big 
data” brain imaging at the Exabyte scale will require the further 
creation of new tools and services for data discovery, integration, 
analysis, and visualization. 
• For the “big data” science of human brain imaging, now is the 
time to begin.
Many 1,000’s of Software Tools 
• Acquisition, processing, storage/DB, service, migration, mining, analysis, 
visualization, annotation, … “(data-driven) process understanding” 
• Biomedical Imaging 
– There are 100’s of different types of image 
processing algorithms and filters 
– For each type of process there may be dozens 
of 
concrete software products (instance implementations) 
• (Example) Neuroimaging 
– NITRC lists > 500 openly shared software tools 
– For each openly shared tool there may be 
dozens of 
proprietary or less commonly used analogues
Millions of Dispersed Hardware Devices 
• Cisco: "By the end of 2012, the number of mobile-connected devices will 
exceed the number of people on Earth” 
• There will be over 10 billion mobile-connected devices in 2016; i.e., there 
will be 1.3 mobile devices per capita 
– These include phones, tablets, laptops, handheld gaming consoles, e-readers, 
in-car entertainment systems, digital cameras, and “machine-to-machine 
modules” 
• DBs, Clients, Servers, Compute-Nodes, Web-Services, Interfaces, … 
• Solution … 
Dinov et al., BMC 2011
Image 
spatial 
alignment 
Slice 
timing 
adjustment 
Van Horn et al., Nature Neuro, 2004 
Statistical 
modeling 
(e.g. GLM) 
Functional – 
structural 
co-registration 
Raw fMRI 
time series 
High-resolution 
anatomical 
image 
Standardized 
brain atlas 
template 
Image 
smoothing 
Gaussian 
spatial 
filtering 
Experimental 
design matrix 
Study Meta Data 
Scanner protocols 
Subject demographics 
Stimulus timing 
etc. 
Spatial 
normalization 
to atlas space 
Statistical 
results maps 
Graphical 
overlays 
Table of 
statistically 
significant 
voxels in atlas 
space coordinates
Pipeline Version 5.9.1 Features 
Graphical Programming Environment 
8/29/2014 63
Perfect Neuroimaging-Computation Storm? 
• Single Subject Studies (N=1) 
– Genetics: 
• Depending on Coverage(X) 
• Whole Genome Seq Data > 320GB (>80X) 
• Require 2+ TB RAM, and 100+ hrs CPU 
– Imaging: 
• Depending on protocols 
• 40-512 gradient directions Diffusion imaging data 
• Raw (multimodal) Neuroimaging Data > 10 GB 
• Derived Data > 100 GB 
• Require 100GB RAM and 70+ hrs CPU 
• Large Subject Studies 
– Cohort studies (N>10, Typically N~100’s) 
– Multi-Institutional Population-wide Studies (N>1,000) 
– Longitudinal (neuroimaging) studies …
From Biomedical Challenges to Modeling, 
Computation, Tools and Curricular 
Training 
• Quantitative Volumetric and Surface based Stats Analyses 
– Interactome: Challenge↔Models↔Data Analysis↔Computation↔Education 
– Statistics Online Computational Resource Che, et al., JSS (2009) No effect 
Marginal 
Significant
Grid & Cloud Computing 
• UCLA Grids 
Cerebro Medulla 
 1,200 cores 
 1.4TB RAM 
 12,000 jobs/day 
 700 users 
• Amazon Cloud 
 4,300 cores 
 9.6 TB RAM 
 (new) 
– EC2 (Elastic Cloud Computing) 
– S3 (Simple Storage Service) 
• UC Grid 
• Globus GridFTP 
• INI Cluster @ USC 
– 3328 cores, 128GB RAM per 16 cores, 26tb aggregate 
memory space. Connectivity is 5Gbit per 16 cores, 
roughly 4terabit aggregate on comp and another 4.3Tbit 
on the storage. 2.43PB of online storage with over 50TB 
of SSD accelerating it currently.
Neuroimaging Applications: 56-ROI Global 
Shape Analysis (NC vs. IBS/Pain) Group 
Effects 
Data Workflow Protocol Results 
Structural T1 data 
NC IBS 
221 107 
Mean-Curvature between-group 
differences in: 
L_cuneus 
R_angular_gyrus 
Left View 
Right View
Neuroimaging Applications: Stat Mapping 
of Cortical GM Thickness (Group Effects) 
Results 
Left 
Anterior 
Insula 
Data 
Workflow Protocol 
Structural T1 data 
Cortical Models 
1.0 
P-value 
0.0
Pipeline User Community
Population Imaging
Big Data y el sector de la Salud en 
Imagen Poblacional 
• Según Bonnies Feldman “el potencial de Big Data en medicina 
reside en la posibilidad de combinar los datos tradicionales con 
otras nuevas formas de datos, tanto a nivel individual como 
Poblacional” 
• El potencial del Big Data indica que se pueden producir ahorros en 
el sector sanitario a través de varias vías: 
– Transformación de datos en información. 
– Apoyo al autocuidado de las personas. 
– Aumento del conocimiento. 
– Concienciación del estado de salud. 
• El Big Data es una metodología de acceso abierto para integrar 
diferentes tipos de datos en imagen poblacional, cuantificación de 
imagen y extracción de características.
Tipos de Estudios 
• Individual 
• Longitudinal 
0 1 2 M 
• Transversal
Estudios Poblacionales 
• Estudios Poblacionales 
– Si no se forman grupos en la población, se calcula la media 
del parámetro o parámetros. 
– Si se forman grupos (control y Patológicos) se debe realizar 
un contraste de hipótesis. 
• Modelado Poblacional 
– Modelar la degeneración volumétrica de sustancia gris y 
sustancia blanca 
– Establecer parámetros de degeneración 
– Contrastar el estado de un individuo con respecto a dicho 
modelo.
Aplicación en 
Alzheimer
Aplicación a Casos Reales 
Resultados de parámetros globales
Aplicación a Casos Reales 
Resultados de grosor y volumen por estructura, 
junto con los valores de referencia
Aplicación a Casos Reales 
• Representación de la diferencia del volumen en comparación 
con la población
¿Porqué no podemos combinar 
BELLEZA Y CIENCIA?
Objetivos BIMCV 
• Desarrollar e implementar estrategias para 
prevenir o tratar efectivamente las 
enfermedades mediante una infraestructura de 
investigación en imagen asociada a grandes 
estudios poblacionales de imagen. 
– Concepto de “Population Imaging”. 
• Proporcionar datos, 
herramientas y recursos de 
proceso para realizar estudios 
avanzados en imagen.
BIMCV
Nodo Valenciano 
Euro-BioImaging 
Infraestructura Europea para la Investigación en 
Tecnologías de Imagen Biomédica e Imagen 
Biológica. 
Un proyecto sobre la hoja de ruta de las ESFRI en infraestructuras 
de investigación 
www.eurobioimaging.eu
EIBIR key facts and daily work 
In the service of research, 
EIBIR offers to its Network Members: 
- Multidisciplinary networking 
- Project Management 
- Research communication 
- Research Training 
- Meeting organisation 
EIBIR Office 
• Established in 2006 
• Staff: 4.5, incl. 3 Project Managers, 1 assistant 
• Provision of services to Network Members + EIBIR bodies 
• Monitoring European Affairs + research funding opportunities 
• Project management and coordination 
• Information activities and media work 
• Promotion of Network Membership 
• Website and data base updates 
• Congress activities 
• Scientific Advisory Board
Cronología & Financiación 
83 
2013 - 2017 
Fase de 
Construcción 
• Evaluación & 
selección de nodos. 
• Construccion de los 
nodes. 
Financiado por los Estados 
Miembros (¿MINECO?) 
2010 - 2013 
Fase Preparatoria 
• Framework 
• Definición de los 
criterios de 
elegibilidad para los 
nodos 
• Llamada a los 
Nodos, Abierta. 
Financiado por CE 
……… 
2017 - …. 
Fase Operacional 
• Acceso y formación 
• Tecnología y evaluación 
para mejorar el servicio 
Financiado por los Estados Miembros 
& EC
MULTIMODAL 
TECHNOLOGY 
NODE 
Imaging Infrastructure with open user access 
European life scientists as users 
FLAGSHIP NODE 
FLAGSHIP NODE 
FLAGSHIP NODE 
FLAGSHIP NODE 
USER TRAINING 
STAFF TRAINING 
Web-access portal 
Data storage and analysis infrastructure 
User returns with results for publication 
NODES HUB 
MULTIMODAL 
TECHNOLOGY 
NODE
1st Open Call 
Euro-BioImaging Nodes – Expression of Interest 
The 1st Open Call: 1 February – 30 April 2013 
• Multi-Modal Molecular Imaging 
• Phase contrast Imaging 
• High-field MRI 
• MR-PET 
• Population Imaging 
• Data Infrastructure: Challenges Framework 
• The biological imaging community will call for EoIs in 6 technologies
Resultados 1ª Convocatoria 
Biological Imaging 
Biomedical Imaging 
9 NODOS ESPAÑOLES 
– 18 Instituciones –
Evaluation summary and Final ranking 
• The node develops and provides access to a large database of 
imaging data and the associated clinical data records. 
• Big Data repository from hospitals in the Valencia region (5 million 
inhabitants living over an area of 23.255 Km2. average number of 
5.3 million clinical cases per year, from 210 different imaging 
modalities). 
• The access to such data and tools will be an efficient way of 
advancing population imaging studies and research. 
• The node has ability to incorporate data from other facilities
Services offered by the node 
• BIMCV facility provides a multi-level and multi-ology storage 
service (Vendor Neutral Archive). 
• CEIB-CS node integrates access to high-performance 
computational services from local and European 
infrastructures (Principe Felipe Research Centre & UPV-I3M 
Infrastructure). 
• Open access methodology to integrate different data types for 
population imaging, quantitative resources and feature 
extraction. 
• Comprehensive user training
Single Technology Flagship Node – Population Imaging: Valencia 
Evaluation summary and Final ranking: 
• Requires minor improvements (training plan, actually corrected). 
• The node develops and provides access to a large database of imaging data 
and the associated clinical data records. 
• Big Data repository from hospitals in the Valencia region (5 million inhabitants 
living over an area of 23.255 Km2. average number of 5.3 million clinical cases 
per year, from 210 different imaging modalities). 
• The access to such data and tools will be an efficient way of advancing 
population imaging studies and research. 
• The node has ability to incorporate data from other facilities. 
Other facilities 
MEDICAL IMAGING DATA BANK (BIMCV) 
BIG DATA DIASEASE SIGNATURES 
Services offered by the node: 
• BIMCV facility provides a multi-level and multi-ology 
storage service (Vendor Neutral Archive). 
• CEIB-AVS node integrates access to high-performance 
computational services from local and European 
infrastructures (Principe Felipe Research Centre & UPV-I3M 
Infrastructure). 
• Open access methodology to integrate different data 
types for population imaging, quantitative resources and 
feature extraction. 
• Comprehensive user training.
Nodo Valenciano, BIMCV 
Centro de Excelencia en Imagen Biomédica de la Conselleria de 
Sanitat 
Sede CEIB clínica Sede CEIB computo
Servicios
Neuroimaging. The landscapes' of the mind
Human Neuroimaging as a 
“Big Data” Science 
The mind landscapes 
http://prezi.com/sseievn7ujcf/?utm_campaign=share&utm_medium=copy
Estudio de la estructura 
Morfometría
Estudio de la estructura 
Tractografía
Estudio de la Función
Estudio de la Función
RESTING STATE
Conectómica
Conectómica
Relevant facts
10 K Structural Modeling in 
Neuroimage of Valencia Region 
• Dos becas de la Subdirección General de Sistemas para la 
Salud de la CS. Ingenieros Informáticos o Ingenieros de 
Telecomunicaciones (DOGV 9-07-2014). 
• Se van a medir las estructuras principales del cerebro. 
• En colaboración con LABMAN (http://www.labman.org) 
• En colaboración con Brain Dynamics 
• La universidad del Sur de California (Jack Van Horn) 
• En colaboración con IBIME
Demo: 
Prototipo de realidad Virtual Aumentada de: 
Gonzalo Rojas Costas

Mais conteúdo relacionado

Mais procurados

Diffusion of Big Data and Analytics in Developing Countries
Diffusion of Big Data and Analytics in Developing CountriesDiffusion of Big Data and Analytics in Developing Countries
Diffusion of Big Data and Analytics in Developing Countriestheijes
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Onyebuchi nosiri
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Onyebuchi nosiri
 
Big Data & Analytics for Government - Case Studies
Big Data & Analytics for Government - Case StudiesBig Data & Analytics for Government - Case Studies
Big Data & Analytics for Government - Case StudiesJohn Palfreyman
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsDhruv Saxena
 
Big Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering valueBig Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering valueEdward Curry
 
Data mining & big data presentation 01
Data mining & big data presentation 01Data mining & big data presentation 01
Data mining & big data presentation 01Aseem Chakrabarthy
 
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...IJERDJOURNAL
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data ScienceKenny Daniel
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremAnthonyOtuonye
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 
Big data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing PlatformsBig data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing PlatformsIJERA Editor
 
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...Mario Faria
 

Mais procurados (20)

Diffusion of Big Data and Analytics in Developing Countries
Diffusion of Big Data and Analytics in Developing CountriesDiffusion of Big Data and Analytics in Developing Countries
Diffusion of Big Data and Analytics in Developing Countries
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
 
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
Efficient Data Filtering Algorithm for Big Data Technology in Telecommunicati...
 
Big Data & Analytics for Government - Case Studies
Big Data & Analytics for Government - Case StudiesBig Data & Analytics for Government - Case Studies
Big Data & Analytics for Government - Case Studies
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Big Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering valueBig Data: Beyond the hype, Delivering value
Big Data: Beyond the hype, Delivering value
 
Big data
Big dataBig data
Big data
 
Data mining & big data presentation 01
Data mining & big data presentation 01Data mining & big data presentation 01
Data mining & big data presentation 01
 
Big Data
Big Data Big Data
Big Data
 
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
 
Cri big data
Cri big dataCri big data
Cri big data
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
1
11
1
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data Science
 
A Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE TheoremA Model Design of Big Data Processing using HACE Theorem
A Model Design of Big Data Processing using HACE Theorem
 
Data Science
Data ScienceData Science
Data Science
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Big data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing PlatformsBig data Mining Using Very-Large-Scale Data Processing Platforms
Big data Mining Using Very-Large-Scale Data Processing Platforms
 
Data mining
Data miningData mining
Data mining
 
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
 

Destaque

Portillofernandez.merinogutierrez.rodriguezcastro.castillomartin
Portillofernandez.merinogutierrez.rodriguezcastro.castillomartinPortillofernandez.merinogutierrez.rodriguezcastro.castillomartin
Portillofernandez.merinogutierrez.rodriguezcastro.castillomartinPortilloAdrian
 
BuildingdigitalServiceswithServiceBuildingBlocks (2)
BuildingdigitalServiceswithServiceBuildingBlocks (2)BuildingdigitalServiceswithServiceBuildingBlocks (2)
BuildingdigitalServiceswithServiceBuildingBlocks (2)Helmut Steigele
 
資訊作業11431
資訊作業11431資訊作業11431
資訊作業11431思瑩 李
 
Manejo y uso de las nuevas tecnologías en la tercera edad
Manejo y uso de las nuevas tecnologías en la tercera edadManejo y uso de las nuevas tecnologías en la tercera edad
Manejo y uso de las nuevas tecnologías en la tercera edad2867jbc
 
Bits Viajeros - Jaime - Los sentidos
Bits Viajeros - Jaime - Los sentidosBits Viajeros - Jaime - Los sentidos
Bits Viajeros - Jaime - Los sentidosMFatimaGS
 
How to speak english fluently-inlingua method
How to speak english fluently-inlingua methodHow to speak english fluently-inlingua method
How to speak english fluently-inlingua methodDhaneshRathore
 
BUSCA TU PAREJA
BUSCA TU PAREJABUSCA TU PAREJA
BUSCA TU PAREJAangiloaiza
 
Tratamiento del biogás para pilas de combustible
Tratamiento del biogás para pilas de combustibleTratamiento del biogás para pilas de combustible
Tratamiento del biogás para pilas de combustibleRaquel Gavilán Párraga
 
[2] grupos
[2] grupos[2] grupos
[2] grupospiyo2770
 
Ludeos - Venda seu conteúdo online (how it works)
Ludeos - Venda seu conteúdo online (how it works)Ludeos - Venda seu conteúdo online (how it works)
Ludeos - Venda seu conteúdo online (how it works)Osvaldo Santana Neto
 
Grupul de firme Cotraco - Un partener de incredere
Grupul de firme Cotraco - Un partener de incredereGrupul de firme Cotraco - Un partener de incredere
Grupul de firme Cotraco - Un partener de incredereUnserver Business Solutions
 

Destaque (20)

Portillofernandez.merinogutierrez.rodriguezcastro.castillomartin
Portillofernandez.merinogutierrez.rodriguezcastro.castillomartinPortillofernandez.merinogutierrez.rodriguezcastro.castillomartin
Portillofernandez.merinogutierrez.rodriguezcastro.castillomartin
 
BuildingdigitalServiceswithServiceBuildingBlocks (2)
BuildingdigitalServiceswithServiceBuildingBlocks (2)BuildingdigitalServiceswithServiceBuildingBlocks (2)
BuildingdigitalServiceswithServiceBuildingBlocks (2)
 
資訊作業11431
資訊作業11431資訊作業11431
資訊作業11431
 
Manejo y uso de las nuevas tecnologías en la tercera edad
Manejo y uso de las nuevas tecnologías en la tercera edadManejo y uso de las nuevas tecnologías en la tercera edad
Manejo y uso de las nuevas tecnologías en la tercera edad
 
LA CRÓNICA 664
LA CRÓNICA 664LA CRÓNICA 664
LA CRÓNICA 664
 
las canalizaciones prefabricadas
las canalizaciones prefabricadaslas canalizaciones prefabricadas
las canalizaciones prefabricadas
 
Articulo
ArticuloArticulo
Articulo
 
MOBOTIX International Partner Conference
MOBOTIX International Partner Conference MOBOTIX International Partner Conference
MOBOTIX International Partner Conference
 
Bits Viajeros - Jaime - Los sentidos
Bits Viajeros - Jaime - Los sentidosBits Viajeros - Jaime - Los sentidos
Bits Viajeros - Jaime - Los sentidos
 
How to speak english fluently-inlingua method
How to speak english fluently-inlingua methodHow to speak english fluently-inlingua method
How to speak english fluently-inlingua method
 
The Horn Law Firm
The Horn Law FirmThe Horn Law Firm
The Horn Law Firm
 
Ruby on Rails. Ćwiczenia
Ruby on Rails. ĆwiczeniaRuby on Rails. Ćwiczenia
Ruby on Rails. Ćwiczenia
 
BUSCA TU PAREJA
BUSCA TU PAREJABUSCA TU PAREJA
BUSCA TU PAREJA
 
Com basis 2011
Com basis 2011Com basis 2011
Com basis 2011
 
11
1111
11
 
¿Cómo ser un líder global?
¿Cómo ser un líder global?¿Cómo ser un líder global?
¿Cómo ser un líder global?
 
Tratamiento del biogás para pilas de combustible
Tratamiento del biogás para pilas de combustibleTratamiento del biogás para pilas de combustible
Tratamiento del biogás para pilas de combustible
 
[2] grupos
[2] grupos[2] grupos
[2] grupos
 
Ludeos - Venda seu conteúdo online (how it works)
Ludeos - Venda seu conteúdo online (how it works)Ludeos - Venda seu conteúdo online (how it works)
Ludeos - Venda seu conteúdo online (how it works)
 
Grupul de firme Cotraco - Un partener de incredere
Grupul de firme Cotraco - Un partener de incredereGrupul de firme Cotraco - Un partener de incredere
Grupul de firme Cotraco - Un partener de incredere
 

Semelhante a BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia

BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. maigva
 
Big data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesBig data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesNavneet Randhawa
 
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...European Data Forum
 
Towards a big data roadmap for europe
Towards a big data roadmap for europeTowards a big data roadmap for europe
Towards a big data roadmap for europeBIG Project
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataMapR Technologies
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptxSyauqiAsyhabira1
 
Gradiant - Technology Offer in Business Analytics
Gradiant - Technology Offer in Business AnalyticsGradiant - Technology Offer in Business Analytics
Gradiant - Technology Offer in Business AnalyticsMarcos Álvarez-Díaz
 

Semelhante a BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia (20)

BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm.
 
Big data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesBig data: Challenges, Practices and Technologies
Big data: Challenges, Practices and Technologies
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
 
Towards a big data roadmap for europe
Towards a big data roadmap for europeTowards a big data roadmap for europe
Towards a big data roadmap for europe
 
Big data
Big dataBig data
Big data
 
DOWLD SLIDES.pptx
DOWLD SLIDES.pptxDOWLD SLIDES.pptx
DOWLD SLIDES.pptx
 
Big dataorig
Big dataorigBig dataorig
Big dataorig
 
ppt1.pptx
ppt1.pptxppt1.pptx
ppt1.pptx
 
Big Data Analytics (1).ppt
Big Data Analytics (1).pptBig Data Analytics (1).ppt
Big Data Analytics (1).ppt
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 
Big Data World
Big Data WorldBig Data World
Big Data World
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx
 
Gradiant - Technology Offer in Business Analytics
Gradiant - Technology Offer in Business AnalyticsGradiant - Technology Offer in Business Analytics
Gradiant - Technology Offer in Business Analytics
 
Big Data
Big DataBig Data
Big Data
 
NCCT.pptx
NCCT.pptxNCCT.pptx
NCCT.pptx
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big Data et eGovernment
Big Data et eGovernmentBig Data et eGovernment
Big Data et eGovernment
 
Complete-SRS.doc
Complete-SRS.docComplete-SRS.doc
Complete-SRS.doc
 

Último

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Último (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia

  • 1. BIMCV: The Perfect "Big Data" Storm. Collision of Peta Bytes of Population Image Data, Millions of Hardware Devices and Thousands of Software Tools. e-Infraestructuras Nacionales Maria de la Iglesia, PhD. http://ceib.san.gva.es
  • 2. OVERVIEW • Big Data • Strategic Vision of Big Data in EU • Strategic Vision of Big Data in US • Big Data in Neuroimaging • Population Imaging • BIMCV – EuroBioimaging - Valencia Node • Neuroimaging: the landscapes of the mind • Relevant facts
  • 4.
  • 5. Big data techniques and technologies • Techniques for analyzing big data – A/B testing. • Association rule learning. • Classification. • Cluster analysis. • Crowdsourcing. • Data fusion and data integration. – Signal processing – natural language processing • Data mining.
  • 6. Big data techniques and technologies • Techniques for analyzing big data – Ensemble learning – Genetic algorithms – Machine learning – Natural language processing (NLP) – Neural Networks • Pattern recognition – Network analysis
  • 7. Big data techniques and technologies • Techniques for analyzing big data – Optimization • Pattern recognition • Predictive modeling. • Regression. • Signal processing – time series analysis – data fusion • Spatial analysis. • Statistics.
  • 8. Big data techniques and technologies • Big DataTechnologies – Big Table. (Proprietary distributed database system built on the Google File System. Inspiration for Hbase) – Business intelligence (BI). BI tools are often used to read data that have been previously stored in a data warehouse or data mart – Cassandra. An open source (free) database management system designed to handle huge amounts of data on a distributed system. This system was originally developed at Facebook and is now managed as a project of the Apache Software foundation
  • 9. Big data techniques and technologies • Big DataTechnologies – Cloud computing. – Data mart. – Data warehouse. using ETL (extract, transform, and load) – Distributed system. – Dynamo – ETL – Google File System. – Hadoop – HBase. – MapReduce. – Mashup
  • 10. Big data techniques and technologies • Big DataTechnologies – Non-relational database. – R. – Relational database. – Semi-structured data. – SQL. – Stream processing. – Structured data. – Unstructured data. – Visualization.
  • 11. Big data techniques and technologies • Big DataTechnologies – VISUALIZATION • Tag cloud • Clustergram • History flow • Spatial information flow
  • 16.
  • 17. Strategic Vision of Big Data in EU
  • 18. How Is the Europe Union Responding? In Big Data
  • 19. Panel: Personalized Medicine in the Era of Big Data EHTEL Symposium Tapani Piha • Head of Unit for eHealth and Technology Assessment European Commission DG Health and Consumers Health Systems and Products
  • 20. How does Big Data link to the Personalized Medicine? •Big Data refers to a collection of data sets so large and complex, it’s impossible to process them with the usual databases and tools •The data is gathered (most of the time) by people just living their lives (e.g. using mobile phones, the internet, driving cars, paying with banking cards) •Big data is used in the private sector (e.g. Google), and in the public sector (e.g. NSA)
  • 21. Big Data use in public health & health care? •Research: "In the last five years, more scientific data has been generated than in the entire history of mankind”1 •Health care: more evidence about personalized treatment, better selection of right provider, better equipped health care providers (e.g. IBM's Watson) •Public health: better personalized life-style info for citizens, earlier detection of epidemics, more and quicker access to epidemiological information 12012 Winston Hide, The Promise of Big Data, Harvard Public Health
  • 22. Staff Working Document on –omics •the potential for, and issues with, the use of '- omics' technologies in personalised medicine, and the related EU research funding, •recent developments in EU legislation for placing medicinal products and devices on the market. •factors affecting the uptake of personalised medicine in health care systems.
  • 23. Commission action on Big Data •BIG-project: multi-sectorial initiative started in 2011 to promote adoption of earlier waves of big data technology and contribute to EU competitiveness; •Green paper on mHealth: to assess market and further clarify what is needed in the legal framework concerning mHealth •Study in health program: to assess the usages and adoption of big data programs for (public) health systems within the EU.
  • 24. Strategic Vision of Big Data in US
  • 25.
  • 26. How Is U.S. Responding? National Institute of Standards an Technology (NIST) NIST is an agency of the U.S. Department of Commerce. To search federal science and technology web sites, including online databases see: science.org NIST program questions: Public Inquiries Unit: (301) 975-NIST (6478), Federal Relay Service (800) 877-8339 (TTY). NIST, 100 Bureau Drive, Stop 1070, Gaithersburg, MD 20899-1070 Technical website questions: DO-webmaster@nist.gov
  • 27. NIST Big Data Public Working Group Big Data PWG Overview Presentation September 30, 2013 Wo Chang, NIST Robert Marcus, ET-Strategies Chaitanya Baru, UC San Diego
  • 28. Agenda • Why Big Data? Why NIST? • NBD-PWG Charter • Overall Workplan • Subgroups Charter and Deliverables – Use Case and Requirements SG – Definitions and Taxonomies SG – Reference Architecture SG – Security and Privacy SG – Technology Roadmap SG • Next Steps 9/30/13 NBD-PWG Overview 28
  • 29. Why Big Data? Why NIST? • Why Big Data? There is a broad agreement among commercial, academic, and government leaders about the remarkable potential of “Big Data” to spark innovation, fuel commerce, and drive progress. • Why NIST? (a) Recommendation from January 15 -- 17, 2013 Cloud/Big Data Forum and (b) A lack of consensus on some important, fundamental questions is confusing potential users and holding back progress. Questions such as: – What are the attributes that define Big Data solutions? – How is Big Data different from the traditional data environments and related applications that we have encountered thus far? – What are the essential characteristics of Big Data environments? – How do these environments integrate with currently deployed architectures? – What are the central scientific, technological, and standardization challenges that need to be addressed to accelerate the deployment of robust Big Data solutions? NBD-PWG is being launched to address these questions and is charged to develop consensus definitions, taxonomies, secure reference architecture, and technology roadmap for Big Data that can be embraced by all sectors. 9/30/13 NBD-PWG Overview 29
  • 30. NBD-PWG Deliverables Working Drafts version 1.0 for 1. Big Data Definitions 2. Big Data Taxonomies 3. Big Data Requirements 4. Big Data Security and Privacy Requirements 5. Big Data Architectures White Paper Survey 6. Big Data Reference Architectures 7. Big Data Security and Privacy Reference Architectures 8. Big Data Technology Roadmap 9/30/13 NBD-PWG Overview 30
  • 31. NBD-PWG Workplan 9/30/13 NBD-PWG Overview 31
  • 32. Big Data Ecosystem in One Sentence • Use Clouds running Data Analytics Collaboratively processing Big Data to solve problems in X-Informatics ( or e-X) • X = Astronomy, Biology, Biomedicine, Business, Chemistry, Climate, Crisis, Earth Science, Energy, Environment, Finance, Health, Intelligence, Lifestyle, Marketing, Medicine, Pathology, Policy, Radar, Security, Sensor, Social, Sustainability, Wealth and Wellness with more fields (physics) defined implicitly • Spans Industry and Science (research) • Education: Data Science see recent New York Times articles • http://datascience101.wordpress.com/2013/04/13/new-york-times-data- science-articles/ 32
  • 34. Big Data Definition • More consensus on Data Science definition than that of Big Data • Big Data refers to digital data volume, velocity and/or variety that: – Enable novel approaches to frontier questions previously inaccessible or impractical using current or conventional methods; and/or – Exceed the storage capacity or analysis capability of current or conventional methods and systems; and – Differentiates by storing and analyzing population data and not sample sizes. – Needs management requiring scalability across coupled horizontal resources 34
  • 35. Vendor-neutral and Technology-agnostic Proposals Data Processing Flow M0039 Data Transformation Flow M0017 IT Stack M0047 35
  • 36. Data Processing Flow M0039 Data Transformation Flow M0017 IT Stack M0047 36 Vendor-neutral and Technology-agnostic Proposals
  • 37. Data Processing Flow M0039 IT Stack M0047 Data Transformation Flow M0017 37 Vendor-neutral and Technology-agnostic Proposals
  • 38. Vendor-neutral and Technology-agnostic Proposals Data Transformation Flow M0017 IT Stack M0047 Data Processing Flow M0039 38
  • 39. Electronic Medical Record (EMR) Data I • Application: Large national initiatives around health data are emerging, and include developing a digital learning health care system to support increasingly evidence-based clinical decisions with timely accurate and up-to- date patient-centered clinical information; using electronic observational clinical data to efficiently and rapidly translate scientific discoveries into effective clinical treatments; and electronically sharing integrated health data to improve healthcare process efficiency and outcomes. These key initiatives all rely on high-quality, large-scale, standardized and aggregate health data. One needs advanced methods for normalizing patient, provider, facility and clinical concept identification within and among separate health care organizations to enhance models for defining and extracting clinical phenotypes from non-standard discrete and free-text clinical data using feature selection, information retrieval and machine learning decision-models. One must leverage clinical phenotype data to support cohort selection, clinical outcomes research, and clinical decision support. 39 PP, Fusion, S/Q, Index Parallelism Streaming over EMR (a set per person), viewers
  • 40. Electronic Medical Record (EMR) Data II • Current Approach: Clinical data from more than 1,100 discrete logical, operational healthcare sources in the Indiana Network for Patient Care (INPC) the nation's largest and longest-running health information exchange. This describes more than 12 million patients, more than 4 billion discrete clinical observations. > 20 TB raw data. Between 500,000 and 1.5 million new real-time clinical transactions added per day. • Futures: Teradata, PostgreSQL and MongoDB supporting information retrieval methods to identify relevant clinical features (tf-idf, latent semantic analysis, mutual information). Natural Language Processing techniques to extract relevant clinical features. Validated features will be used to parameterize clinical phenotype decision models based on maximum likelihood estimators and Bayesian networks. Decision models will be used to identify a variety of clinical phenotypes such as diabetes, congestive heart failure, and pancreatic cancer. 40
  • 41. Pathology Imaging/ Digital Pathology I • Application: Digital pathology imaging is an emerging field where examination of high resolution images of tissue specimens enables novel and more effective ways for disease diagnosis. Pathology image analysis segments massive (millions per image) spatial objects such as nuclei and blood vessels, represented with their boundaries, along with many extracted image features from these objects. The derived information is used for many complex queries and analytics to support biomedical research and clinical diagnosis. 41 MR, MRIter, PP, Classification Streaming Parallelism over Images
  • 42. Pathology Imaging/ Digital Pathology II • Current Approach: 1GB raw image data + 1.5GB analytical results per 2D image. MPI for image analysis; MapReduce + Hive with spatial extension on supercomputers and clouds. GPU’s used effectively. Figure 3 of section 2.12 shows the architecture of Hadoop-GIS, a spatial data warehousing system over MapReduce to support spatial analytics for analytical pathology imaging. 42 • Futures: Recently, 3D pathology imaging is made possible through 3D laser technologies or serially sectioning hundreds of tissue sections onto slides and scanning them into digital images. Segmenting 3D microanatomic objects from registered serial images could produce tens of millions of 3D objects from a single image. This provides a deep “map” of human tissues for next generation diagnosis. 1TB raw image data + 1TB analytical results per 3D image and 1PB data per moderated hospital per year. Architecture of Hadoop-GIS, a spatial data warehousing system over MapReduce to support spatial analytics for analytical pathology imaging
  • 43. Computational Bioimaging • Application: Data delivered from bioimaging is increasingly automated, higher resolution, and multi-modal. This has created a data analysis bottleneck that, if resolved, can advance the biosciences discovery through Big Data techniques. • Current Approach: The current piecemeal analysis approach does not scale to situation where a single scan on emerging machines is 32TB and medical diagnostic imaging is annually around 70 PB even excluding cardiology. One needs a web-based one-stop-shop for high performance, high throughput image processing for producers and consumers of models built on bio-imaging data. • Futures: Goal is to solve that bottleneck with extreme scale computing with community-focused science gateways to support the application of massive data analysis toward massive imaging data sets. Workflow components include data acquisition, storage, enhancement, minimizing noise, segmentation of regions of interest, crowd-based selection and extraction of features, and object classification, and organization, and search. Use ImageJ, OMERO, VolRover, advanced segmentation and feature detection software. 43 MR, MRIter?, PP, Classification Streaming Parallelism over Images
  • 44. 22: Statistical Relational Artificial Intelligence for Health Care • Application: The goal of the project is to analyze large, multi-modal medical data including different data types such as imaging, EHR, genetic and natural language. This approach employs the relational probabilistic models that have the capability of handling rich relational data and modeling uncertainty using probability theory. The software learns models from multiple data types and can possibly integrate the information and reason about complex queries. Users can provide a set of descriptions – say for instance, MRI images and demographic data about a particular subject. They can then query for the onset of a particular disease (say Alzheimer’s) and the system will then provide a probability distribution over the possible occurrence of this disease. • Current Approach: A single server can handle a test cohort of a few hundred patients with associated data of 100’s of GB. • Futures: A cohort of millions of patient can involve petabyte datasets. Issues include availability of too much data (as images, genetic sequences etc) that complicate analysis. A major challenge lies in aligning the data and merging from multiple sources in a form that can be made useful for a combined analysis. Another issue is that sometimes, large amount of data is available about a single subject but the number of subjects themselves is not very high (i.e., data imbalance). This can result in learning algorithms picking up random correlations between the multiple data types as important features in analysis. MRIter, EGO Streaming Parallelism over People and their EMR 44
  • 45. El paradigma P4 de la Medicina PREDICTIVO PREVENTIVO PERSONALIZADO PARTICIPATIVO
  • 46. El paradigma V4 en Big Data Medicina V-OLUME V-ARIETY V-ELOCITY V-ALUE
  • 47. Big Data in Neuroimaging
  • 48.
  • 49. human neuroimaging is now, officially, a “big data” science • Among the examples of “big data” featured at the meeting was – no surprise - human neuroimaging • The Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative • Initiatives surrounding large-scale brain mapping are also underway in Europe http://www.humanbrainproject.eu • Organization for Human Brain Mapping (OHBM; http://www.humanbrainmapping.org)
  • 50. How Big is “Big”? • While size is a relative term when it comes to data, medical imaging applied to the brain comes in a variety of forms which each generating differing types and amounts of information about neural structure and/or function. • NeuroImage, indicates that since 1995 the amount of data collected has doubled approximately every 26 months. At this rate, by 2015 the amount of acquired neuroimaging data alone, discounting header information and before more files are generated during data processing and statistical analysis, may exceed an average of 20GB per published research study
  • 51. Growth of Neuroimaging Study Size 20000 15000 10000 5000 0 1990 1995 2000 2005 2010 2015 2020 MegaBytes Year Expected Observed Predicted Van Horn and Toga (in press) Brain Imaging and Behavior
  • 52. Kryder’s law: Exponential Growth of Data VOLUME OF DATA MB = MEGABYTE = 106, GB = GIGABYTE = 109 TB = TERABYTE = 1012, PB = PETABYTE = 1015 COMPUTE POWER CPU TRANSISTOR COUNTS MOORE’S LAW YEARS SINGLE CRYO BRAIN VOLUME 1600 CM2 NEUROIMAGING (ANNUALLY) GENOMICS (BP/YR) Voxel Resolution Gray Scale Color 200 GB 10 MB 1x105 1985-1989 Size Count 8bits 16bits 24bits 1 TB 100 MB 1x106 1990-1994 1cm 12x15x9 1620 3000 4860 50 TB 10 GB 5x106 1995-1999 1mm 120x 150x90 1.62 MB 3.24 MB 4.86 MB 250 TB 1TB 1x107 2000-2004 100 μm 1200x 1500x900 1.62 GB 3.24 GB 4.86 GB 1 PB 30TB 8x106 2005-2009 10 μm 12000x 15000x 9000 1.62 TB 3.24 TB 4.86 TB 5 PB 1 PB 1x109 2010-2014 1 μm 120000x 150000x 90000 1.62 PB 3.24 PB 4.86 PB 10+ PB 20+ PB 1x1011 2015-2019 (estimated) Kryder's law, Chip Walter - Scientific American, 2005 - nature.com
  • 53. Big Neuroimaging + Big Genetics = REALLY Big Data • With the ability to obtain genome-wide sets of single nucleotide polymorphism (SNP) information becoming routine and the costs of full genomic sequencing rapidly becoming affordable. • Next Generation Sequencing (NGS) methods, for major brain imaging studies such as the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (Weiner, Veitch et al. 2012), with its initially available sample of 832 subjects. • As the bond between neuroimaging and genomics grows tighter, with both areas growing at incredible rates, disk storage, unique data compression techniques
  • 54. Multisite Consortia and Data Sharing • Examples of multisite neuroimaging efforts can be found in the ubiquitous application of neuroimaging in health but also in devestating illnesses such as: • Parkinson’s (Evangelou, Maraganore et al. 2009) • psychiatric disorders (Schumann, Loth et al. 2010) • the mapping of human brain connectivity (Toga, Clark et al. 2012 • databases of aging and aging-related diseases, largescale Autism Research (NDAR; Hall,Huerta et al. 2012) and the Federal InteracgencyTraumatic Brain Injury Research (FITBIR; Bushnik and Gordon 2012)
  • 55. Multisite Consortia and Data Sharing • The various “grass roots” collections of resting-state fMRI data maintained as part of the “1000 Functional Connectomes” project http://fcon_1000.projects.nitrc.org/ (see Biswal, Mennes et al. 2010) • Task-based OpenfMRI http://www.openfmri.org (Poldrack, Barch et al. 2013) are other notable examples.
  • 56.
  • 57. The Role of Cyberinfrastructure • Individual desktop computers are now no longer suitable for analyzing potentially petabytes worth of brain and genomics data at a time. • While the National Science Foundation (NSF) has made major investments in the computer architecture needed for physics, weather, and geological data. • Eg. XSEDE, https://www.xsede.org/ , and Open Science Grid, https://www.opensciencegrid.org
  • 58. The Role of Cyberinfrastructure • The Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC; http://www.nitrc.org ) • The International Neuroinformatics Coordinating Facility (INCF; http://incf.org ) Have begun to deploy local clusters with Amazon EC2 server technology toward this goal but a larger effort will be required involving dedicated processing centers or distributed grids of linked compute centers.
  • 59. Conclusions • Neuroimaging research, by its very nature, is data intensive, multimodal, and collaborative factors which have been instrumental in its success and growth. • Yet the infrastructure needed for supporting this advancing form of brain research where data is king is still maturing. • The next steps for the development of resources supporting “big data” brain imaging at the Exabyte scale will require the further creation of new tools and services for data discovery, integration, analysis, and visualization. • For the “big data” science of human brain imaging, now is the time to begin.
  • 60. Many 1,000’s of Software Tools • Acquisition, processing, storage/DB, service, migration, mining, analysis, visualization, annotation, … “(data-driven) process understanding” • Biomedical Imaging – There are 100’s of different types of image processing algorithms and filters – For each type of process there may be dozens of concrete software products (instance implementations) • (Example) Neuroimaging – NITRC lists > 500 openly shared software tools – For each openly shared tool there may be dozens of proprietary or less commonly used analogues
  • 61. Millions of Dispersed Hardware Devices • Cisco: "By the end of 2012, the number of mobile-connected devices will exceed the number of people on Earth” • There will be over 10 billion mobile-connected devices in 2016; i.e., there will be 1.3 mobile devices per capita – These include phones, tablets, laptops, handheld gaming consoles, e-readers, in-car entertainment systems, digital cameras, and “machine-to-machine modules” • DBs, Clients, Servers, Compute-Nodes, Web-Services, Interfaces, … • Solution … Dinov et al., BMC 2011
  • 62. Image spatial alignment Slice timing adjustment Van Horn et al., Nature Neuro, 2004 Statistical modeling (e.g. GLM) Functional – structural co-registration Raw fMRI time series High-resolution anatomical image Standardized brain atlas template Image smoothing Gaussian spatial filtering Experimental design matrix Study Meta Data Scanner protocols Subject demographics Stimulus timing etc. Spatial normalization to atlas space Statistical results maps Graphical overlays Table of statistically significant voxels in atlas space coordinates
  • 63. Pipeline Version 5.9.1 Features Graphical Programming Environment 8/29/2014 63
  • 64. Perfect Neuroimaging-Computation Storm? • Single Subject Studies (N=1) – Genetics: • Depending on Coverage(X) • Whole Genome Seq Data > 320GB (>80X) • Require 2+ TB RAM, and 100+ hrs CPU – Imaging: • Depending on protocols • 40-512 gradient directions Diffusion imaging data • Raw (multimodal) Neuroimaging Data > 10 GB • Derived Data > 100 GB • Require 100GB RAM and 70+ hrs CPU • Large Subject Studies – Cohort studies (N>10, Typically N~100’s) – Multi-Institutional Population-wide Studies (N>1,000) – Longitudinal (neuroimaging) studies …
  • 65. From Biomedical Challenges to Modeling, Computation, Tools and Curricular Training • Quantitative Volumetric and Surface based Stats Analyses – Interactome: Challenge↔Models↔Data Analysis↔Computation↔Education – Statistics Online Computational Resource Che, et al., JSS (2009) No effect Marginal Significant
  • 66. Grid & Cloud Computing • UCLA Grids Cerebro Medulla  1,200 cores  1.4TB RAM  12,000 jobs/day  700 users • Amazon Cloud  4,300 cores  9.6 TB RAM  (new) – EC2 (Elastic Cloud Computing) – S3 (Simple Storage Service) • UC Grid • Globus GridFTP • INI Cluster @ USC – 3328 cores, 128GB RAM per 16 cores, 26tb aggregate memory space. Connectivity is 5Gbit per 16 cores, roughly 4terabit aggregate on comp and another 4.3Tbit on the storage. 2.43PB of online storage with over 50TB of SSD accelerating it currently.
  • 67. Neuroimaging Applications: 56-ROI Global Shape Analysis (NC vs. IBS/Pain) Group Effects Data Workflow Protocol Results Structural T1 data NC IBS 221 107 Mean-Curvature between-group differences in: L_cuneus R_angular_gyrus Left View Right View
  • 68. Neuroimaging Applications: Stat Mapping of Cortical GM Thickness (Group Effects) Results Left Anterior Insula Data Workflow Protocol Structural T1 data Cortical Models 1.0 P-value 0.0
  • 71. Big Data y el sector de la Salud en Imagen Poblacional • Según Bonnies Feldman “el potencial de Big Data en medicina reside en la posibilidad de combinar los datos tradicionales con otras nuevas formas de datos, tanto a nivel individual como Poblacional” • El potencial del Big Data indica que se pueden producir ahorros en el sector sanitario a través de varias vías: – Transformación de datos en información. – Apoyo al autocuidado de las personas. – Aumento del conocimiento. – Concienciación del estado de salud. • El Big Data es una metodología de acceso abierto para integrar diferentes tipos de datos en imagen poblacional, cuantificación de imagen y extracción de características.
  • 72. Tipos de Estudios • Individual • Longitudinal 0 1 2 M • Transversal
  • 73. Estudios Poblacionales • Estudios Poblacionales – Si no se forman grupos en la población, se calcula la media del parámetro o parámetros. – Si se forman grupos (control y Patológicos) se debe realizar un contraste de hipótesis. • Modelado Poblacional – Modelar la degeneración volumétrica de sustancia gris y sustancia blanca – Establecer parámetros de degeneración – Contrastar el estado de un individuo con respecto a dicho modelo.
  • 75. Aplicación a Casos Reales Resultados de parámetros globales
  • 76. Aplicación a Casos Reales Resultados de grosor y volumen por estructura, junto con los valores de referencia
  • 77. Aplicación a Casos Reales • Representación de la diferencia del volumen en comparación con la población
  • 78. ¿Porqué no podemos combinar BELLEZA Y CIENCIA?
  • 79. Objetivos BIMCV • Desarrollar e implementar estrategias para prevenir o tratar efectivamente las enfermedades mediante una infraestructura de investigación en imagen asociada a grandes estudios poblacionales de imagen. – Concepto de “Population Imaging”. • Proporcionar datos, herramientas y recursos de proceso para realizar estudios avanzados en imagen.
  • 80. BIMCV
  • 81. Nodo Valenciano Euro-BioImaging Infraestructura Europea para la Investigación en Tecnologías de Imagen Biomédica e Imagen Biológica. Un proyecto sobre la hoja de ruta de las ESFRI en infraestructuras de investigación www.eurobioimaging.eu
  • 82. EIBIR key facts and daily work In the service of research, EIBIR offers to its Network Members: - Multidisciplinary networking - Project Management - Research communication - Research Training - Meeting organisation EIBIR Office • Established in 2006 • Staff: 4.5, incl. 3 Project Managers, 1 assistant • Provision of services to Network Members + EIBIR bodies • Monitoring European Affairs + research funding opportunities • Project management and coordination • Information activities and media work • Promotion of Network Membership • Website and data base updates • Congress activities • Scientific Advisory Board
  • 83. Cronología & Financiación 83 2013 - 2017 Fase de Construcción • Evaluación & selección de nodos. • Construccion de los nodes. Financiado por los Estados Miembros (¿MINECO?) 2010 - 2013 Fase Preparatoria • Framework • Definición de los criterios de elegibilidad para los nodos • Llamada a los Nodos, Abierta. Financiado por CE ……… 2017 - …. Fase Operacional • Acceso y formación • Tecnología y evaluación para mejorar el servicio Financiado por los Estados Miembros & EC
  • 84. MULTIMODAL TECHNOLOGY NODE Imaging Infrastructure with open user access European life scientists as users FLAGSHIP NODE FLAGSHIP NODE FLAGSHIP NODE FLAGSHIP NODE USER TRAINING STAFF TRAINING Web-access portal Data storage and analysis infrastructure User returns with results for publication NODES HUB MULTIMODAL TECHNOLOGY NODE
  • 85.
  • 86. 1st Open Call Euro-BioImaging Nodes – Expression of Interest The 1st Open Call: 1 February – 30 April 2013 • Multi-Modal Molecular Imaging • Phase contrast Imaging • High-field MRI • MR-PET • Population Imaging • Data Infrastructure: Challenges Framework • The biological imaging community will call for EoIs in 6 technologies
  • 87. Resultados 1ª Convocatoria Biological Imaging Biomedical Imaging 9 NODOS ESPAÑOLES – 18 Instituciones –
  • 88. Evaluation summary and Final ranking • The node develops and provides access to a large database of imaging data and the associated clinical data records. • Big Data repository from hospitals in the Valencia region (5 million inhabitants living over an area of 23.255 Km2. average number of 5.3 million clinical cases per year, from 210 different imaging modalities). • The access to such data and tools will be an efficient way of advancing population imaging studies and research. • The node has ability to incorporate data from other facilities
  • 89. Services offered by the node • BIMCV facility provides a multi-level and multi-ology storage service (Vendor Neutral Archive). • CEIB-CS node integrates access to high-performance computational services from local and European infrastructures (Principe Felipe Research Centre & UPV-I3M Infrastructure). • Open access methodology to integrate different data types for population imaging, quantitative resources and feature extraction. • Comprehensive user training
  • 90. Single Technology Flagship Node – Population Imaging: Valencia Evaluation summary and Final ranking: • Requires minor improvements (training plan, actually corrected). • The node develops and provides access to a large database of imaging data and the associated clinical data records. • Big Data repository from hospitals in the Valencia region (5 million inhabitants living over an area of 23.255 Km2. average number of 5.3 million clinical cases per year, from 210 different imaging modalities). • The access to such data and tools will be an efficient way of advancing population imaging studies and research. • The node has ability to incorporate data from other facilities. Other facilities MEDICAL IMAGING DATA BANK (BIMCV) BIG DATA DIASEASE SIGNATURES Services offered by the node: • BIMCV facility provides a multi-level and multi-ology storage service (Vendor Neutral Archive). • CEIB-AVS node integrates access to high-performance computational services from local and European infrastructures (Principe Felipe Research Centre & UPV-I3M Infrastructure). • Open access methodology to integrate different data types for population imaging, quantitative resources and feature extraction. • Comprehensive user training.
  • 91. Nodo Valenciano, BIMCV Centro de Excelencia en Imagen Biomédica de la Conselleria de Sanitat Sede CEIB clínica Sede CEIB computo
  • 94. Human Neuroimaging as a “Big Data” Science The mind landscapes http://prezi.com/sseievn7ujcf/?utm_campaign=share&utm_medium=copy
  • 95. Estudio de la estructura Morfometría
  • 96. Estudio de la estructura Tractografía
  • 97.
  • 98. Estudio de la Función
  • 99. Estudio de la Función
  • 101.
  • 105.
  • 106. 10 K Structural Modeling in Neuroimage of Valencia Region • Dos becas de la Subdirección General de Sistemas para la Salud de la CS. Ingenieros Informáticos o Ingenieros de Telecomunicaciones (DOGV 9-07-2014). • Se van a medir las estructuras principales del cerebro. • En colaboración con LABMAN (http://www.labman.org) • En colaboración con Brain Dynamics • La universidad del Sur de California (Jack Van Horn) • En colaboración con IBIME
  • 107.
  • 108. Demo: Prototipo de realidad Virtual Aumentada de: Gonzalo Rojas Costas