SlideShare a Scribd company logo
1 of 42
Towards an Open Analytics Environment Ian Foster Computation Institute Argonne National Lab & University of Chicago
The Computation Institute ,[object Object],[object Object],www.ci.uchicago.edu Faculty, fellows, staff, students, computers, projects.
The Good Old Days: Astronomy ~1600 30 years ? years 10 years 6 years 2 years
Astronomy, from 1600  to 2000 Automation 10 -1  ļƒ   10 8  Hz data capture Community 10 0  ļƒ   10 4 astronomers (10 6  amateur) Computation Data 10 6  ļƒ   10 15  B aggregate 10 -1  ļƒ   10 15  Hz peak Literature 10 1  ļƒ   10 5 pages/year
Biomedical Research ~1600
Biomedical Research ~2000 ... atcgaattccaggcgtcacattctcaattcca... MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYT... Protein-Protein Interactions metabolism pathways receptor-ligand 4Āŗ structure Polymorphism and Variants genetic variants individual patients epidemiology Physiology Cellular biology Biochemistry Neurobiology Endocrinology etc. >10 6 ESTs  Expression patterns Large-scale screens Genetics and Maps Linkage Cytogenetic  Clone-based From John Wooley >10 6 >10 9 >10 6 >10 5 >10 9 DNA sequences alignments Proteins sequence 2Āŗ structure 3Āŗ structure
Growth of Sequences and Annotations since 1982 Folker Meyer, Genome Sequencing vs. Mooreā€™s Law: Cyber Challenges  for the Next Decade,  CTWatch , August 2006.
The Analyst in Denial ā€œ I just need a bigger disk (and workstation)ā€
An Open Analytics Environment Data in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Results out Programs & rules in
oĀ·pen [oh-puhn] adjective ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What Goes In (1)
What Goes In (2) Rules Workflows Dryad MapReduce Parallel programs SQL BPEL Swift SCFL R MatLab Octave
How it Cooks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What Comes Out Data Data Virtual Data Schema
Analysis as (Collaborative) Process Transform Annotate  Search Add to Tag Visualize Discover Extend Group Share
Centralized or Distributed? Both
Towards an Open Analysis Environment: (1) Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Towards an Open Analysis Environment: (2) Hardware SiCortex 6K cores, 6 Top/s IBM BG/P 160K cores, 500 Top/s PADS 10-40 Gbit/s
PADS: Petascale Active Data Store 500 TB  reliable  storage  (data & metadata) 180 TB,  180 GB/s  17 Top/s analysis Data ingest Dynamic  provisioning Parallel analysis Remote access Offload to remote  data centers P A D S Diverse users Diverse data sources 1000 TB tape  backup
Towards an Open Analysis Environment : (3) Methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Tagging &  Social Networking GLOSS :  Generalized  Labels Over Scientific  data Sources
XDTM: XML Data Typing & Mapping ./group23 drwxr-xr-x  4 yongzh users 2048 Nov 12 14:15  AA drwxr-xr-x  4 yongzh users 2048 Nov 11 21:13  CH drwxr-xr-x  4 yongzh users 2048 Nov 11 16:32  EC ./group23/AA : drwxr-xr-x  5 yongzh users 2048 Nov  5 12:41  04nov06aa drwxr-xr-x  4 yongzh users 2048 Dec  6 12:24  11nov06aa . /group23/AA/04nov06aa : drwxr-xr-x  2 yongzh users  2048 Nov  5 12:52  ANATOMY drwxr-xr-x  2 yongzh users 49152 Dec  5 11:40  FUNCTIONAL . /group23/AA/04nov06aa/ANATOMY : -rw-r--r--  1 yongzh users  348 Nov  5 12:29  coplanar.hdr -rw-r--r--  1 yongzh users 16777216 Nov  5 12:29  coplanar.img . /group23/AA/04nov06aa/FUNCTIONAL : -rw-r--r--  1 yongzh users  348 Nov  5 12:32  bold1_0001.hdr -rw-r--r--  1 yongzh users  409600 Nov  5 12:32  bold1_0001.img -rw-r--r--  1 yongzh users  348 Nov  5 12:32  bold1_0002.hdr -rw-r--r--  1 yongzh users  409600 Nov  5 12:32  bold1_0002.img -rw-r--r--  1 yongzh users  496 Nov 15 20:44  bold1_0002.mat -rw-r--r--  1 yongzh users  348 Nov  5 12:32  bold1_0003.hdr -rw-r--r--  1 yongzh users  409600 Nov  5 12:32  bold1_0003.img Logical Physical
fMRI Type Definitions type  Study  {  Group g[ ];  } type  Group  {  Subject s[ ];  } type  Subject  {  Volume anat;  Run run[ ];  } type  Run  {  Volume v[ ];  } type  Volume  {  Image img;  Header hdr;  } type  Image  {}; type  Header  {}; type  Warp  {}; type  Air  {}; type  AirVec  {  Air a[ ];  } type  NormAnat  { Volume anat;  Warp aWarp;  Volume nHires; }
High-Performance Data Analytics Functional MRI Ben Clifford,  Mihael Hatigan,  Mike Wilde, Yong Zhao
SwiftScript for fMRI Data Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(Run or) reorientRun (Run ir,    string direction) { foreach Volume  iv , i in ir.v { or.v[i] = reorient( iv , direction); } }
Provenance Data Model
Multi-level Scheduling SwiftScript Abstract computation Virtual Data Catalog SwiftScript Compiler Specification Execution Virtual Node(s)ā€ Worker Nodes Provenance data Provenance data Provenance collector launcher launcher file1 file2 file3 App F1 App F2 Scheduling Execution Engine (Karajan w/ Swift Runtime)ā€ Swift runtime callouts C C C C Status reporting Provisioning Falkon Resource Provisioner Amazon EC2
DOCK on SiCortex ,[object Object],[object Object],[object Object],[object Object],[object Object],(does not include ~800 sec to stage input data) Ioan Raicu, Zhao Zhang
LIGO Gravitational Wave Observatory Birmingham ā€¢ >1 Terabyte/day to 8 sites 770 TB replicated to date: >120 million replicas MTBF = 1 month Ann Chervenak et al., ISI; Scott Koranda et al, LIGO ,[object Object],AEI/Golm
Lag Plot for Data Transfers to Caltech Credit: Kevin Flasch, LIGO
SIDGrid: B. Bertenthal et al., U.Chicago, IU, UIC
Social Informatics Data Grid (SIDgrid) TeraGrid PADS ā€¦ SIDgrid Collaborative, multi-modal analysis of cognitive science data Diverse experimental data & metadata  Browse data Search Content preview Transcode Download Analyze
ELAN SIDGrid Portal
Ā 
A  C ommunity  I ntegrated  M odel for  E conomic  a nd  R esource  T rajectories for  H umankind ( CIM-EARTH ) Dynamics, foresight, uncertainty, resolution, ā€¦ Agriculture, transport, taxation, ā€¦ Data  (global, local, ā€¦) (Super) computers CIM-EARTH Framework  Community process Open code, data
Alleviating  Poverty in Thailand: Modeling  Entrepreneurship Consider  only wealth, access to  capital Consider also distance to 6 major cities Rob Townsend, Victor Zhorin, et al. Match High Low
Text Mining
GeneWays Online  Journals Pathways GeneWays Andrey Rzhetsky  et al. Screening 250,000 journal articles 2.5M reasoning chains  4M statements
Evidence Integration: Genetics & Disease Susceptibility Identify Genes Phenotype 1  Phenotype 2  Phenotype 3  Phenotype 4 Predictive Disease Susceptibility Physiology Metabolism Endocrine Proteome Immune Transcriptome Biomarker Signatures Morphometrics Pharmacokinetics Ethnicity Environment Age Gender Source: Terry Magnuson
James Evans, U.Chicago Arabidopsis  articles
An Open Analytics Environment Data in ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Results out Programs & rules in
Ā 

More Related Content

What's hot

LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013Luis Daniel IbƔƱez
Ā 
Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Mark Smith
Ā 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Mehwish Alam
Ā 
Extreme Scripting July 2009
Extreme Scripting July 2009Extreme Scripting July 2009
Extreme Scripting July 2009Ian Foster
Ā 
Hyperloglog Project
Hyperloglog ProjectHyperloglog Project
Hyperloglog ProjectKendrick Lo
Ā 
Benchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on SparkBenchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on SparkXiaoqian Liu
Ā 
Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...
Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...
Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...CIARD Movement
Ā 
Velocity cubes of galaxies
Velocity cubes of galaxiesVelocity cubes of galaxies
Velocity cubes of galaxiesJose Enrique Ruiz
Ā 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudAnsgar Scherp
Ā 
GraphQL & DGraph with Go
GraphQL & DGraph with GoGraphQL & DGraph with Go
GraphQL & DGraph with GoJames Tan
Ā 
Data-driven Innovation - Wood
Data-driven Innovation - WoodData-driven Innovation - Wood
Data-driven Innovation - WoodAmazon Web Services
Ā 
Implementing a VO archive for datacubes of galaxies
Implementing a VO archive for datacubes of galaxiesImplementing a VO archive for datacubes of galaxies
Implementing a VO archive for datacubes of galaxiesJose Enrique Ruiz
Ā 
Probabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profitProbabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profitTyler Treat
Ā 
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...William Yetman
Ā 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabImpetus Technologies
Ā 
Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28Ted Dunning
Ā 
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServerText Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServerVitor Hirota Makiyama
Ā 
Astronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache SparkAstronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache SparkDatabricks
Ā 
R statistics with mongo db
R statistics with mongo dbR statistics with mongo db
R statistics with mongo dbMongoDB
Ā 

What's hot (19)

LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013
Ā 
Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016
Ā 
Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.Interactive Knowledge Discovery over Web of Data.
Interactive Knowledge Discovery over Web of Data.
Ā 
Extreme Scripting July 2009
Extreme Scripting July 2009Extreme Scripting July 2009
Extreme Scripting July 2009
Ā 
Hyperloglog Project
Hyperloglog ProjectHyperloglog Project
Hyperloglog Project
Ā 
Benchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on SparkBenchmark MinHash+LSH algorithm on Spark
Benchmark MinHash+LSH algorithm on Spark
Ā 
Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...
Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...
Opening and Integration of CASDD and Germplasm Data to AGRIS by Prof. Xuefu Z...
Ā 
Velocity cubes of galaxies
Velocity cubes of galaxiesVelocity cubes of galaxies
Velocity cubes of galaxies
Ā 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
Ā 
GraphQL & DGraph with Go
GraphQL & DGraph with GoGraphQL & DGraph with Go
GraphQL & DGraph with Go
Ā 
Data-driven Innovation - Wood
Data-driven Innovation - WoodData-driven Innovation - Wood
Data-driven Innovation - Wood
Ā 
Implementing a VO archive for datacubes of galaxies
Implementing a VO archive for datacubes of galaxiesImplementing a VO archive for datacubes of galaxies
Implementing a VO archive for datacubes of galaxies
Ā 
Probabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profitProbabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profit
Ā 
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Ā 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
Ā 
Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28
Ā 
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServerText Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Text Mining Applied to SQL Queries: a Case Study for SDSS SkyServer
Ā 
Astronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache SparkAstronomical Data Processing on the LSST Scale with Apache Spark
Astronomical Data Processing on the LSST Scale with Apache Spark
Ā 
R statistics with mongo db
R statistics with mongo dbR statistics with mongo db
R statistics with mongo db
Ā 

Viewers also liked

Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009Ian Foster
Ā 
Rethinking how we provide science IT in an era of massive data but modest bud...
Rethinking how we provide science IT in an era of massive data but modest bud...Rethinking how we provide science IT in an era of massive data but modest bud...
Rethinking how we provide science IT in an era of massive data but modest bud...Ian Foster
Ā 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009Ian Foster
Ā 
Running Hot October 2008
Running Hot October 2008Running Hot October 2008
Running Hot October 2008Ian Foster
Ā 
Recruitment and Selection
Recruitment and SelectionRecruitment and Selection
Recruitment and Selectionr m
Ā 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataIan Foster
Ā 
Recruiting in a Networked World - Workshop Series
Recruiting in a Networked World - Workshop SeriesRecruiting in a Networked World - Workshop Series
Recruiting in a Networked World - Workshop Serieshholmes75
Ā 
Networking Materials Data
Networking Materials DataNetworking Materials Data
Networking Materials DataIan Foster
Ā 
Recruitment and Selection
Recruitment and SelectionRecruitment and Selection
Recruitment and Selectionr m
Ā 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World FosterIan Foster
Ā 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
Ā 
Campus Bridging with Globus Services
Campus Bridging with Globus ServicesCampus Bridging with Globus Services
Campus Bridging with Globus ServicesIan Foster
Ā 
Globus publication demo screenshots
Globus publication demo screenshotsGlobus publication demo screenshots
Globus publication demo screenshotsIan Foster
Ā 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
Ā 
Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plansIan Foster
Ā 
Flitterin For Talent Presentation Slides
Flitterin For Talent Presentation SlidesFlitterin For Talent Presentation Slides
Flitterin For Talent Presentation Slideshholmes75
Ā 
Mexico talk foster march 2012
Mexico talk foster march 2012Mexico talk foster march 2012
Mexico talk foster march 2012Ian Foster
Ā 

Viewers also liked (17)

Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009Grid And Healthcare For IOM July 2009
Grid And Healthcare For IOM July 2009
Ā 
Rethinking how we provide science IT in an era of massive data but modest bud...
Rethinking how we provide science IT in an era of massive data but modest bud...Rethinking how we provide science IT in an era of massive data but modest bud...
Rethinking how we provide science IT in an era of massive data but modest bud...
Ā 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
Ā 
Running Hot October 2008
Running Hot October 2008Running Hot October 2008
Running Hot October 2008
Ā 
Recruitment and Selection
Recruitment and SelectionRecruitment and Selection
Recruitment and Selection
Ā 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing Data
Ā 
Recruiting in a Networked World - Workshop Series
Recruiting in a Networked World - Workshop SeriesRecruiting in a Networked World - Workshop Series
Recruiting in a Networked World - Workshop Series
Ā 
Networking Materials Data
Networking Materials DataNetworking Materials Data
Networking Materials Data
Ā 
Recruitment and Selection
Recruitment and SelectionRecruitment and Selection
Recruitment and Selection
Ā 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
Ā 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
Ā 
Campus Bridging with Globus Services
Campus Bridging with Globus ServicesCampus Bridging with Globus Services
Campus Bridging with Globus Services
Ā 
Globus publication demo screenshots
Globus publication demo screenshotsGlobus publication demo screenshots
Globus publication demo screenshots
Ā 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ā 
Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plans
Ā 
Flitterin For Talent Presentation Slides
Flitterin For Talent Presentation SlidesFlitterin For Talent Presentation Slides
Flitterin For Talent Presentation Slides
Ā 
Mexico talk foster march 2012
Mexico talk foster march 2012Mexico talk foster march 2012
Mexico talk foster march 2012
Ā 

Similar to Open Analytics Environment

Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architecturesinside-BigData.com
Ā 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceUniversity of Washington
Ā 
Opportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesOpportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesIan Foster
Ā 
HEPData workshop talk
HEPData workshop talkHEPData workshop talk
HEPData workshop talkEamonn Maguire
Ā 
Virtual Science in the Cloud
Virtual Science in the CloudVirtual Science in the Cloud
Virtual Science in the Cloudthetfoot
Ā 
Ph. D. Final Dissertation SLides
Ph. D. Final Dissertation SLidesPh. D. Final Dissertation SLides
Ph. D. Final Dissertation SLidesEmanuele Panigati
Ā 
ICWE2017 BigDataEurope
ICWE2017 BigDataEuropeICWE2017 BigDataEurope
ICWE2017 BigDataEuropeBigData_Europe
Ā 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social StreamsPramod Anantharam
Ā 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging EnvironmentsPaul Groth
Ā 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsEnrico Daga
Ā 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008Ian Foster
Ā 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
Ā 
Recommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRecommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRoku
Ā 
The data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architecturesThe data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architecturesVincenzo Gulisano
Ā 
Enabling semantic integration
Enabling semantic integration Enabling semantic integration
Enabling semantic integration Jean-Paul Calbimonte
Ā 
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Thomas Gottron
Ā 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Data Con LA
Ā 
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Emanuele Della Valle
Ā 
HEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkHEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkEamonn Maguire
Ā 

Similar to Open Analytics Environment (20)

Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architectures
Ā 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
Ā 
Opportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architecturesOpportunities for X-Ray science in future computing architectures
Opportunities for X-Ray science in future computing architectures
Ā 
HEPData workshop talk
HEPData workshop talkHEPData workshop talk
HEPData workshop talk
Ā 
Virtual Science in the Cloud
Virtual Science in the CloudVirtual Science in the Cloud
Virtual Science in the Cloud
Ā 
Ph. D. Final Dissertation SLides
Ph. D. Final Dissertation SLidesPh. D. Final Dissertation SLides
Ph. D. Final Dissertation SLides
Ā 
ICWE2017 BigDataEurope
ICWE2017 BigDataEuropeICWE2017 BigDataEurope
ICWE2017 BigDataEurope
Ā 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social Streams
Ā 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
Ā 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
Ā 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
Ā 
Potterā€™S Wheel
Potterā€™S WheelPotterā€™S Wheel
Potterā€™S Wheel
Ā 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
Ā 
Recommender Systems in the Linked Data era
Recommender Systems in the Linked Data eraRecommender Systems in the Linked Data era
Recommender Systems in the Linked Data era
Ā 
The data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architecturesThe data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architectures
Ā 
Enabling semantic integration
Enabling semantic integration Enabling semantic integration
Enabling semantic integration
Ā 
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Ā 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Ā 
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18Stream Reasoning: Where we got so far. Oxford 2010.1.18
Stream Reasoning: Where we got so far. Oxford 2010.1.18
Ā 
HEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkHEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 Talk
Ā 

More from Ian Foster

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxIan Foster
Ā 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionIan Foster
Ā 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
Ā 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
Ā 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationIan Foster
Ā 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
Ā 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
Ā 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
Ā 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
Ā 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
Ā 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
Ā 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
Ā 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
Ā 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
Ā 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
Ā 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
Ā 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperabilityIan Foster
Ā 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
Ā 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasIan Foster
Ā 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
Ā 

More from Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
Ā 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
Ā 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
Ā 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
Ā 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
Ā 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
Ā 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
Ā 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
Ā 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
Ā 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
Ā 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
Ā 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
Ā 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
Ā 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
Ā 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
Ā 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
Ā 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
Ā 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Ā 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
Ā 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
Ā 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
Ā 
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelDeepika Singh
Ā 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
Ā 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
Ā 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
Ā 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
Ā 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
Ā 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
Ā 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
Ā 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
Ā 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
Ā 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
Ā 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
Ā 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
Ā 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
Ā 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Ā 
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Ā 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
Ā 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Ā 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
Ā 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Ā 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Ā 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Ā 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Ā 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Ā 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
Ā 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
Ā 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Ā 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
Ā 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Ā 

Open Analytics Environment

  • 1. Towards an Open Analytics Environment Ian Foster Computation Institute Argonne National Lab & University of Chicago
  • 2.
  • 3. The Good Old Days: Astronomy ~1600 30 years ? years 10 years 6 years 2 years
  • 4. Astronomy, from 1600 to 2000 Automation 10 -1 ļƒ  10 8 Hz data capture Community 10 0 ļƒ  10 4 astronomers (10 6 amateur) Computation Data 10 6 ļƒ  10 15 B aggregate 10 -1 ļƒ  10 15 Hz peak Literature 10 1 ļƒ  10 5 pages/year
  • 6. Biomedical Research ~2000 ... atcgaattccaggcgtcacattctcaattcca... MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYT... Protein-Protein Interactions metabolism pathways receptor-ligand 4Āŗ structure Polymorphism and Variants genetic variants individual patients epidemiology Physiology Cellular biology Biochemistry Neurobiology Endocrinology etc. >10 6 ESTs Expression patterns Large-scale screens Genetics and Maps Linkage Cytogenetic Clone-based From John Wooley >10 6 >10 9 >10 6 >10 5 >10 9 DNA sequences alignments Proteins sequence 2Āŗ structure 3Āŗ structure
  • 7. Growth of Sequences and Annotations since 1982 Folker Meyer, Genome Sequencing vs. Mooreā€™s Law: Cyber Challenges for the Next Decade, CTWatch , August 2006.
  • 8. The Analyst in Denial ā€œ I just need a bigger disk (and workstation)ā€
  • 9.
  • 10.
  • 12. What Goes In (2) Rules Workflows Dryad MapReduce Parallel programs SQL BPEL Swift SCFL R MatLab Octave
  • 13.
  • 14. What Comes Out Data Data Virtual Data Schema
  • 15. Analysis as (Collaborative) Process Transform Annotate Search Add to Tag Visualize Discover Extend Group Share
  • 17.
  • 18. Towards an Open Analysis Environment: (2) Hardware SiCortex 6K cores, 6 Top/s IBM BG/P 160K cores, 500 Top/s PADS 10-40 Gbit/s
  • 19. PADS: Petascale Active Data Store 500 TB reliable storage (data & metadata) 180 TB, 180 GB/s 17 Top/s analysis Data ingest Dynamic provisioning Parallel analysis Remote access Offload to remote data centers P A D S Diverse users Diverse data sources 1000 TB tape backup
  • 20.
  • 21. Tagging & Social Networking GLOSS : Generalized Labels Over Scientific data Sources
  • 22. XDTM: XML Data Typing & Mapping ./group23 drwxr-xr-x 4 yongzh users 2048 Nov 12 14:15 AA drwxr-xr-x 4 yongzh users 2048 Nov 11 21:13 CH drwxr-xr-x 4 yongzh users 2048 Nov 11 16:32 EC ./group23/AA : drwxr-xr-x 5 yongzh users 2048 Nov 5 12:41 04nov06aa drwxr-xr-x 4 yongzh users 2048 Dec 6 12:24 11nov06aa . /group23/AA/04nov06aa : drwxr-xr-x 2 yongzh users 2048 Nov 5 12:52 ANATOMY drwxr-xr-x 2 yongzh users 49152 Dec 5 11:40 FUNCTIONAL . /group23/AA/04nov06aa/ANATOMY : -rw-r--r-- 1 yongzh users 348 Nov 5 12:29 coplanar.hdr -rw-r--r-- 1 yongzh users 16777216 Nov 5 12:29 coplanar.img . /group23/AA/04nov06aa/FUNCTIONAL : -rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0001.hdr -rw-r--r-- 1 yongzh users 409600 Nov 5 12:32 bold1_0001.img -rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0002.hdr -rw-r--r-- 1 yongzh users 409600 Nov 5 12:32 bold1_0002.img -rw-r--r-- 1 yongzh users 496 Nov 15 20:44 bold1_0002.mat -rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0003.hdr -rw-r--r-- 1 yongzh users 409600 Nov 5 12:32 bold1_0003.img Logical Physical
  • 23. fMRI Type Definitions type Study { Group g[ ]; } type Group { Subject s[ ]; } type Subject { Volume anat; Run run[ ]; } type Run { Volume v[ ]; } type Volume { Image img; Header hdr; } type Image {}; type Header {}; type Warp {}; type Air {}; type AirVec { Air a[ ]; } type NormAnat { Volume anat; Warp aWarp; Volume nHires; }
  • 24. High-Performance Data Analytics Functional MRI Ben Clifford, Mihael Hatigan, Mike Wilde, Yong Zhao
  • 25.
  • 27. Multi-level Scheduling SwiftScript Abstract computation Virtual Data Catalog SwiftScript Compiler Specification Execution Virtual Node(s)ā€ Worker Nodes Provenance data Provenance data Provenance collector launcher launcher file1 file2 file3 App F1 App F2 Scheduling Execution Engine (Karajan w/ Swift Runtime)ā€ Swift runtime callouts C C C C Status reporting Provisioning Falkon Resource Provisioner Amazon EC2
  • 28.
  • 29.
  • 30. Lag Plot for Data Transfers to Caltech Credit: Kevin Flasch, LIGO
  • 31. SIDGrid: B. Bertenthal et al., U.Chicago, IU, UIC
  • 32. Social Informatics Data Grid (SIDgrid) TeraGrid PADS ā€¦ SIDgrid Collaborative, multi-modal analysis of cognitive science data Diverse experimental data & metadata Browse data Search Content preview Transcode Download Analyze
  • 34. Ā 
  • 35. A C ommunity I ntegrated M odel for E conomic a nd R esource T rajectories for H umankind ( CIM-EARTH ) Dynamics, foresight, uncertainty, resolution, ā€¦ Agriculture, transport, taxation, ā€¦ Data (global, local, ā€¦) (Super) computers CIM-EARTH Framework Community process Open code, data
  • 36. Alleviating Poverty in Thailand: Modeling Entrepreneurship Consider only wealth, access to capital Consider also distance to 6 major cities Rob Townsend, Victor Zhorin, et al. Match High Low
  • 38. GeneWays Online Journals Pathways GeneWays Andrey Rzhetsky et al. Screening 250,000 journal articles 2.5M reasoning chains 4M statements
  • 39. Evidence Integration: Genetics & Disease Susceptibility Identify Genes Phenotype 1 Phenotype 2 Phenotype 3 Phenotype 4 Predictive Disease Susceptibility Physiology Metabolism Endocrine Proteome Immune Transcriptome Biomarker Signatures Morphometrics Pharmacokinetics Ethnicity Environment Age Gender Source: Terry Magnuson
  • 40. James Evans, U.Chicago Arabidopsis articles
  • 41.
  • 42. Ā