SlideShare uma empresa Scribd logo
1 de 24
Efficient Support for MPI-I/O Atomicity
              Based on Versioning

Viet-Trung Tran1, Bogdan Nicolae2, Gabriel Antoniu2, Luc Bougé1
                   KerData Research Team

                 1
                   ENS Cachan, IRISA, France
                2
                  INRIA, IRISA, Rennes, France




                                                                  1
Context: Data Intensive Large-scale HPC
Simulations
 Large-scale simulations of natural phenomena
 Highly parallel platform
 I/O challenges
    High I/O performance
    Huge data sizes (~PB)
    Highly concurrency




                                                 2
Data Access Pattern

 Spatial splitting in parallelization
      Ghost cells
 Application data model vs storage model



                     
                         •Sequence of bytes

 Concurrent overlapping non-contiguous I/O
      Require atomicity guarantees


                                              3
Goal:

High throughput non-contiguous I/O
    under atomicity guarantees




                                     4
State of The Art

 Locking-based approaches to ensure atomicity
 3 level of implementations
    Application
    MPI-I/O               Application (Visit, Tornado simulation)
    Storage
                                Data model (HDF5, NetCDF)

                                     MPI-IO middleware


                          Parallel file systems (PVFS, GPFS, Lustre)



                                                                       5
Our Approach

 Dedicated interface for atomic non-contiguous I/O
    Provide atomicity guarantees at storage level
    No need to translate MPI consistency to storage consistency model

 Shadowing as a key to enhance data access under concurrency
    No locking
    Concurrent overlapped writes are allowed
    Atomicity guarantees

 Data striping




                                                                         6
Building Block: BlobSeer

 A KerData project (blobseer.gforge.inria.fr)
    Data striping
    Versioning-based concurrency control
    Distributed metadata management




                                                 7
Building Block: BlobSeer (continued)

 Distributed metadata management
    Organized as a segment tree                                                            [0, 8]

    Distributed over a DHT
                                                           [0, 4]      [0, 4]                        [4, 4]
 Two phases I/O              Metadata trees
    Data access
                                            [0, 2]      [0, 2]         [2, 2]            [2, 2]      [4, 2]
    Metadata access


                                   [0, 1]      [1, 1]   [1, 1]      [2, 1]      [2, 1]      [3, 1]   [4, 1]



                           Blob



                                                                                                              8
Proposal for A Non-contiguous,
Versioning Oriented Access Interface

 Non-contiguous Write
    vw = NONCONT_WRITE(id, buffers[], offsets[], sizes[])

 Non-contiguous Read
    NONCONT_READ(id, v, buffers[], offsets[], sizes[])

 Challenges
    Noncontiguous I/O must be atomic
    Efficient under concurrency




                                                             9
1st challenge: Non-contiguous I/O Must Be Atomic

 Shadowing techniques
 Isolate non-contiguous update into one single consistent snapshot
    Done at metadata level




                                                                      10
2nd challenge: Efficiency Under Concurrent Accesses


    Advantages of Shadowing
                                                 Our        Locking-
       Parallel data I/O phases                 approach   based
                                                            approach
    Parallel Metadata I/O
                                   Overlapping   Parallel   No
     phases ?                      Data I/O




                                                                       11
Minimize Ordering Overhead

 Ordering is done at metadata level
 Arbitrary order




                                       12
Avoid Synchronization for Concurrent Segment Tree
Generation
 Delegate the generation of shadowing tree to client side
 Shadowing tree are generated in parallel thank to predictable
  metadata node ID




                                                                  13
Lazy Evaluation During Border Node Calculation

 Building metadata tree in bottom-up fashion
 Optimized for non-contiguous pattern




                                                 14
Sumary: Overlapping Non-contiguous I/O

                      Our approach                           Locking-based
                                                             approaches
Data I/O phases       Parallel                               Serialization
Metadata I/O phases   Close to parallel thanks to            Serialization
                      1- Arbitrary ordering
                      2- Metadata level’s ordering
                      3- Client side’s shadowing in parallel
                      4- Lazy evaluation




                                                                             15
Leveraging Our Versioning-Oriented Interface in
Parallel I/O Stack


              Application (Visit, Tornado simulation)


                   Data model (HDF5, NetCDF)


                        MPI-IO middleware


               Storage optimized for atomic MPI-I/O


    Integrating BlobSeer to MPI-I/O middleware is straightforward




                                                                    16
Experimental Evaluation

• Our machines: Reservation on Grid'5000 platform
   – 80 nodes
   – Pentium-4 CPU@2.6Ghz, 4GB RAM, Gigabit Ethernet
   – Measured bandwidth: 117.5 MB/s for MTU=1500B
• 3 sets of experiments:
   – Scalability of non-contiguous I/O
   – Scalability under concurrency
   – MPI-tile-I/O




                                                       17
Scalability of Non-contiguous I/O




                                    18
Scalability Under Concurrency




                                19
MPI-tile-I/O: 128 KB Chunk Size




                                  20
MPI-tile-IO: 1MB Chunk Size




                              21
Conclusion

• Experiments show promising results
   • We outperform locking-based approaches
   • Key features: shadowing, dedicated API for atomic non-contiguous I/O
   • Comparison to Lustre file system

• High throughput non-contiguous I/O under atomicity guarantees
• Future work
   • Exposing versioning-interface to MPI-I/O applications
   • Potential improvement for producer-consumer workflow
   • Pyramid: A large-scale array-oriented active storage system




                                                                            22
Context




                    Application (Visit, Tornado simulation)

                         Data model (HDF5, NetCDF)

                              MPI-IO middleware

                             Parallel file systems



•Parallel file systems do not provide atomic non-contiguous I/O interface




                                                                        23
2nd challenge: Efficiency under concurrent
accesses
 Minimize ordering overhead
    Ordering is done at metadata level
    Arbitrary order

 Avoid synchronization for concurrent segment tree generation
    Delegate the generation of shadowing tree to client side
    Shadowing tree are generated in parallel

 Lazy evaluation during border node calculation




                                                                 24

Mais conteúdo relacionado

Mais procurados

01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir
01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir
01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.iraminnezarat
 
Advanced Components on Top of L4Re
Advanced Components on Top of L4ReAdvanced Components on Top of L4Re
Advanced Components on Top of L4ReVasily Sartakov
 
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...Edge AI and Vision Alliance
 
Conference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentConference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentEricsson
 

Mais procurados (7)

01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir
01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir
01 introduction fundamentals_of_parallelism_and_code_optimization-www.astek.ir
 
Memory, IPC and L4Re
Memory, IPC and L4ReMemory, IPC and L4Re
Memory, IPC and L4Re
 
Advanced Components on Top of L4Re
Advanced Components on Top of L4ReAdvanced Components on Top of L4Re
Advanced Components on Top of L4Re
 
SoC-2012-pres-2
SoC-2012-pres-2SoC-2012-pres-2
SoC-2012-pres-2
 
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
 
1
11
1
 
Conference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentConference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environment
 

Destaque

Etihad Fares and Ticketing Programme2014
Etihad Fares and Ticketing Programme2014Etihad Fares and Ticketing Programme2014
Etihad Fares and Ticketing Programme2014Asif Ali
 
Boost your Sales 2015
Boost your Sales 2015Boost your Sales 2015
Boost your Sales 2015Asif Ali
 
Underpinning Marketing Strategy With Email Automation
Underpinning Marketing Strategy With Email AutomationUnderpinning Marketing Strategy With Email Automation
Underpinning Marketing Strategy With Email AutomationMediaPost
 
Guion docente cuento - Clase 1 Ok!
Guion docente cuento - Clase 1 Ok!Guion docente cuento - Clase 1 Ok!
Guion docente cuento - Clase 1 Ok!Francisca Jimenez
 
Advanced Reservation and Ticketing Program for Contact Center (CRT )2015
Advanced Reservation and Ticketing Program for Contact Center (CRT )2015Advanced Reservation and Ticketing Program for Contact Center (CRT )2015
Advanced Reservation and Ticketing Program for Contact Center (CRT )2015Asif Ali
 
Arizona Broadband Strategic Plan Resource Guide
Arizona Broadband Strategic Plan Resource GuideArizona Broadband Strategic Plan Resource Guide
Arizona Broadband Strategic Plan Resource GuideMark Goldstein
 
Toma de decisiones
Toma de decisionesToma de decisiones
Toma de decisionesNildaLugo
 
Interview naria
Interview nariaInterview naria
Interview nariababy11111
 
Redes Sociais no Mundo Corporativo
Redes Sociais no Mundo CorporativoRedes Sociais no Mundo Corporativo
Redes Sociais no Mundo CorporativoRenzo Colnago
 
Alegações finais de Dirceu
Alegações finais de DirceuAlegações finais de Dirceu
Alegações finais de DirceuRadar News
 
Iran Startup Ecosystem by September 2016
Iran Startup Ecosystem by September 2016Iran Startup Ecosystem by September 2016
Iran Startup Ecosystem by September 2016Oppmakr Institue
 
Ask the Expert - Essential Strategies for Your Next Career Development Event
Ask the Expert - Essential Strategies for Your Next Career Development EventAsk the Expert - Essential Strategies for Your Next Career Development Event
Ask the Expert - Essential Strategies for Your Next Career Development EventMercedes Rodríguez
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeSean Davis
 
Empreendedorismo no Setor Público - Associação Brasileira de Recursos Humanos
Empreendedorismo no Setor Público - Associação Brasileira de Recursos HumanosEmpreendedorismo no Setor Público - Associação Brasileira de Recursos Humanos
Empreendedorismo no Setor Público - Associação Brasileira de Recursos HumanosRenzo Colnago
 

Destaque (18)

Etihad Fares and Ticketing Programme2014
Etihad Fares and Ticketing Programme2014Etihad Fares and Ticketing Programme2014
Etihad Fares and Ticketing Programme2014
 
Boost your Sales 2015
Boost your Sales 2015Boost your Sales 2015
Boost your Sales 2015
 
Underpinning Marketing Strategy With Email Automation
Underpinning Marketing Strategy With Email AutomationUnderpinning Marketing Strategy With Email Automation
Underpinning Marketing Strategy With Email Automation
 
Guion docente cuento - Clase 1 Ok!
Guion docente cuento - Clase 1 Ok!Guion docente cuento - Clase 1 Ok!
Guion docente cuento - Clase 1 Ok!
 
Advanced Reservation and Ticketing Program for Contact Center (CRT )2015
Advanced Reservation and Ticketing Program for Contact Center (CRT )2015Advanced Reservation and Ticketing Program for Contact Center (CRT )2015
Advanced Reservation and Ticketing Program for Contact Center (CRT )2015
 
Arizona Broadband Strategic Plan Resource Guide
Arizona Broadband Strategic Plan Resource GuideArizona Broadband Strategic Plan Resource Guide
Arizona Broadband Strategic Plan Resource Guide
 
Toma de decisiones
Toma de decisionesToma de decisiones
Toma de decisiones
 
Presentaciòn
PresentaciònPresentaciòn
Presentaciòn
 
Interview naria
Interview nariaInterview naria
Interview naria
 
Redes Sociais no Mundo Corporativo
Redes Sociais no Mundo CorporativoRedes Sociais no Mundo Corporativo
Redes Sociais no Mundo Corporativo
 
Alegações finais de Dirceu
Alegações finais de DirceuAlegações finais de Dirceu
Alegações finais de Dirceu
 
Decor arte
Decor arteDecor arte
Decor arte
 
Iran Startup Ecosystem by September 2016
Iran Startup Ecosystem by September 2016Iran Startup Ecosystem by September 2016
Iran Startup Ecosystem by September 2016
 
Failure to Launch Across the Lifespan
Failure to Launch Across the LifespanFailure to Launch Across the Lifespan
Failure to Launch Across the Lifespan
 
Ethics and Behavioral Health Care
Ethics and Behavioral Health CareEthics and Behavioral Health Care
Ethics and Behavioral Health Care
 
Ask the Expert - Essential Strategies for Your Next Career Development Event
Ask the Expert - Essential Strategies for Your Next Career Development EventAsk the Expert - Essential Strategies for Your Next Career Development Event
Ask the Expert - Essential Strategies for Your Next Career Development Event
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the Transcriptome
 
Empreendedorismo no Setor Público - Associação Brasileira de Recursos Humanos
Empreendedorismo no Setor Público - Associação Brasileira de Recursos HumanosEmpreendedorismo no Setor Público - Associação Brasileira de Recursos Humanos
Empreendedorismo no Setor Público - Associação Brasileira de Recursos Humanos
 

Semelhante a Efficient Support for MPI-I/O Atomicity

Lecture 24
Lecture 24Lecture 24
Lecture 24Shani729
 
Fra enkel J2SE til Grid computing med GigaSpaces XAP
Fra enkel J2SE til Grid computing med GigaSpaces XAPFra enkel J2SE til Grid computing med GigaSpaces XAP
Fra enkel J2SE til Grid computing med GigaSpaces XAPmudnaes
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwaUGIF
 
Virtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFireVirtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFireCarter Shanklin
 
Pyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemPyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemViet-Trung TRAN
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSkills Matter
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futureTakayuki Muranushi
 
Nutanix - Expert Session - Metro Availability
Nutanix -  Expert Session - Metro AvailabilityNutanix -  Expert Session - Metro Availability
Nutanix - Expert Session - Metro AvailabilityChristian Johannsen
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...ChangWoo Min
 
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)npinto
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsFirat Atagun
 
Damon2011 preview
Damon2011 previewDamon2011 preview
Damon2011 previewsundarnu
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1UGIF
 
OpenNebula Interoperability
OpenNebula InteroperabilityOpenNebula Interoperability
OpenNebula Interoperabilitydmamolina
 
Leading Research Across the AI Spectrum
Leading Research Across the AI SpectrumLeading Research Across the AI Spectrum
Leading Research Across the AI SpectrumQualcomm Research
 
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...inside-BigData.com
 

Semelhante a Efficient Support for MPI-I/O Atomicity (20)

Lecture 24
Lecture 24Lecture 24
Lecture 24
 
Big Data Smarter Networks
Big Data Smarter NetworksBig Data Smarter Networks
Big Data Smarter Networks
 
Fra enkel J2SE til Grid computing med GigaSpaces XAP
Fra enkel J2SE til Grid computing med GigaSpaces XAPFra enkel J2SE til Grid computing med GigaSpaces XAP
Fra enkel J2SE til Grid computing med GigaSpaces XAP
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwa
 
Virtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFireVirtualizing Latency Sensitive Workloads and vFabric GemFire
Virtualizing Latency Sensitive Workloads and vFabric GemFire
 
Pyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemPyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage system
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_future
 
Tom Krcha - Future of Flash
Tom Krcha - Future of FlashTom Krcha - Future of Flash
Tom Krcha - Future of Flash
 
Nutanix - Expert Session - Metro Availability
Nutanix -  Expert Session - Metro AvailabilityNutanix -  Expert Session - Metro Availability
Nutanix - Expert Session - Metro Availability
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...
 
NIO.pdf
NIO.pdfNIO.pdf
NIO.pdf
 
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, Implementations
 
Damon2011 preview
Damon2011 previewDamon2011 preview
Damon2011 preview
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1
 
OpenNebula Interoperability
OpenNebula InteroperabilityOpenNebula Interoperability
OpenNebula Interoperability
 
Leading Research Across the AI Spectrum
Leading Research Across the AI SpectrumLeading Research Across the AI Spectrum
Leading Research Across the AI Spectrum
 
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
 
Restfs
RestfsRestfs
Restfs
 

Mais de Viet-Trung TRAN

Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Viet-Trung TRAN
 
Dynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreDynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreViet-Trung TRAN
 
Pregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnPregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnViet-Trung TRAN
 
Mapreduce simplified-data-processing
Mapreduce simplified-data-processingMapreduce simplified-data-processing
Mapreduce simplified-data-processingViet-Trung TRAN
 
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookTìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookViet-Trung TRAN
 
giasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studygiasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studyViet-Trung TRAN
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkViet-Trung TRAN
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkViet-Trung TRAN
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkViet-Trung TRAN
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learningViet-Trung TRAN
 
success factors for project proposals
success factors for project proposalssuccess factors for project proposals
success factors for project proposalsViet-Trung TRAN
 
OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents Viet-Trung TRAN
 
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Viet-Trung TRAN
 
Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Viet-Trung TRAN
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learningViet-Trung TRAN
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 

Mais de Viet-Trung TRAN (20)

Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
 
Dynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value StoreDynamo: Amazon’s Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value Store
 
Pregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớnPregel: Hệ thống xử lý đồ thị lớn
Pregel: Hệ thống xử lý đồ thị lớn
 
Mapreduce simplified-data-processing
Mapreduce simplified-data-processingMapreduce simplified-data-processing
Mapreduce simplified-data-processing
 
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của FacebookTìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
 
giasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case studygiasan.vn real-estate analytics: a Vietnam case study
giasan.vn real-estate analytics: a Vietnam case study
 
Giasan.vn @rstars
Giasan.vn @rstarsGiasan.vn @rstars
Giasan.vn @rstars
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural Network
 
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural NetworkA Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural Network
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on Spark
 
Recent progress on distributing deep learning
Recent progress on distributing deep learningRecent progress on distributing deep learning
Recent progress on distributing deep learning
 
success factors for project proposals
success factors for project proposalssuccess factors for project proposals
success factors for project proposals
 
GPSinsights poster
GPSinsights posterGPSinsights poster
GPSinsights poster
 
OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents OCR processing with deep learning: Apply to Vietnamese documents
OCR processing with deep learning: Apply to Vietnamese documents
 
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015Introduction to BigData @TCTK2015
Introduction to BigData @TCTK2015
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 

Último

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Último (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

Efficient Support for MPI-I/O Atomicity

  • 1. Efficient Support for MPI-I/O Atomicity Based on Versioning Viet-Trung Tran1, Bogdan Nicolae2, Gabriel Antoniu2, Luc Bougé1 KerData Research Team 1 ENS Cachan, IRISA, France 2 INRIA, IRISA, Rennes, France 1
  • 2. Context: Data Intensive Large-scale HPC Simulations  Large-scale simulations of natural phenomena  Highly parallel platform  I/O challenges  High I/O performance  Huge data sizes (~PB)  Highly concurrency 2
  • 3. Data Access Pattern  Spatial splitting in parallelization  Ghost cells  Application data model vs storage model  •Sequence of bytes  Concurrent overlapping non-contiguous I/O  Require atomicity guarantees 3
  • 4. Goal: High throughput non-contiguous I/O under atomicity guarantees 4
  • 5. State of The Art  Locking-based approaches to ensure atomicity  3 level of implementations  Application  MPI-I/O Application (Visit, Tornado simulation)  Storage Data model (HDF5, NetCDF) MPI-IO middleware Parallel file systems (PVFS, GPFS, Lustre) 5
  • 6. Our Approach  Dedicated interface for atomic non-contiguous I/O  Provide atomicity guarantees at storage level  No need to translate MPI consistency to storage consistency model  Shadowing as a key to enhance data access under concurrency  No locking  Concurrent overlapped writes are allowed  Atomicity guarantees  Data striping 6
  • 7. Building Block: BlobSeer  A KerData project (blobseer.gforge.inria.fr)  Data striping  Versioning-based concurrency control  Distributed metadata management 7
  • 8. Building Block: BlobSeer (continued)  Distributed metadata management  Organized as a segment tree [0, 8]  Distributed over a DHT [0, 4] [0, 4] [4, 4]  Two phases I/O Metadata trees  Data access [0, 2] [0, 2] [2, 2] [2, 2] [4, 2]  Metadata access [0, 1] [1, 1] [1, 1] [2, 1] [2, 1] [3, 1] [4, 1] Blob 8
  • 9. Proposal for A Non-contiguous, Versioning Oriented Access Interface  Non-contiguous Write  vw = NONCONT_WRITE(id, buffers[], offsets[], sizes[])  Non-contiguous Read  NONCONT_READ(id, v, buffers[], offsets[], sizes[])  Challenges  Noncontiguous I/O must be atomic  Efficient under concurrency 9
  • 10. 1st challenge: Non-contiguous I/O Must Be Atomic  Shadowing techniques  Isolate non-contiguous update into one single consistent snapshot  Done at metadata level 10
  • 11. 2nd challenge: Efficiency Under Concurrent Accesses  Advantages of Shadowing Our Locking-  Parallel data I/O phases approach based approach  Parallel Metadata I/O Overlapping Parallel No phases ? Data I/O 11
  • 12. Minimize Ordering Overhead  Ordering is done at metadata level  Arbitrary order 12
  • 13. Avoid Synchronization for Concurrent Segment Tree Generation  Delegate the generation of shadowing tree to client side  Shadowing tree are generated in parallel thank to predictable metadata node ID 13
  • 14. Lazy Evaluation During Border Node Calculation  Building metadata tree in bottom-up fashion  Optimized for non-contiguous pattern 14
  • 15. Sumary: Overlapping Non-contiguous I/O Our approach Locking-based approaches Data I/O phases Parallel Serialization Metadata I/O phases Close to parallel thanks to Serialization 1- Arbitrary ordering 2- Metadata level’s ordering 3- Client side’s shadowing in parallel 4- Lazy evaluation 15
  • 16. Leveraging Our Versioning-Oriented Interface in Parallel I/O Stack Application (Visit, Tornado simulation) Data model (HDF5, NetCDF) MPI-IO middleware Storage optimized for atomic MPI-I/O Integrating BlobSeer to MPI-I/O middleware is straightforward 16
  • 17. Experimental Evaluation • Our machines: Reservation on Grid'5000 platform – 80 nodes – Pentium-4 CPU@2.6Ghz, 4GB RAM, Gigabit Ethernet – Measured bandwidth: 117.5 MB/s for MTU=1500B • 3 sets of experiments: – Scalability of non-contiguous I/O – Scalability under concurrency – MPI-tile-I/O 17
  • 20. MPI-tile-I/O: 128 KB Chunk Size 20
  • 22. Conclusion • Experiments show promising results • We outperform locking-based approaches • Key features: shadowing, dedicated API for atomic non-contiguous I/O • Comparison to Lustre file system • High throughput non-contiguous I/O under atomicity guarantees • Future work • Exposing versioning-interface to MPI-I/O applications • Potential improvement for producer-consumer workflow • Pyramid: A large-scale array-oriented active storage system 22
  • 23. Context Application (Visit, Tornado simulation) Data model (HDF5, NetCDF) MPI-IO middleware Parallel file systems •Parallel file systems do not provide atomic non-contiguous I/O interface 23
  • 24. 2nd challenge: Efficiency under concurrent accesses  Minimize ordering overhead  Ordering is done at metadata level  Arbitrary order  Avoid synchronization for concurrent segment tree generation  Delegate the generation of shadowing tree to client side  Shadowing tree are generated in parallel  Lazy evaluation during border node calculation 24