SlideShare a Scribd company logo
1 of 1
Download to read offline
MemzNet:	
  Memory-­‐Mapped	
  Zero-­‐copy	
  Network	
  Channel	
  for	
  Moving	
  Large	
  Datasets	
  over	
  100Gbps	
  Networks	
  
                                                                                                                                                  Mehmet	
  Balman	
  	
  (	
  mbalman@lbl.gov)	
  	
  
                                                                                                                 Computa8onal	
  Research	
  Division,	
  Lawrence	
  Berkeley	
  Na8onal	
  Laboratory	
                                                                                                                                                                          MemzNet’s	
  Architecture	
  for	
  data	
  streaming	
  
                                                                                                     Collaborators:	
  Eric	
  Pouyoul,	
  Yushu	
  Yao,	
  E.	
  Wes	
  Bethel,	
  Burlen	
  Loring,	
  Prabhat,	
  John	
  Shalf,	
  Alex	
  Sim,	
  	
  
                                                                                                                                    Arie	
  Shoshani,	
  Dean	
  N.	
  Williams,	
  Brian	
  L.	
  Tierney	
  
Memory-­‐mapped	
  Network	
  Channel	
  Framework	
                                                                                                                      	
  
                                                                                                         Increasing	
   the	
   bandwidth	
   is	
   not	
   sufficient	
   by	
   itself;	
   we	
   need	
   careful	
   evalua8on	
   of	
   future	
   high-­‐
                                                                                                         bandwidth	
  networks	
  from	
  the	
  applica8ons'	
  perspec8ve.	
  We	
  require	
  enhancements	
  in	
  current	
  
                                                                                                         middleware	
   tools	
   to	
   take	
   advantage	
   of	
   future	
   networking	
   frameworks.	
   To	
   improve	
  
                                                                                                         performance	
   and	
   efficiency,	
   we	
   develop	
   an	
   experimental	
   prototype,	
   called	
   MemzNet:	
  
                                                                                                         Memory-­‐mapped	
   Zero-­‐copy	
   Network	
   Channel,	
   which	
   uses	
   a	
   block-­‐based	
   data	
   movement	
  
                                                                                                         method	
   in	
   moving	
   large	
   scien8fic	
   datasets.	
   We	
   have	
   implemented	
   MemzNet	
   that	
   takes	
   the	
  
                                                                                                         approach	
  of	
  aggrega8ng	
  files	
  into	
  blocks	
  and	
  providing	
  dynamic	
  data	
  channel	
  management.	
  
                                                                                                         We	
  present	
  our	
  ini8al	
  results	
  in	
  100Gbps	
  networks.	
  	
  

                                                                                                                                                                          SC11 100Gbps Demo Configuration	

                                                                                                   Climate	
  Data-­‐file	
  characterisGcs	
  
                                                                                                   ²  Many	
  small	
  files	
  
                                                                                                   ²  One	
  of	
  the	
  fastest	
  growing	
  scien8fic	
  datasets	
  
                                                                                                   ²  Distributed	
  among	
  many	
  research	
  ins8tu8ons	
  
                                                                                                       around	
  the	
  world	
                                                                                                                                                                                                                                                                                                           Features	
  
                                                                                                   ²  Requires	
  high-­‐performance	
  data	
  replica8on.	
  
                      Measurement	
  in	
  ANI	
  100Gbps	
  Testbed	
                                                                                                                                                                                                                                                                                                    •  Data	
   files	
   are	
   aggregated	
   and	
   divided	
   into	
   simple	
   blocks.	
   	
   Blocks	
   are	
  
                                                                                                      File	
   size	
   distribu8on	
   in	
   IPCC	
   Fourth	
  
                                                                                                      Assessment	
   Report	
   (AR4)	
   phase	
   3,	
   the	
                                                                                                                                                                                                             tagged	
   and	
   streamed	
   over	
   the	
   network.	
   Each	
   data	
   block’s	
   tag	
   includes	
  
                      3	
  hosts,	
  each	
  connected	
  with	
  4	
  10Gbps	
  NICs	
  to	
  
                      100Gbps	
  router	
  
                                                                                                      Coupled	
   Model	
   Intercomparison	
   Project	
                                                                                                                                                                                                                    informa8on	
  about	
  the	
  content	
  inside.	
  	
  
                                                                                                      (CMIP3)	
  
                                                                                                      	
  
                                                                                                                                                                                                                                                                                                                                                                          •  Decouples	
  disk	
  and	
  network	
  IO	
  opera8ons;	
  so,	
  read/write	
  threads	
  can	
  
                                                                                                      	
                                                                                                                                                                                                                                                                     work	
  independently.	
  	
  	
  
                                                                                                      	
                                                           ² Many	
   TCP	
   sockets	
   oversubscribe	
   the	
   network	
   and	
   cause	
   performance	
                                                                                                  •  Implements	
   a	
   memory	
   cache	
   managements	
   system	
   that	
   is	
   accessed	
   in	
  
                                                                                                      	
                                                              degrada8on.	
  	
  
                                                                                                      	
                                                           ² Host	
  system	
  performance	
  could	
  easily	
  be	
  the	
  bocleneck.	
                                                                                                                          blocks.	
   These	
   memory	
   blocks	
   are	
   logically	
   mapped	
   to	
   the	
   memory	
  
                                                                                                                                                                                                                                                                                                                                                                             cache	
  that	
  resides	
  in	
  the	
  remote	
  site.	
  	
  
                                                                                                                                                                              Moving	
  Climate	
  Files	
  Efficiently	
                                                                                                                                                   •  The	
   synchroniza8on	
   of	
   the	
   memory	
   cache	
   is	
   accomplished	
   based	
   on	
  
                                                                                                                                                                                                                                                                                                                                                                             the	
   tag	
   header.	
   Applica8on	
   processes	
   interact	
   with	
   the	
   memory	
  
                                                                                                                                                                                                                                                                                                                                                                             blocks.	
  Enables	
  out-­‐of-­‐order	
  and	
  asynchronous	
  send	
  receive	
  	
  
                                                                                                                                                                                                                                                                                                                                                                          •  MemzNet	
  is	
  	
  is	
  not	
  file-­‐centric.	
  Bookkeeping	
  informa8on	
  is	
  embedded	
  
                                                                                                                                                                                                                                                                                                                                                                             inside	
   each	
   block.	
   Can	
   increase/decrease	
   the	
   number	
   of	
   parallel	
  
                                                                                                                                                                                                                                                                                                                                                                             streams	
  without	
  closing	
  and	
  reopening	
  the	
  	
  data	
  channel.	
  
  ANI testbed 100Gbps (10x10NICs, three hosts): CPU/Interrupts vs the number of

                                                                                                                                                                                                                                                                                                                                                                                                                                    Performance	
  
  concurrent transfers [1, 2, 4, 8, 16, 32 64 concurrent jobs - 5min intervals], TCP buffer size
  is 50M	





                                                                                                                                                                                                                                                                                                                                                    GridFTP                                   MemzNet




                                                                                                                                      Special	
   Thanks	
   Peter	
   Nugent,	
   Zarija	
   Lukic	
   ,	
   Patrick	
   Dorn,	
   Evangelos	
   Chaniotakis,	
   John	
   Christman,	
   Chin	
                                                    SC11	
  demo:	
  GridFTP	
  vs	
  memzNet
                                                                                                                                      Guok,	
   Chris	
   Tracy,	
   Lauren	
   Rotman,	
   Jason	
   Lee,	
   Shane	
   Canon,	
   Tina	
   Declerck,	
   Cary	
   Whitney,	
   Ed	
   Holohan,	
  	
                                                                                                                                         ANI Tetbed: Throughput comparison
                                                                                                                                      Adam	
   Scovel,	
   Linda	
   Winkler,	
   Jason	
   Hill,	
   Doug	
   Fuller,	
   	
   Susan	
   Hicks,	
   Hank	
   Childs,	
   Mark	
   Howison,	
   Aaron	
                          References	
  
                                                                                                                                      Thomas,	
  John	
  Dugan,	
  Gopal	
  Vaswani	
                                                                                                                                            ²  Mehmet	
  Balman	
  et	
  al.,	
  Experiences	
  with	
  100Gbps	
  Network	
  Applica8ons.	
  In	
  Proceedings	
  of	
  the	
  fi0h	
  interna2onal	
  workshop	
  on	
  Data-­‐
                                                                                                                                  	
  
                                                                                                                            Acknowledgements:	
   This	
   work	
   was	
   supported	
   by	
   the	
   Director,	
   Office	
   of	
   Science,	
   Office	
   of	
   Basic	
   Energy	
   Sciences,	
   of	
   the	
   U.S.	
         Intensive	
   Distributed	
   Compu2ng,	
   in	
   conjunc8on	
   with	
   the	
   ACM	
   Symposium	
   on	
   High-­‐Performance	
   Parallel	
   and	
   Distributed	
   Compu8ng	
  
(a) total throughput vs. the number of concurrent memory-to-memory transfers, (b) interface traffic, packages per second     Department	
   of	
   Energy	
   under	
   Contract	
   No.	
   DE-­‐AC02-­‐05CH11231.	
   This	
   research	
   used	
   resources	
   of	
   the	
   ESnet	
   Advanced	
   Network	
                  (HPDC’12),	
  June	
  2012.	
  
(blue) and bytes per second, over a single NIC with different number of concurrent transfers. Each peak represents a        Ini8a8ve	
  (ANI)	
  Testbed,	
  which	
  is	
  supported	
  by	
  the	
  Office	
  of	
  Science	
  of	
  the	
  U.S.	
  Department	
  of	
  Energy	
  under	
  the	
  contract	
  above,	
  
                                                                                                                            funded	
  through	
  the	
  The	
  American	
  Recovery	
  and	
  Reinvestment	
  Act	
  of	
  2009	
                                                                                                ²  Mehmet	
  	
  Balman,	
  Streaming	
  Exa-­‐scale	
  data	
  over	
  100Gbps	
  Networks,	
  IEEE	
  Compu8ng	
  Now,	
  Oct	
  2012.	
  
different test; 1, 2, 4, 8, 16, 32, 64 concurrent streams per job were initiated for 5min intervals	
  

More Related Content

More from balmanme

Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networksbalmanme
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...balmanme
 
Balman stork cw09
Balman stork cw09Balman stork cw09
Balman stork cw09balmanme
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...balmanme
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...balmanme
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balmanbalmanme
 
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010balmanme
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterbalmanme
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerbalmanme
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarbalmanme
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopbalmanme
 
Balman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet BalmanBalman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet Balmanbalmanme
 
Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarbalmanme
 
Pdcs2010 balman-presentation
Pdcs2010 balman-presentationPdcs2010 balman-presentation
Pdcs2010 balman-presentationbalmanme
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networksbalmanme
 
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...balmanme
 
Opening ndm2012 sc12
Opening ndm2012 sc12Opening ndm2012 sc12
Opening ndm2012 sc12balmanme
 
Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation balmanme
 
Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011balmanme
 
Welcome ndm11
Welcome ndm11Welcome ndm11
Welcome ndm11balmanme
 

More from balmanme (20)

Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networks
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
 
Balman stork cw09
Balman stork cw09Balman stork cw09
Balman stork cw09
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balman
 
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-poster
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summer
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminar
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
 
Balman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet BalmanBalman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet Balman
 
Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminar
 
Pdcs2010 balman-presentation
Pdcs2010 balman-presentationPdcs2010 balman-presentation
Pdcs2010 balman-presentation
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
 
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
 
Opening ndm2012 sc12
Opening ndm2012 sc12Opening ndm2012 sc12
Opening ndm2012 sc12
 
Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation
 
Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011
 
Welcome ndm11
Welcome ndm11Welcome ndm11
Welcome ndm11
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

MemzNet: Memory-Mapped Zero-copy Network Channel for Moving Large Datasets over 100Gbps Networks

  • 1. MemzNet:  Memory-­‐Mapped  Zero-­‐copy  Network  Channel  for  Moving  Large  Datasets  over  100Gbps  Networks   Mehmet  Balman    (  mbalman@lbl.gov)     Computa8onal  Research  Division,  Lawrence  Berkeley  Na8onal  Laboratory   MemzNet’s  Architecture  for  data  streaming   Collaborators:  Eric  Pouyoul,  Yushu  Yao,  E.  Wes  Bethel,  Burlen  Loring,  Prabhat,  John  Shalf,  Alex  Sim,     Arie  Shoshani,  Dean  N.  Williams,  Brian  L.  Tierney   Memory-­‐mapped  Network  Channel  Framework     Increasing   the   bandwidth   is   not   sufficient   by   itself;   we   need   careful   evalua8on   of   future   high-­‐ bandwidth  networks  from  the  applica8ons'  perspec8ve.  We  require  enhancements  in  current   middleware   tools   to   take   advantage   of   future   networking   frameworks.   To   improve   performance   and   efficiency,   we   develop   an   experimental   prototype,   called   MemzNet:   Memory-­‐mapped   Zero-­‐copy   Network   Channel,   which   uses   a   block-­‐based   data   movement   method   in   moving   large   scien8fic   datasets.   We   have   implemented   MemzNet   that   takes   the   approach  of  aggrega8ng  files  into  blocks  and  providing  dynamic  data  channel  management.   We  present  our  ini8al  results  in  100Gbps  networks.     SC11 100Gbps Demo Configuration Climate  Data-­‐file  characterisGcs   ²  Many  small  files   ²  One  of  the  fastest  growing  scien8fic  datasets   ²  Distributed  among  many  research  ins8tu8ons   around  the  world   Features   ²  Requires  high-­‐performance  data  replica8on.   Measurement  in  ANI  100Gbps  Testbed   •  Data   files   are   aggregated   and   divided   into   simple   blocks.     Blocks   are   File   size   distribu8on   in   IPCC   Fourth   Assessment   Report   (AR4)   phase   3,   the   tagged   and   streamed   over   the   network.   Each   data   block’s   tag   includes   3  hosts,  each  connected  with  4  10Gbps  NICs  to   100Gbps  router   Coupled   Model   Intercomparison   Project   informa8on  about  the  content  inside.     (CMIP3)     •  Decouples  disk  and  network  IO  opera8ons;  so,  read/write  threads  can     work  independently.         ² Many   TCP   sockets   oversubscribe   the   network   and   cause   performance   •  Implements   a   memory   cache   managements   system   that   is   accessed   in     degrada8on.       ² Host  system  performance  could  easily  be  the  bocleneck.   blocks.   These   memory   blocks   are   logically   mapped   to   the   memory   cache  that  resides  in  the  remote  site.     Moving  Climate  Files  Efficiently   •  The   synchroniza8on   of   the   memory   cache   is   accomplished   based   on   the   tag   header.   Applica8on   processes   interact   with   the   memory   blocks.  Enables  out-­‐of-­‐order  and  asynchronous  send  receive     •  MemzNet  is    is  not  file-­‐centric.  Bookkeeping  informa8on  is  embedded   inside   each   block.   Can   increase/decrease   the   number   of   parallel   streams  without  closing  and  reopening  the    data  channel.   ANI testbed 100Gbps (10x10NICs, three hosts): CPU/Interrupts vs the number of Performance   concurrent transfers [1, 2, 4, 8, 16, 32 64 concurrent jobs - 5min intervals], TCP buffer size is 50M GridFTP MemzNet Special   Thanks   Peter   Nugent,   Zarija   Lukic   ,   Patrick   Dorn,   Evangelos   Chaniotakis,   John   Christman,   Chin   SC11  demo:  GridFTP  vs  memzNet Guok,   Chris   Tracy,   Lauren   Rotman,   Jason   Lee,   Shane   Canon,   Tina   Declerck,   Cary   Whitney,   Ed   Holohan,     ANI Tetbed: Throughput comparison Adam   Scovel,   Linda   Winkler,   Jason   Hill,   Doug   Fuller,     Susan   Hicks,   Hank   Childs,   Mark   Howison,   Aaron   References   Thomas,  John  Dugan,  Gopal  Vaswani   ²  Mehmet  Balman  et  al.,  Experiences  with  100Gbps  Network  Applica8ons.  In  Proceedings  of  the  fi0h  interna2onal  workshop  on  Data-­‐   Acknowledgements:   This   work   was   supported   by   the   Director,   Office   of   Science,   Office   of   Basic   Energy   Sciences,   of   the   U.S.   Intensive   Distributed   Compu2ng,   in   conjunc8on   with   the   ACM   Symposium   on   High-­‐Performance   Parallel   and   Distributed   Compu8ng   (a) total throughput vs. the number of concurrent memory-to-memory transfers, (b) interface traffic, packages per second Department   of   Energy   under   Contract   No.   DE-­‐AC02-­‐05CH11231.   This   research   used   resources   of   the   ESnet   Advanced   Network   (HPDC’12),  June  2012.   (blue) and bytes per second, over a single NIC with different number of concurrent transfers. Each peak represents a Ini8a8ve  (ANI)  Testbed,  which  is  supported  by  the  Office  of  Science  of  the  U.S.  Department  of  Energy  under  the  contract  above,   funded  through  the  The  American  Recovery  and  Reinvestment  Act  of  2009   ²  Mehmet    Balman,  Streaming  Exa-­‐scale  data  over  100Gbps  Networks,  IEEE  Compu8ng  Now,  Oct  2012.   different test; 1, 2, 4, 8, 16, 32, 64 concurrent streams per job were initiated for 5min intervals