SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Stork 1.0 and Beyond
Data Scheduling for Large‐scale 
ll bCollaborative Science
Mehmet Balman
Louisiana State University, Baton Rouge, LA, USA
Presented at Condor Week 2009 April 20-April 23, 2009
Scheduling Data Placement JobsScheduling Data Placement Jobs
• Data Placement ActivitiesData Placement Activities
• Modular Architecture
– Data Transfer ModulesData Transfer Modules 
for specific protocols/services
• Throttle maximum transfer operations running
• Keep a log of data placement activities• Keep a log of data placement activities
• Add fault tolerance to data transfers
Job SubmissionJob Submission
[ dest_url = "gsiftp://eric1.loni.org/scratch/user/";
arguments = ‐p 4 dbg ‐vb";
src_url = "file:///home/user/test/";
dap_type = "transfer";
verify_checksum = true;
verify_filesize = true;
set_permission = "755" ;
i trecursive_copy = true;
network_check = true;
checkpoint_transfer = true;
output = "userout";output =  user.out ;
err = "user.err";
log = "userjob.log";
]]
AgendaAgenda
• Error Detection and Error ClassificationError Detection and Error Classification
• Data Transfer Operations
D i T i– Dynamic Tuning 
– Prediction Service
– Job Aggregation
• Data Migration using Stork
• Practical example in PetaShare Project
• Future Directions
Failure‐AwarenessFailure Awareness
• Dynamic Environment: 
• data transfers are prune to frequent failures
• what went wrong during data transfer?
• No access to the remote resourcesNo access to the remote resources
• Messages get lost due to system malfunction
• Instead of waiting failure to happen• Instead of waiting failure to happen
• Detect possible failures and malfunctioning services
• Search for another data server
• Alternate data transfer service• Alternate data transfer service
• Classify erroneous cases to make better decisions
Error DetectionError Detection
• Use Network Exploration Techniques
– Check availability of the remote service
– Resolve host and determine connectivity failures
– Detect available data transfers service
– should be Fast and Efficient not to bother system/network resources
• Error while transfer is in progress?
– Error_TRANSFER
• Retry or not?
• When to re‐initiate the transfer
• Use alternate options?• Use alternate options?
Error ClassificationError Classification
•Recover from Failure
•Retry failed operation
•Postpone scheduling of a 
failed operationsfailed operations
•Early Error Detection
I i i T f h•Initiate Transfer when 
erroneous condition 
recovered
•Or use Alternate options
• Data Transfer Protocol not always return appropriate error codes
• Using error messages generated by the data transfer protocol
p
• A better logging facility and classification
Error ReportingError Reporting
Failure‐Aware SchedulingFailure Aware Scheduling
Scoop data  ‐ Hurricane Gustov Simulationsp
Hundreds of files (250 data transfer operation)
Small (100MB) and large files (1G, 2G
New Transfer ModulesNew Transfer Modules
• Verify the successful completion of the operation by y p p y
controlling checksum and file size. 
f G idFTP S k f d l f• for GridFTP, Stork transfer module can recover from a 
failed operation by restarting from the last transmitted 
file. In case of a retry from a failure, scheduler informs 
the transfer module to recover and restart the transfer 
using the information from a rescue file created by the 
checkpoint‐enabled transfer module.checkpoint enabled transfer module.
• Replacing Globus RFT (Reliable File Transfer)
AgendaAgenda
• Error Detection and Error ClassificationError Detection and Error Classification
• Data Transfer Operations
D i T i– Dynamic Tuning 
– Prediction Service
– Job Aggregation
• Data Migration using Stork
• Practical example in PetaShare Project
• Future Directions
Tuning Data TransfersTuning Data Transfers
• Latency Wall
– Buffer Size Optimization
– Parallel TCP Streams
– Concurrent  Transfers
• User level end‐to‐end Tuning
P ll liParallelism
• (1) the number of parallel data streams connected to a data transfer 
service for increasing the utilization of network bandwidthservice for increasing the utilization of network bandwidth
• (2) the number of concurrent data transfer operations that are 
initiated at the same time for better utilization of system resourcesinitiated at the same time for better utilization of system resources.
Parameter EstimationParameter Estimation
• come up with a good estimation for the co e up t a good est at o o t e
parallelism level
– Network statistics
– Extra measurement
– Historical data 
• Might not reflect the best possible current 
settings (Dynamic Environment)
Optimization ServiceOptimization Service
Dynamic TuningDynamic Tuning
Average Throughput using Parallel Streamsg g p g
Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams
Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from IBM m/c
Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams
Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
Job AggregationJob Aggregation
• data placement jobs are combined and processed as a
single transfer job.
• Information about the aggregated job is stored in the job queue and
it is tied to a main job which is actually performing the transfer
operation such that it can be queried and reported separately.operation such that it can be queried and reported separately.
• Hence, aggregation is transparent to the user
W h t f i t i ll• We have seen vast performance improvement, especially
with small data files,
• simply by combining data placement jobs based on their
d ti ti ddsource or destination addresses.
– decreasing the amount of protocol usage
– reducing the number of independent network connections
Job AggregationJob Aggregation 
2000
2500
ec)
1000
1500
2000
time (se
single job at a time
2 parallel jobs
4 ll l j b
0
500
1000
total 
4 parallel jobs
8 parallel jobs
16 parallel jobs
32 parallel jobs
0 10 20 30 40
max aggregation count
32 parallel jobs
Experiments on LONI (Louisiana Optical Network Initiative) :
1024 transfer jobs from Ducky to Queenbee (rtt avg 5.129 ms) - 5MB
data file per job
AgendaAgenda
• Error Detection and Error ClassificationError Detection and Error Classification
• Data Transfer Operations
D i T i– Dynamic Tuning 
– Prediction Service
– Job Aggregation
• Data Migration using Stork
• Practical example in PetaShare Project
• Future Directions
PetaSharePetaShare
• Distributed Storage for Data 
Archive
• Global Namespace among 
distributed resources
• Client tools and interfaces
• Pcommands
• Petashell
• Petafs
• Windows Browser
• Web Portal
• Spans among seven LouisianaSpans among seven Louisiana 
research institutions
• Manages 300TB of disk storage, 
400TB of tape400TB of tape
Broader ImpactBroader Impact
Fast and Efficient Data Migration in PetaShareg
Future DirectionsFuture Directions
Stork: Central Scheduling Framework
f b l k
Stork: Central Scheduling Framework
• Performance bottleneck
– Hundreds of jobs submitted to a single batch 
h d l kscheduler, Stork
• Single point of failure
Future DirectionsFuture Directions
Distributed Data Scheduling
• Interaction between data scheduler
• Manage data activities with lightweight agents in each site
Distributed Data Scheduling
• Manage data activities with lightweight agents in each site
• Better parameter tuning and reordering of data placement 
jobs
– Job Delegation 
– peer‐to‐peer data movement 
– data and server striping 
– make use of replicas for multi‐source downloads
Questions?Questions?
Team:
Tevfik Kosar kosar@cct lsu eduTevfik Kosar kosar@cct.lsu.edu
Mehmet Balman balman@cct.lsu.edu
Dengpan Yin dyin@cct.lsu.edu
Jia "Jacob" Cheng jacobch@cct.lsu.edu
www.petashare.org www.cybertools.loni.org www.storkproject.orgwww.cct.lsu.edu

Mais conteúdo relacionado

Mais procurados

Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plansIan Foster
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobus
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data StagingHenning Bergmeyer
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of ScienceGlobus
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlPrimal Pappachan
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009Ian Foster
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Globus
 
empirical analysis modeling of power dissipation control in internet data ce...
 empirical analysis modeling of power dissipation control in internet data ce... empirical analysis modeling of power dissipation control in internet data ce...
empirical analysis modeling of power dissipation control in internet data ce...saadjamil31
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusGlobus
 
Globus publication demo screenshots
Globus publication demo screenshotsGlobus publication demo screenshots
Globus publication demo screenshotsIan Foster
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceGlobus
 
What's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtraWhat's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtraGlobus
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13DECK36
 

Mais procurados (20)

Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plans
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data Staging
 
SomeSlides
SomeSlidesSomeSlides
SomeSlides
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing Webcrawl
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)
 
empirical analysis modeling of power dissipation control in internet data ce...
 empirical analysis modeling of power dissipation control in internet data ce... empirical analysis modeling of power dissipation control in internet data ce...
empirical analysis modeling of power dissipation control in internet data ce...
 
The DBpedia databus
The DBpedia databusThe DBpedia databus
The DBpedia databus
 
hadoop
hadoophadoop
hadoop
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
 
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
 
contentDM
contentDMcontentDM
contentDM
 
Globus publication demo screenshots
Globus publication demo screenshotsGlobus publication demo screenshots
Globus publication demo screenshots
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials Science
 
Understanding Big Data Platform from Patents
Understanding Big Data Platform from PatentsUnderstanding Big Data Platform from Patents
Understanding Big Data Platform from Patents
 
What's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtraWhat's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtra
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13
 

Destaque

Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarbalmanme
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarbalmanme
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopbalmanme
 
Pdcs2010 balman-presentation
Pdcs2010 balman-presentationPdcs2010 balman-presentation
Pdcs2010 balman-presentationbalmanme
 
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010balmanme
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerbalmanme
 
Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation balmanme
 

Destaque (7)

Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminar
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminar
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
 
Pdcs2010 balman-presentation
Pdcs2010 balman-presentationPdcs2010 balman-presentation
Pdcs2010 balman-presentation
 
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summer
 
Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation
 

Semelhante a Balman stork cw09

An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEWShiyong Lu
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsVijay Karan
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsVijay Karan
 
M.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing ProjectsM.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing ProjectsVijay Karan
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET Journal
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013Kirill Osipov
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsSriram Krishnan
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryIan Foster
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility ExhibitionGlobus
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer OverlordsIan Foster
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Scalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesScalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesFinalyear Projects
 
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...Finalyear Projects
 
Venkatachandu rajana
Venkatachandu rajanaVenkatachandu rajana
Venkatachandu rajanarajanachandu
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET Journal
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow ObstructionsTatiana Al-Chueyr
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsBen Laird
 

Semelhante a Balman stork cw09 (20)

An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEW
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing Projects
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing Projects
 
Subhabrata Deb Resume
Subhabrata Deb ResumeSubhabrata Deb Resume
Subhabrata Deb Resume
 
M.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing ProjectsM.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing Projects
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013
 
My C.V
My C.VMy C.V
My C.V
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific Applications
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate Discovery
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility Exhibition
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer Overlords
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Scalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesScalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehouses
 
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
 
Venkatachandu rajana
Venkatachandu rajanaVenkatachandu rajana
Venkatachandu rajana
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop Framework
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow Obstructions
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.js
 

Mais de balmanme

Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...balmanme
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...balmanme
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1balmanme
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networksbalmanme
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...balmanme
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...balmanme
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...balmanme
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balmanbalmanme
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterbalmanme
 
Balman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet BalmanBalman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet Balmanbalmanme
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networksbalmanme
 
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...balmanme
 
Opening ndm2012 sc12
Opening ndm2012 sc12Opening ndm2012 sc12
Opening ndm2012 sc12balmanme
 
Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011balmanme
 
Welcome ndm11
Welcome ndm11Welcome ndm11
Welcome ndm11balmanme
 
2011 agu-town hall-100g
2011 agu-town hall-100g2011 agu-town hall-100g
2011 agu-town hall-100gbalmanme
 
Rdma presentation-kisti-v2
Rdma presentation-kisti-v2Rdma presentation-kisti-v2
Rdma presentation-kisti-v2balmanme
 
Streaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networksStreaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networksbalmanme
 
APM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CAAPM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CAbalmanme
 
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 -  Delft, The NetherlandsHPDC 2012 presentation - June 19, 2012 -  Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlandsbalmanme
 

Mais de balmanme (20)

Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networks
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balman
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-poster
 
Balman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet BalmanBalman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet Balman
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
 
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
 
Opening ndm2012 sc12
Opening ndm2012 sc12Opening ndm2012 sc12
Opening ndm2012 sc12
 
Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011
 
Welcome ndm11
Welcome ndm11Welcome ndm11
Welcome ndm11
 
2011 agu-town hall-100g
2011 agu-town hall-100g2011 agu-town hall-100g
2011 agu-town hall-100g
 
Rdma presentation-kisti-v2
Rdma presentation-kisti-v2Rdma presentation-kisti-v2
Rdma presentation-kisti-v2
 
Streaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networksStreaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networks
 
APM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CAAPM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CA
 
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 -  Delft, The NetherlandsHPDC 2012 presentation - June 19, 2012 -  Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlands
 

Último

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Balman stork cw09

  • 2. Scheduling Data Placement JobsScheduling Data Placement Jobs • Data Placement ActivitiesData Placement Activities • Modular Architecture – Data Transfer ModulesData Transfer Modules  for specific protocols/services • Throttle maximum transfer operations running • Keep a log of data placement activities• Keep a log of data placement activities • Add fault tolerance to data transfers
  • 3. Job SubmissionJob Submission [ dest_url = "gsiftp://eric1.loni.org/scratch/user/"; arguments = ‐p 4 dbg ‐vb"; src_url = "file:///home/user/test/"; dap_type = "transfer"; verify_checksum = true; verify_filesize = true; set_permission = "755" ; i trecursive_copy = true; network_check = true; checkpoint_transfer = true; output = "userout";output =  user.out ; err = "user.err"; log = "userjob.log"; ]]
  • 4. AgendaAgenda • Error Detection and Error ClassificationError Detection and Error Classification • Data Transfer Operations D i T i– Dynamic Tuning  – Prediction Service – Job Aggregation • Data Migration using Stork • Practical example in PetaShare Project • Future Directions
  • 5. Failure‐AwarenessFailure Awareness • Dynamic Environment:  • data transfers are prune to frequent failures • what went wrong during data transfer? • No access to the remote resourcesNo access to the remote resources • Messages get lost due to system malfunction • Instead of waiting failure to happen• Instead of waiting failure to happen • Detect possible failures and malfunctioning services • Search for another data server • Alternate data transfer service• Alternate data transfer service • Classify erroneous cases to make better decisions
  • 6. Error DetectionError Detection • Use Network Exploration Techniques – Check availability of the remote service – Resolve host and determine connectivity failures – Detect available data transfers service – should be Fast and Efficient not to bother system/network resources • Error while transfer is in progress? – Error_TRANSFER • Retry or not? • When to re‐initiate the transfer • Use alternate options?• Use alternate options?
  • 7. Error ClassificationError Classification •Recover from Failure •Retry failed operation •Postpone scheduling of a  failed operationsfailed operations •Early Error Detection I i i T f h•Initiate Transfer when  erroneous condition  recovered •Or use Alternate options • Data Transfer Protocol not always return appropriate error codes • Using error messages generated by the data transfer protocol p • A better logging facility and classification
  • 9. Failure‐Aware SchedulingFailure Aware Scheduling Scoop data  ‐ Hurricane Gustov Simulationsp Hundreds of files (250 data transfer operation) Small (100MB) and large files (1G, 2G
  • 10. New Transfer ModulesNew Transfer Modules • Verify the successful completion of the operation by y p p y controlling checksum and file size.  f G idFTP S k f d l f• for GridFTP, Stork transfer module can recover from a  failed operation by restarting from the last transmitted  file. In case of a retry from a failure, scheduler informs  the transfer module to recover and restart the transfer  using the information from a rescue file created by the  checkpoint‐enabled transfer module.checkpoint enabled transfer module. • Replacing Globus RFT (Reliable File Transfer)
  • 11. AgendaAgenda • Error Detection and Error ClassificationError Detection and Error Classification • Data Transfer Operations D i T i– Dynamic Tuning  – Prediction Service – Job Aggregation • Data Migration using Stork • Practical example in PetaShare Project • Future Directions
  • 12. Tuning Data TransfersTuning Data Transfers • Latency Wall – Buffer Size Optimization – Parallel TCP Streams – Concurrent  Transfers • User level end‐to‐end Tuning P ll liParallelism • (1) the number of parallel data streams connected to a data transfer  service for increasing the utilization of network bandwidthservice for increasing the utilization of network bandwidth • (2) the number of concurrent data transfer operations that are  initiated at the same time for better utilization of system resourcesinitiated at the same time for better utilization of system resources.
  • 13. Parameter EstimationParameter Estimation • come up with a good estimation for the co e up t a good est at o o t e parallelism level – Network statistics – Extra measurement – Historical data  • Might not reflect the best possible current  settings (Dynamic Environment)
  • 16. Average Throughput using Parallel Streamsg g p g Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
  • 17. Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from IBM m/c
  • 18. Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
  • 19. Job AggregationJob Aggregation • data placement jobs are combined and processed as a single transfer job. • Information about the aggregated job is stored in the job queue and it is tied to a main job which is actually performing the transfer operation such that it can be queried and reported separately.operation such that it can be queried and reported separately. • Hence, aggregation is transparent to the user W h t f i t i ll• We have seen vast performance improvement, especially with small data files, • simply by combining data placement jobs based on their d ti ti ddsource or destination addresses. – decreasing the amount of protocol usage – reducing the number of independent network connections
  • 20. Job AggregationJob Aggregation  2000 2500 ec) 1000 1500 2000 time (se single job at a time 2 parallel jobs 4 ll l j b 0 500 1000 total  4 parallel jobs 8 parallel jobs 16 parallel jobs 32 parallel jobs 0 10 20 30 40 max aggregation count 32 parallel jobs Experiments on LONI (Louisiana Optical Network Initiative) : 1024 transfer jobs from Ducky to Queenbee (rtt avg 5.129 ms) - 5MB data file per job
  • 21. AgendaAgenda • Error Detection and Error ClassificationError Detection and Error Classification • Data Transfer Operations D i T i– Dynamic Tuning  – Prediction Service – Job Aggregation • Data Migration using Stork • Practical example in PetaShare Project • Future Directions
  • 22. PetaSharePetaShare • Distributed Storage for Data  Archive • Global Namespace among  distributed resources • Client tools and interfaces • Pcommands • Petashell • Petafs • Windows Browser • Web Portal • Spans among seven LouisianaSpans among seven Louisiana  research institutions • Manages 300TB of disk storage,  400TB of tape400TB of tape
  • 24. Future DirectionsFuture Directions Stork: Central Scheduling Framework f b l k Stork: Central Scheduling Framework • Performance bottleneck – Hundreds of jobs submitted to a single batch  h d l kscheduler, Stork • Single point of failure
  • 25. Future DirectionsFuture Directions Distributed Data Scheduling • Interaction between data scheduler • Manage data activities with lightweight agents in each site Distributed Data Scheduling • Manage data activities with lightweight agents in each site • Better parameter tuning and reordering of data placement  jobs – Job Delegation  – peer‐to‐peer data movement  – data and server striping  – make use of replicas for multi‐source downloads
  • 26. Questions?Questions? Team: Tevfik Kosar kosar@cct lsu eduTevfik Kosar kosar@cct.lsu.edu Mehmet Balman balman@cct.lsu.edu Dengpan Yin dyin@cct.lsu.edu Jia "Jacob" Cheng jacobch@cct.lsu.edu www.petashare.org www.cybertools.loni.org www.storkproject.orgwww.cct.lsu.edu