SlideShare uma empresa Scribd logo
1 de 40
Task scheduling algorithm for multicore
processor system for minimizing recovery
time in case of single node fault
1
Shohei Gotoda†, Naoki Shibata‡, Minoru Ito†
†Nara Institute of Science and Technology
‡Shiga University
Background
• Multicore processors
 Almost all processors designed recently are
multicore processors
• Computing cluster consisting of 1800 nodes
experiences about 1000 failures[1]
in the first year after deployment
[1] Google spotlights data center inner workings
cnet.com article on May 30, 2008
Objective of Research
• Fault tolerance
 We assume a single fail-stop failure of a multicore
processor
• Network contention
 To generate schedules reproducible on real
systems
3
Devise new scheduling method that
minimizes recovery time
taking account of the above points
Task Graph
• A group of tasks that can be
executed in parallel
• Vertex (task node)
Task to be executed on a single
CPU core
• Edge (task link)
Data dependence between tasks
4
Task node Task link
Task graph
Processor Graph
• Topology of the computer network
• Vertex (Processor node)
CPU core (circle)
• has only one link
Switch (rectangle)
• has more than 2 links
• Edge (Processor link)
Communication path between
processors
5
Processor node Processor linkSwitch
Processor graph
321
Task Scheduling
• Task scheduling problem
 assigns a processor node to each
task node
 minimizes total execution time
An NP-hard problem
6
1
One processor node is
assigned to each task node
321
Processor graph
Task graph
Inputs and Outputs for Task Scheduling
• Inputs
Task graph and processor graph
• Output
A schedule
• which is an assignment of a processor
node to each task node
• Objective function
Minimize task execution time
7
3
31
31
321
Processor graph
Task graph
Network Contention Model
• Communication delay
If processor link is occupied by another
communication
• We use existing network contention
model[2]
8
3
31
32
Contention 321
Processor graph
Task graph
[2] O. Sinnen and L.A. Sousa, “Communication Contention in
Task Scheduling,“ IEEE Trans. Parallel and Distributed Systems,
vol. 16, no. 6, pp. 503-515, 2005.
Multicore Processor Model
• Each core executes a task
independently from other cores
• Communication between cores
finishes instantaneously
• One network interface is shared
among all cores on a die
• If there is a failure, all cores on a
die stop execution simultaneously
9
Core1
Core2
CPU
21
Processor graph
Influence of Multicore Processors
10
• Need for considering multicore
processors in scheduling
High speed communication link
among processors on a single die
• Existing schedulers try to utilize this
high speed link
• As a result, many dependent tasks are
assigned to cores on a single die
3
31
32
321
Assigned to cores
on a same die
Processor graph
Task graph
• Need for considering multicore
processors in scheduling
High speed communication link
among processors on a single die
• Existing schedulers try to utilize this
high speed link
• As a result, many dependent tasks are
assigned to cores on a single die
In case of fault
• Dependent tasks tends to be
destroyed at a time
11
3
31
32
321
Processor graph
Task graph
Influence of Multicore Processors
Assigned to cores
on a same die
Related Work (1/2)
• Checkpointing [3]
Node state is saved in each node
Backup node is allocated
Recover processing results from saved state
Multicore is not considered
Network contention is not considered
12
[3] Y. Gu, Z. Zhang, F. Ye, H. Yang, M. Kim, H. Lei, and Z. Liu. An empirical study of high
availability in stream processing systems. In Middleware ’09: the 10th
ACM/IFIP/USENIX International Conference on Middleware (Industrial Track), 2009.
1
2
3
4
Input
Queue
Output
Queue
Secondary
Primary
Backup
Related Work (2/2)
• Task scheduling method[5] in which
 Multiple task graph templates are prepared
beforehand,
 Processors are assigned according to the templates
• This method is suitable for highly loaded
systems
[5] Wolf, J., et al.: SODA: An Optimizing Scheduler for Large-
Scale Stream-Based Distributed Computer Systems. In: ACM
Middleware (2008)
Our Contribution
• There is no existing method for scheduling
that takes account of both
• multicore processor failure
• network contention
• We propose a scheduling method taking
account of network contention and
multicore processor failure
14
Assumptions
• Only a single fail-stop failure of a multicore
processor can occur
Failed computing node automatically restart after
30 sec.
• Failure can be detected in one second
by interruption of heartbeat signals
• Use checkpointing technique to recover from
saved state
• Network contention
Contention model is same as the Sinnen’s model
15
Checkpointing and Recovery
• Each processor node saves state to the main memory
when each task is finished
 Saved state is the data transferred to the succeeding processor
nodes
 Only output data from each task node is saved as a state
• This is much smaller than the complete memory image
 We assume saving state finishes instantaneously
• Since this is just copying small data within memory
• Recovery
 Saved state which is not affected by the failure is found in the
ancestor task nodes.
Some tasks are executed again using the saved state
16
[3] Y. Gu, Z. Zhang, F. Ye, H. Yang, M. Kim, H. Lei, and Z. Liu. An empirical study of high
availability in stream processing systems. In Middleware ’09: the 10th
ACM/IFIP/USENIX International Conference on Middleware (Industrial Track), 2009.
What Proposed Method Tries to Do
• Reduce recovery time in case of failure
 Minimizes the worst case total execution time
• Worst case in the all possible patterns of failure
• Each of dies can fail
 Execution time before failure + recovery
Worst Case Scenario
• Critical path
Path in task graph from first to last task with longest
execution time
• The worst case scenario
All tasks in critical path are assigned
to processors on a die
Failure happens when the last task is
being executed
We need two times of total execution time
18
Example task graph
First
Last
Idea of Proposed Method
• We distribute tasks on critical path over dies
But, there is communication overhead
If we distribute too many tasks, there is too much
overhead
• Usually, the last tasks in critical path have larger
influence
We check tasks from the last task in the critical path
We find the last k tasks in the critical path to other dies
We find the best k
Problem with Existing Method
20
1 2
3
A B C
21
3
BA
Resulting execution
Existing
Schedule
D
DC
• Task 1 is assigned to core A
• Task 2 is assigned to core B
• Task 3 is assigned to same die
• because of high
communication speed
Time
• Suppose that failure happens
when Task 3 is being executed
• All results are lost
21
1 2
3
A B C
21
3
BA
D
DC
Resulting execution
Existing
Schedule
Time
Problem with Existing Method
Problem with Existing Method
22
1 2
3
A B C
21
3
BA
D
DC
1’ 2’
3’
21
3
Resulting execution
Existing
Schedule
Time
• Suppose that failure happens
when Task 3 is being executed
• All results are lost
• We need to execute all tasks
again from the beginning
on another die
Improvement in Proposed Method
• Distribute influential tasks to
other dies
In this case, task 3 is the most
influential
23
21
3
Proposed schedule
1 2
3
A B C
BA
Resulting execution
D
DC
Comm.
overhead
Time
Recovery in Proposed Method
• Suppose that failure happens
when Task 3 is being executed
• Results of Task 1 and 2 are saved
24
21
3
1 2
3
A B C
BA
D
DC
Resulting execution
Time
Proposed schedule
Recovery in Proposed Method
• Suppose that failure happens
when Task 3 is being executed
• Results of Task 1 and 2 are saved
• Execution can be continued from
the saved state
25
3’
21
3
1 2
3
A B C
BA
D
DC
3
Resulting execution
Time
Proposed schedule
Communication Overhead
• Communication overhead is imposed to the
proposed method
26
Existing schedule Proposed schedule
overhead
1 2
3
A B C D
1 2
3
A B C D
Time
Speed-up in Recovery
27
Recovery with
existing schedule
Recovery with
proposed schedule
Proposed method has larger effect
if computation time is longer than
communication time
1 2
3
A B C D
1 2
3
A B C D
1’ 2’
3’
3’
speed-up
時間
Comparison of Schedules
28
Existing schedule Proposed schedule
Time
Time
Task graph
10 32
Processor graph
1
2
6 7
3 4
8 9
5
1
0
1
1
1
2
1
3
29
Not
available
Comparison of Recovery
Existing schedule
Proposed schedule
Time
Time
Task graph
10 32
Processor graph
1
2
6 7
3 4
8 9
5
1
0
1
1
1
2
1
3
Evaluation
• Items to compare
Recovery time in case of a failure
Overhead in case of no failure
• Compared methods
PROPOSED
CONTENTION
• Sinnen’s method considering network contention
INTERLEAVED
• Scheduling algorithm that tries to spread tasks to all
dies as much as possible
30
Test Environment
• Devices
4 PCs with
• Intel Core i7 920 (2.67GHz) (Quad core)
• Intel Network Interface Card
 Intel Gigabit CT Desktop Adaptor (PCI Express x1)
• 6.0GB Memory
• Program to measure execution time
• Windows 7(64bit)
• Java(TM) SE Runtime Environment (64bit)
• Standard TCP socket
31
Task Graph with Low Parallelism
Configuration
• Number of task nodes:90
• Number of cores on a die:2
• Number of dies:2~4
• Robot control [4]
32
Task graph
Processor graph
10
Die
1 Core
Switch
4 5
Die
# of dies
32
Die
6 7
Die
[4] Standard Task Graph Set
http://www.kasahara.elec.waseda.ac.jp/schedule/index.html
Results with Robot Control Task
• We varied number of dies
• In case of failure, proposed method reduced
total execution time by 40%
• In case of no failure, up to 6% of overhead 33
In case of a failure No failure
40%
6%
Number of dies Number of dies
CONTENTIONINTERLEAVED
PROPOSED
INTERLEAVED
CONTENTION
PROPOSED
Executiontime(sec)
Executiontime(sec)
Configuration
• Number of task nodes:98
• Number of cores on a die:4
• Number of dies:2~4
• Sparse matrix solver [4]
34
10
Die
1 Core
Switch
2 3 54
Die
6 7
# of dies
Task Graph with High Parallelism
Processor graph
Task graph
[4] Standard Task Graph Set
http://www.kasahara.elec.waseda.ac.jp/schedule/index.html
Results with Sparse Matrix Solver
• We varied number of dies
• In case of failure, execution time including
recovery reduced by up to 25%
• In case of no failure, up to 7% of overhead 35
25%
7%
In case of a failure No failure
INTERLEAVEDINTERLEAVED
CONTENTION
CONTENTION
PROPOSED
PROPOSED
Number of diesNumber of dies
Executiontime(sec)
Executiontime(sec)
Simulation with Varied CCR
• CCR
Ratio between comm. time and comp. time
High CCR means long communication time
• Number of tasks:50
• Number of cores on a die:4
• Number of dies:4
• Task graph
18 random graphs
10
Die
1 Core
Switch
2 3 54
Die
6 7
# of dies
Processor graph
• We varied CCR
• INTERLEAVED has large overhead when
CCR=10 (communication heavy)
• PROPOSED has 30% overhead, but reduced
execution time in case of no failure 37
5%
30%
Results with Varied CCR
In case of a failure No failure
Executiontime(sec)
Executiontime(sec)
INTERLEAVED
CONTENTION
PROPOSED CONTENTION
PROPOSED
INTERLEAVED
Effect of Parallelization of Proposed Scheduler
• Proposed algorithm is parallelized
• Compared times to generate schedules
20 task graphs
Multi thread vs Single Thread
Speed-up : up to x4
38
Environment
• Intel Core i7 920 (2.67GHz)
• Windows 7(64bit)
• Java(TM) SE 6 (64bit)
Single thread Multi thread
Timetogenerateschedule
Conclusion
• Proposed task scheduling method considering
Network contention
Single fail-stop failure
Multicore processor
• Future work
Evaluation on larger computer system
39
Shohei Gotoda, Naoki Shibata and Minoru Ito :
"Task scheduling algorithm for multicore
processor system for minimizing recovery time
in case of single node fault," Proceedings of
IEEE International Symposium on Cluster
Computing and the Grid (CCGrid 2012), pp.260-
26, 2012.
DOI:10.1109/CCGrid.2012.23 [ PDF ]
40

Mais conteúdo relacionado

Mais procurados

(Paper) Task scheduling algorithm for multicore processor system for minimiz...
 (Paper) Task scheduling algorithm for multicore processor system for minimiz... (Paper) Task scheduling algorithm for multicore processor system for minimiz...
(Paper) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
 
Dynamic load balancing in distributed systems in the presence of delays a re...
Dynamic load balancing in distributed systems in the presence of delays  a re...Dynamic load balancing in distributed systems in the presence of delays  a re...
Dynamic load balancing in distributed systems in the presence of delays a re...Mumbai Academisc
 
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...bhavikpooja
 
Distributed System Management
Distributed System ManagementDistributed System Management
Distributed System ManagementIbrahim Amer
 
Fault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataFault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataKaran Pardeshi
 
Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed SystemsRitu Ranjan Shrivastwa
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
An Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud ComputingAn Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud ComputingAisha Kalsoom
 
resource management
  resource management  resource management
resource managementAshish Kumar
 
Optimal load balancing in cloud computing
Optimal load balancing in cloud computingOptimal load balancing in cloud computing
Optimal load balancing in cloud computingPriyanka Bhowmick
 
Optimized Assignment of Independent Task for Improving Resources Performance ...
Optimized Assignment of Independent Task for Improving Resources Performance ...Optimized Assignment of Independent Task for Improving Resources Performance ...
Optimized Assignment of Independent Task for Improving Resources Performance ...ijgca
 
334839757 task-assignment
334839757 task-assignment334839757 task-assignment
334839757 task-assignmentsachinmore76
 
capacityshifting1
capacityshifting1capacityshifting1
capacityshifting1Gokul Vasan
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed SystemsClock Synchronization in Distributed Systems
Clock Synchronization in Distributed SystemsZbigniew Jerzak
 
An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...Alexander Decker
 

Mais procurados (18)

(Paper) Task scheduling algorithm for multicore processor system for minimiz...
 (Paper) Task scheduling algorithm for multicore processor system for minimiz... (Paper) Task scheduling algorithm for multicore processor system for minimiz...
(Paper) Task scheduling algorithm for multicore processor system for minimiz...
 
Dynamic load balancing in distributed systems in the presence of delays a re...
Dynamic load balancing in distributed systems in the presence of delays  a re...Dynamic load balancing in distributed systems in the presence of delays  a re...
Dynamic load balancing in distributed systems in the presence of delays a re...
 
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
 
Distributed System Management
Distributed System ManagementDistributed System Management
Distributed System Management
 
Fault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big DataFault tolerant mechanisms in Big Data
Fault tolerant mechanisms in Big Data
 
Gupta datamule
Gupta datamuleGupta datamule
Gupta datamule
 
Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed Systems
 
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
An Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud ComputingAn Efficient Decentralized Load Balancing Algorithm in Cloud Computing
An Efficient Decentralized Load Balancing Algorithm in Cloud Computing
 
Ijariie1161
Ijariie1161Ijariie1161
Ijariie1161
 
resource management
  resource management  resource management
resource management
 
Optimal load balancing in cloud computing
Optimal load balancing in cloud computingOptimal load balancing in cloud computing
Optimal load balancing in cloud computing
 
Optimized Assignment of Independent Task for Improving Resources Performance ...
Optimized Assignment of Independent Task for Improving Resources Performance ...Optimized Assignment of Independent Task for Improving Resources Performance ...
Optimized Assignment of Independent Task for Improving Resources Performance ...
 
334839757 task-assignment
334839757 task-assignment334839757 task-assignment
334839757 task-assignment
 
capacityshifting1
capacityshifting1capacityshifting1
capacityshifting1
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed SystemsClock Synchronization in Distributed Systems
Clock Synchronization in Distributed Systems
 
An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...An efficient scheduling policy for load balancing model for computational gri...
An efficient scheduling policy for load balancing model for computational gri...
 

Destaque

Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Web Services
 
可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释Yanpo Zhang
 
Connecting Your Data Analytics Pipeline
Connecting Your Data Analytics PipelineConnecting Your Data Analytics Pipeline
Connecting Your Data Analytics PipelineAmazon Web Services
 
大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)
大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)
大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)Amazon Web Services
 
初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務Amazon Web Services
 
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech TalksTackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech TalksAmazon Web Services
 
GLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid ComputingGLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid ComputingLINE+
 
電子商務資料分析 上課投影片
電子商務資料分析 上課投影片電子商務資料分析 上課投影片
電子商務資料分析 上課投影片Ethan Yin-Hao Tsui
 
BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesAmazon Web Services
 
管理程式對AWS LAMBDA持續交付
管理程式對AWS LAMBDA持續交付管理程式對AWS LAMBDA持續交付
管理程式對AWS LAMBDA持續交付Amazon Web Services
 
Cephfs架构解读和测试分析
Cephfs架构解读和测试分析Cephfs架构解读和测试分析
Cephfs架构解读和测试分析Yang Guanjun
 
The Power of Big Data - AWS Summit Bahrain 2017
The Power of Big Data - AWS Summit Bahrain 2017The Power of Big Data - AWS Summit Bahrain 2017
The Power of Big Data - AWS Summit Bahrain 2017Amazon Web Services
 
淺談系統監控與 AWS CloudWatch 的應用
淺談系統監控與 AWS CloudWatch 的應用淺談系統監控與 AWS CloudWatch 的應用
淺談系統監控與 AWS CloudWatch 的應用Rick Hwang
 
如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境
如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境
如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境Amazon Web Services
 
Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2Cloudera, Inc.
 
唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pubChao Zhu
 
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)Kuo-Chun Su
 

Destaque (20)

Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview
 
可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释
 
Connecting Your Data Analytics Pipeline
Connecting Your Data Analytics PipelineConnecting Your Data Analytics Pipeline
Connecting Your Data Analytics Pipeline
 
Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
 
大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)
大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)
大數據運算媒體業案例分享 (Big Data Compute Case Sharing for Media Industry)
 
初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務
 
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech TalksTackle Your Dark Data  Challenge with AWS Glue - AWS Online Tech Talks
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech Talks
 
GLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid ComputingGLOA:A New Job Scheduling Algorithm for Grid Computing
GLOA:A New Job Scheduling Algorithm for Grid Computing
 
電子商務資料分析 上課投影片
電子商務資料分析 上課投影片電子商務資料分析 上課投影片
電子商務資料分析 上課投影片
 
BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practices
 
管理程式對AWS LAMBDA持續交付
管理程式對AWS LAMBDA持續交付管理程式對AWS LAMBDA持續交付
管理程式對AWS LAMBDA持續交付
 
Cephfs架构解读和测试分析
Cephfs架构解读和测试分析Cephfs架构解读和测试分析
Cephfs架构解读和测试分析
 
The Power of Big Data - AWS Summit Bahrain 2017
The Power of Big Data - AWS Summit Bahrain 2017The Power of Big Data - AWS Summit Bahrain 2017
The Power of Big Data - AWS Summit Bahrain 2017
 
淺談系統監控與 AWS CloudWatch 的應用
淺談系統監控與 AWS CloudWatch 的應用淺談系統監控與 AWS CloudWatch 的應用
淺談系統監控與 AWS CloudWatch 的應用
 
如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境
如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境
如何利用 Amazon EMR 及Athena 打造高成本效益的大數據環境
 
Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub唯品会大数据实践 Sacc pub
唯品会大数据实践 Sacc pub
 
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
 
Micro service
Micro serviceMicro service
Micro service
 

Semelhante a (Slides) Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault

Semelhante a (Slides) Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault (20)

cs1311lecture25wdl.ppt
cs1311lecture25wdl.pptcs1311lecture25wdl.ppt
cs1311lecture25wdl.ppt
 
Processes and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3eProcesses and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3e
 
Unit-3.ppt
Unit-3.pptUnit-3.ppt
Unit-3.ppt
 
Unit 1 Computer organization and Instructions
Unit 1 Computer organization and InstructionsUnit 1 Computer organization and Instructions
Unit 1 Computer organization and Instructions
 
Ds ppt imp.
Ds ppt imp.Ds ppt imp.
Ds ppt imp.
 
Lecture1
Lecture1Lecture1
Lecture1
 
Nbvtalkatjntuvizianagaram
NbvtalkatjntuvizianagaramNbvtalkatjntuvizianagaram
Nbvtalkatjntuvizianagaram
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
 
Unit - 5 Pipelining.pptx
Unit - 5 Pipelining.pptxUnit - 5 Pipelining.pptx
Unit - 5 Pipelining.pptx
 
03 performance
03 performance03 performance
03 performance
 
Platform Technology (2).pdf
Platform Technology (2).pdfPlatform Technology (2).pdf
Platform Technology (2).pdf
 
multiprocessor real_ time scheduling.ppt
multiprocessor real_ time scheduling.pptmultiprocessor real_ time scheduling.ppt
multiprocessor real_ time scheduling.ppt
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Distributed systems scheduling
Distributed systems schedulingDistributed systems scheduling
Distributed systems scheduling
 
ESC UNIT 3.ppt
ESC UNIT 3.pptESC UNIT 3.ppt
ESC UNIT 3.ppt
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Architecting for the cloud elasticity security
Architecting for the cloud elasticity securityArchitecting for the cloud elasticity security
Architecting for the cloud elasticity security
 
Section05 scheduling
Section05 schedulingSection05 scheduling
Section05 scheduling
 
Week # 1.pdf
Week # 1.pdfWeek # 1.pdf
Week # 1.pdf
 

Mais de Naoki Shibata

Circular barcode design resistant to linear motion blur (preliminary slides)
Circular barcode design resistant to linear motion blur (preliminary slides)Circular barcode design resistant to linear motion blur (preliminary slides)
Circular barcode design resistant to linear motion blur (preliminary slides)Naoki Shibata
 
(Paper) An Endorsement Based Mobile Payment System for a Disaster Area
(Paper) An Endorsement Based Mobile Payment System for a Disaster Area(Paper) An Endorsement Based Mobile Payment System for a Disaster Area
(Paper) An Endorsement Based Mobile Payment System for a Disaster AreaNaoki Shibata
 
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...Naoki Shibata
 
Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...
Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...
Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...Naoki Shibata
 
(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...
(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...
(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...Naoki Shibata
 
An Endorsement Based Mobile Payment System for A Disaster Area
An Endorsement Based Mobile Payment System for A Disaster AreaAn Endorsement Based Mobile Payment System for A Disaster Area
An Endorsement Based Mobile Payment System for A Disaster AreaNaoki Shibata
 
GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...
GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...
GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...Naoki Shibata
 
GPGPU-Assisted Subpixel Tracking Method for Fiducial Markers
GPGPU-Assisted Subpixel Tracking Method for Fiducial MarkersGPGPU-Assisted Subpixel Tracking Method for Fiducial Markers
GPGPU-Assisted Subpixel Tracking Method for Fiducial MarkersNaoki Shibata
 
(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...
(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...
(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...Naoki Shibata
 
(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...
(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...
(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...Naoki Shibata
 
(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...
(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...
(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...Naoki Shibata
 
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...Naoki Shibata
 
(Paper) Self adaptive island GA
(Paper) Self adaptive island GA(Paper) Self adaptive island GA
(Paper) Self adaptive island GANaoki Shibata
 
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...Naoki Shibata
 
(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNs
(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNs(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNs
(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNsNaoki Shibata
 
(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...
(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...
(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...Naoki Shibata
 
(Slides) A Personal Navigation System with a Schedule Planning Facility Based...
(Slides) A Personal Navigation System with a Schedule Planning Facility Based...(Slides) A Personal Navigation System with a Schedule Planning Facility Based...
(Slides) A Personal Navigation System with a Schedule Planning Facility Based...Naoki Shibata
 
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...Naoki Shibata
 
(Slides) A demand-oriented information retrieval method on MANET
(Slides) A demand-oriented information retrieval method on MANET(Slides) A demand-oriented information retrieval method on MANET
(Slides) A demand-oriented information retrieval method on MANETNaoki Shibata
 
(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...
(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...
(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...Naoki Shibata
 

Mais de Naoki Shibata (20)

Circular barcode design resistant to linear motion blur (preliminary slides)
Circular barcode design resistant to linear motion blur (preliminary slides)Circular barcode design resistant to linear motion blur (preliminary slides)
Circular barcode design resistant to linear motion blur (preliminary slides)
 
(Paper) An Endorsement Based Mobile Payment System for a Disaster Area
(Paper) An Endorsement Based Mobile Payment System for a Disaster Area(Paper) An Endorsement Based Mobile Payment System for a Disaster Area
(Paper) An Endorsement Based Mobile Payment System for a Disaster Area
 
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
 
Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...
Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...
Congestion Alleviation Scheduling Technique for Car Drivers Based on Predicti...
 
(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...
(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...
(Paper) MTcast: Robust and Efficient P2P-based Video Delivery for Heterogeneo...
 
An Endorsement Based Mobile Payment System for A Disaster Area
An Endorsement Based Mobile Payment System for A Disaster AreaAn Endorsement Based Mobile Payment System for A Disaster Area
An Endorsement Based Mobile Payment System for A Disaster Area
 
GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...
GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...
GreenSwirl: Combining Traffic Signal Control and Route Guidance for Reducing ...
 
GPGPU-Assisted Subpixel Tracking Method for Fiducial Markers
GPGPU-Assisted Subpixel Tracking Method for Fiducial MarkersGPGPU-Assisted Subpixel Tracking Method for Fiducial Markers
GPGPU-Assisted Subpixel Tracking Method for Fiducial Markers
 
(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...
(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...
(Paper) BalloonNet: A Deploying Method for a Three-Dimensional Wireless Netwo...
 
(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...
(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...
(Paper) Emergency Medical Support System for Visualizing Locations and Vital ...
 
(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...
(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...
(Paper) A Method for Overlay Network Latency Estimation from Previous Observa...
 
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...
(Paper) Parking Navigation for Alleviating Congestion in Multilevel Parking F...
 
(Paper) Self adaptive island GA
(Paper) Self adaptive island GA(Paper) Self adaptive island GA
(Paper) Self adaptive island GA
 
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
 
(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNs
(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNs(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNs
(Slides) A Decentralized Method for Maximizing k-coverage Lifetime in WSNs
 
(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...
(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...
(Slides) A Technique for Information Sharing using Inter-Vehicle Communicatio...
 
(Slides) A Personal Navigation System with a Schedule Planning Facility Based...
(Slides) A Personal Navigation System with a Schedule Planning Facility Based...(Slides) A Personal Navigation System with a Schedule Planning Facility Based...
(Slides) A Personal Navigation System with a Schedule Planning Facility Based...
 
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
(Slides) A Method for Distributed Computaion of Semi-Optimal Multicast Tree i...
 
(Slides) A demand-oriented information retrieval method on MANET
(Slides) A demand-oriented information retrieval method on MANET(Slides) A demand-oriented information retrieval method on MANET
(Slides) A demand-oriented information retrieval method on MANET
 
(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...
(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...
(Slides) Inter-Vehicle Communication Protocol for Cooperatively Capturing and...
 

Último

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Último (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

(Slides) Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault

  • 1. Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault 1 Shohei Gotoda†, Naoki Shibata‡, Minoru Ito† †Nara Institute of Science and Technology ‡Shiga University
  • 2. Background • Multicore processors  Almost all processors designed recently are multicore processors • Computing cluster consisting of 1800 nodes experiences about 1000 failures[1] in the first year after deployment [1] Google spotlights data center inner workings cnet.com article on May 30, 2008
  • 3. Objective of Research • Fault tolerance  We assume a single fail-stop failure of a multicore processor • Network contention  To generate schedules reproducible on real systems 3 Devise new scheduling method that minimizes recovery time taking account of the above points
  • 4. Task Graph • A group of tasks that can be executed in parallel • Vertex (task node) Task to be executed on a single CPU core • Edge (task link) Data dependence between tasks 4 Task node Task link Task graph
  • 5. Processor Graph • Topology of the computer network • Vertex (Processor node) CPU core (circle) • has only one link Switch (rectangle) • has more than 2 links • Edge (Processor link) Communication path between processors 5 Processor node Processor linkSwitch Processor graph 321
  • 6. Task Scheduling • Task scheduling problem  assigns a processor node to each task node  minimizes total execution time An NP-hard problem 6 1 One processor node is assigned to each task node 321 Processor graph Task graph
  • 7. Inputs and Outputs for Task Scheduling • Inputs Task graph and processor graph • Output A schedule • which is an assignment of a processor node to each task node • Objective function Minimize task execution time 7 3 31 31 321 Processor graph Task graph
  • 8. Network Contention Model • Communication delay If processor link is occupied by another communication • We use existing network contention model[2] 8 3 31 32 Contention 321 Processor graph Task graph [2] O. Sinnen and L.A. Sousa, “Communication Contention in Task Scheduling,“ IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 6, pp. 503-515, 2005.
  • 9. Multicore Processor Model • Each core executes a task independently from other cores • Communication between cores finishes instantaneously • One network interface is shared among all cores on a die • If there is a failure, all cores on a die stop execution simultaneously 9 Core1 Core2 CPU 21 Processor graph
  • 10. Influence of Multicore Processors 10 • Need for considering multicore processors in scheduling High speed communication link among processors on a single die • Existing schedulers try to utilize this high speed link • As a result, many dependent tasks are assigned to cores on a single die 3 31 32 321 Assigned to cores on a same die Processor graph Task graph
  • 11. • Need for considering multicore processors in scheduling High speed communication link among processors on a single die • Existing schedulers try to utilize this high speed link • As a result, many dependent tasks are assigned to cores on a single die In case of fault • Dependent tasks tends to be destroyed at a time 11 3 31 32 321 Processor graph Task graph Influence of Multicore Processors Assigned to cores on a same die
  • 12. Related Work (1/2) • Checkpointing [3] Node state is saved in each node Backup node is allocated Recover processing results from saved state Multicore is not considered Network contention is not considered 12 [3] Y. Gu, Z. Zhang, F. Ye, H. Yang, M. Kim, H. Lei, and Z. Liu. An empirical study of high availability in stream processing systems. In Middleware ’09: the 10th ACM/IFIP/USENIX International Conference on Middleware (Industrial Track), 2009. 1 2 3 4 Input Queue Output Queue Secondary Primary Backup
  • 13. Related Work (2/2) • Task scheduling method[5] in which  Multiple task graph templates are prepared beforehand,  Processors are assigned according to the templates • This method is suitable for highly loaded systems [5] Wolf, J., et al.: SODA: An Optimizing Scheduler for Large- Scale Stream-Based Distributed Computer Systems. In: ACM Middleware (2008)
  • 14. Our Contribution • There is no existing method for scheduling that takes account of both • multicore processor failure • network contention • We propose a scheduling method taking account of network contention and multicore processor failure 14
  • 15. Assumptions • Only a single fail-stop failure of a multicore processor can occur Failed computing node automatically restart after 30 sec. • Failure can be detected in one second by interruption of heartbeat signals • Use checkpointing technique to recover from saved state • Network contention Contention model is same as the Sinnen’s model 15
  • 16. Checkpointing and Recovery • Each processor node saves state to the main memory when each task is finished  Saved state is the data transferred to the succeeding processor nodes  Only output data from each task node is saved as a state • This is much smaller than the complete memory image  We assume saving state finishes instantaneously • Since this is just copying small data within memory • Recovery  Saved state which is not affected by the failure is found in the ancestor task nodes. Some tasks are executed again using the saved state 16 [3] Y. Gu, Z. Zhang, F. Ye, H. Yang, M. Kim, H. Lei, and Z. Liu. An empirical study of high availability in stream processing systems. In Middleware ’09: the 10th ACM/IFIP/USENIX International Conference on Middleware (Industrial Track), 2009.
  • 17. What Proposed Method Tries to Do • Reduce recovery time in case of failure  Minimizes the worst case total execution time • Worst case in the all possible patterns of failure • Each of dies can fail  Execution time before failure + recovery
  • 18. Worst Case Scenario • Critical path Path in task graph from first to last task with longest execution time • The worst case scenario All tasks in critical path are assigned to processors on a die Failure happens when the last task is being executed We need two times of total execution time 18 Example task graph First Last
  • 19. Idea of Proposed Method • We distribute tasks on critical path over dies But, there is communication overhead If we distribute too many tasks, there is too much overhead • Usually, the last tasks in critical path have larger influence We check tasks from the last task in the critical path We find the last k tasks in the critical path to other dies We find the best k
  • 20. Problem with Existing Method 20 1 2 3 A B C 21 3 BA Resulting execution Existing Schedule D DC • Task 1 is assigned to core A • Task 2 is assigned to core B • Task 3 is assigned to same die • because of high communication speed Time
  • 21. • Suppose that failure happens when Task 3 is being executed • All results are lost 21 1 2 3 A B C 21 3 BA D DC Resulting execution Existing Schedule Time Problem with Existing Method
  • 22. Problem with Existing Method 22 1 2 3 A B C 21 3 BA D DC 1’ 2’ 3’ 21 3 Resulting execution Existing Schedule Time • Suppose that failure happens when Task 3 is being executed • All results are lost • We need to execute all tasks again from the beginning on another die
  • 23. Improvement in Proposed Method • Distribute influential tasks to other dies In this case, task 3 is the most influential 23 21 3 Proposed schedule 1 2 3 A B C BA Resulting execution D DC Comm. overhead Time
  • 24. Recovery in Proposed Method • Suppose that failure happens when Task 3 is being executed • Results of Task 1 and 2 are saved 24 21 3 1 2 3 A B C BA D DC Resulting execution Time Proposed schedule
  • 25. Recovery in Proposed Method • Suppose that failure happens when Task 3 is being executed • Results of Task 1 and 2 are saved • Execution can be continued from the saved state 25 3’ 21 3 1 2 3 A B C BA D DC 3 Resulting execution Time Proposed schedule
  • 26. Communication Overhead • Communication overhead is imposed to the proposed method 26 Existing schedule Proposed schedule overhead 1 2 3 A B C D 1 2 3 A B C D Time
  • 27. Speed-up in Recovery 27 Recovery with existing schedule Recovery with proposed schedule Proposed method has larger effect if computation time is longer than communication time 1 2 3 A B C D 1 2 3 A B C D 1’ 2’ 3’ 3’ speed-up 時間
  • 28. Comparison of Schedules 28 Existing schedule Proposed schedule Time Time Task graph 10 32 Processor graph 1 2 6 7 3 4 8 9 5 1 0 1 1 1 2 1 3
  • 29. 29 Not available Comparison of Recovery Existing schedule Proposed schedule Time Time Task graph 10 32 Processor graph 1 2 6 7 3 4 8 9 5 1 0 1 1 1 2 1 3
  • 30. Evaluation • Items to compare Recovery time in case of a failure Overhead in case of no failure • Compared methods PROPOSED CONTENTION • Sinnen’s method considering network contention INTERLEAVED • Scheduling algorithm that tries to spread tasks to all dies as much as possible 30
  • 31. Test Environment • Devices 4 PCs with • Intel Core i7 920 (2.67GHz) (Quad core) • Intel Network Interface Card  Intel Gigabit CT Desktop Adaptor (PCI Express x1) • 6.0GB Memory • Program to measure execution time • Windows 7(64bit) • Java(TM) SE Runtime Environment (64bit) • Standard TCP socket 31
  • 32. Task Graph with Low Parallelism Configuration • Number of task nodes:90 • Number of cores on a die:2 • Number of dies:2~4 • Robot control [4] 32 Task graph Processor graph 10 Die 1 Core Switch 4 5 Die # of dies 32 Die 6 7 Die [4] Standard Task Graph Set http://www.kasahara.elec.waseda.ac.jp/schedule/index.html
  • 33. Results with Robot Control Task • We varied number of dies • In case of failure, proposed method reduced total execution time by 40% • In case of no failure, up to 6% of overhead 33 In case of a failure No failure 40% 6% Number of dies Number of dies CONTENTIONINTERLEAVED PROPOSED INTERLEAVED CONTENTION PROPOSED Executiontime(sec) Executiontime(sec)
  • 34. Configuration • Number of task nodes:98 • Number of cores on a die:4 • Number of dies:2~4 • Sparse matrix solver [4] 34 10 Die 1 Core Switch 2 3 54 Die 6 7 # of dies Task Graph with High Parallelism Processor graph Task graph [4] Standard Task Graph Set http://www.kasahara.elec.waseda.ac.jp/schedule/index.html
  • 35. Results with Sparse Matrix Solver • We varied number of dies • In case of failure, execution time including recovery reduced by up to 25% • In case of no failure, up to 7% of overhead 35 25% 7% In case of a failure No failure INTERLEAVEDINTERLEAVED CONTENTION CONTENTION PROPOSED PROPOSED Number of diesNumber of dies Executiontime(sec) Executiontime(sec)
  • 36. Simulation with Varied CCR • CCR Ratio between comm. time and comp. time High CCR means long communication time • Number of tasks:50 • Number of cores on a die:4 • Number of dies:4 • Task graph 18 random graphs 10 Die 1 Core Switch 2 3 54 Die 6 7 # of dies Processor graph
  • 37. • We varied CCR • INTERLEAVED has large overhead when CCR=10 (communication heavy) • PROPOSED has 30% overhead, but reduced execution time in case of no failure 37 5% 30% Results with Varied CCR In case of a failure No failure Executiontime(sec) Executiontime(sec) INTERLEAVED CONTENTION PROPOSED CONTENTION PROPOSED INTERLEAVED
  • 38. Effect of Parallelization of Proposed Scheduler • Proposed algorithm is parallelized • Compared times to generate schedules 20 task graphs Multi thread vs Single Thread Speed-up : up to x4 38 Environment • Intel Core i7 920 (2.67GHz) • Windows 7(64bit) • Java(TM) SE 6 (64bit) Single thread Multi thread Timetogenerateschedule
  • 39. Conclusion • Proposed task scheduling method considering Network contention Single fail-stop failure Multicore processor • Future work Evaluation on larger computer system 39
  • 40. Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260- 26, 2012. DOI:10.1109/CCGrid.2012.23 [ PDF ] 40

Notas do Editor

  1. Recently, almost all processors are designed as multicore processors, and these are commonly used in datacenters. On the other hand, computing cluster consisting of 1800 nodes experiences about 1000 failures in the first year, according to this cnet.com article.
  2. So, the objective of this research is to devise a task scheduling method that minimizes recovery time taking account of fault tolerance of multicore processors and network contention.
  3. Now I define some terms used in our research. A task graph is a group of tasks that can be executed in parallel. A vertex of a task graph is a task to be executed on a single CPU core. Each edge represents data dependence between these tasks.
  4. Processor graph is the topology of the computing system. Each round vertex represents a CPU core. Each rectangular vertex represents a switch that does not have computing capability.
  5. The task scheduling problem is an np-hard problem to assign a processor node to each task node. In this figure, processor 1 is assigned to this task graph node.
  6. The inputs of the task scheduling problem is these two graphs, and output is a schedule, which is an assignment of a processor node to each task node. The objective function is usually to minimize the total task execution time.
  7. Our proposed method takes account of network contention based on the model proposed by Oliver Sinnen. In this model, if a processor link is occupied by another communication, there is communication delay.
  8. A multicore processor is modeled like this. We assign a task to each of cores. Since the cores share the main memory, we assume that communication between cores finishes instantaneously. A network interface is shared among the cores, so one die of a multicore processor is model like this graph. We assume that all cores on a die stop simultaneously in case of a fault.
  9. There is a need for considering multicore processors in scheduling. Since the communication link among cores on a die has high bandwidth, existing scheduler tries to utilize this link to minimize the total execution time. As a result, many dependent tasks are assigned to cores on a die. But, if a failure happens, many dependent tasks and their results are destroyed.
  10. There is a need for considering multicore processors in scheduling. Since the communication link among cores on a die has high bandwidth, existing scheduler tries to utilize this link to minimize the total execution time. As a result, many dependent tasks are assigned to cores on a die. But, if a failure happens, many dependent tasks and their results are destroyed.
  11. I now explain a related work. A checkpointing technique is proposed in this paper. In this paper, node state is saved in each node, and recovery is made by these saved states.
  12. As far as we surveyed, there is no existing method for scheduling that takes account of both multicore processor failure and network contention. So, we proposed a scheduling method taking account of both of these things.
  13. I now explain the assumptions made in our research. We assume that only a single fail-stop failure of a multicore processor can occur. The failed node automatically restart after 30 seconds by rebooting. Failure can be detected in one second by interruption of heartbeat signals. We use a checkpointing technique for recovery. We use the network contention model proposed by Oliver Sinnen.
  14. As for checkpointing and recovery, we assume that each processor node saves state to the main memory when each task is finished.
  15. So, our method reduces the recovery time in case of a failure it minimizes the worst case total execution time. It means that the worst case in the all possible patterns of failure and we minimize the sum of execution time before and after failure Our method is based on sinnen’s method, so it takes account of network contention.
  16. I now explain the worst case scenario of failure. The critical path is the … The worst case scenario is that
  17. 基本は既存と同じ手法でスケジューリングを行います 既存手法のスケジュールでは このようなタスクグラフがあたえられたとき 1,2は並列実行可能であるため、それぞれABに割り当てスケジュールとしてはこのようになります。 続いて3に関しては、マルチコアプロセッサ内に割り当てたほうが、リンク速度が高速で処理時間を短くできるため A,Bのどちらか、この場合ですとAに割り当てられます
  18. しかし、3のタスクノード実行中に故障が発生した場合、A,Bは同一のマルチコアプロセッサであるため、 1,2,3のすべてのタスクノードデータが失われてしまうため、この1,2,3をもう一度最初からやりなおす必要があります。
  19. しかし、3のタスクノード実行中に故障が発生した場合、A,Bは同一のマルチコアプロセッサであるため、 1,2,3のすべてのタスクノードデータが失われてしまうため、この1,2,3をもう一度最初からやりなおす必要があります。
  20. 提案手法では、先ほどの通り、3での故障がタスク処理時間の増加が最大となることがわかったため、 3のタスクノードを、親プロセッサと異なるものに割り当てます。 この場合、A,Bのマルチコアプロセッサとは異なるCに割り当てたため、3では通信時間が発生します。
  21. 提案手法で先程と同様、タスクノード3の実行中に故障が発生した場合、 既存のように同じプロセッサに集中しておらず、タスクノード1,2が残されています。 そのため、再処理が必要なのは3のみとなり、処理時間が短縮されます。
  22. 提案手法で先程と同様、タスクノード3の実行中に故障が発生した場合、 既存のように同じプロセッサに集中しておらず、タスクノード1,2が残されています。 そのため、再処理が必要なのは3のみとなり、処理時間が短縮されます。
  23. 停止故障非発生時 提案手法でのタスク処理時間が通信時間により若干増加 別プロセッサに割り当てるコストがかかるため,
  24. このように,提案手法ではタスク全体での処理時間を, 停止故障が発生していないときの通信時間をオーバーヘッドとして, 停止故障発生時の計算時間を短縮することができます. そのため提案手法では通信時間に対して計算時間が大きいほど, 時間短縮できる割合が大きくなり停止故障発生時に有利になります.      
  25. 入力タスクグラフのタスクノードの色と,スケジュール中の色が対応しています. タスクグラフのクリティカルパス上のタスクノードは赤系の色になっています. 既存スケジュールの方でみると,クリティカルパス上のタスクノードが通信時間削減し,タスク処理時間を短くするために,ひとつのプロセッサ上に集中しています. 故障が発生したときは,これらのタスクノードの計算結果がすべて失われてしまうため,1からやり直した場合2倍近くのタスク処理時間になります 提案手法ではこの,クリティカルパス上の赤系のタスクノードが,ある程度,他のプロセッサも利用するような割り当てになっており, 故障が発生しても,全てをやり直す必要がなく,タスク処理時間の短縮につながります.
  26. プロセッサ0とプロセッサ1をもつマルチコアプロセッサが故障が発生したとき 黒で示している部分故障で利用できない状態 既存手法では,このクリティカルーパス上の最後のタスクノードの処理が終わる直前に壊れた時に故障が発生して, 再処理するために,7,9のタスクノードを利用して処理を再開しようとするが,これも同じプロセッサ上のために 結果を失っていて使えない. このように順にたどっていった場合,最終的に,クリティカルパス上のタスク全てをやり直すことになる. 提案手法では,ある程度,クリティカルパス上のタスクノードが分散しているため,すべてやり直す必要がなく, 処理時間の短縮につながる.
  27. このへんで12分ならええ感じ
  28. 実験に利用した全てのタスクグラフはタスクスケジューリングのベンチマーク用に公開されている,standard task graph setからいくつかピックアップしたもの # Parallelism : 4.363796
  29. (sum of all task processing times)/(critical path length). # Parallelism : 15.868853
  30. CCR = communication to computation ratio (sum of all task processing times)/(critical path length). # Parallelism : 15.868853
  31. CCRはタスクグラフの計算時間と通信時間の比率で,CCRが高いほど,通信時間がいタスクグラフになっています. 交互手法では,通信時間のオーバーヘッドが大きく,CCR10の時では,既存手法よりもかなり悪くなっている. 提案手法では,CCRが低いタスクグラフで効果を発揮する傾向にあるが,CCR10のときでも, 故障していない時のオーバーヘッドは大きめであるが,故障時のタスク処理時間は短縮できている