SlideShare uma empresa Scribd logo
1 de 42
CONCURRENT DATA
STRUCTURES
The Role of Locking
Dr. C.V. Suresh Babu
Overview


Introduction
 Synchronization
 Non-blocking

Synchronization



Is Non-blocking Synchronization performancebeneficial for Parallel Applications?



NOBLE: A Non-blocking Synchronization Interface.
How can we make non-blocking synchronization
accessible to the parallel programmer?



Lock-free Skip lists



Conclusions, Future Work
Systems: SMP


Cache-coherent distributed shared
memory multiprocessor systems:
 UMA
 NUMA
Synchronization
Barriers
 Locks, semaphores,… (mutual
exclusion)


“A significant part of the work performed
by today’s parallel applications is spent on
synchronization.”
...
Lock-Based Synchronization:
Sequential
Non-blocking Synchronization


Lock-Free Synchronization
 Optimistic

approach

• Assumes it’s alone and prepares
operation which later takes place (unless
interfered) in one atomic step, using
hardware atomic primitives
• Interference is detected via shared
memory
• Retries until not interfered by other
operations
• Can cause starvation
Example: Shared Queue
The usual approach is to implement operations using retry loops.
Here’s an example:
type Qtype = record v: valtype; next: pointer to Qtype end
type Qtype = record v: valtype; next: pointer to Qtype end
shared var Tail: pointer to Qtype;
shared var Tail: pointer to Qtype;
local var old, new: pointer to Qtype
local var old, new: pointer to Qtype
procedure Enqueue (input: valtype)
procedure Enqueue (input: valtype)
new := (input, NIL);
new := (input, NIL);
repeat old := Tail
repeat old := Tail
until CAS2(&Tail, &(old->next), old, NIL, new, new)
until CAS2(&Tail, &(old->next), old, NIL, new, new)

old
Tail

new

old
Tail

new
Non-blocking Synchronization


Lock-Free Synchronization
 Avoids

problems that locks have

 Fast
 Starvation?



(not in the Context of HPC)

Wait-Free Synchronization
 Always

finishes in a finite number of its own

steps.
• Complex algorithms
• Memory consuming
• Less efficient on average than lock-free
Overview


Introduction
 Synchronization
 Non-blocking

Synchronization



Is Non-blocking Synchronization performancebeneficial for Parallel Scientific Applications?



NOBLE: A Non-blocking Synchronization Interface.
How can we make non-blocking synchronization
accessible to the parallel programmer?



Conclusions, Future Work
Non-blocking
Synchronisation
Synchronisation:
 An alternative approach for synchronisation
introduced 25 years ago
 Many theoretical results
Evaluation:
 Micro-benchmarks shows better
performance than mutual exclusion in real
or simulated multiprocessor systems.
Practice




Non-blocking synchronization is still not
used in practical applications
Non-blocking solutions are often
 complex
 having

non-standard or un-clear
interfaces
 non-practical

?

?
Practice
Question?
”How the performance of
parallel scientific
applications is affected by
the use of non-blocking
synchronisation rather than
lock-based one?”

?

?

?
Answers
How the performance of parallel scientific
applications is affected by the use of nonblocking synchronisation rather than lockbased one?






The identification of the basic locking
operations that parallel programmers use in
their applications.
The efficient non-blocking implementation of
these synchronisation operations.
The architectural implications on the design
of non-blocking synchronisation.
Comparison of the lock-based and lock-free
versions of the respective applications
Applications
Ocean

simulates eddy currents in an ocean basin.

Radiosity

computes the equilibrium distribution of light in a scene
using the radiosity method.

Volrend

renders 3D volume data into an image using a raycasting method.

Water

Evaluates forces and potentials that occur over time
between water molecules.

Spark98

a collection of sparse matrix kernels.
Each kernel performs a sequence of sparse matrix
vector product operations using matrices that are
derived from a family of three-dimensional finite
element earthquake applications.
Removing Locks in
Applications


Many locks are
“Simple Locks”.



Many critical
sections contain
shared floatingpoint variables.



Large critical
sections.







CAS, FAA and LL/SC can
be used to implement
non-blocking version.
Floating-point
synchronization primitives
are needed. A DoubleFetch-and-Add primitive
was designed.
Efficient Non-blocking
implementations of big
ADT are used.
Experimental Results:
Speedup
58P
58P

32P
24P

24P

58P
58P
SPARK98
Before:
spark_setlock(lockid);
w[col][0] += A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2];
w[col][1] += A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2];
w[col][2] += A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2];
spark_unsetlock(lockid);
After:
dfad(&w[col][0], A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2]);
dfad(&w[col][1], A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2]);
dfad(&w[col][2], A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2]);
Overview


Introduction
 Synchronization
 Non-blocking

Synchronization



Is Non-blocking Synchronization beneficial for
Parallel Scientific Applications?



NOBLE: A Non-blocking Synchronization Interface.
How can we make non-blocking synchronization
accessible to the parallel programmer?



Conclusions, Future Work
Practice




Non-blocking synchronization is still not
used in practical applications
Non-blocking solutions are often
 complex
 having

non-standard or un-clear
interfaces
 non-practical

?

?
NOBLE: Brings Non-blocking closer to Practice


Create a non-blocking inter-process
communication interface with the properties:
 Attractive

functionality
 Programmer friendly
 Easy to adapt existing solutions
 Efficient
 Portable
 Adaptable for different programming languages
NOBLE Design: Portable
Noble.h
#define NBL...
#define NBL...
#define NBL...

Exported definitions
Identical for all platforms
Platform in-dependent

QueueLF.c

StackLF.c

#include “Platform/Primitives.h”
…

#include “Platform/Primitives.h”
…

...

Platform dependent
SunHardware.asm

IntelHardware.asm

CAS, TAS, Spin-Locks
…

CAS, TAS, Spin-Locks
...

...
Using NOBLE
• First create a global variable
handling the shared data
object, for example a stack:
• Create the stack with the
appropriate implementation:

Globals
#include <noble.h>
...
NBLStack* stack;

Main
stack=NBLStackCreateLF(10000);
...

Threads
• When some thread wants to
do some operation:

NBLStackPush(stack, item);

or
item=NBLStackPop(stack);
Using NOBLE
Globals
#include <noble.h>
...
NBLStack* stack;

Main


When the data structure is
not in use anymore:

stack=NBLStackCreateLF(10000);
...
NBLStackFree(stack);
Using NOBLE
Globals
#include <noble.h>
...
NBLStack* stack;

• To change the
synchronization mechanism,
only one line of code has to
be changed!

Main
stack=NBLStackCreateLB();
...
NBLStackFree(stack);

Threads
NBLStackPush(stack, item);

or
item=NBLStackPop(stack);
Design: Attractive functionality


Data structures for multi-threaded usage
 FIFO

Queues
 Priority Queues
 Dictionaries
 Stacks
 Singly linked lists
 Snapshots
 MWCAS
 ...


Clear specifications
Status


Multiprocessor support
 Sun

Solaris (Sparc)
 Win32 (Intel x86)
 SGI (Mips)
 Linux (Intel x86)
Availiable for academic use:
http://www.noble-library.org/
Did our Work have any
Impact?
1)

2)

3)

Industry has initialized contacts and
uses a test version of NOBLE.
Free-ware developers has showed
interest.
Interest from research organisations.
NOBLE is freely availiable for
research and educational purposes.
A Lock-Free Skip list


Presented as part of the: H. Sundell, Ph. Tsigas
Fast and Lock-Free Concurrent Priority Queues
for Multi-Thread Systems. 17th IEEE/ACM
International Parallel and Distributed
Processing Symposium (IPDPS ´03), May 2003
(TR 2002). Best Paper Award

A very similar lock-free skip list algorithm will be
presented this August at the ACM Symposium
on Principles of Distributed Computing (PODC
2004):
”Lock-Free Linked Lists and Skip Lists”
Mikhail Fomitchev, Eric Ruppert
Randomized Algorithm: Skip Lists


William Pugh: ”Skip Lists: A Probabilistic
Alternative to Balanced Trees”, 1990
 Layers

of ordered lists with different
densities, achieves a tree-like behavior

Head

Tail

1

2
 Time

3

4

5

6

7

complexity: O(log2N) – probabilistic!

…
25%
50%
Our Lock-Free Concurrent
Skip List
 Define

node state to depend on the
insertion status at lowest level as well
as a deletion flag

1
3
2
1

p

D

2

D

 Insert
 Set

3

D

4

D

5

D

6

D

7

D

from lowest level going upwards

deletion flag. Delete from
highest level going downwards

3
2
1

p

D
Concurrent Insert vs. Delete
operations


b)

1

Problem:

2
Delete

3
Insert

- both nodes are deleted!


4

a)

Solution (Harris et al): Use bit 0 of
pointer to mark deletion status
1

b)

2 *
c)

a)

3

4
Dynamic Memory Management
Problem: System memory allocation
functionality is blocking!
 Solution (lock-free), IBM freelists:


 Pre-allocate

a number of nodes, link
them into a dynamic stack structure,
and allocate/reclaim using CAS
Allocate

Head

Mem 1

Reclaim

Used 1

Mem 2

…

Mem n
The ABA problem


Problem: Because of concurrency
(pre-emption in particular), same
pointer value does not always mean
same node (i.e. CAS succeeds)!!!
Step 1:

1

6

7

3

7

4
Step 2:

2
4
The ABA problem


Solution: (Valois et al) Add reference
counting to each node, in order to prevent
nodes that are of interest to some thread to
be reclaimed until all threads have left the
node
New Step 2:

1 *

6 *

1

1

CAS Failes!

2

3
?

7
?

4
1

?
Helping Scheme


Threads need to traverse safely
2 *

1

4

or



2 *

4

?

?


1

Need to remove marked-to-be-deleted
nodes while traversing – Help!
Finds previous node, finish deletion and
continues traversing from previous node

1

2 *

4
Overlapping operations on
Insert 2
shared data
2


Example: Insert operation 1

4

- which of 2 or 3 gets inserted?


Solution: Compare-And-Swap
atomic primitive:
CAS(p:pointer to word, old:word,
new:word):boolean
atomic do
if *p = old then
*p := new;
return true;
else return false;

3
Insert 3
Experiments
1-30 threads on platforms with
different levels of real concurrency
 10000 Insert vs. DeleteMin operations
by each thread. 100 vs. 1000 initial
inserts
 Compare with other implementations:


 Lotan

and Shavit, 2000
 Hunt et al “An Efficient Algorithm for
Concurrent Priority Queue Heaps”,
1996
Full Concurrency
Medium Pre-emption
High Pre-emption
Lessons Learned








The Non-Blocking Synchronization
Paradigm can be suitable and beneficial to
large scale parallel applications.
Experimental Reproducable Work. Many
results claimed by simulation are not
consistent with what we observed.
Applications gave us nice problems to look
at and do theoretical work on. (IPDPS 2003
Algorithmic Best Paper Award)
NOBLE helped programmers to trust our
implementations.
Future Work
Extend NOBLE for loosely coupled
systems.
 Extend the set of data structures
supported by NOBLE based on the
needs of the applications.
 Reactive-Synchronisation


Mais conteúdo relacionado

Mais procurados

Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Carol McDonald
 
Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...
Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...
Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...
EPAM_Systems_Bulgaria
 
Concurrent programming without synchronization
Concurrent programming without synchronizationConcurrent programming without synchronization
Concurrent programming without synchronization
Martin Hristov
 

Mais procurados (20)

Java concurrency
Java concurrencyJava concurrency
Java concurrency
 
Effective java - concurrency
Effective java - concurrencyEffective java - concurrency
Effective java - concurrency
 
Parallel Programming With Dot Net
Parallel Programming With Dot NetParallel Programming With Dot Net
Parallel Programming With Dot Net
 
Basics of Java Concurrency
Basics of Java ConcurrencyBasics of Java Concurrency
Basics of Java Concurrency
 
Lecture 21
Lecture 21Lecture 21
Lecture 21
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
 
Java 8 - Stamped Lock
Java 8 - Stamped LockJava 8 - Stamped Lock
Java 8 - Stamped Lock
 
Java Multithreading and Concurrency
Java Multithreading and ConcurrencyJava Multithreading and Concurrency
Java Multithreading and Concurrency
 
Non-blocking synchronization — what is it and why we (don't?) need it
Non-blocking synchronization — what is it and why we (don't?) need itNon-blocking synchronization — what is it and why we (don't?) need it
Non-blocking synchronization — what is it and why we (don't?) need it
 
Spiking Neural Networks As Continuous-Time Dynamical Systems: Fundamentals, E...
Spiking Neural Networks As Continuous-Time Dynamical Systems: Fundamentals, E...Spiking Neural Networks As Continuous-Time Dynamical Systems: Fundamentals, E...
Spiking Neural Networks As Continuous-Time Dynamical Systems: Fundamentals, E...
 
Distributed computing presentation
Distributed computing presentationDistributed computing presentation
Distributed computing presentation
 
Advanced Introduction to Java Multi-Threading - Full (chok)
Advanced Introduction to Java Multi-Threading - Full (chok)Advanced Introduction to Java Multi-Threading - Full (chok)
Advanced Introduction to Java Multi-Threading - Full (chok)
 
Nn devs
Nn devsNn devs
Nn devs
 
Task and Data Parallelism: Real-World Examples
Task and Data Parallelism: Real-World ExamplesTask and Data Parallelism: Real-World Examples
Task and Data Parallelism: Real-World Examples
 
Concurrency Programming in Java - 05 - Processes and Threads, Thread Objects,...
Concurrency Programming in Java - 05 - Processes and Threads, Thread Objects,...Concurrency Programming in Java - 05 - Processes and Threads, Thread Objects,...
Concurrency Programming in Java - 05 - Processes and Threads, Thread Objects,...
 
NXTTour: An Open Source Robotic System Operated over the Internet
NXTTour: An Open Source Robotic System Operated over the InternetNXTTour: An Open Source Robotic System Operated over the Internet
NXTTour: An Open Source Robotic System Operated over the Internet
 
Spiking Neural P System
Spiking Neural P SystemSpiking Neural P System
Spiking Neural P System
 
071bct537 lab4
071bct537 lab4071bct537 lab4
071bct537 lab4
 
Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...
Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...
Tech Talks_04.07.15_Session 1_Jeni Markishka & Martin Hristov_Concurrent Prog...
 
Concurrent programming without synchronization
Concurrent programming without synchronizationConcurrent programming without synchronization
Concurrent programming without synchronization
 

Destaque

Cambridge university certificate
Cambridge university certificateCambridge university certificate
Cambridge university certificate
John Pradeep
 
Benign sinonasal masses presentation & management-1
Benign sinonasal masses presentation & management-1Benign sinonasal masses presentation & management-1
Benign sinonasal masses presentation & management-1
kamalaiims
 

Destaque (9)

Learning and Development Programs
Learning and Development ProgramsLearning and Development Programs
Learning and Development Programs
 
Cambridge university certificate
Cambridge university certificateCambridge university certificate
Cambridge university certificate
 
Test planoutline
Test planoutlineTest planoutline
Test planoutline
 
Apresentação git
Apresentação gitApresentação git
Apresentação git
 
"Едно начало" - Йозо Михайлов Йосифов
"Едно начало" - Йозо Михайлов Йосифов "Едно начало" - Йозо Михайлов Йосифов
"Едно начало" - Йозо Михайлов Йосифов
 
Ict u2
Ict u2Ict u2
Ict u2
 
HR BAROMETER - An attractive place to work
HR BAROMETER - An attractive place to work HR BAROMETER - An attractive place to work
HR BAROMETER - An attractive place to work
 
Introducing solved Question Papers of last 5 years for Anna University, 1st s...
Introducing solved Question Papers of last 5 years for Anna University, 1st s...Introducing solved Question Papers of last 5 years for Anna University, 1st s...
Introducing solved Question Papers of last 5 years for Anna University, 1st s...
 
Benign sinonasal masses presentation & management-1
Benign sinonasal masses presentation & management-1Benign sinonasal masses presentation & management-1
Benign sinonasal masses presentation & management-1
 

Semelhante a Role of locking

Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-CentersTowards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Faculty of Technical Sciences, University of Novi Sad
 
The Pillars Of Concurrency
The Pillars Of ConcurrencyThe Pillars Of Concurrency
The Pillars Of Concurrency
aviade
 
Brian Klumpe Unification of Producer Consumer Key Pairs
Brian Klumpe Unification of Producer Consumer Key PairsBrian Klumpe Unification of Producer Consumer Key Pairs
Brian Klumpe Unification of Producer Consumer Key Pairs
Brian_Klumpe
 
Comparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus UsingComparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus Using
jorgerodriguessimao
 
Parallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MPParallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MP
IJSRED
 
Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problems
Richard Ashworth
 

Semelhante a Role of locking (20)

Architecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory systemArchitecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory system
 
Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)
 
QCon NYC: Distributed systems in practice, in theory
QCon NYC: Distributed systems in practice, in theoryQCon NYC: Distributed systems in practice, in theory
QCon NYC: Distributed systems in practice, in theory
 
Wireless Ad Hoc Networks
Wireless Ad Hoc NetworksWireless Ad Hoc Networks
Wireless Ad Hoc Networks
 
Harmful interupts
Harmful interuptsHarmful interupts
Harmful interupts
 
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-CentersTowards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
 
The Pillars Of Concurrency
The Pillars Of ConcurrencyThe Pillars Of Concurrency
The Pillars Of Concurrency
 
[iOS] Multiple Background Threads
[iOS] Multiple Background Threads[iOS] Multiple Background Threads
[iOS] Multiple Background Threads
 
Brian Klumpe Unification of Producer Consumer Key Pairs
Brian Klumpe Unification of Producer Consumer Key PairsBrian Klumpe Unification of Producer Consumer Key Pairs
Brian Klumpe Unification of Producer Consumer Key Pairs
 
Ijcatr04041019
Ijcatr04041019Ijcatr04041019
Ijcatr04041019
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
 
Comparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus UsingComparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus Using
 
But is it Art(ificial Intelligence)?
But is it Art(ificial Intelligence)? But is it Art(ificial Intelligence)?
But is it Art(ificial Intelligence)?
 
Parallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MPParallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MP
 
Reactive programming with rx java
Reactive programming with rx javaReactive programming with rx java
Reactive programming with rx java
 
Concurrency in Eclipse: Best Practices and Gotchas
Concurrency in Eclipse: Best Practices and GotchasConcurrency in Eclipse: Best Practices and Gotchas
Concurrency in Eclipse: Best Practices and Gotchas
 
Linux Assignment 3
Linux Assignment 3Linux Assignment 3
Linux Assignment 3
 
Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problems
 
Concurrency and Parallelism, Asynchronous Programming, Network Programming
Concurrency and Parallelism, Asynchronous Programming, Network ProgrammingConcurrency and Parallelism, Asynchronous Programming, Network Programming
Concurrency and Parallelism, Asynchronous Programming, Network Programming
 
Fogify: A Fog Computing Emulation Framework
Fogify: A Fog Computing Emulation FrameworkFogify: A Fog Computing Emulation Framework
Fogify: A Fog Computing Emulation Framework
 

Mais de Dr. C.V. Suresh Babu

Mais de Dr. C.V. Suresh Babu (20)

Data analytics with R
Data analytics with RData analytics with R
Data analytics with R
 
Association rules
Association rulesAssociation rules
Association rules
 
Clustering
ClusteringClustering
Clustering
 
Classification
ClassificationClassification
Classification
 
Blue property assumptions.
Blue property assumptions.Blue property assumptions.
Blue property assumptions.
 
Introduction to regression
Introduction to regressionIntroduction to regression
Introduction to regression
 
DART
DARTDART
DART
 
Mycin
MycinMycin
Mycin
 
Expert systems
Expert systemsExpert systems
Expert systems
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
 
Bayes network
Bayes networkBayes network
Bayes network
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
 
Knowledge based agents
Knowledge based agentsKnowledge based agents
Knowledge based agents
 
Rule based system
Rule based systemRule based system
Rule based system
 
Formal Logic in AI
Formal Logic in AIFormal Logic in AI
Formal Logic in AI
 
Production based system
Production based systemProduction based system
Production based system
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
 
Diagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDiagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AI
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Role of locking

  • 1. CONCURRENT DATA STRUCTURES The Role of Locking Dr. C.V. Suresh Babu
  • 2. Overview  Introduction  Synchronization  Non-blocking Synchronization  Is Non-blocking Synchronization performancebeneficial for Parallel Applications?  NOBLE: A Non-blocking Synchronization Interface. How can we make non-blocking synchronization accessible to the parallel programmer?  Lock-free Skip lists  Conclusions, Future Work
  • 3. Systems: SMP  Cache-coherent distributed shared memory multiprocessor systems:  UMA  NUMA
  • 4. Synchronization Barriers  Locks, semaphores,… (mutual exclusion)  “A significant part of the work performed by today’s parallel applications is spent on synchronization.” ...
  • 6. Non-blocking Synchronization  Lock-Free Synchronization  Optimistic approach • Assumes it’s alone and prepares operation which later takes place (unless interfered) in one atomic step, using hardware atomic primitives • Interference is detected via shared memory • Retries until not interfered by other operations • Can cause starvation
  • 7. Example: Shared Queue The usual approach is to implement operations using retry loops. Here’s an example: type Qtype = record v: valtype; next: pointer to Qtype end type Qtype = record v: valtype; next: pointer to Qtype end shared var Tail: pointer to Qtype; shared var Tail: pointer to Qtype; local var old, new: pointer to Qtype local var old, new: pointer to Qtype procedure Enqueue (input: valtype) procedure Enqueue (input: valtype) new := (input, NIL); new := (input, NIL); repeat old := Tail repeat old := Tail until CAS2(&Tail, &(old->next), old, NIL, new, new) until CAS2(&Tail, &(old->next), old, NIL, new, new) old Tail new old Tail new
  • 8. Non-blocking Synchronization  Lock-Free Synchronization  Avoids problems that locks have  Fast  Starvation?  (not in the Context of HPC) Wait-Free Synchronization  Always finishes in a finite number of its own steps. • Complex algorithms • Memory consuming • Less efficient on average than lock-free
  • 9. Overview  Introduction  Synchronization  Non-blocking Synchronization  Is Non-blocking Synchronization performancebeneficial for Parallel Scientific Applications?  NOBLE: A Non-blocking Synchronization Interface. How can we make non-blocking synchronization accessible to the parallel programmer?  Conclusions, Future Work
  • 10. Non-blocking Synchronisation Synchronisation:  An alternative approach for synchronisation introduced 25 years ago  Many theoretical results Evaluation:  Micro-benchmarks shows better performance than mutual exclusion in real or simulated multiprocessor systems.
  • 11. Practice   Non-blocking synchronization is still not used in practical applications Non-blocking solutions are often  complex  having non-standard or un-clear interfaces  non-practical ? ?
  • 12. Practice Question? ”How the performance of parallel scientific applications is affected by the use of non-blocking synchronisation rather than lock-based one?” ? ? ?
  • 13. Answers How the performance of parallel scientific applications is affected by the use of nonblocking synchronisation rather than lockbased one?     The identification of the basic locking operations that parallel programmers use in their applications. The efficient non-blocking implementation of these synchronisation operations. The architectural implications on the design of non-blocking synchronisation. Comparison of the lock-based and lock-free versions of the respective applications
  • 14. Applications Ocean simulates eddy currents in an ocean basin. Radiosity computes the equilibrium distribution of light in a scene using the radiosity method. Volrend renders 3D volume data into an image using a raycasting method. Water Evaluates forces and potentials that occur over time between water molecules. Spark98 a collection of sparse matrix kernels. Each kernel performs a sequence of sparse matrix vector product operations using matrices that are derived from a family of three-dimensional finite element earthquake applications.
  • 15. Removing Locks in Applications  Many locks are “Simple Locks”.  Many critical sections contain shared floatingpoint variables.  Large critical sections.    CAS, FAA and LL/SC can be used to implement non-blocking version. Floating-point synchronization primitives are needed. A DoubleFetch-and-Add primitive was designed. Efficient Non-blocking implementations of big ADT are used.
  • 17. SPARK98 Before: spark_setlock(lockid); w[col][0] += A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2]; w[col][1] += A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2]; w[col][2] += A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2]; spark_unsetlock(lockid); After: dfad(&w[col][0], A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2]); dfad(&w[col][1], A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2]); dfad(&w[col][2], A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2]);
  • 18. Overview  Introduction  Synchronization  Non-blocking Synchronization  Is Non-blocking Synchronization beneficial for Parallel Scientific Applications?  NOBLE: A Non-blocking Synchronization Interface. How can we make non-blocking synchronization accessible to the parallel programmer?  Conclusions, Future Work
  • 19. Practice   Non-blocking synchronization is still not used in practical applications Non-blocking solutions are often  complex  having non-standard or un-clear interfaces  non-practical ? ?
  • 20. NOBLE: Brings Non-blocking closer to Practice  Create a non-blocking inter-process communication interface with the properties:  Attractive functionality  Programmer friendly  Easy to adapt existing solutions  Efficient  Portable  Adaptable for different programming languages
  • 21. NOBLE Design: Portable Noble.h #define NBL... #define NBL... #define NBL... Exported definitions Identical for all platforms Platform in-dependent QueueLF.c StackLF.c #include “Platform/Primitives.h” … #include “Platform/Primitives.h” … ... Platform dependent SunHardware.asm IntelHardware.asm CAS, TAS, Spin-Locks … CAS, TAS, Spin-Locks ... ...
  • 22. Using NOBLE • First create a global variable handling the shared data object, for example a stack: • Create the stack with the appropriate implementation: Globals #include <noble.h> ... NBLStack* stack; Main stack=NBLStackCreateLF(10000); ... Threads • When some thread wants to do some operation: NBLStackPush(stack, item); or item=NBLStackPop(stack);
  • 23. Using NOBLE Globals #include <noble.h> ... NBLStack* stack; Main  When the data structure is not in use anymore: stack=NBLStackCreateLF(10000); ... NBLStackFree(stack);
  • 24. Using NOBLE Globals #include <noble.h> ... NBLStack* stack; • To change the synchronization mechanism, only one line of code has to be changed! Main stack=NBLStackCreateLB(); ... NBLStackFree(stack); Threads NBLStackPush(stack, item); or item=NBLStackPop(stack);
  • 25. Design: Attractive functionality  Data structures for multi-threaded usage  FIFO Queues  Priority Queues  Dictionaries  Stacks  Singly linked lists  Snapshots  MWCAS  ...  Clear specifications
  • 26. Status  Multiprocessor support  Sun Solaris (Sparc)  Win32 (Intel x86)  SGI (Mips)  Linux (Intel x86) Availiable for academic use: http://www.noble-library.org/
  • 27. Did our Work have any Impact? 1) 2) 3) Industry has initialized contacts and uses a test version of NOBLE. Free-ware developers has showed interest. Interest from research organisations. NOBLE is freely availiable for research and educational purposes.
  • 28. A Lock-Free Skip list  Presented as part of the: H. Sundell, Ph. Tsigas Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems. 17th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS ´03), May 2003 (TR 2002). Best Paper Award A very similar lock-free skip list algorithm will be presented this August at the ACM Symposium on Principles of Distributed Computing (PODC 2004): ”Lock-Free Linked Lists and Skip Lists” Mikhail Fomitchev, Eric Ruppert
  • 29. Randomized Algorithm: Skip Lists  William Pugh: ”Skip Lists: A Probabilistic Alternative to Balanced Trees”, 1990  Layers of ordered lists with different densities, achieves a tree-like behavior Head Tail 1 2  Time 3 4 5 6 7 complexity: O(log2N) – probabilistic! … 25% 50%
  • 30. Our Lock-Free Concurrent Skip List  Define node state to depend on the insertion status at lowest level as well as a deletion flag 1 3 2 1 p D 2 D  Insert  Set 3 D 4 D 5 D 6 D 7 D from lowest level going upwards deletion flag. Delete from highest level going downwards 3 2 1 p D
  • 31. Concurrent Insert vs. Delete operations  b) 1 Problem: 2 Delete 3 Insert - both nodes are deleted!  4 a) Solution (Harris et al): Use bit 0 of pointer to mark deletion status 1 b) 2 * c) a) 3 4
  • 32. Dynamic Memory Management Problem: System memory allocation functionality is blocking!  Solution (lock-free), IBM freelists:   Pre-allocate a number of nodes, link them into a dynamic stack structure, and allocate/reclaim using CAS Allocate Head Mem 1 Reclaim Used 1 Mem 2 … Mem n
  • 33. The ABA problem  Problem: Because of concurrency (pre-emption in particular), same pointer value does not always mean same node (i.e. CAS succeeds)!!! Step 1: 1 6 7 3 7 4 Step 2: 2 4
  • 34. The ABA problem  Solution: (Valois et al) Add reference counting to each node, in order to prevent nodes that are of interest to some thread to be reclaimed until all threads have left the node New Step 2: 1 * 6 * 1 1 CAS Failes! 2 3 ? 7 ? 4 1 ?
  • 35. Helping Scheme  Threads need to traverse safely 2 * 1 4 or  2 * 4 ? ?  1 Need to remove marked-to-be-deleted nodes while traversing – Help! Finds previous node, finish deletion and continues traversing from previous node 1 2 * 4
  • 36. Overlapping operations on Insert 2 shared data 2  Example: Insert operation 1 4 - which of 2 or 3 gets inserted?  Solution: Compare-And-Swap atomic primitive: CAS(p:pointer to word, old:word, new:word):boolean atomic do if *p = old then *p := new; return true; else return false; 3 Insert 3
  • 37. Experiments 1-30 threads on platforms with different levels of real concurrency  10000 Insert vs. DeleteMin operations by each thread. 100 vs. 1000 initial inserts  Compare with other implementations:   Lotan and Shavit, 2000  Hunt et al “An Efficient Algorithm for Concurrent Priority Queue Heaps”, 1996
  • 41. Lessons Learned     The Non-Blocking Synchronization Paradigm can be suitable and beneficial to large scale parallel applications. Experimental Reproducable Work. Many results claimed by simulation are not consistent with what we observed. Applications gave us nice problems to look at and do theoretical work on. (IPDPS 2003 Algorithmic Best Paper Award) NOBLE helped programmers to trust our implementations.
  • 42. Future Work Extend NOBLE for loosely coupled systems.  Extend the set of data structures supported by NOBLE based on the needs of the applications.  Reactive-Synchronisation 