SlideShare uma empresa Scribd logo
1 de 43
Distributed System
Sanjivani Rural Education Society’s
Sanjivani College of Engineering, Kopargaon-423603
(An Autonomous Institute Affiliated to Savitribai Phule Pune University, Pune)
NAAC ‘A’ Grade Accredited, ISO 9001:2015 Certified
Department of Information Technology
(NBA Accredited)
Dr. R. D. Chintamani
Asst. Prof.
Unit –V
DFS
3
DISTRIBUTED FILE SYSTEMS
DEFINITIONS:
• A Distributed File System ( DFS ) is simply a classical model of a file system
distributed across multiple machines. The purpose is to promote sharing of dispersed
files.
• This is an area of active research interest today.
• The resources on a particular machine are local to itself. Resources on other
machines are remote.
• A file system provides a service for clients. The server interface is the normal set of file
operations: create, read, etc. on files.
4
Introduction
• Distributed file systems support the sharing of
information in the form of files throughout the
intranet.
• A distributed file system enables programs to
store and access remote files exactly as they
do on local ones, allowing users to access
files from any computer on the intranet.
• Recent advances in higher bandwidth
connectivity of switched local networks and
disk organization have lead high performance
and highly scalable file systems.
5
DISTRIBUTED FILE
SYSTEMS
Clients, servers, and storage are dispersed across machines.
Configuration and implementation may vary -
a) Servers may run on dedicated machines, OR
b) Servers and clients can be on the same machines.
c) The OS itself can be distributed with the file system a part of
that distribution.
d) A distribution layer can be interposed between a conventional
OS and the file system.
Clients should view a DFS the same way they would a centralized FS; the
distribution is hidden at a lower level.
Performance is concerned with throughput and response time.
Definitions
6
DISTRIBUTED FILE
SYSTEMS
Naming is the mapping between logical and physical objects.
• Example: A user filename maps to <cylinder, sector>.
• In a conventional file system, it's understood where the file actually resides;
the system and disk are known.
• In a transparent DFS, the location of a file, somewhere in the network, is
hidden.
• File replication means multiple copies of a file; mapping returns a SET of
locations for the replicas.
Location transparency -
a)The name of a file does not reveal any hint of the file's physical storage
location.
b)File name still denotes a specific, although hidden, set of physical disk
blocks.
Naming and Transparency
7
DISTRIBUTED FILE
SYSTEMS
The ANDREW DFS AS AN EXAMPLE:
• Is location independent.
• Supports file mobility.
• Separation of FS and OS allows for disk-less systems. These have lower
cost and convenient system upgrades. The performance is not as good.
NAMING SCHEMES:
There are three main approaches to naming files:
1. Files are named with a combination of host and local name.
• This guarantees a unique name. NEITHER location transparent NOR
location independent.
• Same naming works on local and remote files. The DFS is a loose
collection of independent file systems.
Naming and Transparency
8
DISTRIBUTED FILE
SYSTEMS
NAMING SCHEMES:
2. Remote directories are mounted to local directories.
• So a local system seems to have a coherent directory structure.
• The remote directories must be explicitly mounted. The files are
location independent.
• SUN NFS is a good example of this technique.
3. A single global name structure spans all the files in the system.
• The DFS is built the same way as a local filesystem. Location
independent.
Naming and Transparency
9
DISTRIBUTED FILE
SYSTEMS
IMPLEMENTATION TECHNIQUES:
• A non-transparent mapping technique:
name ----> < system, disk, cylinder, sector >
• A transparent mapping technique:
name ----> file_identifier ----> < system, disk, cylinder, sector >
• So when changing the physical location of a file, only the file
identifier need be modified. This identifier must be "unique”.
Naming and Transparency
10
DISTRIBUTED FILE
SYSTEMS
CACHING
• Reduce network traffic by retaining recently accessed disk blocks in a cache, so
that repeated accesses to the same information can be handled locally.
• If required data is not already cached, a copy of data is brought from the server
to the user.
• Perform accesses on the cached copy.
• Files are identified with one master copy residing at the server machine,
• Copies of (parts of) the file are scattered in different caches.
• Cache Consistency Problem -- Keeping the cached copies consistent with the
master file.
• A remote service ((RPC) has these characteristic steps:
a) The client makes a request for file access.
b) The request is passed to the server in message format.
c) The server makes the file access.
d) Return messages bring the result back to the client.
• This is equivalent to performing a disk access for each request.
Remote File Access
11
DISTRIBUTED FILE
SYSTEMS
CACHE LOCATION:
• Caching is a mechanism for maintaining disk data on the local machine. This data can be
kept in the local memory or in the local disk. Caching can be advantageous both for read
ahead and read again.
• The cost of getting data from a cache is a few HUNDRED instructions; disk accesses cost
THOUSANDS of instructions.
• The master copy of a file doesn't move, but caches contain replicas of portions of the file.
• Caching behaves just like "networked virtual memory".
• What should be cached? << blocks <---> files >>. Bigger sizes give a better hit rate;
smaller give better transfer times.
• Caching on disk gives:
— Better reliability.
• Caching in memory gives:
— The possibility of diskless work stations,
— Greater speed,
Remote File Access
12
DISTRIBUTED FILE
SYSTEMS
COMPARISON OF CACHING AND REMOTE SERVICE:
• Many remote accesses can be handled by a local cache. There's a
great deal of locality of reference in file accesses. Servers can be
accessed only occasionally rather than for each access.
• Caching causes data to be moved in a few big chunks rather than in
many smaller pieces; this leads to considerable efficiency for the
network.
• Disk accesses can be better optimized on the server if it's understood
that requests are always for large contiguous chunks.
• Caching works best on machines with considerable local store - either
local disks or large memories.
Remote File Access
13
DISTRIBUTED FILE
SYSTEMS
STATEFUL VS. STATELESS SERVICE:
Stateful: A server keeps track of information about client requests.
• It maintains what files are opened by a client; connection
identifiers; server caches.
• Memory must be reclaimed when client closes file or when
client dies.
Stateless: Each client request provides complete information needed by
the server (i.e., filename, file offset ).
• The server can maintain information on behalf of the client,
but it's not required.
Remote File Access
14
DISTRIBUTED FILE
SYSTEMS
STATEFUL VS. STATELESS SERVICE:
Performance is better for stateful.
• Don't need to parse the filename each time, or "open/close" file on
every request.
Fault Tolerance: A stateful server loses everything when it crashes.
• Server must poll clients in order to renew its state.
• Client crashes force the server to clean up its encached
information.
• Stateless remembers nothing so it can start easily after a crash.
Remote File Access
15
DISTRIBUTED FILE
SYSTEMS
FILE REPLICATION:
• Duplicating files on multiple machines improves availability
and performance.
• Placed on failure-independent machines ( they won't fail
together ).
• The main problem is consistency - when one copy
changes, how do other copies reflect that change? Often
there is a tradeoff: consistency versus availability and
performance.
Remote File Access
16
General File Service
Architecture
• The responsibilities of a DFS are typically
distributed among three modules:
• Client module which emulates the conventional
file system interface
• Server modules(2) which perform operations for
clients on directories and on files.
• Most importantly this architecture enables
stateless implementation of the server
modules.
17
File service architecture
Client computer Server computer
Application
program
Application
program
Client module
Flat file service
Directory service
File Service Architecture
18
• Flat File Service:
• Concerned with implementing operations on
the concepts of files.
• Unique File Identifiers (UFIDs) are used to
refer to files in all requests for flat file service
operations.
• Responsibilities of file and directory service is
based upon UFID (long sequence of bits so
each file has UFID which is unique in DS).
File Service Architecture
19
• Directory Service:
• It provides a mapping between text names for
files and their UFIDs
• Client Obtain UFID by quoting text name to the
directory service.
• Client Module:
• Run on each client computer
• Integrate and expand the operations of the flat
file service under single application
programming interface.
What is NFS?
• First commercially successful network
file system:
• Developed by Sun Microsystems for their
diskless workstations
• Designed for robustness and “adequate
performance”
• Sun published all protocol specifications
• Many many implementations
20
21
OVERVIEW:
• Runs on SUNOS - NFS is both an implementation and a specification
of how to access remote files. It's both a definition and a specific
instance.
• The goal: to share a file system in a transparent way.
• Uses client-server model ( for NFS, a node can be both
simultaneously.) Can act between any two nodes ( no dedicated server.
)
• Mount makes a server file-system visible from a client.
DISTRIBUTED FILE
SYSTEMS SUN Network File System
highlights
• NFS is stateless
• All client requests must be self-contained
• The virtual file system interface
• VFS operations
• VNODE operations
• Performance issues
• Impact of tuning on NFS performance
22
Objectives (I)
• Machine and Operating System
Independence
• Could be implemented on low-end machines
of the mid-80’s
• Fast Crash Recovery
• Major reason behind stateless design
• Transparent Access
• Remote files should be accessed in exactly
the same way as local files 23
Objectives (II)
• UNIX semantics should be
maintained on client
• Best way to achieve transparent access
• “Reasonable” performance
• Robustness and preservation of UNIX
semantics were much more important
24
Basic design
• Three important parts
• The protocol
• The server side
• The client side
25
The protocol (I)
• Uses the Sun RPC mechanism and Sun
eXternal Data Representation (XDR)
standard
• Defined as a set of remote procedures
• Protocol is stateless
• Each procedure call contains all the
information necessary to complete the call
26
Advantages of statelessness
• Crash recovery is very easy:
• When a server crashes, client just resends
request until it gets an answer from the
rebooted server
• Client cannot tell difference between a
server that has crashed and recovered and
a slow server
• Client can always repeat any request
27
Consequences of
statelessness
• Read and writes must specify their start offset
• Server does not keep track of current position in
the file
• User still use conventional UNIX reads and writes
• Open system call translates into several
lookup calls to server
28
Server side (II)
• File handle consists of
• Filesystem id identifying disk partition
• I-node number identifying file within
partition
• Generation number changed every time
i-node is reused to store a new file
• Server will store
• Filesystem id in filesystem superblock
• I-node generation number in i-node 29
Client side (I)
• Provides transparent interface to NFS
• Mapping between remote file names
and remote file addresses is done a
server boot time through remote
mount
• Extension of UNIX mounts
• Specified in a mount table
• Makes a remote subtree appear part of a
local subtree 30
Remote mount
Client tree
bin
usr
/
Server subtree
rmount
After rmount, root of server subtree
can be accessed as /usr
31
Client side (II)
• Provides transparent access to
• NFS
• New virtual filesystem interface supports
• VFS calls, which operate on whole file
system
• VNODE calls, which operate on individual
files
• Treats all files in the same fashion
32
Client side (III)
UNIX system calls
VNODE/VFS
Other FS NFS UNIX FS
User interface is
unchanged
RPC/XDR disk
LAN
Common interface
33
The Mount Protocol
• The mount protocol provides four basic services
that clients need before they can use NFS:
• It allows the client to obtain a list of the directory
hierarchies (i.e. the file systems) that the client can
access through NFS.
• It accepts full path names That allow the client to identify
a particular directory hierarchy.
• It authenticates each client’s request and validates the
client’s permission to access the requested hierarchy.
• It returns a file handle for the root directory of the
hierarchy a client specifies.
• The client uses the root handle obtained from the
mount protocol when making NFS calls.
34
35
THE MOUNT PROTOCOL:
The following operations occur:
1. The client's request is sent via RPC to the mount server ( on server machine.)
2. Mount server checks export list containing
a) file systems that can be exported,
b) legal requesting clients.
c) It's legitimate to mount any directory within the legal filesystem.
3. Server returns "file handle" to client.
4. Server maintains list of clients and mounted directories -- this is state information!
But this data is only a "hint" and isn't treated as essential.
5. Mounting often occurs automatically when client or server boots.
DISTRIBUTED FILE
SYSTEMS SUN Network File System
36
THE NFS PROTOCOL:
RPC’s support these remote file operations:
a) Search for file within directory.
b) Read a set of directory entries.
c) Manipulate links and directories.
d) Read/write file attributes.
e) Read/write file data.
Note:
• NFS servers are stateless. Each request must provide all information. With a server
crash, no information is lost.
• Modified data must actually get to server disk before client is informed the action is
complete. Using a cache would imply state information.
• A single NFS write is atomic. A client write request may be broken into several atomic
RPC calls, so the whole thing is NOT atomic.
DISTRIBUTED FILE
SYSTEMS SUN Network File System
37
NFS ARCHITECTURE:
Follow local and remote access through this figure:
DISTRIBUTED FILE
SYSTEMS SUN Network File System
38
NFS ARCHITECTURE:
1. UNIX filesystem layer - does normal open / read / etc. commands.
2. Virtual file system ( VFS ) layer -
a) Gives clean layer between user and filesystem.
a) Acts as deflection point by using global vnodes.
a) Understands the difference between local and remote names.
a) Keeps in memory information about what should be deflected (mounted
directories) and how to get to these remote directories.
3. System call interface layer -
a) Presents sanitized validated requests in a uniform way to the VFS.
DISTRIBUTED FILE
SYSTEMS SUN Network File System
39
CACHES OF REMOTE DATA:
• The client keeps:
File block cache - ( the contents of a file )
File attribute cache - ( file header info (inode in UNIX) ).
• The local kernel hangs on to the data after getting it the first time.
• On an open, local kernel, it checks with server that cached data is still
OK.
• Cached attributes are thrown away after a few seconds.
DISTRIBUTED FILE
SYSTEMS SUN Network File System
NFS solution (I)
• Stateless server does not know how
many users are accessing a given file
• Clients do not know either
• Clients must
• Frequently send their modified blocks to
the server
• Frequently ask the server to revalidate the
blocks they have in their cache
40
Hard issues (I)
• NFS root file systems cannot be shared:
• Too many problems
• Clients can mount any remote subtree
any way they want:
• Could have different names for same
subtree by mounting it in different places
• NFS uses a set of basic mounted
filesystems on each machine and let users
do the rest 41
Hard issues (II)
• NFS passes user id, group id and
groups on each call
• Requires same mapping from user id and
group id to user on all machines
• NFS has no file locking
42
Conclusion
• To allow many clients to access a server and to keep
the servers isolated from client crashes, NFS uses
stateless servers.
• NFS adopted the open-read-write-close paradigm
used in UNIX, along with basic file types and file
protection modes.
45

Mais conteĂşdo relacionado

Semelhante a Chapter-5-DFS.ppt

Ch16 OS
Ch16 OSCh16 OS
Ch16 OSC.U
 
Presentation on nfs,afs,vfs
Presentation on nfs,afs,vfsPresentation on nfs,afs,vfs
Presentation on nfs,afs,vfsPrakriti Dubey
 
File service architecture and network file system
File service architecture and network file systemFile service architecture and network file system
File service architecture and network file systemSukhman Kaur
 
운영체제론 Ch17
운영체제론 Ch17운영체제론 Ch17
운영체제론 Ch17Jongmyoung Kim
 
Chapter 17 - Distributed File Systems
Chapter 17 - Distributed File SystemsChapter 17 - Distributed File Systems
Chapter 17 - Distributed File SystemsWayne Jones Jnr
 
Andrew File System
Andrew File SystemAndrew File System
Andrew File SystemAshish KC
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
 
Data Analytics: HDFS with Big Data : Issues and Application
Data Analytics:  HDFS  with  Big Data :  Issues and ApplicationData Analytics:  HDFS  with  Big Data :  Issues and Application
Data Analytics: HDFS with Big Data : Issues and ApplicationDr. Chitra Dhawale
 
11 distributed file_systems
11 distributed file_systems11 distributed file_systems
11 distributed file_systemslongly
 
Advanced Storage Area Network
Advanced Storage Area NetworkAdvanced Storage Area Network
Advanced Storage Area NetworkSoumee Maschatak
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.pptvijayapraba1
 
Ds
DsDs
DsHDRS
 
Ds
DsDs
DsHDRS
 

Semelhante a Chapter-5-DFS.ppt (20)

Ch16 OS
Ch16 OSCh16 OS
Ch16 OS
 
OS_Ch16
OS_Ch16OS_Ch16
OS_Ch16
 
OSCh16
OSCh16OSCh16
OSCh16
 
Presentation on nfs,afs,vfs
Presentation on nfs,afs,vfsPresentation on nfs,afs,vfs
Presentation on nfs,afs,vfs
 
File service architecture and network file system
File service architecture and network file systemFile service architecture and network file system
File service architecture and network file system
 
운영체제론 Ch17
운영체제론 Ch17운영체제론 Ch17
운영체제론 Ch17
 
Chapter 17 - Distributed File Systems
Chapter 17 - Distributed File SystemsChapter 17 - Distributed File Systems
Chapter 17 - Distributed File Systems
 
Andrew File System
Andrew File SystemAndrew File System
Andrew File System
 
Distributed file systems chapter 9
Distributed file systems chapter 9Distributed file systems chapter 9
Distributed file systems chapter 9
 
Dfs
DfsDfs
Dfs
 
Distributed File Systems
Distributed File SystemsDistributed File Systems
Distributed File Systems
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Dos unit 4
Dos unit 4Dos unit 4
Dos unit 4
 
Data Analytics: HDFS with Big Data : Issues and Application
Data Analytics:  HDFS  with  Big Data :  Issues and ApplicationData Analytics:  HDFS  with  Big Data :  Issues and Application
Data Analytics: HDFS with Big Data : Issues and Application
 
11 distributed file_systems
11 distributed file_systems11 distributed file_systems
11 distributed file_systems
 
Distributed file systems dfs
Distributed file systems   dfsDistributed file systems   dfs
Distributed file systems dfs
 
Advanced Storage Area Network
Advanced Storage Area NetworkAdvanced Storage Area Network
Advanced Storage Area Network
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Ds
DsDs
Ds
 
Ds
DsDs
Ds
 

Último

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 

Último (20)

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 

Chapter-5-DFS.ppt

  • 1. Distributed System Sanjivani Rural Education Society’s Sanjivani College of Engineering, Kopargaon-423603 (An Autonomous Institute Affiliated to Savitribai Phule Pune University, Pune) NAAC ‘A’ Grade Accredited, ISO 9001:2015 Certified Department of Information Technology (NBA Accredited) Dr. R. D. Chintamani Asst. Prof.
  • 3. 3 DISTRIBUTED FILE SYSTEMS DEFINITIONS: • A Distributed File System ( DFS ) is simply a classical model of a file system distributed across multiple machines. The purpose is to promote sharing of dispersed files. • This is an area of active research interest today. • The resources on a particular machine are local to itself. Resources on other machines are remote. • A file system provides a service for clients. The server interface is the normal set of file operations: create, read, etc. on files.
  • 4. 4 Introduction • Distributed file systems support the sharing of information in the form of files throughout the intranet. • A distributed file system enables programs to store and access remote files exactly as they do on local ones, allowing users to access files from any computer on the intranet. • Recent advances in higher bandwidth connectivity of switched local networks and disk organization have lead high performance and highly scalable file systems.
  • 5. 5 DISTRIBUTED FILE SYSTEMS Clients, servers, and storage are dispersed across machines. Configuration and implementation may vary - a) Servers may run on dedicated machines, OR b) Servers and clients can be on the same machines. c) The OS itself can be distributed with the file system a part of that distribution. d) A distribution layer can be interposed between a conventional OS and the file system. Clients should view a DFS the same way they would a centralized FS; the distribution is hidden at a lower level. Performance is concerned with throughput and response time. Definitions
  • 6. 6 DISTRIBUTED FILE SYSTEMS Naming is the mapping between logical and physical objects. • Example: A user filename maps to <cylinder, sector>. • In a conventional file system, it's understood where the file actually resides; the system and disk are known. • In a transparent DFS, the location of a file, somewhere in the network, is hidden. • File replication means multiple copies of a file; mapping returns a SET of locations for the replicas. Location transparency - a)The name of a file does not reveal any hint of the file's physical storage location. b)File name still denotes a specific, although hidden, set of physical disk blocks. Naming and Transparency
  • 7. 7 DISTRIBUTED FILE SYSTEMS The ANDREW DFS AS AN EXAMPLE: • Is location independent. • Supports file mobility. • Separation of FS and OS allows for disk-less systems. These have lower cost and convenient system upgrades. The performance is not as good. NAMING SCHEMES: There are three main approaches to naming files: 1. Files are named with a combination of host and local name. • This guarantees a unique name. NEITHER location transparent NOR location independent. • Same naming works on local and remote files. The DFS is a loose collection of independent file systems. Naming and Transparency
  • 8. 8 DISTRIBUTED FILE SYSTEMS NAMING SCHEMES: 2. Remote directories are mounted to local directories. • So a local system seems to have a coherent directory structure. • The remote directories must be explicitly mounted. The files are location independent. • SUN NFS is a good example of this technique. 3. A single global name structure spans all the files in the system. • The DFS is built the same way as a local filesystem. Location independent. Naming and Transparency
  • 9. 9 DISTRIBUTED FILE SYSTEMS IMPLEMENTATION TECHNIQUES: • A non-transparent mapping technique: name ----> < system, disk, cylinder, sector > • A transparent mapping technique: name ----> file_identifier ----> < system, disk, cylinder, sector > • So when changing the physical location of a file, only the file identifier need be modified. This identifier must be "unique”. Naming and Transparency
  • 10. 10 DISTRIBUTED FILE SYSTEMS CACHING • Reduce network traffic by retaining recently accessed disk blocks in a cache, so that repeated accesses to the same information can be handled locally. • If required data is not already cached, a copy of data is brought from the server to the user. • Perform accesses on the cached copy. • Files are identified with one master copy residing at the server machine, • Copies of (parts of) the file are scattered in different caches. • Cache Consistency Problem -- Keeping the cached copies consistent with the master file. • A remote service ((RPC) has these characteristic steps: a) The client makes a request for file access. b) The request is passed to the server in message format. c) The server makes the file access. d) Return messages bring the result back to the client. • This is equivalent to performing a disk access for each request. Remote File Access
  • 11. 11 DISTRIBUTED FILE SYSTEMS CACHE LOCATION: • Caching is a mechanism for maintaining disk data on the local machine. This data can be kept in the local memory or in the local disk. Caching can be advantageous both for read ahead and read again. • The cost of getting data from a cache is a few HUNDRED instructions; disk accesses cost THOUSANDS of instructions. • The master copy of a file doesn't move, but caches contain replicas of portions of the file. • Caching behaves just like "networked virtual memory". • What should be cached? << blocks <---> files >>. Bigger sizes give a better hit rate; smaller give better transfer times. • Caching on disk gives: — Better reliability. • Caching in memory gives: — The possibility of diskless work stations, — Greater speed, Remote File Access
  • 12. 12 DISTRIBUTED FILE SYSTEMS COMPARISON OF CACHING AND REMOTE SERVICE: • Many remote accesses can be handled by a local cache. There's a great deal of locality of reference in file accesses. Servers can be accessed only occasionally rather than for each access. • Caching causes data to be moved in a few big chunks rather than in many smaller pieces; this leads to considerable efficiency for the network. • Disk accesses can be better optimized on the server if it's understood that requests are always for large contiguous chunks. • Caching works best on machines with considerable local store - either local disks or large memories. Remote File Access
  • 13. 13 DISTRIBUTED FILE SYSTEMS STATEFUL VS. STATELESS SERVICE: Stateful: A server keeps track of information about client requests. • It maintains what files are opened by a client; connection identifiers; server caches. • Memory must be reclaimed when client closes file or when client dies. Stateless: Each client request provides complete information needed by the server (i.e., filename, file offset ). • The server can maintain information on behalf of the client, but it's not required. Remote File Access
  • 14. 14 DISTRIBUTED FILE SYSTEMS STATEFUL VS. STATELESS SERVICE: Performance is better for stateful. • Don't need to parse the filename each time, or "open/close" file on every request. Fault Tolerance: A stateful server loses everything when it crashes. • Server must poll clients in order to renew its state. • Client crashes force the server to clean up its encached information. • Stateless remembers nothing so it can start easily after a crash. Remote File Access
  • 15. 15 DISTRIBUTED FILE SYSTEMS FILE REPLICATION: • Duplicating files on multiple machines improves availability and performance. • Placed on failure-independent machines ( they won't fail together ). • The main problem is consistency - when one copy changes, how do other copies reflect that change? Often there is a tradeoff: consistency versus availability and performance. Remote File Access
  • 16. 16 General File Service Architecture • The responsibilities of a DFS are typically distributed among three modules: • Client module which emulates the conventional file system interface • Server modules(2) which perform operations for clients on directories and on files. • Most importantly this architecture enables stateless implementation of the server modules.
  • 17. 17 File service architecture Client computer Server computer Application program Application program Client module Flat file service Directory service
  • 18. File Service Architecture 18 • Flat File Service: • Concerned with implementing operations on the concepts of files. • Unique File Identifiers (UFIDs) are used to refer to files in all requests for flat file service operations. • Responsibilities of file and directory service is based upon UFID (long sequence of bits so each file has UFID which is unique in DS).
  • 19. File Service Architecture 19 • Directory Service: • It provides a mapping between text names for files and their UFIDs • Client Obtain UFID by quoting text name to the directory service. • Client Module: • Run on each client computer • Integrate and expand the operations of the flat file service under single application programming interface.
  • 20. What is NFS? • First commercially successful network file system: • Developed by Sun Microsystems for their diskless workstations • Designed for robustness and “adequate performance” • Sun published all protocol specifications • Many many implementations 20
  • 21. 21 OVERVIEW: • Runs on SUNOS - NFS is both an implementation and a specification of how to access remote files. It's both a definition and a specific instance. • The goal: to share a file system in a transparent way. • Uses client-server model ( for NFS, a node can be both simultaneously.) Can act between any two nodes ( no dedicated server. ) • Mount makes a server file-system visible from a client. DISTRIBUTED FILE SYSTEMS SUN Network File System
  • 22. highlights • NFS is stateless • All client requests must be self-contained • The virtual file system interface • VFS operations • VNODE operations • Performance issues • Impact of tuning on NFS performance 22
  • 23. Objectives (I) • Machine and Operating System Independence • Could be implemented on low-end machines of the mid-80’s • Fast Crash Recovery • Major reason behind stateless design • Transparent Access • Remote files should be accessed in exactly the same way as local files 23
  • 24. Objectives (II) • UNIX semantics should be maintained on client • Best way to achieve transparent access • “Reasonable” performance • Robustness and preservation of UNIX semantics were much more important 24
  • 25. Basic design • Three important parts • The protocol • The server side • The client side 25
  • 26. The protocol (I) • Uses the Sun RPC mechanism and Sun eXternal Data Representation (XDR) standard • Defined as a set of remote procedures • Protocol is stateless • Each procedure call contains all the information necessary to complete the call 26
  • 27. Advantages of statelessness • Crash recovery is very easy: • When a server crashes, client just resends request until it gets an answer from the rebooted server • Client cannot tell difference between a server that has crashed and recovered and a slow server • Client can always repeat any request 27
  • 28. Consequences of statelessness • Read and writes must specify their start offset • Server does not keep track of current position in the file • User still use conventional UNIX reads and writes • Open system call translates into several lookup calls to server 28
  • 29. Server side (II) • File handle consists of • Filesystem id identifying disk partition • I-node number identifying file within partition • Generation number changed every time i-node is reused to store a new file • Server will store • Filesystem id in filesystem superblock • I-node generation number in i-node 29
  • 30. Client side (I) • Provides transparent interface to NFS • Mapping between remote file names and remote file addresses is done a server boot time through remote mount • Extension of UNIX mounts • Specified in a mount table • Makes a remote subtree appear part of a local subtree 30
  • 31. Remote mount Client tree bin usr / Server subtree rmount After rmount, root of server subtree can be accessed as /usr 31
  • 32. Client side (II) • Provides transparent access to • NFS • New virtual filesystem interface supports • VFS calls, which operate on whole file system • VNODE calls, which operate on individual files • Treats all files in the same fashion 32
  • 33. Client side (III) UNIX system calls VNODE/VFS Other FS NFS UNIX FS User interface is unchanged RPC/XDR disk LAN Common interface 33
  • 34. The Mount Protocol • The mount protocol provides four basic services that clients need before they can use NFS: • It allows the client to obtain a list of the directory hierarchies (i.e. the file systems) that the client can access through NFS. • It accepts full path names That allow the client to identify a particular directory hierarchy. • It authenticates each client’s request and validates the client’s permission to access the requested hierarchy. • It returns a file handle for the root directory of the hierarchy a client specifies. • The client uses the root handle obtained from the mount protocol when making NFS calls. 34
  • 35. 35 THE MOUNT PROTOCOL: The following operations occur: 1. The client's request is sent via RPC to the mount server ( on server machine.) 2. Mount server checks export list containing a) file systems that can be exported, b) legal requesting clients. c) It's legitimate to mount any directory within the legal filesystem. 3. Server returns "file handle" to client. 4. Server maintains list of clients and mounted directories -- this is state information! But this data is only a "hint" and isn't treated as essential. 5. Mounting often occurs automatically when client or server boots. DISTRIBUTED FILE SYSTEMS SUN Network File System
  • 36. 36 THE NFS PROTOCOL: RPC’s support these remote file operations: a) Search for file within directory. b) Read a set of directory entries. c) Manipulate links and directories. d) Read/write file attributes. e) Read/write file data. Note: • NFS servers are stateless. Each request must provide all information. With a server crash, no information is lost. • Modified data must actually get to server disk before client is informed the action is complete. Using a cache would imply state information. • A single NFS write is atomic. A client write request may be broken into several atomic RPC calls, so the whole thing is NOT atomic. DISTRIBUTED FILE SYSTEMS SUN Network File System
  • 37. 37 NFS ARCHITECTURE: Follow local and remote access through this figure: DISTRIBUTED FILE SYSTEMS SUN Network File System
  • 38. 38 NFS ARCHITECTURE: 1. UNIX filesystem layer - does normal open / read / etc. commands. 2. Virtual file system ( VFS ) layer - a) Gives clean layer between user and filesystem. a) Acts as deflection point by using global vnodes. a) Understands the difference between local and remote names. a) Keeps in memory information about what should be deflected (mounted directories) and how to get to these remote directories. 3. System call interface layer - a) Presents sanitized validated requests in a uniform way to the VFS. DISTRIBUTED FILE SYSTEMS SUN Network File System
  • 39. 39 CACHES OF REMOTE DATA: • The client keeps: File block cache - ( the contents of a file ) File attribute cache - ( file header info (inode in UNIX) ). • The local kernel hangs on to the data after getting it the first time. • On an open, local kernel, it checks with server that cached data is still OK. • Cached attributes are thrown away after a few seconds. DISTRIBUTED FILE SYSTEMS SUN Network File System
  • 40. NFS solution (I) • Stateless server does not know how many users are accessing a given file • Clients do not know either • Clients must • Frequently send their modified blocks to the server • Frequently ask the server to revalidate the blocks they have in their cache 40
  • 41. Hard issues (I) • NFS root file systems cannot be shared: • Too many problems • Clients can mount any remote subtree any way they want: • Could have different names for same subtree by mounting it in different places • NFS uses a set of basic mounted filesystems on each machine and let users do the rest 41
  • 42. Hard issues (II) • NFS passes user id, group id and groups on each call • Requires same mapping from user id and group id to user on all machines • NFS has no file locking 42
  • 43. Conclusion • To allow many clients to access a server and to keep the servers isolated from client crashes, NFS uses stateless servers. • NFS adopted the open-read-write-close paradigm used in UNIX, along with basic file types and file protection modes. 45