SlideShare a Scribd company logo
1 of 22
Download to read offline
Introduction

Classification

Lustre

Conclusion

Distributed File Systems
An Overview
Anant Narayanan
Cluster and Grid Computing
Vrije Universiteit

11 March 2009

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Outline
Introduction
Classification
Storage
Fault Tolerance
Applications
Lustre
Overview
Architecture
Implementation

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Definition

Allows access to files located on a remote host
In a transparent manner, as though the client is actually
working on the host

Typically, clients do not have access to the underlying
block storage
They interact over the network using a protocol

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Why do we need them?
Distributed applications usually require a common data
store
Eases ability to keep data consistent
Access control is possible both on the server and client
Depending on how the protocol is designed

Allows for implementation of
Replication
Fault tolerance

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Storage

Block Oriented

“Usual” meaning of a file system
Deal with storing data on a block basis
Most distributed file systems are based on this at the
lowest level
Examples: ext3, NTFS, HFS+

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Storage

Record Oriented

Were used on Mainframes and Minicomputers
Fetch and put whole records, seek to boundaries
Have a lot in common with today’s databases
Examples: Files-11, Virtual Storage Access Method
(VSAM)

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Storage

Object Oriented

Splits file metadata from file data
File data is further split into objects
Objects stored on object storage servers
May or may not have a block oriented FS at the lowest
layer
Examples: Lustre, XtreemFS

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Fault Tolerance

High Availability

Replication
Parallel Striping
Examples: Brtfs, Coda, GlusterFS

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Applications

Clusters

Shared disk systems (GFS)
Distributed disk systems (Lustre)
Typically used on local networks
Fast network access

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Applications

Grids or Clouds

Dynamic nature
Deal with heterogeneity
Deal with VOs (Grids) and SLAs (Clouds)
Examples: XtreemFS, Dynamo

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Overview

Linux + Cluster
Distributed, parallel, fault tolerant, object based file system
Tens of thousands of nodes
Petabytes of storage capacity
Hundreds of Gigabytes / second of throughput
Without compromising on speed or security
Small workgroup clusters, to large-scale, multi-site
clusters, to super-computers

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Lustre Clusters
Classification

Lustre

Conclusion

Lustre clusters contain three kinds of systems:
Architecture

Layout

• File system clients, which can be used to access the file system
• Object storage servers (OSS), which provide file I/O service
• Metadata servers (MDS), which manage the names and directories in the file system
MDS disk storage containing
Metadata Targets (MDT)

Pool of clustered MDS servers
1-100

OSS servers
1-1000s

MDS 1
(active)

MDS 2
(standby)

Elan
Myrinet
InfiniBand

OSS 1

Commodity Storage

OSS 2

Simultaneous
support of multiple
network types

Lustre clients
1-100,000

OSS storage with object
storage targets (OST)

OSS 3

Shared storage
enables failover OSS

OSS 4

Router

OSS 5

GigE
OSS 6

= Failover

OSS 7

Enterprise-Class
Storage Arrays and
SAN Fabric

Figure 1. Systems in a Lustre cluster

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Architecture

Components

File system clients: used to access the file system
Object storage servers (OSS): provide file I/O service,
deals with block storage
Metadata servers (MDS): manage the names and
directories in the file system, deals with authentication

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
4

Lustre Clusters

Introduction

Sun Microsystems, Inc.

Classification

Lustre

Conclusion

Architecture

The following
Characteristicstable shows the characteristics associated with each of the three types
of systems.
Typical number
of systems

Performance

Required
attached
storage

Desirable
hardware
characteristics

Clients

1–100,000

1 GB/sec I/O,
1000 metadata
ops

None

None

OSS

1–1000

500 MB/sec —
2.5 GB/sec

File system
capacity/OSS
count

Good bus
bandwidth

MDS

2 (in the future
2–100)

3000–15,000
metadata
ops/sec
(operations)

1–2% of file
system capacity

Adequate CPU
power, plenty of
memory

The storage attached to the servers is partitioned, optionally organized with logical
Anant Narayanan

Cluster OSS and MDS
volume management (LVM) and formatted as file systems. The Lustreand Grid Computing Vrije Universiteit

servers read, write, and modify data in the format imposed by these file systems. Each
Distributed File Systems
Introduction

Classification

Lustre

Conclusion

Architecture

Heterogeneous?

MDS and OSS may store actual data on ext3 or ZFS block
file systems
Infiniband, TCP/IP over Ethernet and Myrinet are
supported network types
Multiple CPU architectures: x86, x86_64, PPC
Requires patched Linux kernel

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Implementation

Setup
MDS
mkfs.lustre -mdt -mgs -fsname=large-fs
/dev/sdamount -t lustre /dev/sda /mnt/mdt
OSS1
mkfs.lustre -ost -fsname=large-fs
-mgsnode=mds@tcp0 /dev/sdb mount -t lustre
/dev/sdb /mnt/ost1
Client
mount -t lustre mds.your.org:/large-fs
/mnt/lustre-client

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Implementation

Networking
LNET abstracts over multiple supported networks
Provides the communication infrastructure required by
Lustre
Takes care of abstracting over fail-over servers, load
balancing
Provides support for Remote Direct Memory Access
(RDMA)
Provides an end-to-end throughput of 100MB per sec on
Gigabit Ethernet networks
Upto 1.5GB per sec on Infiniband
Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Implementation

Where Are the Files?
Classification

Lustre

Conclusion

Traditional UNIX disk file systems use inodes, which contain lists of block numbers
where the file data for the inode is stored. Similarly, one inode exists on the MDT for

each file the Lustre
Where aredoesinnot point to file system. However, in points to one or more objects associated
the files?blocks, but instead the Lustre file system, the inode on the
MDT
data
with the files. This is illustrated in Figure 5.

File on MDT
Extended
Attributes

Ordinary ext3 File
obj1

oss2

obj3

oss3

Direct
Data Blocks

oss1

obj2

Data
Block
ptrs
Indirect
Double
Indirect

inode

inode

Indirect
Data Blocks

Figure 5. MDS inodes point to objects; ext3 inodes point to data

Anant Narayanan

These objects are implemented as files on the OST file systems and contain file data.
Cluster and Grid Computing Vrije Universiteit
Figure 6 shows how a file open operation transfers the object pointers from the MDS

Distributed File Systems
Introduction

Classification

Lustre

Conclusion

Implementation

Striping and Replication

One object per MDS inode implies “unstriped” data
Multiple objects per MDS inode implies that the file has
been split, similar to RAID 0
These stripes may be duplicated across several OSS
Provides fault tolerance and high availability

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

Conclusion

Reliable, Scalable and Performant filesystem
Open Architecture and Protocols
BUT
Does not handle dynamic addition / removal of servers
Does not provide the kind of access control and security
that a VO might need

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

What next?

Other distributed file systems and implementation details
Grids (XtreemFS), Clouds (Dynamo)
Questions?

Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit
Introduction

Classification

Lustre

Conclusion

References
You may be required to register on the Sun Website to access these documents!

Datasheet:
http://www.sun.com/software/products/lustre/
datasheet.pdf
Scalable Cluster Filesystem Whitepaper:
http://www.sun.com/offers/docs/LustreFileSystem.pdf
LNET:
http://www.sun.com/offers/docs/
lustre_networking.pdf
Lustre Documentation Index:
http://manual.lustre.org/index.php?title=Main_Page
Lustre Publications Index:
http://wiki.lustre.org/
index.php?title=Lustre_Publications
Anant Narayanan
Distributed File Systems

Cluster and Grid Computing Vrije Universiteit

More Related Content

What's hot

Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systemsViet-Trung TRAN
 
Distributed file systems
Distributed file systemsDistributed file systems
Distributed file systemsSri Prasanna
 
Chapter 8 distributed file systems
Chapter 8 distributed file systemsChapter 8 distributed file systems
Chapter 8 distributed file systemsAbDul ThaYyal
 
Operating System : Ch17 distributed file systems
Operating System : Ch17 distributed file systemsOperating System : Ch17 distributed file systems
Operating System : Ch17 distributed file systemsSyaiful Ahdan
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemJanani S
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemAnamika Singh
 
Dfs (Distributed computing)
Dfs (Distributed computing)Dfs (Distributed computing)
Dfs (Distributed computing)Sri Prasanna
 
Distribution File System DFS Technologies
Distribution File System DFS TechnologiesDistribution File System DFS Technologies
Distribution File System DFS TechnologiesRaphael Ejike
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemNaza hamed Jan
 
Chapter 17 - Distributed File Systems
Chapter 17 - Distributed File SystemsChapter 17 - Distributed File Systems
Chapter 17 - Distributed File SystemsWayne Jones Jnr
 
Distributed Filesystems Review
Distributed Filesystems ReviewDistributed Filesystems Review
Distributed Filesystems ReviewSchubert Zhang
 
Self-Adapting, Energy-Conserving Distributed File Systems
Self-Adapting, Energy-Conserving Distributed File SystemsSelf-Adapting, Energy-Conserving Distributed File Systems
Self-Adapting, Energy-Conserving Distributed File SystemsMário Almeida
 
3. distributed file system requirements
3. distributed file system requirements3. distributed file system requirements
3. distributed file system requirementsAbDul ThaYyal
 
11 distributed file_systems
11 distributed file_systems11 distributed file_systems
11 distributed file_systemslongly
 
4.file service architecture (1)
4.file service architecture (1)4.file service architecture (1)
4.file service architecture (1)AbDul ThaYyal
 
Presentation on nfs,afs,vfs
Presentation on nfs,afs,vfsPresentation on nfs,afs,vfs
Presentation on nfs,afs,vfsPrakriti Dubey
 
Chapter 10 - File System Interface
Chapter 10 - File System InterfaceChapter 10 - File System Interface
Chapter 10 - File System InterfaceWayne Jones Jnr
 

What's hot (20)

Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systems
 
Distributed file systems
Distributed file systemsDistributed file systems
Distributed file systems
 
Chapter 8 distributed file systems
Chapter 8 distributed file systemsChapter 8 distributed file systems
Chapter 8 distributed file systems
 
12. dfs
12. dfs12. dfs
12. dfs
 
Operating System : Ch17 distributed file systems
Operating System : Ch17 distributed file systemsOperating System : Ch17 distributed file systems
Operating System : Ch17 distributed file systems
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
5.distributed file systems
5.distributed file systems5.distributed file systems
5.distributed file systems
 
Dfs (Distributed computing)
Dfs (Distributed computing)Dfs (Distributed computing)
Dfs (Distributed computing)
 
Distribution File System DFS Technologies
Distribution File System DFS TechnologiesDistribution File System DFS Technologies
Distribution File System DFS Technologies
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
11. dfs
11. dfs11. dfs
11. dfs
 
Chapter 17 - Distributed File Systems
Chapter 17 - Distributed File SystemsChapter 17 - Distributed File Systems
Chapter 17 - Distributed File Systems
 
Distributed Filesystems Review
Distributed Filesystems ReviewDistributed Filesystems Review
Distributed Filesystems Review
 
Self-Adapting, Energy-Conserving Distributed File Systems
Self-Adapting, Energy-Conserving Distributed File SystemsSelf-Adapting, Energy-Conserving Distributed File Systems
Self-Adapting, Energy-Conserving Distributed File Systems
 
3. distributed file system requirements
3. distributed file system requirements3. distributed file system requirements
3. distributed file system requirements
 
11 distributed file_systems
11 distributed file_systems11 distributed file_systems
11 distributed file_systems
 
4.file service architecture (1)
4.file service architecture (1)4.file service architecture (1)
4.file service architecture (1)
 
Presentation on nfs,afs,vfs
Presentation on nfs,afs,vfsPresentation on nfs,afs,vfs
Presentation on nfs,afs,vfs
 
Chapter 10 - File System Interface
Chapter 10 - File System InterfaceChapter 10 - File System Interface
Chapter 10 - File System Interface
 

Similar to Distributed File Systems: An Overview

CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
 
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSPARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSRaheemUnnisa1
 
linux file sysytem& input and output
linux file sysytem& input and outputlinux file sysytem& input and output
linux file sysytem& input and outputMythiliA5
 
final-unit-ii-cc-cloud computing-2022.pdf
final-unit-ii-cc-cloud computing-2022.pdffinal-unit-ii-cc-cloud computing-2022.pdf
final-unit-ii-cc-cloud computing-2022.pdfSamiksha880257
 
Chapter 1-Introduction.ppt
Chapter 1-Introduction.pptChapter 1-Introduction.ppt
Chapter 1-Introduction.pptsirajmohammed35
 
seed block algorithm
seed block algorithmseed block algorithm
seed block algorithmDipak Badhe
 
Distributed computing seminar lecture 3 - distributed file systems
Distributed computing seminar   lecture 3 - distributed file systemsDistributed computing seminar   lecture 3 - distributed file systems
Distributed computing seminar lecture 3 - distributed file systemstugrulh
 
Chapter 1-Introduction.ppt
Chapter 1-Introduction.pptChapter 1-Introduction.ppt
Chapter 1-Introduction.pptbalewayalew
 
International Journal of Computer Science and Security Volume (4) Issue (1)
International Journal of Computer Science and Security Volume (4) Issue (1)International Journal of Computer Science and Security Volume (4) Issue (1)
International Journal of Computer Science and Security Volume (4) Issue (1)CSCJournals
 
lecture_1_introduction.ppt
lecture_1_introduction.pptlecture_1_introduction.ppt
lecture_1_introduction.pptRandyGaray
 
LogicalDOC Clustering
LogicalDOC ClusteringLogicalDOC Clustering
LogicalDOC ClusteringLogicalDOC
 
Survey of distributed storage system
Survey of distributed storage systemSurvey of distributed storage system
Survey of distributed storage systemZhichao Liang
 

Similar to Distributed File Systems: An Overview (20)

CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Gt3112931298
Gt3112931298Gt3112931298
Gt3112931298
 
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSPARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
 
linux file sysytem& input and output
linux file sysytem& input and outputlinux file sysytem& input and output
linux file sysytem& input and output
 
Chapter04 new
Chapter04 newChapter04 new
Chapter04 new
 
final-unit-ii-cc-cloud computing-2022.pdf
final-unit-ii-cc-cloud computing-2022.pdffinal-unit-ii-cc-cloud computing-2022.pdf
final-unit-ii-cc-cloud computing-2022.pdf
 
Chapter 1-Introduction.ppt
Chapter 1-Introduction.pptChapter 1-Introduction.ppt
Chapter 1-Introduction.ppt
 
Gradution Project
Gradution ProjectGradution Project
Gradution Project
 
seed block algorithm
seed block algorithmseed block algorithm
seed block algorithm
 
DFSNov1.pptx
DFSNov1.pptxDFSNov1.pptx
DFSNov1.pptx
 
Lec3 Dfs
Lec3 DfsLec3 Dfs
Lec3 Dfs
 
Distributed computing seminar lecture 3 - distributed file systems
Distributed computing seminar   lecture 3 - distributed file systemsDistributed computing seminar   lecture 3 - distributed file systems
Distributed computing seminar lecture 3 - distributed file systems
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
Chapter 1-Introduction.ppt
Chapter 1-Introduction.pptChapter 1-Introduction.ppt
Chapter 1-Introduction.ppt
 
International Journal of Computer Science and Security Volume (4) Issue (1)
International Journal of Computer Science and Security Volume (4) Issue (1)International Journal of Computer Science and Security Volume (4) Issue (1)
International Journal of Computer Science and Security Volume (4) Issue (1)
 
Distributed Operating System_4
Distributed Operating System_4Distributed Operating System_4
Distributed Operating System_4
 
lecture_1_introduction.ppt
lecture_1_introduction.pptlecture_1_introduction.ppt
lecture_1_introduction.ppt
 
Operating system
Operating systemOperating system
Operating system
 
LogicalDOC Clustering
LogicalDOC ClusteringLogicalDOC Clustering
LogicalDOC Clustering
 
Survey of distributed storage system
Survey of distributed storage systemSurvey of distributed storage system
Survey of distributed storage system
 

More from Anant Narayanan

Enterprise Scale Knowledge Graphs
Enterprise Scale Knowledge GraphsEnterprise Scale Knowledge Graphs
Enterprise Scale Knowledge GraphsAnant Narayanan
 
Building an Intelligent Assistant
Building an Intelligent AssistantBuilding an Intelligent Assistant
Building an Intelligent AssistantAnant Narayanan
 
WebRTC: A Practical Introduction
WebRTC: A Practical IntroductionWebRTC: A Practical Introduction
WebRTC: A Practical IntroductionAnant Narayanan
 
Message Passing vs. Data Synchronization
Message Passing vs. Data SynchronizationMessage Passing vs. Data Synchronization
Message Passing vs. Data SynchronizationAnant Narayanan
 
Firebase: Tales from the Trenches
Firebase: Tales from the TrenchesFirebase: Tales from the Trenches
Firebase: Tales from the TrenchesAnant Narayanan
 
Error Handling in WebRTC
Error Handling in WebRTCError Handling in WebRTC
Error Handling in WebRTCAnant Narayanan
 
WebRTC: User Security & Privacy
WebRTC: User Security & PrivacyWebRTC: User Security & Privacy
WebRTC: User Security & PrivacyAnant Narayanan
 
Firefox Architecture Overview
Firefox Architecture OverviewFirefox Architecture Overview
Firefox Architecture OverviewAnant Narayanan
 
Next Generation Browser Add-Ons
Next Generation Browser Add-OnsNext Generation Browser Add-Ons
Next Generation Browser Add-OnsAnant Narayanan
 
An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed DebuggingAnant Narayanan
 
A Brief Incursion into Botnet Detection
A Brief Incursion into Botnet DetectionA Brief Incursion into Botnet Detection
A Brief Incursion into Botnet DetectionAnant Narayanan
 
Mozilla Weave: Integrating Services into the Browser
Mozilla Weave: Integrating Services into the BrowserMozilla Weave: Integrating Services into the Browser
Mozilla Weave: Integrating Services into the BrowserAnant Narayanan
 
Innovating with Mozilla Labs
Innovating with Mozilla LabsInnovating with Mozilla Labs
Innovating with Mozilla LabsAnant Narayanan
 
Glendix: The Why and the How
Glendix: The Why and the HowGlendix: The Why and the How
Glendix: The Why and the HowAnant Narayanan
 

More from Anant Narayanan (20)

Enterprise Scale Knowledge Graphs
Enterprise Scale Knowledge GraphsEnterprise Scale Knowledge Graphs
Enterprise Scale Knowledge Graphs
 
Building an Intelligent Assistant
Building an Intelligent AssistantBuilding an Intelligent Assistant
Building an Intelligent Assistant
 
WebRTC: A Practical Introduction
WebRTC: A Practical IntroductionWebRTC: A Practical Introduction
WebRTC: A Practical Introduction
 
Message Passing vs. Data Synchronization
Message Passing vs. Data SynchronizationMessage Passing vs. Data Synchronization
Message Passing vs. Data Synchronization
 
Firebase: Tales from the Trenches
Firebase: Tales from the TrenchesFirebase: Tales from the Trenches
Firebase: Tales from the Trenches
 
WebRTC: An Overview
WebRTC: An OverviewWebRTC: An Overview
WebRTC: An Overview
 
Error Handling in WebRTC
Error Handling in WebRTCError Handling in WebRTC
Error Handling in WebRTC
 
WebRTC Demystified
WebRTC DemystifiedWebRTC Demystified
WebRTC Demystified
 
WebRTC: User Security & Privacy
WebRTC: User Security & PrivacyWebRTC: User Security & Privacy
WebRTC: User Security & Privacy
 
Firefox Architecture Overview
Firefox Architecture OverviewFirefox Architecture Overview
Firefox Architecture Overview
 
πP
πPπP
πP
 
Next Generation Browser Add-Ons
Next Generation Browser Add-OnsNext Generation Browser Add-Ons
Next Generation Browser Add-Ons
 
An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed Debugging
 
A Brief Incursion into Botnet Detection
A Brief Incursion into Botnet DetectionA Brief Incursion into Botnet Detection
A Brief Incursion into Botnet Detection
 
Mozilla Weave: Integrating Services into the Browser
Mozilla Weave: Integrating Services into the BrowserMozilla Weave: Integrating Services into the Browser
Mozilla Weave: Integrating Services into the Browser
 
about:labs
about:labsabout:labs
about:labs
 
Innovating with Mozilla Labs
Innovating with Mozilla LabsInnovating with Mozilla Labs
Innovating with Mozilla Labs
 
Glendix: The Why and the How
Glendix: The Why and the HowGlendix: The Why and the How
Glendix: The Why and the How
 
Mozilla Prism
Mozilla PrismMozilla Prism
Mozilla Prism
 
Making Gentoo Tick
Making Gentoo TickMaking Gentoo Tick
Making Gentoo Tick
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Distributed File Systems: An Overview

  • 1. Introduction Classification Lustre Conclusion Distributed File Systems An Overview Anant Narayanan Cluster and Grid Computing Vrije Universiteit 11 March 2009 Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 3. Introduction Classification Lustre Conclusion Definition Allows access to files located on a remote host In a transparent manner, as though the client is actually working on the host Typically, clients do not have access to the underlying block storage They interact over the network using a protocol Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 4. Introduction Classification Lustre Conclusion Why do we need them? Distributed applications usually require a common data store Eases ability to keep data consistent Access control is possible both on the server and client Depending on how the protocol is designed Allows for implementation of Replication Fault tolerance Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 5. Introduction Classification Lustre Conclusion Storage Block Oriented “Usual” meaning of a file system Deal with storing data on a block basis Most distributed file systems are based on this at the lowest level Examples: ext3, NTFS, HFS+ Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 6. Introduction Classification Lustre Conclusion Storage Record Oriented Were used on Mainframes and Minicomputers Fetch and put whole records, seek to boundaries Have a lot in common with today’s databases Examples: Files-11, Virtual Storage Access Method (VSAM) Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 7. Introduction Classification Lustre Conclusion Storage Object Oriented Splits file metadata from file data File data is further split into objects Objects stored on object storage servers May or may not have a block oriented FS at the lowest layer Examples: Lustre, XtreemFS Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 8. Introduction Classification Lustre Conclusion Fault Tolerance High Availability Replication Parallel Striping Examples: Brtfs, Coda, GlusterFS Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 9. Introduction Classification Lustre Conclusion Applications Clusters Shared disk systems (GFS) Distributed disk systems (Lustre) Typically used on local networks Fast network access Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 10. Introduction Classification Lustre Conclusion Applications Grids or Clouds Dynamic nature Deal with heterogeneity Deal with VOs (Grids) and SLAs (Clouds) Examples: XtreemFS, Dynamo Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 11. Introduction Classification Lustre Conclusion Overview Linux + Cluster Distributed, parallel, fault tolerant, object based file system Tens of thousands of nodes Petabytes of storage capacity Hundreds of Gigabytes / second of throughput Without compromising on speed or security Small workgroup clusters, to large-scale, multi-site clusters, to super-computers Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 12. Introduction Lustre Clusters Classification Lustre Conclusion Lustre clusters contain three kinds of systems: Architecture Layout • File system clients, which can be used to access the file system • Object storage servers (OSS), which provide file I/O service • Metadata servers (MDS), which manage the names and directories in the file system MDS disk storage containing Metadata Targets (MDT) Pool of clustered MDS servers 1-100 OSS servers 1-1000s MDS 1 (active) MDS 2 (standby) Elan Myrinet InfiniBand OSS 1 Commodity Storage OSS 2 Simultaneous support of multiple network types Lustre clients 1-100,000 OSS storage with object storage targets (OST) OSS 3 Shared storage enables failover OSS OSS 4 Router OSS 5 GigE OSS 6 = Failover OSS 7 Enterprise-Class Storage Arrays and SAN Fabric Figure 1. Systems in a Lustre cluster Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 13. Introduction Classification Lustre Conclusion Architecture Components File system clients: used to access the file system Object storage servers (OSS): provide file I/O service, deals with block storage Metadata servers (MDS): manage the names and directories in the file system, deals with authentication Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 14. 4 Lustre Clusters Introduction Sun Microsystems, Inc. Classification Lustre Conclusion Architecture The following Characteristicstable shows the characteristics associated with each of the three types of systems. Typical number of systems Performance Required attached storage Desirable hardware characteristics Clients 1–100,000 1 GB/sec I/O, 1000 metadata ops None None OSS 1–1000 500 MB/sec — 2.5 GB/sec File system capacity/OSS count Good bus bandwidth MDS 2 (in the future 2–100) 3000–15,000 metadata ops/sec (operations) 1–2% of file system capacity Adequate CPU power, plenty of memory The storage attached to the servers is partitioned, optionally organized with logical Anant Narayanan Cluster OSS and MDS volume management (LVM) and formatted as file systems. The Lustreand Grid Computing Vrije Universiteit servers read, write, and modify data in the format imposed by these file systems. Each Distributed File Systems
  • 15. Introduction Classification Lustre Conclusion Architecture Heterogeneous? MDS and OSS may store actual data on ext3 or ZFS block file systems Infiniband, TCP/IP over Ethernet and Myrinet are supported network types Multiple CPU architectures: x86, x86_64, PPC Requires patched Linux kernel Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 16. Introduction Classification Lustre Conclusion Implementation Setup MDS mkfs.lustre -mdt -mgs -fsname=large-fs /dev/sdamount -t lustre /dev/sda /mnt/mdt OSS1 mkfs.lustre -ost -fsname=large-fs -mgsnode=mds@tcp0 /dev/sdb mount -t lustre /dev/sdb /mnt/ost1 Client mount -t lustre mds.your.org:/large-fs /mnt/lustre-client Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 17. Introduction Classification Lustre Conclusion Implementation Networking LNET abstracts over multiple supported networks Provides the communication infrastructure required by Lustre Takes care of abstracting over fail-over servers, load balancing Provides support for Remote Direct Memory Access (RDMA) Provides an end-to-end throughput of 100MB per sec on Gigabit Ethernet networks Upto 1.5GB per sec on Infiniband Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 18. Introduction Implementation Where Are the Files? Classification Lustre Conclusion Traditional UNIX disk file systems use inodes, which contain lists of block numbers where the file data for the inode is stored. Similarly, one inode exists on the MDT for each file the Lustre Where aredoesinnot point to file system. However, in points to one or more objects associated the files?blocks, but instead the Lustre file system, the inode on the MDT data with the files. This is illustrated in Figure 5. File on MDT Extended Attributes Ordinary ext3 File obj1 oss2 obj3 oss3 Direct Data Blocks oss1 obj2 Data Block ptrs Indirect Double Indirect inode inode Indirect Data Blocks Figure 5. MDS inodes point to objects; ext3 inodes point to data Anant Narayanan These objects are implemented as files on the OST file systems and contain file data. Cluster and Grid Computing Vrije Universiteit Figure 6 shows how a file open operation transfers the object pointers from the MDS Distributed File Systems
  • 19. Introduction Classification Lustre Conclusion Implementation Striping and Replication One object per MDS inode implies “unstriped” data Multiple objects per MDS inode implies that the file has been split, similar to RAID 0 These stripes may be duplicated across several OSS Provides fault tolerance and high availability Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 20. Introduction Classification Lustre Conclusion Conclusion Reliable, Scalable and Performant filesystem Open Architecture and Protocols BUT Does not handle dynamic addition / removal of servers Does not provide the kind of access control and security that a VO might need Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 21. Introduction Classification Lustre Conclusion What next? Other distributed file systems and implementation details Grids (XtreemFS), Clouds (Dynamo) Questions? Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  • 22. Introduction Classification Lustre Conclusion References You may be required to register on the Sun Website to access these documents! Datasheet: http://www.sun.com/software/products/lustre/ datasheet.pdf Scalable Cluster Filesystem Whitepaper: http://www.sun.com/offers/docs/LustreFileSystem.pdf LNET: http://www.sun.com/offers/docs/ lustre_networking.pdf Lustre Documentation Index: http://manual.lustre.org/index.php?title=Main_Page Lustre Publications Index: http://wiki.lustre.org/ index.php?title=Lustre_Publications Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit