2. Clustering
Clustering is the use of multiple
computers, typically_
PCs or UNIX workstations,
multiple storage devices, and
redundant interconnections,
to form what appears to users as
a single highly available system.
2
4. Clustering
Cluster computing can be used for
load balancing
high availability
relatively low-cost form of
parallel processing machine
for scientific and other
applications that lend
themselves to parallel operations
4
6. Clustering
Cluster computing technology
puts clusters of systems together
to provide better system
reliability and performance.
Cluster server systems connect a
group of servers together in order
to jointly provide processing
service for the clients in the
network.
6
7. Clustering
Cluster operating systems divide
the tasks amongst the available
servers.
Clusters of systems or
workstations, on the other hand,
connect a group of systems
together to jointly share a critically
demanding computational task
7
8. Clustering
At the present time, cluster
server and workstation systems
are mostly used in High
Availability applications and in
scientific applications such as
numerical computations.
8
11. (cluster algorithm - i)
Main Requirements
A clustering algorithm should
satisfy_
1) scalability
2) dealing with different types of
attributes
3) discovering clusters with
arbitrary shape
11
12. (cluster algorithm - ii)
4) minimal requirements for domain
knowledge to determine input
parameters
5) ability to deal with noise and
outliers
6) insensitivity to order of input
records
7) high dimensionality
8) interpretability and usability
12
13. Linux Cluster
Clustering can be performed
on various operating
systems like Windows,
Macintosh, Solaris etc
Linux has its own advantages
which are as follows:-
13
14. Linux Cluster
Advantages of Linux Clustering…
• Linux runs on a wide range of
hardware
• Linux source code is freely
distributed.
• Linux is relatively virus free.
• Having a wide variety of tools and
applications for free.
• Good environment for developing
cluster infrastructure.
14
15. Cluster Components
The cluster consists of four major
parts…
a)
b)
c)
d)
Network,
Compute nodes,
Master server,
Gateway
Each part has a specific function
that is needed for the hardware to
perform its function.
15
16. Cluster Components
1. Network:
Provides communication
between nodes, server, and
gateway Consists of fast
Ethernet switch, cables, and
other networking hardware
16
17. Cluster Components
2. Nodes:
• Serve as processors for the
cluster.
• Each node is interchangeable,
there are no functionality
differences between nodes.
• Consists of all computers in the
cluster other than the gateway
and server.
17
18. Cluster Components
3. Master Server:
Provides network services to the
cluster DHCP.
NFS (Node image and shared file
system).
Actually runs parallel programs and
spawns processes on the nodes.
Should have minimum requirement.
18
19. Cluster Components
4. Gateway:
• Acts as a bridge/firewall between
outside world and cluster.
• Should have two Ethernet cards
19
20. Types of Clustering
1) High Performance Clusters
low price supercomputing (Beowulf
project)
2) High Availability Clusters
high available and fault tolerant
system (HA project)
20
21. Types of Clustering
3) Bulk Storage Clusters
stored data sharing and service
4) Web/Internet Clusters
21
22. Types of Clustering
1. High Performance Clusters
(Beowulf Clusters)
• The Beowulf Project First developed at
1994 in NASA
• New trend in developing
supercomputers - replace with high
price vector supercomputers
• Low price supercomputing is possible high performance/low price
processors, high speed network devices
22
23. Types of Clustering
What is a Beowulf?
• A technology of clustering Linux
computers to form a parallel,
virtual supercomputer
• It is a system which usually
consists of one server node, and
one or more client nodes
connected together via Ethernet or
some other network.
23
24. Types of Clustering
What is a Beowulf?
• It is a system built using
commodity hardware components,
like any PC capable of running
Linux, standard Ethernet
adapters, and switches.
24
25. Types of Clustering
2. High Availability Clusters
• Enterprise Server Requirements
• Reliability + Availability +
Serviceability = Non-stop FaultTolerant Cluster HA Server
25
27. Types of Clustering
3. Bulk Storage Clusters
• Network is configured with one of
the virtual cluster server
techniques.
• The disk storage is connected with
Fiber Channel including SAN
(Storage Area Network) file
systems.
27
28. RAID (Redundant Array of
Independent Disks)
GFS (Global File System)
FC (Fibre Channel)
DSU (Data Service Unit)
Types of Clustering
28
29. Types of Clustering
4. Internet Clusters
• The Linux Virtual Server is a highly
scalable and highly available server
built on a cluster of real servers, with
the load balancer running on the Linux
operating system.
• The architecture of the cluster is
transparent to end users. End users
only see a single virtual server (Single
System Image)
29