A computer cluster consists of a set of loosely or tightly connected
computers that work together so that, in many respects, they can be viewed as a
single system. Unlike grid computers, computer clusters have each node set to
perform the same task, controlled and scheduled by software.
They are usually deployed to improve performance and availability over
that of a single computer, while typically being much more cost-effective than
single computers of comparable speed or availability.
Computer clusters emerged as a result of convergence of a number of
computing trends including the availability of low-cost microprocessors,
high-speed networks, and software for high-performance distributed computing.
They have a wide range of applicability and deployment, ranging from small
business clusters with a handful of nodes to some of the fastest supercomputers
in the world such as IBM's Sequoia.
Supercomputer VS Computer Cluster
SuperComputer -> isn't a name for one particular type of computer, it's a term that refers to computers used to solve problems which require processing capabilities nearly as big as anyone can build at a given time. The types of systems that people call supercomputers change over time, supercomputers of yesteryear aren't that super any more.
Cluster Computers -> are loosely coupled parallel computers where the computing nodes have individual memory and instances of the operating system, but typically share a file system, and use an explicitly programmed high-speed network for communication. The term loosely refers to the technicalities of how such machines are constructed.
BASIC CONCEPTS
The desire to get more computing power and better reliability by
orchestrating a number of low-cost commercial off-the-shelf computers has given
rise to a variety of architectures and configurations.
The computer clustering approach usually (but not always) connects a
number of readily available computing nodes (e.g. personal computers used as
servers) via a fast local area network.
A computer cluster may be a simple two-node system which just connects
two personal computers, or may be a very fast supercomputer. A basic approach
to building a cluster is that of a Beowulf cluster which may be built with a
few personal computers to produce a cost-effective alternative to traditional
high performance computing. An early project that showed the viability of the
concept was the 133-node Stone Soupercomputer.
Although a cluster may consist of just a few personal computers
connected by a simple network, the cluster architecture may also be used to
achieve very high levels of performance.
CLUSTERS ATTRIBUTES
Computer clusters may be configured for different purposes ranging from
general purpose business needs such as web-service support, to
computation-intensive scientific calculations.
"Load-balancing" clusters are configurations in which
cluster-nodes share computational workload to provide better overall
performance. For example, a web server cluster may assign different queries to
different nodes, so the overall response time will be optimized. Example: Amazon,
Google, Facebook, Hotmail, etc. (Of course we are talking about SERVERS to “serve”
web services!)
Computer clusters are used for computation-intensive purposes, rather
than handling IO-oriented operations such as web service or databases. For
instance, a computer cluster might support computational simulations of vehicle
crashes or weather.
"High-availability clusters" (HA clusters) improve the
availability of the cluster approach. They operate by having redundant nodes,
which are then used to provide service when system components fail. HA cluster
implementations attempt to use redundancy of cluster components to eliminate
single points of failure.
BENEFITS
Clusters are primarily designed with performance in mind, but
installations are based on many other factors such as:
- Fault tolerance (the ability for a system to continue working with a malfunctioning node) also allows for simpler scalability
- High performance situations
- Low frequency of maintenance routines
- Resource consolidation
- And, centralized management.
Other advantages include enabling data recovery in the event of a
disaster and providing parallel data processing and high processing capacity.
DESIGN AND CONFIGURATION
One of the issues in designing a cluster is how tightly coupled the
individual nodes may be. For instance, a single computer job may require
frequent communication among nodes: this implies that the cluster shares a
dedicated network, is densely located, and probably has homogeneous nodes. The
other extreme is where a computer job uses one or few nodes, and needs little
or no inter-node communication, approaching grid computing.
In a Beowulf system, the application programs never see the
computational nodes (called slave computers) but only interact with the
"Master" which is a specific computer handling the scheduling and
management of the slaves. In a typical implementation the Master has two
network interfaces, one that communicates with the private Beowulf network for
the slaves, the other for the general purpose network of the organization. The slave computers typically have their own
version of the same operating system, and local memory and disk space. However,
the private slave network may also have a large and shared file server that
stores global persistent data, accessed by the slaves as needed.
Due to the increasing computing power of each generation of game
consoles, a novel use has emerged where they are repurposed into
High-performance computing (HPC) clusters. Some examples of game console
clusters are Sony PlayStation clusters and Microsoft Xbox clusters. Another
example of consumer game product is the Nvidia Tesla Personal Supercomputer
workstation, which uses multiple graphics accelerator processor chips. Besides
game consoles, high-end graphics cards too can be used instead. The use of
graphics cards (or rather their GPU's) to do calculations for grid computing is
vastly more economical than using CPU's, despite being less precise. However,
when using double-precision values, they become as precise to work with as
CPU's and are still much less costly (purchase cost).
Computer clusters have historically run on separate physical computers
with the same operating system. With the advent of virtualization, the cluster
nodes may run on separate physical computers with different operating systems
which are painted above with a virtual layer to look similar. The cluster may
also be virtualized on various configurations as maintenance takes place. An
example implementation is Xen as the virtualization manager with Linux-HA.
DATA SHARING AND COMMUNICATION
As the computer clusters were appearing during the 1980s, so were
supercomputers. One of the elements that distinguished the three classes at
that time was that the early supercomputers relied on shared memory. To date
clusters do not typically use physically shared memory, while many
supercomputer architectures have also abandoned it.
However, the use of a clustered file system is essential in modern
computer clusters. Examples include the IBM General Parallel File System,
Microsoft's Cluster Shared Volumes or the Oracle Cluster File System.
MESSAGE PASSING AND COMMUNICATION
Two widely used approaches for communication between cluster nodes are
MPI, the Message Passing Interface and PVM, the Parallel Virtual Machine.
PVM was developed at the Oak Ridge National Laboratory around 1989 before
MPI was available. PVM must be directly installed on every cluster node and
provides a set of software libraries that paint the node as a "parallel
virtual machine". PVM provides a run-time environment for message-passing,
task and resource management, and fault notification. PVM can be used by user
programs written in C, C++, or Fortran, etc.
MPI emerged in the early 1990s out of discussions among 40
organizations. The initial effort was supported by ARPA and National Science
Foundation. Rather than starting anew, the design of MPI drew on various
features available in commercial systems of the time. The MPI specifications
then gave rise to specific implementations. MPI implementations typically use
TCP/IP and socket connections. MPI is now a widely available communications
model that enables parallel programs to be written in languages such as C,
Fortran, Python, etc. Thus, unlike PVM which provides a concrete
implementation, MPI is a specification which has been implemented in systems
such as MPICH and Open MPI.
CONCLUSION
After reading all this info (and complicate in somehow to understand!) let’s
get some interactive info:
- Microsoft Data Center (Short Version) https://www.youtube.com/watch?v=zXsoygN_v7A
- Microsoft Data Center (Long Version) https://www.youtube.com/watch?v=0uRR72b_qvc
- Google Data Center https://www.youtube.com/watch?v=XZmGGAbHqa0
- Amazon Web Services https://www.youtube.com/watch?v=AVy9yqCNIo4
- Facebook Data Center https://www.youtube.com/watch?v=4A_A-CmrqpQ
PLEASE, TELL ME HOW DID YOU FOUND THIS INFORMATION. IT IS HELPFUL?
LET ME KNOW YOUR COMMENTS
No comments:
Post a Comment