SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
THE ADVANCED COMPUTING SYSTEMS ASSOCIATION




                                      The following paper was originally published in the

                          Proceedings of the Embedded Systems Workshop
                                     Cambridge, Massachusetts, USA, March 29–31, 1999




                                Massively Distributed Systems:
                                Design Issues and Challenges


                                                      Dan Nessett
                                                     3Com Corporation




                                             © 1999 by The USENIX Association
                                                    All Rights Reserved

Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial
reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper.
USENIX acknowledges all trademarks herein.

                                 For more information about the USENIX Association:
                                 Phone: 1 510 528 8649        FAX: 1 510 548 5738
                                 Email: office@usenix.org     WWW: http://www.usenix.org
Massively Distributed Systems: Design Issues and Challenges
                                                Dan Nessett
                                       Technology Development Center
                                             3Com Corporation
                                          Dan_Nessett@3com.com


Abstract                                                    1. Introduction
                                                            This paper explores trends in distributed system archi-
The forty year trend in the computing industry is away      tecture and implementation that will have far-reaching
from centralized, high unit cost, low unit volume prod-     consequences in the next 5-10 years. The paper’s central
ucts toward distributed, low unit cost, high unit volume    thesis is that applications will become massively dis-
products. The next step in this process is the emergence    tributed during this interval and in so doing generate
of massively distributed systems. These systems will        significant engineering problems, the solution of which
penetrate even more deeply into the fabric of society and   will change the nature of distributed computing. Fol-
become the information power grids of the 21st century.     lowing this trend, networking technology will change
They will be ubiquitous. Most will operate outside the      in order better to serve the communications require-
normal cognizance of the people they serve and most         ments of this new class of applications.
will be based on embedded systems that present non-
                                                            One major driver of massively distributed applications
traditional computing interfaces to their users. They
                                                            is the advent of ubiquitous embedded systems. While
will be engineered to operate as distributed utilities,
                                                            embedded systems will not always communicate either
much like the energy, water, transportation and media
                                                            with their peers or with traditional computing equip-
broadcast businesses do today. The first deployment of
                                                            ment, important applications will require the intercon-
massively distributed systems is likely to occur as sup-
                                                            nection of them in large numbers. Several examples are
port structures for these industries.
                                                            given in section 2. Furthermore, these applications will
Massively distributed systems will differ from existing     require computing architectures that off-load heavy
distributed systems in important ways. Such systems         computational tasks from embedded systems onto more
eventually will interconnect billions of nodes. This will   powerful and capable equipment. Consequently, mas-
necessitate changes in the way nodes interact with one      sively distributed applications will present problems to
another. One-to-many communications will be the             architects and implementers that transcend those of
norm, rather than one-to-one. The size of on-line appli-    stand-alone embedded system applications.
cation communities will necessitate the use of statisti-
                                                            The paper examines the motivation for and engineering
cally correct rather than deterministic algorithms for
                                                            problems of massively distributed systems. In the next
resource accounting, fault detection and correction, and
                                                            section, a case is made why massively distributed sys-
system management. These communities will coalesce
                                                            tems will arise. Section 3 presents an analysis of their
and dissolve rapidly in order to host events that are of
                                                            engineering design issues. Finally, section 4 presents a
interest to groups formed specifically for this purpose.
                                                            summary of the paper.
This will require new approaches to naming, routing,
security and privacy, resource management and synchro-
nization. Heterogeneity will be even more of a factor in    2. Motivation
the design, implementation and operation of massively       Massively distributed systems are the next step in the
distributed systems                                         development of computing. From a commercial per-
This paper explores the nature and characteristics of       spective, this process started with centralized, high unit
massively distributed systems by proposing some ex-         cost, low unit volume systems in the fifties and sixties.
amples and then using them to characterize nine major       It then moved to less centralized, but still more or less
design issues. Seven of these are drawn from seminal        stand-alone systems in the seventies. In the eighties,
work in the area of distributed systems. Two others are     Network Service Providers (NSPs), such as Compuserv,
based on experience in distributed system design and        Genie and America On-line, became popular providing
implementation subsequent to that work.                     email and bulletin board services to businesses and
                                                            some advanced home users. In the first half of this dec-
ade, internet access became popular, supporting truly         to the day-to-day operations of companies close to the
distributed applications, such as the World Wide Web.         consumer. This will require the development and de-
In the first decade of the 21st century, massively dis-       ployment of massively distributed systems that reach
tributed systems will appear and become ubiquitous. In        into the home, office, public pathways, and private in-
direct contrast to the systems of the fifties through the     stitutions.
early nineties, these systems will be decentralized, low
                                                              While the problems presented by these massively dis-
unit cost and high unit volume. They will be pervasive.
                                                              tributed applications will display certain differences,
Most will operate outside the normal cognizance of the
                                                              there will be a surprising amount of commonality.
people they serve and most will be based on embedded
                                                              Managing flows, whether they consist of information,
systems that present non-traditional computing inter-
                                                              paper, dishwashing detergent, gasoline, or other com-
faces to their users. They will be engineered to operate
                                                              modities, requires sensors at the points where the prod-
as distributed utilities, much like the energy, water,
                                                              uct is consumed, quickly configurable transportation
transportation and media broadcast businesses do today.
                                                              systems to move those commodities (e.g., trucks,
Indeed, the first deployment of massively distributed
                                                              power switching systems, airplanes, a communications
systems is likely to occur as support structures for these
                                                              fabric), and control algorithms and equipment to match
industries.
                                                              the ingress of raw materials to the egress of the con-
Massively distributed systems will not be limited in          sumed products.
scope to the interconnection of and interaction between
                                                              In many distributed systems, such as those concerned
embedded systems alone. While large numbers of em-
                                                              with power, water, natural gas, vehicle fuel, and mass
bedded systems will participate in massively distributed
                                                              market goods delivery, embedded processors will meas-
systems, at least some will interact with traditional
                                                              ure resource use and relay consumption information up
computing equipment. The latter will provide control,
                                                              the supply-chain. Control systems will utilize this in-
information storage, intensive computation and other
                                                              formation to schedule raw material inventory replen-
services to massively distributed applications. Embedded
                                                              ishment and organize the management of resource
systems will provide specialized point-of-presence serv-
                                                              flows. Embedded systems close to the consumer (e.g.,
ices.
                                                              in appliances, yard sprinkler systems, home heating
Several hypothetical examples illustrate the future im-       systems) will schedule resource consumption to mini-
pact of massively distributed systems. These will arise       mize resource demand spikes, thereby reducing the peak
as natural extensions to current practice. They will be       capacity requirements of delivery systems [1].
used subsequently to illustrate some of the distinguish-
                                                              An important supply chain that will dramatically influ-
ing characteristics of massively distributed systems,
                                                              ence how massively distributed systems are imple-
which are significantly different from those of smaller
                                                              mented is information flow. The resource management
scale distributed systems operating today.
                                                              systems mentioned in the previous paragraph are geo-
                                                              graphically hierarchical, since they control physical
2.1. Ubiquitous supply-chain                                  processes characterized by the movement of mass or
management                                                    energy. Information flows as easily over large as small
                                                              distances, a characteristic that distinguishes it from
Managing large distribution processes in the face of          these other supply chains. Consequently, information
increasing governmental regulation, resource scarcity         supply chains will present significantly different prob-
and system complexity will become increasingly impor-         lems to designers and implementers of massively dis-
tant and open up markets for massively distributed sys-       tributed systems. Embedded systems will play a large
tems. The classic utility businesses, e.g., the electrical,   role in information supply chains, as discussed in the
water, natural gas, and telephone companies, will be          next section.
joined by other flow based service industries, e.g.,          It will be necessary to use massively distributed sys-
transportation conglomerates, oil companies (especially       tems to solve this new class of flow control problems,
those which own and operate service stations), large          since the ratio of control points to production facilities
retail companies, including food, clothing, soft and hard     will grow to be several orders of magnitude larger than
goods businesses, and the information industries, such        it is today. This adjustment will encourage new control
as newspaper, television, entertainment and advertise-        algorithms based not on deterministic concepts, but
ment concerns to concentrate on improving their mar-          rather on statistical notions of correctness. Profits will
gins by the efficient control of their product flow. Sup-     be maximized by ensuring a less than 100% accounting
ply-chain management will spread from manufacturing           of resource usage and payment, since reaching the last
few percent will impose significant marginal costs that      ware. Embedded processors will run transportable code,
will exceed the marginal return. Banks already employ        such as Java, in order to support active networking [2].
this philosophy in response to ATM fraud. The money
                                                             To protect consumers from unwanted intrusions into
that could be used to make their equipment more secure
                                                             their lives, products will be developed to filter the in-
earns more interest than the losses they would prevent.
                                                             formation flowing to an end-system. These filters will
Of course, to ensure an equitable and legally defensible
                                                             range from those that allow parents to control the in-
operation (not to mention keeping stockholders happy),
                                                             formation available to their children (eliminating offen-
control algorithms must eliminate systematic fraud
                                                             sive material such as pornography) to those that elimi-
whereby one or a small group of individuals consis-
                                                             nate the probable increase in junk information broadcast
tently benefit from a system's non-deterministic behav-
                                                             to consumers. Specialized processors implementing
ior.
                                                             pattern recognition algorithms and embedded in rout-
                                                             ing, switching and end systems will provide powerful
2.2. Agile distributed information                           and customizable filtering capabilities to achieve these
services                                                     objectives.
Perhaps the most dramatic development of the past 5          To solve the problem of locating communities that
years is the growth of information distribution channels     might interest an individual, services will form offering
based on the Internet and their rapid penetration into       electronic real-time catalogues. These will be the suc-
areas not normally associated with computing. The            cessors of the WWW index and search engines that exist
World Wide Web, email, and Internet news groups are          currently. Electronic advisory services will inform con-
prominent examples. This trend will not only continue        sumers which communities offer the best value, using a
it will accelerate. The maturation of multi-media broad-     value metric tailored for the customer's specific objec-
cast technology running over the Internet, assisted by       tives. Specialized processors embedded in database and
extremely high capacity networking infrastructure will       information repository systems will continuously index
encourage the formation of virtual information commu-        and catalogue information as it arrives.
nities. Unlike more traditional groupings, these com-
munities will be ephemeral, lasting anywhere from            Needless to say, participation in these communities will
weeks or months to periods as short as several hours or      require payment, ranging from that necessary simply to
less. Primitive examples of these communities now            reimburse common carriers, when the content is free
exist, e.g., everyone reading the same newspaper and all     (e.g., broadcasts of school plays, candidate sponsored
those watching the same TV program. However, the             political events, religious events), to payments for ex-
information communities of tomorrow will not be              clusive information commodities (e.g., pay-per-view
based on rigidly structured broadcast hierarchies; rather    sporting events, specialized financial broadcasts). Such
they will encourage interactions between people who are      payments will require the development and deployment
members of a community at a particular moment. Mul-          of an electronic monetary system that utilizes a combi-
ticast interest groups centered around a wide variety of     nation of digital cash, electronic credit systems (i.e.,
subjects will form, much as internet news groups form        electronic analogues of the credit card, bank line of
today. Small production companies will present a play,       credit, and checks), and point of presence payment ter-
host a debate, cover a sporting event or sponsor an elec-    minals. Processors embedded in cash cards and other
tronic interactive event, such as a networked game that      payment tokens will play an important role in such
millions play simultaneously. The large media con-           systems.
glomerates will gradually evolve into distributed infor-
mation and entertainment production companies offering       2.3. The new economics
a wider variety of programming material than they do         Electronic commerce is now a common feature of many
now.                                                         commercial enterprises. However, most designers and
Managing such rapidly changing information communi-          implementers of electronic commerce systems have
ties will require more capable networks than currently       concentrated on the provision of electronic payment.
exist. Switching and routing equipment will include          Other aspects of electronic commerce have received less
embedded systems to monitor resource use and adjust          attention, in particular, many components that are well
allocations to meet fluctuating demand. These systems        matched to implementation on massively distributed
will control resources that are traditionally static, such   systems. Commerce involves not only payment for
as device backplane bandwidth, encryption and compres-       goods and services, but also their creation, advertise-
sion hardware, and agile error detection/correction hard-    ment, delivery, maintenance, and disposal.
The downward spiraling cost of communication is               uted systems. This work identified seven major design
changing profit models in many businesses. Hard prod-         issues. Two additional issues are added to the seven pre-
uct companies will change their businesses to adapt to        viously identified based on experience subsequent to that
the new micro-economic environment. The creation of           work. The classical design issues are : 1) naming, 2)
hard goods will become increasingly automated. Their          error control, 3) resource management, 4) protection
advertisement will become more interactive, consumers         (security and privacy), 5) synchronization, 6) objects
will use third party search and evaluation companies          (representation, encoding and translation), and 7) meas-
that recommend products according to a detailed specifi-      urement, testing and debugging. The additional design
cation. Since consumers are rarely expert enough to           issues are : 8) heterogeneity, and 9) scale.
create a useful and detailed description of their require-
                                                              Each of these issues is used to show how massively
ments, such companies will develop interactive systems
                                                              distributed systems differ significantly from existing
with friendly customer interfaces that will lead a cus-
                                                              distributed systems. For pedagogical reasons, the issues
tomer in the development of these requirements.
                                                              are addressed in the following order : scale, heterogene-
Customers will not travel to a remote location to use         ity, objects (representation, encoding and translation),
these systems. They will be accessed through commu-           resource management, protection (security and privacy),
nications facilities terminated in the customer's home or     naming, error control, synchronization, and measure-
place of business. Eventually, requirement specifica-         ment, testing and debugging.
tions will drive the creation of the product at an auto-
mated facility, create a just-in-time delivery plan to        3.1. Scale
move the product from the manufacturing facility to the
shipping end-point, schedule the resources to conduct         The main characteristic of massively distributed systems
warranted and purchased maintenance on the product, and       is their scale. While current distributed system are com-
arrange for the disposal of old equipment the customer        posed of hundreds of nodes1, during the first decade of
no longer needs (or that was traded in for new equip-         the 21st century massively distributed systems will be
ment). Companies will optimize costs by coordinating          composed of thousands to billions of nodes, supporting
this with the warehousing and shipment of the pur-            hundreds to millions of users [4].
chased equipment as well as with the disposal of the          The scaling issue for massively distributed systems has
equipment of other customers.                                 a number of dimensions. In the area of communications
Embedded systems will play an important role in the           capacity, recent events in the telecommunications indus-
new economics. Once a manufacturing facility receives         try promise to dramatically change the available world-
a product specification, embedded processors in robotic       wide bandwidth for networking. For example, Project
equipment will tailor its mechanical assets to create that    Oxygen of the CTR Group, Ltd. intends to lay fiber
product. Embedded processors will sense resource usage        across the ocean floors resulting in bandwidth of 1.28
and work with control systems to ensure parts are deliv-      terabits/sec on any segment [5]. Current schedules call
ered when necessary. Embedded systems will monitor            for the completion of an Atlantic ring by Q4 2000 and a
and schedule warehouse and delivery capacity, cooperat-       Pacific ring by Q2 2001. The pricing schedules pro-
ing with control systems to move the product to the           posed by CTR Group will result in transoceanic band-
customer. Embedded systems in robots will manage the          width offered at 1.5% of current satellite circuit prices
repair of products returned for maintenance.                  and 1% of marine cable prices.
                                                              Power, water, natural gas, vehicle fuel and mass market
3. Design issues                                              supply chains are naturally large scale. Traditional in-
                                                              formation supply chains (e.g., radio, television, motion
Massively distributed systems are not only quantita-
                                                              picture, the printed news media) are broadcast based and
tively different, but also qualitatively different than the
                                                              characterized by a small producer-to-consumer ratio.
distributed systems in place today. Their size and scope
                                                              However, this is undergoing radical change. The growth
introduce new problems in design and implementation.
                                                              of the world wide web has greatly increased the pro-
However, to a certain extent we can use the experience
we have gained building moderately scaled distributed
systems to guide us in solving these problems.                1
                                                                Some may object to this assertion, observing that the current Inter-
                                                              net is composed of millions of nodes. However, the term distributed
This section presents a taxonomy of design issues for         system here means a cohesive set of computing resources acting
                                                              together to achieve a common objective. While the routing infra-
massively distributed systems. It is based on seminal         structure of the Internet can be considered to be a massively distrib-
work [3] that analyzed the current generation of distrib-     uted application, no other system of this scale exists currently.
ducer-to-consumer ratio in data communications and this         per-use approach, which is suited to their ephemeral
trend will continue. There may even come a time when            nature. Longer-lived communities may be more in-
the ratio approaches 1. This one factor has the potential       clined to charge a subscription fee for their services.
to completely change how information supply chains              Some communities with external financial backing
are implemented and could be the major driver behind            may choose to charge no fee, transferring value to
massively distributed systems.                                  their sponsors through advertising, indirect persua-
                                                                sion, or charitable benefit. Each of these costing
Geographically, massively distributed applications sup-
                                                                models must be supported by the massively distrib-
porting information supply chains will be global in
                                                                uted system infrastructure in a way that allows such
extent. A single application running over a massively
                                                                diverse activities to exist concurrently.
distributed system will interconnect users of signifi-
cantly different languages, cultures and views. Commu-
                                                             • To support cost recovery for distributed applications,
nication costs will be structured so that they scale loga-
                                                                several efforts are underway to build and deploy elec-
rithmically with the number of destinations, thereby
                                                                tronic monetary systems. Existing applications are
utilizing efficiencies inherent in broadcast and multicast
                                                                able to use a single funds transfer system, since
communications. Individual subscribers will be able
                                                                many serve a limited customer base. Massively dis-
affordably to communicate with one to a hundred other
                                                                tributed systems, on the other hand, will be global
subscribers. One-to-many communications will be the
                                                                in scope and normally serve customers in many
norm, as opposed to one-to-one transfers, which are
                                                                countries who use a large variety of payment sys-
most common today.
                                                                tems. For some, only traditional methods, such as a
                                                                credit card or cash payment card will be available;
3.2. Heterogeneity                                              while others may use different electronic systems,
While heterogeneity is an important design issue for            such as digital cash, digital checks, or digital credit
existing distributed systems, its extent in massively           cards. Embedded systems will play an important
distributed systems will be significantly greater. In par-      role in the latter class of payment system. Mas-
ticular:                                                        sively distributed systems must accommodate the
                                                                use of all such systems by a single application.
• The communications infrastructure will be composed            This will necessitate the development of funds trans-
   of channels of very different capacities. Very low           fer transaction clearinghouses that will convert funds
   bandwidth channels (e.g., 1200 BPS) will still be in         from one system to the other.
   use in developing nations during the next decade,
   while the developed world will support very high          • Distributed systems currently run on equipment man-
   bandwidth channels in the core services (e.g., 1             aged by different administrations, which desire to
   Tbps). Communications between embedded systems               keep its use under their control. Current practice is
   and their peers or higher-level facilities will occur        to rely on the users of this equipment to properly
   over channels of significantly lower bandwidth than          employ it according to policies promulgated by
   that available in the core of the network. Building          these administrations. However, this approach is
   applications that run over bandwidth diverse com-            changing as organizational IT budgets become larger
   munication infrastructure will require new ways of           and as Board of Directors, stock holders and other in-
   organizing data flows.                                       terested parties hold management more strictly ac-
                                                                countable for information processing equipment use.
• Similarly, end-systems will possess a wide variety of         Organizationally imposed constraints on distributed
   presentation techniques, from interactive HDTV to            applications have a deleterious effect on their opera-
   personal digital assistants and personal phone               tion. A clear example of this is how router based
   equipment. Applications will combine these end-              firewalls not only provide protection against intruder
   systems into coherent interactive subsystems. In             attacks, but also prevent certain distributed applica-
   fact, it is possible today to create a communications        tions, such as those using UDP, Java applets or
   conferencing system that patches together video-             those using the X window system from running
   teleconferencing equipment, multicast capable work-          across firewall boundaries. Massively distributed
   stations and cellular phones.                                systems will experience even greater impediments to
                                                                their operation, since not only will locally imposed
• Participatory communities, formed on demand, will             constraints interfere with their operation, but other
   choose from an array of costing options for their            constraints will hinder them, such as national re-
   members. Short-lived communities may elect a pay-            strictions on transborder data flows and encrypted
data as well as the mandated use of certain standards            also on a wide variety of embedded systems sup-
      for communications protocols.                                    ported by their own proprietary operating systems
                                                                       and hardware (e.g., automobile control systems,
• Distributed applications are composed of software                    PDAs). For example, a massively distributed appli-
   components2 that run on various platforms and are                   cation for remote conferencing may have compo-
   organized into component classes (e.g., file servers,               nents that run on workstations, on equipment pro-
   public key distribution systems, file system clients,               vided by a manufacturer for the TV cable industry,
   various clients on embedded systems), each of which                 on cellular phones, on PCS based personal commu-
   performs a different function. Current distributed ap-              nication devices, and so forth. This will increase the
   plications either assume that all components run the                number of different implementations of the software
   same version of software, or are structured according               for a single component type and thereby increase the
   to a simple client/server model that must concur-                   effort necessary to ensure the application works cor-
   rently accommodate different versions on a small                    rectly.
   number of system (e.g., two or three). Massively
   distributed applications, because of other heterogene-
   ity requirements, such as the use of different presen-           3.3. Objects (representation,
   tation formats, costing and cost recovery schemes,               encoding and translation)
   naming techniques and administration constraints,                A variety of efforts are underway to determine the best
   will be constructed of a significantly larger number             programming model for distributed objects (e.g.,
   of component types. Components of these types in                 CORBA, Java). Lessons from these investigations will
   general will not run the same version of software.               directly affect how objects are handled in massively dis-
   Designing and implementing these applications will               tributed systems. However, there are certain issues aris-
   therefore require engineering techniques that allow              ing from the scale of these systems that will introduce
   them to operate in the face of software versioning               new constraints on how objects are represented, encoded
   heterogeneity.                                                   and translated.

• One of the fundamental mistakes made when building                • Massively distributed system by their nature will re-
   distributed systems is ignoring legacy applications                 quire the creation, movement, modification and de-
   and infrastructure. This invariably leads to poor cus-              struction of massive objects, which conceivably
   tomer acceptance of new technology. Customers                       could contain thousands to millions of other ob-
   cannot simply discard their existing applications                   jects. The size of these objects will affect how they
   when new infrastructure becomes available. In many                  are represented. For example, it will be infeasible to
   cases these applications and their supporting infra-                incorporate a copy of each contained object within
   structure represent billions of investment dollars and              the massive object. Instead, object references will be
   reimplementing them according to the interfaces                     used within massive objects to represent its con-
   presented by new technology is economically infea-                  stituents. These objects will themselves be con-
   sible. Massively distributed systems will be no dif-                structed using references to their contained objects,
   ferent in this regard. However, existing distributed                leading to a hierarchical object anatomy. However,
   systems generally must accommodate stand-alone                      several well-known problems associated with refer-
   legacy systems and applications. Massively distrib-                 encing distributed objects will require attention. For
   uted systems, on the other hand, must accommodate                   example, implementers of massively distributed sys-
   both stand-alone as well as current distributed sys-                tems will have to solve the lost object problem,
   tem legacy infrastructure and applications.                         which occurs when an object is destroyed without
                                                                       destroying all its references in other objects. Rely-
    • While most existing distributed applications run on a            ing on manual techniques to recover from this error
       number of different computing platforms, they are               condition will not be an option, because of its scale.
       generally limited to a small number of common                   Similarly, revoking access to an object will be
       families, e.g., Unix, Windows, and perhaps MVS.                 much more problematic for massively distributed
       Massively distributed applications, on the other                systems than for existing distributed systems. It
       hand, will run not only on the legacy platforms, but            will be infeasible to locate all references to an object
                                                                       and delete them, even if the underlying object repre-
2
 The term “component” is used here in its general sense. Distrib-      sentation allows that. While using access lists can
uted system components may or may not be constructed as object-
oriented software components.                                          mitigate the problem of revocation, their use in
                                                                       massively distributed systems will be problematic,
since their size may make this approach unattrac-            new problems in object store performance, error
   tive. This problem is discussed in more detail in            control and correctness. Object implementers will
   section 3.5.                                                 have to deal with the internal synchronization of ob-
                                                                ject constituents, in regards to their creation,
• Not only will the representation of massively distrib-        movement, modification and destruction. They also
   uted objects require new techniques, their presenta-         will have to deal with these issues when designing
   tion to users will also require innovation. Some re-         and implementing object caches.
   searchers have examined this problem. A new class
   of user interface represents objects as virtual spaces.   • The algorithms used to manipulate massively distrib-
   This technique is eminently suitable for presenting          uted objects will utilize techniques that differ from
   massively distributed objects to end users. For ex-          those commonly used today. They will be replaced
   ample, a first level object might be represented as a        by those that use multi-cast communications in or-
   virtual world, its constituent objects as countries,         der to avoid object components understanding the
   then cities, streets, houses, rooms, and so on, the          object's global architecture. Voting and distributed
   exact structure depending on the size of the first           agreement algorithms will become common. Pro-
   level object and its relationship to its constituents        grammers will create active objects, i.e., those
   as well as the relationship of the constituents to           which include active on-going computing, that blur
   each other. Such presentation paradigms will be re-          the distinction between data and processing. While
   quired to make massively distributed objects acces-          some existing distributed systems support active ob-
   sible to the technologically naive user.                     jects, massively distributed systems will organize
                                                                large parallel computations around this concept,
• Since massively distributed applications will be global       forming these computations by creating a first level
    in scope, their users will belong to multiple cul-          object containing a large number of active objects
    tures and countries. This will increase the necessity       residing on geographically disperse and administra-
    for the internationalization of data. While current         tively heterogeneous systems.
    applications can be configured to use different char-
    acter sets, depending on choices made when the sys-
    tem is installed, massively distributed systems will     3.4. Resource management
    have to internationalize data on-the-fly. This is es-    In order to design and build massively distributed appli-
    pecially true of embedded systems, which com-            cations, engineers will have to grapple with new prob-
    monly will exist in mobile equipment. There will         lems in resource management. Many existing distrib-
    be no point when a system administrator can select       uted systems operate according to a local resource con-
    a particular internationalized presentation set. Users   trol model. Local processing manages its own
    of massively distributed applications will come and      resources, interacting with other threads of control
    go during its lifetime, requiring run-time configura-    through message passing or remote procedure
    tion of this data. Perhaps more importantly, there       call/method invocation. In massively distributed sys-
    will not be a single system administrator who con-       tems, objects will be composed of resources located in a
    trols a massively distributed application. Its control   large number of different places. Controlling the re-
    will be distributed as well. Not only will character     sources associated with an object will only be possible
    set data require internationalization, the application   through a distributed global object resource management
    will have to internationalize other data such as fi-     mechanism. This will introduce several new issues in
    nancial, length/mass/time, and timezone. For exam-       distributed system resource control:
    ple, an application that presents financial data may
    have to convert between dollars, yen and marks, de-      • Routing will become an issue not only at the network
    pending on where the end-point system is located            layer of the distributed system, but also at the appli-
    and how it has been configured by the customer.             cation layer. Composite objects containing active
                                                                objects will require routing services so the compos-
• The storage of massively distributed objects will re-         ites can interact with each other, especially if ob-
   quire new techniques. Current object storage sys-            jects are allowed to move. For example, an embed-
   tems assume objects are located on resources con-            ded system such as a mobile PDA will move and
   trolled by a single administration. The storage of a         connect to different portals in a network. If upon
   single massively distributed object, however, may            connecting, objects are moved from the PDA to
   use the resources of many storage systems, con-              execution environments hosted near the portal, in-
   trolled by different administrations. This will lead to      vocation of those object's methods by other objects
will require a routing protocol to identify a path be-     • Massively distributed systems will contain extremely
   tween them. The interactions between active objects           large volumes of data and enormous processing
   and passive objects will require routing so active ob-        power. Effectively managing these resources will
   jects can locate and contact passive object methods.          challenge implementers. Distributed system archi-
   Congestion control within composite objects will              tects will merge the two prevalent ways to organize
   be an issue. Programmers and object implementers              computations in a distributed system, i.e., move
   will organize objects to avoid computational hot              data to the processing (used by NFS, World Wide
   spots, similar to those experienced by massively              Web, FTP, gopher) and move processing to the data
   parallel algorithms. Due to their scale, object               (used by active networking and Java applets) into a
   implementers will require adaptive routing and con-           single scheme. Such objects will be moved within
   gestion control within massively distributed objects,         the distributed system and carry with them both code
   manual configuration will not be an option.                   and data [6]. If the object consists primarily of data,
                                                                 it will closely approximate moving data to the proc-
• Resource management in a massively distributed sys-            essing. If it consists primarily of code, it will
   tem will interact strongly with its heterogeneous na-         closely approximate moving processing to the data.
   ture. Applications will compose resources under the           RPC and other procedure-oriented paradigms will be
   control of many administrations into a single dis-            emulated by including a single high level instruc-
   tributed resource. Management of the distributed re-          tion in the code section, specifically a procedure
   source must accommodate the usage policies of the             identifier, which will be interpreted at the RPC
   separate administrations. For example, cost recovery          server.
   for a massively distributed object must accrete and
   disseminate sufficient funds to cover the costs of the     • The deployment of massively distributed systems will
   composite objects. Since massively distributed ob-            encourage the trend towards a new valuation model
   jects may be highly recursive, cost recovery will re-         for information. Specifically, the independent vari-
   quire the composition of recursively discovered               able driving value will be how long the information
   composite costs. As objects evolve, their cost re-            has existed. Massively distributed systems will in-
   covery anatomy will change, forcing object                    crease the difficulty in keeping information private.
   implementers to dynamically configure the object's            Consequently, customers will pay for fresh informa-
   cost recovery algorithm. Since it will be difficult           tion, which will rapidly decrease in value once it is
   and expensive to continually track the exact cost             produced. For example, stock market ticker data is
   structure of an object at any point in time, cost re-         now available electronically, making automated trad-
   covery will use statistical algorithms, assessing             ing programs possible. However, this implies that
   charges and dispersing payments so that the owners            most of the value of the ticker data is consumed in a
   of component resources receive their payments, but            very short period of time. Its public release without
   avoiding too fine an accounting of usage. Periodic            cost routinely occurs today 15 minutes after its gen-
   audits will ensure systematic fraud is minimized.             eration. As massively distributed systems become
   Operators of massively distributed objects will adopt         common, more and more stock traders will be forced
   a fixed cost model, allowing users of the object to           to use automatic trading programs, which will re-
   flexibly utilize its resources without charging for its       duce the value of the ticker data even faster. It is
   components use.                                               likely that eventually ticker data will be available
                                                                 without cost within a few minutes of its production.
• Object implementers will devise algorithms enforcing           This paradigm will apply to other data, such as li-
   restrictions on object use that are required by restric-      brary searches, financial analyses and market re-
   tions on their components. Unlike existing distrib-           search.
   uted resource management, these algorithms will
   have to be adaptive, since implementers will not           • Upgrading software in a massively distributed system
   know all possible component resource usage poli-              will pose new problems for software companies. It
   cies, a priori. Object implementers will be chal-             will become more difficult to locate all users of a
   lenged to understand object behavior, since dynami-           particular version of software and users will become
   cally changing an object's components, which may              less cognizant of the version they are using. This
   introduce new resource usage restrictions, will make          situation will arise because objects will contain and
   this difficult.                                               use other objects implemented by software not di-
                                                                 rectly visible to the user. To meet this new chal-
                                                                 lenge, software companies will move from a pure
product business model to a combination of product             which a user is capable of manually examining for
   and service model. Indeed, this is already happening.          interest, usefulness, or suitability. Various filtering
   Many software companies offers a subscription pur-             techniques will be developed and used by customers
   chase agreement for their products that provides the           to ensure their systems only present to them infor-
   customer with one year of upgrades. Engineers will             mation in which they are interested. Additionally,
   design their software to periodically query mainte-            once these filtration systems are common, applica-
   nance centers that will upgrade it automatically, if           tions will use feedback controls to gather and em-
   the software's maintenance contract is still active.           ploy end-system supplied filtration data to decrease
   Again, this is current practice. For example, the vi-          the amount of information sent to an end-point that
   rus detection software that is part of the Norton Sys-         is automatically discarded. Massively distributed
   tem Tools product for Windows 95/98 periodically               systems will provide tools to applications that al-
   communicates with a maintenance site on the Inter-             low them to support this type of customization for
   net and downloads new virus checking features as               large numbers of end-point systems.
   they are released. Customers will be forced to use
   maintenance contracts to keep their applications
   running. In the future, customers will force software       3.5. Protection (security and
   manufacturers to provide open interfaces to their           privacy)
   software and negotiate maintenance agreements with          Protection of distributed system assets, including base
   secondary companies, much like the hardware busi-           resources such as processing, storage, communications
   ness does today.                                            and user-interface I/O as well as higher-level composites
                                                               of these resources, e.g., processes, files, messages, dis-
• Massively distributed systems will force a reexamina-        play windows and more complex objects is not a
   tion of the way communication interfaces work. In           strength of existing distributed systems. While engi-
   particular, they will have to scale to accommodate a        neers are currently engaged in developing solutions to
   large number of correspondents. For example, a              the many problems that exist in this area, they are not
   simple mass-market purchase application that al-            addressing protection issues that will arise as massively
   lows customers to choose between several products           distributed systems become prominent. Specifically:
   may receive millions of requests in a short period of
   time. Current programming interfaces to communi-            • Massively distributed systems for the most part will
   cation services are totally inadequate to handle such          support a large number of end-systems, many of
   a high rate of transactions. Furthermore, individu-            which will be embedded in other equipment and used
   ally acknowledging each of these transactions would            by technologically naive customers. These systems
   impose a significant processing burden on the sys-             will require management, which very probably will
   tems running the application. This will encourage              occur through the use of systems under the control
   engineers to design communication interfaces and               of service providers. Since many end-systems will
   protocols that are better suited to high rate and vol-         either generate or contain data that customers con-
   ume transactions.                                              sider private, the management technology for mas-
                                                                  sively distributed systems must guarantee, to a rea-
• Since massively distributed systems will dramatically           sonable extent, that unaggregated data is not com-
    increase the amount of traffic handled by a commu-            promised either by individuals operating the
    nications fabric, engineers will investigate how to           management sub-system or by the service providers
    efficiently move extremely large volumes of traffic           as part of their corporate strategy, e.g., during the
    over the internet. In addition much of this traffic           collection of marketing data. To meet this objective,
    will belong to one of several different quality of            end-systems must be customer anonymous with re-
    service classes (e.g., best effort delivery data, stream      spect to the management sub-system. That is, there
    data) and so understanding how to accommodate dif-            must be no tie between the identifiers used to man-
    ferent quality of service parameters in the internet is       age end-systems and those used for billing, warranty
    a problem with high priority. Fortunately, this               or other services that require the identification of the
    problem is currently receiving significant attention          individuals who own or use these end-systems.
    by the Internet Engineering Task Force and the
    IEEE.                                                      • Since one-to-many communications will play a major
                                                                   role in massively distributed systems, engineers will
• Massively distributed systems will support a volume              have to address the problems of deletion, modifica-
   of information flow to an end-system beyond that                tion, insertion, replay, release of secret state, and
masquerade for this type of communications. Exist-          forcing these legal constraints will be just as hard as
   ing techniques are designed to protect one-to-one           implementing the technology that satisfies them,
   communications against these threats. New proto-            over time the constraints may be relaxed or even
   cols and processing algorithms will be required to          eliminated. However, while they are in force, the
   provide these services for one-to-many communica-           technical problems they raise will be formidable.
   tions.
                                                            • Commercial enterprises will play a large role in mas-
• The scale of massively distributed systems will intro-       sively distributed applications. Consequently, new
   duce new problems in resource access control for            threats will arise as the value of these applications
   both end-systems and infrastructure support sys-            increases. One concern is the security of internet-
   tems. Currently, most systems use access control            work routing. Massively distributed systems will
   lists or their approximations (e.g., Unix file access       introduce new factors into this problem. The infra-
   permission bits). However, massively distributed            structure required for such systems will necessarily
   systems will support millions of principals, which          federate other large networks into a global commu-
   would lead to access control lists of unmanageable          nications fabric. However, routing service providers
   size, if implemented as they are currently.                 may wish to limit the traffic that transits their net-
   Implementers will require new techniques, such as           works and subscribers may demand guarantees that
   access list caching hierarchies, capability systems         the quality of service advertised by routing service
   that solve the revocation and lost object problems,         providers is actually delivered. The former problem
   and access control techniques that combine the ad-          has received a great deal of attention, leading to the
   vantages and characteristics of access control lists        formulation of policy routing techniques. The latter
   with capability access control and eliminate the dis-       problem on the other hand, has received less atten-
   advantages of each. Since massively distributed sys-        tion and will require further study.
   tems will be highly heterogeneous and require the
   support of legacy distributed systems, engineers will
   be forced to grapple with the hazards of composing       3.6. Naming
   systems that use access control lists with those us-     Engineers have devoted considerable attention to the
   ing capability based access control [7].                 issue of naming in distributed systems over the last 20
                                                            years. Pioneering efforts, such as Grapevine [8], the
• When confronted with the problem of information           Domain Name System (DNS) [9] and NIS [10], devel-
   protection, existing distributed systems already must    oped the concepts for the next generation of distributed
   cope with the controls governments have placed on        naming systems, such as X.500 and LDAP accessed
   cryptographic technology. These constraints have         directories. To some extent these second-generation sys-
   hampered engineers in providing appropriate levels       tems, especially X.500, are designed for large scale dis-
   of security to the users of distributed systems. Mas-    tributed systems. In particular, X.500 supports the stor-
   sively distributed systems will generate new prob-       age and retrieval of customer defined data structures, a
   lems in this area. Since communications will occur       hierarchically structured and distributed naming context
   between large numbers of end-systems, security           mechanism and decentralized authentication services.
   service implementations will be hard pressed to as-      The Open Group's Distributed Computing Environment
   certain which systems are constrained by particular      (DCE) uses either X.500 or DNS to construct a large
   national laws concerning encryption and what those       scale naming system from its local Cell Directory Sys-
   limitations might be. As mobile systems become           tem [11].
   more prevalent and as they are integrated into mas-
   sively distributed systems, enforcement of these         Even though the designers of second generation distrib-
   constraints, even if they are known, would require       uted naming systems have attempted to accommodate
   tracking the position of a mobile system, identify-      massively distributed systems in their design, there are a
   ing the constraints imposed by its locality, and         number of naming issues that transcend these designs
   communicating these constraints to all systems in-       and require attention:
   teracting with it. For the number of end-systems         • The volume of data within a massively distributed
   that will engage in a massively distributed applica-        naming system will be so large, tens to hundreds of
   tion and for the frequency with which a mobile sys-         millions of entries, that customers will not know
   tem's position might change (e.g., those used from          where to start when looking for an item. Hierarchi-
   or embedded in an automobile, train, boat or air-           cal structures have the advantage of logarithmically
   plane), this will be technically infeasible. Since en-
scaling the name space, given a specific choice of             around, which will in turn require meta-naming
   how the data should be organized. They have the                services (i.e., to name the naming context). Trans-
   disadvantage of constraining efficient queries to              lating naming contexts will necessitate the discov-
   those whose unknowns are at the bottom of the hi-              ery of the relationships between the different con-
   erarchy. Since a single massively distributed system           texts. Furthermore, object sharing requires the nam-
   will be used for a wide variety of purposes, multiple          ing of objects by different entities. Communicating
   indexes into its naming data will be required. Inde-           names for the purpose of sharing requires moving
   pendent services will manage these indexes and each            the associated naming contexts.
   will need to update their meta-data as the foundation
   data changes. This already exists for indexes of web-       • Different cultures may require different naming
   page data. However, as anyone who has used inter-              schemes, some hierarchical, some local. These must
   net search engines realizes, the volume of data re-            be coalesced into a usable massively distributed sys-
   turned for many queries is too large for efficient use.        tem naming scheme. The same object may be
   Higher-level indexing services will use the services           named differently depending on the use of the name
   of lower-level services, leading to a hierarchy of             and its characteristics (e.g., native language, avail-
   naming data accessible by customers. This web will             able character set).
   contain data of varying degrees of freshness, which
   will require new methods and algorithms to present
   a consistent and coherent view to customers.                3.7 Error Control
                                                               To a certain extent the problem of error control in
• As stated previously, a large number of the organiza-        global networks has not been solved for existing dis-
   tional structures within a massively distributed sys-       tributed systems. For example, multicast links fail dur-
   tem will be ephemeral, e.g., information communi-           ing video teleconferences, ftp sites become unreachable
   ties which will unite around a particular event and         during a transfer due to firewall, router or link failure,
   then dissolve. However, many who would be candi-            and end-system failure results in broadcast storms that
   dates to participate in such an event may not be            cripple local communications. All of these events are
   aware, a priori, that it exists. The scale of a mas-        difficult to analyze and correct automatically. They in-
   sively distributed system invalidates traditional ad-       variably require the use of out-of-band channels to no-
   vertising techniques, since the volume and rate of          tify a responsible technician or system administrator. In
   the advertising information arriving at an end-             a massively distributed system, however, the identifica-
   system using these approaches would overload it.            tion and use of an appropriate out-of-band channel will
   Consequently, engineers will devise new pro-                be problematic. Furthermore, service providers must
   ducer/consumer models of advertising. For example,          provide error control in ways that conform to other re-
   one possibility is a hierarchically structured reverse      quirements, such as privacy protection, high perform-
   advertising technique, whereby the customer adver-          ance and scalability. Such constraints lead to the follow-
   tises desirable product attributes to first tier brokers,   ing problems:
   which accrete these into reverse advertisements to
   higher level brokers. From other leaves in the hier-        • Fault isolation and correction in massively distributed
   archy, product producers advertise their products to            systems will require new infrastructure services to
   other first-tier brokers, which likewise accrete them           monitor communications quality and deliver excep-
   into advertisements to higher level brokers. Some-              tion alarms to service providers when quality falls
   where within the hierarchy a rendezvous occurs and              below a given threshold. Since the number of dis-
   direct advertisements are sent or otherwise made                tributed applications running in a massively distrib-
   available to customers. A simple example of this                uted system will be too large for manual monitoring
   strategy for advertising and using computing re-                to be cost effective, service providers and distributed
   sources has already been prototyped [12]                        system implementers will design, implement and
                                                                   deploy automatic fault detection and isolation
• Moving objects requires moving names embedded in                 mechanisms to ensure service remains available dur-
   the object. This necessitates either retaining the con-         ing various resource outages. This will be especially
   text in which those names are interpreted or translat-          important for distributed embedded systems. The
   ing them into contexts available at the new object              functions of existing network operation centers will
   site. Massively distributed systems makes either                become automated. Network operation centers will
   strategy difficult. Retaining contexts will compel an           focus on higher level fault isolation and control,
   object to maintain contact with them as it moves
such as rerouting communication services when a              within communication protocols, but also within
   catastrophic event takes down large subsystems.              objects so their state driven algorithms are fault tol-
                                                                erant. To achieve this objective, there will be an in-
• Engineers of massively distributed systems will intro-        creased use of voting protocols and algorithms.
   duce new ways to process a large volume of infor-
   mation and number of transactions. As previously          • It will be impossible to keep track of all the identifiers
   described for costing strategies, statistical algo-            or other processing state associated with an object.
   rithms will be used to cope with the decreasing                Massively distributed systems must be designed to
   probability that all necessary fault identification in-        work in the face of lost objects, which will be the
   formation is available in a reasonable amount of               rule rather than the exception. Loss of identifiers
   time. For example, consider the problem of auto-               will also increase the incidence of orphaned objects,
   mating the collection of payments for a video serv-            which will increasingly require the use of global ob-
   ice. Errors in transmission, bugs in software, unre-           ject garbage collection.
   sponsiveness of end-system equipment due to mal-
   function and operator error will lead to a certain        • One-to-many communications on a large scale will
   amount of imprecision in the accounting and collec-          encourage the development of forward error correc-
   tion processes. An automated billing and collection          tion techniques, which will replace feedback error
   application will record all funds received, match            control in many cases. Feedback error control re-
   them with outstanding balances on accounts and at-           quires the retention of a large amount of state and a
   tempt to resolve the amount received with the over-          high level of processing when applied to one-to-
   all outstanding balance. A discrepancy below a cer-          many communications. Idempotent operations will
   tain threshold will be charged to the cost of doing          become the paradigm of choice when one-to-many
   business. Auditing processes will attempt to ensure          communications support remote procedure or object
   that these discrepancies are not systematically occur-       method calls.
   ring as the result of fraud. Such techniques will also
   be used for automated opinion polling, physical re-       • Error control will move more and more into the appli-
   source flow monitoring (e.g., power, water), and              cation layer as the error characteristics of lower lay-
   distributed system infrastructure maintenance.                ers become heterogeneous (especially for one-to-
                                                                 many communications) and as applications are
• Massively distributed systems will require the use of          widely distributed. [13]. To manage application error
   highly available subsystems, including a redundant            control, management systems will expand their con-
   communications fabric provided by multiple service            cerns from the network layer and below to all layers
   providers, fault-tolerant distributed processing for          of the protocol hierarchy. This will lead to the stan-
   control and accounting, and high availability end-            dardization of management protocols and interfaces.
   systems for content providers, which will be the
   next logical step beyond the high availability file
   systems currently deployed. Fault-isolation will          3.8 Synchronization
   have to operate in the face of privacy services, such     One of the most difficult problems that engineers of
   as encryption, that are virtually non-existent today,     massively distributed systems will encounter is syn-
   but that will become prominent in the next decade.        chronizing computations consisting of thousands to
   Service providers will have to respond to subscriber      millions of components. Current methods of synchroni-
   problem reports, even when the ultimate source and        zation, such as semaphores, monitors, barriers, remote
   destination of traffic causing the problem is un-         procedure call, object method invocation, and message
   known. They will gravitate toward a scaled cost           passing, do not scale well. Generally, they are suitable
   structure for various grades of service, from high        either for synchronizing closely coupled computations
   grade indemnified service ("you lose service, we lose     (e.g., semaphores, monitors and barriers), or for unicast
   revenue"), to lower grades of best effort service.        distributed applications (e.g., remote procedure calls,
                                                             object method invocations and message passing). Engi-
• Detection of inconsistent state will be difficult in       neers have not yet devised efficient synchronization
   massively distributed systems, since the distributed      techniques for group computations, when groups con-
   state will not be available for centralized processing.   tain thousands to millions of active elements. Specific
   Consequently, algorithms for communications and           problems in the area of synchronization include:
   processing will be developed that are tolerant of in-
   consistencies. They will use redundancy not only
• Caching will become pervasive to accommodate per-           • To accommodate the improbability of a component
   formance, error control and resource management               knowing all other components with which it should
   objectives. Keeping all cached copies synchronized            synchronize, multicast communications will be-
   will require new cache coherency algorithms, since            come a major synchronization technique. Thus, a
   simple write-back or write-through schemes do not             component might multicast a message that an-
   scale properly. This will lead to synchronization             nounces the occurrence of an event to a multicast
   marker algorithms where an authoritative copy of              group without knowing exactly which other com-
   information is broadcast periodically to resync               ponents belong to that group. Use of this technique
   cached state. Resolution of differences will occur as         will rely on the designer making a trade-off between
   a result of this synchronization activity. However,           synchronization and resource consumption.
   there will be many locations where portions of a
   large database are authoritative and redundancy in the     • Synchronization and fault tolerance will interact
   data will accommodate failures.                                strongly. Synchronization mechanisms will be re-
                                                                  quired to operate in the face of a certain level of
• The scale of massively distributed applications will            component and communications failure. However,
   encourage engineers to use asynchronous comput-                synchronization algorithms will be designed to
   ing, mimicking the way live organisms function.                move the distributed application towards a well-
   This will be a revolution in the way computations              defined goal even when failures occur. Thus, mas-
   are designed, implemented and operated. It will no             sively distributed applications will make significant
   longer be possible to assume or reason that the state          use of probabilistic algorithms that complete with a
   of some distributed application equals a given value           probability less than unity in a given length of
   at any point in time. Instead, engineers will design           time, but that ensure this probability always mono-
   distributed application algorithms so that an invari-          tonically increases in time.
   ant is always true with high probability.

• The use of pre-agreement algorithms based on roughly        3.9 Measurement, testing and
   synchronized clocks will emerge as a major syn-            debugging
   chronization technique for massively distributed ap-       The implementation of massively distributed applica-
   plications [14]. The accuracy and drift of local           tions requires measurement, testing and debugging to
   clocks will become a major engineering problem.            ensure applications behave correctly and perform well.
   Separate synchronization systems (e.g., the NIST           These services are also required to correct implementa-
   time standard broadcast over WWV) will be used to          tions when they fail to meet either of these criteria. The
   decouple clock synchronization from resource deple-        scale of massively distributed systems makes the execu-
   tion problems and other errors of massively distrib-       tion of these tasks difficult. In particular:
   uted applications. The use of geographically distant
   physical clocks on high-precision real-time applica-       • Gathering the necessary measurement data to determine
   tions will require the transmission of both clock and         whether massively distributed applications are per-
   location in a synchronization schedule to accommo-            forming properly will be problematic. Directing this
   date relativistic effects.                                    data to a single destination will almost certainly in-
                                                                 terfere with the phenomena being measured. Conse-
• It will be virtually impossible for thousands of com-          quently, measurement will occur through an auxil-
     ponents to synchronize using communications when            iary distributed application, running in tandem with
     it is not possible to use an external synchronizing         the target application. The measurement application
     signal. If synchronized clocks are not achievable, for      will itself be massively distributed, requiring its
     example, because some components do not have                own control infrastructure as well as new monitor-
     real-time clocks (e.g., simple embedded systems),           ing techniques that allow it to maintain contact with
     barrier synchronization implemented as a hierarchi-         the measured application even when parts of it are
     cal tree will be necessary. Because of the attendant        created, destroyed and moved.
     performance degradation associated with barrier syn-
     chronization, it will only be employed at major          • Testing massively distributed applications will require
     synchronization points separated by long intervals          new techniques to ensure the application scales.
     of time.                                                    Current approaches used by testing engineers, which
                                                                 concentrate on determining whether the application
                                                                 works properly on a single machine will be inade-
quate for massively distributed applications While         [3] Watson, R. W., "Distributed system architecture
   there is current work to develop harnesses to test         model," in Distributed Systems, Architecture and Im-
   client/server based applications, the introduction of a    plementation, Springer-Verlag, Berlin, NY, 1981.
   third support process into the test harnessthat sup-
                                                              [4] Takada, H. and Sakamura, K., "Compact, low-cost,
   ports client/server interactions (e.g., a database con-
                                                              but real-time distributed computing for computer aug-
   taining public key certificates) is rare. Due to the
                                                              mented environments," Proc. 5th IEEE Comp. Soc.
   expense of developing a dedicated test harness for
                                                              Workshop on Future Trends of Dist. Comp. Sys.,
   massively distributed applications, engineers will be
                                                              Cheju Island, Korea, Aug., 1995, pp. 56-63.
   forced to test applications during large scale beta de-
   ployment. This is in fact becoming the norm for            [5] Lange, L., "The Internet," IEEE Spectrum, January,
   much mass market software.                                 1999, pp. 35-40. (see also www.oxygen.org).
                                                              [6] Sadok, D. H., Kelner, J., and Silva, R.A., "A dis-
• It may not be possible to debug a massively distrib-
                                                              tributed programming platform using mobile agents,"
     uted application under full load, so engineers will be
                                                              3rd Inter. Symp. On Autonomous Decentralized Sys.,
     forced to make trade-offs between debugging and
                                                              April, 1997, pp. 103-110.
     real-time fault isolation with correction. Traditional
     debugging techniques, such as breakpointing, do not      [7] Nessett, D.M., "Factors affecting distributed system
     scale. Therefore, trace-driven analytical techniques     security," IEEE Trans. Soft. Eng., vol. SE- 13, pp.
     will predominate when debugging massively dis-           223-248, 1987.
     tributed applications. Since test harnesses of the
                                                              [8] Birrell, A.D., Levin, R., Needham, R.M.., and
     requisite massive scale will be economically infea-
                                                              Schroeder, M.D., "Grapevine: an exercise in distributed
     sible, engineers will be forced to use execution
                                                              computing," Communications of the ACM, April,
     traces of live application runs. Such execution traces
                                                              1982, 260-274.
     may produce a very large volume of data, so test en-
     gineers will have to develop agile filtering, coalesc-   [9] Mockapetris, P.V., Dunlap, K.J., "Development of
     ing and viewing tools.                                   the domain name system," Proc. of SIGCOMM '88,
                                                              ACM, August 16-19, 1988, pp.123-133.
4. Summary                                                    [10] Weiss, P., “Yellow pages protocol specification,”
                                                              Technical Report, Sun Microsystems, Inc., Mountain
The nature of this paper is exploratory and most asser-       View, CA., 1985.
tions are presented without detailed justification. Tech-
nology forecasting is always risky and, in hindsight,         [11] Dilley, J., “Practical experiences with the OSF cell
rarely completely accurate. However, I hope the analysis      directory service,” Networked Systems Architecture,
at least stimulated the reader to think about massively       Hewlett-Packard, International Workshop OSF DCE,
distributed systems, how their characteristics will influ-    Karlsruhe, Germany, Oct., 1993
ence embedded system design and implementation, and           [12] Cappello, P., Christiansen, B., Neary, M., and
how they may change the way we do computing in the            Schauser, K., "Market-based massively parallel internet
future. Even if there is disagreement about its content,      compuing," Proc. 3rd Working Conference on Mas-
it has succeeded if it motivates readers to develop an        sively Parallel Programming Models, London, England,
alternative taxonomy of the design issues and challenges      Nov., 1997, IEEE Computer Society, pp. 118-129.
associated with massively distributed systems.
                                                              [13] Saltzer, J., Reed, D., and Clark, D., “End-to-end
5. References                                                 arguments in system design,” ACM Transactions on
                                                              Computer Systems 2(4), Nov., 1984, pp. 277-288.
[1] Gustavsson, R., "Agents with power," Communica-
tions of the ACM, Vol 42, No. 3, March, 1999, pp.             [14] Kopetz, H., "The time-triggered architecture," Proc.
41-47.                                                        1st IEEE International Symposium on Object-oriented
                                                              Real-time Distributed Computing, Kyoto, Japan, April,
[2] Tennenhouse, D.L., Smith, J.M., Sincoskie, W.D.,          1998, pp. 22-29.
Wetherall, D.J., and Minden. G.L, "A survey of active
network research," IEEE Communications Magazine,
pages 80-86, January 1997.

Mais conteúdo relacionado

Mais procurados

Chapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsChapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsFrancelyno Murela
 
Distributed systems1
Distributed systems1Distributed systems1
Distributed systems1Sumita Das
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Fazli Amin
 
1 distributed-systems-template-modified
1 distributed-systems-template-modified1 distributed-systems-template-modified
1 distributed-systems-template-modifiedzafargilani
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systemscfenoy
 
Distributed System
Distributed SystemDistributed System
Distributed SystemIqra khalil
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel systemManish Singh
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit INANDINI SHARMA
 
Distributed computing
Distributed computingDistributed computing
Distributed computingKeshab Nath
 
Intro (Distributed computing)
Intro (Distributed computing)Intro (Distributed computing)
Intro (Distributed computing)Sri Prasanna
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system modelHarshad Umredkar
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed SystemRKGhosh3
 
Distributed operating system
Distributed operating systemDistributed operating system
Distributed operating systemudaya khanal
 
7 distributed and real systems
7 distributed and real systems7 distributed and real systems
7 distributed and real systemsmyrajendra
 
1. Overview of Distributed Systems
1. Overview of Distributed Systems1. Overview of Distributed Systems
1. Overview of Distributed SystemsDaminda Herath
 
Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architectureYisal Khan
 
Distributed operating system(os)
Distributed operating system(os)Distributed operating system(os)
Distributed operating system(os)Dinesh Modak
 

Mais procurados (20)

Chapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsChapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systems
 
Distributed systems1
Distributed systems1Distributed systems1
Distributed systems1
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)
 
1 distributed-systems-template-modified
1 distributed-systems-template-modified1 distributed-systems-template-modified
1 distributed-systems-template-modified
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
 
Distributed System
Distributed SystemDistributed System
Distributed System
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel system
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 
Intro (Distributed computing)
Intro (Distributed computing)Intro (Distributed computing)
Intro (Distributed computing)
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
 
Distributed operating system
Distributed operating systemDistributed operating system
Distributed operating system
 
Unit 1
Unit 1Unit 1
Unit 1
 
7 distributed and real systems
7 distributed and real systems7 distributed and real systems
7 distributed and real systems
 
1. Overview of Distributed Systems
1. Overview of Distributed Systems1. Overview of Distributed Systems
1. Overview of Distributed Systems
 
Aos distibutted system
Aos distibutted systemAos distibutted system
Aos distibutted system
 
Distributed system architecture
Distributed system architectureDistributed system architecture
Distributed system architecture
 
Fundamentals
FundamentalsFundamentals
Fundamentals
 
Distributed operating system(os)
Distributed operating system(os)Distributed operating system(os)
Distributed operating system(os)
 

Semelhante a Massively Distributed Systems: Design Issues and Challenge

Semelhante a Massively Distributed Systems: Design Issues and Challenge (20)

I3CON Newsletter #2
I3CON Newsletter #2I3CON Newsletter #2
I3CON Newsletter #2
 
A survey of peer-to-peer content distribution technologies
A survey of peer-to-peer content distribution technologiesA survey of peer-to-peer content distribution technologies
A survey of peer-to-peer content distribution technologies
 
Pervasive Computing
Pervasive ComputingPervasive Computing
Pervasive Computing
 
DISTRIBUTED SYSTEM.docx
DISTRIBUTED SYSTEM.docxDISTRIBUTED SYSTEM.docx
DISTRIBUTED SYSTEM.docx
 
Tr 85.4
Tr 85.4Tr 85.4
Tr 85.4
 
Grid computing
Grid computingGrid computing
Grid computing
 
Chap 1 libre
Chap 1 libreChap 1 libre
Chap 1 libre
 
Distributed Computing system
Distributed Computing system Distributed Computing system
Distributed Computing system
 
Katasonov icinco08
Katasonov icinco08Katasonov icinco08
Katasonov icinco08
 
7- Grid Computing.Pdf
7- Grid Computing.Pdf7- Grid Computing.Pdf
7- Grid Computing.Pdf
 
1.intro. to distributed system
1.intro. to distributed system1.intro. to distributed system
1.intro. to distributed system
 
Chapter 02
Chapter 02Chapter 02
Chapter 02
 
dagrep_v006_i004_p057_s16152
dagrep_v006_i004_p057_s16152dagrep_v006_i004_p057_s16152
dagrep_v006_i004_p057_s16152
 
Report_Internships
Report_InternshipsReport_Internships
Report_Internships
 
Gcc notes unit 1
Gcc notes unit 1Gcc notes unit 1
Gcc notes unit 1
 
CHAPTER THREE.pptx
CHAPTER THREE.pptxCHAPTER THREE.pptx
CHAPTER THREE.pptx
 
Lectura 2.2 the roleofontologiesinemergnetmiddleware
Lectura 2.2   the roleofontologiesinemergnetmiddlewareLectura 2.2   the roleofontologiesinemergnetmiddleware
Lectura 2.2 the roleofontologiesinemergnetmiddleware
 
40 41
40 4140 41
40 41
 
How Cyber-Physical Systems Are Reshaping the Robotics Landscape
How Cyber-Physical Systems Are Reshaping the Robotics LandscapeHow Cyber-Physical Systems Are Reshaping the Robotics Landscape
How Cyber-Physical Systems Are Reshaping the Robotics Landscape
 
Adidrds
AdidrdsAdidrds
Adidrds
 

Último

Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxRosabel UA
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 

Último (20)

Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptx
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 

Massively Distributed Systems: Design Issues and Challenge

  • 1. THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the Embedded Systems Workshop Cambridge, Massachusetts, USA, March 29–31, 1999 Massively Distributed Systems: Design Issues and Challenges Dan Nessett 3Com Corporation © 1999 by The USENIX Association All Rights Reserved Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: office@usenix.org WWW: http://www.usenix.org
  • 2. Massively Distributed Systems: Design Issues and Challenges Dan Nessett Technology Development Center 3Com Corporation Dan_Nessett@3com.com Abstract 1. Introduction This paper explores trends in distributed system archi- The forty year trend in the computing industry is away tecture and implementation that will have far-reaching from centralized, high unit cost, low unit volume prod- consequences in the next 5-10 years. The paper’s central ucts toward distributed, low unit cost, high unit volume thesis is that applications will become massively dis- products. The next step in this process is the emergence tributed during this interval and in so doing generate of massively distributed systems. These systems will significant engineering problems, the solution of which penetrate even more deeply into the fabric of society and will change the nature of distributed computing. Fol- become the information power grids of the 21st century. lowing this trend, networking technology will change They will be ubiquitous. Most will operate outside the in order better to serve the communications require- normal cognizance of the people they serve and most ments of this new class of applications. will be based on embedded systems that present non- One major driver of massively distributed applications traditional computing interfaces to their users. They is the advent of ubiquitous embedded systems. While will be engineered to operate as distributed utilities, embedded systems will not always communicate either much like the energy, water, transportation and media with their peers or with traditional computing equip- broadcast businesses do today. The first deployment of ment, important applications will require the intercon- massively distributed systems is likely to occur as sup- nection of them in large numbers. Several examples are port structures for these industries. given in section 2. Furthermore, these applications will Massively distributed systems will differ from existing require computing architectures that off-load heavy distributed systems in important ways. Such systems computational tasks from embedded systems onto more eventually will interconnect billions of nodes. This will powerful and capable equipment. Consequently, mas- necessitate changes in the way nodes interact with one sively distributed applications will present problems to another. One-to-many communications will be the architects and implementers that transcend those of norm, rather than one-to-one. The size of on-line appli- stand-alone embedded system applications. cation communities will necessitate the use of statisti- The paper examines the motivation for and engineering cally correct rather than deterministic algorithms for problems of massively distributed systems. In the next resource accounting, fault detection and correction, and section, a case is made why massively distributed sys- system management. These communities will coalesce tems will arise. Section 3 presents an analysis of their and dissolve rapidly in order to host events that are of engineering design issues. Finally, section 4 presents a interest to groups formed specifically for this purpose. summary of the paper. This will require new approaches to naming, routing, security and privacy, resource management and synchro- nization. Heterogeneity will be even more of a factor in 2. Motivation the design, implementation and operation of massively Massively distributed systems are the next step in the distributed systems development of computing. From a commercial per- This paper explores the nature and characteristics of spective, this process started with centralized, high unit massively distributed systems by proposing some ex- cost, low unit volume systems in the fifties and sixties. amples and then using them to characterize nine major It then moved to less centralized, but still more or less design issues. Seven of these are drawn from seminal stand-alone systems in the seventies. In the eighties, work in the area of distributed systems. Two others are Network Service Providers (NSPs), such as Compuserv, based on experience in distributed system design and Genie and America On-line, became popular providing implementation subsequent to that work. email and bulletin board services to businesses and some advanced home users. In the first half of this dec-
  • 3. ade, internet access became popular, supporting truly to the day-to-day operations of companies close to the distributed applications, such as the World Wide Web. consumer. This will require the development and de- In the first decade of the 21st century, massively dis- ployment of massively distributed systems that reach tributed systems will appear and become ubiquitous. In into the home, office, public pathways, and private in- direct contrast to the systems of the fifties through the stitutions. early nineties, these systems will be decentralized, low While the problems presented by these massively dis- unit cost and high unit volume. They will be pervasive. tributed applications will display certain differences, Most will operate outside the normal cognizance of the there will be a surprising amount of commonality. people they serve and most will be based on embedded Managing flows, whether they consist of information, systems that present non-traditional computing inter- paper, dishwashing detergent, gasoline, or other com- faces to their users. They will be engineered to operate modities, requires sensors at the points where the prod- as distributed utilities, much like the energy, water, uct is consumed, quickly configurable transportation transportation and media broadcast businesses do today. systems to move those commodities (e.g., trucks, Indeed, the first deployment of massively distributed power switching systems, airplanes, a communications systems is likely to occur as support structures for these fabric), and control algorithms and equipment to match industries. the ingress of raw materials to the egress of the con- Massively distributed systems will not be limited in sumed products. scope to the interconnection of and interaction between In many distributed systems, such as those concerned embedded systems alone. While large numbers of em- with power, water, natural gas, vehicle fuel, and mass bedded systems will participate in massively distributed market goods delivery, embedded processors will meas- systems, at least some will interact with traditional ure resource use and relay consumption information up computing equipment. The latter will provide control, the supply-chain. Control systems will utilize this in- information storage, intensive computation and other formation to schedule raw material inventory replen- services to massively distributed applications. Embedded ishment and organize the management of resource systems will provide specialized point-of-presence serv- flows. Embedded systems close to the consumer (e.g., ices. in appliances, yard sprinkler systems, home heating Several hypothetical examples illustrate the future im- systems) will schedule resource consumption to mini- pact of massively distributed systems. These will arise mize resource demand spikes, thereby reducing the peak as natural extensions to current practice. They will be capacity requirements of delivery systems [1]. used subsequently to illustrate some of the distinguish- An important supply chain that will dramatically influ- ing characteristics of massively distributed systems, ence how massively distributed systems are imple- which are significantly different from those of smaller mented is information flow. The resource management scale distributed systems operating today. systems mentioned in the previous paragraph are geo- graphically hierarchical, since they control physical 2.1. Ubiquitous supply-chain processes characterized by the movement of mass or management energy. Information flows as easily over large as small distances, a characteristic that distinguishes it from Managing large distribution processes in the face of these other supply chains. Consequently, information increasing governmental regulation, resource scarcity supply chains will present significantly different prob- and system complexity will become increasingly impor- lems to designers and implementers of massively dis- tant and open up markets for massively distributed sys- tributed systems. Embedded systems will play a large tems. The classic utility businesses, e.g., the electrical, role in information supply chains, as discussed in the water, natural gas, and telephone companies, will be next section. joined by other flow based service industries, e.g., It will be necessary to use massively distributed sys- transportation conglomerates, oil companies (especially tems to solve this new class of flow control problems, those which own and operate service stations), large since the ratio of control points to production facilities retail companies, including food, clothing, soft and hard will grow to be several orders of magnitude larger than goods businesses, and the information industries, such it is today. This adjustment will encourage new control as newspaper, television, entertainment and advertise- algorithms based not on deterministic concepts, but ment concerns to concentrate on improving their mar- rather on statistical notions of correctness. Profits will gins by the efficient control of their product flow. Sup- be maximized by ensuring a less than 100% accounting ply-chain management will spread from manufacturing of resource usage and payment, since reaching the last
  • 4. few percent will impose significant marginal costs that ware. Embedded processors will run transportable code, will exceed the marginal return. Banks already employ such as Java, in order to support active networking [2]. this philosophy in response to ATM fraud. The money To protect consumers from unwanted intrusions into that could be used to make their equipment more secure their lives, products will be developed to filter the in- earns more interest than the losses they would prevent. formation flowing to an end-system. These filters will Of course, to ensure an equitable and legally defensible range from those that allow parents to control the in- operation (not to mention keeping stockholders happy), formation available to their children (eliminating offen- control algorithms must eliminate systematic fraud sive material such as pornography) to those that elimi- whereby one or a small group of individuals consis- nate the probable increase in junk information broadcast tently benefit from a system's non-deterministic behav- to consumers. Specialized processors implementing ior. pattern recognition algorithms and embedded in rout- ing, switching and end systems will provide powerful 2.2. Agile distributed information and customizable filtering capabilities to achieve these services objectives. Perhaps the most dramatic development of the past 5 To solve the problem of locating communities that years is the growth of information distribution channels might interest an individual, services will form offering based on the Internet and their rapid penetration into electronic real-time catalogues. These will be the suc- areas not normally associated with computing. The cessors of the WWW index and search engines that exist World Wide Web, email, and Internet news groups are currently. Electronic advisory services will inform con- prominent examples. This trend will not only continue sumers which communities offer the best value, using a it will accelerate. The maturation of multi-media broad- value metric tailored for the customer's specific objec- cast technology running over the Internet, assisted by tives. Specialized processors embedded in database and extremely high capacity networking infrastructure will information repository systems will continuously index encourage the formation of virtual information commu- and catalogue information as it arrives. nities. Unlike more traditional groupings, these com- munities will be ephemeral, lasting anywhere from Needless to say, participation in these communities will weeks or months to periods as short as several hours or require payment, ranging from that necessary simply to less. Primitive examples of these communities now reimburse common carriers, when the content is free exist, e.g., everyone reading the same newspaper and all (e.g., broadcasts of school plays, candidate sponsored those watching the same TV program. However, the political events, religious events), to payments for ex- information communities of tomorrow will not be clusive information commodities (e.g., pay-per-view based on rigidly structured broadcast hierarchies; rather sporting events, specialized financial broadcasts). Such they will encourage interactions between people who are payments will require the development and deployment members of a community at a particular moment. Mul- of an electronic monetary system that utilizes a combi- ticast interest groups centered around a wide variety of nation of digital cash, electronic credit systems (i.e., subjects will form, much as internet news groups form electronic analogues of the credit card, bank line of today. Small production companies will present a play, credit, and checks), and point of presence payment ter- host a debate, cover a sporting event or sponsor an elec- minals. Processors embedded in cash cards and other tronic interactive event, such as a networked game that payment tokens will play an important role in such millions play simultaneously. The large media con- systems. glomerates will gradually evolve into distributed infor- mation and entertainment production companies offering 2.3. The new economics a wider variety of programming material than they do Electronic commerce is now a common feature of many now. commercial enterprises. However, most designers and Managing such rapidly changing information communi- implementers of electronic commerce systems have ties will require more capable networks than currently concentrated on the provision of electronic payment. exist. Switching and routing equipment will include Other aspects of electronic commerce have received less embedded systems to monitor resource use and adjust attention, in particular, many components that are well allocations to meet fluctuating demand. These systems matched to implementation on massively distributed will control resources that are traditionally static, such systems. Commerce involves not only payment for as device backplane bandwidth, encryption and compres- goods and services, but also their creation, advertise- sion hardware, and agile error detection/correction hard- ment, delivery, maintenance, and disposal.
  • 5. The downward spiraling cost of communication is uted systems. This work identified seven major design changing profit models in many businesses. Hard prod- issues. Two additional issues are added to the seven pre- uct companies will change their businesses to adapt to viously identified based on experience subsequent to that the new micro-economic environment. The creation of work. The classical design issues are : 1) naming, 2) hard goods will become increasingly automated. Their error control, 3) resource management, 4) protection advertisement will become more interactive, consumers (security and privacy), 5) synchronization, 6) objects will use third party search and evaluation companies (representation, encoding and translation), and 7) meas- that recommend products according to a detailed specifi- urement, testing and debugging. The additional design cation. Since consumers are rarely expert enough to issues are : 8) heterogeneity, and 9) scale. create a useful and detailed description of their require- Each of these issues is used to show how massively ments, such companies will develop interactive systems distributed systems differ significantly from existing with friendly customer interfaces that will lead a cus- distributed systems. For pedagogical reasons, the issues tomer in the development of these requirements. are addressed in the following order : scale, heterogene- Customers will not travel to a remote location to use ity, objects (representation, encoding and translation), these systems. They will be accessed through commu- resource management, protection (security and privacy), nications facilities terminated in the customer's home or naming, error control, synchronization, and measure- place of business. Eventually, requirement specifica- ment, testing and debugging. tions will drive the creation of the product at an auto- mated facility, create a just-in-time delivery plan to 3.1. Scale move the product from the manufacturing facility to the shipping end-point, schedule the resources to conduct The main characteristic of massively distributed systems warranted and purchased maintenance on the product, and is their scale. While current distributed system are com- arrange for the disposal of old equipment the customer posed of hundreds of nodes1, during the first decade of no longer needs (or that was traded in for new equip- the 21st century massively distributed systems will be ment). Companies will optimize costs by coordinating composed of thousands to billions of nodes, supporting this with the warehousing and shipment of the pur- hundreds to millions of users [4]. chased equipment as well as with the disposal of the The scaling issue for massively distributed systems has equipment of other customers. a number of dimensions. In the area of communications Embedded systems will play an important role in the capacity, recent events in the telecommunications indus- new economics. Once a manufacturing facility receives try promise to dramatically change the available world- a product specification, embedded processors in robotic wide bandwidth for networking. For example, Project equipment will tailor its mechanical assets to create that Oxygen of the CTR Group, Ltd. intends to lay fiber product. Embedded processors will sense resource usage across the ocean floors resulting in bandwidth of 1.28 and work with control systems to ensure parts are deliv- terabits/sec on any segment [5]. Current schedules call ered when necessary. Embedded systems will monitor for the completion of an Atlantic ring by Q4 2000 and a and schedule warehouse and delivery capacity, cooperat- Pacific ring by Q2 2001. The pricing schedules pro- ing with control systems to move the product to the posed by CTR Group will result in transoceanic band- customer. Embedded systems in robots will manage the width offered at 1.5% of current satellite circuit prices repair of products returned for maintenance. and 1% of marine cable prices. Power, water, natural gas, vehicle fuel and mass market 3. Design issues supply chains are naturally large scale. Traditional in- formation supply chains (e.g., radio, television, motion Massively distributed systems are not only quantita- picture, the printed news media) are broadcast based and tively different, but also qualitatively different than the characterized by a small producer-to-consumer ratio. distributed systems in place today. Their size and scope However, this is undergoing radical change. The growth introduce new problems in design and implementation. of the world wide web has greatly increased the pro- However, to a certain extent we can use the experience we have gained building moderately scaled distributed systems to guide us in solving these problems. 1 Some may object to this assertion, observing that the current Inter- net is composed of millions of nodes. However, the term distributed This section presents a taxonomy of design issues for system here means a cohesive set of computing resources acting together to achieve a common objective. While the routing infra- massively distributed systems. It is based on seminal structure of the Internet can be considered to be a massively distrib- work [3] that analyzed the current generation of distrib- uted application, no other system of this scale exists currently.
  • 6. ducer-to-consumer ratio in data communications and this per-use approach, which is suited to their ephemeral trend will continue. There may even come a time when nature. Longer-lived communities may be more in- the ratio approaches 1. This one factor has the potential clined to charge a subscription fee for their services. to completely change how information supply chains Some communities with external financial backing are implemented and could be the major driver behind may choose to charge no fee, transferring value to massively distributed systems. their sponsors through advertising, indirect persua- sion, or charitable benefit. Each of these costing Geographically, massively distributed applications sup- models must be supported by the massively distrib- porting information supply chains will be global in uted system infrastructure in a way that allows such extent. A single application running over a massively diverse activities to exist concurrently. distributed system will interconnect users of signifi- cantly different languages, cultures and views. Commu- • To support cost recovery for distributed applications, nication costs will be structured so that they scale loga- several efforts are underway to build and deploy elec- rithmically with the number of destinations, thereby tronic monetary systems. Existing applications are utilizing efficiencies inherent in broadcast and multicast able to use a single funds transfer system, since communications. Individual subscribers will be able many serve a limited customer base. Massively dis- affordably to communicate with one to a hundred other tributed systems, on the other hand, will be global subscribers. One-to-many communications will be the in scope and normally serve customers in many norm, as opposed to one-to-one transfers, which are countries who use a large variety of payment sys- most common today. tems. For some, only traditional methods, such as a credit card or cash payment card will be available; 3.2. Heterogeneity while others may use different electronic systems, While heterogeneity is an important design issue for such as digital cash, digital checks, or digital credit existing distributed systems, its extent in massively cards. Embedded systems will play an important distributed systems will be significantly greater. In par- role in the latter class of payment system. Mas- ticular: sively distributed systems must accommodate the use of all such systems by a single application. • The communications infrastructure will be composed This will necessitate the development of funds trans- of channels of very different capacities. Very low fer transaction clearinghouses that will convert funds bandwidth channels (e.g., 1200 BPS) will still be in from one system to the other. use in developing nations during the next decade, while the developed world will support very high • Distributed systems currently run on equipment man- bandwidth channels in the core services (e.g., 1 aged by different administrations, which desire to Tbps). Communications between embedded systems keep its use under their control. Current practice is and their peers or higher-level facilities will occur to rely on the users of this equipment to properly over channels of significantly lower bandwidth than employ it according to policies promulgated by that available in the core of the network. Building these administrations. However, this approach is applications that run over bandwidth diverse com- changing as organizational IT budgets become larger munication infrastructure will require new ways of and as Board of Directors, stock holders and other in- organizing data flows. terested parties hold management more strictly ac- countable for information processing equipment use. • Similarly, end-systems will possess a wide variety of Organizationally imposed constraints on distributed presentation techniques, from interactive HDTV to applications have a deleterious effect on their opera- personal digital assistants and personal phone tion. A clear example of this is how router based equipment. Applications will combine these end- firewalls not only provide protection against intruder systems into coherent interactive subsystems. In attacks, but also prevent certain distributed applica- fact, it is possible today to create a communications tions, such as those using UDP, Java applets or conferencing system that patches together video- those using the X window system from running teleconferencing equipment, multicast capable work- across firewall boundaries. Massively distributed stations and cellular phones. systems will experience even greater impediments to their operation, since not only will locally imposed • Participatory communities, formed on demand, will constraints interfere with their operation, but other choose from an array of costing options for their constraints will hinder them, such as national re- members. Short-lived communities may elect a pay- strictions on transborder data flows and encrypted
  • 7. data as well as the mandated use of certain standards also on a wide variety of embedded systems sup- for communications protocols. ported by their own proprietary operating systems and hardware (e.g., automobile control systems, • Distributed applications are composed of software PDAs). For example, a massively distributed appli- components2 that run on various platforms and are cation for remote conferencing may have compo- organized into component classes (e.g., file servers, nents that run on workstations, on equipment pro- public key distribution systems, file system clients, vided by a manufacturer for the TV cable industry, various clients on embedded systems), each of which on cellular phones, on PCS based personal commu- performs a different function. Current distributed ap- nication devices, and so forth. This will increase the plications either assume that all components run the number of different implementations of the software same version of software, or are structured according for a single component type and thereby increase the to a simple client/server model that must concur- effort necessary to ensure the application works cor- rently accommodate different versions on a small rectly. number of system (e.g., two or three). Massively distributed applications, because of other heterogene- ity requirements, such as the use of different presen- 3.3. Objects (representation, tation formats, costing and cost recovery schemes, encoding and translation) naming techniques and administration constraints, A variety of efforts are underway to determine the best will be constructed of a significantly larger number programming model for distributed objects (e.g., of component types. Components of these types in CORBA, Java). Lessons from these investigations will general will not run the same version of software. directly affect how objects are handled in massively dis- Designing and implementing these applications will tributed systems. However, there are certain issues aris- therefore require engineering techniques that allow ing from the scale of these systems that will introduce them to operate in the face of software versioning new constraints on how objects are represented, encoded heterogeneity. and translated. • One of the fundamental mistakes made when building • Massively distributed system by their nature will re- distributed systems is ignoring legacy applications quire the creation, movement, modification and de- and infrastructure. This invariably leads to poor cus- struction of massive objects, which conceivably tomer acceptance of new technology. Customers could contain thousands to millions of other ob- cannot simply discard their existing applications jects. The size of these objects will affect how they when new infrastructure becomes available. In many are represented. For example, it will be infeasible to cases these applications and their supporting infra- incorporate a copy of each contained object within structure represent billions of investment dollars and the massive object. Instead, object references will be reimplementing them according to the interfaces used within massive objects to represent its con- presented by new technology is economically infea- stituents. These objects will themselves be con- sible. Massively distributed systems will be no dif- structed using references to their contained objects, ferent in this regard. However, existing distributed leading to a hierarchical object anatomy. However, systems generally must accommodate stand-alone several well-known problems associated with refer- legacy systems and applications. Massively distrib- encing distributed objects will require attention. For uted systems, on the other hand, must accommodate example, implementers of massively distributed sys- both stand-alone as well as current distributed sys- tems will have to solve the lost object problem, tem legacy infrastructure and applications. which occurs when an object is destroyed without destroying all its references in other objects. Rely- • While most existing distributed applications run on a ing on manual techniques to recover from this error number of different computing platforms, they are condition will not be an option, because of its scale. generally limited to a small number of common Similarly, revoking access to an object will be families, e.g., Unix, Windows, and perhaps MVS. much more problematic for massively distributed Massively distributed applications, on the other systems than for existing distributed systems. It hand, will run not only on the legacy platforms, but will be infeasible to locate all references to an object and delete them, even if the underlying object repre- 2 The term “component” is used here in its general sense. Distrib- sentation allows that. While using access lists can uted system components may or may not be constructed as object- oriented software components. mitigate the problem of revocation, their use in massively distributed systems will be problematic,
  • 8. since their size may make this approach unattrac- new problems in object store performance, error tive. This problem is discussed in more detail in control and correctness. Object implementers will section 3.5. have to deal with the internal synchronization of ob- ject constituents, in regards to their creation, • Not only will the representation of massively distrib- movement, modification and destruction. They also uted objects require new techniques, their presenta- will have to deal with these issues when designing tion to users will also require innovation. Some re- and implementing object caches. searchers have examined this problem. A new class of user interface represents objects as virtual spaces. • The algorithms used to manipulate massively distrib- This technique is eminently suitable for presenting uted objects will utilize techniques that differ from massively distributed objects to end users. For ex- those commonly used today. They will be replaced ample, a first level object might be represented as a by those that use multi-cast communications in or- virtual world, its constituent objects as countries, der to avoid object components understanding the then cities, streets, houses, rooms, and so on, the object's global architecture. Voting and distributed exact structure depending on the size of the first agreement algorithms will become common. Pro- level object and its relationship to its constituents grammers will create active objects, i.e., those as well as the relationship of the constituents to which include active on-going computing, that blur each other. Such presentation paradigms will be re- the distinction between data and processing. While quired to make massively distributed objects acces- some existing distributed systems support active ob- sible to the technologically naive user. jects, massively distributed systems will organize large parallel computations around this concept, • Since massively distributed applications will be global forming these computations by creating a first level in scope, their users will belong to multiple cul- object containing a large number of active objects tures and countries. This will increase the necessity residing on geographically disperse and administra- for the internationalization of data. While current tively heterogeneous systems. applications can be configured to use different char- acter sets, depending on choices made when the sys- tem is installed, massively distributed systems will 3.4. Resource management have to internationalize data on-the-fly. This is es- In order to design and build massively distributed appli- pecially true of embedded systems, which com- cations, engineers will have to grapple with new prob- monly will exist in mobile equipment. There will lems in resource management. Many existing distrib- be no point when a system administrator can select uted systems operate according to a local resource con- a particular internationalized presentation set. Users trol model. Local processing manages its own of massively distributed applications will come and resources, interacting with other threads of control go during its lifetime, requiring run-time configura- through message passing or remote procedure tion of this data. Perhaps more importantly, there call/method invocation. In massively distributed sys- will not be a single system administrator who con- tems, objects will be composed of resources located in a trols a massively distributed application. Its control large number of different places. Controlling the re- will be distributed as well. Not only will character sources associated with an object will only be possible set data require internationalization, the application through a distributed global object resource management will have to internationalize other data such as fi- mechanism. This will introduce several new issues in nancial, length/mass/time, and timezone. For exam- distributed system resource control: ple, an application that presents financial data may have to convert between dollars, yen and marks, de- • Routing will become an issue not only at the network pending on where the end-point system is located layer of the distributed system, but also at the appli- and how it has been configured by the customer. cation layer. Composite objects containing active objects will require routing services so the compos- • The storage of massively distributed objects will re- ites can interact with each other, especially if ob- quire new techniques. Current object storage sys- jects are allowed to move. For example, an embed- tems assume objects are located on resources con- ded system such as a mobile PDA will move and trolled by a single administration. The storage of a connect to different portals in a network. If upon single massively distributed object, however, may connecting, objects are moved from the PDA to use the resources of many storage systems, con- execution environments hosted near the portal, in- trolled by different administrations. This will lead to vocation of those object's methods by other objects
  • 9. will require a routing protocol to identify a path be- • Massively distributed systems will contain extremely tween them. The interactions between active objects large volumes of data and enormous processing and passive objects will require routing so active ob- power. Effectively managing these resources will jects can locate and contact passive object methods. challenge implementers. Distributed system archi- Congestion control within composite objects will tects will merge the two prevalent ways to organize be an issue. Programmers and object implementers computations in a distributed system, i.e., move will organize objects to avoid computational hot data to the processing (used by NFS, World Wide spots, similar to those experienced by massively Web, FTP, gopher) and move processing to the data parallel algorithms. Due to their scale, object (used by active networking and Java applets) into a implementers will require adaptive routing and con- single scheme. Such objects will be moved within gestion control within massively distributed objects, the distributed system and carry with them both code manual configuration will not be an option. and data [6]. If the object consists primarily of data, it will closely approximate moving data to the proc- • Resource management in a massively distributed sys- essing. If it consists primarily of code, it will tem will interact strongly with its heterogeneous na- closely approximate moving processing to the data. ture. Applications will compose resources under the RPC and other procedure-oriented paradigms will be control of many administrations into a single dis- emulated by including a single high level instruc- tributed resource. Management of the distributed re- tion in the code section, specifically a procedure source must accommodate the usage policies of the identifier, which will be interpreted at the RPC separate administrations. For example, cost recovery server. for a massively distributed object must accrete and disseminate sufficient funds to cover the costs of the • The deployment of massively distributed systems will composite objects. Since massively distributed ob- encourage the trend towards a new valuation model jects may be highly recursive, cost recovery will re- for information. Specifically, the independent vari- quire the composition of recursively discovered able driving value will be how long the information composite costs. As objects evolve, their cost re- has existed. Massively distributed systems will in- covery anatomy will change, forcing object crease the difficulty in keeping information private. implementers to dynamically configure the object's Consequently, customers will pay for fresh informa- cost recovery algorithm. Since it will be difficult tion, which will rapidly decrease in value once it is and expensive to continually track the exact cost produced. For example, stock market ticker data is structure of an object at any point in time, cost re- now available electronically, making automated trad- covery will use statistical algorithms, assessing ing programs possible. However, this implies that charges and dispersing payments so that the owners most of the value of the ticker data is consumed in a of component resources receive their payments, but very short period of time. Its public release without avoiding too fine an accounting of usage. Periodic cost routinely occurs today 15 minutes after its gen- audits will ensure systematic fraud is minimized. eration. As massively distributed systems become Operators of massively distributed objects will adopt common, more and more stock traders will be forced a fixed cost model, allowing users of the object to to use automatic trading programs, which will re- flexibly utilize its resources without charging for its duce the value of the ticker data even faster. It is components use. likely that eventually ticker data will be available without cost within a few minutes of its production. • Object implementers will devise algorithms enforcing This paradigm will apply to other data, such as li- restrictions on object use that are required by restric- brary searches, financial analyses and market re- tions on their components. Unlike existing distrib- search. uted resource management, these algorithms will have to be adaptive, since implementers will not • Upgrading software in a massively distributed system know all possible component resource usage poli- will pose new problems for software companies. It cies, a priori. Object implementers will be chal- will become more difficult to locate all users of a lenged to understand object behavior, since dynami- particular version of software and users will become cally changing an object's components, which may less cognizant of the version they are using. This introduce new resource usage restrictions, will make situation will arise because objects will contain and this difficult. use other objects implemented by software not di- rectly visible to the user. To meet this new chal- lenge, software companies will move from a pure
  • 10. product business model to a combination of product which a user is capable of manually examining for and service model. Indeed, this is already happening. interest, usefulness, or suitability. Various filtering Many software companies offers a subscription pur- techniques will be developed and used by customers chase agreement for their products that provides the to ensure their systems only present to them infor- customer with one year of upgrades. Engineers will mation in which they are interested. Additionally, design their software to periodically query mainte- once these filtration systems are common, applica- nance centers that will upgrade it automatically, if tions will use feedback controls to gather and em- the software's maintenance contract is still active. ploy end-system supplied filtration data to decrease Again, this is current practice. For example, the vi- the amount of information sent to an end-point that rus detection software that is part of the Norton Sys- is automatically discarded. Massively distributed tem Tools product for Windows 95/98 periodically systems will provide tools to applications that al- communicates with a maintenance site on the Inter- low them to support this type of customization for net and downloads new virus checking features as large numbers of end-point systems. they are released. Customers will be forced to use maintenance contracts to keep their applications running. In the future, customers will force software 3.5. Protection (security and manufacturers to provide open interfaces to their privacy) software and negotiate maintenance agreements with Protection of distributed system assets, including base secondary companies, much like the hardware busi- resources such as processing, storage, communications ness does today. and user-interface I/O as well as higher-level composites of these resources, e.g., processes, files, messages, dis- • Massively distributed systems will force a reexamina- play windows and more complex objects is not a tion of the way communication interfaces work. In strength of existing distributed systems. While engi- particular, they will have to scale to accommodate a neers are currently engaged in developing solutions to large number of correspondents. For example, a the many problems that exist in this area, they are not simple mass-market purchase application that al- addressing protection issues that will arise as massively lows customers to choose between several products distributed systems become prominent. Specifically: may receive millions of requests in a short period of time. Current programming interfaces to communi- • Massively distributed systems for the most part will cation services are totally inadequate to handle such support a large number of end-systems, many of a high rate of transactions. Furthermore, individu- which will be embedded in other equipment and used ally acknowledging each of these transactions would by technologically naive customers. These systems impose a significant processing burden on the sys- will require management, which very probably will tems running the application. This will encourage occur through the use of systems under the control engineers to design communication interfaces and of service providers. Since many end-systems will protocols that are better suited to high rate and vol- either generate or contain data that customers con- ume transactions. sider private, the management technology for mas- sively distributed systems must guarantee, to a rea- • Since massively distributed systems will dramatically sonable extent, that unaggregated data is not com- increase the amount of traffic handled by a commu- promised either by individuals operating the nications fabric, engineers will investigate how to management sub-system or by the service providers efficiently move extremely large volumes of traffic as part of their corporate strategy, e.g., during the over the internet. In addition much of this traffic collection of marketing data. To meet this objective, will belong to one of several different quality of end-systems must be customer anonymous with re- service classes (e.g., best effort delivery data, stream spect to the management sub-system. That is, there data) and so understanding how to accommodate dif- must be no tie between the identifiers used to man- ferent quality of service parameters in the internet is age end-systems and those used for billing, warranty a problem with high priority. Fortunately, this or other services that require the identification of the problem is currently receiving significant attention individuals who own or use these end-systems. by the Internet Engineering Task Force and the IEEE. • Since one-to-many communications will play a major role in massively distributed systems, engineers will • Massively distributed systems will support a volume have to address the problems of deletion, modifica- of information flow to an end-system beyond that tion, insertion, replay, release of secret state, and
  • 11. masquerade for this type of communications. Exist- forcing these legal constraints will be just as hard as ing techniques are designed to protect one-to-one implementing the technology that satisfies them, communications against these threats. New proto- over time the constraints may be relaxed or even cols and processing algorithms will be required to eliminated. However, while they are in force, the provide these services for one-to-many communica- technical problems they raise will be formidable. tions. • Commercial enterprises will play a large role in mas- • The scale of massively distributed systems will intro- sively distributed applications. Consequently, new duce new problems in resource access control for threats will arise as the value of these applications both end-systems and infrastructure support sys- increases. One concern is the security of internet- tems. Currently, most systems use access control work routing. Massively distributed systems will lists or their approximations (e.g., Unix file access introduce new factors into this problem. The infra- permission bits). However, massively distributed structure required for such systems will necessarily systems will support millions of principals, which federate other large networks into a global commu- would lead to access control lists of unmanageable nications fabric. However, routing service providers size, if implemented as they are currently. may wish to limit the traffic that transits their net- Implementers will require new techniques, such as works and subscribers may demand guarantees that access list caching hierarchies, capability systems the quality of service advertised by routing service that solve the revocation and lost object problems, providers is actually delivered. The former problem and access control techniques that combine the ad- has received a great deal of attention, leading to the vantages and characteristics of access control lists formulation of policy routing techniques. The latter with capability access control and eliminate the dis- problem on the other hand, has received less atten- advantages of each. Since massively distributed sys- tion and will require further study. tems will be highly heterogeneous and require the support of legacy distributed systems, engineers will be forced to grapple with the hazards of composing 3.6. Naming systems that use access control lists with those us- Engineers have devoted considerable attention to the ing capability based access control [7]. issue of naming in distributed systems over the last 20 years. Pioneering efforts, such as Grapevine [8], the • When confronted with the problem of information Domain Name System (DNS) [9] and NIS [10], devel- protection, existing distributed systems already must oped the concepts for the next generation of distributed cope with the controls governments have placed on naming systems, such as X.500 and LDAP accessed cryptographic technology. These constraints have directories. To some extent these second-generation sys- hampered engineers in providing appropriate levels tems, especially X.500, are designed for large scale dis- of security to the users of distributed systems. Mas- tributed systems. In particular, X.500 supports the stor- sively distributed systems will generate new prob- age and retrieval of customer defined data structures, a lems in this area. Since communications will occur hierarchically structured and distributed naming context between large numbers of end-systems, security mechanism and decentralized authentication services. service implementations will be hard pressed to as- The Open Group's Distributed Computing Environment certain which systems are constrained by particular (DCE) uses either X.500 or DNS to construct a large national laws concerning encryption and what those scale naming system from its local Cell Directory Sys- limitations might be. As mobile systems become tem [11]. more prevalent and as they are integrated into mas- sively distributed systems, enforcement of these Even though the designers of second generation distrib- constraints, even if they are known, would require uted naming systems have attempted to accommodate tracking the position of a mobile system, identify- massively distributed systems in their design, there are a ing the constraints imposed by its locality, and number of naming issues that transcend these designs communicating these constraints to all systems in- and require attention: teracting with it. For the number of end-systems • The volume of data within a massively distributed that will engage in a massively distributed applica- naming system will be so large, tens to hundreds of tion and for the frequency with which a mobile sys- millions of entries, that customers will not know tem's position might change (e.g., those used from where to start when looking for an item. Hierarchi- or embedded in an automobile, train, boat or air- cal structures have the advantage of logarithmically plane), this will be technically infeasible. Since en-
  • 12. scaling the name space, given a specific choice of around, which will in turn require meta-naming how the data should be organized. They have the services (i.e., to name the naming context). Trans- disadvantage of constraining efficient queries to lating naming contexts will necessitate the discov- those whose unknowns are at the bottom of the hi- ery of the relationships between the different con- erarchy. Since a single massively distributed system texts. Furthermore, object sharing requires the nam- will be used for a wide variety of purposes, multiple ing of objects by different entities. Communicating indexes into its naming data will be required. Inde- names for the purpose of sharing requires moving pendent services will manage these indexes and each the associated naming contexts. will need to update their meta-data as the foundation data changes. This already exists for indexes of web- • Different cultures may require different naming page data. However, as anyone who has used inter- schemes, some hierarchical, some local. These must net search engines realizes, the volume of data re- be coalesced into a usable massively distributed sys- turned for many queries is too large for efficient use. tem naming scheme. The same object may be Higher-level indexing services will use the services named differently depending on the use of the name of lower-level services, leading to a hierarchy of and its characteristics (e.g., native language, avail- naming data accessible by customers. This web will able character set). contain data of varying degrees of freshness, which will require new methods and algorithms to present a consistent and coherent view to customers. 3.7 Error Control To a certain extent the problem of error control in • As stated previously, a large number of the organiza- global networks has not been solved for existing dis- tional structures within a massively distributed sys- tributed systems. For example, multicast links fail dur- tem will be ephemeral, e.g., information communi- ing video teleconferences, ftp sites become unreachable ties which will unite around a particular event and during a transfer due to firewall, router or link failure, then dissolve. However, many who would be candi- and end-system failure results in broadcast storms that dates to participate in such an event may not be cripple local communications. All of these events are aware, a priori, that it exists. The scale of a mas- difficult to analyze and correct automatically. They in- sively distributed system invalidates traditional ad- variably require the use of out-of-band channels to no- vertising techniques, since the volume and rate of tify a responsible technician or system administrator. In the advertising information arriving at an end- a massively distributed system, however, the identifica- system using these approaches would overload it. tion and use of an appropriate out-of-band channel will Consequently, engineers will devise new pro- be problematic. Furthermore, service providers must ducer/consumer models of advertising. For example, provide error control in ways that conform to other re- one possibility is a hierarchically structured reverse quirements, such as privacy protection, high perform- advertising technique, whereby the customer adver- ance and scalability. Such constraints lead to the follow- tises desirable product attributes to first tier brokers, ing problems: which accrete these into reverse advertisements to higher level brokers. From other leaves in the hier- • Fault isolation and correction in massively distributed archy, product producers advertise their products to systems will require new infrastructure services to other first-tier brokers, which likewise accrete them monitor communications quality and deliver excep- into advertisements to higher level brokers. Some- tion alarms to service providers when quality falls where within the hierarchy a rendezvous occurs and below a given threshold. Since the number of dis- direct advertisements are sent or otherwise made tributed applications running in a massively distrib- available to customers. A simple example of this uted system will be too large for manual monitoring strategy for advertising and using computing re- to be cost effective, service providers and distributed sources has already been prototyped [12] system implementers will design, implement and deploy automatic fault detection and isolation • Moving objects requires moving names embedded in mechanisms to ensure service remains available dur- the object. This necessitates either retaining the con- ing various resource outages. This will be especially text in which those names are interpreted or translat- important for distributed embedded systems. The ing them into contexts available at the new object functions of existing network operation centers will site. Massively distributed systems makes either become automated. Network operation centers will strategy difficult. Retaining contexts will compel an focus on higher level fault isolation and control, object to maintain contact with them as it moves
  • 13. such as rerouting communication services when a within communication protocols, but also within catastrophic event takes down large subsystems. objects so their state driven algorithms are fault tol- erant. To achieve this objective, there will be an in- • Engineers of massively distributed systems will intro- creased use of voting protocols and algorithms. duce new ways to process a large volume of infor- mation and number of transactions. As previously • It will be impossible to keep track of all the identifiers described for costing strategies, statistical algo- or other processing state associated with an object. rithms will be used to cope with the decreasing Massively distributed systems must be designed to probability that all necessary fault identification in- work in the face of lost objects, which will be the formation is available in a reasonable amount of rule rather than the exception. Loss of identifiers time. For example, consider the problem of auto- will also increase the incidence of orphaned objects, mating the collection of payments for a video serv- which will increasingly require the use of global ob- ice. Errors in transmission, bugs in software, unre- ject garbage collection. sponsiveness of end-system equipment due to mal- function and operator error will lead to a certain • One-to-many communications on a large scale will amount of imprecision in the accounting and collec- encourage the development of forward error correc- tion processes. An automated billing and collection tion techniques, which will replace feedback error application will record all funds received, match control in many cases. Feedback error control re- them with outstanding balances on accounts and at- quires the retention of a large amount of state and a tempt to resolve the amount received with the over- high level of processing when applied to one-to- all outstanding balance. A discrepancy below a cer- many communications. Idempotent operations will tain threshold will be charged to the cost of doing become the paradigm of choice when one-to-many business. Auditing processes will attempt to ensure communications support remote procedure or object that these discrepancies are not systematically occur- method calls. ring as the result of fraud. Such techniques will also be used for automated opinion polling, physical re- • Error control will move more and more into the appli- source flow monitoring (e.g., power, water), and cation layer as the error characteristics of lower lay- distributed system infrastructure maintenance. ers become heterogeneous (especially for one-to- many communications) and as applications are • Massively distributed systems will require the use of widely distributed. [13]. To manage application error highly available subsystems, including a redundant control, management systems will expand their con- communications fabric provided by multiple service cerns from the network layer and below to all layers providers, fault-tolerant distributed processing for of the protocol hierarchy. This will lead to the stan- control and accounting, and high availability end- dardization of management protocols and interfaces. systems for content providers, which will be the next logical step beyond the high availability file systems currently deployed. Fault-isolation will 3.8 Synchronization have to operate in the face of privacy services, such One of the most difficult problems that engineers of as encryption, that are virtually non-existent today, massively distributed systems will encounter is syn- but that will become prominent in the next decade. chronizing computations consisting of thousands to Service providers will have to respond to subscriber millions of components. Current methods of synchroni- problem reports, even when the ultimate source and zation, such as semaphores, monitors, barriers, remote destination of traffic causing the problem is un- procedure call, object method invocation, and message known. They will gravitate toward a scaled cost passing, do not scale well. Generally, they are suitable structure for various grades of service, from high either for synchronizing closely coupled computations grade indemnified service ("you lose service, we lose (e.g., semaphores, monitors and barriers), or for unicast revenue"), to lower grades of best effort service. distributed applications (e.g., remote procedure calls, object method invocations and message passing). Engi- • Detection of inconsistent state will be difficult in neers have not yet devised efficient synchronization massively distributed systems, since the distributed techniques for group computations, when groups con- state will not be available for centralized processing. tain thousands to millions of active elements. Specific Consequently, algorithms for communications and problems in the area of synchronization include: processing will be developed that are tolerant of in- consistencies. They will use redundancy not only
  • 14. • Caching will become pervasive to accommodate per- • To accommodate the improbability of a component formance, error control and resource management knowing all other components with which it should objectives. Keeping all cached copies synchronized synchronize, multicast communications will be- will require new cache coherency algorithms, since come a major synchronization technique. Thus, a simple write-back or write-through schemes do not component might multicast a message that an- scale properly. This will lead to synchronization nounces the occurrence of an event to a multicast marker algorithms where an authoritative copy of group without knowing exactly which other com- information is broadcast periodically to resync ponents belong to that group. Use of this technique cached state. Resolution of differences will occur as will rely on the designer making a trade-off between a result of this synchronization activity. However, synchronization and resource consumption. there will be many locations where portions of a large database are authoritative and redundancy in the • Synchronization and fault tolerance will interact data will accommodate failures. strongly. Synchronization mechanisms will be re- quired to operate in the face of a certain level of • The scale of massively distributed applications will component and communications failure. However, encourage engineers to use asynchronous comput- synchronization algorithms will be designed to ing, mimicking the way live organisms function. move the distributed application towards a well- This will be a revolution in the way computations defined goal even when failures occur. Thus, mas- are designed, implemented and operated. It will no sively distributed applications will make significant longer be possible to assume or reason that the state use of probabilistic algorithms that complete with a of some distributed application equals a given value probability less than unity in a given length of at any point in time. Instead, engineers will design time, but that ensure this probability always mono- distributed application algorithms so that an invari- tonically increases in time. ant is always true with high probability. • The use of pre-agreement algorithms based on roughly 3.9 Measurement, testing and synchronized clocks will emerge as a major syn- debugging chronization technique for massively distributed ap- The implementation of massively distributed applica- plications [14]. The accuracy and drift of local tions requires measurement, testing and debugging to clocks will become a major engineering problem. ensure applications behave correctly and perform well. Separate synchronization systems (e.g., the NIST These services are also required to correct implementa- time standard broadcast over WWV) will be used to tions when they fail to meet either of these criteria. The decouple clock synchronization from resource deple- scale of massively distributed systems makes the execu- tion problems and other errors of massively distrib- tion of these tasks difficult. In particular: uted applications. The use of geographically distant physical clocks on high-precision real-time applica- • Gathering the necessary measurement data to determine tions will require the transmission of both clock and whether massively distributed applications are per- location in a synchronization schedule to accommo- forming properly will be problematic. Directing this date relativistic effects. data to a single destination will almost certainly in- terfere with the phenomena being measured. Conse- • It will be virtually impossible for thousands of com- quently, measurement will occur through an auxil- ponents to synchronize using communications when iary distributed application, running in tandem with it is not possible to use an external synchronizing the target application. The measurement application signal. If synchronized clocks are not achievable, for will itself be massively distributed, requiring its example, because some components do not have own control infrastructure as well as new monitor- real-time clocks (e.g., simple embedded systems), ing techniques that allow it to maintain contact with barrier synchronization implemented as a hierarchi- the measured application even when parts of it are cal tree will be necessary. Because of the attendant created, destroyed and moved. performance degradation associated with barrier syn- chronization, it will only be employed at major • Testing massively distributed applications will require synchronization points separated by long intervals new techniques to ensure the application scales. of time. Current approaches used by testing engineers, which concentrate on determining whether the application works properly on a single machine will be inade-
  • 15. quate for massively distributed applications While [3] Watson, R. W., "Distributed system architecture there is current work to develop harnesses to test model," in Distributed Systems, Architecture and Im- client/server based applications, the introduction of a plementation, Springer-Verlag, Berlin, NY, 1981. third support process into the test harnessthat sup- [4] Takada, H. and Sakamura, K., "Compact, low-cost, ports client/server interactions (e.g., a database con- but real-time distributed computing for computer aug- taining public key certificates) is rare. Due to the mented environments," Proc. 5th IEEE Comp. Soc. expense of developing a dedicated test harness for Workshop on Future Trends of Dist. Comp. Sys., massively distributed applications, engineers will be Cheju Island, Korea, Aug., 1995, pp. 56-63. forced to test applications during large scale beta de- ployment. This is in fact becoming the norm for [5] Lange, L., "The Internet," IEEE Spectrum, January, much mass market software. 1999, pp. 35-40. (see also www.oxygen.org). [6] Sadok, D. H., Kelner, J., and Silva, R.A., "A dis- • It may not be possible to debug a massively distrib- tributed programming platform using mobile agents," uted application under full load, so engineers will be 3rd Inter. Symp. On Autonomous Decentralized Sys., forced to make trade-offs between debugging and April, 1997, pp. 103-110. real-time fault isolation with correction. Traditional debugging techniques, such as breakpointing, do not [7] Nessett, D.M., "Factors affecting distributed system scale. Therefore, trace-driven analytical techniques security," IEEE Trans. Soft. Eng., vol. SE- 13, pp. will predominate when debugging massively dis- 223-248, 1987. tributed applications. Since test harnesses of the [8] Birrell, A.D., Levin, R., Needham, R.M.., and requisite massive scale will be economically infea- Schroeder, M.D., "Grapevine: an exercise in distributed sible, engineers will be forced to use execution computing," Communications of the ACM, April, traces of live application runs. Such execution traces 1982, 260-274. may produce a very large volume of data, so test en- gineers will have to develop agile filtering, coalesc- [9] Mockapetris, P.V., Dunlap, K.J., "Development of ing and viewing tools. the domain name system," Proc. of SIGCOMM '88, ACM, August 16-19, 1988, pp.123-133. 4. Summary [10] Weiss, P., “Yellow pages protocol specification,” Technical Report, Sun Microsystems, Inc., Mountain The nature of this paper is exploratory and most asser- View, CA., 1985. tions are presented without detailed justification. Tech- nology forecasting is always risky and, in hindsight, [11] Dilley, J., “Practical experiences with the OSF cell rarely completely accurate. However, I hope the analysis directory service,” Networked Systems Architecture, at least stimulated the reader to think about massively Hewlett-Packard, International Workshop OSF DCE, distributed systems, how their characteristics will influ- Karlsruhe, Germany, Oct., 1993 ence embedded system design and implementation, and [12] Cappello, P., Christiansen, B., Neary, M., and how they may change the way we do computing in the Schauser, K., "Market-based massively parallel internet future. Even if there is disagreement about its content, compuing," Proc. 3rd Working Conference on Mas- it has succeeded if it motivates readers to develop an sively Parallel Programming Models, London, England, alternative taxonomy of the design issues and challenges Nov., 1997, IEEE Computer Society, pp. 118-129. associated with massively distributed systems. [13] Saltzer, J., Reed, D., and Clark, D., “End-to-end 5. References arguments in system design,” ACM Transactions on Computer Systems 2(4), Nov., 1984, pp. 277-288. [1] Gustavsson, R., "Agents with power," Communica- tions of the ACM, Vol 42, No. 3, March, 1999, pp. [14] Kopetz, H., "The time-triggered architecture," Proc. 41-47. 1st IEEE International Symposium on Object-oriented Real-time Distributed Computing, Kyoto, Japan, April, [2] Tennenhouse, D.L., Smith, J.M., Sincoskie, W.D., 1998, pp. 22-29. Wetherall, D.J., and Minden. G.L, "A survey of active network research," IEEE Communications Magazine, pages 80-86, January 1997.