Coordination Strategies: Degree of Decentralization

Here's a look at degrees of decentralization of agents that serve multiple, geographically-dispersed users. Note, in the discussion that follows, that I generally assume that two completely separate instances of the same application (such as Lotus Notes two different companies) would count as a "single central server" (or whatever) and not as "multiple, non-mirrored servers that don't know about each other", because the servers involved are in totally separate and hermetic information domains. The one exception I make is Julia, because she could conceivably run multiple instantiations (one per MUD) all on a single workstation. In this case, even though all instantiations are essentially in separate information spaces, it's conceivable that it's useful to know that there are multiple instantiations of the same software running around.

A single central server
This is generally the easiest and most obvious organization for any sort of agent that serves multiple users.
- Advantages:
  - Easy to coordinate; no work has to be done by the agent itself to do so.
  - Easy for users to know where to contact.
  - Lends itself to crossbar algorithms and similar cases in which the entire knowledge base must be examined for each query or action.
  - If the server is used by very widely spread users, timezones may spread out some of the load.
- Disadvantages:
  - Doesn't scale: generally, the workload goes up as the square of the number of users.
  - Not fault tolerant: the server is a single point of failure for both performance and security (it is a single obvious point to compromise).
  - The majority of users will find themselves a good part of an Internet diameter away from the server; this can be serious if low latency is required of the server.
- Examples:
  - SABER (the airline reservation system).
  - Most typical data-warehousing operations.
  - HOMR and maybe Firefly.
  - Webhound.
Multiple mirrored servers
This describes a class of server where the basic algorithm is run in parallel on a number of machines (typically, very-loosely-coupled parallelism, e.g., separate workstations, rather than a single MIMD or SIMD parallel architecture). Such architectures in general can be divided into:
- Tightly-consistent architectures, in which all servers have exactly the same, or virtually the same, database and simply handle multiple requests or take actions for users in parallel, possibly checkpointing with each other as each action is taken, and
- Loosely-consistent architectures, in which the servers have mostly the same information or at least information in the same domain, but they do not try to enforce a particularly strong and consistent worldview among themselves.
The choice of tight or loose consistency is generally a function of the operations being supported by the servers.
- Advantages:
  - These architectures are handy when it is relatively simple to maintain database consistency between servers (for example, if user requests or actions taken on their behalf do not side-effect the database, then its consistency is easier to maintain).
  - Load-balancing is fairly simple, and extra hosts can be added incrementally to accomodate increases in load.
  - The servers may be geographically distributed to improve either network load-balancing, timezone load-balancing, or fault-tolerance.
- Disadvantages:
  - If the algorithm requires tight consistency, the requisite interserver communications costs can eventually come to dominate the computation.
  - Even loosely-consistent servers will probably still suffer from roughly quadratic growth in load with the number of users. This implies that, to keep up with even linear user growth, a quadratically-increasing number of servers must be put online; keeping up with typical exponential growth, of course, is much harder.
- Examples:
  - Current Firefly is presumably some sort of mirrored, probably loosely-consistent server architecture.
  - Lycos is certainly a tightly-consistent multiple-server architecture, where its tight consistency is archieved by distributing the same index database to all servers on a regular basis.
  - Many FTP archives are mirrored in some sort of loosely- or tightly-coupled server architecture, where the servers are themselves geographically distributed to spread out the concommittant network load.
Multiple, non-mirrored servers
These types of agent architectures can fairly trivially be divided into:
- those that know about each other, and
- those that probably don't.
The Web itself is an example of this architecture; each server does not need to know about each other, and, in general, do not mirror each other. Few agent architectures seem to be designed in this way, however, except in the limit of the same agent simply being run in multiple instantiations in different information domains.
- Advantages:
  - Consistency is easy to achieve.
  - Load sharing may be implemented as in the mirroring case above.
- Disadvantages:
  - Similar to mirrored servers, though the disadvantage of maintaining consistency is eliminated.
  - Load growth may still be a problem.
  - It may be difficult to find all servers if the algorithm demands it, since the lack of mirroring means servers may tend to fall out of touch with each other.
Totally distributed peers (no distinction between server and client)
As in the case above of multiple, non-mirrored servers, totally-distributed peers can be divided into:
- those that know about each other
- those that probably don't
This approach resembles an ALife system more than the approaches above, and deviates most radically from typical client/server or centralized-system approaches.
- Advantages:
  - Can probably be scaled up to accomodate loading easily, because servers are also peers and probably associate close to 1-to-1 with the user base.
  - No central point of either failure or compromise.
- Disadvantages:
  - Coordination between the peers becomes much more difficult.
  - Algorithms that require global consistency are probably impossible to achieve with acceptable performance.
  - It may be difficult to keep all agents at similar software revision levels.
- Examples:
  - Yenta is a classic example of totally-distributed peers that, for the most part, don't know about each other (except those in current clusters of the given peer).

Up | Table of contents | Next

Lenny Foner

Last modified: Fri Dec 15 09:56:07 1995