CHAPTER 6	Related Work


6.1 Introduction

We have presented a general architecture and a sample application, and some evalua-
tion of them, designed to promote a particular sociopolitical agenda and to demon-
strate that starting with such an agenda can lead to technological advances. Let us 
now turn our attention to related work in some relevant fields.


This research touches on a large number of possible topics. We shall restrict ourselves 
here to examining:

Section 6.2

o	Other types of matchmaking systems

Section 6.3

o	Other decentralized systems

Section 6.4

o	Other systems and software that have been designed for political purposes


6.2 Matchmakers

In general, systems that perform any sort of matchmaking task are centralized sys-
tems. Such an organization has several useful advantages, especially to the implemen-
tors of such systems, if they do not also have to deal with personal information:

Why centralization is a 
popular approach

o	They are easier to administer -- all, or almost all, of the relevant software can run 
on hosts directly under the administrative control of the implementors


o	If they are being used for a business, it is often obvious how to structure the system 
such that users may be charged fees, or have advertising delivered to them as they 
use the system


o	If the business model of the matchmaker also requires that personal information be 
reused for other purposes -- such as marketing -- then centralizing all data on the 
company's own servers makes this easy.


Collaborative filtering

Webhound/Webdoggie [103] and HOMR/Ringo/Firefly [112], for example, are typi-
cal examples of centralized matchmakers. A central server maintains information 
about user interests, and users connect to the server (in both cases, with web brows-
ers) to discover whether they have a match. Both systems require the user to be proac-
tive in establishing and maintaining an interest profile, although Webhound/
Webdoggie also obtained leverage by using a data source the user already kept 
updated, namely his or her hotlist.

Brokering services

Kuokka and Harada [99][100] describe a system that matches advertisements and 
requests from users and hence serves as a brokering service. Also a centralized server, 
their system assumes a highly-structured representation of user interests.

Sixdegrees

Sixdegrees [110] is an interesting idea in matchmaking, generally for professional 
reasons; this site keeps track of who you already know and uses this information to 
find minimal spanning trees to others who you would like to know. It does this by ask-
ing for email addresses corresponding to others that you know, and also for their rela-
tionships to you (as well as other information, such as their profession), and then 
contacts those people to see if they agree. If they do not repudiate the relationship, the 
system records the correspondence. Users are always identified; unlike most other 
systems, there are no pseudonyms. Users can then ask queries such as, 'Who do I 
know who knows a lawyer?' 


The system is somewhat cumbersome because of the need to involve everyone explic-
itly (anyone you name must take the effort to become a member themselves), but its 
narrow targeting of social relationships makes it likely to find interesting contacts. It 
is, of course, another centralized system, although it takes certain efforts both to reas-
sure its users that their information will remain private -- although, of course, they 
make no assurances about either crackers or subpoenas --  and that the system cannot 
easily be gamed to expose large numbers of relationships -- for example, you can only 
find out about the relationships of other people to people you already know, out to a 
very limited diameter, and can only spam those you already know, which is presum-
ably not very productive.

PlanetAll

PlanetAll [150] takes a somewhat different approach. It concentrates on finding peo-
ple you once knew, rather than on finding new people you might like to know. Like 
Sixdegrees, it is a centralized, web-based service, and everyone using the service is 
identified by their real name. Unlike Sixdegrees, the primary organizing principle 
behind PlanetAll is affinity groups. Such groups are prespecified, named entities cor-
responding to organizations in the real world -- not online -- of which the user was at 
one time a member. They are typically schools, clubs, or religious organizations, and 
PlanetAll allows one to search for them by keyword. When registering, the user spec-
ifies affinity groups, and is then notified when others join the group. He or she can 
send messages into the group or to particular individuals.


Spamming is prohibited by the rules of service, and, since individuals are always 
strongly identified, tracking them down and barring them is easy. On the other hand, it 
is not clear what would happen if someone who was never part of some affinity group 
in real life were to join one anyway -- such a party crasher would probably simply be 
tolerated, as least if he or she was not obnoxious, because everyone else in the group 
might assume that someone knew them.


One can also tell PlanetAll about particular individuals in the system and ask it to 
send mail when that individual's information (such as work address) changes. It is 
presumed that individuals already know each other when they receive notification of 
one joining the group -- thus, PlanetAll concentrates on finding people after one has 
lost track of them, rather than on describing unknowns to each other. PlanetAll also 
has a number of other interesting features. For example, it allows users to enter their 
travel itineraries, and will notify them when their paths cross in foreign cities.


As with Sixdegrees, PlanetAll users must trust that the central site will protect their 
personal information. Since such information could be valuable to a number of com-
mercial interests, and also to those contemplating identity theft, this could be a major 
exposure.

Romantic matchmakers

Although romantic matchmaking is not an explicit goal of Yenta, there are a large 
number of matchmaking systems specialized for this application, and they are worth 
studying. Such systems appear to be invariably centralized. For example, Match.Com 
[41] is a straightforward romantic-matchmaking service. Users fill out a form detail-
ing their own characteristics and those of people they would like to meet (sex, age, 
geographic location, etc.), which are used in a simple match/filter algorithm; they also 
post personal ads to supply more detail once a user's filter has selected some ads. 
Similarly, the Jewish Matchmaker [43] (unfortunately also called Yenta, for obvious 
reasons) is one of several more-specialized systems that function similarly: surveys 
for filtering, personals for secondary selection, and a centralized server, all backed up 
by a web-based interface.

A rare decentralized 
example

Kautz, Milewski, and Selman [91] are one group, of very few, to have taken a more 
distributed approach to matchmaking. They report work on a prototype system for 
expertise location in a large company. Their prototype assumes that users can identify 
who else might be a suitable contact, and use agents to automate the referral-chaining 
process. They include simulated results showing how the length and accuracy of the 
resulting referral chains are affected by the number of simulated users and the accu-
racy and helpfulness of their recommendations. Yenta differs from this approach in 
using ubiquitous user data to infer interests, rather than explicitly asking about exper-
tise. In addition, Yenta assumes that the individuals involved probably don't already 
know each other, and may have interests that they wish to keep private from at least 
some subset of other users.

6.3 Decentralized 
systems

There are a variety of other decentralized systems that bear consideration here. For 
the most part, these systems may be divided by their underlying metaphors: biologi-
cal, market-based, or other. We shall discuss all three below.


Both biological and market-based systems are often used in the allocation of scarce 
resources, although with a difference in emphasis. For example, biological systems 
often model individual actors or agents through their births, lives, and deaths. It is 
commonly assumed that the characteristics of agents change relatively slowly over 
their lifetimes, but that an entire population may change through evolution. Individual 
agents generally have very limited models of the world and sometimes vanishingly 
small reasoning abilities. Market-based systems, on the other hand, tend to assume 
agents which exist for indefinite spans of time, but can change their behavior rela-
tively quickly due to learning within an agent. In addition, information flows -- as 
opposed to flows of matter -- are often considered to dominate the interaction, and 
explicit negotiation between agents with high levels of reasoning are common.

Biological metaphors

The artificial life approach is explicitly informed by a biological metaphor [94]. This 
discipline tends to model systems as small collections of local state that have gener-
ally been mapped into a simulation of some physical space. Within this space, these 
bundles of state may interact solely through local interactions -- there is no action at a 
distance. Systems modeled often tend also to simulate real biological systems, albeit 
simplified versions -- ant and termite colonies [142], predator/prey systems and vari-
ous simulations of Darwinian or Lamarckian evolution [19][102], learning [57][62], 
immune systems [95], and many more. Some simulate decidedly nonbiological sys-
tems using biological metaphors -- for example, many problems in optimization are 
often effective solved using genetic algorithms [96]; for example, producing optimal 
sorting networks [79].


The choice of self-contained bundles of state, and strictly local communication, stems 
naturally from systems which either simulate or are inspired by the natural world, 
where nonlocal effects tend to be rare. Most such systems run on uniprocessors, but 
there are exceptions. For example, many learning [62] or simulated-evolution [165] 
systems have been implemented on SIMD or MIMD architectures such as the CM-2 
or CM-5 Connection Machines from Thinking Machines. Others have been distrib-
uted to collections of uniprocessors connected via the Internet. One example is NetTi-
erra [139], a network-based implementation of the original Tierra [138], a system 
originally written to explore the evolution of RNA-based life via an easy-to-mutate 
machine language.

Market-based metaphors

Market-based approaches tends to use negotiation, barter, and intermediate represen-
tations of value -- such as money -- to enable a collection of actors to decide on indi-
vidual strategies [25][111]. One example of such a system is Harvest [74], which uses 
a decentralized collection of gatherer, broker, collector, and cache elements to greatly 
improve the performance of, e.g., web servers. Element use market-based ideas to 
decide how to allocate various resources such as storage or bandwidth.


Consider also a system in which we have a heap, such as that found in a Lisp system, 
where objects point at each other. Reclaiming unused space in a heap is called gar-
bage collection, and doing so if the heap spans multiple machines can be quite slow 
due to communications overhead. Using a market-based approach, in which storage 
essentially pays rent and storage which runs out of money is deallocated [40] can 
make this problem much more tractable by keeping almost all the computation 
required local to individual machines.

Other approaches

Not all decentralized systems necessarily require either competition or cooperation 
between agents -- some simply use decentralization to achieve pure parallelism, turn-
ing a network of uniprocessor CPU's into an emulation of a MIMD multiprocessor. 
One common example of this these days is cryptographic key cracking [32], in which 
thousands of CPU's participate in searching the keyspace of a particular encrypted 
communication. This application is typically political in nature -- in general, partici-
pants take part in order to help demonstrate that ciphers such as 56-bit DES are woe-
fully insecure [15][24][34][42][76][184].

6.4 Political 
software and systems

Let us now examine various software systems that have been designed with a particu-
lar eye towards their political environment. We will concentrate here only on systems 
which attempt to advance what we believe to be the socially responsible position in 
our political argument -- and not, for example, systems such as the centralized Intelli-
gent Transportation Systems described in Section 1.4.

Pretty Good Privacy

By far the most famous example of such software is Pretty Good Privacy, or PGP 
[187]. PGP is one of the most widely-used strong-cryptography packages in the 
world. Recent versions have even been deliberately exported from the United States, 
even though doing so electronically is illegal. Instead, the First Amendment to the US 
Constitution was exploited as a loophole -- it has already been determined that printed 
books are not subject to regulation under US export-control law. Thus, source code 
was printed into a ten-volume book, which is legal to export, in a format that was 
explicitly designed to be easy to scan and convert back into electronic form overseas. 
(Since then, other important cryptographic efforts have been exported in the same 
way -- for example, all of the VHDL and loader code describing how to build a hard-
ware DES-cracking machine was printed in machine-scannable form expressly to 
allow this [42].)


PGP's development was motivated by explicitly political aims -- its author, Philip 
Zimmerman, wrote it to make strong cryptography easily available to the masses, or 
at least to those masses who owned personal computers. And since then, it has 
become a lightning rod for discussion concerning US cryptographic-export policy.


PGP itself does not depend on any sort of network infrastructure -- it encrypts and 
decrypts files only. However, it is most useful when combined with a network, rather 
than when being used to mail encrypted floppies back and forth. Various popular 
mail-handling programs, such as Eudora for Macs and PC's, and Mailcrypt for GNU 
Emacs, have incorporated it into their design.

Anonymous remailers

Other political software has made the network a more explicit part of their design. 
Consider anonymous remailers [10][23][66], which are designed to hide the origin 
and destination of messages being sent from one computer to another. They work by 
encrypting messages in transit, and routing them through a large number of computers 
in various political domains. The assumption is that no single entity could success-
fully compromise every computer and every network link in the chain, and that this 
lack of total surveillance will allow truly-anonymous information exchange.


The contents of such messages are varied. Many concern topics which are potentially 
embarrassing or dangerous to those discussing them, such as unusual lifestyles, or 
discussion of medical problems such as HIV which might cause the discussant to lose 
his or her job or social standing. Others are explicitly political in nature, sent by peo-
ple living in regions where political dissent can lead to imprisonment or execution 
[11].

anon.penet.fi

One particularly famous remailer was the anon.penet.fi remailer [77], run by Johan 
Helsingius. This service offered single-hop anonymity -- messages sent to this 
remailer had identifying information stripped out, but were then delivered as usual to 
their destination. This made it particularly easy to use without the special software 
often required of multihop Mixmaster [10][23][66] remailers. It also offered nyms -- 
one could have a stable, pseudonymous identity through the use of this service, rather 
than being completely anonymous. Anyone could reply to a message posted through 
anon.penet.fi, back to the original author, even though both parties would not know 
each other's actual identities.


This mechanism also led to a certain amount of insecurity. For example, in one well-
publicized case in 1995, the Church of Scientology was able to get the local govern-
ment in Finland to subpoena the site's operator for the mapping between one particu-
lar nym and the real email address of the person behind it. In 1996, the Church tried to 
determine if a particular individual had ever used the service. The site was eventually 
shut down by its operator, who cited the increasing load on his time that running it 
required, and the availability of at least partial substitutes elsewhere on the net.

The Anonymizer

Consider now the Anonymizer, which attempts to make it possible to fetch web pages 
without informing the web server of the identity of the machine doing the fetching -- 
presumably for use in reading pages with controversial content, or to deny marketers 
the ability to target the reader for profiling. It is a single, centralized server, and sim-
ply proxies requests through itself, rewriting HTML links such that following a link 
on a fetched page will go back through the Anonymizer. While it can effectively hide 
users from sites, it is useless against traffic analysis attacks -- it operates at a single, 
well-known address and from a single point of presence. This makes its communica-
tions easy to tap, either at the site or by looking for requests from a given user to the 
Anonymizer itself. Even if SSL is in use, thus hiding the actual URL's being 
requested and the contents of the pages returned, traffic analysis at the user's site can 
instantly reveal that the Anonymizer is in use at all, and even this is often sufficient to 
target the user for various unfortunate consequences. Further, sites which offer con-
tent may deliberately deny content to the Anonymizer, to force users to come from 
well-identified IP addresses. Finally, users of the Anonymizer must trust that the site 
really is honoring its stated policies of not keeping logs of the traffic through itself.

Crowds

A more-sophisticated system, developed after the Anonymizer, is the Crowds system 
[141]. This system is also an attempt to strip identifying information from web surf-
ers, and uses decentralization to foil traffic analysis. Participating users join a 
crowd -- a collection of other machines, all of which participate in the system, and 
which randomly reforward HTTP requests and responses among themselves before 
sending them to their final destinations. This means that any particular web page 
fetched by a user could come from any of the participating machines at random, 
hence denying the web server the ability to know precisely who is fetching which 
pages.


This system is explicitly aware of the problems of traffic analysis, both at the web 
server itself and in the intervening links between that server and the user, and takes 
steps to foil it. It also reduces the problems of trusting the privacy policy of a single 
site.

Web filters

Web-filtering programs grew directly out of political concerns -- they are software 
packages which are deliberately designed to block content from particular users, gen-
erally minors and anyone else who might be coerced into using them, such as library 
patrons in some cases. Some of them, such as RSACi [140], rely on self-ratings by 
sites. Others, such as PICS [177], rely on third-party ratings. These third-party ratings 
may be either public, and possibly distributed, or provided by the manufacturer of the 
filtering software, and often private.


Since someone must choose which sites are acceptable and which are not, there is an 
implicit political agenda to using such software. Even systems which claim to allow 
the user to select any other third party's recommendations may be abused given 
enough control of the network infrastructure. For example, China carefully controls 
traffic across its borders, and could insist that all web surfers use only government-
approved PICS sites for their filter lists. In addition, those systems in which the ven-
dor of the filtering software choose are often extremely heavy-handed about what 
sorts of sites are deemed unacceptable. In response to this, Bennett Haselton [75] has 
spent considerable time and effort exposing the antics of filter manufacturers who 
claim to be blocking 'sexual content' but are also blocking a wide variety of nonsex-
ual web sites that happen to have politics that the filter vendors find unacceptable. The 
list of sites blocked by these packages are secret, ostensibly for reasons of competitive 
commercial advantage, but this means that there is virtually no oversight for what 
often turns into an appalling censorial exercise.

FSF and Open Source

Finally, let us consider an intellectual property methodology, as opposed to particular 
systems or programs. The methodology of interest is the union of the Free Software 
Foundation and the more-recent Open Source movement. Both of these approaches 
view freely-redistributable software as a social good. While they differ on the details 
of what this means and how to achieve it, they are in substantial agreement that the 
freedom to examine and modify source code is the cornerstone of building high-qual-
ity software. Many famous examples of their effort exist, such as the GNU collection 
of hundreds of utility programs -- Emacs, autoconf, automake, gtar, gmake, and all the 
rest -- and other projects which use their licensing terms but were not written by the 
FSF -- such as Linux, SCM, and so forth.


Both the FSF and the Open Source group have an explicit political agenda, which 
they enforce through the technology of copyright and contract law. Thus, their tech-
nology is that of intellectual property per se, rather than that of software itself. Their 
efforts have had an enormous effect on the way that software is currently developed, 
especially -- but not exclusively -- that which runs under various varieties of UNIX, 
and is likely to leave a considerable legacy.

6.5 Summary

In this chapter, we have touched briefly upon matchmakers, decentralized systems, 
and politics. All three of these fields are assuming increasing importance as the Inter-
net continues to expand and its user base continues to grow. The research that led to 
Yenta and its underlying architecture did not arise from the vacuum. Instead, it is 
explicitly informed from -- and, in some cases, in reaction to -- some of the existing 
systems and methods of practice currently popular in the field.