The Difficulty of Accountability
--------------------------------

In simple distributed systems, rudimentary accountability
measures are often sufficient. If the list of peers is generally static
and all are known to each other by hostname or address, servers can only
misbehave under the threat of permanent bad reputation.  Furthermore,
if the operators of a system are known, pre-existing jurisdictional
mechanisms such as legal contracts help ensure that systems abide
by protocol. 

In the real world, these two social forces -- reputation and law -- have
provided an impetus for fair trade for centuries.  Since the earliest days
of commerce, buyers and merchants have known each others' identities --
at first through the immediacy of face-to-face contact, and later through
postal mail and telephone conversations.  This knowledge has allowed
them to research the past histories of their trading partners, and to
seek legal reprisal when deals go bad.  Much of today's e-commerce uses
a similar authentication model: clients (both consumers and businesses)
purchase items and services from known sources over the Internet and
the World Wide Web.  These sources are uniquely identified by digital
certificates, registered trademarks, and other addressing mechanisms.
Peer-to-peer technology has blurred this distinction between client
and server.

Peer-to-peer technology removes central control of such resources as
communication, file storage and retrieval, and computation.  Therefore the
traditional mechanisms for ensuring proper behavior can no longer provide
the same level of protection.

The problem of peer-to-peer identity is threefold.  First, peer-to-peer
technology increases the difficulty of uniquely and permanently
identifying peers and their operators.  Connections and network maps
might be transient.  Peers might be able to join and leave the system.
Participants in the system might wish to hide personal identifying
information by distributing trust among many peers along both
operational and jurisdictional lines. Second, even given a known
identity, there may be no way to identify a particular
individual, or to associate
their history or predict their performance.  Third,
individuals running P2P services are rarely bound by contracts, and the
cost and time delay of legal enforcement would generally outweigh their
possible benefit.

There are a number of different models on which to base a peer-to-peer
system.  As systems become more dynamic and diverge from real-world
notions of identity, it becomes more difficult to achieve accountability
and protect against resource degradation attacks.
[is this paragraph redundant? -rd]

The simplest type of peer-to-peer system has two main characteristics.
First, the list of servers in the system is mainly fixed: new servers
cannot be frictionlessly added to the system.  Second, the identities
of the servers/operators are known, generally by DNS hostname or static
IP host address.  Since the operators are known, they may have legal
responsibility or economic incentive -- leveraged by the power of
reputation -- to fulfill the protocols according to expectation.

An example of such a peer-to-peer system is the Mixmaster2 remailer.
[citation? who runs it? who owns it? -rd]
The original Mixmaster client software was developed by Lance Cottrell and 
released in 1995.
[Cottrell, Lance. Mixmaster and Remailer Attacks. Web-published essay. 
http://www.obscura.com/~loki/remailer/remailer-essay.html]
Currently, the software runs on about 30 remailer nodes, whose locations
are published to the newsgroup alt.privacy.anon-server and at web sites
such as efga.org. The software itself can be found at
 http://mixmaster.anonymizer.com

The remailer nodes are known by hostname and remain generally fixed.
While anybody can start running a remailer, the operator needs to spread
information about his new node via out-of-band means (meaning manually
sending email -- it's not part of the protocol) to web pages that
publicize node statistics. The location of the new node is then manually
added to each client's software configuration files.  This process of
manually adding new nodes leads to a system that remains generally static.
Indeed, there are less than 30 widely-known Mixmaster nodes. 

[Electronic Frontiers Georgia List of Public Mixmaster Remailers.
Web Page. http://anon.efga.org/Remailers]

A slightly more complicated type of peer-to-peer system still has
identified operators, but is dynamic in terms of members. That is, the
protocol itself has support for adding and removing participating
servers. One example of such a system is Gnutella -- it has good support
for new users (which are also servers) joining and leaving the system,
but at the same time the identity and location of each of these servers
is generally known or knowable.

[Gnutella Wall of Shame. Web site. http://www.zeropaid.com/busted/]

[R. Dingledine, M. Freedman, and D. Molnar "The Free Haven Project"
2000 Berkeley Workshop on Design Issues in Anonymity and Unobservability
http://www.freehaven.net/doc/freehaven.ps]

These sorts of systems
can be very effective, because they're generally easy to deploy (there's
no need to provide any real protection against people trying to learn
the identity of other participants) while at the same time they allow
many users to freely join the system and donate their resources.

Farther still on the scale of design and deployment difficulty are the
types of peer-to-peer systems which are dynamic in terms of
participants, and have pseudonymous server operators. In these systems,
the actual servers that store files or proxy communication live behind
digital masks that conceal their geographic location and other
identifying features.  Thus, the mapping of pseudonym to real-world
identity is not known.  A given pseudonym may be pegged with negative
attributes, but a user can just create a new pseudonym or manage several
at once.  Since a given server can simply disappear at any time and
reappear as a completely new entity, these sorts of designs require
either reputation systems or micropayments systems to provide
accountability on the server end. An example of a system in this
category is the Free Haven design -- each server is contactable via a
Mixmaster reply block and a public key, but no other identifying
features are available.

The final peer-to-peer model on this scale is a dynamic system with
fully anonymous operators. A server which is fully anonymous, compared
to the pseudonymity described above, lacks even that level of temporary
identity.  Since an anonymous peer's history is by definition unknown, all
decisions in an anonymous system must be based only on the information
made available during each protocol operation.  In this case, peers
cannot use a reputation system since there is no real opportunity to
establish a profile on any server. This leaves a micropayment system as
the only reasonable way to establish accountability. On the other hand,
because the servers themselves have no long-term identity, this may
limit the number of services or operations which such a system could
provide. For instance, it would be very difficult to offer long-term
file storage and backup services.

As we can notice from the models of possible systems, there are a
number of design considerations beyond that of simple collaborative
networking between nodes on the Internet.  Below we describe two main
goals -- privacy protection and dynamic operation -- which emphasize
the difficulty of achieving accountability measures.

First, a focus of many systems is the goal of privacy-protecting file
sharing or communication.  Privacy-protecting file sharing requires a
mechanism for inserting and retrieving documents either anonymously or
pseudonymously.  Privacy-protecting communication, on the other hand,
requires a means to communicate -- via email, telnet, ftp, irc,
http, etc. -- while not divulging any information that could link the
user to his real-world persona.  Note that the Internet is not an
ideal medium for anonymous communication and publishing systems: the
ease of passive sniffing and active attack is greatly simplified.  Email
headers include the routing paths of email messages, including DNS
hostnames and IP addresses.  Web browsers normally display user IP
addresses; cookies on a client's browser may be used to store persistent
user-information.  Commonly-used online chat applications such as ICQ
and Instant Messenger also divulge IP addresses.  Network cards in
promiscuous mode can read all data flowing through the local ethernet,
and themselves often have world-unique MAC addresses, hard-wired during
manufacture. With all these possibilities, telephony or dedicated lines
might be better suited for this goal of privacy-protection.  However,
the ubiquitous nature of the Internet has made it the only real
consideration.

Second, the dynamic nature and explosive growth of the Internet suggests
that any peer-to-peer system must be similarly flexible and dynamic, in
order to sustain long-term use and growth. Similarly, the importance of
ad-hoc networks will probably increase in the near future, paralleling
the further development of inexpensive wireless technology and
connectivity.  The system must provide a mechanism for peers to smoothly
join and leave without impacting functionality.  This design also
decreases the risk of system-wide compromise, as more peers join the
system.  (This assertion assumes that servers run a variety of operating
systems and tools, so that a single exploit cannot compromise most of
the servers at once.)

The main goal of accountability is the following: we wish to maximize
a server's utility to the overall system, while minimizing the threat
it poses.  In other words, accountability is used to reduce the damage,
intentional or not, that a server can inflict on the system.  We can
choose to minimize the threat by only incurring risk either equivalent
to our benefit from the transaction or proportional to our trust in
the parties involved, respectively described in terms of a micropayment 
and reputation model.

The former technique uses a fee-for-service micropayment model.  A
server makes decisions based on fairly immediate information.  Payments
and the value of services are generally kept small, so that a server
only gambles some small amount of lost resources for any single
exchange.  If both parties are satisfied with the result, they can
continue with successive exchanges.  Therefore, parties require little
prior information about each other for this model, as the risk is small
at any one time.  Note that the notion of payment in a micropayment
scheme might be distinct from any actual currency or cash.

The latter technique uses a reputation model.  For a given exchange, a
server risks some amount of resources proportional to its trust that the
result will be satisfactory.  As a server's reputation grows, other
nodes become more willing to risk larger amounts with it.  The
micropayment approach of small, successive exchanges is no longer
necessary.  Reputation systems require careful development, however, in
the presence of impermanent, pseudonymous identities.  If an adversary
can gain positive attributes too easily and establish a good reputation,
she can damage the system.  Conversely, if a well-intentioned server can
incur negative attributes easily from short-lived operational problems,
she can lose reputation too quickly.  The system would lose the utility
offered by these ``good'' servers.

There is a third way to handle the accountability problem: ignore the
issue and engineer the system to simply survive some faulty
servers. Instead of spending time on ensuring that servers fulfill their
function, we can leverage the vast resources of the Internet for
redundancy and mirroring.  We might not know, and we might be unable to
find out, if a server is behaving according to protocol, whether that
protocol is storing files and responding to file queries, forwarding
email or other communications upon demand, or correctly computing values
or analyzing data.  Instead, if we replicate the file or functionality
through the system, we can ensure that the system works correctly with
high probability, despite misbehaving components.
[note that this works only for services provided by the network itself.
it is meaningless in the context above: exchange of objects of value. -rd]

The question remains: where are we successful? That is, which servers in
the peer-to-peer system are behaving according to protocol, and which
offer "correct" information?  This problem is one of trust, described in
more depth in another chapter [Insert Trust chapter here].  Aspects of
trust which make our situation particularly complicated include the
problem of document naming (and associated uncertainty about the actual
contents of the document), and development of efficient tests for
distributed computations. More broadly, we need to develop general
algorithms to verify behavior of decentralized systems -- for instance,
Free Haven needs to address this issue to know how frequently to do
document queries to see if a given document is still available.

In general, the popular peer-to-peer systems take a wide variety of
approaches to solving the accountability problem. A few examples follow.

  * Freenet dumps unpopular data on the floor, so people flooding the
    system with unpopular data are ultimately ignored.  Popular data is
    cached near the requester, so repeated requests won't traverse long
    sections of the network.
  * Gnutella doesn't "publish" your documents anywhere except on your
    computer, so there's no way you can flood other systems.
    (This has great impact on the level of anonymity actually offered.)
  * Publius limits the submission size to 100k. (It remains to be seen
    how successful this will be; they recognize it as a problem.)
  * Mojo Nation uses micropayments for all peer-to-peer exchanges.
  * Free Haven requires publishers to provide reliable space of their
    own if they want to insert documents into the system.  This economy
    of reputation tries to ensure that people donate to the system in
    proportion to how much space they use.

In the next sections, we further consider the accountability problem,
and describe the micropayment, reputation, and other schemes used.