%%% WShAnon paper for the Free Haven Project.
%%% Due : July 1 in camera ready form. 
\documentclass[11pt, amssymbols]{article}
% \usepackage[]{agaramon}
% \setcounter{page}{7}
\usepackage{epsfig}
\textwidth16cm
\textheight21cm
\topmargin0mm
\oddsidemargin2.5mm
\evensidemargin2.5mm

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% AREA FOR AUTHOR STYLES AND DEFINITIONS
%
% Please reduce your own definitions and macros to an
% absolute minimum. Thank you!
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newcommand{\cm}[0]{$\surd$}
\newcommand{\subsubsubsection}[1]{\paragraph{#1}}
\newcommand{\dofig}[3]{\begin{figure}
\epsfbox{#1}
\caption{#2}
\label{#3}
\end{figure}}


\begin{document}
\title{The Free Haven Project: \\
Distributed Anonymous Storage Service}
\author{Roger Dingledine \\
   	MIT \\
  	arma@mit.edu
   	\and
  	Michael J. Freedman \\ 
        MIT \\
	mfreed@mit.edu
	\and
	David Molnar \\
	Harvard University \\
	dmolnar@fas.harvard.edu
}
\date{July 25, 2000}
\maketitle


\begin{abstract}
The Free Haven Project is a design for a system of anonymous storage
which resists the attempts of powerful adversaries to find or destroy
stored data.  We explicitly specify requirements for such a storage
system and enumerate distinct notions of anonymity. We suggest a way to
classify anonymous systems based on the kinds of anonymity provided.
Accountability of servers is explicitly addressed through the use of a
``trust network,'' which attempts to limit the damage done by
misbehaving servers.  Design choices common to several anonymous storage
and publication systems are identified.  We end by identifying possible
attacks, defenses, and future questions.
\end{abstract}

\section{Introduction}
Anonymous publication and storage services allow individuals to speak
freely without fear of persecution, yet such systems remain poorly
understood. Political dissidents must publish in order to reach enough
people for their criticisms of a regime to be effective, yet they and
their readers require anonymity at the same time. Less extreme examples
involve cases in which a large and powerful private organization, such
as the Church of Scientology, attempts to silence its critics by
attacking either the critics themselves or those who make the criticism
publically available. Additionally, the recent controversy over Napster
and Gnutella has highlighted both a widespread demand for anonymous
publication services for non-political purposes, and the consequences of
such services failing to provide the anonymity expected.

%Despite the need for such services,
% i don't think we've sufficiently demonstrated a need.
% perhaps freenet and gnutella are enough. and where *are* those
% political dissidents?

Systems meeting these needs are just starting to be deployed, and the
exact requirements and design choices are not yet clear. Recent events
have highlighted some shortcomings of already deployed systems; the
identification and removal of Napster users who downloaded Metallica
songs \cite{metallica} and the Gnutella Wall of Shame \cite{gnushame} are two
examples. These shortcomings are driving the development of a new
generation of anonymous publication services, such as Freenet \cite{freenet},
which focus specifically on providing anonymity.

It is in this spirit that the Free Haven Project aims to design,
implement, and deploy a functioning distributed anonymous \emph{storage}
service. We distinguish storage from publication in that storage
services focus less on availability and more on persistence of data.
 
In the process, we hope to clarify some of the requirements for such
systems and highlight design choices. We recognize that such services
raise significant moral and legal issues which are outside the scope of
this paper; for our consideration of these issues, we refer to the first
author's thesis \cite{rogers-thesis}.

%Systems meeting these needs are just starting to be deployed, and the
%exact requirements and design choices are not yet clear.  The Free Haven
%Project aims to design, implement, and deploy a functioning distributed
%anonymous storage service. In the process, we hope to clarify some of
%the requirements for such systems and highlight design choices.

Here we present a design for a system of anonymous storage and begin
investigating the requirements for such a system. In particular, we
recognize that a system must meet some standard of reliability and
utility before it can be useful. Our design operates on a basic unit of
data called a \emph{document}. Our requirements for reliably processing
these documents are covered in section \ref{sec:storage-reqs}.

We also show that it is not enough simply to talk about ``anonymous''
storage and publication. In section \ref{sec:anon}, we enumerate the
many different \emph{kinds} of anonymity which cover different aspects
of the system, all important for the realization of a truly anonymous
system.

%We also show that it is not enough simply to talk about ``anonymous''
%storage. There are many different \emph{kinds} of anonymity which cover
%different aspects of the system. The kind of anonymity which protects a
%document reader from being shot is different than the kind of anonymity
%which protects a server from being tied to some piece of bad data. These
%kinds of anonymity are enumerated in section \ref{sec:anon}.

Free Haven meets these requirements with a design based on a community
%network (i like the word community. it emphasizes our bottom-up approach.)
of servers called the \emph{servnet}. Each server, or \emph{servnet
node}, holds pieces of some documents; these pieces are called
\emph{shares}.  In addition, each servnet node has a persistent
identification or \emph{pseudonym} which allows it to be identified by
other servnet nodes or potential Free Haven users. Section
\ref{sec:design} describes the design of the Free Haven system and the
operations that it supports, including inserting and retrieving
documents.
%In section XX we describe how users can insert document into the Free
%Haven servnet, and in section XX we describe the steps involved in
%retrieving a document.
%%ah ha! so *you're* the one who spells it retreive. :)

We chose to use a network of pseudonymous servers in order to give each
server a \emph{reputation}. This reputation allows servers to be
``paid'' without needing the robust digital cash scheme required for
systems such as Anderson's Eternity Service \cite{eternity}.  Servers
form ``contracts'' to store given shares for a certain period of time;
successfully fulfilling the contract gains the server trust and the
ability to store some of its own data on other servnet nodes. This gives
an incentive for each server to behave well, as long as cheating servers
can be identified, which we illustrate in section \ref{subsec:buddies}.
The idea is similar to the ``give up space now, get space forever''
scheme used in Intermemory \cite{CEGGSY98}, but allows servers to lose
trust if they start behaving badly. In section \ref{subsec:trust} we
discuss the ``trust network,'' which is the system that keeps track of
trust in each servnet node.  
%Section\ref{subsec:buddies} illustrates some ways servers can misbehave
%and methods to catch them when they do. 

Some of these ``contracts'' are formed when a user inserts data into the
servnet.  Most of them, however, will be formed when two servers swap
shares by \emph{trading}. Trading allows the servnet to be
\emph{dynamic} in the sense that servnet nodes can join and leave easily
and without special treatment. To join, a servnet node starts building
up trust by storing shares for others. To leave, a node trades away all
of its shares and then disappears. The benefits and mechanisms of
trading are described in section \ref{subsec:trading}.

Naturally, such a system has powerful adversaries which can launch a
range of attacks.  We describe some attacks on the Free Haven design in
section \ref{sec:attacks} and show how well the design does (or does
not) resist each attack. We then compare our design with other systems
aimed at anonymous storage and publication using the kinds of anonymity
described in section \ref{sec:analysis-anon}, allowing us to distinguish
systems which at first glance look very similar.  We conclude with a
list of ``challenges'' for anonymous publication and storage systems,
each of which reflects a limitation in the current Free Haven design.


\section{Requirements For Anonymous Storage}
\label{sec:storage-reqs}

\subsection{Parties}

We begin our discussion of requirements for anonymous storage by
enumerating the parties in an anonymous storage system. Information is
stored in units called \emph{documents}.

The \emph{author} of a document is the entity which initially created
the document.  The \emph{publisher} of a document is the entity which
places the document into the system.  Documents may have \emph{readers},
which are entities who retrieve the document from the system.  An
anonymous storage system may have \emph{servers}, which are participants
who provide special services required to keep the system running, such
as dedicated disk space or bandwidth.

\subsection{Storage Requirements}

These requirements are independent of the system's anonymity, but must
be fulfilled if the system is to be useful.

\begin{description}

\item[Integrity:]
An authorized reader must receive the same document as originally placed
into the system by the publisher, or recognize that the document has
been altered.

\item[Controlled Retrieval:]
The system must specify some policy for retrieving the document.


\item[Efficient Retrieval:]
An authorized reader must be able to efficiently retrieve the document.

%%%  WHAT?!?!  We aren't efficient! -mjf

\item[Persistence:]
It must be clear how long the document can remain stored, and the system
must fill all commitments to store a document for a given amount of time.

%%% What about publishing?  Huh???

\end{description}

\section{Anonymity for Anonymous Storage}
\label{sec:anon}
The word ``anonymous'' can mean many different things. Some systems
claim ``anonymity'' without specifying a precise definition. While the
anonymity requirements of communication channels have been considered
previously in depth \cite{web-mix, gold-anon}, we are not aware of a
similar investigation into the requirements for publication and storage
systems.

We do not give formal definitions here. Instead, we attempt to lay the
groundwork for future definitions by enumerating different aspects of
anonymity relevant to anonymous storage. This enumeration will allow us
to compare Free Haven with related work.

In all of these notions of anonymity, there are at least three distinct subnotions
based on what the adversary is assumed to already know. A document may be picked first, 
and then the adversary wishes to learn who authored, read, published, and so on. 
A user may be picked first, and the adversary wishes to know which documents the user 
authored, read, published, and so on. Finally, an adversary may know a document
\emph{and} a user, and then attempt to confirm its suspicion that the two are linked. 

\begin{description}
\item[Author-Anonymity:] 
A system is \emph{author-anonymous} if an adversary cannot link an
author to a document.


\item[Publisher-Anonymity:] 
A system is \emph{publisher-anonymous} if it prevents an adversary from
linking a publisher to a document.

\item[Reader-Anonymity:]
To say that a system has \emph{reader-anonymity} means that a 
document cannot be linked with its readers. Reader-anonymity protects
the privacy of a system's users.

\item[Server-Anonymity:]
Server-anonymity means no server can be linked to a document.
Here, the adversary always picks the document first. That is, given a document's name or 
other identifier, an adversary is no closer to knowing which server or 
servers on the network currently possess this document. 

\item[Document Anonymity:]
Document-anonymity means that a server does not know which documents it
is storing. Server-anonymity and document-anonymity are crucial if mere
possession of some file is cause for action against the server, because
they provide protection to a server operator even after his or her machine has been
seized by an adversary.

\emph{Isolated-server} document-anonymity means that if the server is
allowed to look only at the data that it is storing, it is unable to
figure out the contents of the document. This is achieved via some sort
of secret sharing mechanism, either sharing the document or sharing the
key for recreating the document (or both) across servers.

\emph{Connected-server} document-anonymity refers to the situation in
which the server is allowed to communicate and compare data with all
other servers. Since a connected server may act as a reader and do
document requests itself, connected-server document-anonymity seems
difficult to achieve without some trusted party which can distinguish
server requests from ``ordinary'' reader requests.

\item[Query-Anonymity:] 
Query-anonymity refers to the notion that over the course of a given
document query or request, the ``identity'' of the document itself is not
revealed to the server. In short, this means that although a server may
have many different documents stored, which document was served for a
given request is not knowable by the server. For an overview of private
information retrieval (PIR), see \cite{MALKIN-THESIS}.

A weaker form of query-anonymity may be realized through server
deniability.  The server knows the identity of the requested document,
but no third party can be convinced of its identity.  This concept is
related to deniable encryption
\cite{CDNO97}. 

% designated verifier proofs \cite{JSI96} of knowledge...

%is typically done by a computationally
%intensive (or bandwidth-intensive) process of downloading lots of other
%documents at the same time to mask which document is being retrieved,
%or perhaps by using lots of servers in the transaction. Alternatively,
%the use of massive amounts of resources might be solved by particularly
%clever mathematics. For an overview of PIR, we suggest Malkin's thesis
%\cite{MALKIN-THESIS}.

\end{description}

It seems that some of these notions of anonymity may imply each
other. We leave this investigation as future work.

\subsection{Anonymity and Pseudonymity}

So far, we have restricted ourselves to describing anonymity.  We could
extend these notions to allow for the use of \emph{pseudonyms}. If two
transactions in the system can be linked, then the attributes which
allow them to be linked make up a pseudonym.  For example, in an
\emph{author-pseudonymous} system, the documents digitally signed by
``Publius'' could all be verified as ``belonging to Publius'' without
anyone coming to know who ``Publius'' is in 'real life.'

Both anonymity and pseudonymity protect the privacy of the user's
\emph{location} and \emph{true name}. Location refers to the actual
physical connection to the system. The term ``true name'' was introduced
by Vinge \cite{vinge} and popularized by May\cite{timmay} to refer to
the legal identity of an individual. Knowing someone's true name or
location allows you to hurt him or her.

Many different actions can be linked to the same pseudonym, while an
anonymous system allows no linking at all.  This allows the pseudonym to
acquire a \emph{reputation}. Free Haven uses pseudonyms to give each
servnet node a reputation; the reputation influences how much data a
node can store and provides an incentive to act correctly.

\subsection{Partial Anonymity}

Often an adversary can gain some partial information about the users of
a system, such as the fact that they have high-bandwidth connections or
all live in California. Preventing an adversary from obtaining
\emph{any} such information may be impossible. Instead of asking ``is
the system anonymous,'' the question shifts to ``is it anonymous
enough?''

We might say that a system is \emph{partially anonymous} if an adversary
can only narrow down a search for a user to one of a ``set of
suspects.''  If the set is large enough, then it is impractical for an
adversary to act as if any single suspect were guilty. On the other
hand, when the set of suspects is small, mere suspicion may cause an
adversary to take action against all of them.

Independently, Syverson has developed a logic for talking about the
adversary's view of a set of suspects \cite{syverson}.

\subsection{Uses of An Ideal World}

%% I would like a trickier example.
%% For anonymous channels, the analogous (better) example is 
%% JavaScript downloaded to the browser which sends out the IP 
%% address to the adversary. 

Suppose an author signs his true name to a document before placing it
into an anonymous publication system. Is the system still ``anonymous''?
This situation raises a crucial question: where does the
``responsibility'' of an anonymous publication system begin, and where
does it end? What can such a system reasonably be expected to protect?

We approach this question by introducing an ``ideal world'' for
anonymous publication, influenced by work on secure multiparty
computation \cite{CANETTI-THESIS, ANNA}. The ideal anonymous system in
this world consists of a trusted third party (TTP) Ted with secure
channels to each party in the system.  This TTP receives confidential
messages, strips off the source information, and confidentially forwards
the messages to their destinations in an unlinkable fashion. Our goal is
to come up with a decentralized system that is able to simulate this TTP
for each operation. Equivalently, if Alice is communicating through Ted
to Bob, a set of protocols which allows Alice to communicate directly to
Bob without Ted is said to be anonymous if the transcripts of
communication are indistinguishable from those which include Ted. If
they are distinguishable, then that difference is exactly the ``break''
that causes the system to fail to be anonymous.

This informal description requires significant work before it can become
a formal model. For one thing, we have not precisely specified the
meaning of a ``transcript,'' nor have we investigated the notion of
``security'' necessary for the channels between Ted, Alice, and Bob.
Such work is outside the scope of this paper. Our point is that if an
attack succeeds even in the ideal world, then it is ``outside'' the
scope of an anonymous publication service.
 

\section{The Free Haven Design}
\label{sec:design}

\subsection{Overview}

The overall system consists of the publication system, which is
responsible for storing and serving documents; and the communications
channel, which is responsible for providing confidential and anonymous
communications between parties. This paper focuses on the design of the
publication system as a back-end for the communications channel.

The agents in our publication system are the {\bf author}, the {\bf
server}, and the {\bf reader}. These agents are layered over the
communications channel; currently they communicate with one another via
addresses which are implemented as remailer reply blocks
\cite{nymserver}.  Authors are agents that produce documents and wish to
store them in the service, servers are computers which store data for
authors, and readers are people
%(inside or outside the servnet)
who retrieve documents from the service.
% need to make clear readers are also servers

Free Haven is based on a community of servers (which as a whole is
termed the ``servnet'') where each server hosts data from the other
servers in exchange for the opportunity to store its own data in the
servnet.  The servnet is dynamic: data moves from one server to another
every so often, based on each server's trust of the others. Servers
transfer data by trading. This means that the only way to introduce a
new file into the system is for a server to use (and thus provide) more
space on its local system. This new file will migrate to other servers
by the process of trading.

Each server has a public key and one (or perhaps more) remailer reply
blocks, which together can be used to provide secure, authenticated,
pseudonymous communication with that server.  Every machine in the
servnet has a database which contains the public keys and reply blocks
of other nodes in the servnet.

Authors assign an expiration date to documents when they are published;
servers make a promise to maintain the availability of a given document
until its expiration date is reached.  Honest behavior is enforced by
other servers losing trust in a server that ``drops'' data. This trust is
monitored and updated by use of a \emph{trust network}.  Each server
maintains a database containing its trust of the other servers.

% we can leave this 'supported operations' subsection out --RD
%\subsection{Supported Operations}

% There are two operations which any publishing service must provide:
% adding a document to the system, and retrieving a document from the
% system. Free Haven provides these operations, and also supports a number
% of behind the scenes operations such as trading, expiration, and
% rudimentary support for enforcing accountability without sacrificing
% anonymity.


\subsection{Storage}

When an author (call her Alice) wishes to store a new document in Free
Haven, she must first identify a Free Haven servnet node which is
willing to store the document for her. This might be accomplished by
having Alice run a node herself. Alternatively, some nodes might have
public interfaces or have publically available reply blocks and be
willing to publish data for others.

\subsection{Publication}

To introduce a file $F$ into the servnet, the publishing server first
uses Rabin's information dispersal algorithm (IDA) \cite{rabin-ida} to
break the file into shares $f_1, \dots, f_n$ where any $k$ shares are
sufficient to recreate the file. The server then generates a key pair
$(PK_{doc},SK_{doc})$, constructs and signs a data segment for each share $f_i$, and
inserts those segments as new documents into its local server space. 
Attributes in each share include a timestamp, expiration
information, the public key which was used to sign it (for integrity
verification), information about share numbering, and the signature
itself.

The robustness parameter $k$ should be chosen based on some compromise
between the importance of the file and the size and available space.  A
large value of $k$ relative to $n$ makes the file more brittle, because
it will be unrecoverable after a few shares are lost. On the other hand,
a smaller value of $k$ implies a larger share size, since more data is
stored in each share.

% would be nice to introduce modelling, but not necessary -RD
%A more considered value for $k$ should be available based on the results of
%modelling the mixnet and servnet to determine how many documents are lost in
%what sort of situations.
%The brittleness of documents in the servnet is affected by a variety of
%factors, including percentage of misbehaving nodes, reliability of the
%network itself, trading frequency, and the rate at which new servers
%join the servnet and old servers disappear.

\subsection{Retrieval}

Documents in Free Haven are indexed by the public key $PK_{doc}$ used to
sign the shares of the document. Readers must locate (or be running) a
server which performs the document request.  The reader comes up with a
key pair $(PK_{client},SK_{client})$ for this transaction, as well as a
one-time remailer reply block.  The servnet server does a broadcast of
``\{`request', $PK_{doc}$, $PK_{client}$, reply block\}'' to all servnet
nodes that it knows about. These broadcasts can be queued and then sent
out in bulk to conserve bandwidth.
% (since the mixnets are
%going to introduce some delay as it is, adding a bit more by sending
%requests perhaps once an hour will not significantly affect the latency).

Each server that receives the query will check to see if it has any
shares with the requested $PK_{doc}$, and if it does it will encrypt
each share in the enclosed public key $PK_{client}$, and then send the
encrypted share through the remailer to the enclosed address. These
shares will magically arrive out of the ether at their destination; once
enough shares arrive ($\ge k$), the client recreates the file and is done.
%If not enough shares arrive, then the client has failed to obtain the 
%file (most likely, it is lost).

\subsection{Share Expiration}

Each share includes an expiration date chosen at share creation time.
This is an absolute (as opposed to relative) timestamp which indicates
the time after which the hosting server may delete the share with no ill
consequences.

Expiration dates should be chosen based on how long the introducing
server wants the data to last, considering the file size and likelihood
of finding a server willing to make the trade.

By allowing the publisher of the document to set its expiration time,
Free Haven distinguishes itself from many of the related works such as
Freenet. We maintain an entirely content-neutral approach to the data
stored in the system. That is, in the implicit contract between servers,
each server agrees to store data for the other servers without regard for
the legal or moral issues for that data in any given jurisdiction.

Designing a protocol which encourages content-neutrality may well mean
that server administrators are less liable for the content on the
network. Examples include common carrier law, wherein phone companies
are not responsible for the content they carry over their systems.
However, our strongest argument for maintaining a content-neutral system
is that we think this is the most useful approach to a persistent
anonymous data storage service. The Free Haven system is designed to
provide privacy for its users; rather than being a persistent publication
%system aimed at availability, is it designed to be a private low-profile
system aimed at convenience, is it designed to be a private low-profile
storage system.

\subsection{Document Revocation}

Some publishing systems allow for documents to be ``unpublished'' or
\emph{revoked}. Revocation has some benefits. Revocation would allow the
implementation of a read-write filesystem.  Published documents could be
updated as newer versions became available. It also allows political
dissidents who publish under their real name to realize their mistake
and unpublish incriminating documents.
% cite brett w?

Revocation could be implemented by allowing the author to come up with a
random private value $x$, and then publishing some hash $H(x)$ inside
each share. To revoke, the author could broadcast his original value $x$
to all servnet nodes as a signal to delete the document.

On the other hand, revocation allows new attacks on the system. Firstly,
it complicates accountability. Revocation requests may not reach all
shares of a file, due either to a poor communication channel or to a
malicious adversary who sends unpublishing requests only to some members
of the servnet.  Secondly, authors might use the same hash for new
shares, and thus ``link'' documents. Adversaries might do the same to
make it appear that the same author published two unrelated documents.
%Adversaries might also use the same $H(x)$ even though they
%are unaware of the value of $x$: this would cause artificial linking, as
%observers might conclude that the publisher of the original document
%also published the later documents.
%\item 
Thirdly, the presence of an unpublishing tag $H(x)$ in a share assigns
``ownership'' to a share that is not present otherwise. An author who
remembers his $x$ has evidence that he was associated with that share,
thus breaking forward author-anonymity.

%\end{itemize}

% In addition, if revocation exists, then a corrupt police force or
% intelligence agency has an incentive to track down the original author
% of the document, because chances are good that he still has the value
% $x$ which would allow them to remove the document from Free Haven. 

One of the most serious arguments against revocation was raised by Ross
Anderson \cite{eternity}. If the capability to revoke exists, then an
adversary can find who controls this capability, and threaten or torture
him until he revokes the document.  We could address this problem by
making revocation optional: the share itself could make it clear whether
that share can be unpublished.  If no unpublishing tag is present, there
would be no reason to track down the author.  If an adversary wishes to
create a pretext to hunt down the publisher of a document, he can still
republish the document {\em with} a revocation tag, and use that as
``reasonable cause''.  Furthermore, node operators or networks may be
required by law to refuse unrevocable shares.

Because the ability to revoke shares may put the original publisher in
increased physical danger, as well as allowing new attacks on the
system, we chose to leave revocation out of the current design.

\subsection{Trading}
\label{subsec:trading}

In the Free Haven design, servers periodically trade shares with each
other. There are a number of reasons why servers trade:

\begin{description}
\item[Cover for Publishing:] If trades are common, then there is no
reason to assume that somebody offering a trade is the publisher of a
share.

\item[Servers Join and Leave:] Trading allows servers to gracefully exit
the servnet by giving away all their shares. This support for a dynamic
network is crucial, since many of the participants in Free Haven will be
well-behaved but transient relative to the duration of the longer-lived
shares.

\item[Longer expiration dates:] Long-lasting shares would be rare if
trading them involved finding a server that promised to be up and
available for the next several years.

\item[Accomodates ethical concerns of server operators:] Frequent trading
makes it easy and unsuspicious for server operators to trade away a
particular piece of data with which they do not wish to be associated.
Server operators can use the public key in each share to check 
to what document a share belongs, then trade away the share without
compromising their reputation as a servnet node or the availability
of the document. 

\item[Provides a moving target:] Encouraging shares to move from node to
node through the servnet means that there is never any specific, static
target to attack.

\end{description}

When a server $A$ wants to make a trade (frequency of trade should be a
parameter set by the server operator), it chooses another server $B$
from its list of known servers (based on trust and history), and offers
a share $\Phi_i$ and a request for size or duration of a return share.
%along with some set of hints describing the share that
%it is interested in receiving in return.
If $B$ is interested, it responds with a share $\lambda_k$ of its own.  
(The notation $X_y$ indicates share $y$ of document $X$.)

Trades are considered ``fair'' based on the two-dimensional currency of
$size \times duration$, plus a function of the preferences of the
servers involved in the trade.

The negotiation is finalized by each server sending an acknowledgement
of the trade (including receipt, as described in section
\ref{subsec:receipts}) to the other. In addition, each server sends a
receipt to both the buddy of the share it is sending and the buddy of
the share it is receiving; buddies and the accountability they provide
are described in section \ref{subsec:buddies}. Thus, the entire trading
handshake takes four rounds: the first two to exchange the shares
themselves, and the next two to exchange receipts and at the same time
send receipts to the buddies.

%XXX
%If the offered share $\Phi_i$ is not acceptable, $B$ can simply not
%respond, or it can respond with some description of why it refused the
%trade. This gives some indication of which trades might be accepted in
%the future. If $B$ chooses not to respond, however, he runs the risk of
%having $A$ conclude that he is a dead node; see Section
%\ref{subsec:node-exp} below on Node Expiration.

By providing the receipt on the third round of the trading handshake,
$A$ makes a commitment to store the share $\lambda_k$. Similarly, the
receipt that $B$ generates on the fourth round represents a commitment
to store the share $\Phi_i$. $B$ could attack $A$ by failing to continue
the protocol after the third line: in this case, $A$ has committed to
keeping the share from $B$, but $B$ has not committed to anything. At
this point, $A$'s only recourse is to broadcast a complaint against $B$
and hope that the Trust system causes others to recognize that $B$ has
misbehaved.

We suggest that when a server $A$ trades a share to a server $B$, server
$A$ should keep a copy of the share around for a while, just in case $B$
proves untrustworthy. This will increase the amount of overhead in the
system by a factor of two or so (depending on duration), but provide
greatly increased robustness.  In this case, when a query is done for a
share, the system responding should include a flag for whether it
believes itself to be the ``primary provider'' of the data, or just
happens to have a copy still lying around. The amount of time is a
parameter which requires further study.

\subsection{Receipts}
\label{subsec:receipts}

A receipt contains a hash of the public keys for the source server and
the destination server, information about the share traded away,
information about the share received, and a timestamp. For each share,
it includes a hash of that document's key, which share number it was,
its expiration date, and its size.

This entire set of five elements is signed by server $A$.  If a
broadcast is performed by $B$ (or any other node) complaining about the
behavior of $A$, then furnishing this receipt along with the complaint
will provide some rudimentary level of ``proof'' that $B$ is not
fabricating its complaint.  Note that the expiration date of both shares
is included within the receipt, and the signature makes this value
immutable. Thus, other servers observing a receipt can easily tell
whether the receipt is still ``valid''. The size of each share is also
included, so other servers can make an informed decision about how
influential this transaction should be on their trust of the two servers
involved in the trade.

We really are not treating the receipt as {\em proof} of a transaction,
but rather as an indication of a commitment to keep safe a given
share. This is because the most a given server can do when it detects a
misbehaving server is broadcast a complaint and hope the Trust system
handles it correctly.

\subsection{Accountability}
\label{subsec:buddies}

Malicious servers can accept document shares and then fail to store
them. If enough shares are lost, the document is unrecoverable. Further,
malicious servers can continue to be malicious if there are no
mechanisms in place for identifying and excising ill-behaved nodes.

We propose a ``buddy system'' which associate pairs of shares in a given
document with each other. Each share is responsible for maintaining
information about the location of the other share, or \emph{buddy}.
When a share moves, it notifies its buddy. These notifications are 
signed by the public key of the document, which is inside each share; it 
is for this reason that we include $PK_doc$ in each share and not simply
a hash of the public key. 

Periodically, a server holding a given share should query for its buddy,
to make sure its buddy is still alive.  Should its buddy stop
responding, then the remaining share (or more correctly, the host
currently holding that share) is responsible for announcing which node
had responsibility for it when it disappeared, as described below under
section \ref{subsec:trust}.

We considered allowing abandoned shares to optionally spawn a new share
if their buddy disappears, but discarded this notion.  Buddy spawning
would make the service much more robust, since shares that are destroyed
can be regenerated. Such spawning could cause an
exponential population explosion of shares: if a share is out of touch
for a little while but is not dead, then both shares will end up
spawning new copies of themselves.  This is a strong argument for not
letting shares replicate.

Since the communications channel we have chosen currently has
significant latency, each server is responsible for forwarding buddy notifications
(based on information stored in the receipts it keeps). These forwarding
addresses are therefore available until the share's expiration date has
passed.

When a buddy notification comes in, the forwarder is checked
and the notification is forwarded if appropriate. This forwarding is
{\em not} done in the case of a document request (which includes a buddy
check operation), since this document request has presumably been
broadcast to all nodes in the servnet.
% -- this means that the node on the
%other end of the forwarder will also get its own copy of the request.

We have attempted to distinguish between the design goals of robustness
and accountability.  The fact that a document cannot be lost until a
certain threshold of its shares have been lost provides a simple
robustness.  Accountability, in turn, is provided by the buddy checking
and notification system among shares, which protects against malicious
or otherwise ill-behaving nodes.  Designers can choose the desired
levels of robustness and accountability independently.
%of each other.
%Robustness can be increased by increasing the number of shares for a
%document or decreasing the associated $k$ parameter; accountability can
%be increased by increasing the number of buddies or the frequency of
%checking.

\subsection{Communications Channel}

The Free Haven design requires a means of anonymously passing
information between agents. One such means is the remailer network,
including the Mixmaster remailers first designed by Lance Cottrell
\cite{mixmaster}.  Other examples of anonymous communication channels
are Onion Routing \cite{onion-routing-paper} and Zero Knowledge Systems'
Freedom \cite{freedom-architecture}.  We refer to David Martin's thesis
for a comprehensive overview of anonymous channels in theory and
practice \cite{local-anonymity}.

The design and implementation of an anonymous communication channel is
an ongoing research topic \cite{abe-mix, web-mix, chaum-mix, chaum-dc,
babel, jakobsson-flash-mix, jakobsson-practical-mix, sg-mix, ISDN-mix,
crowds}. The first implementation of the Free Haven design will use the
Cypherpunk and Mixmaster remailers as its anonymous channel. For design
details, see \cite{freedman-thesis}.

\subsection{Trust Network}
\label{subsec:trust}

The Trust Network in Free Haven is responsible for creating
accountability.  Accountability in the face of such strong anonymity is
a difficult task; there are many opportunities to try to take advantage
of other servers, ranging from merely neglecting to send a receipt after
a trade to wrongly accusing a different node of losing a share to more
insidious and complex attacks.

Other systems exist which use reputations to ensure correct or ``better'' operation.
The most directly relevant is the PGP Web of Trust model for public keys \cite{pgpfaq}. Other systems include the Advogato and Slashdot message moderation systems, AOL's Instant Messenger \cite{AOLIM}, and much
of real world commerce and law. 

%The Trust Network is motivated by the need to recognize malicious or
%otherwise misbehaving nodes, so that the amount of damage that such
%``evil'' nodes can perform is limited.  In particular, the protocol
%supports some enforcement of good practice by the other nodes, but there
%is still plenty of opportunity for nodes to obey the rules until they
%are sufficiently trusted, and then start breaking the rules in subtle
%ways.  
%For instance, a node might sometimes fail to provide a receipt to
%a share's buddy during a trade, introducing confusion as to whether the
%other side is trying to trick the buddy into believing that the share
%had been traded away.  More destructive would be for a node to never
%query the buddy shares for any of the shares it possesses, or even
%wrongly accuse a different node of losing that share. The list goes on
%and on.  
Careful trust management should enable each node to keep track
of which nodes it trusts. With the cushioning provided by the
information dispersal algorithm, only a large number of the
nodes turning evil will result in actual loss of documents.

Each node needs to keep two values describing each other node it knows
about: trust and metatrust.  Trust signifies a belief that the node in
question will obey the Free Haven Protocol.  Metatrust represents a
belief that the utterances of that node are valuable information.  For
each of these two values, each node also needs to maintain a confidence
and metaconfidence rating.  This serves to represent the ``stiffness''
of the trust value.

%For example, a node which has Trust 100 but Confidence 1 will be trusted
%with a great deal of data, but will lose or gain trust rapidly in
%response to the utterances of other nodes.  Exactly how these values
%are used is left entirely up to each node.

%Some nodes may wish to set all meta-trusts to zero, thus ignoring all
%outside utterances.  Other nodes may wish to set confidence values
%very high, ignoring not only outside utterances but also direct
%evidence of wrongdoing.

Nodes should broadcast referrals in several circumstances, such as when
they log the honest completion of a trade, when they suspect that a
buddy of a share they hold has been lost, and when the trust or
metatrust in a node changes substantially.

%\begin{itemize}
%\item When they log the honest completion of a trade.
%\item When they fail to verify that a buddy of a share they hold is
%safely held. 
%\item When the trust or metatrust in a node otherwise changes
%substantially. 
%\end{itemize}

The Trust Network provides an easy method of adding new nodes and
removing inactive ones.  New nodes can contact ``introducers'' via the
anonymous communication channel; these introducers will then broadcast
referrals of this new node.  Likewise, a node may mark another as
``dormant'' given some threshold of unanswered requests, such that
dormant nodes are not included in broadcasts or trade requests.  If a
dormant node starts initiating requests again, we conclude it is not
actually dormant and resume sending broadcasts and offering trades to
this node.

The design of a decentralized ``web of trust'' network is a complicated
research problem itself, beyond the scope of discussion in this paper.
The trust network should be able to interpret referrals of other nodes,
referrals which are accompanied by receipts, and disagreements between
referrals.  A node should be able to gain and lose trust independently,
and it should recognize when to broadcast its own trust referral. We
reference \cite{sniffen-thesis} for deeper consideration.

%
% Add due diligence - I'm not convinced we have enough -mjf
%
% PGP Key Servers
% \cite{pgpfaq}
% Netscape CERT
% AOL Instant Messenger
% AIM\cite{aolim}
% \section{Mobile Agents for Network Trust}
% MANET\cite{manet} is a DARPA project to produce ``a
% compromise-tolerant structure for information gathering.''

\subsection{Implementation Status}

The Free Haven project is still in its design stages.
Although we have a basic proof-of-concept implementation, we still wish
to firm up our design, primarily in the areas of accountability and
bandwidth overhead.  Before any deployable implementation, we want to
ensure that the Free Haven system offers better \emph{anonymity}
than current systems.  Still, the design is sufficiently simple and
modular, allowing both a straightforward basic implementation and easy
extensibility.  

%For an in-depth discussion of the design, as well as a
%consideration of the social and legal arguments for anonymous
%publication, please see Dingledine's thesis \cite{rogers-thesis}.

%...we have a basic implementation to prove our design. the next step
%is to firm up our newer design and figure out the specifics of how we
%do accountability. we want to be more sure that what we're offering is
%actually better (in terms of anonymity) before we follow through with it.
%on the other hand, the design is simple enough (and modular enough)
%that the basic implementation was pretty straightforward, and extensions
%to it are pretty straightforward too. reword this, eh?

%we should probably put this subsection at the end of design? but i wanted
%to put it in a very clear place. hrm.
%I moved it to 4.2 -mjf
%% moved to after trust network --dmolnar

\section{Attacks on Free Haven}
\label{sec:attacks}

Anonymous publishing and storage systems will have adversaries.
The attacks and pressures that these adversaries may employ might be technical,
 legal, political, or social
in nature. Obviously, the success of technical attacks is contingent
upon the system's security and robustness to attack.  The system's
design and the nature of anonymity it provides also affects 
the success of non-technical attacks.  

% Several of the types of attacks that may
% be employed cannot be handled merely through technology:  these include
% social attacks on system security and servnet node operators, political
% attacks to discourage servnet use, and government and legal attacks to
% shut down nodes or arrest operators. The success of these attacks will
% often depend upon the political and jurisdictional clime of servnet
% nodes' physical location.  On the other hand, the success of technical
% attacks -- from individuals, organizations, corporations, or national
% security agencies -- is contingent upon the system's security and
% robustness to attack. 

% We have defined anonymity in terms of our ideal anonymous publishing
% system in section \ref{sec:anon}.  In doing so, we specified a list
% of protections to provide for system agents and operations.  This
% section describes the various types of attacks an adversary or group of 
% colluding adversaries might use against Free Haven, relating the effect
% of these attacks on the level of anonymity maintained. 

We now consider possible attacks on the Free Haven system based on their
respective targets: on the availability of documents and servnet
operation; on the accountability offered by the trust network; and on
the various aspects of anonymity relevant to anonymous storage and
publication, as described in section \ref{sec:anon}.  For a more
in-depth consideration of attacks, we refer to \cite{rogers-thesis}.
%\cite{rogers-thesis, chapter 7}.

% As the security and
% anonymity of a system are only as strong as its weakest link, we have
% considered all three of these, and take appropriate countermeasures for
% many of these attacks.

% \subsection{Adversaries and Threat Model}
% \subsection{Communication Channel Attacks}

\subsection{Attacks on Documents or the Servnet}

\begin{description}
%\item Attack the time-synchronization protocol to make files expire
%earlier than expected. 

%{\it Prevention:} We rely on the ability of servnet node operators to
%maintain accurate or near-accurate time on their systems. Presumably if
%an adversary has the capacity to successfully attack a system's time
%server or the link to the time server, then the adversary can do other
%attacks as well.

\item[Physical attack:] Destroy a servnet node.

{\it Prevention:} Because we are breaking documents into shares and only
$k$ of $n$ shares are required to reconstruct the document, an adversary
must find and destroy many nodes before availability is compromised.

\item[Legal action:] Find a physical servnet node, and prosecute the
owner based on its contents.  

{\it Prevention:} Because of the isolated-server document-anonymity
property that the Free Haven design provides, the servnet operator may
be able to plausibly deny knowledge of the data stored on his
computer. This depends on the laws of the country in question. 

\item[Social pressure:]  Bring various forms of social pressure against
servnet node administrators.  Claim that the design is patented or
otherwise illegal.  Sue the Free Haven Project and any known node
administrators.  Conspire to make a cause ``unpopular'', convincing
administrators that they should manually prune their data.  Allege that
they ``aid child pornographers'' and other socially-unacceptable
activities.

{\it Prevention:} We rely on the notion of jurisdictional arbitrage.
Information illegal in one place is frequently legal in others. Free
Haven's content-neutral policies mean that there is no reason to expect
that the server operator has looked at the data he holds, which might
make it more difficult to prosecute.  We further rely on having enough
servnet nodes in enough different jurisdictions that organizations
cannot conspire to bully a sufficient fraction of servers to make Free
Haven unusable.

%\item Conspire to make a cause ``unpopular''.  Convince servnet node
%administrators that they don't want to be hosting data for these
%unpopular causes, and that they should manually prune their data.

%{\it Prevention:} We rely on the judgment of servnet administrators to
%choose to support any and all content, if they can get away with it in
%their jurisdiction.  We rely on having enough servnet nodes in enough
%different jurisdictions that organizations cannot conspire to bully a
%sufficient fraction of servers to make Free Haven unusable.

%\item Decrease the number of individuals willing to run nodes by
%making it expensive to move data, or by alleging that node operators
%``aid child pornographers.''

% Increase the personal cost
%of running a servnet or mixnet node, either by adding a monetary cost
%to moving large quantities of data around, or by adding a bad
%reputation such as ``harboring terrorist data and kiddie porn''.

%{\it Prevention:} 

\item[Denial of service:] Attack the servnet by continued flooding of
queries for data or requests to join the servnet.  These queries may use
up all available bandwidth and processing power for a node.

{\it Prevention:} We must assume that our communications channel has
adequate protection and buffering against this attack, such as the use
of client puzzles \cite{JB99}. Most communications channels we are
likely to choose will not protect against this attack. This is a real
problem.

\item[Data flooding:] Attempt to flood the servnet with shares, to use up
available resources.  

{\it Prevention:} The trading protocol implicitly protects against this
type of denial of service attack against storage resources. The ability
to insert shares, whether ``false'' or valid, is restricted to trading:
that server must find another which trusts its ability to provide space
for the share it would receive in return.

Similarly, the design provides protection against the corrupting of
shares.  Altering (or ``spoofing'') a share cannot be done, because the
share contains a particular public key, and is signed by that
key. Without knowledge of the original key which was used to create a
set of shares, an adversary cannot forge new shares for a given
document.

\item[Share hoarding:] Trade until a sufficient fraction of an objectionable
document is controlled by a group of collaborating servers, and then
destroy this document.  Likewise, a sufficiently wealthy adversary could
purchase a series of servers with very large drives and join the
servnet, trading away garbage for ``valuable data.''  He can trade away
enough garbage to have a significant portion of all the data in the
servnet on his drives, subject to deletion.

{\it Prevention:} We rely on the overall size of the servnet to make it
unlikely or prohibitively expensive or for any given server or group of
collaborating servers to obtain a sufficient fraction of the shares of
any given document.  The failure of this assumption would
leave us with no real defense.


% We rely on the accountability from the buddy system to make it
% unprofitable to destroy a share without also destroying its buddy.
% This attack is actually more complicated than just hoping to possess enough
% shares of a document at a given instant in time: adversaries can obtain
% control over certain shares and then refuse to trade those shares away. This
% means that an adversary might over time increase the fraction of the document
% that he controls. The timing and frequency of trades must be modelled,
% based on the expected size of the servnet, to choose parameters that
% prevent this attack.

%\item Conspire to make a cause ``unpopular''.  Convince servnet node
%administrators that they don't want to be hosting data for these
%unpopular causes, and that they should manually prune their data.

%{\it Prevention:} We rely on the judgment of servnet administrators to
%choose to support any and all content, if they can get away with it in
%their jurisdiction.  We rely on having enough servnet nodes in enough
%different jurisdictions that organizations cannot conspire to bully a
%sufficient fraction of servers to make Free Haven unusable.


\end{description}

\subsection{Attacks on the Trust Network}
\label{sec:attacks-trust}

%There are a variety of attacks which are possible on the Free Haven
%system.  Many of these attacks are far outside the scope of the Trust
%Module: social attacks on system security and servnet node operators,
%political attacks to discourage servnet use, government and legal
%attacks to shut down nodes or arrest operators, denial of service
%attacks on the communications anonymous channel, and attacks on the
%infrastructure of the server network.

While attacks against the Trust Network\footnote{Parts of this section
were originally written by Brian Sniffen in \cite{sniffen-thesis}.} 
are related to attacks directly against nodes, their goal is not to
directly affect document availability or servnet operation.  Rather,
these attacks seek to compromise the means by which we provide
accountability for malicious or otherwise misbehaving nodes.

%The Trust Network is motivated by the need to recognize malicious or
%otherwise misbehaving nodes, 

Some of these attacks, such as temporary denials of service, have
negative repercussions on the trust of a node.  These repercussions
might be qualified as ``unfair'', but are best considered in the
following light: if a node is vulnerable to these attacks, it may not be
capable of meeting the specifications of the Free Haven protocol.  Such
a node is not worthy of trust to meet those specifications.  The trust
system does not judge intent, merely actions.

\begin{description}

\item[Simple Betrayal:] An adversary may become part of the servnet,
act correctly long enough to gain trust, then betray this trust by
deleting files before their expiration dates.

{\it Prevention:} The trust economy is designed to make this
unprofitable.  The size-time currency means that a corrupt node has to
initially store data at least equivalent to that it later deletes.
% The simplest attack is this: Become part of the Servnet, earn trust,
% then betray it by deleting files before their expiration dates.  The
% trust economy is designed to make this as unprofitable as possible.
% The size-time currency means that a corrupt node has to donate at
% least as much to the Free Haven as it removes.  This 50\% useful work
% ratio is a rather loose lower bound --- it requires duping a great
% number of high-metatrust nodes into recommending you.  
A node which engages in this behavior should be caught by the buddy
system when it deletes each share.

\item[Buddy Coopting:] If a corrupt node (or group of colluding nodes)
can gain control of both a share and its buddy, it can delete both of
them without repercussions.  
%This means that corrupt nodes can defeat the
%buddy system by capturing both buddies, then deleting them.

{\it Prevention:} We assume a large quantity of shares in the servnet,
making buddy capture more difficult.  Nodes also can modify trust
ratings if precise trading parameters, or constant trading, suggests an
attempt to capture buddies.  More concretely, a possible work-around
involves separating the reply-block addresses for trading and for buddy
checking, preventing corrupt nodes from acquiring the buddies of the
shares they already have.  Such an approach adds complexity, and
possibly opens other avenues for attack.

\item[False Referrals:] An adversary can broadcast false referrals, or
direct these to specific hosts. 

{\it Prevention:} The metaconfidence trust rating can provide a guard
against false referrals, combined with a single-reporting policy (i.e.,
at most one referral per target per source is used for trust
calculations). 
%Based on field tests of Free Haven, we may need to switch to a policy of
%ignoring referrals which do not have receipts.


\item[Trading Receipt Games:] While we believe that the signed
timestamps attest to who did what and when, receipt-based accountability
may be vulnerable to some attacks.  Most likely, these will involve
multi-node adversaries engaging in coordinated bait-and-switch games
with target nodes.

\item[Entrapment:] 
There are several ways in which an adversary can appear to violate the
protocols.  When another node points them out, the adversary can present
receipts which show her wrong and can accuse her of sending false
referrals. A more thorough system of attestations and protests is
necessary to defend against and account for this type of attack.

\end{description}

\subsection{Attacks on Anonymity}

There are a number of attacks which might be used to determine more
information about the identity of some entity in the system.

\begin{description}
\item[Attacks on reader anonymity:] An adversary might develop and
publish on Free Haven a customized virus which automatically contacts a
given host upon execution. A special case of this attack would be to
include mime-encoded URLs in a document to exploit reader software which
automatically loads URLs. Another approach might be to become a node on
both the servnet and the mixnet, and attempt an end-to-end attack, such
as correlating message timing with document requests. Indeed, servers
could claim to have a document and see who requests it, or simply
monitor queries and record the source of each query. Sophisticated
servers might attempt to correlate readers based on the material they
download, and then try to build statistical profiles and match them to
people (outside Free Haven) based on activity and preferences;
%This would develop into a directed marketing campaign similar to
%Amazon's: ``People who have downloaded this share may also like the
%following shares.'' This last attack is perhaps the most insidious one,
%since corporations with a lot of resources might want to take advantage
%of internet publication services to gain more information about users; 
we prevent this attack by using each reply block for only one
transaction.

\item[Attacks on server anonymity:] Adversaries might create unusually
large shares, and try to reduce the set of known servers who might have
the capacity to store such shares. This attacks the partial anonymity of
these servers.  An adversary could become a servnet node, and then
collect routine status and participation information (such as server
lists) from other nodes.  This information might be extended with
extensive knowledge of the bandwidth characteristics and limitations of
the Internet to map servnet topology.  By joining the mixnet, an
adversary might correlate message timing with trade requests or trust
broadcasts.  An alternate approach is simply to spread a Trojan Horse or
worm which looks for Free Haven servers and reports which shares they
are currently storing.

\item[Attacks on publisher anonymity:] An adversary could become a server
and log publishing acts, and then attempt to correlate source or timing.
Alternatively, he might look at servers who might recently have
published a document, and try to determine who has been communicating
with them recently.

\end{description}

There are entirely social attacks which can be very successful, such as
offering a large sum of money for information leading to the current
location of a given document, server, reader, etc.

We avoid or reduce the threat of many of these attacks by using an anonymous channel 
which supports pseudonyms for our communications. This prevents most or all adversaries from being
able to determine the source or destination of a given message, or
establish linkability between each endpoint of a set of messages.  Even
if node administrators are subpoenaed or otherwise pressured to release
information about these entities, they can openly disclaim any
knowledge.  Obviously, the level of anonymity provided by the 
is based on its robustness to traffic analysis and similar attacks.


%\section{Anonymity Provided By Free Haven}
%Free Haven does not provide every kind of anonymity 
%described in section X.

\section{Related Work}
\label{sec:related-work}

There are a number of projects and papers which discuss anonymous
publication services.  We start this section by providing an overview of
some of the related projects and papers. After this section, we continue
by examining the amount of anonymity that each project offers.
%in detail the amount of anonymity and privacy protection 


\subsection{The Eternity Service}

This work was inspired by Anderson's seminal paper on The Eternity
Service \cite{eternity}.  As Anderson
wrote, ``[t]he basic idea is to use redundancy and scattering
techniques to replicate data across a large set of machines (such as the
Internet), and add anonymity mechanisms to drive up the cost of
selective service denial attacks.''

A publisher uploads a document and some digital cash, along with a
requested file duration (cost would be based on document size and
desired duration). In the simple design, a publisher would upload the
document to 100 servers, and remember ten of these servers for the
purposes of auditing their performance. Because he does not record most
of the servers to whom he submitted the file, there is no way to
identify which of the participating eternity servers are storing his
file. Document queries are done via broadcast, and document delivery is
achieved through one-way anonymous remailers.

There are issues which are not addressed in his brief paper: for
instance, if documents are submitted anonymously but publishers are
expected to remember a random sample of servers so they can audit them,
what do they do when they find that some server is cheating?  
%Since publishers are anonymous, it would seem that they have no power
%at all. 
Anderson passes this responsibility on to the digital cash itself, so
servers do not receive payment if they stop providing the associated
service.  He does not elaborate on the possible implications of this
increased accountability to the anonymity of the publishers.
%increased accountability to the anonymity of the authors.

Eternity has several problems that hinder real-world deployment.  Most
importantly, Eternity relies on a stable digital cash scheme, which is
not available today.
%If this digital cash can be exchanged for ordinary cash, then
%Eternity has a strong correlation between ability to store data into the
%system and amount of real-world capital available. While our proposal
%does have a loose correlation between available resources and amount of
%influence over the servnet, the correlation is not nearly as direct.
There is no consideration to maintaining a dynamic list of available
servers and allowing servers to smoothly join and leave.  Anderson
further proposes that a directory of files in the system should itself
be a file in the system.  However, without a mechanism for updating or
revising files, this would appear very difficult to achieve.

\subsection{Napster}

The Napster service \cite{napster} is a company based around connecting
people who are offering MP3 files to people who want to download them.
While they provide no real anonymity and disclaim all legal liability,
an important thing to note about the Napster service is that it is
highly successful. Thousands of people use Napster daily to exchange
music; if there were greater security (and comparable ease of use), we
suspect that many thousands more would participate. The existence of
Napster shows that demand exists for a distributed storage and retrieval
service.

\subsection{Gnutella}

Gnutella \cite{gnutella} is a peer-to-peer Napster clone. Developed
originally by Nullsoft, it is currently maintained as an open source
project. The Gnutella developers claim that querying the network is ``anonymous.''
Analysis of the Gnutella protocol reveals features which make this statement problematic.

The header of a Gnutella packet includes two fields: \emph{TTL} (time to
live: the number of additional hops after which the packet should be
dropped) and \emph{Hops taken} (the number of hops this packet has made
since its creation). The TTL is started at some default value based on
the expected size of the network, and the Hops value is effectively an
inverse of the TTL during the travel of the packet. Because the Hops
value is 1 when the packet is initially sent, it is clear when a server
is generating a query.

% (assuming it is playing by the protocol, which for the vast 
% majority of users is a reasonable assumption). 

In addition, while the protocol is designed for a user to set up
connections with his ``friends'', there is no infrastructure in place
for finding new friends. Instead, the Gnutella site offers a ``default''
set of friends with which users can start. Most users will never change
this file if the service is functional. This means that the actual
network is a hierarchical system, as shown in pictures of the Gnutella
network topology \cite{gnutellamaps}.  There are a small number of
central nodes which would be ideal targets for collecting information about users.



%If the TTL and Hops fields aren't enough, every message in the system is
%stamped with a Globally Unique Identifier (GUID). This identifier is used
%to recognize messages which have already been seen, so they are not passed
%on redundantly. This prevents the Gnutella network from an exponential
%explosion of messages, since the default TTL is typically as high as 25.

In addition, only queries are protected. The actual downloads are done by point-to-point connections, meaning that
the IP addresses of server and reader are both revealed to each
other. This is done for reasons of efficiency, but it is far from
anonymous.

Sites such as the Gnutella Wall of Shame \cite{gnushame}, which attempts
to entrap child pornographers using the Gnutella service, demonstrate
that the direct file-transfer portion of the Gnutella service does not
adequately protect the anonymity of servers or readers.

% Gnutella is not designed to be an anonymous communications or
% publication network. Gnutella is a network designed to provide {\em
% availability} for data from one user to the next, and it does a
% relatively good job of this.
%% yup, but we're worrying about their anonymity here -dmolnar

%http://gnutella.wego.com/go/wego.pages.page?groupId=116705&view=page&folderId=118398&pageId=118400 
%http://capnbry.dyndns.org/gnutella/protocol.html
%http://www.rixsoft.com/Knowbuddy/gnutellafaq.html


\subsection{Eternity USENET}

Adam Back proposed \cite{eternityusenet} a simpler implementation of the
Eternity Service, using the existing Usenet infrastructure to distribute
the posted files all around the world.  

%This is an excellent idea, but on
%the other hand this limits the participating servers to systems which
%already host Usenet news. Further, news administrators much specifically
%choose to participate in this variant of the eternity service, and so
%they may well choose not to carry the `alt' groups that comprise the
%service.


To achieve anonymity in publishing, Eternity Usenet employs cypherpunks
type I and type II (mixmaster) remailers as gateways from email to newsgroups.
Publishers PGP-sign documents which they wish to publish into the system:
these documents are formatted in html, and readers make http search or
query requests to `Eternity Servers' which map these requests into
NNTP commands either to a remote news server or a local news spool. With
the initial implementation, the default list of newsgroups to read
consists only of  {\tt alt.anonymous.messages}. The Eternity Server
effectively provides an interface to a virtual web filesystem which
posters populate via Usenet posts. Eternity Usenet uses normal Usenet mechanisms for retrieval, posting,
and expiring, so publishers may not have control over the expiration time or
propagation rate of their document. 

Reader anonymity for Eternity USENET is provided when the system is 
used in "local proxy" mode, in which the user downloads the entire
eternity newsgroup from a remote server. The server can still link
the reader to that day's contents of an eternity newsgroup, so the
reader anonymity is not as strong as we might like. 

Back treats Usenet as an append-only file system. His system provides
support for replacing files (virtual addresses) because newer posts
signed with the same PGP key are assumed to be from the same
publisher. Addresses are claimed on a first-come first-served basis, and
PGP signatures provide linkability between an initial file at a given
address and a revision of that file. It is not clear what happens 
when two addresses are claimed at once -- since Usenet posts may
arrive out of order, it would seem that there might be some subtle
attacks against file coherency if two different Eternity Servers have a
different notion of who owns a file.

While the system is not directly `censorable' as we usually
consider it, the term `eternity' is misleading. Usenet posts expire
based on age and size. Back does not provide an analysis of how long a
given document will survive in the network. The task of making a
feasible distributed store of Eternity documents is left as a future
work. 

%Four public-access Eternity Servers are listed at the end of the
%article; none of these servers is still available. This indicates that
%active work on Eternity Usenet is not ongoing.

%\subsubsection{Adam Back's email to freehaven-dev}

%This is just included for reference.  Possibly some of above things
%should be modified.  We might wish to email Avi Rubin, Freenet, and
%Gnutella guy for reference that our interpretation of their system is
%correct.  We of course, can judge what they say...

%In the freehaven paper in table 3.1: Overview: Computational Anonymity,
%Eternity USENET is listed as providing publisher anonymity only.

%- Eternity USENET actually provides pretty aggressive Reader Anonymity:
%consider how hard it would be to track down which internet users read a
%given alt newsgroup post.

%The confusion may come from the availability of two modes of use of the
%Eternity USENET client: local proxy, and public proxy.  The public proxy
%is just to give people something to play with without having to install
%software.  The local proxy version provides reader anonymity.

%- Eternity USENET doesn't provide server anonymity, but it doesn't need
%to because all USENET servers are coopted into being servers, and there
%are many of them.  It doesn't provide server anonymity for public
%proxies, but service remains available to local proxies if public
%proxies are taken down.

%- Also Eternity USENET provides document anonymity, the USENET article
%can be encrypted with a key derived from the URL.

%Also for Eternity USENET public proxies they have the option of
%encrypting their cache contents based on URL which provides weak
%deniability -- at least it doesn't currently know the URL, or stored
%document contents, though it must see the URL and content during access
%as there is no client software in this mode.

%Eternity USENET has it's problems, which are:

%- for the local proxy client, the user only sees documents broadcast
%since it installs the software.  To fix this the author, interested
%reader, or agent would have to periodically republish documents.

%- scalability - there must be a limit to how many megabytes per day of
%Eternity USENET posts will be tolerated by USENET server admins.
%alt.anonymous.messages may be dropped in retaliation.  Of course
%Eternity USENET can work from any and all USENET groups, but the act of
%posting lots of unreadable binaries to arbitrary newsgroups would be
%considered abusive.

%- the race condition you discuss -- my thoughts on a way to combat this
%would be to do a bit commitment to the document first, leave it a few
%days to distribute, then publish the document proper.

%- For the open public proxies, their problems are the limited number in
%existance, lack of server anonymity coupled with limited number of
%servers.

%Also section 3.2.3 on Eternity USENET comments that the participating
%servers have to already host USENET news.  This is wrong.  All the
%participating public server, or local proxy client needs is access to a
%USENET server.

%I ran a local proxy client for a while over a dial up line with the
%local proxy client talking to the ISPs news server using NNTP.


\subsection{Freenet}

Like Gnutella, Freenet \cite{freenet} is a peer to peer network of servers.
 When a user wishes to
request a document, she hashes the name of that document (where she gets
this name is outside the scope of Freenet) and then queries her own
server about the location. If her server does not have it, it passes the
query on to a nearby server which is ``more likely'' to have it. Freenet
clusters documents with similar hashes nearby each other, and uses a
routing protocol to route queries ``downhill'' until they arrive at the
desired document.

%Freenet is similar to Gnutella in that its main purpose is to provide
%highly {\em available} data to its users.  There are also a number of

%% that's not what they say! 

%differences between Freenet and Gnutella. First of all,

%% is this important to our related work discussion? -dmolnar

%  Freenet caches
% data near requestors. Specifically, when a request is answered over a
% given path, all servers along that path cache that document. This means
% that future queries for that document will be answered very quickly,
% assuming the copies of that document haven't expired from the nearby
% caches. This introduces another important feature: persistence of
% data. Because nodes cache documents as they are requested, the documents
% do not disappear from the system when the server offering those
% documents disappears.  This is a key improvement over Napster and
% Gnutella. 

Freenet bases document lifetime on the popularity of the document:
frequently requested files get duplicated around the system, whereas
infrequently requested files live in only a few places or die out
completely.  While this is a valid choice for a system that emphasizes
availability and efficiency, it precludes certain uses of the
system. For example, Yugoslav phone books are currently being collected
``to document residency for the close to one million people forced to
evacuate Kosovo"\cite{yugo}; these phone books might not survive a popularity contest.

% For example, photos of JFK Jr. saluting his father existed for
% years, but only became ``popular'' after his untimely death.

%Examples
%include photos of JFK Jr.\ saluting his father, or a (timestamped) Idaho
%phone book that has those ten extra names that the FBI might one day be
%accused of `erasing'. Indeed, this is already happening
%For example, \cite{yugo} describes a case where Yugoslav phone
% books are being collected ``to
%document residency for the close to one million people forced to
%evacuate Kosovo.'' 

Freenet explicitly sets out to provide anonymity. Their goals include
both sender and reader anonymity, as well as plausible deniability for
servers -- the notion that a server does not know the contents of
documents it is storing. They provide this last, which we called
isolated-server document-anonymity, by
referencing files by $H(name)$ and having users encrypt the
documents themselves with $name$ before inserting them. This means that
anybody who knows the original $name$ string can decrypt the document,
but the server storing the document is unable to invert $H(name)$ to
determine $name$.

Freenet  has a similar potential flaw with publisher- and
reader-anonymity to
Gnutella, due to the presence of the TTL and Depth (comparable to
Hops) fields in the Freenet message headers. Freenet takes steps to avoid the
problems of Gnutella's Depth and TTL headers by randomly assigning values
to both fields, so that a depth of 1 does not necessarily mean that a
request originated with a given node. Packets
with TTL 1 are randomly expired or forwarded onwards.

Document requests are also sent through the caching-enabled network (rather than
peer-to-peer as they are in Gnutella). Because of these measures, Freenet has ``more'' anonymity
than Gnutella provides.

% what does this mean?  ahh...the whole caching/routing bit

Further, statistical attacks similar to
those described in the Crowds \cite{crowds} paper might work to pinpoint
the location of a given reader or publisher; caching provides protection
against this since the network topology for a given document changes
after each request. These attacks need to be analyzed further.

Another attack to consider is that simply doing a request from a strict
political environment will cause a copy of the data to migrate into that
jurisdiction, giving law enforcement more reason to shut those servers
down. On the other hand, it may well be that such environments will
close down servers without requiring ``sufficient cause'', so the
political and legal arguments are meaningless.

% Because of the above, we consider that Freenet provides 
% ``weaker'' anonymity than Free Haven. 
%% that's not going to fly. 


\subsection{Publius}

Publius \cite{publius} attacks the problem of anonymous publishing from
a different angle, employing a one-way anonymous channel to transfer
documents from publishers to servers.
%. Rather than trying to come up with a routing protocol like
%Gnutella and Freenet, Publius simply employs a one-way anonymous
%channel to transfer documents from publishers to servers. 
The Publius protocol is designed to maintain availability of documents
on these servers.

In this system, a publisher generates a key $K$ for her document, and
encrypts the document with this key. She performs Shamir's
secret-sharing algorithm to build a set of $n$ key shares, any $k$ of
which is sufficient to reconstruct $K$. From there, she
chooses some $n$ of the Publius servers and anonymously delivers the
encrypted message plus one share to each of these $n$ servers.

In this way, the document is replicated over each server, but the key is
split over the $n$ servers. Document reading is implemented by running a
local web proxy on the reader's machine; the $n$ addresses chosen as
servers are concatenated into a URL which is presumably published or
otherwise remembered.  The local proxy fetches each share independently,
reconstructs the original key $K$, and then decrypts the document.

% This is weaker than us - all readers know the location of documents to
% get from :(  -mjf

%(Publius provides no description of how a
%directory service might be built, or any other mechanism for remembering
%URLs after documents are inserted.)
%% NEITHER DO WE -- dmolnar

%The URL used to retrieve the document specifies all $n$ servers at which
%the document was stored. To retrieve a given document, the local proxy
%fetches each share independently, reconstructs the original key $K$, and
%then decrypts the document.

The Publius system provides publisher-anonymity by means of a one-way
anonymous channel between authors and servers. In addition, because
Shamir's secret-sharing protocol is used and each server only receives
one share, Publius provides both computational and information-theoretic
isolated-server document-anonymity: a single server is not able to
determine anything about a document it stores.

A minor flaw is that readers cannot determine if a share is corrupt
simply by examining it: the reader must request all of the shares and
attempt to reconstruct in order to determine the integrity of a share. A
verifiable secret sharing scheme \cite{vss} might make the system more
efficient.

The entire scheme is based on a static, system-wide list of available
servers. Since these servers are permanent, there is no support for
adding new servers or purging dead ones. More importantly, there is no
support for recognizing misbehaving servers and removing them.

% Publius is by far the strongest related work in terms of our notions of
% anonymity. The paper is very well-written and goes into considerable
% detail on various attacks and counters to those attacks.


\section{An Analysis of Anonymity}
\label{sec:analysis-anon}

We describe the protections offered for each of the broad categories of
anonymity.  In Table \ref{table:real-anon-pub}, we provide an overview
view of Free Haven and the different publishing systems which we
examined.  We consider the level of privacy provided -- computational
(C), information-theoretic (I-T), and perfect-forward (P-F) anonymity --
by the various systems.  

Computational anonymity means that an adversary modelled as a polynomial-time
Turing Machine has no better than a $\frac12 + neg(k)$ chance of breaking anonymity, for 
some reasonable security parameter $k$ and negliglible function $neg(k)$. Information-theoretic is the same, except
with no computational restrictions on the adversary. Perfect forward anonymity is analogous to perfect forward secrecy : a system is perfect forward anonymous if no information remains 
after a transaction is complete which could later identify the participants if
one side or the other is compromised.

%Informally, for polynomially-bounded adversaries who have passive access
%to some of the edges between agents, computationally-strong anonymity
%indicates that the adversary has less than a $\frac12 + \frac{1}{n^k}$
%(for some constant $k$) chance of correctly guessing if an individual is
%involved in any given transaction.
%
%
%  Do we want to add one line about partial anonymity here:  -mjf
%
%The definition is slightly stronger than partial anonymity concepts of
%``possible''  innocence:  namely, the adversary has individual is not
%exposed, but the ...
%
%the identity of one of the individuals involved in any given
%transaction.   
%% I changed this -- it isn't the identity taken from the large pool of
%% possibilities, it's whether, given a person, is she involved?


\begin{table}[htc]
%\begin{center}
\begin{tabular}{|c|ccc|ccc|ccc|cc|cc|}
\hline
Project     & \multicolumn{3}{c}{Publisher} \vline & \multicolumn{3}{c}{Reader} \vline & \multicolumn{3}{c}{Server} \vline & \multicolumn{2}{c}{Document} \vline & \multicolumn{2}{c}{Query} \vline \\
            & C    & I-T & P-F & C    & I-T & P-F & C    & I-T & P-F & C    & I-T & C    & I-T \\
\hline
Gnutella    &      &     &     &      &     &     &      &     &     &      &     &      &     \\
\hline
Eternity Usenet & \cm &     & \cm & ? && ?& &&& && & \\
\hline
FreeNet     &  ?    &     &  ?   &  ?    &     &  ?   &      &     &     &
\cm  &     &      &     \\
\hline
Publius     & \cm  &     & \cm &      &     &     &      &     &     & \cm  & \cm &      &     \\
\hline
Free Haven  & \cm  &     & \cm & \cm  &     & \cm & \cm  &     &     & \cm  &     &      &     \\
\hline
FH + ideal mix &\cm&\cm&\cm & \cm  & \cm & \cm & \cm  & \cm &\cm  & \cm  &     &      &     \\
\hline
\end{tabular}
\caption{Anonymity Properties of Publishing Systems}
\label{table:real-anon-pub}
\end{table}
%\end{center}

For the purposes of this discussion, we will assume that the anonymous channel 
used by Free Haven is only computationally secure. An ``ideal mix'', in contrast,
would be a perfectly anonymous channel which resists any attack we of which we could 
think, including those by a computationally unbound adversary. All anonymous channels
are assumed to keep no records of communication.

Free Haven provides computational and perfect forward author anonymity, because authors communicate to 
publishers via an anonymous channel. Servers trade to other servers via 
pseudonyms, providing computational but not perfect forward anonymity, as the 
pseudonyms can be broken later. Because trading is constant, however, Free Haven
acheives publisher anonymity for publishers trying to trade away
all shares of the same document. The use of IDA to split documents provides
isolated-server document anonymity, but the public key embedded in each share
(which we require for authenticating buddy messages) makes it trivial for
connected servers to discover what they are storing. Because requests are
broadcast via an anonymous channel, Free Haven provides computational reader 
anonymity, and different reply blocks used and then destroyed after each
request provide perfect forward anonymity. 

Gnutella fails to provide publisher-anonymity, reader-anonymity, or
server-anonymity because of the peer-to-peer connections for actual
file transfer. Because Gnutella servers start out knowing the intended
contents of the document they are offering, they also fail to provide
document-anonymity.

Eternity Usenet provides publisher anonymity via the use of one-way
anonymous remailers. Server anonymity is not provided, because every 
Usenet server which carries the eternity newsgroup is a server.
Back has pointed out that isolated-server document anonymity can be provided
by encrypting files with a key derived from the URL; connected servers 
might find the key and attempt to decrypt stored documents. Reader anonymity
is not provided by open public proxies unless the reader uses an anonymous
channel because the proxy can see what a user queries, downloads, and at what time.
For local proxies, which connect to a separate news server, however, the situation
is better because the news server knows only what the user downloads. Even so, 
this is not quite satisfactory, because the user can be tied by the server to
the contents of the eternity newsgroup at a certain time. 

% Reader anonymity is not protected, and it is clear
% that a Usenet service that offers Eternity files is carrying the
% Eternity feed. Because each downstream host gets its only entire copy of
% the feed, there is no document-anonymity in Eternity Usenet. 

Freenet achieves isolated-server document-anonymity because servers are unable to
reverse the hash of the document name to determine the key with which to
decrypt the document. For connected-server document anonymity, the servers can
check whether they are carrying a particular key, but cannot easily match a stored
document to a key due to the hash function.  Server-anonymity is not provided because given a
document key, it is very easy to locate a server that is carrying that
document -- querying any server at all will result in that server
carrying the document!  Because of the TTL and Hops fields for both
reading and publishing, it is also not clear that Freenet achieves publisher- or
reader-anonymity, although they are much better in these regards than Gnutella.


Publius achieves document-anonymity because the key is split between the
$n$ servers, and without sufficient shares of the key a server is unable
to decrypt the document that is stores.  The secret sharing algorithm
provides a stronger form of this anonymity (albeit in a
storage-intensive manner), since an isolated server really can learn
nothing at all about the contents of a document that it is helping to
store.  Because documents are published to Publius through a one-way
anonymous remailer, it provides publisher-anonymity. Publius
provides no support for protecting readers by itself, however, and the servers containing a
given file are clearly marked in the URL used for retrieving that file. Readers
can use a system such as ZKS Freedom or Onion Routing, to protect themselves, but
servers may still be liable for storing ``bad'' data.

% Certainly a server that is not actively trying to
% rebuild a document (for instance, by doing a request for the other
% shares) will not have the document available, since it is broken into
% shares before publication. 

%This second table provides a more detailed view of the different
%publishing systems which we examined. There are a number of interesting
%points that become clearer once we increase the resolution of the
%table. 

We see that systems can often provide publisher anonymity via one-way
communication channels, effectively removing any linkability; removing
the need for a reply pseudonym on the anonymous channel means that there is
``nothing to crack''.  The last line in this table is important to note. 
 While it implies that Free Haven achieves anonymity in many areas, this is misleading: the ideal
anonymous channel is actually providing the first nine aspects of
anonymity. Assuming a robust ideal anonymous channel, there would be no
linkability between transactions, and mere computational ability on the
part of the adversary would be insufficient to identify the parties in a
transaction.

This would mean that we could leave most of the anonymity to
the communication channel itself, and provide a simple back-end file
system or equivalent service to transfer documents between agents.  Thus
the design of the back-end system could be based primarily on addressing
other issues such as availability of documents, protections against
flooding and denial of service attacks, and accountability in the face
of this anonymity.

% Unfortunately, the design and deployment of such an ideal anonymous
% channel is a very challenging task, and one left to future innovation
% and development. Until then, we focus on providing increased anonymity 
% with a combination of more conventional mix networks and a publication
% system tailored for the mixnet.  

%% This section is for laying out questions like 
%%  "how do we go about asking 'how can we be sure this is anonymous?' "
%% and questions like "what model is best for modelling anon pub systems
%% so we can prove things about them? 

\section{Future Work} 

Our experience designing Free Haven revealed several problems which
proved to be too hard to solve at this time. We state some of these
problems here and refer to the first author's thesis for in-depth
consideration \cite{rogers-thesis}.

\begin{description}

\item[Deployed Free Low-Latency Pseudonymous Channel:] Free Haven
requires pseudonyms in order to create server reputations. The only
current widely deployed channels which supports pseudonyms seem to be
the Cypherpunk remailer network \cite{nymserver} and ZKS
Freedom mail.  These networks run over SMTP and consequently have high
latency. This high latency complicates protocol design. In addition, Freedom
mail requires payment and is not yet open source. Payment may be complicated 
for international users or may allow participation to be traced, and open source
facilitates peer review of the implementation. 

\item[Accountability and Trust:] We found it extremely difficult to
reason about the accountability in Free Haven, especially when
considering the ``buddy system.'' At the same time, accountability is
critical to ensuring that documents remain available in the
system. Future work in this area might develop an ``anonymous system
trust algebra'' for formally reasoning about a servnet node's
reliability based on various circumstances which would allow us to
verify trust protocols. 

\item[Modelling and Metrics:] When desiging Free Haven, we made some
choices, such as the choice to include trading, based only on our
intuition of what would make a robust, anonymous system. A mathematical
model of anonymous storage would allow us to test this intuition and run
simulations. We also need \emph{metrics}: specific quantities which can
be measured and compared to determine which designs are ``better.''  For
example, we might ask ``how many servers must be compromised by an
adversary for how long before any document's availability is
compromised? before a specific targeted document's availability is
compromised?'' or ``how many servers must be compromised by an adversary
for how long before the adversary can link a document and a publisher?''
This modelling might follow from the work of Gulcu and Tsudik
\cite{babel}, Kesdogan, Egner, and Buschkes \cite{sg-mix}, and
Berthold, Federrath, and Kohntopp \cite{web-mix} which apply
statistical modelling to mix-nets.

\item[Formal Definition of Anonymity:] Closely related to the last point
is the need to formalise the ``kinds of anonymity'' presented in section
\ref{sec:anon}. By formally defining anonymity, we can move closer to
providing meaningful \emph{proofs} that a particular system provides the
anonymity we desire. We might leverage our experience with cryptographic
definitions of semantic security and non-malleability to produce similar
definitions and proofs\cite{GOLDREICH-BOOK}. A first step in this
direction might
be to carefully explore the connection remarked by Rackoff and Simon between 
secure multiparty computation and anonymous protocols\cite{traf-analysis}.

\item[Usability Requirements and Interface:] We stated in the
introduction that we began the Free Haven Project out of concern for the
rights of political dissidents.  Unfortunately, at this stage of the
project, we have contacted few political dissidents,
and as a consequence do not have a clear idea of the usability and
interface requirements for an anonymous storage system. Our concern is
heightened by a recent paper which
points out serious deficiencies in PGP's user interface \cite{johnny}.

\end{description}

We consider the above to be ``challenge problems" for anonymous publication 
and storage systems.

\section{Conclusion}

We have presented a design for a content-neutral decentralized anonymous
storage service and investigated the kinds of anonymity required for
such a service.  The design protects anonymity at least as much, and in
some cases more, than related work.  Unfortunately, we have also shown
that this does not say that much, because large problems remain with
evaluating the level of protection provided by any of these systems,
much less providing ``good'' protection. Until the risks involved in
using such systems can be better evaluated, they cannot be used in good
conscience for situations where failure is not an option. Much more work
remains.

\section*{Acknowledgements}

Professor Ronald Rivest provided invaluable assistance as Roger's
Masters and Michael's Bachelors thesis advisor and caused us to think
long and hard about our design decisions. Professor Michael Mitzenmacher
made possible David's involvement in this project and provided
insightful comments on information dispersal and trading.  Beyond
many suggestions for overall design details, Brian Sniffen provided the
design and background for the trust system, and Joseph Sokol-Margolis
was useful for considering attacks on the system. Adam Back and Theodore Hong
commented on our assessment of their systems and made our related work section 
much better. Furthermore, we thank Susan Born,
 Nathan Mahn, Jean-Fran\c cois Raymond, Anna Lysyanskaya, Adam Smith,
Brett Woolsridge, and our anonymous referees for
further insight and feedback.

%% Rivest, Michael Mitzenmacher, brians, nmahn, seph, susan, brett woolsridge, 
%% Anna L, Adam S, Adam B, and everyone else I've forgotten. 

\bibliographystyle{plain} \bibliography{freehaven.bib}


\end{document}


