\documentclass[11pt]{article}
\usepackage[bottom]{draftcopy}
\usepackage[english]{babel}
\usepackage{epsfig}
\usepackage{amscd}
\textheight 9.0 in
\textwidth 6.5 in
\voffset -0.5 in
\hoffset -0.8 in
\draftcopyName{DRAFT}{0}

\begin{document}
\title{Towards a simplified trust network and trading protocol: \\
Decision-making based on hash-space partitions}


\author{Michael J. Freedman \\
	The Free Haven Project \\
	mfreed@mit.edu
}

\dateamerican
\maketitle

\begin{abstract}
We present a new trading protocol for the Free Haven Project.  Trade
decision-making is based on the intersection of node hash-space
partitions and the mapping of share content onto the hash-space.  We
remove the concept of share buddies and trust broadcasts.  We greatly
simplify the Trust Network and its associated trust algebra:  all trust
is determined locally via directly observed behavior.  We describe how
these new concepts may effect the Free Haven system, and present some
open questions.
\end{abstract}

\section{Introduction}
The current trust system for the Free Haven design remains fairly
abstract.  The model has concepts of buddy shares querying each other,
buddy squawking given the appearance of bad behavior, trust broadcasts,
trust referrals, and personal tabs kept about other nodes.

No real model nor proof-of-concept has been made to adequately express
the complexity of this problem.  Indeed, the current concept of trust
algebra is a complicated one, and remains a largely unknown variable of
the Free Haven design.  Even with these possible shortcomings, the means
to achieve accountability is largely offloaded to the Trust Network.

We present a new trading model for Free Haven, based on deterministic
trade-partner identification via hash-space address partitioning.  Using
this new technique, we simplify the requirements and usage of the Trust
Network.  

\section{Design}

\subsection{Overview}

Some node$_A$ that wishes to trade a share will query is hash-space
database for a possible trade-destination for the share (say, node$_B$).
Node$_A$ will attempt to trade with node$_B$.  If the trade is not
successful, node$_A$ will attempt to trade a different share than
initially chosen to that share's trade-destination (most likely
different) node.  If the trade with node$_B$ is successful, node$_A$
will query node$_B$ some time after the trade occurs.  This query seeks
to determine if node$_B$ still has (or can attain) a copy of the traded
share.  Node$_A$ will adjust its trust in node$_B$ accordingly.

\subsection{Data Storage and Structures}
\label{subsec:db}

Each node in the servnet will maintain four database: the node database,
the trust database, the hash-space database, and the trade
database.\footnote{The node and trust db can be combined in relational
databases; the trade database may be merely a flat list.}

\begin{description}

\item[Node-Trust Information]
The node database maintains a mapping from $PK_{node_i}$ to the
node$_i$'s relevant information, which includes the node's reply block,
mixnet type, tec.  The trust database relates the node's public key to
trust metrics in the system, similar to the current Trust DB model.

\item[Hash-Space Information]
The hash-space database maps discrete time blocks (e.g., one week) to a
partitioning of the (SHA-1) hash space, along with some nonce selected
at random for that time interval.  The hash space is partitioned as
follows:

For a given servnet node$_A$, consider some hash-space size of $M$ and a set
of other nodes of order $n$ (the set of all nodes in the servnet $-$
itself).  A node$_i$ is selected at random from this set of known
nodes.  Starting at the front of the hash space, a partition of the
node$_A$'s hash space of size

\begin{displaymath}
\Big(
\frac{\mathrm{trust}_{node_i}}
     {\sum_{j=1}^{n}{\mathrm{trust}_{node_j}}}
\Big) * M
\end{displaymath}

is assigned to node$_i$.  This partitioning occurs until the entire hash
space is divided among node$_A$'s known nodes.  

A key representing the currency time intervale is then mapped in the
hash-space database to this partitioning and some random nonce.

\item[Trade Information]
The trade list/database maintains a list of trade receipts, which we
will describe in greated depth in Table \ref{table:trading}, while
explaining the new trading protocol. 

\end{description}

\subsection{Time Interval Initiation}

When a new time interval begins for a node (say, $time_{k_A}$ for
node$_A$), the node performs hash-space partitioning due to trust.  This
process is described above in section \ref{subsec:db}.  Note that time
intervals are relative, thus we require no time synchronization between
nodes.  We also generate an random nonce $r_{time_{k_A}}$ for this time
interval.  We generate a mapping of all shares to possible trade
destinations due to the share contents and the nonce.  That is, 
given some share $\Phi_i$, the value $hash(\Phi_i, r_{time_{k_A}})$ will
fall within some partition of the hash-space (say, node$_B$). 

If, during this time interval, node$_A$ selects some share to trade, the
node with which she can trade this share is specified by this
pre-computed mapping.   If node$_B$ receives a trade request from
node$_A$, he can only agree to trade if there exists some share in his
share db for which its $hash(share, nonce)$ maps to node$_A$.


\subsection{Trading Protocol}

At some time during $time_{k_A}$, node$_A$ selects some share $\Phi_i$
to trade away.  The share-to-partition mapping procedure, which was
precomputed during time interval initiation, shows the trade-destination
for $\Phi_i$ to be node$_B$.  If node$_B$ has some share $\lambda_j$,
which maps to node$_A$ during his share-to-partition mapping, node$_B$
will agree the to trade.

See Table \ref{table:trading} for a description of the trading protocol.


\begin{table}[htc]
\begin{center}
\begin{tabular}{|l|c|l|}
\hline
Node A & comm (encr/auth) & Node B \\
\hline
\hline
At $time_{k_A}$: & & \\
Partition hash-space & & \\
& & At $time_{l_B}$: \\
& & Partition hash-space \\
... & & ... \\
Select $\Phi_i$ from share db & & \\
Determine target for $\Phi_i$: & & \\
$hash(\Phi_i, n_{time_{k_A}}) \rightarrow B$ & & \\
& $\stackrel{\Phi_i, exp_{\Phi}, ...}{------>}$ & \\
& & Share in partition$_A$? \\
& & If no, then NAK. \\
& & Else choose $\Lambda_j$ \\
& $\stackrel{\Lambda_j, exp_{\Lambda}, ...}{<------}$ & \\
:share: $\Lambda_j$ & & share: $\Phi_i$ \\
trade: $hash(\Phi_i), B$ & & 
trade: $hash(\Lambda_j), A$ \\
trade: $time_{k_A}, exp_{\Phi}$ & &
trade: $time_{l_B}, exp_{\Lambda}$ \\
\hline
\end{tabular}
\caption{{\bf Basic Trading Protocol.} Note that the [db: {\it text}]
notation means that the specified node stores {\it text} into the
corresponding database.}
\label{table:trading}
\end{center}
\end{table}


\subsection{Accountability}

At some time during interval $time_{k_A}$, node$_A$ will perform spot
checks on the retrievability of traded-away shares.  One possible
engineering design choice would be to perform these checks with a
frequency proportional to the node's load and bandwidth.

A receipt is chosen from the trade list, with probability inversely
related to its age ($current\_time - time\_of\_trade$).\footnote{In the
trade protocol description, we only record time interval of trade.
Based on the granularity of time intervals, we may also wish to record
exact time of trade within the interval, using this number for the
``age'' calculation.}  We should probably weight this probability
function by how long the share should remain in the servnet
($expiry\_time - current\_time$).  Shares with long expiry times require
more careful consideration.

This receipt includes the node with which the share was traded (say,
node$_B$).  Node$_A$ will send the recorded $hash(share)$ and some {\it
new} random nonce $r_{query}$ to node$_B$.  Node$_B$ performs the
following check:

\begin{enumerate}
\item If one of his shares has a matching $hash(share)$ value, node$_B$
returns $hash(share, r_{query})$ to node$_A$.  Node$_A$ is happy with
node$_B$. 
\item Elseif a receipt in his trade db has the same $hash(share)$,
node$_B$ sends a request to the node with which he traded (say,
node$_C$), along with the {\it original} $r_{query}$.  In other words,
node$_B$ plays a ``man in the middle.''
\begin{enumerate}
\item If node$_B$ eventually receives $hash(share, r_{query})$ back from
node$_C$ (who performs similar operations as B), this value is passed
back to node$_A$. Node$_A$ is happy with node$_B$.
\item Else node$_B$ receives nothing back from node$_C$, and thus does
nothing further.  Node$_B$ is not happy with node$_C$; Node$_A$ is not
happy with node$_B$.
\end{enumerate}
\item Else node$_B$ does nothing. Node$_A$ is not happy with node$_B$.
\end{enumerate}

Node$_A$ modifies her trust in node$_B$ whether she is happy or not.
This suggests that nodes always expect their trade partners to be able
to retrieve shares accepted during trading.  Even if node$_C$ was the
entity that ``lost'' the share, node$_A$ only cares whether or not
node$_B$ can return proof of recoverability.

Based on the ``query'' transcript, a node would not be able to determine
if it is the first in a ``chain'' of queries, or merely one node in a
long chain.  However, if it does not play by the proper protocol rules,
the node that queries {\it it} will lose trust in {\it it}.  One may
consider if it is within a long chain, it can cause a number of trust
hits along each link in the chain.  However, as nodes query for
shares with decreasing probability based on ``age,'' the probability of
a long chain also decreases.\footnote{This probability function requires
some analysis to reduce the amount of damage a node can inflict on
others, without taking a similar hit itself.  However, the trust algebra
should not be that complex.}

Therefore, the system as presented only relies on {\it local} trust:
there is no concept of trust broadcasts or (buddy) squawking, as in the
old design.\footnote{We do not consider the problem of ``introduces''
here.}  Trust is a local effect with possible global consequences: if a
node continues to drop shares or misbehave, its neighboring nodes will
lose trust in it.  If a sufficient number of nodes lose trust in it, its
total ``hash-space partition'' on all other nodes will proportionally
decrease (thus decreasing the probability that the other nodes have
shares with map to its small hash-space partition.).  Therefore,
ill-trusted nodes will have a hard time finding willing trade-partners.


\section{Discussion}

We note the following attributes and open questions of the presented
trading design.

\subsection{Efficient storage}  

At the beginning of each time interval, the old hash-space mapping can
be deleted.  In the trade db, we store the $hash(share)$, as well as the
trade destination node, time interval of trade, and document expiry
time.  Therefore, we can ``query'' the trade destination node at a
later, using this information, to test whether the share is still
present (thus also checking the returned share's consistency against the
saved $hash$ value.)

Secondly, the trade list can occasionally be scanned, and those entries
in which $current\_time > expiry\_time\_of\_share$ can be
removed. 

\subsection{Reader proof-of-knowledge}
\label{subsec:pok}

We wish to achieve document-anonymity of a stronger nature than that
provided by Freenet and other systems.  Namely, in these systems,
servers do not know the contents of the documents they are storing, nor
are they able to query the system to retrieve extra information about
said documents.  Rather, a $H(name)$ is used to name the file; a reader
supplies the hash pre-image (``name''), allowing the server to verify
that the reader is ``authorized'' to retrieve the file.  However, after one
reader supplies the summer with the hash pre-image, the server knows how
to there-after retrieve the document.  

This is not a problem per se for Freenet {\it et al} systems, which perform
{\it document} storage/caching, for these servers already have the entire
(possibly encrypted) documents.  However, for systems like Free Haven
which do {\it share} storage, servers might not initially know how to
retrieve entire documents.

We suggest the following possibility: when a node attempts to retrieve a
specific share, it only needs to present $hash(share)$.  However, if a
reader is attempting to retrieve the entire {\it document}, it provides
a proof of knowledge on a ``read authorization'' key for the document.
This idea requires further development: performing an interactive proof
across a high-latency mixnet appears highly impractical.  However, we
need to ensure that the server cannot merely present a transcript from a
non-interactive proof.  We also need to consider how this might effect
performance.

\subsection{Share reinsertion}

The protocol as presented suggests returning only $hash(share, nonce)$
so that only valid readers and nodes actively engaged in trading can
determine the actual contents of the share.  This measure emphasis {\it
accountability}:  we modify our trust in other nodes whether or not they
can recover a share.  

If the ``query'' check and the actual document retrieval appear the same
to a node, adversaries would not be able to tell the difference between
node-to-node checks and broadcast retrievals.\footnote{This obviously
does not support the authorized reader suggestion as
described in section \ref{subsec:pok}}. However, if an adversary
merely runs $>1$ node, he can most likely detect the difference.

Therefore, with the accountability measure as presented, adversaries can
gladly prove recoverability during these spot query checks, and merely
ignore all reader requests to recover a document.

We suggest that share reinsertion might be preferred.  The protocol
would be amended such that when node$_A$ queries node$_B$ for some share
(say, $\Phi_i$) with $hash(\Phi_i)$, node$_B$ returns the actual
$\Phi_i$ to node$_A$ (no nonce is required).  Node$_A$ will reinsert
$\Phi_i$ into its share db, and thereafter treat it like a normal share.

Only the node which initiates a query would want to reinsert the share
into its share db to stop adversaries from an obvious DoS flushing
attack.  Nodes would also have to minimize the number of document
queries it performs within a given time interval, so that it does not
run out of storage space.

We can modify this process so that node$_A$ only {\it sometimes}
reinserts a share recovered from the query procedure, perhaps with some
probability based on the share's ``age'' and expiry time.  

We need to ensure that this reinserted share has been ``paid for'' as
normally during trading.  An obvious possibility is that the normal
$size x duration$ calculation is used: shares with longer expiry dates
are more expensive, as the probably of reinsertion is greater.  Futher
consideration is required.

This reinsertion mechanism allows the {\it accountability} process to
also effect the {\it robustness} of storage beyond the simple k-of-n
share-splitting heuristic.


\subsection{Limited storage effects}

We need to consider the effect of limited storage space on hash-space
partitioning.  If a node can only store a smaller number of shares, the
probability that a trade request is from a node that ``hits'' a share in
the hash-space becomes small.  We may wish to consider a different
mechanism for choosing whether or not an ``incoming'' trade request will
be answered in the positive.

Indeed, we may want to remove the entire concept of hash-space
partitioning (which makes all trade ``decision-making'' during the
initially partitioning process, {\it not} when a trade request is
received).  We can perform trade decision-making {\it at} the
time a request is received, based on trust in the requesting node.

Alternatively, reading and publishing should still be accomplished via
broadcasts, yet trading can be performed on a sparsely-connected graph.
Or, we can merely skew the partitioning to be based on some polynomial
function of trust, as opposed to a linear relationship.  If this would
be the case, we would need to ensure that the ``confidence'' rating of
trust (i.e., how easy can trust be changed) is taken well into account.

Future consideration is surely required.


\section{Conclusion}

The goal of this paper is to present new trading mechanism which uses a
simplified local-effect Trust Network.  Less emphasis is placed on
trust broadcasts and communications between nodes.  

The design does not describe any requirement for buddies, or the
four-part trade protocol yielding buddy receipts.  This helps simplify
the system, reducing both storage and communication overhead.
Furthermore, the lack of buddies with the accompanied buddy-squawking
broadcast greatly reduces communication overhead.

The design requires futher study/modeling on the effects of
local trust heuristics to gain a better global understanding of
trustworthyness.  That is, how easily can an adversary escape notice,
provided that it attempts to selectively break protocol.

The design as presented in this document is a work-in-progress.

\end{document}


