outline for a paper on accountability and MIXes
-----------------------------------------------

Goals:  1) Show current and perceived problems in the remailer network
        which stem from a lack of "accountability."
        2) Define "accountability" for MIXes. Define ways of saying
        whether a MIX network is efficient or "works well."
        3) Describe the state of the art for accountability in MIXes.
        4) Argue that the state of the art is not very good. Show
        solutions which "work better."


I. Intro

II. Goal 1 -- Current and Perceived Problems in remailers

Overall Note: We make the point that previously in MIXes, efficiency
is neglected in favor of stronger anonymity. As Adam Back pointed out,
this may be a misguided policy. If the system is not efficient or not
reliable enough, the anonymity may be compromised because *too few people
use it*.

II.1 Public remailers are subject to spam attacks. These are typically
deliberately aimed at shutting down the remailer by causing it to run
afoul of ISP.

This is a variant of a flooding attack.

II.2 Remailers may be compromised or unreliable. The network is so
unreliable that many messages never get through and have to be sent
many times. Not a good idea for traffic analysis.

Senders need some way of determining which remailers are reliable and
which are not, so as to minimize the number of messages which need to
be re-sent. There are currently several 'remailer stats' pages available,
but it's not at all clear that they're reliable, fair, or secure against
manipulation.

Remailer stats pages:

http://anon.efga.org/Remailers
Has a pointer to 4 different stats pages : Frog, publius.net, gretchen,
Green.

http://www.publius.net/mixmaster-list.html
http://www.tahina.priv.at/~cm/stats/
http://www.neuropa.net/~gretchen/rlist2.html
http://generalprotectionfault.net/green/   (not working right now)

See also the "geographic location page" 
http://generalprotectionfault.net/orange/remap.htm

II.3 "Selective" denial of service.
http://www.eff.org/pub/Privacy/Anonymity/1999_09_DoS_remail_vuln.html

As stated, the attack requires compromising or interposing between
many different remailers and the outside world. This may not be feasible.
An easier attack would be to fool with the statistics given out by
the efga.org or other servers; make sure that their test packets go
astray.

II.4 Remailer network transient; permanent nodes are rare. People pop
up, run a remailer, realize that it's not worth the hassle and costs $$$,
then disappear.

This is a case for redeemable micropayments.


III. Goal 2 -- Define "accountability" for remailers and what it
means for an accountability system to "work."


III.1 Remailer accountability can be considered in two parts.

        1) The remailer operator must be accountable to the
        sender of a message. If the remailer fails to deliver
        a message, the sender needs to know that. Danger: if he
        concludes that a dropped message means he should cease
        using that remailer, then this opens the door for an
        adversary to run a 100% reliable remailer and try to reduce
        the reliability of the others.

        2) The sender must be accountable to the remailer operator.
        The sender must not be able to send junk messages it doesn't
        "care about" through the remailer. At least not enough to flood
        the remailer.

III.2 An accountability system "works" if it maximizes the "efficiency"
      of the remailer system.

        First sketch:

        Consider efficiency in terms of a sender Alice A and many
        recipients Bob, with each Bob denoted by B_i.
        Define the random variables

                Y = Alice's message gets through the mixnet
                B = takes on the integer values i for the B_i

        A notion of "efficiency" of a remailer system for Alice is E[Y].
        We can calculate E[Y] by calulating E[E[Y|B]]. So if we can
        compute P(Alice's message gets through mixnet when sent to B_i)
        then we can compute this efficiency.

        We use the law of total expectation together with the probability
        that Alice sends to any particular B_i.

        E[Y] = f_B(0) E[Y|B = 0] + f_B(1) E[Y|B = 1] + ...

        where f_B(i) is the density function of B. f_B(i) us
        what the probability is of Alice sending to B_i.

        If the adversary adds new player(s) B_evil, then Alice doesn't
        know about them at first and won't send messages to them.
        f_B(evil) = 0

        So their values of E[Y|B = evil] do not affect the efficiency at
        all. Is this a reasonable model, or is it too recursively defined?

        We can use a notion of efficiency to show what efficiency would
        be like in a "state of nature" without any reputation or
        accountability mechanisms whatsover. Then we can evaluate
        a mechanism for goodness based on how much better it is than
        the "state of nature."

        For a simple case, consider this simplified model: There are n
        nodes N_i 1 <= i <= n. Each node is corrupt with probability p.
        There is a single sender Alice and m Bobs B_i 1 <= i <= m

        In a "state of nature", each B_i picks a reply block of k
        nodes uniformly at random. The block is "bad" if any one of the
        servers in it is bad. (I think this means this is a binomial
        random variable with parameters k and p). Say a message
        never gets through a bad block.
        Now we know exactly what E[Y|B = i] is for each B_i.

        If we then know f_B(b), we can compute E[Y].
        I should just assume a uniform f_B(i) = 1/m and calculate this;
        maybe later today.

        In general, though, we may be able to ask these things about
        "efficiencies"

        * are they local or global?
        * Do they "make sense"
                - Can an adversary drive them to a particular value
                without "doing any work"?
                - Do they capture what we actually want to
                measure/improve?

        * Are they easy to calculate?
        * Others?

        which may give us some guidance on what to consider as
        efficiencies.


IV. Goal 3 -- Describe the state of the art.

IV.1 Levien's original uptime and reliability list.

IV.2 Modern day descendants (Frog, anon.efga.org)

IV.3 What ZKS is doing


V. Goal 4 -- Argue state of the art can be improved. Suggest improvements.


V.1 Lack of micropayments and lack of incentive to run a node.

Models for micropayment use in mix-net systems: Consider Alice sending
some message to Bob, through some number of intermediate nodes (both
the number and identity of these nodes known only to her, modulo each
node knowing predecessor and successor.)

V.1.1. End-to-end model:

The simplest approach is to make Alice send Bob some form of payment
and not worry about what happens to any intermediaries.  The
intermediate nodes lack any protection from attack.  Bob might even be
fictitious; Alice can attack any number of intermediate peers by
routing her queries through them, using up their bandwidth, or
attempting to perform active traffic analysis.  Consider mix-nets such
as Mixmaster, which keep a pool of outgoing messages to prevent timing
attacks.  Only when this threshold is passed will the remailer choose
one message at random and send it out.  We wish to prevent adversary
from being able to flood this pool with (t-1) messages.  This problem
is precisely one of our motivations for using micropayments!

V.1.2. Pairwise model:

Throw micropayments into every transaction between every pair of
nodes: one long route can be modeled as a number of pairwise
transactions.  This might appear to protect each recipient of
payments, but is also fundamentally flawed.

Using fungible micropayments, each peer earns one unit from its
predecessor, then spends one unit with its successor.  Assuming equal
costs throughout the network, Alice is the only net debtor and Bob the
only net creditor.  But if a single, malicious operator is in charge
of both Alice and Bob, these two peers have managed to extract work
from the intermediate nodes without paying -- a more-subtle DoS or
flooding attack!

Using non-fungible micropayments, Alice remains a net debtor, but so
are all intermediate peers.  Alice can make use of greater
computational resources (centralized or distributed) to flood
intermediate peers with POWs.  Being properly-behaving nodes, these
peers attempt to make good on the micropayment exchange, and start
churning out POWs for the next hop in the protocol. Eventually Alice
can exhaust the resources of a whole set of smaller, honest peers.

V.1.3. Amortized pairwise model:

Taking what we learned about the risks of the previous model, we
design a more sophisticated one that amortizes Alice's cost throughout
the network route by iteratively decreasing the cost of transactions
as they move through the system.

In the case of non-fungible POWs, we still lose.  First of all, Alice
can still make use of greater wealth, economies of scale, distributed
computing, etc. in order to attack intermediate nodes. While the load
decreases as it moves though the system, peers X, Y, and Z still need
to devote some of their own resources; they may be unable to afford
that load.

For fungible payments, this model appears more promising.
Intermediate nodes end up as net creditors: their resources are paid
for by the cut they take from Alice's initial lump-sum payment.

This model has another large weakness from a security point-of-view:
we leak information regarding the route-length.  If we use amortized
payments, the distribution of wealth from Alice's initial lump-sum
payment is some monotonically (or at least linearly) decreasing
function.  Intermediate peers can extrapolate how many hops are in the
route, as well as their relative position in the chain.

V.1.4. All points model:

We hypothesize that the only model to offer adequate accountability
and resource protection is an all-points model: Alice pays each peer
engaged in the protocol, intermediate and endpoint alike.  Obviously,
we wish to use micropayments which do not leak any information about
Alice: systems built on hash-collisions or other cryptographic
puzzles, systems using blinded public key encryption, systems using
proofs-of-knowledge, etc.  There are several existing considerations
for forward-only mixnets.

First of all, we lose our capacity to perform on-demand or interactive
payments.  "On depand" payments are according to the client puzzle
model, in which micropayments are only required if the node detects a
possible connection depletion attack.  Interactive processes cannot be
carried out adequately through the mixnet.  Therefore, we suggest that
payments should *always* accompany messages, and need to be of a
non-interactive variety.

To stop double-spending, the scheme must use either a centralized bank
server model (for instance, Chaumian ecash), use offline fraud-tracing
(for instance, Brands' system, which allows non-interactive payment
through out-of-band recipient-specific nonce), or have
recipient-specific information encoded in the payment (for instance,
hash cash).  Any recipient-specific information should further be
hidden from view, so as to protect an eavesdropper from being able to
piece together the route by looking at the micropayments.  Ideally,
recipient-hiding cryptosystems

[David Hopwood. hopwood@zetnet.co.uk. Recipient-Hiding Blinded
Public-Key Encryption. Draft Manuscript. ]

help ensure that the act of encrypting the micropayment does not
itself leak information about to whom the data is encrypted.
Payment schemes such as Brands' system don't actually transmit 
the encrypted "coin" itself, but rather a proof-of-knowledge 
payment transcript.


V.2 Possibility of corrupted uptime services. Also, do they really
help us make better decisions about where to send messages?

        Outline how reputation systems in general will address this.
        Tallying reputations for pseudonyms allows
          * Improved reliability for messages
          * Improved efficiency for overall remailer system
        while maintaining sufficient pseudonymity of users and nodes.

        Tradeoffs. Possible models:
          * Have a few people doing the reliability checks and building
            public reputation lists from them. Still relatively
            centralized and vulnerable to influence. But much simpler
            to design/implement/analyze.
          * Any mixnet user can 'rate' any mixnet node. More 'fair' and
            decentralized. But much more complex -- pseudospoofing and
            outright lying can make the system fall apart? (This is a
            problem familiar to many anonymous systems.)
          * Using 'maximum flow' model (eg Advogato.org) to limit damage
            done by pseudospoofing.
    
        Developing a scoring system -- how to aggregrate ratings into
          scores:
          * Ways of verifying transactions, to require rating credibility
          * Ways of collecting ratings -- retain anonymity but also remain
            verifiable
          * 'Local' reputation: allow each user to weight the reputation
            calculation based on preferences
          * Privacy and information leaks with publishing scores --
            does it introduce any new traffic analysis attacks?
          * Reputation of the reputation lists? Can we prove that
            we included each rating in our score?

        Automating the use of these reputation scores in mixnet client
        software.


VI. Conclusion and future work