Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Enforcing Fair Sharing of Peer-to-Peer Resources

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Enforcing Fair Sharing of Peer-to-Peer Resources

Tsuen-Wan “Johnny” Ngan, Dan S. Wallach, and Peter Druschel


Department of Computer Science, Rice University
twngan,dwallach,druschel  @cs.rice.edu

Abstract a system where nodes, acting selfishly, behave col-


lectively to maximize the common welfare. When
Cooperative peer-to-peer applications are designed to
share the resources of each computer in an overlay net- such a system has no centralized authority with to-
work for the common good of everyone. However, users tal knowledge of the system making decisions, this
do not necessarily have an incentive to donate resources becomes a distributed algorithmic mechanism de-
to the system if they can get the system’s resources for sign (DAMD) problem [9], a current area of study
free. This paper presents architectures for fair sharing which combines computational tractability in theo-
of storage resources that are robust against collusions retical computer science with incentive-compatible
among nodes. We show how requiring nodes to pub- mechanism design in the economics literature.
lish auditable records of their usage can give nodes eco-
nomic incentives to report their usage truthfully, and we To illustrate the power of such economic systems,
present simulation results that show the communication we focus on the specific problem of fair sharing in
overhead of auditing is small and scales well to large p2p storage systems, although our techniques can
networks. potentially be extended to discuss fairness in band-
width consumption and other resources. Section 2
1 Introduction discusses adversarial models that a storage system
must be designed to address. Section 3 discusses
A large number of peer-to-peer (p2p) systems
different approaches to implementing fairness poli-
have been developed recently, providing a general-
cies in p2p storage systems. Section 4 presents
purpose network substrate [10, 11, 13, 14, 16] suit-
some simulation results. Finally, Section 5 dis-
able for sharing files [6, 7], among other appli-
cusses related work and Section 6 concludes.
cations. In practice, particularly with widespread
p2p systems such as Napster, Gnutella, or Kazaa,
2 Models
many users may choose to consume the p2p sys-
tem’s resources without providing any of their own Our goal is to support a notion of fair sharing such
resources for the use of others [1]. Users have no as limiting any given node to only consuming as
natural incentive to provide services to their peers if much of the network’s storage as it provides space
it is not somehow required of them. for others on its local disk. A centralized broker that
monitored all transactions could accomplish such a
This paper considers methods to design such re-
feat, but it would not easily scale to large numbers
quirements directly into the p2p system. While
of nodes, and it would form a single point of failure;
we could take a traditional quota enforcement ap-
if the broker was offline, all file storage operations
proach, requiring some kind of trusted authority to
would be unable to proceed.
give a user “permission” to store files, such notions
are hard to create in a network of peers. Why should We will discuss several possible decentralized de-
some peers be placed in a position of authority over signs in Section 3, where nodes in the p2p network
others? If all nodes were to publish their resource keep track of each others’ usage, but first we need to
usage records, directly, where other nodes are audit- understand the threats such a design must address.
ing those records as a part of the normal functioning It is possible that some nodes may wish to collude
of the system, we might be able to create a system to corrupt the system, perhaps gaining more storage
where nodes have natural incentives to publish their for each other than they collectively provide to the
records accurately. Ideally, we would like to design network. We consider three adversarial models:

No collusion Nodes, acting on their own, wish to that target. Then it randomly selects a few blocks of
gain an unfair advantage over the network, but the file and queries the target for the hash of those
they have no peers with which to collude. blocks. The target can answer correctly only if it has
Minority collusion A subset of the p2p network is the file. The target may ask another replica holder
willing to form a conspiracy to lie about their for a copy of the file, but any such request during a
resource usage. However, it is assumed that challenge would cause the challenger to be notified,
most nodes in the p2p network are uninterested and thus able to restart the challenge for another file.
in joining the conspiracy. 3.1 Smart cards
Minority bribery The adversary may choose spe-
cific nodes to join the conspiracy, perhaps of- The original PAST paper [7] suggested the use of
fering them a bribe in the form of unfairly in- smart cards to enforce storage quotas. The smart
creased resource usage. card produces signed endorsements of a node’s re-
quests to consume remote storage, while charging
This paper focuses primarily on minority collusions. that space to an internal counter. When storage is
While bribery is perfectly feasible, and we may well reclaimed, the remote node returns a signed mes-
even be able to build mechanisms that are robust sage that the smart card can verify before crediting
against bribery, it is entirely unclear that the lower- its internal counter.
level p2p routing and messaging systems can be
equally robust. In studying routing security for p2p Smart cards avoid the bandwidth overheads of the
systems, Castro et al. [3] focused only on minority decentralized designs discussed in this paper. How-
collusions. Minority bribery would allow very small ever, smart cards must be issued by a trusted organi-
conspiracies of nodes to defeat the secure routing zation, and periodically re-issued to invalidate com-
primitives. For the remainder of this paper, we as- promised cards. This requires a business model that
sume the correctness of the underlying p2p system. generates revenues to cover the cost of running the
organization. Thus, smart cards appear to be unsuit-
We note that the ability to consume resources, such able for grassroots p2p systems.
as remote disk storage, is a form of currency, where
remote resources have more value to a node than 3.2 Quota managers
its local storage. When nodes exchange their local
If each smart card was replaced by a collection of
storage for others’ remote storage, the trade benefits
nodes in the p2p network, the same design would
both parties, giving an incentive for them to cooper-
still be applicable. We can define the manager set
ate. As such, there is no need for cash or other forms
for a node to be a set of nodes adjacent to that
of money to exchange hands; the storage economy
node in the overlays node indentifier (nodeId) space,
can be expressed strictly as a barter economy.
making them easy for other parties in the overlay
3 Designs to discover and verify. Each manager must remem-
ber the amount of storage consumed by the nodes
In this section, we describe three possible designs it manages and must endorse all requests from the
for storage accounting systems. For all of these de- managed nodes to store new files. To be robust
signs, we assume the existence of a public key in- against minority collusion, a remote node would in-
frastructure, allowing any node to digitally sign a sist that a majority of the manager nodes agree that
document such that any other node can verify it, yet a given request is authorized, requiring the manager
it is computationally infeasible for others to forge. set to perform a Byzantine agreement protocol [4].
Likewise, for any of these designs, it is imperative The drawback of this design is that request approval
to ensure that nodes are actually storing the files has a relatively high latency and the number of mali-
they claim to store. This is guaranteed by the fol- cous nodes in any manager set must be less than one
lowing challenge mechanism. For each file a node third of the set size. Furthermore, managers suffer
is storing, it periodically picks a node that stores a no direct penalty if they grant requests that would
replica of the same file as a target, and notifies all be correctly denied, and thus could be vulnerable to
other replicas holders of the file that it is challenging bribery attacks.

3.3 Auditing Local:  C F3
 A F4

Remote: F1 F4 D
While the smart card and quota manager designs A F4
are focused on enforcing quotas, an alternative ap-
proach is to require nodes to maintain their own F1 F3
records and publish them, such that other nodes can F2
audit those records. Of course, nodes have no inher- B C
ent reason to publish their records accurately. This Local:  A F1
Local:  B F2

subsection describes how we can create natural eco- Remote: F2 Remote: F3


nomic disincentives to nodes lying in their records. Figure 1: A p2p network with local/remote lists.
3.3.1 Usage files fraudulent entries in its local list, to claim the stor-
Every node maintains a usage file, digitally signed, age is being used. To prevent fraudulent entries in
which is available for any other node to read. The either list, we define an auditing procedure that B,
usage file has three sections: or any other node, may perform on A.
 If B detects that F1 is missing from A’s remote list,
the advertised capacity this node is providing
to the system; then B can feel free to delete the file. After all,

A is no longer “paying” for it. Because an audit
a local list of (nodeId, fileId) pairs, containing could be gamed if A knew the identity of its au-
the identifiers and sizes of all files that the node ditor, anonymous communication is required, and
is storing locally on behalf of other nodes; and can be accomplished using a technique similar to

a remote list of fileIds of all the files published Crowds [12]. So long as every node that has a re-
by this node (stored remotely), with their sizes. lationship with A is auditing it at randomly chosen
intervals, A cannot distinguish whether it is being
Together, the local and remote lists describe all the
audited by B or any other node with files in its re-
credits and debits to a node’s account. Note that
mote list. We refer to this process as a normal audit.
the nodeIds for the peers storing the files are not
stored in the remote list, since this information can Normal auditing, alone, does not provide a disincen-
be found using mechanisms in the storage system tive to inflation of the local list. For every entry in
(e.g., PAST). We say a node is “under quota,” and A’s local list, there should exist an entry for that file
thus allowed to write new files into the network, in another node’s remote list. An auditor could fetch
when its advertised capacity minus the sum of its the usage file from A and then connect to every node
remote list, charging for each replica, is positive. mentioned in A’s local list to test for matching en-
tries. This would detect inconsistencies in A’s usage
When a node A wishes to store a file F1 on another
file, but A could collude with other nodes to push its
node B, first B must fetch A’s usage file to verify that
debts off its own books. To fully audit A, the au-
A is under quota. Then, two records are created: A
ditor would need to audit the nodes reachable from
adds F1 to its remote list and B adds  A  F1  to its
A’s local list, and recursively audit the nodes reach-
local list. This is illustrated in Figure 1. Of course,
able from those local lists. Eventually, the audit
A might fabricate the contents of its usage file to
would discover a cheating anchor where the books
convince B to improperly accept its files.
did not balance (see Figure 2). Implementing such
We must provide incentives for A to tell the truth. a recursive audit would be prohibitively expensive.
To game the system, A might normally attempt to Instead, we require all nodes in the p2p overlay to
either inflate its advertised capacity or deflate the perform random auditing. With a lower frequency
sum of its remote list. If A were to increase its ad- than their normal audits, each node should choose
vertised capacity beyond the amount of disk it ac- a node at random from the p2p overlay. The audi-
tually has, this might attract storage requests that A tor fetches the usage file, and verifies it against the
cannot honor, assuming the p2p storage system is nodes mentioned in that file’s local list. Assuming
operating at or near capacity, which is probably a all nodes perform these random audits on a regular
safe assumption. A might compensate by creating schedule, every node will be audited, on a regular
Local:  C F3
the auditing mechanisms continue to function.
Remote: F1 D
A Reducing communication Another issue is that
F3 fetching usage logs repeatedly could result in seri-
F1 ous communication overhead, particularly for nodes
F2 with slow net connections. To address this, we im-
B C plemented three optimizations. First, rather than
Local:  A F1
Local:  B F2
sending the usage logs through the overlay route
Remote: F2 Remote: F3
used to reach it, they can be sent directly over
Figure 2: A cheating chain, where node A is the the Internet: one hop from the target node to the
cheating anchor. anonymizing relay, and one hop to the auditing
node. Second, since an entry in a remote list would
basis, with high probability. be audited by all nodes replicating the logs, those
How high? Consider a network with n nodes, where replicas can alternately audit that node to share the
c n nodes are conspiring. The probability that the cost of auditing. Third, we can reduce communica-
cheating anchor is not random audited by any node tion by only transmitting diffs of usage logs, since
n c 
in one period is nn  21  1  e  0  368, and the the logs change slowly. We must be careful that the
cheating anchor would be discovered in three peri- anonymity of auditors isn’t compromised, perhaps
ods with probability higher than 95%. using version numbers to act as cookies to track au-
ditors. To address this, the auditor needs to, with
Recall that usage files are digitally signed by their
some probability, request the complete usage logs.
node. Once a cheating anchor has been discovered,
its usage file is effectively a signed confession of its 4 Experiments
misbehavior! This confession can be presented as
evidence toward ejecting the cheater from the p2p In this section, we present some simulation results
network. With the cheating anchor ejected, other of the communication costs of the quota managers
cheaters who depended on the cheating anchor will and the auditing system. For our simulations, we as-
now be exposed and subject to ejection, themselves. sume all nodes are following the rules and no nodes
We note that this design is robust even against are cheating. Both storage space and file sizes are
bribery attacks, because the collusion will still be chosen from truncated normal distributions1 . The
discovered and the cheaters ejected. We also note storage space of each node is chosen from 2 to
that since everybody, including auditors, benefits 200GB, with an average of 48GB. We varied the
when cheaters are discovered and ejected from the average file size across experiments. In each day
p2p network, nodes do have an incentive to perform of simulated time, 1% of the files are reclaimed and
these random audits [8]. republished. Two challenges are made to random
replicas per file a node is storing per day.
3.3.2 Extensions For quota managers, we implemented Castro et al.’s
BFT algorithm [4]. With a manager set size of ten,
Selling overcapacity As described above, a node
the protocol can tolerate three nodes with byzantine
cannot consume more resources from the network
faults in any manager set. For auditing, normal au-
than it provides itself. However, it is easy to imag-
dits are performed on average four times daily on
ine nodes who want to consume more resources
each entry in a node’s remote list and random audits
than they provide, and, likewise, nodes who pro-
are done once per day. We simulated both with and
vide more resources than they wish to consume.
without the append-only log optimization.
Naturally, this overcapacity could be sold, perhaps
through an online bidding system [5], for real-world Our simulations include per-node overhead for
money. These trades could be directly indicated in 1 The bandwidth consumed for auditing is dependent on the
the local and remote lists. For example, if D sells number, rather than the size, of files being stored. We also
1GB to E, D can write (E, 1GB trade) in its remote performed simulations using heavy-tailed file size distributions
list, and E writes (D, 1GB trade) in its local list. All and obtained similar results.

Pastry-style routing lookups as well as choosing
one node, at random, to create one level of in-
80
direction on audit requests. The latter provides
weak anonymity sufficient for our purposes. Note 70
that we only measure the communication overhead 60

Bandwidth (bps)
due to storage accounting. In particular, we ex- 50
clude the cost of p2p overlay maintenance and stor- 
40
ing/fetching of files, since it is not relevant to our 30
comparison. Unless otherwise specified, all simula- 20
tions are done with 10,000 nodes, 285 files stored Auditing w/o caching
10 Auditing w/ caching
per nodes, and an average node lifetime of 14 days. Quota managers
0
1000 10000 100000
4.1 Results No. of nodes (log scale)

Figure 3 shows the average upstream bandwidth re- Figure 3: Overhead with different number of nodes.
quired per node, as a function of the number of
nodes (the average required downstream bandwidth
is identical). The per-node bandwidth requirement 200
180 Auditing w/o caching
is almost constant, thus all systems scale well with Auditing w/ caching
the size of the overlay network. Bandwidth (bps) 160 Quota managers
140
Figure 4 shows the bandwidth requirement as a 120

function of the number of files stored per node. The 100
overheads grow linearly with the number of files, 80
but for auditing without caching, it grows nearly 60
40
twice as fast as the other two designs. Since p2p
20
storage systems are typically used to store large
0
files, this overhead is not a concern. Also, the sys- 0 100 200 300 400 500 600 700
tem could charge for an appropriate minimum file Average number of files stored per node
size to give users an incentive to combine small files
into larger archives prior to storing them. Figure 4: Overhead with different number of files
stored per node.
Figure 5 shown the overhead versus average node
lifetime. The overhead for quota managers grows
rapidly when the node lifetime gets shorter, mostly 350
Auditing w/o caching
from the cost in joining and leaving manager sets 300 Auditing w/ caching
and from voting for file insertions for new nodes. Quota managers
Bandwidth (bps)

250
Our simulations have also shown that quota man-
 200
agers are more affected by the file turnover rate, due
to the higher cost for voting. Also, the size of man- 150
ager sets determines the vulnerability of the quota 100
manager design. To tolerate more malicious nodes, 50
we need to increase the size of manager sets, which
0
would result in a higher cost. 0 5 10 15 20 25
In summary, auditing with caching has performance Average node lifetime (days)
comparable to quota managers, but is not subject to
bribery attacks and is less sensitive to the fraction of Figure 5: Overhead with different average node life-
malicious nodes. Furthermore, in a variety of con- time.
ditions, the auditing overhead is quite low — only a
fraction of a typical p2p node’s bandwidth.
5 Related Work References
[1] E. Adar and B. Huberman. Free riding on Gnutella. First
Tangler [15] is designed to provide censorship- Monday, 5(10), Oct. 2000.
resistant publication over a small number of servers [2] R. Anderson. The Eternity service. In Proc. 1st Int’l Conf.
(i.e., 30), exchanging data frequently with one an- on the Theory and Applications of Cryptology, pages
242–252, Prague, Czech Republic, Oct. 1996.
other. To maintain fairness, Tangler requires servers
[3] M. Castro, P. Druschel, A. Ganesh, A. Rowstron, and
to obtain “certificates” from other servers which can D. S. Wallach. Security for structured peer-to-peer over-
be redeemed to publish files for a limited time. A lay networks. In Proc. OSDI’02, Boston, MA, Dec. 2002.
new server can only obtain these certificates by pro- [4] M. Castro and B. Liskov. Practical Byzantine fault toler-
viding storage for the use of other servers and is not ance. In Proc. OSDI’99, New Orleans, LA, Feb. 1999.
allowed to publish anything for its first month on- [5] B. F. Cooper and H. Garcia-Molina. Bidding for storage
line. As such, new servers must have demonstrated space in a peer-to-peer data preservation system. In Proc.
22nd Int’l Conf. on Distributed Computing Systems, Vi-
good service to the p2p network before being al- enna, Austria, July 2002.
lowed to consume any network services. [6] F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and
I. Stoica. Wide-area cooperative storage with CFS. In
The Eternity Service [2] includes an explicit notion
Proc. SOSP’01, Chateau Lake Louise, Banff, Canada,
of electronic cash, with which users can purchase Oct. 2001.
storage space. Once published, a document cannot [7] P. Druschel and A. Rowstron. PAST: A large-scale, per-
be deleted, even if requested by the publisher. sistent peer-to-peer storage utility. In Proc. 8th Workshop
on Hot Topics in Operating Systems, Schoss Elmau, Ger-
Fehr and Gachter’s study considered an economic many, May 2001.
game where selfishness was feasible but could eas- [8] E. Fehr and S. Gachter. Altruistic punishment in humans.
ily be detected [8]. When their human test subjects Nature, (415):137–140, Jan. 2002.
were given the opportunity to spend their money to [9] J. Feigenbaum and S. Shenker. Distributed algorithmic
punish selfish peers, they did so, resulting in a sys- mechanism design: Recent results and future directions.
In Proc. 6th Int’l Workshop on Discrete Algorithms and
tem with less selfish behaviors. This result helps Methods for Mobile Computing and Communications,
justify that users will be willing to pay the costs of pages 1–13, Atlanta, GA, Sept. 2002.
random audits. [10] P. Maymounkov and D. Mazières. Kademlia: A peer-to-
peer information system based on the XOR metric. In
Proc. IPTPS’02, Cambridge, MA, Mar. 2002.
6 Conclusions [11] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and
S. Shenker. A scalable content-addressable network. In
This paper has presented two architectures for Proc. SIGCOMM’01, pages 161–172, San Diego, CA,
Aug. 2001.
achieving fair sharing of resources in p2p networks.
[12] M. K. Reiter and A. D. Rubin. Crowds: Anonymity for
Experimental results indicate small overheads and web transactions. ACM Transactions on Information and
scalability to large numbers of files and nodes. In System Security, 1(1):66–92, 1998.
practice, auditing provides incentives, allowing us [13] A. Rowstron and P. Druschel. Pastry: Scalable, dis-
to benefit from its increased resistance to collusion tributed object address and routing for large-scale peer-
to-peer systems. In Proc. IFIP/ACM Int’l Conf. on Dis-
and bribery attacks.
tributed Systems Platforms, pages 329–350, Heidelberg,
Germany, Nov. 2001.
[14] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and
Acknowledgments H. Balakrishnan. Chord: A scalable peer-to-peer lookup
service for Internet applications. In Proc. SIGCOMM’01,
We thank Moez A. Abdel-Gawad, Shu Du, and San Diego, CA, Aug. 2001.
Khaled Elmeleegy for their work on an earlier ver- [15] M. Waldman and D. Mazières. Tangler: A censorship-
resistant publishing system based on document entangle-
sion of the quota managers design. We also thank ments. In Proc. 8th ACM Conf. on Computer and Com-
Andrew Fuqua and Hervé Moulin for helpful dis- munications Security, Nov. 2001.
cussions on economic incentives. This research was [16] B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph.
supported by NSF grant CCR-9985332, Texas ATP Tapestry: An infrastructure for fault-resilient wide-area
grants 003604-0053-2001 and 003604-0079-2001, address and routing. Technical Report UCB//CSD-01-
1141, U. C. Berkeley, Apr. 2001.
and a gift from Microsoft Research.

You might also like