Abstract
Many data sharing systems are open to arbitrary users on the Internet, who are independent and self-interested agents. Therefore, in addition to traditional design goals such as technical performance, data sharing systems should be designed to best support the strategic interactions of these agents.
Our research hypothesis is that designs that maximize the participants' autonomy can produce useful data sharing systems.
We apply this design principle to both the system architecture and the functional design of a data sharing system, and study the resulting class of systems, which we call Decentralized Social Data Sharing ((DS)²) systems.
We formally define this class of systems and provide a reference implementation and an example application: a distributed wiki system called P2Pedia. P2Pedia implements a decentralized collaboration model, where the users are not required to reach a consensus, and instead benefit from being exposed to multiple viewpoints. We demonstrate the value of this collaboration model through an extensive user study.
Allowing the users to autonomously control their data prevents the system architecture from being optimized for efficient query processing. We show that Regular Path Queries, a useful class of graph queries, can still be processed on the shared data: although in the worst case such queries are intractable, we propose a cost estimation technique to identify tractable queries from partial knowledge of the data.
Through simulation, we also show that the users' control over network connections allows them to self-organize and interact with other users with whom their interests are best aligned. This may result in less data being available, and we study cases where this is in fact demonstrably beneficial to the users, as the available data to each user is the most relevant to them.
This suggests that querying this reduced collection of shared data may lead to more tractable query processing without necessarily reducing the users' utility.