DICG '22 Paper #78 Reviews and Comments

Paper #78 Secure Replication for Client-centric Data Stores

Submitted: 31 August 2022
Accepted: 04 October 2022

Review A: 3. Weak accept
Review B: 3. Weak accept

Review A

Overall merit

3. Weak accept

Reviewer expertise

2. Some familiarity

Paper summary

This paper presents a secure and confidential model for Byzantine resilient state-based Conflict-free Replicated Data Types (CRDTs). This model consists of two data structures to represent JSON data types. Through the experiments, the authors compare the model's overhead versus OWebSync, as a baseline and show that their model provides security by significantly increasing the storage size.

Comments for author

The paper presents the details of the solution, but its correctness is not well studied. Moreover, although the authors compare their model with a not-secure baseline model, comparing with other robust CRDT protocols (e.g., [1]) is missing.

[1] Martin Kleppmann. 2022. Making CRDTs Byzantine Fault Tolerant. In Proceedings of the 9th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC' 22). ACM, USA, 8-15.

Review B

Overall merit

3. Weak accept

Reviewer expertise

4. Expert

Paper summary

A CRDT version of partial JSON is enhanced such that access can be controlled via cryptographic keys, including group membership and key rotation.

Comments for author

This paper presents a protocol and implementation of securing several CRDTs (ORMap, LWWRegister), permitting to controll the editing of a shared almost-JSON document (arrays are missing). For performance reasons, a modified Merkle-Patricia trie is proposed, which helps when replicating (and is the reasons to outperform the comparison with the baseline).

I like this work for presenting a working solution, including benchmarks, for shared document editing. The main weakness I see are assumptions which sometimes work against the desired decentralization that could be better reflected. Cncepts could be presented in more orthogonal ways, spelling out the price that the "hybrid solution" has.

For example, the delicate use of absolute time is discussed, but to me it remaines unclear how the bracketing of time would work in situations with long offline intervals, something that was used to motivate the case (in the introduction). The discussion also refers to Geth, for example, which implies a global consensus process.

Also, more should be said about the implications of controlling the access policy. It seems that the model assumes a document owner. But in general, and also here, there is a consensus problem hidden behind the decision which group member should be kicked out - something that the paper does not explain except stating that "The other users can then decide to revoke access if necessary.", without saying how they would come to this decision. The "group membership discussion" (related to "group encryption in cryptography) is somehow hidden (on page 4 we learn that there is a list of users) and remains unclear: it seems that such lists can be attached to each sub-document and that keys can be rotated by everybody in the group ("Replicas that decide to rotate a key..."), which would imply that each JSON map is "owned" by potentially a different user? Who would then be in charge of excluding a Byzantine replica?

Similarily, the promised research on pruning would again require some form of consensus, and interactivity, among the group members.

Two related work come to mind: Secure Scuttlebutt provides a similar trusteless replication infrastructure (servers are called "pubs") and has "private groups" via derived shared keys.
Rinberg et al (https://www.vldb.org/pvldb/vol15/p1053-rinberg.pdf) describe a shared full-JSON model that also addresses the growth of state, without need of a tombstone set of deleted keys.