Submitted: 28 February 2020
Major revision: 14 October 2021
Reviewer 1: Author Should Prepare A Minor revision
Reviewer 2: Author Should Prepare A Major Revision For A Second Review
Reviewer 3: Author Should Prepare A Minor Revision
While reviewer 1 and reviewer 3 are satisfied with the revision, reviewer 2 still shows concern on the section 3, particularly on providing the empirical proof. I would suggest authors to prepare another major revision.
Author Should Prepare A Minor Revision
This paper presents OWebSync, a system to synchronize state in collaborative web-based applications. The paper addresses an important problem, is of interest to the TPDS audience and in general it is easy to follow even though the organization could be improved.
Please rate the manuscript. Please explain your choice.: Good
Author Should Prepare A Major Revision For A Second Review
The problem of the paper is well-motivated and the background is explained precisely and succinctly. It is convincing that the paper aims at making a significant contribution to the state of the art, with relevant practical applications, as illustrated by two use cases.
My main concerns, however, are about the Chapter 3 (data model). While probably none of the issues are a complete show-stopper, significant additional work is required to make that part of the paper unambiguous, convincing, and correct. The main issue are errors in the core ORMap CRDT implementation. It is under-specified, contains errors, and lacks any kind of (even informal) attempt to proof that it is actually correct.
A fundamental concern is that the authors claim that their ORMap is based on Shaprio's ORSet, but it fact it is fundamentally different. A usual ORSet checks, in its GET function, if some key is in the Removed set. The authors' version does not do so, without explanation. In an ORSet, a key may appear more than once (with different unique tags) in the Observed set (if added by multiple nodes independently). Checking for set inclusion is obviously easy, but if you implement a map instead of set, with key-value pairs, you would end up with multiple values for a single key (and different tags), and deciding which value is the "right" one is not obvious. The authors do not address that problem, may because their "ORMap" is fundamentally different from a usual ORSet, but fail to explain that part. Also, ORSet's delete function deletes *all* instances of a key (all tags), whereas the code in Fig 1 apparently removes only one. The evident differences are significant, and it is certainly not obvious that the proposed ORMap data type is correct. At least some better justification why this data type is a correct CRDT should be provided
I was trying to understand the ORMap better by reading their pseudo code implementation (Fig. 1), but this a difficult task as well. They use a "JOIN" function in line 42 and line 46 that is never defined or explained. Possibly this is meant to be a recursive call to the MERGE function, but I am not sure about this. Maybe some other magic is happening in that undefined function that addresses some the issue I raise here? Hard to tell, if he paper lacks that information.
An important case to be handled by CRDTs is resolving conflicts. Lets consider a very simple case, two nodes concurrently add items with the same key A and different values (1, 2), and different tags. So node 1 has (tag1, A, 1) in its Observed set, and the remote node has (tag2, A, 2) and sends a state update. From what I get from Fig. 1, if a key already exists at node 1 (e.g., line 34, line 40), the local observed set is never changed (the only place this happens is in line 47, which is only reached for a key not locally existing), so based on the code in Fig. 1, tag2 will never make it into node 1's Observed set. In the other direction, by the same argument, the remote node will never add tag1 to its Observed set. While the values (1,2) might be merged by the leaf-level LWWRegister based on their timestamps, the tag in the Observed sets will remain inconsistent after merge. If a node is deleted (by tag id) on one side, it will not be deleted on the other side, resulting in observable difference.
Some additional minor notes on Section 3:
Section 4 (synchronization protocol):
Section 5 (evaluation):
The evaluation is, over all, well explained and convincing. Fig. 6,
however, is somewhat unfortunate because most boxes in the box-plot are
so small that their color is not visible.
References:
Reference [35] is incomplete (lacks proceedings where it was published)
Reference [51] is incomplete (also lacks information where it was
published)
Please rate the manuscript. Please explain your choice.: Fair
Author Should Prepare A Minor Revision
In overall the paper is well written and does a good job of comparing with other similar systems that are available for JS based middleware. The authors try to show the merits and limitations of the chosen approach. One interesting contribution they have is to show the trade-offs among state based and operation based designs when considering the likelihood of partitions, and confirms that operation based approaches are lighter when communication is available but are less efficient when recovering from partitions.
My main issue with the proposal is the use of a set of tombstones for the removed IDs since this set will likely grow linearly in workloads where deletions are common. Notice that in state based delta CRDTs it is possible to track causality in a compact way for a recursive map and avoid tombstones. This choice in the present paper might be compensated since a remove wins semantics is also chosen, and deleting sub-trees could possibly get rid of inner tombstones. In any case, it would be valuable for this paper to discuss this issue in more detail and possibly add an workload that is more heavy in deletions and tombstone creation.
Some details:
1. Introduction.
"replicas eventual consistent" -> "replicas eventually consistent"
Its important to clarify the different approaches to delta CRDTs in [17,24] and [18] since although sharing some terminology they differ. As far as I can tell Legion uses [18] and the recursive map with common version vectors is in [24].
3.1. LWW
Last-writer-win register are not always consistent with causality when clocks get out of sync. A better design is probably a multi-value register (also observed remove) and use of clock timestamp only to choose the value to present among the possibly concurrent values. This is not an issue that requires change, bit its important to signal the drawback.
3.2. OO-Map
I found the use of path[0] and path[1..] hard to follow, maybe clarify notation or add a small concrete example when explaining the implementation.
Misc:
You might want to also check https://github.com/peer-base/js-delta-crdts, from the IPFS team, that seems to contain a recursive map implementation in JS.
Please rate the manuscript. Please explain your choice.: Good