Middleware '19 Paper #75 Reviews and Comments

Paper #75 OWebSync: A Web Middleware with State-Based Replicated Data Types and Merkle-Trees for Seamless Synchronization of Distributed Web Clients

Submitted: 18 May 2019
Rejected: 24 August 2019

Review A: 2. Weak reject
Review B: 3. Weak accept
Review C: 1. Reject

Review A

Overall merit

2. Weak reject

Reviewer expertise

4. Expert

Paper summary

The paper proposes a middleware for data synchronization among clients in online and offline modes. The proposed system supports synchronization of semi-structured data, like JSON. The synchronization mechanism is based on combining different state-based Conflict-Free Replicated Data Types (Last-Write-Wins Registers, and a new Observed Remove Maps) with Merkle Tree for digesting data and fast detection of data changes. The authors use industrial case studies to evaluate the system.

Reasons to accept the paper

Making use of Merkle trees in the proposed CRDTs provides an efficient way to detect the data changes and to decrease the amount of stored metadata. This is a novel and useful contribution that can be used in several similar systems.

The approach is simple to re-implement based on the description and the component technology used.

The paper uses two industrial workloads to evaluate the performance of the system. The use of these workloads are well-justified and emphasizes the actual use cases, benefits, and potential of such middlewares.

The authors offer an interactive demo of their work, which is available online. The application is well designed and is a good representation of the proposed system and makes it easier to understand their work and contributions.

The paper is very well written and very well presented, and it was enjoyable to read.

Weaknesses

The synchronization depends on the timestamp of the updates as the system favors most recent writes to resolve conflicts. However, it is not clear how the timestamps are provided. Are the timestamps provided by the clients and what happens when the client clocks are out of sync? If the server provides the timestamp, then how do the systems takes the network latency into considerations? This introduces certain complexities to the designed system that is not discussed in the paper.

It is not clear what consistency model the system offers. For example, in the second half of Section 3, it is described that system discards LWWRegister when conflicting with ORMap, in order to save the ORMap. This choice could lead to data inconsistencies and strange behavior from the user point of view.

The evaluation does not properly study one of the main selling points of the paper, which is resynchronization after network disruptions. In online mode, the system does not perform better in comparison to other systems. In offline mode, although the system performs better for resynchronizing data after the network disruptions are over, the studied network disruptions scenarios are short and do not push the system to its limits. The systems need to be evaluated against a more extended period of being offline when a large number of updates are missing and need to be synced.

One of the bottlenecks of the system is the required bandwidth and the high number of exchanged messages for resynchronization after network disruptions. However, there is no result included in the paper to evaluate the network traffic for syncing a large number of updates after coming back online.

The provided online demo application fails to resynchronize data after network disruptions, as explained in the paper. However, I can not say if this is a simple bug of the web application or represents the failure of the data synchronization approach.

Comments for author

The paper offers a new state-based CRDT approach for replicating and synchronizing JSON data types and resolving the potential conflicts. Their proposed approach makes novel use of Merkle Trees to decrease the amount of metadata required to be exchanged between the clients and also to decrease the stored metadata, required for representing complex data types like JSON.

The paper offers a practical alternative to existing synchronization middleware. In addition to the Strong and Weak points mentioned above, I would like to describe the found bug in the online demo: Open 3 clients of the application (3 Chrome Tabs). Go offline. In the first tab resize (make larger) one shape (e.g., a square). In the second tab, resize (make smaller) the same shape selected from the first tab. In the third tab, do not touch anything. Go back online. After the resynchronization, the shape from the first and second tab has the same size (synced correctly), but the size of the same shape in the third tab is wrong (out of sync.) This could be a simple application bug or hint to a problem with the author's approach, which leads to data loss and inconsistency.

Review B

Overall merit

3. Weak accept

Reviewer expertise

3. Knowledgeable

Paper summary

This paper presents the OWebSync system for synchronizing JSON objects between web browser clients. The target scale is about 25 concurrent users on one document and the aim is to support both online and offline collaboration. The system uses state-based CRDTs and merkle trees for efficient state transfer of changes in JSON objects. The evaluation compares the system with a number of prior systems (some operation-based CRDTs and delta-based CRDTs, etc.) and shows that OWebSync surpasses operation-based and delta-based CRDT synchronization for the target number of concurrent users.

Reasons to accept the paper

Weaknesses

Comments for author

It is not clear if clocks are assumed synchronized across clients or even if timestamps are necessary since all the updates are serialized at the server by order of arrival.

The example of the eWorkforce study is irrelevant as it involves clients (technicians) working on their own isolated data and little collaboration. For the eDesigner example, the authors should make the case why automatic conflict resolution (last write wins is acceptable). In practice, having the last write win can be annoying to users (see Dropbox, e.g.) The authors should provide other concrete examples of applications that would benefit from OWebsync (accepting of automatic conflict resolution (last write wins), target up to ~24 clients, etc.) to make the case for OWebsync stronger. The application examples presented in the paper are not sufficiently convincing.

The results showing that OWebSync are not particularly surprising, given the nature of some of the systems with which the authors compare (e.g., operation and delta-based CRDTs). Similar results of the breakdown of operation-based approaches in particular have been published as highlighted by the authors themselves.

Review C

Overall merit

1. Reject

Reviewer expertise

3. Knowledgeable

Paper summary

The paper describes a technique for using hierarchical hashes (a Merkle tree) for distributed synchronization, to reduce the amount of state transfer when large portions of a document are unmodified. The system is argued to be more scalable than, say, Google Docs.

Reasons to accept the paper

I do not argue for acceptance and have no arguments to list in favor.

Weaknesses

I have two significant objections to the paper in its present form that I think argue for rejection:

Comments for author

I think this has some good ideas but the evaluation needs significant improvement, both in terms of the "industrial case studies" and the test applications. The writing also has many grammatical issues that need careful proofreading.

I would like to see a better explanation of operating "conflict free". It seems like this is hard to guarantee unless you severely restrict the types of applications. I think the idea of "last-write-wins" is fraught with hazards, basically lost edits.

The idea of extra levels in the Merkle tree is reminiscent of work by Torsten Suel & students on hierarchical distributed synchronization (like rsync). Should find & cite.

Questions for authors' response

If you can explain the industrial case studies and the conflict generation in your tests, perhaps I can be persuaded to be more positive.

Author Response

Reviewer A and C question if the evaluation in the offline scenario (which is our target scenario) is extensive enough with respect to the number of updates and the amount of conflicts.
With 24 clients, moving 1 object every second for 1 minute, this results in 1440 moves during the 1 minute offline period. Since there are only 100 or 1000 objects, there will certainly be many conflicts where multiple clients moved the same object. A more extended offline period doesn't change much for OWebSync since only the state is kept and the same client moving the same object twice will result in the same amount of state to be sent. Operation-based approaches will take longer when the time increases, since they have to send all operations anyway.

Reviewer A has questions about the timestamps.
The timestamps are provided by the clients (we do not consider malicious clients). OWebSync focuses on automatic conflict resolution, which might be perceived as arbitrary to the user (especially with clock drift between clients), but has the benefit that a user is never bothered by the need to reconcile the data themselves. For applications that do require this, one can extend the system with a Multi-Value Register and let the user choose the correct version (this is discussed at the end of section 3).

Reviewer A observed a bug in the demo application.
We tried several times with Chrome on both Windows and Mac but could not reproduce it. Did you observe it multiple times? Or was the issue solved after refreshing (which would probably indicate a bug in the 2D drawing library, not in OWebSync)?

Reviewer B questions if timestamps are necessary since all the updates are serialized at the server by order of arrival.
Updates are not serialized at the server. In fact there is no notion of update operations reaching the server. OWebSync is using state-based synchronization with Merkle-trees for efficient diffing.

Reviewer C questions our use of "industrial case studies".
The case studies are R&D prototypes of collaborative applied research projects similar to a (R)IA project in Horizon 2020, but on a regional fund and co-funded by the companies. Sharing the name of the companies or the type or name of the projects would break the double blind review process. We could provide them to the PC chair if allowed. The eWorkforce case study is not discussed in the evaluation since the eDesigners case study has more updates and conflicts and we did not have space to present them both in the paper. Both case studies were important for the design and have a working prototype.

Reviewer C asks for information about the location of writes and the size of the objects.
The paper explains that those writes are the x and y position in an object that has 7 attributes (page 8). Figure 2a shows an example of such an object. An update assigns a random value to the left and top attributes.