ATC '19

Review A

Overall merit

2. Weak reject

Reviewer expertise

1. No familiarity

Reviewer confidence

1. Low

Paper summary

The authors present OWebSync which is a middleware to support synchronization of browser based clients editing tree like data structures (e.g., json), that can tolerate eventual consistency and which do not have inter-node constraints. This is done by using a state-based CRTD with a Merkle tree to determine which portions of the state need to be resynchronized. OWebSync operates in the same way both for on-line clients and to resynchronize a previously offline client. The authors evaluate OWebSync against three other solutions for synchronizing state -- Yjs, ShareDB, and Legion -- with varying number of clients and either 100 or 1000 objects in a single document. The main conclusion from their results is that OWebSync is the most consistent in performance both at the 50th and 99th percentile, increasing only from 2.3s to 2.7s from the smallest to largest online test and having comparable performance with limited variability for both online and resynchronization after offline. However, at small scale OWebSync is significantly slower than ShareDB and Legion. This improved performance at scale and during recovery comes at increased network usage as compared to the alternatives.

Strengths

The authors do a good job of describing their solution and placing it in the context of alternatives. I liked the use of Merkle tree as an alternative to tracking what various client have seen. I also appreciate that the authors implemented three of the alternatives and show results that indicate an alternative is sometimes better.

Weaknesses

While the authors say they ran 48 benchmarks, it is really only one benchmark run with different parameters: three different numbers of clients times, two different numbers of objects, four implementation and online/resynchronization. I would like to see additional benchmarks, including multiple overlapping sets of documents (and not just one document) and measurements from real world experience with their industrial use cases.

Detailed feedback

Overall, I found the paper well written and the main contributions were clear. I appreciate that the authors implemented a range of alternatives and compared OWebSync to these results.

As indicated above, I would like to see a broader range of benchmarks executed against these implementations. For instance, if there are many documents with limited sharing, how does ShareDB do as compared to OWebSync after a client has been offline? What happens when the client is offline for more than one minute? What is the experience eDesigners or eWorkforce in actually using their solution?

I would also like to see a little more analysis that says where a solution like ShareDB is better. For instance, from the results in figure 5 with a small number of clients (<8), it seems like ShareDB wins if clients don't work offline. But what happens with 4 client with significant periods of offline work?

I personally found the argument for stable performance even after a network disconnect to be more compelling than the better performance at large number of clients. I am not convinced how common 24 clients will be in practice.

I was somewhat confused by figure 2b since the same UUID is used for all of the items whereas in the description of ORMap on page 4 it says each item has its own UUID.

Review B

Overall merit

2. Weak reject

Reviewer expertise

2. Some familiarity

Reviewer confidence

2. Medium

Paper summary

This paper presents OWebSync: a web middleware that supports seamless synchronization of both online and offline clients that are concurrently shared data sets.

Strengths

The paper provides two motivating case studies and then provides the rationale and more background on synchronization mechanisms such as CRDTs.

The analysis of performance evaluation is comprehensive in terms of objects, clients and time to synchronize updates after the network failures.

Weaknesses

Lock of the implementation details of OWebSync, such as combining state-based CRDTs with specific enhancements based on Merkle-trees.

Detailed feedback

OWebSync implements a fine-grained data synchronization model and leverages Merkle-trees and convergent replicated data types to achieve the performance. Experimental results show that in comparison with operation-based and other state-based approaches, OWebSync scales better to tens of concurrent editors on a single document, and is also better in recovering from offline situations, and can achieve acceptable interactive performance with limited network overhead at a higher scale.

Some problems need to be addressed:

The approach of OWebSync consists of state-based CRDTs and Merkle-trees. It is not clear how to combine state-based CRDTs with Merkle-trees. In other words, what is the unique contribution of OWebSync? Why does the OWebSync use the Merkle-trees to improve performance?
For the message batching of performance optimization tactics, message batching eliminates the concurrent processing of many small messages that could lead to a lot of duplicated work on sub-trees, and why does the synchronization time increase?
As shown in Table 1, the table summarizes the results in seconds in the large scale benchmark for both online and offline settings, but the paper lacks the description about these results.

Review C

Overall merit

2. Weak reject

Reviewer expertise

2. Some familiarity

Reviewer confidence

2. Medium

Paper summary

The paper proposes a synchronization system that allows clients to collaboratively edit objects even when disconnected and synchronize once they reconnect to a server.

Strengths

Real implementation.
Evaluates against real systems.

Weaknesses

Not enough novelty: cobbles together well-known, published CRDT techniques.
Evaluation uses synthetic, limited workload; gives limited insight into fundamental advantages of the technique.
Motivation for using state-based CRDTs is a bit unconvincing.

Detailed feedback

Why is it difficult to provide a causally ordered reliable broadcast for operation-based CRDTs? Establishing a reliable channel between a client and a server seems simple enough, since the client can number its messages.

Given the higher network usage of OWebSync, one question is: if the other approaches used the same amount of network, would they obtain equally good synchronization times?

In general, while it's great that the evaluation compares against real, full-fledged systems, it does not quite give a good idea of the fundamental advantage of this style of state-based CRDT versus operation-based CRDT. We don't know if the performance difference is due to implementation quirks of the respective systems. I think the evaluation would be bolstered by some microbenchmarks showing different types of schemes within the same system, if possible.

Synchronizing on a Merkle tree seems very chatty, involving multiple round-trips. In the settings of interest (e.g., a phone on a 4G network connecting to a server), aren't latencies very high due to the hub-and-spoke nature of cell topologies?

In the evaluation, I think a graph that showed network usage versus synchronization latency for different parameterizations would be useful to highlight the different bandwidth/latency trade-off points available.

Overall, the evaluation feels synthetic and limited. It sticks to 100 objects and 1000 objects; what happens at higher scales? Also, clients update objects uniformly randomly. In real applications one would expect to see more skewed use patterns.

I couldn't help thinking that the killer app for this kind of system would be if it worked without a centralized server (or at least, with the server acting only as a communication router). In this case, the server would see only encrypted packets, and you would get a privacy-preserving (from the cloud) Google Docs. But such an approach lends itself much more to operation-based CRDTs.

I liked the description of CRDTs in this paper; it was very clear!

When the discussion of CRDTs comes up in the early part of the paper, I think a motivational protocol-level graph showing the fundamental performance difference between operation and state-based CRDTs would go a long way in motivating this work.

The toy demo app is very nice!

Review D

Overall merit

1. Reject

Reviewer expertise

2. Some familiarity

Reviewer confidence

2. Medium

Paper summary

This paper presents OWebSync, an approach for synchronization among multiple clients in a web-based application. OWebSync uses a variation of state-based CRDTs in order to minimize synchronization time, especially in the case of long disconnections. In particular, OWebSync leverages Merkle trees in order to quickly determine the differences between two large states and to minimize unnecessary network traffic. The evaluation suggests that OWebSync performs similarly to Δ-CRDTs during connected operation, but can achieve lower synchronization time in the presence of lengthy disconnects.

Strengths

OWebSync does achieve its goal of reducing synchronization time in the offline scenario.

Weaknesses

The contribution feels low-level and incremental.
The scope is rather narrow.
The performance evaluation is synthetic.
The results are underwhelming.

Detailed feedback

Thank you for submitting your paper to ATC. Unfortunately, I don't think this paper should be accepted at ATC in its current form. Here are the main reasons behind my decision.

The contribution of the paper feels quite incremental. Your approach is quite similar to many of the approaches that you describe, except that you are using Merkle trees in order to quickly identify differences and minimize synchronization time. This seems more like an implementation-level optimization than a conceptual contribution.

Also, the contribution of the paper is quite narrow. It applies mostly to web-based applications with a large number of clients, all concurrently making a large number of edits to a large number of objects; where the application is using tree-structured JSON objects; and where disconnections are frequent.

I would have preferred to see an evaluation based on real user data. Instead the evaluation is based on a synthetic workload where the clients keep performing updates for 8 minutes at the rather alarming rate of 1 update every second. It is hard for me to imagine that this workload imitates a realistic usage of these applications. Do you really expect users to make updates so frequently? And in fact, would *all* users to making updates at the same time, even during disconnections?

Finally, the results of the evaluation are rather underwhelming. During the “fully online” scenario, Legion is essentially just as good as OWebSync, except with a higher variance. It is only during the disconnected scenario, and especially with a large number of users, that OWebSync shows a sizable performance benefit. Even in this somewhat extreme case, Legion is still within the 10 second limit, though. Not as good as OWebSync, for sure, but still on the verge of acceptable.

Suggestions

I suggest that you rethink how you present your paper. Currently the paper reads more like a low-level performance optimization, rather than a conceptual contribution. I suggest you focus on determining whether there exists an insight behind the mechanism of OWebSync; an insight that could then be leveraged to increase performance.

Your current introduction includes a rather lengthy “related work” paragraph. I suggest that you rethink this paragraph in two ways. First, it is currently going into too much detail about various other approaches, before the reader has acquired enough context to understand these details. Second, this paragraph is clearly too long. I suggest that you aggressively prune the content in this paragraph. You don't need to give all the details here (that's what the “Related work” section is for). All you need to do is give the high-level intuition for why previous approaches fail where you succeeded. When rethinking this paragraph, it would help if you had already identified the critical insight of OWebSync. This would make it easier to distill the essence of why previous approaches fall short of your goal.

Nits

In a few occasions you use "less" where "fewer" should be used.

In page 9, you write [13][14] instead of [13,14].

ATC '19 Paper #181 Reviews and Comments

Paper #181 OWebSync: A web middleware with state-based replicated data types and Merkle-trees for seamless synchronization of distributed web clients

Review A

Overall merit

Reviewer expertise

Reviewer confidence

Paper summary

Strengths

Weaknesses

Detailed feedback

Review B

Overall merit

Reviewer expertise

Reviewer confidence

Paper summary

Strengths

Weaknesses

Detailed feedback

Review C

Overall merit

Reviewer expertise

Reviewer confidence

Paper summary

Strengths

Weaknesses

Detailed feedback

Review D

Overall merit

Reviewer expertise

Reviewer confidence

Paper summary

Strengths

Weaknesses

Detailed feedback