EuroSys '21 Paper #597 Reviews and Comments

Paper #597 WebLedger: a Byzantine Fault-Tolerant State-Based Ledger for a Decentralized Web without a Blockchain

Submitted: 9 October 2020
Rejected: 20 January 2021

Review A: 3. Weak accept
Review B: 1. Reject
Review C: 3. Weak accept
Review D: 3. Weak reject
Review E: 2. Weak reject
Review F: 1. Reject

Review A

Overall merit

3. Weak accept (OK paper, but I am not enthusiastic)

Novelty

2. Incremental improvement

Writing quality

4. Well-written

Experimental methodology

4. Good

Paper summary

This paper presents Webledger, a browser-based p2p solution that provides a lightweight solution for decentralized web applications. Webledger relies on a leaderless consensus protocol that byzantine resilient. The paper presents the architectures, the protocols and an evaluation with comparisons.

Strengths

Weaknesses

Comments for author

This paper presents a p2p browser-based solution that can be used by clients in sharing economy applications without requiring a complex infrastructure deployement. The paper is well-motivated and provide a list of compelling use cases.

The description of the leaderless protocol is quite dense and it is sometimes hard to extract the novelty of the protocol. Overall this is an interesting and complete architecture.

The evaluation is performed on the loyalty application and provides convincing results over competitors (such as Tendermint). I do not fully understand why two versions of web-ledger are presented while one performs much better than the others.

Review B

Overall merit

1. Reject (Serious problems, I'll argue against this paper)

Novelty

2. Incremental improvement

Writing quality

3. Adequate

Experimental methodology

2. Poor

Paper summary

The paper proposes a lightweight custom BFT protocol for replicating state across a set of browser-based peers/clients. It models state as a set of CDRTs. The replication protocol is state-based meaning that it is basically sharded across different states.

Unfortunately, the new custom BFT protocol is simplistic. Properties (safety and liveness) are not defined - hence the paper is also weak on rigor. Liveness is only probabilistic, even though an eventually synchronous model is assumed (and not a fully asynchronous one), and as such the protocol provides weaker guarantees than classical BFT.

Strengths

Weaknesses

Comments for author

Thank you for submitting to Eurosys.

This opened nicely, reading as a well-written paper. However, as soon as the reader dives into section 3 problems appear. Usually, BFT protocols do not fit into one column and reading the text reveals the issue. This protocol does not provide classical guarantees, notably it only provides probabilistic liveness as oposed to determinsitic liveness even though an eventually synchronous model is used. This is a big issue. It also explains why the protocol fits one column - it is simplistic. Liveness is the key issue why BFT protocols are difficult.

Furthermore, the paper never spells out properties the protocol achieves. This is also a big issue. Consequently, correctness sketch is not useful, since there is no benchmark to asses it (i.e.., protocol specification). On its own, sketch of the proof is too informal and does not give new insights.

These issues alone are enough to reject the paper (at least from Eurosys). One piece of advice to authors is: 1) either try to fix the protocol to guarantee accepted properties (and state them), 2) move to asynchronous model where these properties are acceptable (and state them and prove them correct), or 3) try to argue that practical adversaries allow for your protocol (I would discourage this route)

Further issue is evaluation: authors compare to BFT-Smart and Tendermint, whereas these are not state of the art in BFT protocol performance. These are better found in HotStuff or Mir-BFT. However, the proposed protocol even has higher latency in common case than BFT Smart. Throughput was never compared - since abstract mentions 1 tps for Webledger, I can see why - but hiding this is not good.

FInally why do you use CRDTs? Where are properties of CRDTs exploited in the protocol (hint: nowhere)? Advise: perhaps you can change the protocol to actually exploit them - might make for a better system.

Review C

Overall merit

3. Weak accept (OK paper, but I am not enthusiastic)

Novelty

2. Incremental improvement

Writing quality

4. Well-written

Experimental methodology

3. Average

Paper summary

The paper presents WebLedges, a middleware for decentralized web applications for small community-driven networks. The middleware builds on a novel byzantine fault-tolerant protocol to be used with gossip-based communication. The protocol uses BLS signatures for efficiency. The evaluation shows that the proposed protocol is able to finish transactions in a few seconds.

Strengths

Weaknesses

Comments for author

Although I believe that supporting community-based networks, I find the design of the solution and its implementation unconvincing. The running example, of supporting loyalty points for a group of merchants is interesting. However, having to run a browser to run the supporting software seems a strange solution, prone to errors/faults, such as having the merchant closing the browser by mistake -- why not running some node.js server for running the server?

In the given setting, based on a community network, where the application runs in a browser, I expect the churn to be high by the reasons mentioned before. Furthermore, I would expect that at some points in time, only a fraction of nodes would be active -- a merchant would open the browser when she opens her store, and not all merchants open the store at the same time. The proposed middleware has any mechanism for handling membership changes -- it is not described in the paper? Or the assumptions are different?

In the introduction, the authors points as a drawback of blockchain-based solutions the fact that the full transaction history is maintained. But this fact is by design... in most blockchain system, it would be possible to prune the history of transactions.

The BFT protocol is built for working with a gossip-based synchronization protocol. To this end, the state records the proposals -- this is not much different from storing the list of operations propagated when using a more traditional message-passing based communication. The algorithm resembles a bit the randomized consensus protocol with two steps in each round, but it is different from those. The sketch of the correctness proof helps in the understanding of the algorithm.

I found a bit confusing the example given in the end of subsection 4.1 (i) Public interface subsection. The example sets an owner to the state (or part of the state) by recording in the state the public key of the owner. For executing an operation, the owner would sign the operation, with every node being able to verify that the author. For passing ownership, the owner would sign an operation that sets a new public key as the owner. Is that it?

The evaluation could be improved in a number of directions. First, although I agree that the 99th percentile is an important metric, showing the average is also relevant. The authors could present the average and use bard for showing the 99th percentile in the same graph.

Second, the authors should have evaluated the throughput of the system, as this is important to understand the kind of applications that can be supported by the proposed middleware.

Third, for the failure scenario the nodes fail all at once? In BFT-SMART, does the leader fail -- if not, I would not expect any major impact in the results -- is it possible that the results of BFT-SMART are a consequence of only replica faults for 20 replicas and leader faults for scenarios with more than 20 replicas?

Review D

Overall merit

2. Weak reject (This paper should be rejected, but I'll not fight strongly)

Novelty

2. Incremental improvement

Writing quality

3. Adequate

Experimental methodology

4. Goodt

Paper summary

The paper presents WebLedger, a byzantine-resilient distributed ledger for mobile networks. The main characteristics of WebLedger are:

The authors compare the performance of WebLedger to that achieved by BFT-SMAR and Tendermint.

Strengths

Weaknesses

Comments for author

Thank you for submitting your work to EuroSys. The topic is interesting and timely. I like the fact that you consider practical questions such as how to limit the complexity of the required infrastructure. It is indeed true that one cannot always rely on big servers.

Nevertheless, I have a concern with your work: the motivation is unclear to me. Why do you actually need to design a new BFT protocol?

Additional remarks to improve the paper:

Review E

Overall merit

2. Weak reject (This paper should be rejected, but I'll not fight strongly)

Novelty

2. Incremental improvement

Writing quality

3. Adequate

Experimental methodology

3. Average

Paper summary

The paper presents WebLedger, a middleware for implementing a decentralized Atomic Register, optimized for browser-based applications. The middleware is based on a leaderless consensus algorithm, assuring Byzantine Fault Tolerance. The algorithm assumes partially synchronous network and uses Gossip protocol for broadcast. The system is based on cryptographic signatures schemes to publish the values that are proposed to the consensus and achieves its performance via aggregated signature schemes (BLS). The paper proves the safety and liveness of the protocol, and describes the optimizations made to support the environment. The authors implemented the system in a specific use case, and evaluated the middleware performance over this implementation, as well as comparing to two other middlewares relevant for this use case. Results show that WebLedger with the full optimizations has better latency than one of the compared middlewares (Tendermeint) in a non-byzantine test, and better than the other (BFT-SMART) in the byzantine test. It also has better results in the amount of storage needed for running the middleware.

Strengths

Implementation of a large-scale (over 100) BFT consensus mechanism in a web browser.
Important optimizations achieve good performance in most scenarios tested.
The implementation of the system is optimized to environments low in storage and CPU power (browsers); reaching performance in this kind of environment is not trivial.

Weaknesses

The full implementation and evaluation is technically strong, but the novelty is not well explained, the work seems to mainly combines existing ideas to create a performant system for its specific use case. The model and comparison against previous work should be more detailed.

The full implementation and evaluation is technically strong, but the novelty is not well explained, the work seems to mainly combines existing ideas to create a performant system for its specific use case. The model and comparison against previous work should be more detailed.

Consensus: There are plenty of BFT consensus algorithms, including leaderless solutions. What exactly is the advantage of the one presented here? The related work section provides only limited details. A formal correctness proof is also in order, as these protocols are infamously tricky to get right.

Evaluation:

Can you provide a theoretic analysis of the performance besides the empirical evaluation results?

Related work:
Missing: High-performance blockchain: Bitcoin-NG, Hybrid-Blockchains, Prism, OHIE Byzcoin is mentioned, but the disadvantage compared to WebLedger is not clarified.
Collaborative storage: From Oceanstore to StorJ and IPFS
Tendermint: Reservations from PoS seem to be against its model that assume the majority of currency is held by honest participants. This is not objectively weaker than assuming 3f+1 correct nodes. How does WebLedger compare against the list of protocols mentioned that use randomization?
Avalanche: Is the connection limit actually an issue? It's at 500 for Chromium, for example, which is well above the tested scenarios.
Missing: Teechain state channels

Comments for author

I like this paper and I like the topic, but I would really like to see a real user trial of such a system. There were many papers on CSCW 20 years back and most included not just the middleware scaling eval, but also some HCI.

Review F

Overall merit

1. Reject (Serious problems, I'll argue against this paper)

Novelty

2. Incremental improvement

Writing quality

2. Needs improvement

Experimental methodology

1. Unacceptable

Paper summary

This paper presents WebLedger, a client-driven, leaderless BFT consensus protocol. The goal of WebLedger is to enable local collaboration (e.g. micro-loans or goods exchange) in the absence of trust among the participants. WebLedger is evaluated compared to other BFT systems: BFT-SMART and Tendermint.

Strengths

Weaknesses

Comments for author

Thank you for submitting your paper to EuroSys 2021. While I find the end goal of completely decentralized transactions an intriguing one, I do not think that the current manuscript should be published at EuroSys. Below I list my concerns, including motivation, presentation, novelty, and evaluation.

The paper is motivated by a Byzantine Fault Tolerant version of a fully decentralized environment where individuals can conduct transactions with their peers without relying on any centralized service. But this does not answer the question: _why_ is that environment preferable to a centralized one? The decades-long experience with the cloud concept has shown that services are significantly faster, more efficient, and easier to use when delegated to a company that can consolidate computation. This is particularly important in the presence of Byzantine failures: it is one thing to assume that at most f out of 3f+1 machines will be compromised in a well-protected datacenter environment; and quite another if the participating machines are everyday devices, which can be compromised with ease.

My second concern is that the protocol description is very brief and very hard to parse. The entire protocol description takes one page, including the consensus part and the data synchronization. This makes it almost impossible to fully understand its subtleties, if any.

Furthermore it is hard to pinpoint what, if any, is the intellectual contribution claimed by this protocol. This stems partially from the lack of detail and partially from the lack of context. It would be nice if various aspects of the approach were contrasted with the state of the art, or at least with traditional approaches in BFT, so that the reader can appreciate where the novelty lies and what its magnitude is.

Finally, the evaluation is not at the level expected at a conference like EuroSys. First, the performance is evaluated using micro-benchmarks. Second, Figure 5 reports the 99th percentile of latency only. I agree with the authors that the 99th is important, but this does not mean that you cannot plot both the mean _and_ the 99th percentile. There are several ways in which you can depict both in the same graph. Additionally, this figure does not show the relation between latency and throughput, making it hard to judge whether this latency is achieved at saturation (and is thus less meaningful) or it is achieved at a lighter load. Overall, the evaluation does not explicitly measure throughput at all; the closest thing being the network bandwidth reported in Figure 6.

On top of missing data, the evaluation is also missing a clear message. What should the reader take away from Figure 5? WebLedger is outperformed by one approach or another in either scenario and it is hard to use this information to conclude that WebLedger is indeed an advancement on the state of the art.

Author Response

We would like to thank the reviewers for their detailed feedback on our paper.

Reviewers B, D and F assessed the contributions of the paper based on performance against state-of-the-art BFT protocols, while our focus of the paper is on the lightweight browser-based middleware architecture for consensus, driven by our use cases from decentralized communities. Ordinary merchants on a farmer's market might own a mobile device and have access to a web browser. However, owning a NodeJS server or other server-side infrastructure is still not obvious in many countries. Network failures and short interruptions are omnipresent in these scenarios.

The type of paper we focused on is thus rather a system architecture and middleware paper for that type of cases and context. From that point of view Reviewer A also acknowledges the completeness of the architecture and sufficient evaluation. The focus of the paper is not on the protocol, however, for completeness we described the protocol and a proof sketch. Our type of paper is not unlike two other relevant papers: [19] and [82].

Given this motivation context, the focus of our evaluation is on testing the upper limit of the number of users, given an acceptable throughput for our use cases. We did not focus on finding the upper limit of the throughput with a more limited number of users. In our evaluation we show that WebLedger can achieve latencies similar to state-of-the-art protocols in the environment driven by our use cases, despite running in a browser. In harsher conditions, with up to 25% of the replicas failing (e.g. loosing their network connection, something that often happens in mobile environments), WebLedger-BLS outperforms the other systems we compared to. While it is true that HotStuff is an improvement in throughput and scalability compared to BFT-SMART, it still has similar latencies, which is the focus of our use cases, and therefore this paper.

Reviewer B states that the properties of the protocol (safety and liveness) are not well defined. We agree that we did not explicitly name these properties, but in Section 3.4 we define our safety and liveness property in plain words. We agree that we could have specified this more precise and formal. Our safety property might be called non-divergence in some literature, while our liveness property can be referred to as termination.

Reviewer B also questions our trade-off of only having probabilistic liveness, in return for a much simpler protocol. Only having probabilistic guarantees even in a weakly synchronous system is not new. Algorand [32] and Avalanche [73] both guarantee only probabilistic safety. We opted for probabilistic liveness, which makes for a simpler and more understandable protocol, with easier to prove properties. The simplicity of the protocol is part of our contribution, but this of course comes with trade-offs tailored specifically to our use cases.

Reviewer B asks where we use CRDTs. CRDTs are used in the data synchronization of the actual data, the new proposals, and the votes (explained in Section 3.3). We use CRDTs here as they are more efficient and robust against failures, than simply broadcasting changes. Each set of votes is an Observed-Removed Set CvRDT, and all these state-based CRDTs are efficiently synchronized with Merkle-trees, using an approach similar to Merkle Search Trees [10].

Reviewer A asks why we presented two versions of WebLedger. We evaluated a version with BLS signatures, and one with classical elliptic curve signatures, to show the performance impact of switching to a more advanced signature schema such as BLS.

Reviewer C, D and E ask to explain the results of BFT-SMART in the failure scenario. It is indeed true that a failure of a single replica does not affect BFT-SMART much, however when the leader crashes, a new leader will need to be elected out of the resulting nodes, during which no transactions can be approved. BFT-SMART is unable to handle large network sizes when the latency between the nodes is higher than usual, e.g. in geo-distributed systems or on mobile networks. This has been shown in the literature before [19].

--
[10] A. Auvolat and F. Taïani, "Merkle Search Trees: Efficient State-Based CRDTs in Open Networks," 2019 38th Symposium on Reliable Distributed Systems (SRDS), Lyon, France, 2019, pp. 221-22109.
[19] L. Bonniot, C. Neumann and F. Taïani, "PnyxDB: a Lightweight Leaderless Democratic Byzantine Fault Tolerant Replicated Datastore," 2020 International Symposium on Reliable Distributed Systems (SRDS), Shanghai, China, 2020, pp. 155-164.
[32] Yossi Gilad, Rotem Hemo, Silvio Micali, Georgios Vlachos, and Nickolai Zeldovich. 2017. Algorand: Scaling Byzantine Agreements for Cryptocurrencies. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). Association for Computing Machinery, New York, NY, USA, 51-68.
[73] Team Rocket, Maofan Yin, Kevin Sekniqi, Robbert van Renesse, and Emin Gün Sirer. 2019. Scalable and Probabilistic Leaderless BFT Consensus through Metastability. arXiv:1906.08936
[82] Albert van der Linde, Pedro Fouto, João Leitão, Nuno Preguiça, Santiago Castiñeira, and Annette Bieniusa. 2017. Legion: Enriching Internet Services with Peer-to-Peer Interactions. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 283-292.

Comment @A1 by Reviewer A

Thank you for submitting your work to EuroSys'21. The paper was discussed between the PC members. Despite the interest of the approach, some concerns were raised regarding the lack of motivation, the novelty and the experimental evaluation as reflected in the reviews.