CCS '22

Review A

Overall merit

2. Weak reject

Reviewer expertise in the topic domain

3. Knowledgeable (I don't necessarily work in this topic domain, but I work on related topics or I am cognizant of the work in this domain in recent years)

Reviewer confidence

3. Quite confident

Paper summary

This paper considers a challenging deployment scenario of fault-tolerant consensus protocols. In its setting, a set of low-profile devices are connected via a network that is neither fully-meshed nor stable. To tackle hurdles in the new setting, the paper proposes MobBFT. It states that MobBFT is applicable for low-profile devices (e.g., web browsers) with unstable network connections, and also claims that MobBFT realizes both safety and liveness in an asynchronous network.

Strengths / Reasons to accept

It motivates an interesting question. How to deploy consensus among low-profile machines over the unstable network environment is a meaningful question, and it was largely unexplored.

The authors present a series of engineering optimizations. Many of them (e.g., use BLS signature to compress storage, store the authenticated states instead of the whole transaction log) make sense to me, though not surprising.

It conducts experiments up to 100 nodes and provides comparisons between existing partially-synchronous protocols such as Tendermint and BFT-SMaRT.

Weaknesses / Reasons to reject

A main concern I have is whether the security/threat model really captures the challenging deployment scenario that the authors aim at, cf. my detailed comments.

The paper does not well justify why their engineering-level optimizations are non-trivial. For example, using BLS signatures for space and communication efficiency is a popular technique. Also, it is well known to let the lightweight devices store authenticated states instead of the whole ledger, in particular, Ehtereum's light clients have implemented this trick a long time ago.

Also the liveness proof is not clear to me. I fail to follow the proofs to understand why an sophisticated asynchronous adversary cannot schedule the order of message deliveries to prevent termination. (Do I miss anything here?)

Constructive comments for author

Here is why I think the security/threat model fails: the standard asynchronous network model implicitly assumes that the honest node can resend messages that are likely failed to reach the destination until receiving an ACK from the receiving node (e.g., analog to the retransmission in TCP protocols). Clearly, this assumes that an honest node can stay on-line and remember what messages are under transmission in its memory. However, the authors claim a very different setting where the clients are lightweight and resource-starving devices (e.g., browsers). These devices, even if they are honest, can frequently go off-line and forget what messages failed to deliver, so they cannot eventually deliver messages. It is really unclear to me why the standard asynchronous model can still capture this more harsh application scenario (where the honest nodes might fail to deliver some messages). One fix that I can imagine is: before sending any message, persist the messages in the local database, such that once the node recovers from a suspension, it can resend the message to ensure its eventual delivery. However, I don't see any discussions on the fatal issue.

The authors try to design some asynchronous fault-tolerant protocols that ensure *both* safety and liveness in the presence of arbitrary message delay. But due to my basic sanity check, their protocol likely fails to do so. Recall that the seminal FLP impossibility [FLP85] already stated: any asynchronous fault-tolerant consensus must rely on some randomized execution, but in this paper, the only random execution is gossipping (which essentially can be abstract by a deterministic network diffuse functionality), so it fails to pass such a basic sanity check. Anyway, I do not doubt their design can be tuned to work in the partially synchronous model [29]. If their best-possible end result is just partially synchronous (because safety holds in the presence of asynchrony but liveness relies on synchrony), please tune the claims down, as our community currently tends to call PBFT/BFT-SMaRT and similar protocols asynchronous-safe or partially-synchronous (instead of asynchronous).

P.g. 3, the authors use [29] as the reference to the asynchronous network model, which is not precise. [29] is the seminal work to initiate studies in the partially-synchronous model, which lies between the asynchronous and synchronous settings. For some reference related to the origin of asynchrony, please consider [FLP85], [Rabin83], [Benor83], [Bracha87].

P.g. 3, minor precision issue: computationally bounded adversary *can* forge signatures or find hash collisions with all but negligible probability. We say that it is *infeasible* (not say “cannot”) to break sig and hash.

P.g. 5, the trick of skipping the verifications for BLS signature shares is well known. This trick is not a literal fast-path. Usually, a fast-path, in fault-tolerant consensus, represents an optimistic execution flow that exchanges fewer messages and does not run the entire protocol (so it might save some rounds of interactions). Here is just delaying signature verifications. Typical fast-path is the one in SPBF [35], where some round of interaction can be saved under certain optimistic conditions.

P.g. 7, why not assume the signing key shares are from secret sharing? If the setting is permissioned, it might be possible to consider a dealer or a DKG protocol to set up threshold BLS, which might further reduce storage or save computations.

P.g. 9, it would be better to compare with the cutting-edge asynchronous consensus protocols. These protocols are naturally leaderless, and thus perform much more robustly than Tendermint and BFT-SMart.

P.g. 11, “HoneyBadger [62] also uses threshold signatures [78] for censorship resilience.” => This is not correct. [62] used threshold encryption (not signature) for censorship resilience.

[FLP85] M. Fischer, N. Lynch, M. Paterson. Impossibility of Distributed Consensus with One Faulty Process. JACM 85. (originally published in PODS 83) [Rabin83] M. Rabin. Randomized byzantine generals. FOCS 83. [Benor83] M. Ben-Or. Another advantage of free choice. PODC 83. [Bracha87] G. Bracha. Asynchronous Byzantine agreement protocols. Information and Computation 87.

Does the paper raise ethical concerns?

1. No

Review B

Overall merit

2. Weak reject

Reviewer expertise in the topic domain

4. Expert (I've worked on this topic domain or a closely related topic domain)

Reviewer confidence

3. Quite confident

Paper summary

The paper proposes MobBFT, a browser-based architecture for implementing secure decentralized applications centered around clients. Contrary to previous works, the approach is meant to be lightweight, employing techniques commonly used in weakly consistent systems like CRDTs and gossip protocols. The approach was evaluated considering a prototype of a community-based loyalty points service.

Strengths / Reasons to accept

The overall idea seems interesting, novel, and well-motivated: there are several frameworks like these that tolerate crash faults. It is only natural this could be extended for (active) adversarial scenarios.

Weaknesses / Reasons to reject

The distributed computing abstractions are not defined in a precise way. Therefore, it is not possible to assess the correctness of MobBFT.

Evaluation centers only on one application scenario, and not on the technique itself.

Constructive comments for author

I have two main issues with this paper. The first one is related to the distributed computing abstractions employed in the design.

MobBFT seems to provide an abstraction of shared memory (a KV store, or, more precisely, a collection of independent registers). Put aside the fact this abstraction is not strong enough for implementing coordination (which might or might not limit its usefulness), BFT registers can be easily implementable in asynchronous systems by using, for example, one of the protocols in [A] (see below). Further, a register is an abstraction weaker than a consensus object, so you don't need consensus for implementing a register. Nonetheless, if you use consensus, you can implement a register with local reads, but only satisfying sequential consistency (not atomicity/linearizability). This leads to several possible issues with the paper:

you claim to implement atomic registers. This is not true, because you cannot do local reads in atomic (or linearizable) registers.
This notion is even more confusing because it appears you provide some kind of eventual consistency (although I'm not sure about that). In this case, atomic registers don't even make sense. Probably you are implementing something as defined in [B].
Then it is said you have a new consensus protocol. Consensus is a well-studied problem, classically defined by properties like validity, agreement, and termination. MobBFT properties resemble that, but are defined in an ad-hoc way, and not in terms of proposals and decisions.
The consensus protocol is described (in Algorithm 2) in an unclear way (at least for me). In particular, because it is not clear how messages are exchanged. Is it just reads and writes on the votes vector?
Perhaps the most important problem in the paper is that it is not clear (1) what is new in your consensus protocol, and (2) how it overcomes the FLP impossibility. Regarding (2), on page 5 it is said "the set of possible values to vote on gets smaller with every round, and eventually the view of all the replicas on the votes in round 0 will become the same, and the winning value can be chosen unanimously." I can't see why this is true for any possible message schedule and the proof of Lemma 6.3 isn't convincing for me.

My second concern is related to the evaluation.

I really liked the fact you compared the work with more traditional consensus protocols (BFT-SMaRt and Tendermint), but it would be good to also compare it with existing frameworks that do not tolerate Byzantine behavior. This is important to assess the cost of such additional security.
The evaluation is done using a single application. The imposed load (1 tx/sec) is ok for this application but not enough to saturate the service. It would be important to evaluate more stringent and general scenarios, possibly using micro-benchmarks.
The WebRTC server is a single point of failure. The paper needs to provide a bit more details about how to overcome this in case the adversary controls such a server.
what is the size of the transactions?
Why BFT-SMaRt is better in Figure 5? Doesn't the gossip network make communication more efficient?

Other comments:

p2: The only contribution in your list seems to be the first item. The others are features of your protocol, some of them are not novel at all.

p3: Cosmos is a blockchain network, not a blockchain.

p3: The fact your set of replicas is fixed is a limitation of the protocol. It is important to specify how the set could change, even if the system needs to stop for it. Notice, however, existing systems like BFT-SMaRt already provide reconfigurations, which seems important for the type of systems you want to serve.

p7: Are there any novelty in Section 4.2, or is it just an application of existing techniques?

p9: HMACs cannot be used to sign requests, but to authenticate them.

p11: In the related work, it is stated that MobBFT supports the propagation of votes over multiple hops. This is hardly novel (see [C]), and can be applied in any protocol.

p11: One paper that does something similar (lightweight blockchain for weak clients) is [D]. The techniques seem completely different.

References

[A] Malkhi & Reiter. Byzantine Quorum Systems. Distributed Computing. 11. 1998.

[B] Serafini et al. Eventually Linearizable Shared Objects. ACM Symp on Principles of Distributed Computing. 2010.

[C] Li et al. Gosig: a scalable and high-performance byzantine consensus for consortium blockchains. ACM SoCC. 2020.

[D] Satija et al. Blockene: A High-throughput Blockchain Over Mobile Devices. OSDI'20.

Questions for authors' response

Which consistency model do you want to support for your shared datastore?

Are you solving consensus? If yes, how do you overcome the FLP impossibility?

Does the paper raise ethical concerns?

1. No

Concerns to be addressed during the revision/shepherding

Abstractions (atomic register, consensus, etc.) need to be properly defined.

Evaluation needs to be extended to better exercise the protocol.

CCS '22 Paper #272 Reviews and Comments

Paper #272 MobBFT: a Client-centric State-based BFT Platform for Decentralized and Resilient Community Web-Apps

Review A

Overall merit

Reviewer expertise in the topic domain

Reviewer confidence

Paper summary

Strengths / Reasons to accept

Weaknesses / Reasons to reject

Constructive comments for author

Does the paper raise ethical concerns?

Review B

Overall merit

Reviewer expertise in the topic domain

Reviewer confidence

Paper summary

Strengths / Reasons to accept

Weaknesses / Reasons to reject

Constructive comments for author

Questions for authors' response

Does the paper raise ethical concerns?

Concerns to be addressed during the revision/shepherding