Byzantine Fault Tolerance in Distributed Systems: A Literature Review

Author's Information:

Okoli. Chinonso Johnson

Department of Cybersecurity

ORCID : orcid.org/0009-0001-1168-5434

Vol 02 No 07 (2025):Volume 02 Issue 07 July 2025

Page No.: 390-396

Abstract:

Byzantine Fault Tolerance (BFT) underpins reliable consensus in arbitrary, faulty, and malicious distributed systems. The early BFT algorithms, based on the Byzantine Generals Problem, incur unnecessarily high communication overhead, limiting scalability. This review critically analyzes key milestones in BFT development, including Practical Byzantine Fault Tolerance (PBFT), where the authors introduced a primary-replica model that simplifies communication complexity. However, due to quadratic messaging, PBFT remains relatively unscalable, prompting proposals for asynchronous protocols like HoneyBadger BFT, which trade low latency for robustness. Blockchain consensus protocols, such as Nakamoto Proof-of-Work, Tendermint, and HotStuff, incorporate BFT principles to balance security, finality, and energy efficiency. Beyond blockchain, BFT is crucial in securing distributed machine learning against adversarial attacks. Challenges like scalability, energy consumption, and emerging security threats persist. Current trends focus on hybrid consensus mechanisms, sharding, and using AI to detect anomalies, aiming to enhance BFT’s effectiveness in large-scale, resource-constrained distributed systems.

KeyWords:

Byzantine, Byzantine Fault, Tolerance, Distributed, Systems, Distributed Systems

References:

  1. Benčić, F., Gramoli, V., & Sergot, M. (2023). The limits of BFT scalability: A quantitative analysis of modern protocols. ACM Computing Surveys, 55(4), 1–34. https://doi.org/10.1145/3533414
  2. Cachin, C., & Vukolić, M. (2017). Blockchain Consensus Protocols in the Wild. arXiv preprint arXiv:1707.01873https://arxiv.org/abs/1707.01873
  3. Cachin, C., Guerraoui, R., & Rodrigues, L. (2011). Introduction to Reliable and Secure Distributed Programming (2nd ed.). Springer. https://doi.org/10.1007/978-3-642-19346-6
  4. Castro, M., & Liskov, B. (1999). Practical Byzantine Fault Tolerance. Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI ’99), 173–186. https://www.usenix.org/legacy/events/osdi99/full_papers/castro/castro.pdf
  5. Castro, M., & Liskov, B. (2002). Practical byzantine fault tolerance and proactive recovery. ACM Transactions on Computer Systems (TOCS)20(4), 398-461.
  6. Danezis, G., Gailly, N., Jakovljevic, I., & Sonnino, A. (2022). Narwhal and Tusk: Consensus from data dissemination. Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 47–63.
  7. Danezis, G., Li, F., & Teague, V. (2020). Narwhal and Tusk: A DAG-based Mempool and a High-Throughput Consensus. Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, 133–150. https://doi.org/10.1145/3372297.3417242
  8. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., ... & Vogels, W. (2007). Dynamo: Amazon’s Highly Available Key-value Store. Proceedings of the Twenty-First ACM SIGOPS Symposium on Operating Systems Principles (SOSP '07), 205–220. https://doi.org/10.1145/1294261.1294281
  9. Dolev, D., Tzachar, N., & Zuck, L. (2022). Scaling Byzantine Fault Tolerant Consensus: Challenges and Solutions. ACM Computing Surveys, 55(6), 1–34. https://doi.org/10.1145/3491097
  10. Dwork, C., Lynch, N., & Stockmeyer, L. (1988). Consensus in the Presence of Partial Synchrony. Journal of the ACM, 35(2), 288–323. https://doi.org/10.1145/42282.42283
  11. Lamport, L., Shostak, R., & Pease, M. (1982). The Byzantine Generals Problem. ACM Transactions on Programming Languages and Systems (TOPLAS), 4(3), 382–401. https://doi.org/10.1145/357172.357176
  12. Li, Q., Huang, J., Ding, Y., & Gao, C. (2021). Byzantine-robust Federated Learning: A Comprehensive Survey. IEEE Transactions on Neural Networks and Learning Systems, 32(6), 2289–2308. https://doi.org/10.1109/TNNLS.2020.3029170
  13. Lin, Y., Zhao, T., & Wang, C. (2023). Byzantine-robust federated learning: Advances and open challenges. IEEE Transactions on Neural Networks and Learning Systems, 34(1), 12–30. https://doi.org/10.1109/TNNLS.2022.3157411
  14. Malkhi, D. (2019). Byzantine Fault Tolerance. In C. Cachin & M. K. Reiter (Eds.), Encyclopedia of Cryptography and Security (2nd ed., pp. 185–188). Springer. https://doi.org/10.1007/978-1-4419-5906-5_27
  15. Malkhi, D., & Reiter, M. K. (1998). Byzantine Quorum Systems. Distributed Computing, 11(4), 203–213. https://doi.org/10.1007/s004460050044
  16. Miller, A., Xia, Y., Croman, K., Shi, E., & Song, D. (2016). The Honey Badger of BFT Protocols. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 31–42. https://doi.org/10.1145/2976749.2978390
  17. Wan, Q., Zhang, H., & Luo, X. (2023). Lightweight BFT consensus protocols for edge computing: A review and taxonomy. IEEE Internet of Things Journal, 10(2), 1253–1272. https://doi.org/10.1109/JIOT.2022.3187890
  18. Wang, Z., Li, J., & Tang, J. (2023). Advances in Byzantine Fault Tolerant Protocols for Large-Scale Distributed Systems. IEEE Transactions on Network Science and Engineering, 10(2), 715–727. https://doi.org/10.1109/TNSE.2023.3248571
  19. Xie, C., Koyejo, O., & Gupta, I. (2020). Byzantine-robust Federated Learning through Adaptive Model Aggregation. Proceedings of the 37th International Conference on Machine Learning (ICML 2020), 11558–11567. https://proceedings.mlr.press/v119/xie20c.html
  20. Xie, C., Koyejo, O., & Gupta, I. (2022). Fall of empires: Breaking Byzantine-tolerant federated learning via poisoned leaders. Proceedings of the 39th International Conference on Machine Learning (ICML), 24524–24545.
  21. Yin, M., Malkhi, D., Reiter, M. K., Gueta, G. G., & Abraham, I. (2019). HotStuff: BFT consensus with linearity and responsiveness. Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC), 347–356. https://doi.org/10.1145/3293611.3331591
  22. Zhou, S., Yu, Y., & Lin, C. (2020). Blockchain consensus protocols: A survey of performance and energy consumption. Journal of Parallel and Distributed Computing, 138, 1–14. https://doi.org/10.1016/j.jpdc.2019.11.010