Harmony: Open Consensus for 10B People
Total Page:16
File Type:pdf, Size:1020Kb
Harmony: Open Consensus for 10B People Harmony builds an open marketplace at Google-scale for the decentralized economy. This project aims to provide a consensus protocol over the open Internet at 10 million transactions per second with 100-millisecond latency and at most 0.1% fee. Harmony’s goal is to be 1,000+ times faster and cheaper than Bitcoin and Ethereum. We are rebuilding the decentralized economy with 10x innovations in all components: transport network (Google’s UDP, Bloom tables, 5G mobile), consensus protocol (Byzantine committees, acyclic graphs, monopolist fees), and system tooling (unikernels, multi-core in Rust, zero-copy streaming). We believe communication is the key to the future of humans and machines to create in harmony. As the gateway for microtransactions or online business, our fee must be at most 0.1% of the transaction value to support new marketplaces of metered content or fractional work. As the infrastructure for the world’s data firehose, our bandwidth must scale to 10M tx/sec to support data from supply chain IoT devices or energy grids. Yet, all of the above must settle agreements within 100 milliseconds to support instant reactions for autonomous robots or on-chain quotes in exchanges. We employ technical innovations that are already proven in research and implementation. For example, Google’s UDP currently powers 35% of its traffic (or 7% of the Internet) with 50% latency improvement, OmniLedger Byzantine protocol benchmarks to 13,000 tx/sec and 1.5 sec latency with 1,800 hosts, while unikernels in Rust archives 10M concurrent connections on a standard 96-core machine on Amazon Cloud. The rest of this document describes our technical architecture and company roadmap. Sign up for Harmony! See our talk (simple-rules.com/talk) or our GitHub for the compiler prototype. Our extended team (part time) consists of four Ph.D.s, 3 Ex-Google, 2 Ex-Apple, graduates from Berkeley, CMU, Waterloo, Penn and Harvard. About: Stephen Tse coded in OCaml for 15+ years and graduated with a doctoral degree from the University of Pennsylvania on security protocols and compiler verification. He was a researcher at Microsoft Research, a senior infrastructure engineer at Google, and a principal engineer on search ranking at Apple. He founded the mobile search Spotsetter, a startup Apple later acquired. He may be reached at [email protected]. Harmony: 为百亿人打造的开放式去中心化共识协议 让我们建立一个Google数据级别的开放市场平台,成为去中心化经济系统的顶梁柱。 Harmony的目标是,在开放的互联网上,提供一个去中心化的共识协议,做到每秒处理 千万条交易和数据,保证100毫秒左右的延迟,同时让每笔交易的手续费低于0.1%。 Harmony会做到比世界上最领先的区块链网络还要快1000倍,便宜1000倍,比如和 Bitcoin和Ethereum比起来。我们会通过10倍的技术创新,重构去中心化的经济体系下的 所有层面:传输层(Google的特制UDP协议, 布隆表格,以及5G移动网络),共识协议 层(拜占庭委员会机制,无环图,垄断费),以及系统工具(Rust下Unikernel单内核并 行计算,和零拷贝数据流)。 我们相信在未来,通信与交互 是人与机器融洽相处的关键。和谐与一致,即是Harmony 的内在含义 。要成为一个支持微交易的电子商务网关平台,我们收取的交易费必须低于 0.1%,才足够支持(大数据时代)市场下簇生的新兴交易模式,比如对信息内容的量化 ,以及工作内容的微分。作为基础设施,我们必须像是水电网络一样,为全世界的数据提 供稳定可靠的通道。正因为如此,我们的带宽必须能够扩展到至少每秒千万条交易的级 别。这样,我们才能够支持那些由供应链、物联网(IoT)、能源网生成的数据。在这基 础上,我们还要保障每条交易从握手达成,到被整个网络认证,必须在100毫秒内完成, 这样,我们才可以支持那些需要实时处理能力的应用,比如全自动机器人,交易所报价。 在技术方面,我们只会采纳那些已通过考验的革命性技术。这些技术都已经历过长期大范 围的研究、开发、和测试。举例来说,Google的特制UDP协议,现在已经处理了Google 本身超过35%的流量。这相当于整个互联网7%的流量。从实际数据上看,这个协议至少 降低了用户50%的网络延迟。OmniLedger上的拜占庭协议,可以处理超过每秒13,000条 交易,在1800个主机下运转,只有1.5秒的网络延迟,Rust下单内核技术,则可以在只用 一个亚马逊云端96核的主机的情况下,并行同时处理一千万个网络连接。 Stephen Tse 谢镇滔 (微信 “tsestephen”) 自高中年代起,便一直着迷于编译器和 通信协议方面的研究。他曾反编译过 ICQ 和 X11 的通讯协议,并已用 OCaml 语 言编程长达十五余年。他博士毕业于宾夕法尼亚大学,专注于研究安全通讯协 议,以及编译器校验方面的技术 。 Stephen 曾在微软研究院总部任职研究员,在Google总部任职高级软件工程师 ,负责基础架构方面的项目,并在苹果公司总部任职主任工程师,主导搜索排 序方面的工作。他曾创立 一个专注于移动搜索的公司 Spotsetter,并被苹果公 司收购。Stephen 还是一个前 Google 员工的硅谷创业者每周私人聚会的创办人和组织者(TGI-$ — 大口喝酒,大谈机器学习和区块链 )。 Architecture and Innovations 4 1. Transport Network 4 2. System Tooling 5 3. Consensus Protocols 6 Protocols and Optimizations 7 1. High-Performance Protocols 7 2. Principles for Scaling 9 3. Model for Attacks 9 Locations and AI 10 1. Location Oracles 10 2. Decentralized Maps 10 3. AI Data Marketplace 10 Contracts and Beyond 11 1. Formal Verification vs Hacks 11 2. Language-based Security 12 3. Fairness and Efficiency 13 Team and Collaborators 15 1. Passion and Team 15 2. Expertise and Collaborators 17 Milestones and Roadmap 19 1. Achievements and Milestones 19 2. Roadmap for Launch 19 Tokens and Sales 20 1. Token Model and Incentives 20 2. Planned Use of Proceeds 21 Questions and Answers 21 1. Frequently Asked Questions 21 2. Further readings 23 Appendix: Min Language for Contracts 24 Architecture and Innovations “ Bottlenecks were neither the protocol nor the infrastructure. The bottlenecks were in the implementation of the protocol. ” Bitcoin Unlimited at Stanford We argue that all aspects of a consensus protocol are critical for scaling beyond 1,000x. Our approach is to productionize research innovations to the scale that serves billions of people and devices. Currently most decentralized applications are limited to infrequent transfers of currencies or tokens. Harmony’s high performance will make many real-world decentralized applications practical. Later in this document, we will describe many use cases unique to Harmony, including location oracles, decentralized maps, and AI data marketplaces. Let’s start with our technical differentiation. We seek and master innovations already proven in practice. For Harmony, we focus on the following key components: transport network, consensus protocol, and system tooling. Our team has extensive experience building systems at the largest scale in the top tech companies. We will also show a prototype of our own contract language and compiler in the later sections. 1. Transport Network As discussed in Advances in block propagation by Greg Maxwell, 12% of network bandwidth is consumed by synchronizing blocks among nodes. More critically, 2.5 round trips are needed to confirm transactions. Maxwell’s study concludes that relay engines and template deltas are important research directions to improve the propagation performance, which Harmony follows closely. We base our network stack on Google’s UDP, called QUIC, which accounts for 88% of Chrome traffic (35% Google’s traffic or 7% of Internet traffic). QUIC improves multiplexing over sessions and achieves zero round-trip latency. The multiplexing saturates the network when synchronizing multiple resources, hence drastically improving bandwidth. It also eliminates many session handshakes during connection setup, which are slow and computationally expensive. Furthermore, broadcasting without round trips makes data synchronization robust to network changes and unavailability. The research paper The QUIC Transport Protocol: Design and Internet-Scale Deployment [talk] covers many technical details and deployment insights for QUIC implementations. Another independent study Taking a Long Look at QUIC: An Approach for Rigorous Evaluation depicts Google’s UDP as the future of the transport layer. An important technique we use for transaction propagations is Bloom tables in set reconciliation and coding. Like the template deltas for Bitcoin mentioned above, Bloom tables enable communication between shards in constant size O(1), rather than proportional to the total transaction size O(n). Graphene employs a similar approach, described in What's the Difference? Efficient Set Reconciliation without Prior Context, to huge improvement. Lastly, we design with 5G mobile network in mind, namely 1ms latency, 1M connections/km², and 10Gbps throughput, as described by Huawei 5G vision. This upcoming transport network supports 100B connections all over the world; their initial tests are already deployed in Japan and Europe. 2. System Tooling One principle guiding our work is to actively seek novel architecture and to use optimal languages. Bitcoin Unlimited recently argues the same point of an integrated approach and the quality of implementation. Its talk Measuring maximum sustained transaction throughput at Stanford emphasizes proper parallelization and eliminating locks through Bitcoin’s system design. The technical blog post Terabyte blocks for Bitcoin Cash also argues the importance of an optimal backend architecture. It claims the economic feasibility of 7M tx/sec at 10-min latency. More concretely, A roadmap for scaling Bitcoin Cash discusses many proposals for systems optimization: memory map, parallelization, and indexing unspent outputs. We take an extreme view on eliminating the bottlenecks in our protocol implementation. Our inspiration comes from Mosaic: Processing a trillion-edge graph on a single machine (talk), which achieves 59x speedup on the most challenging distributed benchmarks, including PageRank and community detections. Its novel innovation is to use Hilbert-ordered tiling scheme for locality, managing to process graphs with 1 billion nodes and 1 trillion nodes on a single host. Mosaic runs its benchmarks on standard machines with 244 cores and 850K IOPS, which are readily available on Amazon cloud at a low cost. The most severe overheads of parallel processing are locks and garbage collections. Typically, the more processing cores in the machine, the more contention for the same resources, which are marked mutually exclusive using locks. Similarly, the more memory in the machine, the longer it takes to scan for free space among all data structures. We