A Decentralized Blockchain Oracle

arXiv:1808.00528v1 [cs.CR] 1 Aug 2018 rvd.Ti a h oeta otr rce nocentral so int into do paper oracles this they turn concern, but to this blockchain address potential To the the points-of-failure. onto has ge blockchains This it that provide. e guarantees bring o security to robust is to the attest that without order oracles data in called on operate entities data Trusted only they blockchain. namely the contract limitation, Smart major contracts.” “smart a tha called programs blockchain of the execution on n wh including to applications network extended other been peer-to-peer ous since has a It in double-spending. preventing transactions monetary process r eddt tett at na fott rn external bring to effort an Generall blockchain. in blockchain a of facts the mechanism called to consensus on entities the attest exists into trusted data to it such, As that needed upon. fact are agreed validit the be the only on can data, consensus such reaching the networ support of wrote The natively memory. Ethereum machine’s not of virtual does the user into a data by relevant is initiated that computation data computations prior on these in operate However, example, only computations. outcome For can of the 7]. sequence on consensus a [6, reach of can world” network “outside the [5] the Ethereum with interact to hones behave equilibriu to forced Nash are a players rational conditions all arbitrarily those where manipulation exists under make this that Further, to system. demonstrates the to set for feature able be the desirable difficult—a being can measure shows formal funds parameters that to decision, bounded a same system oracle’s with presents the the manipulate adversary also successfully behind an paper wi parameters of play This probability the to accuracy. of required of are analysis degree they high so certifi a while role manipulation high-risk/high-reward adversarial i a to fall resistant low-risk/low-re Players is a propositions. play that Voters of certifiers. and falsity voters or roles: truth the decides xctdsae ngnrlzdcmuain 5,admore. o and sequence [5], the computations [4], generalized network in peer-to-peer states a the executed in as responsi storage the such 3], file items [2, for on transactions that monetary years consensus way private a recent reach of in order In to domains transactions. parties other such such to of of extended enables set been set has a a concept of this allowed order cons the reach it spending on to application, sus entities double original mutually-distrusting the and its anonymous solving In usin currency while [1]. digital problem network a of peer-to-peer form a the in transactions monetary A STRAEA Abstract ao iiaino lccantcnlg sisinabilit its is technology blockchain of limitation major A handle to developed originally was technology Blockchain voting oracle, Ethereum, blockchain, Keywords: onAdler John A eetaie rcebsdo oiggm that game voting a on based oracle decentralized a , Tepbi lccanwsoiial ocie to conceived originally was blockchain public —The STRAEA ∗ ynBerryhill Ryan , .I I. ∗ NTRODUCTION { ‡ eateto lcrcladCmue niern,Univer Engineering, Computer and Electrical of Department eateto nomtc,Ahn nvriyo Economics of University Athens Informatics, of Department adler,ryan,veneris,zpoulos,nveira eetaie lccanOracle Blockchain Decentralized A : on † eateto optrSine nvriyo Toronto of University Science, Computer of Department h lccan hti,a is, that blockchain, the ∗ nra Veneris Andreas , adrole ward r play ers analysis htthe that roduces t two nto oracles xternal nerally [email protected] have s exist t umer- bility tly. ized ∗† en- ile th m n issPoulos Zissis , y g y y k f hc ssmwa amu oteuaiiyo h ecosystem. will the at of system usability the the of to out harmful and somewhat in is While drop which process. to opportunity dispute no the a does users oracle to Augur’s decentralization, subject greater be or offering repo may times maliciously who results specific may users to at incorrect such, oracle As as outcomes penalty. The monetary so report a market. events to face it prediction users of a [13], specific outcome in Augur require the participants of determine out case to pay the again oracle In attestations an the system. uses protect operating to malicious Softw order a Intel in same uses Extensions TownCrier the Guard system. sees centr the a in site creating point-of-failure attestations, the on-chain influence al to to maliciously output could visitor to website does due every a Additionally, TLS However, information. checkable that (TLS). are guarantee security layer attestations not transport that of be- use is Transfer idea the oracles underlying Hypertext The these the protocol. hind using (HTTPS) accessed Secure Protocol websites t of attest they content as information present cryptographically-checkable above ones oracles. the existing like to for verified challenges facts infeasible provably However, unique [10]). be are Ethereum cannot the as that that such as computations platforms (such on complex computationally perform or of 9]), f outcomes comes [8, prove that be blockchain be data can as another security not data (such external cryptographically may robust either the This correct the when protocols. limitation provide substantial blockchain a not native do of properties oracles such speaking, otatcaatrsiscnb tlzda h underlying the A as in utilized Entities be smart the network. similar can using decentralized with contributions characteristics platforms our other contract Without describe but game. blockchain we Ethereum voting-based generality, a of levera loss through and ledger intelligence public a on human runs that oracle decentralized oesadcriesaentrqie ob nieduring online be syst to the exit required or enter may not roles all are periods; usabili time For particular certifiers process. and certification th and voters on voting high stake their the a “large” truth of in a play outcome the place stake Certifiers they “small” correct. on where being game a vote risk/high-reward vote placing to their by of game opportunity proposition confidence low-risk/low-reward the random a a given play of are Voters they so. certifier where pa doing and and for system voters, the submitters, fees to propositions roles: Boolean three submit of Submitters more or one nti ok epooeA propose we work, this In provide that oracles are [12] TownCrier and [11] TLSnotary } @eecg.toronto.edu ∗ elVeira Neil , iyo Toronto of sity n Business and ∗ n nsai Kastania Anastasia and , STRAEA STRAEA ageneral-purpose —a a alinto fall can ‡ offer t alized e its ter mat em the o rom ges are ty, st s. rt y n e - any time. Due to its requirement for monetary staking and fees, chain. Miners combine transactions into blocks. Each block the system is inherently not vulnerable to Sybil attacks [14]. has a header consisting of a hash of its own transactions, a In more detail, in ASTRAEA voters place a small amount of hash of the previous block header, and a nonce used for PoW. stake and are given the opportunity to vote on a proposition Users select as the head the block with the most cumulative selected randomly from the system. This stake is deposited work in its history. Rewriting the chain to undo a transaction before seeing the proposition. The outcome of voting is requires repeating PoW, and the further back in history the the stake-weighted sum of votes. Due to the randomness transaction is, the more work is required. involved when selecting a proposition, voting is resistant to Ethereum [5] uses similar ideas to generalize consensus manipulation by actors seeking to force an incorrect result for a on the state of a virtual Turing-complete computer. Ethereum particular proposition. On the other hand, certifiers place large transactions may additionally insert a smart contract [15] into stakes on the truth or falsity of propositions of their choosing. the virtual machine or call a function of an existing contract Similar to voting, the outcome of certification is the stake- thus extending the applicability and usability of a blockchain. weighted sum of certifications. However, not all propositions necessarily carry certifications. Nevertheless, when one exists B. Oracles and the voter outcome matches that of the certifier, players who took the corresponding position are rewarded and players This work focuses on a particular issue facing smart con- who took the opposing position are penalized. When the voter tracts: their inability to act on data that exists outside the and certifier outcomes disagree, all of the certifiers who took blockchain. Smart contract execution must be deterministic a position are penalized. It is important to note that votes in order to be publicly verifiable in perpetuity. As such, and certifications are sealed and revealed only at the end of they cannot directly pull data from e.g., the Internet. For the aforementioned process. Intuitively, the proposed system example, consider a contract to allow wagers on the outcome encourages certifiers to place bets on propositions for which of the Mayweather vs McGregor boxing match [16]. It would there is a high degree of confidence that they are true or be straightforward to devise a smart contract that pays out false, and it also encourages voters to vote accurately on the bettors based on a flag representing the outcome of the match, propositions they are randomly given. but trustlessly determining the value of that flag is a major Analysis reveals that the proposed system has desirable challenge. In that sense, oracles allow a blockchain to interact properties. In this paper, we determine the probability of an with the real, outside world. adversary manipulating the voting outcome and we demonstrate that system parameters can be chosen in a way so that C. System Assumptions an adversary with a bounded amount of funds also has an This paper presents ASTRAEA, an oracle that decides the arbitrarily small probability of successfully manipulating it. truth value of Boolean propositions. We assume each propo- Further, we demonstrate that under the same conditions that sition p has a truth value t that is either true (T ) or false prevent an adversary from manipulating voting, the system has (F ). There are N players (voters and certifiers) numbered 1 a Nash equilibrium in which all players play honestly. Finally, through N. For each proposition p and each i ∈ [1,N], let we argue that the certification process avoids degenerate coor- βi(p) ∈ {T, F } denote the belief of player i regarding the dination strategies in which users always vote with a constant truth value of proposition p. Each player i has an accuracy “true” or “false” without regard to the validity of the actual qi ∈ [0, 1] that is, informally, the probability that player i is proposition so as to earn a profit. correct about a given proposition. Formally: The remainder of this paper is outlined as follows. Sec- tion II gives a brief overview of blockchains, oracles, and key t with probability qi assumptions. Section III follows with a summary of current βi(p)= (1) ¬t with probability (1 − q ) blockchain oracle projects. Sections IV and V introduce AS- ( i TRAEA and analyze it from a game-theoretical direction. An The value of β (p) is independent of β (p) for all j 6= i. extension to ASTRAEA is presented in Section VI. Finally, i j applications of a decentralized oracle are shown in Section VII. That is, each player’s beliefs are independent of all other players’ beliefs. Entities that have non-independent beliefs II. PRELIMINARIES AND SYSTEM ASSUMPTIONS (e.g., users that pool their voting efforts in the style of mining pools [17]) are treated as a single player. Further βi(p) is A. The Distributed Public Ledger ′ ′ independent of βi(p ) for all p 6= p. That is, a player’s belief Bitcoin [1] is a blockchain-based distributed ledger of in a proposition is independent of her beliefs in all other monetary transactions which transfers coins between users propositions. An honest player always votes or certifies that identified by their public keys and certified by digital signa- proposition p has truth value βi(p). tures. Transactions are grouped into blocks which are ordered lists of transactions. Each block contains a hash that commits III. PRIOR ART to the previous block, forming a hash-linked chain (i.e., a blockchain). An unambiguous head of the chain determines The challenge of smart contracts interacting with the outside the precise order in which transactions occurred, preventing world has spawned numerous oracle proposals with varying double-spending. Bitcoin’s [1] primary innovation was the use degrees of centralization, performance, and scope tradeoffs. A of Proof-of-Work (PoW) to reach consensus on the head of the brief description of previous work is found here. A. Existing Blockchain Oracle Proposals 1. Stake 2. Random proposition Hivemind (previously Truthcoin) [18] is a prediction mar- ASTRAEA 3. Sealed vote ketplace built upon Bitcoin. It allows users to report on the predicted outcome of a future event by staking reputation Fig. 1. Player voting in ASTRAEA currency, while traders trade on the event’s market using base currency (i.e., Bitcoin). However, it requires a complex scheme 1. Stake to balance incentives involving two currencies and it has a ASTRAEA 2. Chosen proposition fi permissioned and centralized implementation due to token 3. Sealed certi cation distribution limitations. Augur [13] is a prediction market Fig. 2. Player certifying in ASTRAEA platform built upon Ethereum, heavily inspired by Truthcoin. Gnosis [19] and its fork Delphi [20] are also prediction Since certifiers choose which propositions to certify, not markets built upon Ethereum. Both suffer from a gameable all propositions are guaranteed to have certification, and dispute resolution design and centralized token distribution. indeed the system does not mandate them to do so. The ChainLink [21] aims to provide a cross-chain portal to minimum certification deposit size is a system parameter internet-available information i.e., data available on websites, and should be large enough that certifying incurs a through their centralized system. Oraclize.it [22] is a central- substantial risk. ized platform that creates trusted blockchain transactions for Setting the parameters appropriately is important for the use in smart contracts [16]. functionality of the system. The maximum voting deposit Town Crier [12] uses a stack of trusted hardware and size should be small relative to the total voting stake on software, and publishes proofs that this computational stack each proposition. If a single vote can account for 100% of has not been tampered with. This system is highly effective, a proposition’s total voting stake, an adversary can have total so long as users trust that the underlying hardware does not control over the outcome of a randomly drawn proposition. contain backdoors or exploits [23, 24]. In effect, the validity Conversely, if it is 1%, the adversary would somehow need and accuracy of the protocol depends on a central authority. to draw the same proposition repeatedly to control its validity All in all, the need of oracles to assist smart contracts was outcome. On the other hand, the minimum certification deposit identified shortly after blockchain technology came into exis- should be large enough that certifiers incur sufficient risk. tence but no truly decentralized, trustless, and permissionless Otherwise, they may be tempted to abuse their certifications solution has been presented until this work. for griefing purposes as a malicious certification can impact IV. ASTRAEA reward payouts for other players. At first sight, it seems like individual certifiers have enormous influence on the process This section presents the decentralized voting-based oracle for individual propositions, and indeed this can be the case. ASTRAEA. It begins with a high-level overview of the user However, as described later in the paper, the certifiers alone roles and the operation of the voting game and concludes cannot force the oracle to produce an incorrect value and they with a detailed description of the game. The next section are encouraged to behave honestly by the incentive structure; analyzes the coordination game, illustrates its properties, and otherwise they face large penalties. demonstrates a desirable Nash equilibrium. A. Overview B. The Proposition List Users of ASTRAEA participate in one (or more) of the A key aspect of the system is the set of propositions following three roles: submitters, voters, and certifiers, that over which voters and certifiers play the game. We assume behave as follows. that there exists a set of propositions constructed outside of • Submitters choose which propositions enter into the sys- ASTRAEA called the proposition list, which is denoted by P tem by allocating money to fund (in part) the subsequent and is of fixed size |P |. Each proposition pi ∈ P has a hidden effort to validate the Boolean propositions. truth value ti and is associated with a bounty amount Bi, the • Voters play a low-risk/low-reward role. Once such a source of which is described later. The voting game described player submits a deposit (stake), she is given the chance later is played over all propositions in P simultaneously. to vote on a proposition chosen uniformly at random from In addition, we assume there are two certifier reward pools the ones available. The stake is placed before the voter containing RT and RF monetary units, intended to reward is notified of which proposition she will be voting on. certifiers for true and false outcomes, respectively. The The steps of this process are depicted in Figure 1. The use of two separate reward pools is intended to avoid a outcome of the voting process is a function of the sum of degenerate coordination strategy in which users always vote the votes weighted by the deposits. The maximum voting and certify with a constant true or false so as to maximize deposit is a parameter of the system. their profit without expending any effort and its rationale is • Certifiers play a high-risk/high-reward role. They choose justified later in this paper. an available proposition and place a large deposit in order The proposition list can be constructed in a variety of ways to certify it as either true or false. The certification out- and it is outside the scope of this paper. For instance, an come is a function of the sum of certifications weighted auction could be conducted for the |P | spaces, with the auction by the deposits. This process is depicted in Figure 2. prices becoming the bounty for each proposition. B1 p1 σi;j;c TABLE I VOTINGOUTCOMES B2 p2 j; c RT RF B3 p3 Outcome Condition ::: v T sTOT,j,T > sTOT,j,F > BjP j pjP j r F sTOT,j,F sTOT,j,T Ø s =s P TOT,j,T TOT,j,F si;r;v TABLE II CERTIFYINGOUTCOMES $ $ Outcome Condition σ > σ Voters T TOT,j,T TOT,j,F Certifiers F σTOT,j,F > σTOT,j,T Ø σTOT,j,T = σTOT,j,F Fig. 3. Overview of ASTRAEA submits a monetary stake σ σ and a sealed certifi- After reaching a termination condition, propositions are i,j,c ≥ max cation c T, F for proposition p . An honest certifier is decided, at which point the game outcome is computed, the ∈ { } j one who always certifies according to their belief β p . proposition is removed from the list, and the corresponding re- i( j ) 3) Termination and Decision: p wards and penalties are administered. A proposition is decided Once a proposition j has accumulated sufficient voting stake, it is decided. The amount after it has received a certain amount of voting stake denoted D , which is the same for all propositions. The proposition can of voting stake before decision is a parameter of the system de- v D s s then be replaced with a new one by, for instance, conducting noted by v. Four values are computed: TOT,j,T , TOT,j,F , σ σ another auction. TOT,j,T , and TOT,j,F . These values represent, respectively, the total voting stake for true, total voting stake for false, total certifying stake for true, and total certifying stake for C. System Description false. For each b ∈{T, F }, they are computed as follows: For each of the N players, si,j,b denotes the amount that N N player i has staked on voting that proposition p is b where j s = s σ = σ b ∈ {T, F }. If player i has not voted on proposition j, then TOT,j,b i,j,b TOT,j,b i,j,b i=1 i=1 si,j,T and si,j,F are both equal to 0. Otherwise, one or both (in X X the case where player i has voted more than once in differing Voting and certifying outcomes are computed as shown directions) are nonzero and equal to the stake with which they in Tables I and II. It can be seen that a simple majority voted. Additionally, let σi,j,b denote the amount of stake that determines the outcome. In case of a tie, the voting outcome player i used to certify that proposition pj is b ∈ {T, F }. is unknown (denoted by Ø). The same formula applies for Let smax and σmin denote the maximum voting stake and certification outcomes. Alternatively, the system can require a minimum certifying stake parameters, respectively. super-majority instead of a simple majority for a definite T and An overview of the game architecture is shown in Figure 3. F outcome and produce a Ø outcome otherwise. Ultimately It depicts a set of players who are either voters, certifiers, or this does not affect the design or the analysis of the game and both. In that figure, the differing mechanisms of interaction it is excluded from the description for the sake of simplicity. with the propositions can also be seen. Voters must engage in The game and oracle outcomes that correspond to each a multi-step process in order to vote on random propositions, of the nine possible combinations of certification and voting while certifiers simply submit a certification to a proposition results are illustrated in Table III. The headings in the top row of their choosing in one step. correspond to certification, while the labels in the first column 1) Voting: Players may place a deposit in order to vote correspond to voting. on a randomly-chosen proposition. First, the voter indicates The game has three possible outcomes: true (T ), false (F ), and unknown (Ø), each of which carries its own a desire to stake an amount of money si,r,v ≤ smax on a reward structure. Note that the game outcome is only used vote for proposition pr, where r is not yet known to the voter. Subsequently, the system chooses the value for r and sends to determine rewards and penalties, and does not necessarily it to the voter. Note that r is chosen uniformly at random correspond exactly to the oracle’s output. Indeed, anyone from [1, |P |], so a voter may be given the opportunity to vote observing the system is free to compute oracle outcomes as on the same proposition more than once. Generating random they wish and depending on the context of the proposition. numbers in smart contracts is a topic of active research and For the sake of this presentation a suggested mapping is several techniques exist to do so securely [25, 26], including ones that use an oracle [22]. Finally, the voter submits a sealed vote v T, F . This can be accomplished using a commit- TABLE III ∈{ } OUTCOMES FOR (A) THE GAME AND (B) THE ORACLE reveal scheme in which the voter commits to a hash of their vote concatenated with a nonce and later reveals the vote and (a) GAME (b) ORACLE C C nonce. We define an honest voter as one who always votes Ø Ø V T F V T F according to their belief βi(pr). T T Ø Ø T T Ø T 2) Certifying: Players may place a large deposit in order F Ø F Ø F Ø F F to certify propositions of their choosing. The certifier simply Ø Ø Ø Ø Ø Ø Ø Ø presented in Table III-b. The suggested oracle output follows while voters did not choose to vote on it as they received it the voting outcome if it matches the certification outcome or at random. the certification outcome is Ø. The oracle is not restricted to an output of {T,F, Ø}, but could instead have an output in E. Monetary Flows the range [0, 1] indicating confidence in the truth or falsity of Note that voters are not rewarded for unknown outcomes, the proposition. and therefore bounties are not claimed. Thus far, the distribution of penalties and unclaimed bounties has not been dis- D. Rewards and Penalties cussed, and the funding of reward pools and bounty amounts Broadly speaking, players are only rewarded for T and has only been briefly mentioned. F outcomes in which they took a position that matches it. Submitter fees are used to fund bounties, while the certifier Conversely, those who took opposing positions are penalized. reward pools are initially left empty. Certifier reward pools are In unknown outcomes, certifiers are penalized and voters funded by unclaimed bounties and penalties. In the absence receive no rewards or penalties. As argued in the paper, this of reward pools, certification will be rare, which will lead to scheme incentivizes the participants to behave honestly on the most bounties being unclaimed. It is expected that this method validity of propositions. will approach equilibrium where reward pools and bounties For the rest of this subsection, fix a player i ∈ [1,N] and are balanced to provide reasonable incentives for both voters proposition pj ∈ P to be decided so as to enumerate the and certifiers. Further, it is adaptable as it can approach a new rewards and penalties for each of the three possible game equilibrium automatically if external conditions change. outcomes. Voter and certifier rewards are presented separately An additional consideration is the draining of reward pools. although, as noted earlier, nothing prohibits a player from Each time a proposition is decided with voting outcome T , RT being both a voter and a certifier. an amount τ is subtracted from RT . If the certification Rewards and penalties are distilled into a single value rv outcome is also T , the funds are used to pay out certifier for voting and rc for certification. A negative value indicates rewards. Otherwise, the funds are added to RF . A symmetric a penalty, while a positive one indicates a reward. The results case applies for false outcomes. The reasoning behind this are summarized in Table IV. is explained in Section V-D but its intuition lies in the fact that 1) True and False Outcomes: In the case of a T outcome, it ensures that the reward pools encourage certifiers to certify the voting reward is as follows. equal numbers of true and false propositions, thereby discouraging voters from voting with constant T or F values si,j,T in order to maximize profit without considering the actual rv = × Bj − si,j,F (2) sTOT,j,T propositions. The player’s vote reward is their share of the T -voting stake V. SYSTEM ANALYSIS times the proposition’s bounty amount. Their penalty is equal To analyze ASTRAEA we first construct formulas relating to their F -voting stake. The certifier reward is shown in Eq. 3 the parameters of the system to the probability of the voting below. procedure producing incorrect results (i.e., where the outcome

σi,j,T 1 for proposition pj with truth value tj is ¬tj ). Next, we rc = × RT × − σi,j,F (3) introduce mathematics that relate the system parameters to σTOT,j,T τ the minimum accuracy needed by voters so that they remain A certifier’s reward is equal to her share of the T -certifying profitable. Subsequently, we prove that a Nash equilibrium stake times the true certifier reward pool amount RT times exists where all players play honestly under a form of honest the reciprocal of the certification target τ. The true reward majority assumption on the voters. Finally, we argue that pool is a reward pool used to reward certifiers who correctly the certifier reward structure avoids a situation where players certify articles as true. The certification target can be seen as profitably vote and certify everything with a constant T or F , the number of certifications that the pool should have enough even if the proposition list is heavily biased towards true or funds to pay for. For instance, if RT = 1000 and τ = 10, false propositions. then 100 monetary units will be distributed to the certifiers. The next proposition will have RT = 900, and therefore 90 A. Voting Outcomes and Manipulation units will be distributed. This subsection determines the probability that voting out- In the case of an F outcome, the rewards and penalties are comes are correct as a function of accuracy and extends similar. the analysis to determine an adversary’s probability of forc- 2) Unknown Outcome: For an outcome of Ø (unknown), ing incorrect outcomes. For simplicity, we assume all non- voters are neither rewarded nor penalized (rv = 0), while adversarial players are honest, have the same accuracy q, certifiers are penalized as follows. and always vote with smax monetary units of stake. We can therefore treat the voting process on a single proposition as a Dv rc = − (σi,j,F + σi,j,T ) (4) sequence of Bernoulli trials with probability q of success. smax The probability that the voting outcome is correct is therefore: That is, certifiers forfeit all of their stake, regardless of agreement with voters. The rationale for penalizing certifiers D D P B v , q > v and not voters is that certifiers chose to certify proposition pj s 2 · s max max TABLE IV SUMMARY OF REWARDS AND PENALTIES

Reward Penalty

Outcome Voters Certiﬁers Voters Certiﬁers

si,j,T σi,j,T 1 T × Bj × RT × si,j,F σi,j,F sTOT,j,T σTOT,j,T τ si,j,F σi,j,F 1 F × Bj × RF × si,j,T σi,j,T sTOT,j,F σTOT,j,F τ

Ø 0 0 0 σi,j,F + σi,j,T

where B(n,p) denotes a binomial random variable. For ex- TABLE V PROBABILITYOF ADVERSARY MANIPULATING VOTING ample, if Dv = 20, smax =1, and q =0.8, the probability of n obtaining a correct voting outcome is roughly 99.7%. Dv P q P Specific P Any | | |P|·Dv ( ) ( ) Now let us assume an adversary has n monetary units and 20 · smax 100 0.8 0 0.0006 0.0548 seeks to force an incorrect outcome on a specific proposition. 20 · smax 100 0.8 0.05 0.0028 0.2438 For simplicity, we assume n is a multiple of s and that 20 · smax 100 0.8 0.25 0.1275 ≈ 1 max −9 −7 20 · smax 100 0.95 0 < 1 × 10 < 1 × 10 the proposition list does not change during the attack. Each . . < −6 < −4 D 20 · smax 100 0 95 0 05 1 × 10 1 × 10 proposition can once again be modeled as a sequence of v . . . . smax 20 · smax 100 0 95 0 25 0 0123 0 7100 −11 −9 Bernoulli trials. Each trial is successful with probability: 100 · smax 100 0.8 0 < 1 × 10 < 1 × 10 −8 −6 100 · smax 100 0.8 0.05 < 1 × 10 < 1 × 10 . . . . p + (1 − p)(1 − q)=1 − q + p · q 100 · smax 100 0 8 0 25 0 0168 0 8156 100 · smax 100 0.95 0 ≈ 0 ≈ 0 100 · smax 100 0.95 0.05 ≈ 0 ≈ 0 where p is the probability that the vote belongs to the adver- −5 100 · smax 100 0.95 0.25 < 1 × 10 0.0002 sary. If the adversary uses all n tokens to vote, then n votes smax belong to the adversary across all |P | propositions. Once all 100 · smax, manipulation is effectively impossible, even by a propositions in the proposition list are decided, the probability powerful adversary controlling 25% of the votes. that an arbitrarily-chosen vote belongs to the adversary is: B. Minimum Voting Accuracy n smax n According to the analysis in the previous subsection, the × = smax |P | · Dv |P | · Dv accuracy of honest voters is critical to the security of AS- TRAEA. This subsection quantifies the minimum accuracy (i.e., So the probability that an arbitrary vote is incorrect is: the probability that the player’s belief βi(pj ) matches the truth n · q value tj of proposition pj ) players need to achieve profitability 1 − q + (5) |P | · Dv in expectation by constructing formulas that relate the accuracy threshold to the parameters of the system. Since making it Note, both |P | and Dv appear in the denominator demon- more difficult to earn a profit is expected to lower participation, strating that increasing these parameters renders system ma- it is critical to set the parameters carefully to achieve a nipulation more difficult. The probability of an adversary reasonable tradeoff between accuracy and participation. changing the outcome of a proposition is shown below: For simplicity, we assume the parameters are set such that the probability of incorrect decisions is negligible. Consider Dv n · q Dv P B , 1 − q + > player i with accuracy qi. A vote with stake of smax on s |P | · D 2 · s max v max proposition pj yields a profit in expectation when, at decision It can be seen that if the quantity in Eq. 5 is less than 0.5, time for pj : there are parameter values that make it arbitrarily difficult for the adversary to force an incorrect result. Table V shows the qi · smax adversary’s chance of success for various parameter values. Bj > (1 − qi) · smax (6) max(sTOT,j,T ,sTOT,j,F ) The six columns show the voting stake before decision, the size of the proposition list, the voter accuracy, the fraction of In other words, the expected share of the voting rewards is votes controlled by the adversary, the probability of forcing an greater than the expected penalties. Note that there is no need incorrect output for a specific proposition, and the probability to account for Ø outcomes since voters receive neither rewards of doing so for any proposition, respectively. The rows where nor penalties in that case. At decision time, the denominator no votes are controlled by an adversary show the probability is clearly at least half of Dv, and at most Dv. Therefore: of an incorrect outcome occurring due to honest voter behavior 1 alone. It can be seen that if voters are 95% accurate and D = · Dv ≤ max(sTOT,j,T ,sTOT,j,F ) ≤ Dv (7) v 2 It is clear from Eq. 6 that the voter is profitable for adversary does not control voting, this assumption may not sufficiently high values of Bj . Towards the goal of computing hold if a dishonest strategy is easier and still profitable. In an upper bound, we over-approximate the range in which a this subsection, we identify a candidate for such a strategy voter is profitable using Eq. 7 as follows. and argue that the reward structure combats it. Imagine true propositions are more common than false q · s i max B > (1 − q ) · s ones and define as a lazy voter one who always votes T . D j i max v Let p = P (tj = T ) denote the probability that a random true Re-arranging yields the following upper bound on Bj : proposition is . We further assume as before that all voting outcomes are correct, so the lazy voter agrees with the (1 − qi) · Dv voting outcome on every true proposition and disagrees on Bj ≤ (8) qi every false proposition. Intuitively, this lazy strategy seems viable, but note that it is also necessary to consider certifier By capping the bounty according to Eq. 8, it is possible to behavior since without certification no rewards are paid out. enforce a minimum accuracy such that below that threshold Certifier incentives are tied to the certifier reward pools, and voting becomes unprofitable. For instance, if 80% accuracy their values fluctuate over time. is desired and D = 1000, the bounty must be capped at v When a true proposition is decided, the R pool will 250 monetary units. If instead 50% accuracy is desired the T shrink by RT . Over time the R pool will drain much faster bounty must be capped at 1000. Of course this analysis only τ T lower-bounds the threshold; it neglects the voter’s costs in than the RF pool when p > 0.5. Indeed, the RF pool may terms of time to evaluate propositions, computing power, and actually grow in this case. blockchain transaction fees. Again, we omit the details of the formal argument due to space restrictions. However, informally, this process incen- C. Desirable Nash Equilibrium tivizes certifiers to make more certifications on propositions This subsection demonstrates the existence of a desirable that they believe are false, since the potential rewards are Nash equilibrium in which all players are honest. Indeed, it is greater. At equilibrium, roughly equal amounts of true and clear that any situation in which all players always vote and false propositions should carry certifications, meaning that certify in concert is a Nash equilibrium, as any player who the lazy strategy will not be profitable. Note that certifiers are votes against all the others will only stand to lose. However, never incentivized to certify a true proposition as false we seek to show that such an equilibrium exists under the in an attempt to acquire the RF reward, as in that case the assumption that the quantity in Eq. 5 is less than 0.5. In effect, certifier will disagree with voters and be penalized. this assumes that honest voters are sufficiently accurate and in such plurality that a majority of votes are correct. From VI. EXTENSION TO UNKNOWN PROPOSITIONS AND DATA the analysis in Section V-B, the assumption is sufficient to AVAILABILITY show that there exists an assignment to the parameters such This section presents an extension to simultaneously handle that all propositions have the correct voting outcome with two issues that would arise from a practical implementation overwhelming probability. of ASTRAEA. While we omit the details of the analysis due to space requirements, it is clear that playing honestly is a Nash equilib- 1) A proposition p may not have a clear answer between rium when every voting outcome is correct and only feasible true and false, e.g. if the proposition is “Is com- strategies are considered. Rewards are only ever paid to players plexity class P equal to NP?” [27] Since voters are who agree with the voting outcome. Since every proposition assigned a proposition randomly, they may be assigned a proposition for which they cannot vote towards the pj has the voting outcome tj , if player i votes or certifies β (p ) (i.e., honestly), she agrees with the voting outcome truth value, since the truth for some is unknown. i j 2) As storage directly on the blockchain is relatively ex- with probability qi. Naturally, an honest player could perform better by switching to a “perfect” strategy in which they always pensive [28], an implementation of ASTRAEA would vote correctly rather than honestly, but such a strategy is not most likely use an off-chain storage solution for propo- feasible since players do not know the underlying truth values. sitions, allowing them to be uniquely identified by an Since all beliefs are independent by assumption, a player immutable public hash [29]. As such, the problem of data availability [30] presents itself: what if the submitter pi cannot develop complex strategies that leverage statistical correlation between beliefs to vote correctly with probability does not propagate her proposition sufficiently for the majority of the network to see it, or a malicious actor better than qi. Even an adversary with perfect accuracy who controls 100% of the certifying stake is incentivized to play submits a proposition but intentionally does not share the data corresponding to the hash? Voters would again honestly (or not play at all e.g., in the case where RT =0 and every proposition is true). Indeed, any other strategy results be unable to vote towards the truth, in this case because in the adversary losing all of their stake. the question is unavailable to them. In both cases, the assumption in Section II-C that each D. Proposition Bias and Reward Pools proposition has a truth value is violated. ASTRAEA can be While the previous subsection demonstrates that playing extended by adding a third option for voters, this of unknown. honestly is an equilibrium under the assumption that an If the voting outcome is unknown (Ø) due to a majority of voters voting unknown, voters are neither rewarded nor pun- [5] G. Wood, “Ethereum: A secure decentralised generalised transaction ished, as in Section IV-D2. Certifiers may not take a position ledger EIP–150 revision,” http://gavwood.com/paper.pdf, 2014. of unknown, they should simply abstain from participating [6] “Why many smart contract use cases are simply impossible,” https:// in such cases. The reward structure for the game outcomes www.coindesk.com/three-smart-contract-misconceptions/, accessed: 2018-02-22. remains the same as described in Section IV, and the analysis [7] “Blockchains: How they work and why they’ll change of desiderata follows in a straightforward manner from that in the world,” https://spectrum.ieee.org/computing/networks/ Section V. This extension protects players from penalization blockchains-how-they-work-and-why-theyll-change-the-world, 2017, accessed: 2018-02-22. when submitters provide unclear propositions or withhold data. [8] A. Kiayias, A. Miller, and D. Zindros, “Non-interactive proofs of proof- of-work,” Cryptology ePrint Archive, Report 2017/963, 2017, https:// VII. APPLICATIONS eprint.iacr.org/2017/963. Practical applications of a decentralized, permissionless, and [9] M. Herlihy, “Atomic cross-chain swaps,” 2018. [10] J. Teutsch and C. Reitwießner, “A scalable verification solution for trustless oracle are numerous. This section gives an overview blockchains,” https://people.cs.uchicago.edu/∼teutsch/papers/truebit.pdf, of candidate use cases for ASTRAEA. 2017. Machine Learning and Data Annotation: Modern data [11] “TLSnotary – a mechanism for independently audited https sessions,” https://tlsnotary.org/TLSNotary.pdf, 2014. science techniques require a tremendous amount of annotated [12] F. Zhang, E. Cecchetti, K. Croman, A. Juels, and E. Shi, “Town crier: An data [31] to train predictive models, with applications ranging authenticated data feed for smart contracts,” Cryptology ePrint Archive, from self-driving cars, to targeted marketing or medical di- Report 2016/168, 2016, https://eprint.iacr.org/2016/168. [13] J. Peterson, J. Krug, M. Zoltu, A. K. Williams, and S. Alexander, agnoses. Centralized crowd-sourcing platforms [32, 33] offer “Augur: a decentralized oracle and prediction market platform,” http:// a means for individuals to perform human intelligence tasks www.augur.net/whitepaper.pdf, 2018. and receive a payment. However, there have been reports of [14] I. Bentov, A. Gabizon, and A. Mizrahi, “Cryptocurrencies without proof of work,” in Financial Cryptography and Data Security. Berlin, these platforms not compensating workers properly [34] and Heidelberg: Springer Berlin Heidelberg, 2016, pp. 142–157. due to the lack of incentives to label data correctly, low- [15] N. Szabo, “Formalizing and securing relationships on public networks,” quality labeling is prevalent in the industry. This requires http://dx.doi.org/10.5210/fm.v2i9.548, 1997. [16] /u/pmu pmu pmu, “V2 trustless mayweather vs mcgregor parimutuel customers to pay for redundant work, since there is no easy smart contract,” https://www.reddit.com/r/ethereum/comments/6tulip, means of actually determining which labels are incorrect. A 2017, accessed: 2018-02-01. decentralized oracle that incentivizes workers to vote honestly [17] Y. Lewenberg, Y. Bachrach, Y. Sompolinsky, A. Zohar, and J. S. can be used for data annotation, potentially reducing cost and Rosenschein, “Bitcoin mining pools: A cooperative game theoretic analysis,” in Proceedings of the 2015 International Conference on improving quality over existing solutions. Autonomous Agents and Multiagent Systems. International Foundation Data Availability: Decentralized applications that make use of for Autonomous Agents and Multiagent Systems, 2015, pp. 919–927. [18] P. Sztorc, “Truthcoin peer-to-peer oracle system and prediction mar- off-chain resources [10] suffer from the data availability prob- ketplace,” http://bitcoinhivemind.com/papers/truthcoin-whitepaper.pdf, lem [30], in which off-chain data may only exist transiently, or 2015. may never have existed at all (e.g. in the case of a malicious [19] Gnosis, “Gnosis whitepaper,” https://gnosis.pm/resources/default/pdf/ gnosis-whitepaper-DEC2017.pdf, 2017. actor). ASTRAEA, with the extension of Section VI, can be [20] Delphi, “Delphi,” https://delphi.markets/whitepaper.pdf, 2017. used as a data availability oracle for all such applications. [21] S. Ellis, A. Juels, and S. Nazarov, “Chainlink a decentralized oracle Adjudication Mechanisms: network,” https://link.smartcontract.com/whitepaper, 2017. Negotiations between parties that [22] Oraclize, http://www.oraclize.it, accessed: 2018-02-01. require an adjudication mechanism can instead use a decen- [23] python sweetness, “The mysterious case of the linux page table iso- tralized oracle such as ASTRAEA. In this case, the oracle will lation patches,” http://pythonsweetness.tumblr.com/post/169166980422/ essentially serve as a public jury. Decentralized applications the-mysterious-case-of-the-linux-page-table, 2018, accessed: 2018-02- 01. that deal with real-world resources, such as legal agreements, [24] Intel, “Intel active management technology, intel small transfers of assets, etc. can make use of this system. business technology, and intel standard manageability escalation of privilege,” https://security-center.intel.com/advisory.aspx? VIII. CONCLUSION intelid=INTEL-SA-00075&languageid=en-fr, 2017, accessed: 2018-02- 01. This work introduces a decentralized, trustless, and permis- [25] “Randao,” https://github.com/randao/randao, 2016. sionless blockchain oracle system, ASTRAEA. Submitters enter [26] S. Micali, M. Rabin, and S. Vadhan, “Verifiable random functions,” in propositions into the system, while voters and certifiers play Symposium on Foundations of Computer Science, 1999, pp. 120–130. [27] J. Hartmanis, “Gödel, von neumann, and the P = NP problem,” pp. a game to determine the truth value of each proposition. We 101–107, 1989. analyze the game-theoretical incentive structure to show that [28] V. Buterin, “Eip 103 (serenity): Blockchain rent,” https://github.com/ a desirable Nash equilibrium exists whereby under a set of ethereum/EIPs/issues/35, 2015, accessed: 2018-02-01. [29] P. L. Inc., “multihash,” https://github.com/multiformats/multihash, ac- simple assumptions all rational players behave honestly. cessed: 2018-02-01. [30] V. Buterin, “A note on data availability and era- REFERENCES sure coding,” https://github.com/ethereum/research/wiki/ [1] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” http:// A-note-on-data-availability-and-erasure-coding, 2017, accessed: bitcoin.org/bitcoin.pdf, 2008. 2018-02-01. [2] E. Ben-Sasson, A. Chiesa, E. Tromer, and M. Virza, “Succinct non- [31] K. Burke, “Humans help train their robot replacements,” interactive zero knowledge for a von neumann architecture,” Cryptology http://www.autonews.com/article/20170827/OEM06/170829822/ ePrint Archive, Report 2013/879, 2013, https://eprint.iacr.org/2013/879. data-annotation-self-driving, 2017, accessed: 2018-02-01. [3] N. van Saberhagen, “Cryptonote v2.0,” https://cryptonote.org/ [32] “Crowdflower,” https://www.crowdflower.com. whitepaper.pdf, 2013. [33] “Amazon mechanical turk,” https://www.mturk.com. [4] D. Vorick and L. Champine, “Sia: Simple decentralized storage,” https:// [34] F. A. Schmidt, “The good, the bad and the ugly: Why crowdsourcing sia.tech/sia.pdf, 2014. needs ethics,” in Cloud and Green Computing, 2013, pp. 531–535.