In the recent Risk Assessment on Tendermint / Cosmos Hub Validators, we covered a new threat we have dubbed the Stake Flip Attack. This post starts with some definitions and looks at how stake gets distributed in the network. Then it presents the attack with examples, both against a single validator, and then against multiple validators to control the network as a whole.
- Stake: the units of currency used to establish how much investment each validator has, and consequently which validators are the live validators; we use the generic term Stake here as the units are different between the Cosmos Hub testnet (Steak) and production (Atom) networks
- Validator: a machine that has been configured to be a Tendermint validator in the Cosmos Hub network
- Live validator: one of the validators to put it in the list of top-N validators, hence making it a live, voting member of the network
- Validator candidate: a validator that does not have sufficient stake to make it a Live validator
In order to make the example more concrete, we will start by grabbing the current stake data from the gaia-7005 testnet thanks to the Hubble tool from Figment Networks. You can see the exact data we used in the table in the appendix, below. If we plot out the distribution, we get the following:
As you can see, it follows a long-tail distribution, which is rather what we would expect for a network such as this, and given the aspects of preferential attachment, we can expect this trend will increase as Cosmos Hub moves into production.
Since the testnet only had 139 validators, we’ll use the top 50 (rather than the top 100) for our example to ensure that our live validators make up less than half of the total pool of validators. Given that, we see that we have 58,705 total stake in play. Of that, 52,455 is staked to the top 50 validators. That means that 34,971 stake is required for consensus across the network. That much stake is provided by the top 23 live validators, or the bottom 42 live validators.
Single node stake flip attack
Our test data is actually pretty interesting, as we see there is actually a jump between the bottom live validators (#50 — either mpaxeNode or kochacolaj), both with 200 stake, and the #51 validator (F6738260186D33D9C14FC6E7017AFE6BB952A63D) with 160 stake. Now, a validator towards the bottom of the live list will probably be paying close attention to the stake gap between it and the top validator candidate. The conventional wisdom is that the top validator candidate could just add 41 stake (in this case), to become a live validator.
In the stake flip attack, the top validator candidate still needs 41 stake, but instead of investing them all in their validator, they already have 20 invested in the bottom live validator. When the attack is carried out, they pull their investment and add the other 21 stake to their validator. This would make the attacker’s validator now live in the #50 position with 181 stake, and the previous validator would drop out of the live pool with a stake of 180. The net effect is the same, but the sudden drop in stake may have additional effects, particularly around delegate trust in that validator — if they see that validator drop from live to candidate status and see other delegates pulling out their investment, it may induce a panic for others to do the same.
Now, one might expect some churn right around the bottom of the live pool / top of the candidates. The attack becomes more profound if we consider that the attacker might have a significant investment (perhaps disguised as multiple delegators) in one of the middle live validators. In our testnet data, we see that these validators have a stake of 1100 — if 900 of that is coming from delegators and they pull that out while pumping a few hundred into their own validator, it can quickly take out one of those validators that thought it had a strong stake and firmly establish their own validator into the pool of live validators.
Network stake flip attack
Now, the single node stake flip attack is mostly a curiosity — one might even say it’s not an attack — it’s just business. Things become thornier when one considers the potential for this attack to take control of the network — that is, have the validators under attacker control have at least two-thirds of the stake in the network.
We should begin by reiterating that this is an attack that would be undertaken by an attacker with a great amount of resources. Given the attacks we have seen of late with successful 51% attacks against numerous proof-of-work blockchain networks, we do need to seriously consider these types of attackers.
As noted above, the bottom 42 live validators have a stake of 35,505, well above the two-thirds stake needed for consensus. An attacker would want to invest their rouse stake in the bottom nodes such that they can be kicked out of the pool, reducing the stake in the pool from legitimate nodes. We can figure out how much stake they need to do this with the following formula:
where Org is the original total stake in the network, R is the rouse stake that gets pulled out from legitimate nodes, DV is the stake in nodes that are demoted from live to candidate status, PV is the stake in nodes that are promoted from candidate to live status, and N is new stake the attacker is adding to those promoted nodes. If we solve for the new stake and account for the fact that the attacker needs to get over two-thirds, we get:
So the total amount of stake the attacker must invest is the new stake, the rouse stake, and the stake in the promoted nodes, which if we stick that into the above formula, we get:
Lets go back to the example — if the original stake is 52,455, and the attackers are responsible for half the investment in the bottom 42 validators — making for a rouse stake of 17,753, and we are looking to demote those 42 validators with their remaining 17,752 stake and move in the next 42 validators with a stake of only 5313, then the new stake to be added to those validators needs to be 28,588. That stake would need to be divided up among the 42 candidates (assuming they are all under the attackers control), ranging from 647 to 730 to firmly put all of them above the 720 stake remaining on the top node that got demoted. In this new network, the live stake will be 50,851, with the attacker controlling 33,901 of it, or 66.66732% of the stake in the system.
This seems like a rather convoluted way to take control of the network, and in practice we would probably not see a Stake Flip Attack be used as the sole attack to take control. Instead, we would likely see it used as part of a much larger campaign that included live nodes that acted completely legitimate for months or even years and attracted outside delegators, as well as attacks on legitimate nodes to either take them over or — failing that — have their stake stricken through availability and integrity attacks. The stake flip attack would likely just be the final push to flip the attacker past the 2/3rds stake mark and allow them to wreck havoc until humans could step in and… well, it’s not clear what could be done once an attacker actually controlled two-thirds of the network.
So there you have the Stake Flip Attack — it certainly requires deep pockets to pull off, but in combination with other attacks it presents a distinct risk to the network and — by extension — all the legitimate validator operators and delegators on it. For the purposes of this post, we’ll refrain from digging into risk treatments for this threat and instead invite the members of the Cosmos / Tendermint community to share their thoughts on how this can be addressed in the comments below.
Appendix: Raw Stake Data
|Adrian Brink –
|⚛️ melea-trust 🐞||1220|
|⚛️ cosmos-trust.com 🐞 🎬 🌈 🦄
We are opening up the permissions on this post specifically to encourage further discussions of this threat.
Stake Flip Attack by S Terry Brugger, PhD is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://bubowerks.io/blog/2018/08/08/stake-flip-attack/.