ThreeFold’s architecture intentionally minimizes the use of blockchains and distributed ledgers. That is, as we say, a feature not a bug. We do use a blockchain as a database for essential shared data about nodes and users, as well as for billing. Actually we are looking for ways to use the blockchain for less in future iterations of the tech.
Blockchains are an important innovation, to be sure, but they come with a large overhead. Not only is the data replicated, but so are the computations. Carrying out the same computation on a hundred or more nodes is simply overkill for most applications. Scalability is thus very limited and privacy is very difficult. We don’t think this is the general purpose solution for building the foundational layer of a cloud.
Of course, you can run blockchains and most other kinds of distributed systems on top of the basic infrastructure building blocks that ThreeFold offers. Thus developers and operators can choose to replicate their data and computations as many times as they see fit.
Traditional clouds are not entirely free of disaster scenarios either. OVH had a serious fire at one of their data centers in 2021, for example. Some customers certainly made a wrong assumption that their data was safe and that they didn’t need a disaster recovery plan. But generally speaking, large enterprises that use clouds make plans to respond to failures in their cloud provider’s infrastructure.
Even in more typical situations, VMs running on AWS can and do go down because of hardware failures. The biggest difference there is that they are able to reliably bring them back online in a short period of time. What makes that possible is that the underlying storage layer for the VM root filesystems and volumes, EBS, is a rather complex distributed storage system. On a smaller scale, a SAN architecture can provide a similar benefit by leveraging specialized hardware and techniques like RAID.
ThreeFold’s approach is to make it as simple as possible for farmers to provide capacity using generic hardware. Even for farmers installing nodes in a data center environment, the amount of specialized knowledge required should be at a minimum. This in line with our goal of creating a cloud infrastructure system that can be deployed anywhere in the world and scale everywhere. It’s also what keeps Zero OS a feasible project, since we don’t have to worry about supporting many permutations of specialized hardware configurations like SANs.
So on the other hand, could we build something like EBS on top of the collection of physical hardware that farmers provide, and use that to serve VM root filesystems with higher availability and easy migration when compute nodes go down? Maybe. But that won’t be anytime soon, if ever. We are not convinced that such complexity is needed is needed in the base layer.
What we are working on is incremental improvements and features that can be implemented in a simple way. One example is volume snapshots. This will allow users to create copies of their VMs and data in a simple way. By also adding the ability to send those snapshots to another node and use them to bring up new VMs, we can already improve the situation substantially. Of course restoring from a snapshot generally implies some amount of data loss, so it’s still up to the user to provide a sync solution for any data that requires greater durability.
I’ll stress again though that this situation is actually the same as with other cloud providers. Even EBS can’t provide 100% durability guarantees. No matter who is providing your VM, there is some chance that you’ll have to restore it from a backup one day. The primary differences are how easy it is to create and restore those backups and what the magnitude of the likelihood is that it will be your volume affected on any given day.
Personally I have been running a service (the node status bot) inside ThreeFold VMs for about 2.5 years. My experience is that gold certified nodes are rather reliable. If something goes wrong with a node, I can restore from synced backups of the essential data pretty quickly and easily. That said, I’d rather minimize the chance that I’ll need to intervene manually at an inconvenient time, so I’m exploring ways to add redundancy (beyond the fact that I already operate a second copy of the bot as a staging environment for new releases). I’ll write about that another time. Some good news is that ThreeFold’s pricing is inexpensive enough that you can often rent two VMs for less than the price of one equivalent VM from another provider.
To conclude, I want to recall that our aim is to build a self healing and self driving cloud that can reach planetary scale. We’ve achieved a lot, but the work isn’t done yet. The idea of a “virtual system administrator”, who can manage deployments on our behalf and help deal with the fact that hardware is in inherently unreliable, is still mostly just an idea. It’s true that there are some inconveniences and uncertainties using the ThreeFold network today versus the incumbent cloud providers. But we think that we’ve developed a solid foundation for a new wave of innovation, and that plenty of useful and highly reliable services can be built on top.