How to use HDD on the Grid with Zdbfs

scott · February 14, 2023, 1:33am

We have some interest lately around using hard disks (HDD) on the Grid in a simple way to store some data, without the added complexity of the Quantum Safe Filesystem. Currently, the only way HDD can be reserved on the Grid is in the form of a “Zero Database” or zdb. It turns out there is a simpler way to write files into a zdb, effectively using a subset of the QSFS stack.

For this tutorial, we’ll use Terraform to reserve some Zdb space, since that’s not currently possible in the playground, and then we’ll mount a zdbfs within a VM using that Zdb to store the data. I’ll deploy the VM within the same Terraform file, but this isn’t necessary. You can deploy the VM independently, and it can actually be on a separate node from the zdb. Of course, the best performance will come from uniting them on one node.

The Terraform Deployment

Let’s start with an example main.tf file for Terraform:

terraform {
  required_providers {
    grid = {
      source = "threefoldtech/grid"
    }
  }
}

provider "grid" {
  mnemonics = "YOUR WORDS HERE"
  network = "main"
}

resource "grid_network" "net1" {
    nodes = [11]
    ip_range = "10.1.0.0/16"
    name = "network"
    description = "my network"
}

resource "grid_deployment" "d1" {
  node = 11
  network_name = grid_network.net1.name

  zdbs {
    name = "zdb"
    description = "zdb"
    password = "PASSWORD"
    size = 10 # Data size in GB
    mode = "seq" 
  }

  vms {
    name = "vm1"
    flist = "https://hub.grid.tf/tf-official-apps/base-ubuntu:latest.flist"
    cpu = 1
    publicip = false
    memory = 512
    rootfs_size = 500
    entrypoint = "/sbin/zinit init"
    env_vars = {
      SSH_KEY = "YOUR PUBLIC SSH KEY"
    }
    planetary = true
  }
}

output "zdb_ip" {
  value = grid_deployment.d1.zdbs[0].ips[0]
}

output "zdb_ns" {
  value = grid_deployment.d1.zdbs[0].namespace
}

output "zdb_port" {
  value = grid_deployment.d1.zdbs[0].port
}

output "ygg_ip" {
    value = grid_deployment.d1.vms[0].ygg_ip
}

Replace the mnemonics, public SSH key, and password with your own info. Using a strong password is probably a good idea here, because the zdb is accessible over the network. The VM specs here are very minimal for demonstration purposes but is probably adequate to run a basic backup server—adjust according to your needs, including Wireguard or public IP access if you need it.

Grab zdb and zdbfs binaries

Go ahead and deploy the Terraform file, and make a note of the zdb IP and namespace. Then SSH into your VM and run the commands below. We’re going to download two binaries into the VM, zdb and zdbfs (links below are latest releases as of time of writing, but you can check the repos for newer versions). We’ll also be running zdb locally in the VM to store the metadata and temp data. This might help a bit with performance, having the metadata on SSD, and will help keep our zdbfs invocations simpler.

apt update
apt install wget
wget -O zdb https://github.com/threefoldtech/0-db/releases/download/v2.0.4/zdb-2.0.4-linux-amd64-static
wget -O zdbfs https://github.com/threefoldtech/0-db-fs/releases/download/v0.1.10/zdbfs-0.1.10-amd64-linux-static
chmod u+x zdb zdbfs

Complete setup

Alright, with our executables lined up, we’ll now launch our local zdb within the VM and start Zdbfs. Zdbfs uses a total of three namespaces, meta, data, and temp. It comes with a convenient command to create any that are missing, autons, which we’ll only need on the first run but it won’t hurt to include it after that.

We set three options for the data zdb that we created earlier with Terraform, the IP, namespace, and password. Use the outputs from Terraform earlier as the IP and namespace, along with the password you specified, in place of ZDB_IP, ZDB_NS, and PASSWORD. Since we don’t specify these fields for meta or temp, they will use the zdb running within the VM by default along with default namespace labels.

Without further adieu:

mkdir /mnt/zdbfs
./zdb --background --mode seq
./zdbfs --background -o autons,dh=ZDB_IP,dn=ZDB_NS,ds=PASSWORD /mnt/zdbfs

Any files written to /mnt/zdbfs will be stored in the data zdb, giving us as close to “just a regular filesystem on a hard drive” as we can get with the Grid right now

If you want to run more than one zdbfs in the same VM, you’d want to give them unique meta and temp namespaces in the local zdb, along with a separate data zdb. Something like this:

./zdbfs --background -o autons,mn=meta2,tn=temp2,dh=ZDB2_IP,dn=ZDB2_NS,ds=PASSWORD /mnt/zdbfs

Quick note on size

The release of zdbfs we’re using here has a hardcoded max size field of 10GB, which will show up for example when running df to check disk usage. This won’t actually limit how much data you can write into the zdbfs, but the used/available fields of df will break once you exceed 10GB. Best to just check the size of the contents with du and compare to how much space you reserved earlier. The ability to set this size has been coded in, just not yet included with the released binaries.

Autostart

As a final step, we might want to autostart zdb and zdbfs, in case our VM ever reboots. With our official micro VM images, like the one included in the Terraform file above, we use zinit to start services, and we can add a simple startup job as shown below. You could do the same thing using cron or systemd, according to your preference and environment.

I’ll assume here that all previous commands were run in /root, so that’s where the binaries and zdb data files are. You can adjust for a different directory as needed. Let’s create two zinit yaml files:

echo exec: '"/root/zdb --mode seq --data /root/zdb-data --index /root/zdb-index"' > /etc/zinit/zdb.yaml
echo exec: '"/root/zdbfs -o dh=ZDB_IP,dn=ZDB_NS,ds=PASSWORD /mnt/zdbfs"' > /etc/zinit/zdbfs.yaml

Now zinit will start these two commands at boot and restart them if they ever exit for some reason. Try it out by shutting everything down and restarting zinit:

pkill zdbfs
pkill zdb
pkill zinit
/sbin/zinit init &

You can check the logs to make sure everything looks good:

/sbin/zinit log zdb
/sbin/zinit log zdbfs

Wrapping up

We covered how to use zdbfs as a standalone component, to give access to HDD in 3Nodes. What we set up is basically the front end of a QSFS deployment, without the encryption and dispersion to the “backend” zdbs. This configuration will still be “append only” so any deleted files will continue to exist as data in the zdb. You could periodically copy only the files you want to keep into a new data zdb using the tip above for mounting two zdbfs in the same VM.

Last thing I’ll say here is that yes, this is still a pretty complicated way to just get some disk storage. While there are some good reasons to use only zdb for HDD, allowing regular HDD volumes is certainly something we can discuss further as a future option.

Mik · February 14, 2023, 2:23am

Wonderful tutorial.

Indeed it is a very complex process as of now.
That being said, it is very nice to see how it can be done and how to use the actual tools to do such project.

Thanks for sharing.