Current SU design ? How does this work exactly ?

naturecrypto · January 11, 2021, 7:52am

Hi there,

I’d like to plan the storage configuration of my one node farm and for that I’d need to have a clear statement on how SU are actually used :

Different wiki statements for 1 SU :

1200GB HDD to get 1TB of usable space
On the storage part of the page (https://wiki.threefold.io/#/cloud_units?id=storage) we can see 300 GB of SSD reserved for 1 SU
On this part of the page (https://wiki.threefold.io/#/cloud_units?id=storage-unit-ssd-storage-su) it states 100 GB

Wiki states that the price for January 2021 is for actual consumption and not reservation. How does that work exactly ? Some real case examples and clarifications would be appreciated

scott · January 19, 2021, 7:33am

Hmmm, and here it says 200gb SSD for an SU. I see that @weynandkuijpers was just working on this and might be able to weigh in on what was decided.

naturecrypto · January 14, 2021, 10:48am

Seems your link does not work anymore @scott

Could someone from the TF team could answer these questions ? I think these are critical pieces of information for people looking to get involved with TF (from a farming or consumer perspective)

weynandkuijpers · January 14, 2021, 2:25pm

It is a mistake on our end. The definition of an SU is the first one: 1200GB of HDD and 300 GD od SSD resulting in 1000GB of net usable storage. The inner workings of the storage solution use “disk managers” to manage slices if disks throughout the TF Grid that manage the selected slices of disk space. These disk managers (container with 0-DB’s) use SSD’s to treat the spinning disk in the “nicest way possible” collecting and storing information in larger chunks (in a smart way) that typical disk block sizes.

The second definition is an old one and should have been deleted. Will do that. Apologies for the late response, will better my response times going forward

naturecrypto · January 15, 2021, 9:36am

@weynandkuijpers thank you for the informations !

So if I understand correctly, if someone get 1 SU, the 300 GB SSD are not directly usable for container deployments ?

Instead this space is reserved for the 1 TB HDD usage, and used like a sort of cache ?

For the price depending of the actual consumption, how does this work exactly ? Because if I remember correctly, we are paying for the whole SU when we make the reservation ?

weynandkuijpers · January 15, 2021, 10:26am

We combine the benefits of both types of storage to give consumers the best possible experience. Lower cost (although that is changing rapidly) HDD but slow with higher cost but fast) SSD. You can reserve SSD only, but then the price of 300GB os SSD is one SU.

The second part of the question: you can rent fractions of SU (theoretically down to MB’s).

naturecrypto · January 17, 2021, 4:51pm

These disk managers (container with 0-DB’s) use SSD’s to treat the spinning disk in the “nicest way possible” collecting and storing information in larger chunks (in a smart way) that typical disk block sizes.

Is there a technical documentation which go through this algorithm ? I’d like some more precise answers than “nicest way possible” and “in a smart way” if you don’t mind I think I get how works most of the TF solution but the storage part is like a a big mistery for me.

Right now, If I’m using solutions like Nutanix for work and I got confidence in it, it’s because I read all of the Nutanix Bible ! If you could provide something similar to this which goes in deep technical details I’m sure it would help people gain trust for TF

And I’m sorry if I don’t get your answers but I’m still not sure what is the behavior if I buy one SU :

I got 1 TB HDD and 300 GB SSD are used by 0-DB containers for accessing these 1 TB HDD (I hope it doesn’t work like that :D, but I’d like to know which percentage of SSD space is used for this TB HDD)
I got 1 TB HDD OR 300 GB SSD and the first that is depleted consumes my SU
If I consume only 500 GB of HDD for one month but I paid one SU for a month, will I have another month of my 500 GB used space ?

I’m sure a detailed practical example would help me a lot to understand if you don’t mind

weynandkuijpers · January 19, 2021, 4:07am

Ok, so essentially, we work with the concept of SU and CU. Whenever you want to deploy a workload, you first need a pool. A pool contains an amount of CU and SU you can use. Essentially, the pool represents your purchased capacity.

When you deploy a workload, we calculate how much CU and SU this workload consumes per second. Once the node reports the workload is deployed, this amount will be drained from the pool every second, until the pool is empty, or the workload is deleted.

Exactly how 1 SU (or CU for that matter) is defined, is depending on the user’s configuration. Basically a SU is a combination of both HDD and SSD (though SSD counts for more than HDD). Should you only require one of the 2, then that is of course possible. In this way, some types of workloads allow you to mix and match. For instance, containers allow you to specify exactly how much cores and memory you want, and depending on that, the amount of CU used is calculated.

Right now, the payment is done based on the reserved amounts. I.e if you reserve 10 SU worth of disk space, 10 SU will be deducted from your pool every second, even if you only use 1 SU.

Formula wise - this is (currently) how unit consumption are being calculated:

cu = round(min(MRU/4, CRU/2)*1000) / 1000
su = round((HRU/1200+SRU/300)*1000) / 1000

The reserved raw units are directly usable and addressable.

scott · January 19, 2021, 8:24am

Whoops, looks like I linked to an issue on a private repository on Github… I’ve removed that link to avoid any future confusion. Thanks for pointing that out

I’m hoping that things are more clear with Weynand’s input. Also hoping that I can get some testing in soon to give some more concrete examples of how this is currently operating in practice.

naturecrypto · January 19, 2021, 6:46pm

Thanks for the formula, this is much more clearer !

If you got some precise tech documentations about how 0-DB works, it would be appreciated

Roel · January 20, 2021, 2:48pm

Not sure if this helps: https://manual.threefold.io/#/capacity_0db?id=_0-db-the-low-level-storage-primitive

scott · January 20, 2021, 7:20pm

There’s also a good introduction to the technical specifics of 0-DB given in the readme to it’s Github repository. Some more information on how 0-OS handles storage can be found in it’s documentation here.

naturecrypto · January 23, 2021, 12:05pm

Thanks for the docs.

I learned :

data volumes asked during container deployment are stored on a standard storage pool created on boot when zos dicovers new disks
0-DB is a low level key pair redis-like server with persistent data stored on always-append files, which allows advantages like optimization of HDD and SSD and out of the box history data. Still I’d like to know how often compaction is triggered to keep the growth of the append file in check

As far as I know, for the moment the 0-DB solution is used only by the minio/S3 container solution, or if someone is skilled enough to code a program to create namespace and use them afterwards.

The theory sounds good, but practical benchmarks and detailed projections about the potential hard drive and SSD life gain would be nice to convince users about the benefits of your solutions !

And last point, actually minio gives us a storage object solution using 0-DB, but we shall need a Kubernetes plugin for persistentVolume which uses 0-DB to really gain a distributed and redundant virtual datacenter over the TF-grid !

For the moment I will try to create such a thing with existing solutions such as Ceph in container but it will use standard storage pool technology…

weynandkuijpers · January 28, 2021, 3:58am

Good analysis and you hit the nail on the head. The kubernetes plugin is underway and persistent storage for containers is available. What’s missing is good and clear documentation which is our focus for the coming weeks.

naturecrypto · January 29, 2021, 10:18am

That’s good news !

Can you tell more about the plugin you are developping ?

weynandkuijpers · February 1, 2021, 4:22am

I will get an update for you - need to get some more information from the devs before I can share. The idea is that you can use the TF Grid storage in any Kubernetes cluster by using the plugin. Keep posted, will come back!

weynandkuijpers · February 10, 2021, 6:14am

Hi sorry not to come back to you earlier, to be honest, I forgot about it. But a lovely lady (@karoline) reminded me of the promise I made… I will get you answers shortly.

Have spoken to the team and will come back with a response today. This means that the plugin is not finished and will take some more time. Will try to get some more information posted here.

weynandkuijpers · February 10, 2021, 9:55am

Here is some preliminary documentation: https://github.com/threefoldtech/quantum-storage

The state of the plugin is being checked.