No rewards for new nodes? Please adjust policy

marshu1 · June 24, 2024, 2:58pm

Hello,

It came to my attention that last month’s newly added nodes did not get any rewards, despite passing the 95% uptime requirement. The reason for not getting rewards still was the 95% uptime requirement.
It turned out that the entire length of the period of May was used in the uptime calculation, a clear distinction from prior months, even from April where the new policies were already implemented.

After discussing with support, this was intentional:
The behavior regarding the uptime for new nodes that joined during a minting period was intentional. That is to say that the uptime for new nodes is still relative to the length of the entire period, not just the amount of the period where the node has existed.
The reason for this is that it makes it too easy for farmers to circumvent the uptime requirement by reregistering a node in cases if they realize that their nodes were not going to meet the uptime requirement.
Uptime scaling was implemented initially during the v2>v3 migration, as the plan was to have farmers migrate at the earliest. Further on it was intended to be removed.
It should also be noted that scaling can undermine the idea of SLA entirely if new nodes are re-registered in an attempt to capture uptime again for the rest of the period.

I understand the thought behind this, but I greatly disagree with it. We did not vote for a change in uptime calculation, we voted for the 95% threshold. This sudden change was never communicated, it was even clearly refuted. The exact answer from the 3.14 discussion thread:
“Uptime for the first minting cycle is based on the portion of the cycle that the node has existed for”.

It has quite a significant impact though, because every rational farmer would now wait until the end of the month to register new nodes. Growth will be staggered as nodes will likely come online in waves at the end of each month. New farmers -unaware of this- will have to wait for potentially up to 10 weeks to get their first rewards (for only half of their provided service), in case of registering just after the 36 hours of the monthly cycle. This raises questions and rightfully so.

This situation is the result of the current policy being flawed, where the adjusted uptime calculation serves as a quick fix to its obvious shortcomings. Every farmer that -for whatever reason- has 36 hours of downtime (or 2 farmerbot violations) is basically asked friendly to provide up to 4 weeks of unpaid service. I want to stress out that i’m not against strict or harsh policies in order to ensure uptime, but the model can never be such that it is a rational decision to temporarily shut down a node. The world doesn’t run on kindness.

I really don’t understand why we shouldn’t cut a farmer’s rewards up until the point of violation an no further. If that’s not enough punishment, maybe cut rewards for the remaining days by a small percentage as well. But at least give the farmers a reason to keep their nodes online. I created a separate forum post about this but no meaningful responses were posted there.

Even if for some reason current policy is considered optimal, there should have been a vote for this. Instead, we get to vote for mainnet releases that serve no purpose but a formality and a delay.

Regards, Marsh

scott · July 1, 2024, 8:00pm

Hi there,

Let me say first that, yes, I did make a mistake in communicating how this uptime requirement would be implemented. In the future we will take more care in ensuring that any clarifying questions around implementation of GEPs are reviewed among the team in the same way as the original GEP text.

As you’ve echoed here, there’s a clear abuse vector in calculating uptime based on a partial period for new nodes joining the network. What it ultimately means is that we can’t enforce an uptime requirement effectively without requiring that new nodes meet the requirement based on a full period duration in their first month.

Just to clarify on this point, the uptime requirement was not enforced in April, only starting in May. Since it has been activated, it has been carried out consistently.

I don’t really see this as an issue per se. We don’t have a situation whereby we need to provide an incentive to bring capacity online today rather than at the next period boundary.

This is just an issue of documentation and communication. When the policies are stated clearly, then it’s the responsibility of the farmer to read them and plan accordingly.

Maybe there are ways to improve the rewards system based on the ideas you shared. But this is a separate topic from whether new nodes should have their uptime calculated against a reduced period length. It’s just as bad from the perspective of the user whether a farmer turns off their node or reregisters it. The outcome is the same for the user which is that their workload is gone.

So ultimately this implementation is done in a way that’s necessary to prevent abuse and therefore to actually provide the intent of what was voted on, which is an uptime service level agreement.

marshu1 · July 3, 2024, 3:55pm

Thanks for this detailed explanation.

I think the rewards model has an issue (as stated here), and I was annoyed mainly because this solution related to uptime calculation kind of signaled that the problem isn’t going to be fixed the right way. To me, the adjustment would be obvious and easy to implement (my assumption). I didn’t really get a response on this. Instead, the uptime calculation was altered which solves only 1 of 2 situations. Although reregistering is no longer possible, farmers are still asked to deliver a service for free for up to 4 weeks (in future) in case of a violation, and I still think that’s an issue.

I agree that the downsides of this new uptime are minor, but unnecessary.

stanikzai · July 5, 2024, 4:30pm

The problem is:
TFT price is so low that if we run our nodes 24/7 we will have to pay out of pocket for electricity cost every month. That is in addition to the initial investment of purchasing the equipment.
If we use farmerbot then if that fails to wake our node, we are in trouble. Because we are not home 24/7 to fix it within 30min. This is a violation for that node. Now if we keep that node on for the rest of the month we are not paid anything. This is neither fair nor good for the project.
I think it would be a good idea to pay a reduced amount to nodes that don’t meet the uptime requirement, instead of paying them nothing at all.

jakubprogramming · July 9, 2024, 4:54pm

How about paying out only for the remaining days of the month (or payout period)?
Example: Downtime on July 21 (21 days of payout lost)
Payout: Only for days from July 22 to July 31 (10 days of payout still to be received)

Result: 10/31 = 32,2% Payout

Mik · July 10, 2024, 1:18pm

What the grid needs for organizations and projects to build with us are:

high uptime
lots of nodes

So we need to expand the grid, and we also need high uptime. That’s why we implemented the uptime requirement in the last GEP.

As of now, you need 95% (DIY) and 98% (certified) uptime for nodes. This means you need to start a new node at the very end of a minting period so you get the most uptime for the next minting period.

If we allow this, it would incite farmers to wipe their disks and get a new node IDs for any kind of reasons, e.g. a user deploys a workload and the farmer wipes their disk to not pay for the electricity used for the deployment.

How it is set up now, farmers don’t have incentive for this behaviour.

With the last GEP, farmers have two strikes for this. So if you see that your node didn’t wake up within 30 minutes, you should simply remove it from the Farmerbot. Then you won’t lost the whole minting period.

Note that WOL tech isn’t 100% reliable. For example, I have 7 nodes at a given farm, 2 nodes didn’t work with WOL, the other 5 five did. So when I had the 30 min warning as you said, I simply removed those 2 nodes. Since then, I had 0 issues with Farmerbot for the remaining 5 nodes. So WOL was the issue within the nodes themselves, not the Farmerbot.

This means that I have 5 nodes on Fbot and 2 running all the time. This is already way better in terms of electricity savings.

This is what I recommend for farmers: try the Farmerbot on all nodes, and remove the nodes having issues with WOL. At least it gives you a big part of eletricity savings, and you never lose a whole minting period by using the two-strikes with the 30 min rule.

marshu1 · July 11, 2024, 11:02am

If we allow this, it would incite farmers to wipe their disks and get a new node IDs for any kind of reasons, e.g. a user deploys a workload and the farmer wipes their disk to not pay for the electricity used for the deployment. How it is set up now, farmers don’t have incentive for this behaviour.

It seems to me that it’s the other way around, right? If only the remaining days after the violation are paid out, there would be no incentive to either create a new ID or to shut down the node

scott · August 22, 2024, 10:32pm

To me, there’s a different solution to the problem of farmers wiping the disks on their nodes. Once we implement the measure in the recent GEP to give 50% of utilization revenue to farmers, they will have a real and significant incentive to attract workloads to their nodes. Users will want to rent the most reliable nodes, and so farmers also have an incentive to demonstrate that their nodes are reliable.

A node that was just wiped has no history and therefore can’t be deemed reliable. If a farmer chooses to power off their node rather than continue running it through a month they won’t mint due to missing the uptime requirement or having a violation, then their node loses overall reliability. That is to say that a node which had 90% uptime in one month still looks much more attractive than one that had only 20%, even though both won’t mint.

These factors can also aggregate over a farm. So a farm that registers a lot of node ids versus a comparatively small number of active nodes would signal a generally low level of reliability for a farm. Likewise, achieving the best possible uptime for all nodes in the farm would reflect positively on the farm.

I am open to the idea that there’s a second bracket for uptime that has a reduced reward. For example, from 70-95% uptime could be rewarded at 50% rewards (just example figures). I understand that some other projects do it this way.

What I’m not crazy about is any scheme that’s based on when in the month a breach of the requirements occurred, such as only canceling rewards for the part of the month before a violation occurs. The reason is that, if we assume events that cause a loss of uptime or a violation occur randomly within each month, then it’s not fair to the farmer who happens to have an incident occur towards the end of a month.

If two farmers both happen to have one power outage in a given year that causes them to miss the uptime requirement for a single month, then it doesn’t make sens to me that one farmer can lose 99% of rewards and another can lose 1% based on the random placement of the event within the month it happens for each of them.