Check if node online

During the migration of the Bancadati and TLRE farm, I have noticed that most, but not all nodes were registered to the grid.

Bancadati registered 85 out of 88 nodes.
TLRE registered 182 out of 186 nodes.

Now I want to find these missing nodes without going through the IPMI terminal one by each to see if they are running proper.

Is there a way, like in the old version of ZOS, where I can get a client of some sort to the nodes via a local connection and check on it’s health?
If there is, can someone point me to the docs so I can write the scripts etc.
Other strategies to accomplish this are of course very welcome.

0-OS v2 send an heart beat every 10 minutes to the explorer.
The last heart-beat received timestamp is available through the API of the explorer. example: https://explorer.grid.tf/explorer/nodes/EAZRGmJ4By2MR57USP1z64XU8cws9xrWTJy9yTWpgUsk
Check the updated field in this result.

An python explorer client is available at https://github.com/threefoldtech/jumpscaleX_libs/tree/development/JumpscaleLibs/clients/explorer
There is also a SAL that provides you with nice method to create reservation and explore the grid: https://github.com/threefoldtech/jumpscaleX_libs/tree/development/JumpscaleLibs/sal/zosv2

There is also a go client for the explorer at : https://github.com/threefoldtech/tfexplorer/tree/master/client

We are in the middle of documenting the explorer API so anyone can implement its own client for it.

If you want to have a UI to check the status of you nodes, the Farm management page in the admin panel also gives you the information.
The explorer also let you filter by farm and gives you the same information

1 Like

I don’t think this is really what I’m looking for,
I want to be able to map physical nodes to the nodes on the explorer so I can check which ones are there or not.

Currently I don’t know a way to do this and I don’t want to look for 7 needles in a hay stack myself.

Hmm ok, I can see the management interface IP on the node detail page, that could work!

I could also do something with node_id_v1

Indeed, you should be able to map your DHCP configuration with the information returned in the ifaces array.

Would it help if the hostname of the node would also be included in the node document ?

Hmm I think i’ll be able to do fine with the IP addresses, no need for hostnames (so far)

Hmm maybe it would be a good idea to add hostname if they are present for all nodes

I’ve been looping all nodes and printing it’s node ID and the inteface that has an ip that starts with 10.10.
As you can see from the excerpt of the result below, not all nodes seem to have an interface with that IP
Also looking at the result of the API call (underneath), ifaces returns null

I also don’t have way to map with node_id_v1, I don’t have that data stored on files I have access to, I do have a list of the ZT IP’s but no access anymore to the ZT networks (my gig google account been deleted)

...
BSXnWFXa9HpkRbGrxzEMZCTvm6EY8685obo6Yt3yNzZ4
zos 10.10.110.127/23

eo42DKKMWY2cF6G9VG39RNRLQ5WfYoFapzE4x77a2oj
zos 10.10.110.125/23

DT2DiitgXyGKj8TFsEC8snBT7jy4ytQiPLQvqNwsjByb

7mYLPqhhdU2tP64ETpjsEyaDNFqMVRWgLijCeo8VRJnJ

eWKRy5VDbzn5mHzK6DPjr6E1N1ZHFkP2CaS6PPr1hC8

BH991CgyjBHdLb3mTqUYwnd2UTwZpDDt2RK7Us81pwBq
zos 10.10.110.11/23

7Gio9RjM4VJB421W7ci7mTLGSf3H7JYmaT5aVNrr9Nnk

A6YP8CJ2D2tqkSwCdBjHCLoDzDbkXiqCyvwsE2Vr2htW
zos 10.10.110.14/23

CeL6tZMAe7qd3EbBLTX36vPBCTV1SonrBuVNPW8k5tsN
zos 10.10.110.13/23

...

NYmbjAuE3UosHtD5Pxngz121NdYhiMrtYsCHxUXY2zd
zos 10.10.110.33/23

9VruBWdCHoQzaMfhL9Ky86NGbPuEFERmi1KT2gLH19HV

6TnVLJ7fMo2u4su4Giaf8PfB6iCnBWBsr9AwPdm9mj9k

2kTP36meeJXayRtFDV5Xs9Lv6Wg5awL7565m7dQNmm1m
zos 10.10.110.82/23
...
{
  "id": 433,
  "node_id": "DT2DiitgXyGKj8TFsEC8snBT7jy4ytQiPLQvqNwsjByb",
  "node_id_v1": "ac1f6b457b2c",
  "farm_id": 173635,
  "os_version": "0.3.1",
  "created": 1589209551,
  "updated": 1589545463,
  "uptime": 336244,
  "address": "",
  "location": {
    "city": "Uknown",
    "country": "Switzerland",
    "continent": "Europe",
    "latitude": 45.922,
    "longitude": 8.9844
  },
  "total_resources": {
    "cru": 8,
    "mru": 63,
    "hru": 0,
    "sru": 22356
  },
  "used_resources": {
    "cru": 0,
    "mru": 0,
    "hru": 0,
    "sru": 0
  },
  "reserved_resources": {
    "cru": 0,
    "mru": 0,
    "hru": 0,
    "sru": 0
  },
  "workloads": {
    "network": 0,
    "volume": 0,
    "zdb_namespace": 0,
    "container": 0,
    "k8s_vm": 0,
    "proxy": 0,
    "reverse_proxy": 0,
    "subdomain": 0,
    "delegate_domain": 0
  },
  "proofs": null,
  "ifaces": null,
  "public_config": null,
  "free_to_use": false,
  "approved": false,
  "public_key_hex": "b8f6b90a9b55a08e2c19def83ae472242b5f5ebbac88e357a6dd6eb0a47b69ea",
  "wg_ports": null
}

It seems a bug has been introduced recently in the network daemon of 0-OS. Normally you should always get at least the detail for the zos interface. I’ll check why exactly some of your node didn’t reported those.

I also open a feature request for the hostname: https://github.com/threefoldtech/zos/issues/new

I also checked the logs from DT2DiitgXyGKj8TFsEC8snBT7jy4ytQiPLQvqNwsjByb and found this:
[+] networkd: 2020-05-15T12:48:41Z fatal failed to create DMZ error="ndmz: could not node create pub iface 6: failed to find a valid network interface to use as parent for ndmz public interface: no interface found with ipv6"

There is no IPv6 available in the Bancadati Datacenter, Moresi is working on a proposal to offer this. Should I open a new topic to discus the IPv6 related stuff?

As you must have seen in the network documentation for 0-OS v2: https://github.com/threefoldtech/zos/blob/master/docs/network/Deploy_Network-V2.md

iPV6 is an hard requirements right now.

Yes, this document has been forwarded to Moresi.