Node 1655 does not seem to be connected correctly [RESOLVED]

Hi, not sure who to contact but I have been deploying VM’s on node 1655, farmID 84 (Terminator). The network configuration is not done properly. AFAICS I have provisioned an IPv4 address:

{
    "version": 0,
    "contractId": 14356,
    "nodeId": 1655,
    "name": "VMc02ecdfc",
    "created": 1673433242,
    "status": "ok",
    "message": "",
    "flist": "https://hub.grid.tf/tf-official-vms/ubuntu-18.04-lts.flist",
    "publicIP": {
        "ip": "87.251.36.6/24",
        "ip6": "",
        "gateway": "87.251.36.1"
    },
    "planetary": "304:5069:f7aa:c456:ede2:6255:b0c8:8607",
    "interfaces": [
        {
            "network": "NW61a62ca3",
            "ip": "10.20.2.2"
        }
    ],
    "capacity": {
        "cpu": 4,
        "memory": 4096
    },
    "mounts": [
        {
            "name": "DISK84fd1857",
            "mountPoint": "/",
            "size": 53687091200,
            "state": "ok",
            "message": ""
        }
    ],
    "env": {
        "SSH_KEY": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDAOP0h6VImNcxnIBRMoMfbMfb0xwGHDlaPxZ+nu0CL8ATJekVDHDLMGEPdvACfHBe0sqIw/l6jqoEMR4Dzhjgm4bVEUBVEnG1FvkeNB59sT2DOxDCZuqJvjx2M1bJlH8AR/JQXxUQ+zvfTbavc4/zfCuJm4PYNUsmEt/IQmRwLznGOkoJbwYLhKCC3ykZd0EGpmCWgUUYn0ihaaYkyrliQi5Ny00x0s6jOIJg0CG2Xh5xcrkhOfCZMxZAB+/LGQpZ3tu+Cy5jRf8V/JZ8XQmtYM2GmBUZ1KGcMcsGzrtuudn13JeYLtWJBw6A7Q3Fb7dQSCMLC9UA0uMSZk67M6DFV john@RescuedMac"
    },
    "entrypoint": "/init.sh",
    "metadata": "{\"type\":\"vm\",\"name\":\"VMc02ecdfc\",\"projectName\":\"Fullvm\"}",
    "description": "",
    "corex": false
}

But pinging it shows a routing issue inside the DC network or switch / router connecting the 3nodes:

➜  ~ ping 87.251.36.6
PING 87.251.36.6 (87.251.36.6) 56(84) bytes of data.
From 213.136.2.25 icmp_seq=1 Destination Host Unreachable
From 213.136.2.25 icmp_seq=2 Destination Host Unreachable
From 213.136.2.25 icmp_seq=3 Destination Host Unreachable
From 213.136.2.25 icmp_seq=4 Destination Host Unreachable

and a tracepath

root@meet:~# tracepath 1.1.1.1
 1?: [LOCALHOST]                      pmtu 1500
 1:  ???                                                 1403.418ms !H
     Resume: pmtu 1500 
root@meet:~# 

Or:

 ping meet.mytrunk.org
PING meet.mytrunk.org (87.251.36.6) 56(84) bytes of data.
From lo0.leaf-sw1.bit-2d.network.bit.nl (213.136.2.25) icmp_seq=1 Destination Host Unreachable
From lo0.leaf-sw1.bit-2d.network.bit.nl (213.136.2.25) icmp_seq=2 Destination Host Unreachable
From lo0.leaf-sw1.bit-2d.network.bit.nl (213.136.2.25) icmp_seq=3 Destination Host Unreachable

from within the VM (you can connect to it over the planetary network):

root@meet:~# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
From 87.251.36.6 icmp_seq=1 Destination Host Unreachable

Also a tracepath from the VM does this:

root@meet:~# tracepath 1.1.1.1
 1?: [LOCALHOST]                      pmtu 1500
 1:  ???                                                 1403.418ms !H
     Resume: pmtu 1500 
root@meet:~# 

In the mean time I will delete the VM, need to have something deployed that has network connectivity.

@teisie, i believe 1655 is yours?

Seems that I am late to the party, it’s been detected and report a long time ago. Here’s the issue: https://github.com/threefoldtech/test_feedback/issues/364

Hello, yes indeed this is my node. The problem I asked the team many times to help me on the IPv4 setup but haven’t got any clear reaction. Before I went to the DC I asked @RobertL and Jan De Landsheert what I needed to do because Robert got it working already. My problem in the setup is that every cable is in random switches. So there’s no seperation between private and public. Unfortunately again i had to hear that after going to the DC. I will be at the dc again in about 10 days and will test everything there to be 100% sure. If @weynandkuijpers can wait for 10 days I got a valid setup and they should be working.

1 Like

No worries, it’s just that these node are fantastic large nodes and I have not been using them. What do you mean with: “Every cable is in random switches”. Let’s do some planning before you go to the DC and make sure we can get your serverpark online and ready for workloads :slight_smile:

1 Like

Hé Weynand, according to the GitHub that Jan unfortunately send me after I went to the DC it clarifies that there should be 2 switches, one for the first cable and 1 for the second. To separate the public and private. Also cable 1 should be in NIC 1 and cable 2 in NIC 2.

What I did wrong is to have all these cables randomly plugged. So there is really no separation.

Like it says here all the way in the bottom:

If there is another way on Unifi Dream Machine Pro to fix this I would really like to know!

Well… it is possible to run both nic connections (ZOS primary grid connection NATed via router which supply’s local IPs via DHCP and a second connection for public IP routing) on a single switch. But it depends on how your WAN uplink is configured. F. e. If your DC provides you public IPs (to be used for nodes) in the exact same range/net like the gateway (and the routers WAN-IP) it should be working. In this case the second nic connections would bypass your router and talk directly to the gateway. This would also work with a second (or more) switch(es). Of course a clean setup would walk on separated/isolated nets which could be easily achieved by configuring vLANs on the dream Maschine and the switches. However we need more information about your setup to figure it out. I had the above described setup working in DC running +40 nodes on two ubiquity unify switches without vLAN. you can shoot me a DM and we can have a quick call to figure it out if you don’t want to publish sensible information about your setup here.

1 Like

I can partially read what you mean. The internet gateway is the same from the public IP’s if that’s what you mean.

But somewhere it has to know which cable to bypass and which not right?

The cables are completely randomly plugged in. Every node has 2 cables but both cables are randomized. So there also nodes where the 2 cables connect to 1 switch. And nodes where it’s separated.

Sure what’s your telegram?

The question is: how is the DC providing your internet uplink? I guess your rack is uplinked with one (or more) CAT/RJ45 connections (or LWL or SFP+ doesn’t make any difference) where the public IP range is routed to and you have connected this uplink with the WAN-side of your router? Or the DC has an own router/gateway installed in your rack where you connect the WAN-side of your own router to any LAN port of the gateway/router provided by the DC. Correct?

In this case your setup needs a very specific configuration (depends also on some other pararmters). The fact that your gateway uses the same IP range like those public IPs provided to the nodes your router isn’t really routing…thats simple switching. A setup like this would require “bridging” of two different router ports with both using the same IP range. However this is possible but isn’t needed and also requires much more ressources (CPU and memory) on your router as well as specific routing and firewall roules. “routing” between similar nets is used when you want to build a “stealth firewall” (also called “Transparent Filtering Bridge”). I guess thats not what you want to do unless you are trying to intercept your nodes traffic and try to hack workloads delpoyed on your nodes with f. e. a man-in-the-middel-attack.

There is a very simple solution to avoid a setup like this but requires that you DC uplink is established by simple static IP (which is the case in most DCs and also here). If you have to use PPPoE or pptp/L2TP (like a consumer internet connection at most homes) this would not work. Let’s assume your WAN is established by static IP. In this case you just simply attach the WAN uplink provided by the DC to one of your switches and NOT to the WAN-side of your own router. The WAN-side of your router needs than to be attached to the switch too. By doing so your nodes will be able to connect directly to the DC gateway (in the same way your router is connecting its WAN-side to the gateway) without the public IP traffic beeing routed/bridged through your router (bypassing). With a network configured like this it is absoultly not important on which ports you connect which nic of your nodes. You can just randomly plug them anywhere. But (!) with one restriction: the DC uplink must use a static IP. Dynamic IP would also not work because you would then have two DHCP servers in the same physical network (the one from the DC and your own router).

Looks like this…

PS: If you are using any kind of LOM you should NOT connect those interaces to the switch where your WAN uplink is going to, because you would expose the LOM interfaces diretcly to the web.

1 Like

Amazing advice and wisdom here @Dany

@teisie: would love to know if this works out for you.

Pretty sure @weynandkuijpers can check after you have reconfigured the setup

1 Like

I will, thank you @Dany for all the extended insights and help!

@teisie: in addition to yesterdays call here is the post on how to assign fixed public IPs on a 3node via Polkadot UI:

Just went to the DC and this is the config:

So the public is completely separated of the private part.

But unfortunately still not working :unamused:

the setup definitly should work!! When the router is connected to the internet/gateway with static public IP settings on its wan-side, then every other host that is connected to the “switch 2” should be online too.

Are you sure that the IP block is routed correctly?

can you ping the wan side of your router from outside the DC? (the router needs to be configured to respond on wan side pings).

You need to talk to the DC guys…looks like there is something weird with the routing.

This is the result of a trace route to 87.251.36.1

and this is the result of a trace route to 87.251.36.7

as you can see… the request is hopping on different hosts in between. this is not normal for plain routing of a /24 net.

1 Like

The weird thing is i tried a server with a static public IP to test the IPs, and it was perfectly working. It just seems the public NIC doesn’t get a static. Can i check somewhere if a NIC got a static ip thru the deployment or anywhere?

Could it be my DC doesn’t automaticly allow SSH port 22?

That’s impossible. Apart from… the public IP on the 3node should at least respond to pings.