Caprover deployment inaccessible [Resolved]

A caprover was deployed and a wordpress instance was launched.
A few issues have developed so I am trying to login to the caprover instance but it seems to refuse incoming ssh connections.

ssh: connect to host 185.69.167.215 port 22: Connection refused

The JSON file looks as follows:

    {
    "version": 0,
    "contractId": 14620,
    "nodeId": 3010,
    "name": "CRLnaiein",
    "created": 1674043928,
    "status": "ok",
    "message": "",
    "flist": "https://hub.grid.tf/tf-official-apps/tf-caprover-main.flist",
    "publicIP": {
        "ip": "185.69.167.215/24",
        "ip6": "",
        "gateway": "185.69.167.1"
    },
    "planetary": "",
    "interfaces": [
        {
            "network": "NWnaiein",
            "ip": "10.200.2.2"
        }
    ],
    "capacity": {
        "cpu": 1,
        "memory": 1024
    },
    "mounts": [
        {
            "name": "data0",
            "mountPoint": "/var/lib/docker",
            "size": 53687091200,
            "state": "ok",
            "message": ""
        }
    ],
    "env": {
        "SWM_NODE_MODE": "leader",
        "CAPROVER_ROOT_DOMAIN": "naiein.com",
        "CAPTAIN_IMAGE_VERSION": "v1.4.2",
        "PUBLIC_KEY": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCwa04MlP4jib8+UdKMOzoWUfAFqC2nGrLFlImSqdQUDdjDtfgzVAYcbjtex0hncP2rotX76uCnVdzWMIoJMMm+xNkHlkbUB9GT2LAijHdKZyxthwDielV1hRvUBVSsSB4xNGGgafSIoYF+qsGL9NftlqLv04tsVgL75mtJ9i82FJ6GiZ/mh64AsvWsF8IJHhm+O3y/Su1ta1scLzELzrrn8kEGRftkvJl3uQwStAi8/N7/WWYRb0fO7uuV1pKJg5kT5gCMzhjLS2Mwruo0bkE69p4y/N3NIbH2LNsKPWueyQAwd24e7zeCPNuY6Sz+RIS/UbBIahL68NNA3alOLwRT5KjzSXaI9fiUSQSyRi33H6/1NB6VJzXEENpXqvOe+Bj5N5GgxDZ0bB1sjci+/cPLDg8OnqBALqe62AtgLx998goowuItmIACBYVFsELpECazE1buTup3Fualy+IJi8x1yIxBwr5zKC8jKD7rXkFBmUtyd7kYddgRqzX27teyen0= toto@Totos-MacBook-Air.local",
        "DEFAULT_PASSWORD": "///"
    },
    "entrypoint": "/sbin/zinit init",
    "metadata": "{\"type\":\"vm\",\"name\":\"naiein\",\"projectName\":\"CapRover\"}",
    "description": "caprover leader machine/node",
    "corex": false
}

http://captain.naiein.com says that nothing is there yet and https://captain.caprover.naiein.com/ times out. These used to be fine…

The wordpress editor is available though via http://naiein-web-wordpress.caprover.naiein.com/

But this is useless if I can’t modify the caprover deployment since I need to change a few php files and check of error logs.

Am I missing something here and how can I access the deployment?

Hi @tototator

I am not sure but can you check if you have enough TFT in your deployment wallet?
Try and add some more, perhaps.

For example, Teis in the Telegram channel couldn’t SSH into his 3nodes due to lack of fund in the wallet.
When he put more TFT back into the wallet, the SSH connection was finally working.

When you go into your deployments, you can see if they are into grace period or not:

You can check in the Deployments section of the Playground:

Screen Shot 2023-02-16 at 2.58.58 PM


If that is the case, once you fill up some more TFT in your wallet, the deployments’ status should go back to “Created” instead of “GracePeriod”. It can take up to an hour or so.

Hi Mik,

Thanks for the quick response.

Fortunately or unfortunately I have enough TFT in the wallet and the deployment is not in a GracePeriod.

OK. Let us troubleshoot some more then.

Do you have a firewall set up?
It could be blocking the port.

What if you try to restart the SSH server?

sudo service ssh restart

Did you change the SSH key pair during your deployment?
Sometimes it happens and then the SSH connection doesn’t work anymore.

The SSH server is running on the remote VM, which isn’t available by SSH in the first place. So this won’t help.

1 Like

I did some checks. The Caprover deployment includes an SSH server in the image, and I was able to connect via SSH to a freshly deployed instance.

Then I tried to SSH to the instance at 185.69.167.215, and it looks like it’s trying to authenticate:

Is the issue ongoing for you @tototator?

Just checked and the issue seems to be resolved and I can ssh into the instance.

This bothers me though, why does it work now and not before? What changed and how can I prevent this from happening in the future?

Did you use the same profile manager?

Perhaps you logged into the profile manager and the connection then worked.

Yes, I only use this one profile manager on this machine.

OK. Thanks for the information.

I will open an issue on Github. The TF Dev team might be able to have a look and find out why this happened. I will let you know how it unfolds.

EDIT: The issue is available here https://github.com/threefoldtech/grid_weblets/issues/1315.

Changing the profile in the manager could only change the SSH key supplied to future deployments—it couldn’t change an existing deployment.

The only other issue similar to this I could find was caused by a RAM issue in the node. I suspect it could be some intermittent problem with either the node hardware or networking, but I wouldn’t rule out some software issue in Zos either. Hopefully we can get some input from dev/ops to narrow it down.

1 Like