How do we troubleshoot missing resources (CLOSED)

I turned up 2 identical R620 servers IDs 3314 and 3316. for some reason ID 3316 is not showing all my HDDs. each server has 9*600GB HDD totaling 5400GB. for ID 3316 I am only showing a total of 3600GB which means 3 HDD are not being seen. does this mean my hard drives may be bad. what do you all suggest I should do or how should I troubleshoot this?

I assume you used used hard drives.
If these are not all recognized correctly by ZOS you maybe didn’t wipe these HDDs.
If the HDDs are ok, and accessible via another operating system, try to wipe them using wipefs on Linux.
I don’t think it’s the controller as some of your disks are working.

I appreciate the response yes these were all wiped - they were tripled passed wiped using a 3 times 0/1 process. the question now is I am trying to figure out which hard drives are not functioning since they all pull up as recognized in dell idrac.

How are they set up? As a RAID?

all individual HDD no raid I had to flash the raid controller to remove that option in order to use the drives per the Server setup process. each hard drive is 600GB and there are 9 of them totaling 5400GB. as of now only 3600GB is seen which means 3 drives are either 1. not functioning, 2. functioning but not accepted, 3. functioning and accepted but not added to the total. when setting them up dell idrac did see them all and said they were all good no issues. I have not rebooted into idrac again since starting the farm so was trying to see if there are other options before shutting the server down loading into idrac and seeing what it can see again. I am not in the server room in my office at the moment so trying to figure other way to remotely diagnose this.

Zos doesn’t provide any capability to figure out which drive is not being detected or utilized.

This is an unusual case. Normally, drives that are visible to system tools and have been properly wiped are picked up by Zos no problem. Are all the drives connected to the same controller?

Sometimes these things are as simple as a reboot, so that’s always a good place to start.

I figured as much would be a nice suggestion to add in tools for checking things like this. hmm now I have to see if there is a way to remotely reboot . does the OS have this option? I dont see anything where we can have direct access to the core OS to force a reboot. I am assuming I have to physical do this unless there is another way.

You are right. There is no other way than to shut down the 3node physically.

Developing on this idea:

As 3nodes can’t do “graceful” shutdown (i.e. only way to shut down is to shut off the machine or if power is lost), you could remotely reboot the 3node if you can remotely shut down the power then bring it back on.

EDIT: The statement above is concerning Zero-OS. As you have a R620, note that with iDRAC you can do many power control operations (restart, power on/off, etc.) remotely.

video: https://www.youtube.com/watch?v=6F3OoEWN1uQ.
text: https://techexpert.tips/dell-idrac/reboot-idrac-server/

okay I have to say i am not a server guy per say but have never done remote idrac before but can definitely look into that. thank you again, see this is the help I need just some simple direction that I can use.

1 Like

Nice to know it helps!

Here’s some more good documentation:

Setting up the IP address or iDRAC:
https://www.dell.com/support/kbdoc/en-ca/000124653/dell-poweredge-how-to-configure-the-idrac8-network-ip

Specific idrac+3node information in this great video by @FLnelson: https://www.youtube.com/watch?v=h5nb09VksYw

The general philosophy with Zero OS is, no shell, no GUI, and no remote control. Anything that could potentially provide attack surface is off the table.

Systems like idrac should provide enough of this kind of functionality in hardware that includes it. Except, of course, in figuring out why your drives aren’t detected by Zos when idrac sees them :sunglasses:

1 Like

as stated I had to manually reboot and after reboot the harddrives still were not detected. ultimately I had to load linux and rewipe the drives this time I used the process from this forum How to clear disks for DIY 3Nodes? and this worked and all my drives are now being utilized. though I have a new issue but will open a new discussion or search up old ones as this discussion can be closed as issue resolved.

3 Likes

Thanks for the feedback @amishtech (and thanks for the great clarification @scott).

Don’t hesitate to create a new post if you have any more questions. Also the Threefold Farmer chat on telegram can be quite useful, here the link: https://t.me/threefoldfarmers