-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virtual SATA Disk (ESXi) Health Status Not Supported (Preventing Storage Pool Creation) #14
Comments
Hm, we don't think the emulation will be needed as these are normally populated. Good catch that they're not anymore. It may be a problem with failing Edit: another idea (we didn't test it yet) but maybe the problem is even simpler: the disk is beyond the standard supported number of disks in the UI for 3615. After all we're artificially forcing more disks that 3615 supports. That idea comes from the fact that 918+ actually does work and creates arrays... or maybe we hit some accidental bug as v7 on 918 and v7 on 3615 run a very different kernel and mfgBIOS hmm... |
I'm not sure I follow... you mean because the disk shows up as Disk 2 and nothing shows as Disk 1? The Storage Manager UI does show a graphic of a DS3615xs with 12 drive slots. Is there something in the backend of lkm you're saying that will allow more than 12 disks? From the way storage manager words the error when creating a pool combined with the way virtual disks appear and work on Jun's loaders (if we can assume for a second nothing else major changed in DSM 7), I think getting DSM to simply report on it's own "Healthy" in the below image instead of "Not Supported" for health status, should be enough to let the pool creation work. I can't pinpoint where in DSM the decision is being made to mark a disk "Not Supported" vs "Healthy" in Storage Manager. |
Yes, we set it to 15 similarly to how Jun's loader set it as it was confirmed to be working with 15 drivers + boot (but with our kernel-mode fix it would probably work with full 16 too). However, that was always a hack and DSM7 may have a problem with that as even the graphics doesn't fit (as you said it shows 12).
The thing is the healthy/not healthy status does work on v6 (and shows healthy on v6). Additionally v6 doesn't have the hddmon module. So it suggests that v7 did change something. |
Do you know where in DSM 6 the decision is made by the system to mark a drive "Healthy" or not? I did some preliminary searching around but can't find where that code is implemented. Also, does creating storage pools with virtual disks on Proxmox work okay? I don't have a Proxmox system to test with. |
From preliminary grepping around on v6:
One crucial detail which is worth mentioning is that it doesn't show that the disk is NOT failing or anything like that but "Not Supported" which indicates it couldn't access that info rather that it determined that the disk is not suitable. As for proxmox + 3615xs + v7 we can SWEAR it was broken and showed the drive as "Not Supported" but we just generated a fresh image, installed v7 and... it does show the drive as Healthy and creates the pool without a problem. The only warning/"error" we get is the standard one about disk not being on the HCL. p.s. As for the broken General tab... it's a JS error - the API returns data but the JS on that tab fails and an error is logged to the standard Chrome web console. |
I've made some progress on figuring out what's going on here and was able to create a storage pool with a hackey kind of method! I booted up Kali and Burp Suite to try and pinpoint where the DSM GUI is making the request for disk health status. Full write up below with some further notes in between the screenshots. Turns out, when you open Storage Manager, webapi requests are made to underlying DSM for disk info. I used Burp to intercept the server response from DSM before the GUI picked it up. Noticed there was one response with a boat load of info about the sbd disk (the one data disk I have added to the ESXi VM). Trial and error revealed that changing the intercepted "overview_status" value from "unknown" to "normal" (highlighted with red box) before the GUI received the response back works to make the GUI report "Healthy" on the disk. I changed nothing else in the server response being intercepted. Once the GUI reported a "Healthy" disk, Storage Manager let me continue to create a storage pool as normal. It does complain about the disk not being on Synology's compatibility list but lets me continue past that error. Then I created a shared folder on the volume and it works! Let's me write data, create folders, etc. Here's the thing though.... Once I stop the intercept in Burp Suite, or once Storage Manager refreshes and pulls the disk info again (such as after the new volume is made) of course the "overview_status" value reverts back to "unknown" making the disk Health Status show "Not Supported" again. BUT, since the volume is already created, DSM allows it to still be used, but with a warning in Storage Manager about this "abnormal" status. In conclusion, I believe the code in DSM 7 preventing volumes from being created on ESXi is simply done in the GUI/web interface, not on the backend, because changing the HTTP response let the process continue. BUT, of course this still raises the question of why DSM is returning "unknown" in "overview_status" in the first place. What is generating that response that gets sent to the GUI Storage Manager? Find that out and shim it to always report "normal" instead and I think we have our fix. That being said, after this whole process, I have not looked at the backend DSM logs or anything else to see if it still complains about the volume being hacked together with this above method. |
Also have this issue on esxi 6.7 (bromolow 7) |
Me too with Parallels on bromolow with DSM 7 |
Noticed you posted your reply while I was writing up my research, where I got a storage pool creation to work by intercepting some HTTP traffic and modifying a json response (see #14 (comment)). This lines up with what you're saying about DSM not simply being able to access the right info to mark as healthy. I was grepping around on DSM 7 and noticed it contains similar sets of files as you mention for the synostorage module and webapi so hopefully not too much has changed between 6 and 7. |
@ilovepancakes95 your analysis is very interesting. with my current Jun's loader 6.2.3 with LSI card passthrough, the disks are real detected with SMART data working. It would be interesting to test LSI passthrough on 7.0 bromolow to check how disks are handled. but I don't have spare disks to try, and I'm not ready to risk my prod datas... |
Few instructions to skip this, adapt to your own configuration:
3.1. Create last partition using free space (Only if Syno layout already exists):
3.2. Create Syno layout, only if the disk is empty:
(if md2 already exist, you should use next md number. The same with sd*3, of course)
|
TL;DR: it's SMART. v7 requires it. We went ahead and wrote an emulator of SMART for disks without it :D The shim is part of the commit d032ac4 - can you guys try the newest release and report if it solves your issues? We've tested on ESXi 7.0.2 and it seems to be working flawlessly without any hacks. |
Hello, I'm trying to play with new build. I have a dedicated virtual SATA controller for loader (SATA0:1) It worked with this before Where are my current settings : { I want to use a SAS HBA IT controller so I prepared a loader with supportsas enabled but currently SAS card is not plugged to VM. Only a virtual disk added to virtual SATA1. If I add "supportsas": "yes", in synoinfo If I remove supportsas line, disk is detected but install fails at 55% once booted, serial log spam : Serial console log in attachment thanks |
Actually it works with 6.2.4 loader Edit : I confirm it fails on 7.0 |
Now it works like a charm even on 7.0.1 |
I'm seeing the same as @OrpheeGT |
I too am getting the same as @OrpheeGT and @Scoobdriver, trying to install DSM 7 3615xs now fails again at 55% after working flawlessly in previous releases. I confirmed I am using SATA boot menu option, but /var/log/messages shows the following:
|
Hi, its looks the sata shim fails with : [ 9.296466] <redpill/boot_device_shim.c:48> Registering boot device router shim |
It should be fixed now - we borked a rebase causing a variable to be initialized after it's checked and not before. So practically it affected anybody trying 3615xs with SATA boot.
@OrpheeGT: FYI: you don't need to use two separate controllers (unless you have some other reason).
@OrpheeGT: As far as we know (we didn't dig deeper into that) you cannot simply add
@OrpheeGT: Thank you for the log. That gpio spam means that module unloaded/crashed and we can see in the log why.
@OrpheeGT: Try again now :)
@kaileu: It looks like you have ESXi with SCSI disk - that may not work actually as we didn't test a combo of SCSI+ESXi but SATA+ESXi. You can try again with new fix. As to SMART you should do
@Scoobdriver: Should now work.
@ilovepancakes95: GPIO flood on 918 means that the module didn't load/crashed totally. So when you see something like that scroll up in the log and see if there are any errors beforehand.
@labrouss: Yup, it's the broken rebase right there. Can you check again? |
Tested 3615 loader on VMware Workstation :
|
@ttg-public Well as my LSI HBA IT card needs mpt2sas module, and it is already part inside the loader. Edit : about SATA1 controller, if I remove "DiskIdxMap": "1000", "SataPortMap": "4" and use same SATA1 controller for both loader and data disk, DSM ask me if I want to erase 2 disks instead of only the one on SATA2 controller. |
The problem with changing
What is the message saying precisely? This message in the installer is very confusing if you have a single disk. It will say that it will erase "2 disks" where it DOESN'T mean "TWO DISKS" but "DISK TWO". If you have more than one then the message says "2 4 11 disks" which is clunky but makes sense (so you know that the "disks" relates to disks number 2, 4, and 11). |
@ttg-public I'm only using DS3615xs plateform as my CPU is not compliant with DS918+... Does it mean even enabling mpt2sas module (with supportsas or manually loading it) is not enough to make LSI HBA passtrough card work ? You may be right about TWO DISKS vs DISK TWO, I will test again. |
And with Supportsas = yes enabled. No disk detected As a reminder my LSI card is not enabled but I expected to see the virtual 16Gb disk at least. |
@OrpheeGT Hm, if you're using 3615 it should support SAS out of the box, but maybe they filter for only their own SAS controllers. We can try to force all SAS ports to be seen as SATA but we don't feel confident to just publish it without testing as we're lacking a free LSI card on hand where we can test (but soon we will get it ;)). We saw you created an issue #19 so lets continue the discussion about the SAS specifically there. @MartinKuhl can you tell us something more about the config? Does it only happen on parallel or did you just test on Parallels? |
Hi @ttg-public This issue only appears with Parallels, it is the only tool I am using for testing. |
Yep, confirming it works now with commit 021ed51. Thank you! |
Sorry to write in a closed topic. Just wanted to thank @ilovepancakes95 and other contributors and add my 2 cents for those who will google "Access Error" in Synology DSM Storage Manager. The Storage Manager seems to make a call to smartctl -d ata -A /dev/sdX when evaluating health of each internal disk. Pay attention to the -d ata part - it provides for ATA device type only, so smartctl will output no relevant information if your internal drive is of another type, and that's exactly what causes drive health warning. In my case, I managed to pass an external USB drive off as an internal one so that it could be recognized by Storage Manager, and it is -d sat option that allowed smartctl to output relevant information. So I managed to make my fake drive status be considered healthy by means of the following simple shim:
|
Using old docker image for compiling
Can install DSM 7 (3615xs) just fine on ESXi with latest lkm release, however when I try to create a storage pool/volume in DSM, despite the virtual SATA disk showing up okay in Storage Manager, it blocks use of it because the "Health" status is "Not Supported" according to DSM. I know actual "health status" and even SMART will not work/be useful with virtual disks but DSM 7 will at least need to think the virtual disk has a status of "Healthy" in order to let the disk be used. In Jun's loader, the health status shows "Healthy" for virtual disks, even though when you click "Health Info" it shows "Access Error" and no actual health stats.
I am assuming there is a flag somewhere that gets written in DSM on whether a disk supports health status or not and then the actual status of the disk. I did some initial digging and found "/run/synostorage/disks/sdb" contains files that appear to show disk information and certain compatibility flags. While nothing in there says "health" I compared these files with a real DSM 7 system and changed some of them to reflect "normal", "supported", etc. in the appropriate places. This doesn't produce any immediate changes in DSM or Storage Manager for the disk so I tried rebooting DSM but the values in that folder change back. I am assuming there will need to be a way to shim synostorage into thinking the disks are all healthy before DSM is loaded fully. Not sure where to go from here. Some pictures are below.
The text was updated successfully, but these errors were encountered: