Drobo

CHKDSK Limbo and Strange Status Message

I have a support case open on this, but with the holidays I expect that dialogue will be on hold until next week sometime. In the meantime, can anyone offer some insight on my situation?

Here is the problem report that I filed with Drobo tech support on 14-Dec:

[i]1) Since reformatting my Drobo to 16TB (with 5TB actual storage) and loading it with 2.5TB of data, it has been very slow to mount and dismount. Both mounting and dismounting now consistently take between 30 and 60 minutes. Is this normal? If not, how can I correct this?

  1. Also, when I look at the storage graph in Drobo, under “How is my storage being used?”, the graph constantly flips back and forth between two different states, every 10 seconds or so. In one graph, Available for Data / Reserved for Expansion / Used for Protection / Overhead is 3.15TB / 931.50GB / 497.84 GB / 3.26 GB and in the other, the labels and bars instead are 3.15 TB / 0 B / 1.39 TB / 5.70 GB. Why is this happening? Does it indicate a problem?[/i]

My OS is Windows 7. The Drobo technician asked me to perform a CHKDSK /R. After running 24 hours a day for 10 days, CHKDSK at last completed the last scanning stage yesterday. For the past 24+ hours, the CHKDSK status message has read:



CHKDSK is verifying free space (stage 5 of 5)…
3613218988 free clusters processed.
Free space verification is complete.
Adding -681748308 bad clusters to the Bad Clusters File.


So, at this point, my questions are:

Has CHKDSK hung, or is it still doing something active and useful? The CHKDSK process is still consuming 25% of CPU resources.

Also, is it normal for a such a large negative number of clusters be added to the Bad Clusters File, and if so, what does this indicate?

Any thoughts on cause and correction of the underlying problem will also be appreciated.

I am hoping to soon conclude the therapy and welcome back a healthy Drobo. It is inconvenient having to workaround my primary archive and primary backup storage being offline for so long.

David A. Gilmour LPPO
Lumacraft Photography

Wow, that looks to me like a major {male chicken}up. How can a supported filesystem (NTFS) on a supported platform (Windows) do such a thing?

This smells fishy to me. Are Drobo’s algorithms really bulletproof?

After 12 non-stop days of processing, CHKDSK ultimately aborted, with the message “Insufficient disk space to fix the bad clusters file. CHKDSK aborted.” The full CHKDSK status log follows.

Any insights and advice on this will be appreciated.

David A. Gilmour LPPO
Lumacraft Photography

Microsoft Windows [Version 6.1.7600]
Copyright © 2009 Microsoft Corporation. All rights reserved.

D:\Windows\system32>chkdsk z: /r
The type of the file system is NTFS.

Chkdsk cannot run because the volume is in use by another
process. Chkdsk may run if this volume is dismounted first.
ALL OPENED HANDLES TO THIS VOLUME WOULD THEN BE INVALID.
Would you like to force a dismount on this volume? (Y/N) Y
Volume dismounted. All opened handles to this volume are now invalid.
Volume label is Drobo.

CHKDSK is verifying files (stage 1 of 5)…
363776 file records processed.
File verification completed.
11 large file records processed.
0 bad file records processed.
0 EA records processed.
0 reparse records processed.
CHKDSK is verifying indexes (stage 2 of 5)…
397068 index entries processed.
Index verification completed.
0 unindexed files scanned.
0 unindexed files recovered.
CHKDSK is verifying security descriptors (stage 3 of 5)…
363776 file SDs/SIDs processed.
Security descriptor verification completed.
16647 data files processed.
CHKDSK is verifying Usn Journal…
1215544 USN bytes processed.
Usn Journal verification completed.
CHKDSK is verifying file data (stage 4 of 5)…
363760 files processed.
File data verification completed.
CHKDSK is verifying free space (stage 5 of 5)…
3613218988 free clusters processed.
Free space verification is complete.
Adding -681748308 bad clusters to the Bad Clusters File.
Insufficient disk space to fix the bad clusters file.
CHKDSK aborted.

Ok

first issue the mounting and dismounting time

under windows larger volumes take MUCH MUCH longer to mount.

the rule of thumb is that time it takes to mount a drobo is the volumes size in TB, minutes

so a 16TB volume takes about 16 mins (VERY roughly)

thats the argument AGAINST having massive volumes (on windows at least).

however - for a while my drobo was plugged into a machine powered by an intel atom. that took AGES to mount like 45-60 minutes. so evidently whatever it is that takes so long to mount a big volume, is dramatically affected by your machine speed - so in fatc your 30-60 minutes, if you have it plugged into a relatively slow machine, may be pefectly normal.

as for the chkdsk

i have NEVER gotten windows 7 chkdsk to successfully run on a 8 or 16TB volume. it seems to get right to the end, then hangs a LONG LONG time, THEN and presents the results, but never actually finishes and returns me to the command line, i have NO IDEA why. 4TB it seems happy with.

so the short version:

  1. your results seem perfectly normal to me
  2. im not help at all

CHKDSK /R is BAD! I’m not sure why they would have you use that parameter. It scans and tries to reallocate bad sectors, but with thin provisioning, the scan is bound to find sectors that aren’t addressable, unless Drobo is somehow smart enough to detect a CHKDSK scan and “fake” an appropriate response.

Given your report of negative bad blocks, I conclude that Drobo doesn’t detect bad sector scanning from a CHKDSK /R operation.

Try running CHKDSK /F. That’ll fix filesystem errors without the bad sector scan.

I agree with Docchris - on my Win7 x64 box my 16TB (8TB actual) Drobo volume does the same - CHKDSK /F runs fine, fixes any errors, presents the summary then never returns to the command prompt.

On my Vista x64 box with the same Drobo connected, CHKDSK /F does complete and return to the command prompt. So it’s definitely some difference between how Win7 and Vista’s CHKDSK /F works.

To add further observed difference, my Vista x64 box takes AGES (haven’t timed it, but it’s >5 minutes) to boot up with the Drobo connected.
The Win7 x64 box does not have this hang at startup. So, other than more differences between Win7 and Vista, I’m not sure what to make of it.

Brandon

sorry i missed that he was using the r rather than the f, my observations were meant for chkdsk/f (agree with bhiga - r is a bad idea on drobo)

interesting that your win7 x64 doesnt have the hangup. you sure it isnt just a much more powerful machine|

may machine with the absurd delay was win 7 x86, i havent had a drobo connected to my x64 box, but its a quad core i7 going at 4ghz, so hopefully it would be a bit faster than my dual core atom @ 1.66 anyway!

Bah, I did dumb things while I was half dozed…

Anwyay…

My Win7 x64 machine is an Atom 330 (dual-core 1.6GHz) with 4GB of RAM.
On this machine, Drobo does not delay boot-up and it mounts quickly.
However, chkdsk /f, when run, goes through its paces, then gives the summary - but never seems to exit to the prompt.

Granted, I think the longest I ever waited for it to exit was 4 hours, but that seems like it’d be more than enough? The OP is definitely more patient than me, waiting 10 days!!

On the other hand, my Vista x64 machine is an Athlon64 4400+ with 4GB of RAM.
On that machine, Drobo delays the startup of Windows for somewhere between 5 and 15 minutes, so it seems to hold to Docchris’s general figure of 1 minute per provisioned TB, as I have a 16TB provisioned volume.

I haven’t had any dismount delays, however.

very curious, my atom 330 with win 7 x86 (only 2 gb ram tho) waited months (it felt like) before booting with a 16tb drobo.

i gave chkdsk a full 24 hours, still never gave me a command prompt

Docchris, bhiga, those are very interesting comments, thank you.

My machine is Windows 7 Ultimate x64 on Intel Quad Core Q8200 with 8GB RAM.

Before opening the support ticket with DRI, I had run CHKDSK /F. It completed normally in a few hours, and reported no errors.

I was curious about how CHKDSK /R would handle thin provisioning, and alluded to that question in the support ticket dialogue at the start of the long CHKDSK run, but the tech did not address it.

Also, when I originally formatted the Drobo to 16TB, I recall that its mounting and demounting times were normal, and I do not recall observing Dashboard’s weird bar graph flip-flop behaviour at that time. Only after I loaded the Drobo with my archive and backup data over the course of a week or so – it now contains about 2.6TB – did I begin to observe slow mounting and odd Dashboard status; whether this had anything to do the amount of data, or whether something else was the trigger, I do not know.

I have wondered whether the alternating bar graph observation might be an indication that one the four drives is glitchy. All four drives that I installed prior to formatting to 16TB were either new or nearly new, with no SMART anomalies at the time of installation. I did include a diagnostic log when I opened the support case, and I would have thought that any SMART type flags on any of the drives would be indicated in it, but the tech made no comment to indicate any concern with the diagnostic log content.

The diagnostic log definitely would have included info if Drobo was encountering drive data or communication errors.

You haven’t used any defrag or partition-level tools Gparted, Parted, Disk Management on the volume, have you?

Also, what Dashboard and Firmware versions do you have?

Since my Drobo just finished relayout (hooray!) and I ended up running Chkdsk /f on it anyway, here’s what my output is for my NTFS 16TB volume, from my Vista x64 machine (it returns normally to the command prompt immediately after the summary output).

[code]The type of the file system is NTFS.
Volume label is Drobo.

CHKDSK is verifying files (stage 1 of 3)…
202944 file records processed.
File verification completed.
6436 large file records processed.
0 bad file records processed.
0 EA records processed.
1 reparse records processed.
CHKDSK is verifying indexes (stage 2 of 3)…
235942 index entries processed.
Index verification completed.
0 unindexed files processed.
CHKDSK is verifying security descriptors (stage 3 of 3)…
202944 security descriptors processed.
Security descriptor verification completed.
16499 data files processed.
Windows has checked the file system and found no problems.

16777087 MB total disk space.
3829748 MB in 179955 files.
69288 KB in 16501 indexes.
0 KB in bad sectors.
732087 KB in use by the system.
4096 KB occupied by the log file.
12946556 MB available on disk.

  4096 bytes in each allocation unit.

4294934518 total allocation units on disk.
3314318484 allocation units available on disk.[/code]

oooohhhh

No, I learned my lesson about that prior to reformatting to 16TB. Previously I had tried speeding things up by enabling write caching for Drobo in Windows Device Manager. Considerable mayhem followed, and recovery took some effort. Since then, I have adhered to the policy of using only the functions built into Dashboard for disk management, e.g., I use PerfectDisk for disk defrag, and the Drobo is specifically excluded.

Dashboard 1.6.6, Firmware 1.3.5

Thanks David… Hope we can get to the bottom of this.

A few more comments on my stats.

The reparse record on my volume is because I made a reparse point (symlink) in the FS with mklink, so this typically wouldn’t appear for most users.

Here’s how it went:
Drobo was not connected to comptuer. I had disconnected it so relayout would complete as fast as possible (it still took somewhere over 72 hours).
Drobo was blinking orange/green (in relayout)
Drobo stopped blinking and went dark when it finished
I connected Drobo to my Vista x64 machine via USB
Vista saw the Drobo, but Dashboard did not.
I waited about 20 minutes for Dashboard to recognize the Drobo. It did not.
I started to get worried.
I checked my drives, Drobo was there, content was OK. Whew!
Ran Chkdsk /f on it. All looks good.
Exited Drobo Dashboard.
Ran Drobo Dashboard again.
Drobo Dashboard recognized Drobo.

It might’ve just been that Drobo Dashboard was PO-ed after not seeing a Drobo for 72+ hours, no clue. No harm, no foul, but it was a little weird.

im sure jennifer said that chkdsk on win 7 runs fine for dri though, so i wonder why they would get returned to the command prompt, and we wont

Here is the latest on my situation.

When I rebooted my system for the first time after the long CHKDKS run, the Drobo dismounted immediately and normally during the shutdown. It took about 16 minutes for the Drobo to mount when the system restarted, which according to member Docchris is to be expected. So my Drobo may now be mounting and dismounting normally. Perhaps the long CHKDSK /r run did correct something, although I did not see any indication of this in its status message.

The DRI tech responding to my case has asked me to provide another diagnostic log, and also to downgrade Dashboard to 1.5.1. Since changing to v1.5.1, the storage use status graph has calmed down. It only changed its status once. Initially it indicated “reserved for expansion”/“used for protection” as 0B/1.39TB. About 30 minutes after mounting, this changed to 931.50GB/496.84GB. 18 hours later, I have not noticed any further changes in this status message.

Sounds like things are getting back to a more “sane” state for you David, glad to hear it.

It’s quite possible that Chkdsk did fix something in the filesystem, before it got to the “I can’t count that high” number of “bad” blocks. :slight_smile:

i wonder how much they re-write chkdsk with each version of windows, is it always hte best it can be , or the same old crap they keep on chucking out there

They probably had to do some update for Vista’s GPT support.
Still doesn’t explain my Win7 hangs but Vista is OK for me. Might end up being one of life’s odd mysteries.

If/when I hook my Drobo up to the Win7 machine (likely I will), I’ll let Chkdsk sit overnight.
Maybe in Vista you “pay the delay” in advance but in Win7 you “pay the delay” later?

ive left it overnight, it does nothing for me, it just just sits there…