Drobo

Drobo FS from Fedora: Error -11 sending data on socket to server

Any ideas what may be causing these?

Mar 2 01:09:00 myworkstation kernel: CIFS VFS: Server our-drobo has not responded in 120 seconds. Reconnecting…
Mar 2 01:11:23 myworkstation kernel: CIFS VFS: sends on sock ffff994089f7b980 stuck for 15 seconds
Mar 2 01:11:23 myworkstation kernel: CIFS VFS: Error -11 sending data on socket to server
Mar 2 01:12:13 myworkstation kernel: CIFS VFS: sends on sock ffff994089f7b980 stuck for 15 seconds
Mar 2 01:12:13 myworkstation kernel: CIFS VFS: Error -11 sending data on socket to server

hi rkudyba,
im not sure about the messages though can i check if your drobo fs is connected to network equipment, or is this is directly attached to your computer?

if it was not directly attached, are you able to try attaching it directly just to see if the same error takes places, also even if after you try shutting things down and starting them again after 15minutes?

(if there is a way when you do this, to use dashboard to shutdown the drobo, or to shut everything else down first, and then to unplug the connection cable from he drobo fs to get it into standby mode first, that may be best)

It’s network attached by Ethernet can’t direct connect as this is from a different site. I did restart via the dashboard and 1 backup worked without errors. Then exact same error re-occurs.

ah thanks for the info,
are you able to check the status from dashboard, such as health of each drive and how much used and free space values & percentages it is showing?

am just wondering in case the backup is data that is being written ‘to’ the drobo, but then the drobo is maybe getting low on space (or almost running out of space) causing the problem?

86% free, 14% used, of total 5.38 TB, all green indicators, and the firmware is 1.2.7.

It seems I can reboot the Drobo FS and the tar backup works without an error. Then I reboot the host and same, 1 backup without error. Then after the 2nd backup the error returns. The tar error fails with:

[code]ERROR OUTPUT:
/bin/tar: /ourhost/etc-new.tgz: Cannot close:
Input/output error
/bin/tar: Exiting with failure status due to previous errors

STANDARD OUTPUT:
Failed to rename /ourhost/home-new.tgz to
/ourhost/home.tgz: File exists

Backup of /etc FAILED[/code]

here’s the command and error in context:

/usr/local/sbin/drobo-backup -n -v -c /etc/drobo-backup.conf Reading configfile /etc/drobo-backup.conf /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/home-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed home --exclude=.gvfs --exclude='Windows 7.vdi' /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/etc-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed etc /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/root-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed root /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/usr-local-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed usr/local /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/var-lib-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed var/lib --exclude=/var/lib/yum --exclude=/var/lib/rpm /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/var-log-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed var/log /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/var-www-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed var/www /bin/mkdir /ourserver ionice -c2 -n7 nice -n19 /bin/tar -cf /ourserver/var-yp-new.tgz -C / --atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed var/yp

hi i was wondering what normally happens when you take a backup, and then when you take another backup (for example when things are working properly, what should usually happen when you take a couple of backups?)

is it just that each folder gets zipped up (in this case as a .tgz) and then another backup is supposed to overwrite the previous backup and keep the same filename?

when it says this:
“Failed to rename /ourhost/home-new.tgz to /ourhost/home.tgz: File exists”
(this seems like the file was not able to be overwritten, since it exists, but am not sure if that file was locked by something else, and was in use, or maybe didnt have the necessary permissions? am not sure about the syntax, but please hang in there an hopefully someone else can chime in if they had a similar error, but if you do want to change any permissions i think it would be best to take a full data backup in case something goes wrong - maybe one idea could be to try taking backups (but not actually backing up from the drobo to the drobo, but from the drobo, to computer to see if that works?)

also, maybe the folder (or area) on the drobo itself, where the backups are being made, is low on space? (not just the drobo overall but the internal place where the backup files are initially trying to be generated in?)

There is plenty of disc space. As mentioned when we reboot the host that is getting backed up, OR the Drobo, the first backup after either reboot runs fine without the input/output error. It’s the subsequent backup that then experiences those errors. I was hoping someone here knew what the Error -11 logs meant.

Here is the script used to create the backups:

[code]#!/usr/bin/perl -W
use POSIX;

Global variables

Host name will be used as name of directory for backups on drobo

my($hostname)=/bin/hostname;
chomp($hostname);
my($configfile)="/etc/drobo-backup.conf";
my($tar)="/bin/tar"; # Path to tar utility
my($mkdir)="/bin/mkdir"; # Path to mkdir utility
my($verbose)=0;
my($testmode)=0;

sub Usage {
print “Usage: drobo-backup [-v] [-c configfile]\n”;
print " -v : verbose mode";
print " -c : specify configuration file (default $configfile)";
print " -n : printd but don’t execute commands (for testing config)";
exit 1;
}

Subroutine to back up one directory to drobo

sub do_backup {
my($drobo,$args,$backup,$cond) = @_;

The backup arg may include per-backup tar args. Strip off these

and quotes to get at the filename

my($backuppath)=$backup;

If quoted, remove quotes for naming. Otherwise it is required not

not to have embedded blanks so that per-backup tar arguments may follow.

if( $backuppath =~ /^"([^"]*)"/ ) { # double quotes
    $backuppath = $1;
}
elsif( $backuppath =~ /^'([^']*)'/ ) { # single quotes
    $backuppath = $1;
}
elsif( $backuppath =~ /^(\S*)/ ) { # otherwise no blanks in path
    $backuppath = $1;
}

if( -d $backuppath || -f $backuppath ) { # check it is a valid dir or file
    my($drobodir) = "$drobo/$hostname";
    # make sure the drobo subdirectory for this host exists
    my($mkdircmd) = "$mkdir $drobodir";
    if( $verbose ) {
        print "$mkdircmd\n";
    }
    if( ! ($testmode || -d $drobodir) ) {
        system($mkdircmd);
    }
    # if it did not work, bail.
    if( ! ($testmode || -d $drobodir) ) {
        print "Failed to create destination dir $drobodir\n";
        exit 1;
    }

construct tarfile name, e.g. usr-local.tgz

    my($backupfilestem) = $backuppath;
                            # Sanitize the tarfile name
    while($backupfilestem =~ s,^/,,) { next; } # remove any leading slash(es)
    while($backupfilestem =~ s,/$,,) { next; } # remove any trailing slash(es)
    $backupfilestem =~ s,/,-,g;     # all internal slashes become hyphens
    $backupfilestem =~ s/[^-\.\w]/X/g;# all remaining non-word chars exc . become X

    my($backupname)="$drobodir/$backupfilestem-new.tgz";
    my($backuprename) = "$drobodir/$backupfilestem.tgz";

$backup =~ s|/|| if $backup =~ m|^[’"]?/|; # remove any leading slash

my($tarcmd) = “$tar -C / -czf $backupname $args $backup”;

    my($tarcmd);
    if($backup =~ m|^['"]?/|){
       $backup =~ s|/||;  # remove any leading slash
       $tarcmd = "ionice -c2 -n7 nice -n19 $tar -cf $backupname -C / $args $backup";
    }else{
       $tarcmd = "ionice -c2 -n7 nice -n19 $tar -cf $backupname $args $backup";
    }

    if( $cond ) {
        $cond =~ s/^\[//;   # remove the [ ] around condition
        $cond =~ s/\]$//;
        $cond =~ s/BKPATH/$backuppath/g; # convenience substitutions
        $cond =~ s/TARFILE/$backuprename/g;
        $cond =~ s/TARDIR/$drobodir/g;
    }
    if( $cond && WEXITSTATUS(system("test $cond")) != 0 ) {
        if( $verbose ) {
            print "Condition [$cond] tests false\n";
            print "No backup of $backuppath\n";
        }
    }
    else {
        if( $verbose ) {
            if( $cond ) {
                print "Condition [$cond] tests true\n";
            }
            print "$tarcmd\n";
        }
    # tar returns 0 for success, 1 for warnings such as file changed while
    # being copied.  So we take either as meaning success.  Rename foo-new.tgz
    # to the (usually existing) foo.tgz.  N.B. system() returns status<<8.
        if( !$testmode ) {
            if( WEXITSTATUS(system($tarcmd)) >= 2 ) {
                print "\nBackup of $backuppath FAILED\n\n";
                # to avoid bad backup being renamed to good in second try, call it bad
                $backuprename = "$drobodir/$backupfilestem-FAILED.tgz";
            }
            if( rename("$backupname","$backuprename") ) {
                print "Backed up $backuppath to $backuprename\n";
            }
            else {
                print "Failed to rename $backupname to $backuprename: $!\n";
            }
        }
    }
}
else {
    print "$backuppath is not a directory or file\n";
}

}

default arguments to use on every backup

my($tarargs)="–atime-preserve --one-file-system --warning=no-file-ignored --warning=no-file-changed --warning=no-file-removed --ignore-failed-read";

set default drobopath according to dsm (lc) or cis (rh) network

my($drobopath)="/drobo-xx/drobo-yy"; # cis value
if($hostname =~ /.sub.domain.edu/) {
$drobopath="/drobo-yy/drobo-xx"; # dsm value
}

Process command line arguments

while(@ARGV) {
if( $ARGV[0] eq “-c” ) { # -c configfile
shift (@ARGV);
if(@ARGV) {
$configfile = $ARGV[0];
}
else {
Usage();
}
}
elsif( $ARGV[0] eq “-v” ) { # -v (verbose mode)
++$verbose;
}
elsif( $ARGV[0] eq “-n” ) { # -n (no-exec mode)
$testmode = 1;
}
else { # unrecognized argument
Usage();
}

shift (@ARGV);

}

open(CONFIGFILE,$configfile) || die(“Cannot open configfile $configfile: $!”);

if($verbose) {
print “Reading configfile $configfile\n”;
}

my(@backup);
my(%condition);
my($configline) = 0;
foreach () {
$configline++;
if( /^\s*#/ || /^\s*$/ ) { # skip blank & comment lines (first nonblank is #)
next;
}
# drobo=/path/to/drobo
if( /^\sdrobo\s=\s*(.)$/ ) {
$drobopath=$1;
}
# tarargs=global tar arguments
elsif( /^\s
tarargs\s*=\s*(.)$/ ) {
$tarargs=$1;
}
# backup [condition] =/path/for/backup [tar args]
elsif( /^\s
backup\s*([[^]]])?\s=\s*(.*)$/ ) {
push(@backup,$2);
if( $1 ) {
$condition{$2} = $1;
}
}
else {
print “Unknown config directive at line $configline in $configfile:\n”;
print;
exit 1;
}
}

close(CONFIGFILE);

my($path);
foreach $path (@backup) {
do_backup($drobopath,$tarargs,$path,$condition{$path});
}

For unknown reason, rename of some files often fails. Try again here.

foreach $tarfile ( glob("$drobopath/$hostname/*-new.tgz") ) {
$rename_name = $tarfile;
$rename_name =~ s/-new.tgz$/.tgz/;
if( rename($tarfile,$rename_name) ) {
print “Second try renamed $tarfile to $rename_name\n”;
}
}
[/code]

I believe I found the issue. It was with Tracker which I uninstalled. Seems it was indexing files and the Trash during the backup. I posted a write up at serverfault.

hi, well done for finding out the cause :slight_smile:

Hm well the problem is back and I submitted this to the Linux CIFS maintainers who analyzed a WireShark and came up with the below response. The unit is out of warranty so Drobo won’t issue a firmware update. We can try the 1 time $99 support but I doubt they’d provide a custom firmware upgrade. Here’s what they wrote:

"The problem appears to be firewall related because I am seeing ICMP Dest Unreachable responses to an SMB Write&X Response (frame 5816 and 5817).

Do a search on tcp.port == 42166 and you will see the response and then a DEST Unreachable, then a retransmit in 120 seconds and another DEST Unreachable etc.

However, there is also another SMB session on tcp.port == 53858 where the Drobo is returning INVALID handle on the WRITES … which seems to be the fundamental problem.

However, stuff is missing from the capture.

This session seems to indicate the problem: tcp.port == 53860

There are a bunch of writes, but eventually the Windows fills up and the Drobo stops responding at the TCP layer and eventually one party or other RSTs the connection. Looks like a Drobo problem to me."

hi rkudyba, if tracker hasnt made its way back into the system, i just had a thought…

you know when you do a reboot and the backup routine runs 1st time, but then when the 2nd attempt kicks in, you get the problems (as far as i understood it)…

… are these sessions, (1st time and 2nd time) still carried out in the same sessions instance?

if they are… i was wondering what happens if you terminate any back-end connection, and then connect again to run the backup command?

No tracker isn’t back. No, the previous sessions finish. The perl script has some error checking to retry the backup of where it fails. The failure does seem to happen at the same place, usually when it starts the /etc path.