Buffalo NAS-Central Forums

Welcome to the Linkstation Wiki community
It is currently Mon Jul 16, 2018 8:59 pm

All times are UTC+01:00




Post new topic  Reply to topic  [ 27 posts ]  Go to page 1 2 Next
Author Message
PostPosted: Fri Oct 12, 2007 12:09 am 
Offline
Newbie

Joined: Sat Aug 04, 2007 12:45 am
Posts: 51
Location: North West, UK
Hi there,

I'd like to share my experience with anybody who is willing to read this ;) and ask for help.
I've got LS1 with freelink upgraded to Etch and 2.6.22.x kernel. I've got a couple of USB hard drives and I tried to use one of them with the LS1. However after a couple of tests it turned out that the box is simply hanging whenever I'm trying to access/copy/read/write more data on the USB drive. The transfer is also very slow at times. I tried to copy 70GB, it took more than a day (about 650kb/s) and in the end the some big files I've copied are different to the originals.

I've spent last two weeks testing and booting and came to a conclusion that something may be wrong with the kernel. Here is why:
- I've tested 2 different USB disks&enclosures to make sure it's not faulty. They both work fine on my other machines
- When I boot into 2.4.17 kernel, everything is rock solid (albeit a bit slowish at 5MB/s)
- When I boot into 2.6.22 and THEN attach the USB drive, things are working fine mostly, even though it crashed on me once
- When I boot into 2.6.22 with the USB drive attached, it goes down the hill.
I can crash the box everytime simply by running
Code:
 hdparm -tT /dev/sda1
twice (or 3 times if I am lucky).
I've also tried to do some performance tests using bonnie++. This also crashes the box within 15 minutes of starting the test. Doesn't happen with kernel 2.4. The parameters I am using with bonnie++ are:
Code:
time bonnie -d . -s 1g:4k -n 0 -x 1 -u robo > /mnt/bonnie/result.csv


Is anybody here willing/able to test/prove/disprove this theory by running the code above when booted WITH the USB disk connected?

The LS1 simply stops responding until the watchdog performs a reset. There is absolutely nothing in any of the logs. It simply stops logging anything (like if all disk I/O stopped) and then the next entry is the boot.

Can any of the kernel hackers here (andre?) tell me whether I can enable something somewhere to get some diagnostics/traces to see what is actually going on inside?

I've also tried ext2, ext3, jfs, with udev (initially), without udev (to be sure), with usbmount and without usbmount. No difference.
Any help or idea would be greatly appreciated.

Cheers

Rob


Top
   
PostPosted: Fri Oct 12, 2007 5:17 am 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
I've never had any problems with 2.6.22.* and USB media (among them the Buffalo DriveStation). Assuming you're using the latest kernel, have tried a different cable, and aren't using a USB hub, could you post information on the USB parts of the logs and on /etc/fstab? Also make sure your swap partition is active, and there is a sufficient amount of swap. Watch "top" while copying. I've never used bonnie but will give it a try this weekend if the problem persists.


Top
   
PostPosted: Fri Oct 12, 2007 3:04 pm 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
real 30m23.881s
user 20m36.248s
sys 4m49.568s
ls,1G:4k,1452,95,9544,75,6082,47,1558,94,15252,45,119.6,5,,,,,,,,,,,,,


Top
   
PostPosted: Sat Oct 13, 2007 10:39 pm 
Offline
Newbie

Joined: Sat Aug 04, 2007 12:45 am
Posts: 51
Location: North West, UK
Andre,

Thanks for looking at this.
The bonnie result means that your LS writing speed is 9.5MB/s and reading 15.2MB/s :). BTW. You can use bon_csv2txt < result.csv (and bon_csv2html) to see the result in a nice format.

So this is what I've done based on your comments.
Yes, I'am using the latest kernel (tried both 2.6.22-5 and -9). Tried two different USB drives, two disks and two cables. No hub.
I've checked swap, disabled it, re-ran mkswap -c, rebooted.
Now it doesn't crash when I run hdparm. I tried to copy my 70GB archive and it went through at about 9MB/s, which is great.
However when I run diff -rq on those two disks, my LS dies within a couple of seconds. I was running vmstat during diff and LS hasn't touched the swap:
Code:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0     40   9280   1636  18188    0    0     0     5    1   12  0  0 100  0
 0  0     40   9800   1636  18180    0    0     0     0    5   22  0  0 99  0
 0  0     40   9772   1652  18184    0    0     0    12   13   56  1  3 96  0
 0  0     40   9772   1652  18184    0    0     0     4   14   40  0  0 100  0
 0  0     40   9784   1660  18184    0    0     0     5   11   43  1  1 99  0
 0  0     40   9784   1660  18184    0    0     0     0    6   26  0  0 100  0
 0  0     40   9796   1668  18184    0    0     0     5    1   13  0  0 100  0
 0  0     40   9796   1668  18184    0    0     0     0   11   44  0  1 99  0
 2  1     40   1160    188  28500    0    0 15004    13  745 1149 11 48 34  8  <-- diff was started here
 0  1     40   1632    204  28100    0    0 16798     5  980 1371 12 57  0 32


And here it died. So swap utilization hasn't changed, the system started doing lots of I/O.

At the moment I am copying /mnt to /mnt/2 (both on the same disk). I will run diff on those two, to see whether it is related to memory or I/O or it really does it only when using USB disk.

I will post more later tonight/tomorrow.


Top
   
PostPosted: Sun Oct 14, 2007 9:56 pm 
Offline
Newbie

Joined: Sat Aug 04, 2007 12:45 am
Posts: 51
Location: North West, UK
Hi there,

So, the situation is.

I wrote a quick&dirty program that allocates memory in 128MB chunks. Ran it a couple of times to make the LS start paging. No problems at all. As I already recreated swap and tried a swap file, I don't believe that problems lies in this area.

Basically the behaviour is:
- Write to USB drive seems to be mostly OK
- Long reads, reads of big files (>100-ths MB) either hang the box or read mangled data.
- I've tested it on a big (4GB) ISO file. Ran diff a couple of times. It either said that the original file and the file on USB drive are different or it died. When I rebooted into 2.4.17 kernel, the diff finished without any problems and said that the files are the same. I've tried this on a few more big files with the same result.
- I've tried both USB slots with the same results.
- I can copy big files and diff them within the internal disk without any problem.
- I ran iostat and vmstat during all the tests. The LS hasn't touched swap space at all. iostat showed loads of I/O ranging from 9MB/s to 20MB/s on both hda and sda, which is exactly as it should be.
- I ran dd if=/dev/sda2 of=/dev/zero bs=4096 and it killed the box after about 50-60 minutes.

I really don't know what to think about this. Whichever way I look at it, it seems that I either have a weird HW problem/race condition or a problem with the kernel (or an interaction between 2.4 and 2.6 when it gets loaded).
Any ideas what else I can try?

lsusb shows this about my USB drive:
Code:
~# lsusb -s 001:002 -v

Bus 001 Device 002: ID 059f:0951 LaCie, Ltd
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x059f LaCie, Ltd
  idProduct          0x0951
  bcdDevice            0.00
  iManufacturer          10 LaCie SA
  iProduct               11 LaCie Hard Drive USB
  iSerial                 5 031705062FFF
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           32
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          4 USB Mass Storage
    bmAttributes         0xc0
      Self Powered
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk (Zip)
      iInterface              6 MSC Bulk-Only Transfer
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  bNumConfigurations      1
can't get debug descriptor: Connection timed out
Device Status:     0xa801
  Self Powered


dmesg shows this about usb:
Code:
...
ehci_hcd 0000:00:0e.2: EHCI Host Controller
ehci_hcd 0000:00:0e.2: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:0e.2: irq 19, io mem 0xbfffcf00
ehci_hcd 0000:00:0e.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 5 ports detected
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
ohci_hcd 0000:00:0e.0: OHCI Host Controller
ohci_hcd 0000:00:0e.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:0e.0: irq 19, io mem 0xbfffe000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
ohci_hcd 0000:00:0e.1: OHCI Host Controller
ohci_hcd 0000:00:0e.1: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:0e.1: irq 19, io mem 0xbfffd000
usb 1-1: new high speed USB device using ehci_hcd and address 2
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
usb 1-1: configuration #1 chosen from 1 choice
USB Universal Host Controller Interface driver v3.0
usbcore: registered new interface driver usblp
drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
Initializing USB Mass Storage driver...
scsi0 : SCSI emulation for USB Mass Storage devices
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
...
usb-storage: device scan complete
scsi 0:0:0:0: Direct-Access     SAMSUNG  HD500LJ               PQ: 0 ANSI: 2 CCS
sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 38 00 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 38 00 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
 sda: sda1 sda2
sd 0:0:0:0: [sda] Attached SCSI disk



Top
   
PostPosted: Mon Oct 15, 2007 3:46 pm 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
Pretty thorough testing indeed. I don't have an answer, but I can rule out 2.4/2.6 interaction (kernel 2.6 via uboot or kernel loader takes over completely). You might have run into a Linux bug. Maybe the upcoming release, 2.6.22.9 (v85/r321), out soon, will perform better for you, but I'm not optimistic.


Top
   
PostPosted: Mon Oct 15, 2007 11:48 pm 
Offline
Newbie

Joined: Sat Aug 04, 2007 12:45 am
Posts: 51
Location: North West, UK
Andre,

Not an answer I was hoping for :cry:.
Would you (or ANYBODY else!!!) be able to copy 1+GB file from your disk to an USB attached disk and then do a diff -q ?

I spent a couple of hours googling different lists and forums. It seems that several people have similar problem with USB from time to time. Neither of them had a solution, however their problems weren't as fatal as mine i.e. their machines haven't died a sudden death.

Would you be able to do me a favour a compile a special kernel for me with CONFIG_USB_DEBUG enabled?
Alternatively could share your patched sources with me? I can try to compile it myself. I haven't compiled my own kernel for 6+ years, but I will probably remember how to do it :?.


Top
   
PostPosted: Tue Oct 16, 2007 9:06 am 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
I can compile the kernel later today. Or you could do it yourself -- thanks to Sylver, SVN is possible. See the news and .config on my server, though my own v85 won't get loaded (I've messed up the loader for now). It works, apart from that.

When my work on the loader is done, I could compile the kernel, and copy and diff a 5.some GB movie for you if you want.


Top
   
PostPosted: Wed Oct 17, 2007 11:55 am 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
Sorry for the delay, there's a life out there :)

The USB debugging kernel is in my tmp directory.

Copying a 5.8 GB movie from my LS's HDD to a DriveStation (though not attached on boot), in part while the kernel was being compiled, worked. The files look identical in ls -l. diff -q though killed my system, which rebooted by itself (into 2.4, aka no clean shutdown).

EDIT: My /etc/sysctl.conf contains
# disable oom Out of Memory killer
vm.overcommit_memory=2


Top
   
PostPosted: Wed Oct 17, 2007 8:57 pm 
Offline
Newbie

Joined: Sat Aug 04, 2007 12:45 am
Posts: 51
Location: North West, UK
Life? What life! We are not allowed to have a life :p

Quote:
diff -q though killed my system, which rebooted by itself

This is good news. At least for me :D. Probably not for you. It looks like I am not the only one who has this problem.

Could you try the diff while booted back in 2.4 kernel?

I've downloaded the new kernel you compiled. Thanks, but I think that I need the modules as well, as the usb is module, so the debug code will be in the module. Can you give me those as well, please?

Thanks


Top
   
PostPosted: Thu Oct 18, 2007 7:24 am 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
A revised tarball is up.

I should probably have used md5sum; tried that?

Diffing on 2.4 would mean downtime for my server (it refuses to run on 2.4). Maybe somebody else could do that for you?


Top
   
PostPosted: Sun Oct 21, 2007 5:35 pm 
Offline
Newbie

Joined: Sat Aug 04, 2007 12:45 am
Posts: 51
Location: North West, UK
Sorry for the delay, had a few problems.

I've tried md5sum and the file was OK.
So did my "kill" test with the kernel you sent. It was a little more chatty when I connected the disk, but logged nothing when it died, so unfortunately I wasn't able to get any more information.

Tried to compile kernel 2.6.23, everything went smoothly, but after about 1.5 hrs the box simply switched itself off. May be it was overheated :(. Then had problems to make it start again. Managed to get it started after day and a half and had to fix the filesystem. So I am not sure whether I'll dare to start the kernel make again.

Have you got any plans to provide 2.6.23? Do you have to apply any custom patches to make it work on LS1 or is it just a standard kernel? I've found some patches for 2.4.17, but nothing for 2.6...
I may try cross-toolchain and compile it on my PC sometimes next week.


Top
   
PostPosted: Sun Oct 21, 2007 8:19 pm 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
I always compile on my LS1; 2.6.23.1/v86 is out; you can get everything via SVN (Sylver's Universal PPC kernel contains all the patches).

Are you sure your LS's HDD and power supply are OK?


Top
   
PostPosted: Sun Oct 21, 2007 11:39 pm 
Offline
Newbie

Joined: Sat Aug 04, 2007 12:45 am
Posts: 51
Location: North West, UK
2.6.23.1 wo-hoo! Well done guys. I will test it when I get back home. I am in London now, until Friday :(.

Finally found Sylver's post about svn and kernel on the forum. I was looking in wiki before.

The disk is only a couple of months old. It was thouroughly tested before I put it in LS. Not sure how to test PSU apart from measuring voltage on pins. I guess it's OK as it was running without any problems for a couple months before I started playing with it :D


Top
   
PostPosted: Mon Oct 22, 2007 5:05 am 
Offline
Site Admin
User avatar

Joined: Sun Jul 17, 2005 4:34 pm
Posts: 5332
There seems to be an RTC issue presently, FYI


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 27 posts ]  Go to page 1 2 Next

All times are UTC+01:00


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Limited