[OmniOS-discuss] iSCSI traffic suddenly comes to a halt and then resumes

Matej Zerovnik matej at zunaj.si
Fri May 29 11:09:56 UTC 2015


Today the server crashed again. I’m not sure if it’s because I was running SMART short self-tests or not, but it looks like it started around that time. 

I’m still running smart tests, but it looks like there are no errors on the drives, although some tests take up to 30min to finish… iostat -E also reports no errors.

When it froze, I started iostat and tried to write a file to ZFS pool. As usual, it froze, but I left iostat running, hoping it will give me some infos… After 30 or something minutes, system become responsible again and this is how my iostat output looks like:
http://pastebin.com/W4EWgnzq <http://pastebin.com/W4EWgnzq>

System got responsible at 'Fri May 29 11:38:45 CEST 2015'.

It’s weird to say the least. It looks like there is something in write buffer that hogs the ZFS for quite some time and gets released or times-out after a certain time. But I’m not sure that it is and what thing has such a long timeout. It looks like freeze lasted for 15 minutes.

Matej

> On 28 May 2015, at 18:30, Josten Landtroop <josten at omniti.com> wrote:
> 
> Have you verified that your disks are not having any issues with smartctl and iostat -E ?
> 
> I'd suggest running a short test on the disks: smartctl -d sat,12 -t short /path/to/disk (note: you may need to append s2 to the physical disk name).
> 
> I built a test target and iSCSI initiator and wrote 1G from /dev/zero and ended up crashing the sesssion; are your sessions under load?
> 
> On Wed, May 27, 2015 at 2:58 AM, Matej Zerovnik <matej at zunaj.si <mailto:matej at zunaj.si>> wrote:
> Hello Josten,
> 
> 
>> On 26 May 2015, at 22:18, Josten Landtroop <josten at omniti.com <mailto:josten at omniti.com>> wrote:
>> 
>> Hi Matej,
>> 
>> Do you have sar running on your system? I'd recommend maybe running it at a short interval so that you can get historical disk statistics. You can use this info to rule out if its the disks or not. You can also use iotop -P to get a real time view of %IO to see if it's the disks. You can also use zpool iostat -v 1.
> 
> I didn’t have sar or iotop running, but I had 'iostat -xn' and 'zpool iostat -v 1' running when things stopped working, but there is nothing unusual in there. Write ops suddenly fall to 0 and that’s it. Reads are still happening and according to network traffic, there is outgoing traffic when I’m unable to write to the ZFS FS (even locally on the server). I created a simple text file, so next time system hangs, I will be able to check if system is readable (currently, I only have iscsi volumes, so I’m unable to check that locally on server).
> 
>> 
>> Also, do you have baseline benchmark of performance and know if you're meeting/exceeding it? The baseline should be for random and sequential IO; you can use bonnie++ to get this information.
> 
> I can, with 99,99% say, I’m exceeding performance of the pool itself. It’s a single raidz2 vdev with 50 hard drives and 70 connected clients. some are idling, but 10-20 clients are pushing data to server. I know zpool configuration is very bad, but that’s a legacy I can’t change easily. I’m already syncing data to another 7 vdev server, but since this server is so busy, transfers are happening VERY SLOW (read, zfs sync doing 10MB/s).
> 
>> 
>> Are you able to share your ZFS configuration and iSCSI configuration?
> 
> Sure! Here are zfs settings:
> 
> zfs get all data:
> NAME  PROPERTY              VALUE                  SOURCE
> data  type                  filesystem             -
> data  creation              Fri Oct 25 20:26 2013  -
> data  used                  104T                   -
> data  available             61.6T                  -
> data  referenced            1.09M                  -
> data  compressratio         1.08x                  -
> data  mounted               yes                    -
> data  quota                 none                   default
> data  reservation           none                   default
> data  recordsize            128K                   default
> data  mountpoint            /volumes/data          received
> data  sharenfs              off                    default
> data  checksum              on                     default
> data  compression           off                    received
> data  atime                 off                    local
> data  devices               on                     default
> data  exec                  on                     default
> data  setuid                on                     default
> data  readonly              off                    local
> data  zoned                 off                    default
> data  snapdir               hidden                 default
> data  aclmode               discard                default
> data  aclinherit            restricted             default
> data  canmount              on                     default
> data  xattr                 on                     default
> data  copies                1                      default
> data  version               5                      -
> data  utf8only              off                    -
> data  normalization         none                   -
> data  casesensitivity       sensitive              -
> data  vscan                 off                    default
> data  nbmand                off                    default
> data  sharesmb              off                    default
> data  refquota              none                   default
> data  refreservation        none                   default
> data  primarycache          all                    default
> data  secondarycache        all                    default
> data  usedbysnapshots       0                      -
> data  usedbydataset         1.09M                  -
> data  usedbychildren        104T                   -
> data  usedbyrefreservation  0                      -
> data  logbias               latency                default
> data  dedup                 off                    local
> data  mlslabel              none                   default
> data  sync                  standard               default
> data  refcompressratio      1.00x                  -
> data  written               1.09M                  -
> data  logicalused           98.1T                  -
> data  logicalreferenced     398K                   -
> data  filesystem_limit      none                   default
> data  snapshot_limit        none                   default
> data  filesystem_count      none                   default
> data  snapshot_count        none                   default
> data  redundant_metadata    all                    default
> data  nms:dedup-dirty       on                     received
> data  nms:description       datauporabnikov        received
> 
> I’m not sure what iSCSI configuration do you want/need? But as far as I figured out in the last 'freeze', iSCSI is not the problem, since I’m unable to write to ZFS volume even if I’m local on the server itself.
> 
>> 
>> For iSCSI, can you take a look at this: http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume <http://docs.oracle.com/cd/E23824_01/html/821-1459/fpjwy.html#fsume>
> 
> Interesting. I tried running 'iscsiadm list target' but it doesn’t return anything. There is also nothing in /var/adm/messages as usual:) But target service is online (according to svcs), clients are connected and having traffic.
> 
>> 
>> Do you have detailed logs for the clients experiencing the issues? If not are you able to enable verbose logging (such as debug level logs)?
> 
> I have clients logs, but they mostly just report loosing connections and reconnecting:
> 
> Example 1:
> Apr 29 10:33:53 eee kernel: connection1:0: detected conn error (1021)
> Apr 29 10:33:54 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
> Apr 29 10:33:56 eee iscsid: connection1:0 is operational after recovery (1 attempts)
> Apr 29 10:36:37 eee kernel: connection1:0: detected conn error (1021)
> Apr 29 10:36:37 eee iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
> Apr 29 10:36:40 eee iscsid: connection1:0 is operational after recovery (1 attempts)
> Apr 29 10:36:50 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
> Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
> Apr 29 10:36:51 eee kernel: sd 3:0:0:0: Device offlined - not ready after error recovery
> 
> Example 2:
> Apr 16 08:41:40 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
> Apr 16 08:43:11 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
> Apr 16 08:44:13 vf kernel: connection1:0: pdu (op 0x5e itt 0x1) rejected. Reason code 0x7
> Apr 16 08:45:51 vf kernel: connection1:0: detected conn error (1021) Apr 16 08:45:51 317 iscsid: Kernel reported iSCSI connection 1:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)
> Apr 16 08:45:53 vf iscsid: connection1:0 is operational after recovery (1 attempts)
> 
> 
> I’m already in contact with OmniTI regarding our new build, but in the mean time, I would love for our clients to be able to use the storage so I’m trying to resolve the current issue somehow…
> 
> Matej
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.omniti.com/pipermail/omnios-discuss/attachments/20150529/95bd0b1c/attachment-0001.html>


More information about the OmniOS-discuss mailing list