[Hardware] выпадение scsi винта

Salavat Yarmukhametov salavat на regiongarant.ru
Ср Июн 16 14:04:09 MSD 2004


	Такая ситуация. Добавил на сервер 2 винта (всего стало 4). После
этого один из новых винчестеров периодически отваливается от системы.
Перевтыкание винчестера не помогает. После перезагруза работает нормально
примерно с неделю, потом опять облом.

[root at sambaserver kernel]# pwd
/var/log/kernel
[root at sambaserver kernel]# less errors.2.bz2
Jun  3 04:26:25 sambaserver kernel: Aborting journal on device sd(8,49).
Jun  3 10:19:05 sambaserver kernel: EXT3-fs: unable to read superblock
Jun  3 11:11:53 sambaserver kernel: kmod: failed to exec /sbin/modprobe -s
-k sc
si_hostadapter, errno = 2

[root at sambaserver kernel]# less warnings.3.bz2
Jun  3 04:03:00 sambaserver kernel: scsi0:0:4:0: Attempting to queue an
ABORT me
ssage
Jun  3 04:03:00 sambaserver kernel: scsi0: Dumping Card State while idle,
at SEQ
ADDR 0x8
Jun  3 04:03:00 sambaserver kernel: ACCUM = 0x0, SINDEX = 0x56, DINDEX =
0xe4, A
RG_2 = 0x0
Jun  3 04:03:00 sambaserver kernel: HCNT = 0x0 SCBPTR = 0x16
Jun  3 04:03:00 sambaserver kernel: SCSISEQ = 0x12, SBLKCTL = 0xa
Jun  3 04:03:00 sambaserver kernel:  DFCNTRL = 0x0, DFSTATUS = 0x89
Jun  3 04:03:00 sambaserver kernel: LASTPHASE = 0x1, SCSISIGI = 0x0,
SXFRCTL0 = 
0x80
Jun  3 04:03:00 sambaserver kernel: SSTAT0 = 0x0, SSTAT1 = 0x8
Jun  3 04:03:00 sambaserver kernel: SCSIPHASE = 0x0
Jun  3 04:03:00 sambaserver kernel: STACK == 0x3, 0x108, 0x160, 0x0
Jun  3 04:03:00 sambaserver kernel: SCB count = 196
Jun  3 04:03:00 sambaserver kernel: Kernel NEXTQSCB = 124
Jun  3 04:03:00 sambaserver kernel: Card NEXTQSCB = 124
Jun  3 04:03:06 sambaserver kernel: QINFIFO entries: 
Jun  3 04:03:06 sambaserver kernel: Waiting Queue entries: 
Jun  3 04:03:06 sambaserver kernel: Disconnected Queue entries: 9:182 
Jun  3 04:03:06 sambaserver kernel: QOUTFIFO entries: 
Jun  3 04:03:06 sambaserver kernel: Sequencer Free SCB List: 22 13 17 10 8
2 7 3
0 25 29 6 21 23 27 1 31 4 3 0 15 24 28 18 5 14 26 11 16 19 20 12 
Jun  3 04:03:06 sambaserver kernel: Sequencer SCB Info: 0(c 0x60, s 0x27,
l 0, t
 0xff) 1(c 0x60, s 0x7, l 0, t 0xff) 2(c 0x60, s 0x7, l 0, t 0xff) 3(c
0x60, s 0
x7, l 0, t 0xff) 4(c 0x60, s 0x7, l 0, t 0xff) 5(c 0x60, s 0x27, l 0, t
0xff) 6(
c 0x60, s 0x7, l 0, t 0xff) 7(c 0x60, s 0x7, l 0, t 0xff) 8(c 0x60, s 0x7,
l 0, 
t 0xff) 9(c 0x64, s 0x47, l 0, t 0xb6) 10(c 0x60, s 0x7, l 0, t 0xff) 11(c
0x60,
 s 0x7, l 0, t 0xff) 12(c 0x60, s 0x7, l 0, t 0xff) 13(c 0x60, s 0x7, l 0,
t 0xf
f) 14(c 0x60, s 0x7, l 0, t 0xff) 15(c 0x60, s 0x7, l 0, t 0xff) 16(c
0x60, s 0x
7, l 0, t 0xff) 17(c 0x60, s 0x7, l 0, t 0xff) 18(c 0x60, s 0x7, l 0, t
0xff) 19
(c 0x60, s 0x7, l 0, t 0xff) 20(c 0x60, s 0x7, l 0, t 0xff) 21(c 0x60, s
0x27, l
 0, t 0xff) 22(c 0x60, s 0x27, l 0, t 0xff) 23(c 0x60, s 0x7, l 0, t 0xff)
24(c 
0x60, s 0x7, l 0, t 0xff) 25(c 0x60, s 0x7, l 0, t 0xff) 26(c 0x60, s 0x7,
l 0, 
t 0xff) 27(c 0x60, s 0x7, l 0, t 0xff) 28(c 0x60, s 0x27, l 0, t 0xff)
29(c 0x60
, s 0x7, l 0, t 0xff) 30(c 0x60, s 0x7, l 0, t 0xff) 31(c 0x60, s 0x27, 
Jun  3 04:03:06 sambaserver kernel: Pending list: 182(c 0x60, s 0x47, l 0)
Jun  3 04:03:06 sambaserver kernel: Kernel Free SCB list: 86 65 54 41 133
91 39 
44 1 143 16 73 60 185 176 101 169 178 188 130 21 98 79 157 75 113 147 5 57
0 59 
47 20 8 30 19 131 76 158 78 63 154 31 167 82 7 22 64 168 43 150 11 29 106
149 13
 52 92 72 187 99 120 170 125 40 33 164 24 116 127 108 32 186 23 84 173 163
122 1
52 174 67 115 12 104 48 36 55 56 58 121 123 138 110 27 17 83 141 68 45 159
3 134
 128 111 117 166 71 142 137 88 135 103 146 181 49 9 95 87 51 153 144 46 77
38 17
1 177 90 180 112 61 139 15 172 14 50 162 156 148 109 6 26 4 132 96 129 93
165 97
 195 25 18 70 190 94 119 155 100 151 34 191 175 118 126 107 183 28 85 42
74 66 1
40 89 114 80 2 189 81 160 136 179 105 35 102 145 53 184 62 69 10 161 37
194 193 192 
Jun  3 04:03:06 sambaserver kernel: DevQ(0:0:0): 0 waiting
Jun  3 04:03:06 sambaserver kernel: DevQ(0:2:0): 0 waiting
Jun  3 04:03:06 sambaserver kernel: DevQ(0:3:0): 0 waiting
Jun  3 04:03:06 sambaserver kernel: DevQ(0:4:0): 0 waiting
Jun  3 04:03:06 sambaserver kernel: DevQ(0:15:0): 0 waiting
Jun  3 04:03:06 sambaserver kernel: (scsi0:A:4:0): Queuing a recovery SCB
Jun  3 04:03:06 sambaserver kernel: scsi0:0:4:0: Device is disconnected,
re-queu
ing SCB
Jun  3 04:03:06 sambaserver kernel: Recovery code sleeping
Jun  3 04:03:06 sambaserver kernel: Recovery SCB completes
Jun  3 04:03:06 sambaserver kernel: Recovery code awake
Jun  3 04:03:06 sambaserver kernel: aic7xxx_abort returns 0x2002
Jun  3 04:03:06 sambaserver kernel: scsi0:0:4:0: Attempting to queue a
TARGET RE
SET message
Jun  3 04:03:06 sambaserver kernel: scsi0:0:4:0: Command not found
Jun  3 04:03:06 sambaserver kernel: aic7xxx_dev_reset returns 0x2002
Jun  3 04:03:06 sambaserver kernel: SCSI disk error : host 0 channel 0 id
4 lun 
0 return code = 10000
Jun  3 04:03:06 sambaserver kernel:  I/O error: dev 08:31, sector 4128

[root at sambaserver kernel]# lsmod
aic7xxx               109760  11 
sd_mod                 11036  22 
scsi_mod               88284   2  [aic7xxx sd_mod]

[root at sambaserver kernel]# scsi_info /dev/sdd
open() failed: No such device or address

#винчестер такой:
[root at sambaserver kernel]# scsi_info /dev/sdc
SCSI_ID="0,3,0"
MODEL="SEAGATE ST318404LC"
FW_REV="0005"

железо:  IBM Netfinity 5100 
http://www-307.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-4K5RG4
Dual channel Ultra2 SCSI controller (both channels internal)

[root at sambaserver kernel]# uname -a
Linux sambaserver.regiongarant.ru 2.4.18-alt13-up #1 Thu Sep 25 22:58:16
MSD 2003 i686 unknown
Master2.0

	Можно переинициализировать 1 канал scsi контроллера не трогая
остальные? т.е. чтобы остальные 3 винчестера не отключать?
	На что грешить в первую очередь - на винт, scsi-контроллер,
ядерный модуль?
--
Salavat Yarmukhametov		
Jabber: salik at jabber.ru		
ICQ:	21144441


Подробная информация о списке рассылки Hardware