[Sysadmins] зависание сервера (логи, сбор информации)

master altlinux =?iso-8859-1?q?master=2Ealtlinux_=CE=C1_gmail=2Ecom?=
Сб Сен 6 13:01:24 MSD 2008


Спасибо всем большое...

Добился определенных результатов.
1. На сервере кроме ОС работает ППО (оно иногда пишет небольшие объемы
 информации - в час по 50-100 мб.) Поставил iozone тестироваться
вместе с ППО - через 2.5 часа сервер повис. Потом через сутки опять
iozone был поставлен но без ППО сервер выдержал.

2. Начал эксперементировать: работает ППО и запускаю скрипт obsec -
перезагрузка системы. Работает ППО - и запускаю скрипт updatedb -
система зависает.

3. Вчера вообще пипец начался: операторы по sftp копируют файлы с
сервера и он стабильно перегружается.

2Michael Shigorin: мне тоже нравятся ваши (ну в смысле altlinux)
сборки ядра. Но ваша компания отказалась поддерживать ветку ядра 2.4 и
поэтому приходится собирать самому. Пропылесосить это хорошее идея,
только почему они вместе падают и зависают.
Я считаю дело в дисковой подсистеме точнее в драйвере для raid массива
(aacraid).

Вот распечатка lsmod:
Module                  Size  Used by    Not tainted
sg                     29468   0  (autoclean)
sr_mod                 14320   0  (autoclean)
cdrom                  27552   0  (autoclean) [sr_mod]
floppy                 48056   0  (autoclean)
usb-storage            26040   0
autofs4                 8532   0  (autoclean)
usb-uhci               21996   0  (unused)
ehci-hcd               16872   0  (unused)
e1000                  97640   4  (autoclean)
ide-scsi                9296   0
ipmi_kcs_drv            8333   1
ipmi_devintf            3592   0  (unused)
w83627hf               14332   0  (unused)
bmcsensors             15937   0  (unused)
i2c-proc                5892   0  [w83627hf bmcsensors]
i2c-ipmi                2028   0  (unused)
ipmi_msghandler        14824   0  [ipmi_kcs_drv ipmi_devintf i2c-ipmi]
i2c-isa                  808   0  (unused)
i2c-i801                4664   0  (unused)
i2c-core               15172   0  [w83627hf bmcsensors i2c-proc
i2c-ipmi i2c-isa i2c-i801]
rtc                     6780   0  (autoclean)
aacraid                30212   4
sd_mod                 10832   8

А вот распечатка lspci -v:
00:00.0 Host bridge: Intel Corp.: Unknown device 25d8 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable-
	Capabilities: [6c] #10 [0041]

00:02.0 PCI bridge: Intel Corp.: Unknown device 25f7 (rev b1) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=01, subordinate=07, sec-latency=0
	I/O behind bridge: 00002000-00003fff
	Memory behind bridge: d8000000-d86fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000cff00000
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable-
	Capabilities: [6c] #10 [0041]

00:04.0 PCI bridge: Intel Corp.: Unknown device 25f8 (rev b1) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=08, subordinate=08, sec-latency=0
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable-
	Capabilities: [6c] #10 [0141]

00:06.0 PCI bridge: Intel Corp.: Unknown device 25f9 (rev b1) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=09, subordinate=09, sec-latency=0
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable-
	Capabilities: [6c] #10 [0141]

00:08.0 System peripheral: Intel Corp.: Unknown device 1a38 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at fe700000 (64-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/0 Enable-
	Capabilities: [6c] #10 [0091]

00:10.0 Host bridge: Intel Corp.: Unknown device 25f0 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: fast devsel

00:10.1 Host bridge: Intel Corp.: Unknown device 25f0 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: fast devsel

00:10.2 Host bridge: Intel Corp.: Unknown device 25f0 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: fast devsel

00:11.0 Host bridge: Intel Corp.: Unknown device 25f1 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: fast devsel

00:13.0 Host bridge: Intel Corp.: Unknown device 25f3 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: fast devsel

00:15.0 Host bridge: Intel Corp.: Unknown device 25f5 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: fast devsel

00:16.0 Host bridge: Intel Corp.: Unknown device 25f6 (rev b1)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: fast devsel

00:1c.0 PCI bridge: Intel Corp.: Unknown device 2690 (rev 09) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=0a, subordinate=0a, sec-latency=0
	Capabilities: [40] #10 [0141]
	Capabilities: [80] Message Signalled Interrupts: 64bit- Queue=0/0 Enable-
	Capabilities: [90] #0d [0000]
	Capabilities: [a0] Power Management version 2

00:1d.0 USB Controller: Intel Corp.: Unknown device 2688 (rev 09)
(prog-if 00 [UHCI])
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, medium devsel, latency 0, IRQ 17
	I/O ports at 1800 [size=32]

00:1d.1 USB Controller: Intel Corp.: Unknown device 2689 (rev 09)
(prog-if 00 [UHCI])
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, medium devsel, latency 0, IRQ 19
	I/O ports at 1820 [size=32]

00:1d.2 USB Controller: Intel Corp.: Unknown device 268a (rev 09)
(prog-if 00 [UHCI])
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, medium devsel, latency 0, IRQ 18
	I/O ports at 1840 [size=32]

00:1d.7 USB Controller: Intel Corp.: Unknown device 268c (rev 09)
(prog-if 20 [EHCI])
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, medium devsel, latency 0, IRQ 17
	Memory at d8a00000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
	Capabilities: [58] #0a [20a0]

00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to
PCI Bridge (rev d9) (prog-if 01 [Subtractive decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=0b, subordinate=0b, sec-latency=32
	I/O behind bridge: 00004000-00004fff
	Memory behind bridge: d8700000-d87fffff
	Prefetchable memory behind bridge: 00000000d0000000-00000000d7f00000
	Capabilities: [50] #0d [0000]

00:1f.0 ISA bridge: Intel Corp.: Unknown device 2670 (rev 09)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, medium devsel, latency 0

00:1f.1 IDE interface: Intel Corp.: Unknown device 269e (rev 09)
(prog-if 8a [Master SecP PriP])
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, medium devsel, latency 0, IRQ 18
	I/O ports at <unassigned>
	I/O ports at <unassigned>
	I/O ports at <unassigned>
	I/O ports at <unassigned>
	I/O ports at 1860 [size=16]

00:1f.3 SMBus: Intel Corp.: Unknown device 269b (rev 09)
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: medium devsel, IRQ 19
	I/O ports at 1100 [size=32]

01:00.0 PCI bridge: Intel Corp.: Unknown device 3500 (rev 01) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=01, secondary=02, subordinate=06, sec-latency=0
	I/O behind bridge: 00002000-00003fff
	Memory behind bridge: d8000000-d85fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000cff00000
	Capabilities: [44] #10 [0051]
	Capabilities: [70] Power Management version 2
	Capabilities: [80] #0d [0000]

01:00.3 PCI bridge: Intel Corp.: Unknown device 350c (rev 01) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=01, secondary=07, subordinate=07, sec-latency=64
	Capabilities: [44] #10 [0071]
	Capabilities: [6c] Power Management version 2
	Capabilities: [80] #0d [0000]
	Capabilities: [d8]
02:00.0 PCI bridge: Intel Corp.: Unknown device 3510 (rev 01) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=02, secondary=03, subordinate=05, sec-latency=0
	I/O behind bridge: 00002000-00002fff
	Memory behind bridge: d8000000-d84fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000cff00000
	Capabilities: [44] #10 [0061]
	Capabilities: [60] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
	Capabilities: [70] Power Management version 2
	Capabilities: [80] #0d [0000]

02:02.0 PCI bridge: Intel Corp.: Unknown device 3518 (rev 01) (prog-if
00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=02, secondary=06, subordinate=06, sec-latency=0
	I/O behind bridge: 00003000-00003fff
	Memory behind bridge: d8500000-d85fffff
	Capabilities: [44] #10 [0061]
	Capabilities: [60] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
	Capabilities: [70] Power Management version 2
	Capabilities: [80] #0d [0000]

03:00.0 PCI bridge: Intel Corp. PCI Bridge Hub A (rev 09) (prog-if 00
[Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=03, secondary=04, subordinate=04, sec-latency=64
	Memory behind bridge: d8000000-d83fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000cff00000
	Capabilities: [44] #10 [0071]
	Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
	Capabilities: [6c] Power Management version 2
	Capabilities: [d8]
03:00.2 PCI bridge: Intel Corp. PCI Bridge Hub B (rev 09) (prog-if 00
[Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=03, secondary=05, subordinate=05, sec-latency=64
	I/O behind bridge: 00002000-00002fff
	Memory behind bridge: d8400000-d84fffff
	Capabilities: [44] #10 [0071]
	Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
	Capabilities: [6c] Power Management version 2
	Capabilities: [d8]
04:01.0 RAID bus controller: Adaptec AAC-RAID (rev 02)
	Subsystem: Adaptec ASR-2020S PCI-X ZCR (Skyhawk)
	Flags: bus master, stepping, 66Mhz, medium devsel, latency 32, IRQ 16
	Memory at d8200000 (64-bit, non-prefetchable) [size=2M]
	Memory at d8000000 (32-bit, non-prefetchable) [size=2M]
	Memory at c0000000 (32-bit, prefetchable) [size=256M]
	Expansion ROM at <unassigned> [disabled] [size=32K]
	Capabilities: [c0] Power Management version 2
	Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable-
	Capabilities: [e0] PCI-X non-bridge device.

05:01.0 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet
Controller (rev 03)
	Subsystem: Intel Corp. PRO/1000 MT Dual Port Network Connection
	Flags: bus master, 66Mhz, medium devsel, latency 52, IRQ 16
	Memory at d8480000 (64-bit, non-prefetchable) [size=128K]
	Memory at d8400000 (64-bit, non-prefetchable) [size=256K]
	I/O ports at 2000 [size=64]
	Expansion ROM at <unassigned> [disabled] [size=256K]
	Capabilities: [dc] Power Management version 2
	Capabilities: [e4] 	Capabilities: [f0] Message Signalled Interrupts:
64bit+ Queue=0/0 Enable-

05:01.1 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet
Controller (rev 03)
	Subsystem: Intel Corp. PRO/1000 MT Dual Port Network Connection
	Flags: bus master, 66Mhz, medium devsel, latency 52, IRQ 17
	Memory at d84a0000 (64-bit, non-prefetchable) [size=128K]
	Memory at d8440000 (64-bit, non-prefetchable) [size=256K]
	I/O ports at 2040 [size=64]
	Expansion ROM at <unassigned> [disabled] [size=256K]
	Capabilities: [dc] Power Management version 2
	Capabilities: [e4] 	Capabilities: [f0] Message Signalled Interrupts:
64bit+ Queue=0/0 Enable-

06:00.0 Ethernet controller: Intel Corp.: Unknown device 1096 (rev 01)
	Subsystem: Super Micro Computer Inc: Unknown device 0000
	Flags: bus master, fast devsel, latency 0, IRQ 18
	Memory at d8500000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at 3000 [size=32]
	Capabilities: [c8] Power Management version 2
	Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
	Capabilities: [e0] #10 [0001]

06:00.1 Ethernet controller: Intel Corp.: Unknown device 1096 (rev 01)
	Subsystem: Super Micro Computer Inc: Unknown device 0000
	Flags: bus master, fast devsel, latency 0, IRQ 19
	Memory at d8520000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at 3020 [size=32]
	Capabilities: [c8] Power Management version 2
	Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
	Capabilities: [e0] #10 [0001]

0b:01.0 VGA compatible controller: ATI Technologies Inc: Unknown
device 515e (rev 02) (prog-if 00 [VGA])
	Subsystem: Super Micro Computer Inc: Unknown device 8080
	Flags: bus master, stepping, fast Back2Back, medium devsel, latency 66, IRQ 18
	Memory at d0000000 (32-bit, prefetchable) [size=128M]
	I/O ports at 4000 [size=256]
	Memory at d8700000 (32-bit, non-prefetchable) [size=64K]
	Expansion ROM at <unassigned> [disabled]  [size=128K]
	Capabilities: [50] Power Management version 2

В свое время на других объектах я собирал с этими железками и
драйверами на ядрах 2.4.32 и 2.4.35 и все работало без проблем.

Очень жду совета. Сейчас ситуация очень критическая. Если будут советы
по пересборке ядра либо вывода более подробной диагностической
информации в случании падения - буду очень признателен.

On 9/3/08, Michael Shigorin <mike на osdn.org.ua> wrote:
> On Mon, Sep 01, 2008 at 06:59:42AM +0400, master altlinux wrote:
>> За любые идеи - предложения - респект...
>
> Ну если за любые...
>
> Мне ядра сборки vsu@ нравятся определённо больше, чем своей.
>
> Если есть возможность -- проведите текущее обслуживание систем,
> начиная с пылесоса по корпусу, кулерам и памяти.
>
> Просмотрите глазами конденсаторы -- не вздулись ли какие.
>
> Прогоните несколько часов (хоть полчаса) memtest86+, а по дискам
> -- bonnie++.  После smartctl -a, если это программный RAID.
>
> Ещё могу предложить для анализа тенденций использовать collectd
> в клиент-серверном варианте (чтоб все байтики статистики, что
> успеет засунуть в сеть, были сохранены) -- можете попробовать
> сбэкпортить на M24 пакет из Daedalus, который всё никак не
> доберётся после доработки до Sisyphus и в бранчи.
>
> PS: как задумаетесь перетаскивать системы -- проверено
> засовывание целиком в OpenVZ-контейнер под Server 4.0
> с последующей постепенной/управляемой/откатывабельной
> миграцией сервисов на 4.0.
>
> --
>  ---- WBR, Michael Shigorin <mike на altlinux.ru>
>   ------ Linux.Kiev http://www.linux.kiev.ua/
> _______________________________________________
> Sysadmins mailing list
> Sysadmins на lists.altlinux.org
> https://lists.altlinux.org/mailman/listinfo/sysadmins
>


Подробная информация о списке рассылки Sysadmins