Куратор(ы):
KT
Автор | Сообщение | |||
---|---|---|---|---|
|
||||
Member Статус: Не в сети |
ПРОСЯ О ПОМОЩИ, ВЫКЛАДЫВАЙТЕ S.M.A.R.T. ПРОБЛЕМНОГО НАКОПИТЕЛЯ! Его можно посмотреть программами Everest, AIDA 64, Victoria 4.x, Dtemp, HDDScan, HD Tune, Crystal Disk Info, SpeedFan… Обращайте внимание на DATA/RAW-параметры, это главные и основные показатели здоровья диска. >>>При использовании Crystal Disk Info в меню Сервис>Дополнительно>Raw-значения выберите вариант «10 [DEC]» это несколько упростит восприятие информации утилиты форумчанами.<<< <<Скриншоты>> При выкладке скриншотов не забываем ограничения накладываемы пунктом 3.12 правил конференции. А именно: «Размещать в тегах «Img» картинки объемом свыше 500 кБ на сообщение. Допускаются картинки до 2 МБ под тегом «spoiler=«, а также прямые ссылки на картинки любого размера. Ссылки на страницы, где картинка отображается среди рекламы, запрещены, применяющие их сайты блокируются автоцензором.» Для лучшего понимания сути вопроса смотрите информацию на первой странице темы, составленную камрадом Ing-Syst. Так же помочь разобраться в показаниях СМАРТ может очень подробный материал размещенный на сайте ixbt.com: Оцениваем состояние дисков при помощи S.M.A.R.T. Возможно, для решения Вашей проблемы потребуется провести цикл процедур утилитами Виктория и MHDD. Ссылки на инструкции по работе с программами можно найти на первой странице темы. Связанные темы [FAQ] Всё о винчестерах Western Digital Восстановление данных Сигейт официально признал проблему с 7200.11 Полезные сообщения участников этой темы: Обнуление некоторых параметров СМАРТ на винчестерах Samsung ShutUp — программа камрада CoolCMD для предотвращения частых парковок HDD. https://disk.yandex.ru/d/x3UITAgo3EGqub Программа считывает один сектор через определенный пользователем промежуток времени. Учёт и поиск запчастей к жестким дискам — R.baza. Последний раз редактировалось KT 29.11.2021 18:36, всего редактировалось 15 раз(а). |
Реклама | |
Партнер |
vensant_jarden |
|
Member Статус: Не в сети |
Sania. ясно. Значит — просто установить и если никаких явных проблем не возникнет — следить за ситуацией на дистанции. |
Sania. |
|
Member Статус: Не в сети |
Да, глупый вопрос, а если вы установите драйвер на видеокарту, это лишает её работоспособности? Если бы драйвер к такому мог приводить, как вы думаете, вам бы не написали этого, а остальные не засудили бы интел за такой кривой драйвер? |
Sinestery |
|
Junior Статус: Не в сети |
Tomset писал(а): Помер и смарт у него явно слетел. А что конкретно не так? img Вложение:
|
Sania. |
|
Member Статус: Не в сети |
Sinestery писал(а): А что конкретно не так? В том что чушь половина смарта отображает, возьмите современную прогу по чтению смарта. |
kolyan1980-08-11 |
|
Member Статус: Не в сети |
userID Я не утверждаю, но мне как-то раз помогла. |
RuckusDJ |
|
||
Junior Статус: Не в сети |
Здравствуйте!
|
fixit |
|
Member Статус: Не в сети |
RuckusDJ писал(а): Диск смело можно выбрасывать? Теперь remap под DOS |
Sania. |
|
Member Статус: Не в сети |
Охлаждение ему организовать, он сейчас 52 греется, это очень плохо. |
RuckusDJ |
|
Junior Статус: Не в сети |
Sania. |
Sania. |
|
Member Статус: Не в сети |
Не доводите диск до перегрева в любом случаи. |
7Gluk7 |
|
Junior Статус: Не в сети |
Всем доброго времени суток! Код: ID Описание атрибута Порог Значение Наихудшее Данные Статус Вложение:
График чтения не сохранил, но он был ровный на 540МБ/с. Последний раз редактировалось 7Gluk7 20.01.2020 15:24, всего редактировалось 1 раз. |
Sania. |
|
Member Статус: Не в сети |
7Gluk7 писал(а): Что можете посоветовать? Очистить диск в нулину и проделать этот тест с другого диска. |
O Smirnoff |
|
Member Статус: Не в сети |
Sania. писал(а): Очистить диск в нулину Secure Erase — понимаю; а вот «в нулину» — это куда, зачем и кому?.. |
Sania. |
|
Member Статус: Не в сети |
Там скорее удаление MBR хватит, но можно ещё чего, что придумает автор, главное пустой стал. |
O Smirnoff |
|
Member Статус: Не в сети |
Sania. писал(а): удаление MBR хватит А, так это оно самое Sania. писал(а): Очистить диск в нулину и есть? |
Sania. |
|
Member Статус: Не в сети |
7Gluk7 |
|
Junior Статус: Не в сети |
Sania. писал(а): Очистить диск в нулину и проделать этот тест с другого диска. AIDA64 вроде при тесте записи как раз нулями и заполняет? |
O Smirnoff |
|
Member Статус: Не в сети |
Sania. писал(а):
Да вот не » «, а пиши уже внятными терминами; а то словоблудием своим только людей с пути истинного сбиваешь… Добавлено спустя 46 секунд: 7Gluk7 писал(а): AIDA64 вроде при тесте записи как раз нулями и заполняет? Лучше всё-же Secure Erase. |
Sania. |
|
Member Статус: Не в сети |
7Gluk7 писал(а): AIDA64 вроде при тесте записи как раз нулями и заполняет? На не пустой дмск, который не нулями и единицами заполнен, а конкретными файлами, которые винда не даст айде переписать,что бы вы не плакались как пол винда с фотками куда то пропали. Добавлено спустя 2 минуты 7 секунд: O Smirnoff писал(а): Да вот не » «, а пиши уже внятными терминами; а то словоблудием своим только людей с пути истинного сбиваешь… Да так меньше приходится писать, нужно же выяснить подкованность спрашивающего. |
7Gluk7 |
|
Junior Статус: Не в сети |
O Smirnoff писал(а): Лучше всё-же Secure Erase. Попробую. Sania. писал(а): На не пустой дмск, который не нулями и единицами заполнен, а конкретными файлами, которые винда не даст айде переписать,что бы вы не плакались как пол винда с фотками куда то пропали. Я с LiveUSB, а винду пока на vhd переместил Последний раз редактировалось 7Gluk7 20.01.2020 15:49, всего редактировалось 1 раз. |
—
Кто сейчас на конференции |
Сейчас этот форум просматривают: нет зарегистрированных пользователей и гости: 3 |
Вы не можете начинать темы Вы не можете отвечать на сообщения Вы не можете редактировать свои сообщения Вы не можете удалять свои сообщения Вы не можете добавлять вложения |
Лаборатория
Новости
Модераторы: Trinity admin`s, Free-lance moderator`s
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
что за ошибка? Phy is bad on enclosure.
HARDWARE—
Controller: Controller0: LSI MegaRAID SAS 9280-8e(Bus 5,Dev 0,Domain 0)
Status: Optimal
Firmware Package Version:12.15.0-0239
Firmware Version: 2.130.403-4660
BBU: NO
Enclosure(s): 1
Drive(s): 13
Virtual Drive(s): 3
Enclosures—
PRODUCT NAME TYPE STATUS
SAS2X28 Ses OK
Drives—
CONNECTOR PRODUCT ID VENDOR ID STATE DISK TYPE CAPACITY POWER STATE
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 MG03SCA200 TOSHIBA Online SAS 1.819 TB On
null x0 & null x0 MG03SCA200 TOSHIBA Online SAS 1.819 TB On
null x0 & null x0 ST1000NM00339ZM ATA Dedicated Hot Spare SATA 931.000 GB Powersave
null x0 & null x0 ST1000NM00339ZM ATA Dedicated Hot Spare SATA 931.000 GB Powersave
null x0 & null x0 MAXTORSTM316081 ATA Online SATA 148.531 GB On
null x0 & null x0 MAXTORSTM316081 ATA Online SATA 148.531 GB On
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 25 апр 2017, 09:08
pinkzebra
Что-то я не вижу лога и ошибки в нем. Ошибка на каком-то диске или целиком на всех?
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
pinkzebra » 25 апр 2017, 11:46
Stranger03 писал(а):pinkzebra
Что-то я не вижу лога и ошибки в нем. Ошибка на каком-то диске или целиком на всех?
на 4 последних диска в режиме ATA
все это благополучно работает, только сразу после загрузки это сообщение и красный сигнал на корзине каждого из 4.
9 дисков sas ведут себя прилично.
-
gs
- Сотрудник Тринити
- Сообщения: 16650
- Зарегистрирован: 23 авг 2002, 17:34
- Откуда: Москва
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
gs » 25 апр 2017, 11:48
Контроллер ругается на четыре порта вообще-то. Все саташники? Они есть в компатибилити листе? А то там еще сообщения, что диски переведены в спящий режим, что тоже не всегда гладко работает.
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
pinkzebra » 25 апр 2017, 13:46
хорошо не 4 диска а 4 порта.
да эти 4 диска sata, два из них в списке совместимости.
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 25 апр 2017, 13:47
pinkzebra писал(а):хорошо не 4 диска а 4 порта.
да эти 4 диска sata, два из них в списке совместимости.
Попробуйте воткнуть любой SAS диск, дабы исключить поломку порта — бекплейна.
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
pinkzebra » 27 апр 2017, 09:08
вставил sas диск в пустую корзину и в корзину вместо диска sata во всех случаях стартовал нормально без данной ошибки.
вывод данную ошибку вызывают именно диски с sata разъемом…
я так понимаю что проблема в экспандере? нужно его перешить?
Код: Выделить всё
ID = 114
SEQUENCE NUMBER = 1365
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:12 Previous = Unconfigured Bad Current = Unconfigured Good
ID = 247
SEQUENCE NUMBER = 1364
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1363
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:12
ID = 247
SEQUENCE NUMBER = 1362
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 29
ID = 91
SEQUENCE NUMBER = 1361
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:10
ID = 185
SEQUENCE NUMBER = 1360
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 Phy is bad on enclosure: 1 PHY 10
ID = 331
SEQUENCE NUMBER = 1359
TIME = 27-04-2017 10:47:42
LOCALIZED MESSAGE = Controller ID: 0 Power state change on PD = Port B:1:10 Previous = Powersave Current = On
ID = 114
SEQUENCE NUMBER = 1358
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:10 Previous = Unconfigured Good Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1357
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 31
ID = 112
SEQUENCE NUMBER = 1356
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:10
ID = 289
SEQUENCE NUMBER = 1355
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 Redundant path broken PD : Port A:1:10 Path : 1 SAS Address : 0x500000E01714D813
ID = 114
SEQUENCE NUMBER = 1354
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:10 Previous = Unconfigured Bad Current = Unconfigured Good
ID = 113
SEQUENCE NUMBER = 1353
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:10Power on occurred, CDB = 0x28 0x00 0x08 0x8f 0xc1 0xcf 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x28 0x00 0x00 0x00 0x00 0x29 0x01 0x00 0x00 0x00 0x00 0x00 0x28 0x00 0x01 0x04 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x22 0x13 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
ID = 247
SEQUENCE NUMBER = 1352
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1351
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:10
ID = 331
SEQUENCE NUMBER = 1350
TIME = 27-04-2017 10:46:59
LOCALIZED MESSAGE = Controller ID: 0 Power state change on PD = Port B:1:9 Previous = Transition Current = On
ID = 331
SEQUENCE NUMBER = 1349
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Power state change on PD = Port B:1:9 Previous = Powersave Current = Transition
ID = 113
SEQUENCE NUMBER = 1348
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:7Power on occurred, CDB = 0x2e 0x00 0xe8 0xe0 0x62 0x6b 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x28 0x00 0x00 0x00 0x00 0x29 0x01 0x00 0x00 0x00 0x00 0x00 0x2e 0x01 0x08 0x17 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x19 0x19 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
ID = 113
SEQUENCE NUMBER = 1347
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:6Mode parameters changed, CDB = 0x2e 0x00 0x74 0x70 0x47 0x6b 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x2a 0x01 0x00 0x00 0x00 0x00
ID = 113
SEQUENCE NUMBER = 1346
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:5Mode parameters changed, CDB = 0x2e 0x00 0x74 0x70 0x47 0x6b 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x2a 0x01 0x00 0x00 0x00 0x00
ID = 114
SEQUENCE NUMBER = 1345
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:10 Previous = Hot Spare Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1344
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 29
ID = 112
SEQUENCE NUMBER = 1343
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:10
ID = 114
SEQUENCE NUMBER = 1342
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:11 Previous = Unconfigured Good Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1341
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 31
ID = 112
SEQUENCE NUMBER = 1340
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:11
ID = 289
SEQUENCE NUMBER = 1339
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 Redundant path broken PD : Port A:1:11 Path : 1 SAS Address : 0x500000E01714D813
ID = 114
SEQUENCE NUMBER = 1338
TIME = 27-04-2017 10:45:51
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:11 Previous = Unconfigured Bad Current = Unconfigured Good
ID = 247
SEQUENCE NUMBER = 1337
TIME = 27-04-2017 10:45:51
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1336
TIME = 27-04-2017 10:45:51
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:11
ID = 114
SEQUENCE NUMBER = 1335
TIME = 27-04-2017 10:45:21
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:12 Previous = Unconfigured Good Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1334
TIME = 27-04-2017 10:45:21
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 31
ID = 112
SEQUENCE NUMBER = 1333
TIME = 27-04-2017 10:45:21
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:12
ID = 289
SEQUENCE NUMBER = 1332
TIME = 27-04-2017 10:45:20
LOCALIZED MESSAGE = Controller ID: 0 Redundant path broken PD : Port B:1:12 Path : 0 SAS Address : 0x500000E01714D812
ID = 247
SEQUENCE NUMBER = 1331
TIME = 27-04-2017 10:45:01
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1330
TIME = 27-04-2017 10:45:01
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:12
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 27 апр 2017, 09:16
pinkzebra писал(а):я так понимаю что проблема в экспандере? нужно его перешить?
Скорей его нужно менять, либо диски брать NL SAS.
-
gs
- Сотрудник Тринити
- Сообщения: 16650
- Зарегистрирован: 23 авг 2002, 17:34
- Откуда: Москва
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
gs » 27 апр 2017, 10:54
Я что-то сомневаюсь, что дело именно в типе интерфейса. Скорее в несовместимости конкретных саташников с контроллером.
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 27 апр 2017, 11:25
gs писал(а):Я что-то сомневаюсь, что дело именно в типе интерфейса. Скорее в несовместимости конкретных саташников с контроллером.
Можно проверить, подключив напрямую без бекплейна,
Вернуться в «Массивы — Технические вопросы, решение проблем.»
Перейти
- Серверы
- ↳ Серверы — Конфигурирование
- ↳ Конфигурации сервера для 1С
- ↳ Серверы — Решение проблем
- ↳ Серверы — ПО, Unix подобные системы
- ↳ Серверы — ПО, Windows система, приложения.
- ↳ Серверы — ПО, Базы Данных и их использование
- ↳ Серверы — FAQ
- Дисковые массивы, RAID, SCSI, SAS, SATA, FC
- ↳ Массивы — RAID технологии.
- ↳ Массивы — Технические вопросы, решение проблем.
- ↳ Массивы — FAQ
- Майнинг, плоттинг, фарминг (Добыча криптовалют)
- ↳ Proof Of Work
- ↳ Proof Of Space
- Кластеры — вычислительные и отказоустойчивые ( SMP, vSMP, NUMA, GRID , NAS, SAN)
- ↳ Кластеры, Аппаратная часть
- ↳ Deep Learning и AI
- ↳ Кластеры, Программное обеспечение
- ↳ Кластеры, параллельные файловые системы
- Медиа технологии, и цифровое ТВ, IPTV, DVB
- ↳ Станции видеомонтажа, графические системы, рендеринг.
- ↳ Видеонаблюдение
- ↳ Компоненты Digital TV решений
- ↳ Студийные системы, производство ТВ, Кино и рекламы
- Инфраструктурное ПО и его лицензирование
- ↳ Виртуализация
- ↳ Облачные технологии
- ↳ Резервное копирования / Защита / Сохранение данных
- Сетевые решения
- ↳ Сети — Вопросы конфигурирования сети
- ↳ Сети — Технические вопросы, решение проблем
- Общие вопросы
- ↳ Обсуждение общих вопросов
- ↳ Приколы нашего IT городка
- ↳ Регистрация на форуме
DISK
DISK Displays information about the disks in the system.
This command is used to change the configuration settings for the disks
in the system and monitor the status of the disk channels. The command
will display the current disk configuration settings and the status of
each disk channel. The INFO= parameter can be used to display all of
the information about a disk in the system. The LIST parameter will
display a list of the disks installed in the system and indicate how
many were found.
- AGINGLIMIT=x|OFF
- Sets the maximum time a command should wait in the disk command queue
for.
This parameter is for Hitachi SAS drives only.
Each unit of this timer is 50 ms, where 0 is 50 ms.
Range: 0 to 4 (50 to 250 milliseconds) or OFF.
Default is 3, (200 milliseconds) - AUTOREASSIGN=ON|OFF
- Allows the user to turn on or off whether bad blocks will be
reassigned when a medium error occurs on a healthy tier.
Default is ON. - CMD_TIMEOUT=x
- This parameter sets the retry disk timeout (in seconds) for an I/O
request. The retry timeout value indicates the maximum amount of
time that is allotted to receive a reply for each retry of an I/O
request. If the I/O request does not complete within this time, it is
aborted and potentially retried: if there is still time remaining in
the overall disk timeout to allow for another retry, it is retried;
if not, it completes with an error status.
This parameter must be smaller than or equal to TIMEOUT.
Valid range is 1 to 512 seconds.
Recommended value for SAS drives is 11 seconds.
Recommended value for SATA drives is 31 seconds. Setting the timeout
below the recommended values can cause disk failures.
Default is 31 seconds. - DEFECTLIST[=tc]
- Allows the user to display the number of defects in the defect list
for the specified disk. The defect list contains all the physical
sectors on the disk that the drive has identified as bad, and to
which the disk’s hardware prevents access. The list is classified
into two types: the permanent list and the grown list. The permanent
list consists of the bad sectors that are identified by the disk
manufacturer; the grown list consists of the bad sectors that are
found after the disk has left the factory (and which can be added to
at any time).
The disk is specified by its tier and channel locations, ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - DIAG[=tc]
- Performs a series of diagnostics tests on the specified disk.
The disk is specified by its physical tier and channel locations,
‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - FAIL[=tc]
- This parameter tells the system to fail the specified disk at the
physical tier and channel locations indicated by ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>.
When a non-SPARE disk is specified:
If failing the disk won’t cause a multi-channel failure, the disk
is marked as failed, and an attempt is made to replace it with a
spare disk.
When a SPARE disk is specified:
If the spare disk is currently in use as a replacement for a
failed disk, then the disk that the spare is replacing is put back to
a failed status, and the spare is released, but it is marked as
unhealthy and unavailable. - FAST_FAIL=[ON|OFF]
- This parameter turns on/off the fast fail mode for disks that are
slow to respond to data access commands. The fast fail parameters can
be customized to a particular need. Default is OFF. - FAST_FAIL_THRESHOLD=’num cmds’
- This parameter indicates how many consecutive commands in the fast
fail algorithm must occur before failing the drive for this reason.
The default value is 5.
Valid range = 2 — 20. - FAST_FAIL_WINDOW_END=’t’
- This parameter indicates the timeout in seconds for when a disk
response is received outside of a window in the future. If the
command finishes outside of this time value, it is not aggregated in
the slow disk algorithm as it is considered a separate instance of
the event and the counter will restart. The default value is 90
seconds.
Valid range = 3 — 180. - FAST_FAIL_WINDOW_START=’t’
- This parameter indicates the timeout in seconds for when a disk
response is considered slow and will count against the drive in the
slow disk fail algorithm. The default value is 5 seconds.
Valid range = 2 — 179. - INFO[=tc]
- This parameter displays the information and status about a specific
disk in the system.
The disk is specified by its physical tier and channel locations,
‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - LIST[=SAS_ID|SPEED]
- This parameter displays a list of all the disks installed in the
system and indicates how many were found of each type.
The optional SAS_ID parameter will display the SAS ID of the device
instead of the serial number.
The optional SPEED parameter will display the link speed of the
device instead of the RPM. - LLFORMAT[=tc]
- Allows the user to perform a low level format of a disk drive.
The disk is specified by its tier and channel locations, ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - MAXCMDS=x
- Sets the maximum command queue depth to a tier of disks.
Range: 1 to 32 commands per tier.
Default: 16 commands.
Setting should be as follows:
— 16 if any SATA drives are used.
— 32 for everything else. - MAXREADLEN=x
- Sets the maximum read command length for SATA drives in KiB.
This parameter is used to increase throughput on systems with a large
number of SATA tiers by reducing the contention for the SAS lanes.
128K is the recommended setting for systems with 16 tiers or more of
SATA disks.
2048K is the recommended setting for systems with SAS disks.
Range is 128 to 2048.
Default is 128. - MAXWRITELEN=x
- Sets the maximum write command length to the drives in KiB.
This parameter is provided for testing only and should normally not
be changed.
Range is 128 to 2048.
Default is 2048. - PLS[=[t][c]]
-
Requests/displays the PHY Link Error Status Block information for the
specified drive. Note that SATA and SAS drives report PHY errors
differently. The PHY information consists of the following items:ERROR SATA AAMUX PHY ERRORS Explanation H-RX Number of SATA FIS CRC errors received on the host port of the AAMUX H-TX Number of SATA R_ERR primitives received on the host port indicating
a problem with the transmitter of the AAMUXH-Link Number of times the PHY has lost link on the host port H-Disp Number of frame errors for the host port of the AAMUX.
These include:
code error,
disparity error,
or realignmentO-RX Number of SATA FIS CRC errors received on the other host port of
the AAMUXO-TX Number of SATA R_ERR primitives received on the other host port
indicating a problem with the transmitter if the AAMUXO-Link Number of times the PHY has lost link on the other host port O-Disp Number of frame errors for the other host port of the AAMUX.
These include:
code error,
disparity error,
or realignmentD-RX Number of SATA FIS CRC errors received on the device port of the
AAMUXD-TX Number of SATA R_ERR primitives received on the device port
indicating a problem with the transmitter of the AAMUXD-Link Number of times the PHY has lost link on the device port D-Disp Number of frame errors for the device port of the AAMUX.
These include:
code error,
disparity error,
or realignmentError SAS PHY ERRORS Explanation InvDW Invalid DWORD Count — The number of invalid dwords received outside
of the PHY reset sequence.RunDis Running disparity Count — The number of dwords containing running
disparity errors received outside of the PHY reset sequence.LDWSYN Loss of DWORD synchronization count — The number of times the PHY
has lost synchronization and the link reset sequence.PHYRES PHY Reset Problem count — The number of times the PHY reset sequence
has failed.The disk is specified by its physical tier and channel locations,
‘tc’,where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
If neither the tier nor the channel are specified, the PLS
information is requested from all drives.
If only the tier is specified, the PLS information is requested from
all the drives on the specified tier. - PMBIT=ON|OFF
- When ON this parameter sets the PM (performance mode) bit in Seagate
SAS drives mode pages. When OFF the Seagate drive uses its default
performance mode settings.
Default is OFF. - QUARANTINE
- Displays the of number quarantine events on this controller for each
disk in the system. Only tiers with quarantine counts will be
displayed.
Use QUARANTINECLEAR to reset the quarantine counts. - QUARANTINE=[ON|OFF]
- Enables/disables the disk quarantine feature for all of the disks. A
disk cannot be quarantined unless FASTAV is enabled for the LUN.
Default is OFF. - QUARANTINECLEAR
- Resets the quarantine counts for all of the disks.
- QUARANTINECMDLIMIT=x
- Sets the maximum number of outstanding disk commands after a good
response before a quarantined disk can be put back into service.
Range 0 to 32 where 0 indicates no delay before putting the disk back
into service.
Default is 0. - QUARANTINETIMEOUT=x
- Sets the minimum timeout before a disk can be quarantined in 16.6
millisecond increments. A disk cannot be quarantined unless FASTAV is
enabled and has timed out on the command.
Range 6 to 65535.
Default is 12 (200 milliseconds) - REASSIGN[=tc] [0xh
- Allows for the reassigning of defective logical blocks on a disk to
an area of the disk reserved for this purpose.
The disk is specified by its tier and channel locations, ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>.
0xh is the hexadecimal value of the LBA (Logical Block Address) to be
reassigned. - REBUILD[=tc]|ALL
-
This parameter tells the system to start a rebuild operation on a
(presumably) already failed disk. A rebuild operation restores a
failed disk to a healthy status once it completes. Note that this
operation can take several hours to complete depending on the size of
the disk and the speed of the rebuild operation. The speed of the
rebuild operation can be adjusted with the DELAY and EXTENT
parameters of the TIER command.
In addition, the rebuild operation can be stopped, or paused and
resumed with the TIER STOP, TIER PAUSE, and TIER RESUME commands.
The TIER AUTOREBUILD command can be used to automate the rebuild
process.
Note that SPARE disks are handled slightly differently from other
disks, in that SPARES that are not in use as an active replacement
for a failed disk elsewhere in the system are simply returned to a
normal healthy status by this command; SPAREs that are in use are
already considered healthy and are not rebuilt.
The failed disk to be rebuilt is specified by its physical tier and
channel locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
All failed and replaced disks can be rebuilt using the ALL parameter.
- REBUILDNOJOURNAL[=tc]|ALL
-
This parameter tells the system to start a rebuild operation on a
(presumably) already failed disk without using the journal. A
rebuild operation restores a failed disk to a healthy status once it
completes. Note that this operation can take several hours to
complete depending on the size of the disk and the speed of the
rebuild operation. The speed of the rebuild operation can be
adjusted with the DELAY and EXTENT parameters of the TIER command.
In addition, the rebuild operation can be stopped, or paused and
resumed with the TIER STOP, TIER PAUSE, and TIER RESUME commands.
The TIER AUTOREBUILD command can be used to automate the rebuild
process.
Note that SPARE disks are handled slightly differently from other
disks, in that SPARES that are not in use as an active replacement
for a failed disk elsewhere in the system are simply returned to a
normal healthy status by this command; SPAREs that are in use are
already considered healthy and are not rebuilt.
The failed disk to be rebuilt is specified by its physical tier and
channel locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
All failed and replaced disks can be rebuilt using the ALL parameter.
- REBUILDVERIFY=ON|OFF
-
This parameter determines if the system will send SCSI Write with
Verify commands to the disks when rebuilding failed disks. This
feature is used to guarantee that the data on the disks is rebuilt
correctly.
Note: This feature will increase the time it takes for rebuilds to
finish.Default is OFF.
- REPLACE[=tc]
-
This parameter tells the system to replace the specified failed disk
with a spare disk or replace a healthy disk that is believed to be on
the verge of failing. The healthy disk replacement is referred to in
the system as a proactive replacement operation. A replace operation
is used to temporarily replace a disk with a healthy spare disk.
This operation can take several hours to complete depending on the
size of the disk and speed of the replace operation. The speed of
the replace operation can be adjusted with the DELAY and EXTENT
parameters of the TIER command.
The disk to be replaced is specified by its physical tier and channel
locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHP>.
(Note that spare disks themselves cannot be replaced with this
command). - RESTART[=tc]
-
This parameter tells the system to start a restart operation on a
(presumably) already failed disk.The failed disk to be restarted is
specified by its physical tier and channel locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
All failed and replaced disks can be restart using the ALL parameter.
- SCAN
- This parameter checks each disk channel in the system for any new
disks and verifies that the existing disks are in the correct
location. It also starts a rebuild operation on any failed disks
which pass the disk diagnostics. - STATUS
- Displays the loop status of each disk channel and a count of the
fibre channel errors encountered on each channel. - STATUSCLEAR
- Resets the fibre channel error counts on each disk channel.
- TIMEOUT=x
-
This parameter sets the total disk timeout (in seconds) for an I/O
request. The total disk timeout value indicates the total overall
length of time allotted to each I/O request to complete; if an I/O
request has not completed within this time frame, then an error
status is reported for it.
This parameter must be greater than or equal to CMD_TIMEOUT.
Valid range is 1 to 512 seconds.- Recommended value for SAS drives is 27 seconds.
- Recommended value for SATA drives is 60 seconds.
Default is 60 seconds.
- WRITESAME=ON|OFF
-
Enable and disables use of the SCSI Write Same command when
formatting LUNs. The SCSI Write Same command is used by the system to
format a LUNs faster. This parameter is provided for backwards
compatibility with disks or enclosures that do not support the SCSI
Write Command.Default is OFF.
The steps I took to fix it:
- updated BIOS
- In the BIOS, diabled the SATA IDE Combined Mode with this help
- reading the kernel documentation about kernel parameters, since every solution online was about adding parameters to that.
- I found out that my SSD actually only supports SATA speed 3.0Gbps with a good shell script
for i in `grep -l Gbps /sys/class/ata_link/*/sata_spd`; do echo Link "${i%/*}" Speed `cat $i` cat "${i%/*}"/device/dev*/ata_device/dev*/id | perl -nE 's/([0-9a-f]{2})/print chr hex $1/gie' | echo " " Device `strings` | cut -f 1-3 done
- In the grub configuration, set the SATA port of the SSD drive to maximum speed 3.0
vi /etc/default/grub
changed the parameter in this line to allow only 3Gbps for SATA port 7 (my SSD)
GRUB_CMDLINE_LINUX_DEFAULT="libata.force=7:3.0G quiet"
update grub and reboot
update-grub reboot
The solution to this has come a long long way for me. I basically approached the whole problem every other day from scratch.
The problems I found on the way where:
- I checked my SMART stats every day and compared. The error count didn’t increase even though the exceptions kept being thrown.
- My SSD was actually the one causing the kernel exceptions, this script helped me lots to understand which ATA device was actually which hard drive in the case
- My SSD and two other drives where on a completely wrong speed setting (UDMA)
root@msa-nas1:~# sudo hdparm -I /dev/sd{a,b,c,d,e,f,g} | grep -i udma DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4 udma5 udma6 DMA: mdma0 mdma1 mdma2 udma0 *udma1 udma2 udma3 udma4 udma5 udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4 udma5 udma6
- The dmesg log showed some strange messages about 40-wire cables, even though those don’t really exist anymore, I bought two different NEW cables, nothing helped.
[ 1.193091] ata5.01: ATA-8: SanDisk SD6SF1M128G1022I, X231200, max UDMA/133 [ 1.193095] ata5.01: 250069680 sectors, multi 1: LBA48 NCQ (depth 0/32) [ 1.193743] ata5.00: limited to UDMA/33 due to 40-wire cable [ 1.193746] ata5.01: limited to UDMA/33 due to 40-wire cable
- Grub loaded a funny kernel for the last two drives:
pata_atiixp
. I was expecting the AHCI driver.
[ 1.022724] scsi4 : pata_atiixp [ 1.022834] scsi5 : pata_atiixp [ 1.022887] ata5: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xf100 irq 14 [ 1.022888] ata6: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xf108 irq 15
- I checked the power consumption and compared if it exceeded the power unit, it did not. Not even close.
- I replaced the SSD with exactly the same model from another machine. Excactly the same model. Still the same errors.
- The SSD!! was in fact incredibly slow, so the hdparm about the UDMA output was actually correct.
root@msa-nas1:~# hdparm -t -T /dev/sdf /dev/sdf: Timing cached reads: 2144 MB in 2.00 seconds = 1072.18 MB/sec Timing buffered disk reads: 8 MB in 3.60 seconds = 2.22 MB/sec
I tried reaching out to SandDisk, it was their hard drive giving me the exceptions, without any success. I could really not find anyone with the exact same problem, but many people with similar problems, in the end I tried a few of those suggested solutions and it turned out to be a mix of a few things. Now it all makes perfectly sense to me, afterwards everyone knows better I guess.
The steps I took to fix it:
- updated BIOS
- In the BIOS, diabled the SATA IDE Combined Mode with this help
- reading the kernel documentation about kernel parameters, since every solution online was about adding parameters to that.
- I found out that my SSD actually only supports SATA speed 3.0Gbps with a good shell script
for i in `grep -l Gbps /sys/class/ata_link/*/sata_spd`; do echo Link "${i%/*}" Speed `cat $i` cat "${i%/*}"/device/dev*/ata_device/dev*/id | perl -nE 's/([0-9a-f]{2})/print chr hex $1/gie' | echo " " Device `strings` | cut -f 1-3 done
- In the grub configuration, set the SATA port of the SSD drive to maximum speed 3.0
vi /etc/default/grub
changed the parameter in this line to allow only 3Gbps for SATA port 7 (my SSD)
GRUB_CMDLINE_LINUX_DEFAULT="libata.force=7:3.0G quiet"
update grub and reboot
update-grub reboot
The solution to this has come a long long way for me. I basically approached the whole problem every other day from scratch.
The problems I found on the way where:
- I checked my SMART stats every day and compared. The error count didn’t increase even though the exceptions kept being thrown.
- My SSD was actually the one causing the kernel exceptions, this script helped me lots to understand which ATA device was actually which hard drive in the case
- My SSD and two other drives where on a completely wrong speed setting (UDMA)
root@msa-nas1:~# sudo hdparm -I /dev/sd{a,b,c,d,e,f,g} | grep -i udma DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4 udma5 udma6 DMA: mdma0 mdma1 mdma2 udma0 *udma1 udma2 udma3 udma4 udma5 udma6 DMA: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4 udma5 udma6
- The dmesg log showed some strange messages about 40-wire cables, even though those don’t really exist anymore, I bought two different NEW cables, nothing helped.
[ 1.193091] ata5.01: ATA-8: SanDisk SD6SF1M128G1022I, X231200, max UDMA/133 [ 1.193095] ata5.01: 250069680 sectors, multi 1: LBA48 NCQ (depth 0/32) [ 1.193743] ata5.00: limited to UDMA/33 due to 40-wire cable [ 1.193746] ata5.01: limited to UDMA/33 due to 40-wire cable
- Grub loaded a funny kernel for the last two drives:
pata_atiixp
. I was expecting the AHCI driver.
[ 1.022724] scsi4 : pata_atiixp [ 1.022834] scsi5 : pata_atiixp [ 1.022887] ata5: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xf100 irq 14 [ 1.022888] ata6: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xf108 irq 15
- I checked the power consumption and compared if it exceeded the power unit, it did not. Not even close.
- I replaced the SSD with exactly the same model from another machine. Excactly the same model. Still the same errors.
- The SSD!! was in fact incredibly slow, so the hdparm about the UDMA output was actually correct.
root@msa-nas1:~# hdparm -t -T /dev/sdf /dev/sdf: Timing cached reads: 2144 MB in 2.00 seconds = 1072.18 MB/sec Timing buffered disk reads: 8 MB in 3.60 seconds = 2.22 MB/sec
I tried reaching out to SandDisk, it was their hard drive giving me the exceptions, without any success. I could really not find anyone with the exact same problem, but many people with similar problems, in the end I tried a few of those suggested solutions and it turned out to be a mix of a few things. Now it all makes perfectly sense to me, afterwards everyone knows better I guess.
Модераторы: Trinity admin`s, Free-lance moderator`s
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
что за ошибка? Phy is bad on enclosure.
HARDWARE—
Controller: Controller0: LSI MegaRAID SAS 9280-8e(Bus 5,Dev 0,Domain 0)
Status: Optimal
Firmware Package Version:12.15.0-0239
Firmware Version: 2.130.403-4660
BBU: NO
Enclosure(s): 1
Drive(s): 13
Virtual Drive(s): 3
Enclosures—
PRODUCT NAME TYPE STATUS
SAS2X28 Ses OK
Drives—
CONNECTOR PRODUCT ID VENDOR ID STATE DISK TYPE CAPACITY POWER STATE
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 ST1000NM0001 SEAGATE Online SAS 931.000 GB On
null x0 & null x0 MG03SCA200 TOSHIBA Online SAS 1.819 TB On
null x0 & null x0 MG03SCA200 TOSHIBA Online SAS 1.819 TB On
null x0 & null x0 ST1000NM00339ZM ATA Dedicated Hot Spare SATA 931.000 GB Powersave
null x0 & null x0 ST1000NM00339ZM ATA Dedicated Hot Spare SATA 931.000 GB Powersave
null x0 & null x0 MAXTORSTM316081 ATA Online SATA 148.531 GB On
null x0 & null x0 MAXTORSTM316081 ATA Online SATA 148.531 GB On
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 25 апр 2017, 09:08
pinkzebra
Что-то я не вижу лога и ошибки в нем. Ошибка на каком-то диске или целиком на всех?
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
pinkzebra » 25 апр 2017, 11:46
Stranger03 писал(а):pinkzebra
Что-то я не вижу лога и ошибки в нем. Ошибка на каком-то диске или целиком на всех?
на 4 последних диска в режиме ATA
все это благополучно работает, только сразу после загрузки это сообщение и красный сигнал на корзине каждого из 4.
9 дисков sas ведут себя прилично.
-
gs
- Сотрудник Тринити
- Сообщения: 16650
- Зарегистрирован: 23 авг 2002, 17:34
- Откуда: Москва
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
gs » 25 апр 2017, 11:48
Контроллер ругается на четыре порта вообще-то. Все саташники? Они есть в компатибилити листе? А то там еще сообщения, что диски переведены в спящий режим, что тоже не всегда гладко работает.
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
pinkzebra » 25 апр 2017, 13:46
хорошо не 4 диска а 4 порта.
да эти 4 диска sata, два из них в списке совместимости.
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 25 апр 2017, 13:47
pinkzebra писал(а):хорошо не 4 диска а 4 порта.
да эти 4 диска sata, два из них в списке совместимости.
Попробуйте воткнуть любой SAS диск, дабы исключить поломку порта — бекплейна.
-
pinkzebra
- Junior member
- Сообщения: 9
- Зарегистрирован: 18 апр 2017, 12:58
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
pinkzebra » 27 апр 2017, 09:08
вставил sas диск в пустую корзину и в корзину вместо диска sata во всех случаях стартовал нормально без данной ошибки.
вывод данную ошибку вызывают именно диски с sata разъемом…
я так понимаю что проблема в экспандере? нужно его перешить?
Код: Выделить всё
ID = 114
SEQUENCE NUMBER = 1365
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:12 Previous = Unconfigured Bad Current = Unconfigured Good
ID = 247
SEQUENCE NUMBER = 1364
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1363
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:12
ID = 247
SEQUENCE NUMBER = 1362
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 29
ID = 91
SEQUENCE NUMBER = 1361
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:10
ID = 185
SEQUENCE NUMBER = 1360
TIME = 27-04-2017 10:48:47
LOCALIZED MESSAGE = Controller ID: 0 Phy is bad on enclosure: 1 PHY 10
ID = 331
SEQUENCE NUMBER = 1359
TIME = 27-04-2017 10:47:42
LOCALIZED MESSAGE = Controller ID: 0 Power state change on PD = Port B:1:10 Previous = Powersave Current = On
ID = 114
SEQUENCE NUMBER = 1358
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:10 Previous = Unconfigured Good Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1357
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 31
ID = 112
SEQUENCE NUMBER = 1356
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:10
ID = 289
SEQUENCE NUMBER = 1355
TIME = 27-04-2017 10:47:25
LOCALIZED MESSAGE = Controller ID: 0 Redundant path broken PD : Port A:1:10 Path : 1 SAS Address : 0x500000E01714D813
ID = 114
SEQUENCE NUMBER = 1354
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:10 Previous = Unconfigured Bad Current = Unconfigured Good
ID = 113
SEQUENCE NUMBER = 1353
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:10Power on occurred, CDB = 0x28 0x00 0x08 0x8f 0xc1 0xcf 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x28 0x00 0x00 0x00 0x00 0x29 0x01 0x00 0x00 0x00 0x00 0x00 0x28 0x00 0x01 0x04 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x22 0x13 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
ID = 247
SEQUENCE NUMBER = 1352
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1351
TIME = 27-04-2017 10:47:12
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:10
ID = 331
SEQUENCE NUMBER = 1350
TIME = 27-04-2017 10:46:59
LOCALIZED MESSAGE = Controller ID: 0 Power state change on PD = Port B:1:9 Previous = Transition Current = On
ID = 331
SEQUENCE NUMBER = 1349
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Power state change on PD = Port B:1:9 Previous = Powersave Current = Transition
ID = 113
SEQUENCE NUMBER = 1348
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:7Power on occurred, CDB = 0x2e 0x00 0xe8 0xe0 0x62 0x6b 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x28 0x00 0x00 0x00 0x00 0x29 0x01 0x00 0x00 0x00 0x00 0x00 0x2e 0x01 0x08 0x17 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x19 0x19 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
ID = 113
SEQUENCE NUMBER = 1347
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:6Mode parameters changed, CDB = 0x2e 0x00 0x74 0x70 0x47 0x6b 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x2a 0x01 0x00 0x00 0x00 0x00
ID = 113
SEQUENCE NUMBER = 1346
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Unexpected sense: PD = Port B:1:5Mode parameters changed, CDB = 0x2e 0x00 0x74 0x70 0x47 0x6b 0x00 0x00 0x01 0x00 , Sense = 0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x2a 0x01 0x00 0x00 0x00 0x00
ID = 114
SEQUENCE NUMBER = 1345
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:10 Previous = Hot Spare Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1344
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 29
ID = 112
SEQUENCE NUMBER = 1343
TIME = 27-04-2017 10:46:49
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:10
ID = 114
SEQUENCE NUMBER = 1342
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:11 Previous = Unconfigured Good Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1341
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 31
ID = 112
SEQUENCE NUMBER = 1340
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:11
ID = 289
SEQUENCE NUMBER = 1339
TIME = 27-04-2017 10:46:03
LOCALIZED MESSAGE = Controller ID: 0 Redundant path broken PD : Port A:1:11 Path : 1 SAS Address : 0x500000E01714D813
ID = 114
SEQUENCE NUMBER = 1338
TIME = 27-04-2017 10:45:51
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:11 Previous = Unconfigured Bad Current = Unconfigured Good
ID = 247
SEQUENCE NUMBER = 1337
TIME = 27-04-2017 10:45:51
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1336
TIME = 27-04-2017 10:45:51
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:11
ID = 114
SEQUENCE NUMBER = 1335
TIME = 27-04-2017 10:45:21
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = Port B:1:12 Previous = Unconfigured Good Current = Unconfigured Bad
ID = 248
SEQUENCE NUMBER = 1334
TIME = 27-04-2017 10:45:21
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 31
ID = 112
SEQUENCE NUMBER = 1333
TIME = 27-04-2017 10:45:21
LOCALIZED MESSAGE = Controller ID: 0 PD removed: Port B:1:12
ID = 289
SEQUENCE NUMBER = 1332
TIME = 27-04-2017 10:45:20
LOCALIZED MESSAGE = Controller ID: 0 Redundant path broken PD : Port B:1:12 Path : 0 SAS Address : 0x500000E01714D812
ID = 247
SEQUENCE NUMBER = 1331
TIME = 27-04-2017 10:45:01
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 31
ID = 91
SEQUENCE NUMBER = 1330
TIME = 27-04-2017 10:45:01
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: Port B:1:12
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 27 апр 2017, 09:16
pinkzebra писал(а):я так понимаю что проблема в экспандере? нужно его перешить?
Скорей его нужно менять, либо диски брать NL SAS.
-
gs
- Сотрудник Тринити
- Сообщения: 16650
- Зарегистрирован: 23 авг 2002, 17:34
- Откуда: Москва
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
gs » 27 апр 2017, 10:54
Я что-то сомневаюсь, что дело именно в типе интерфейса. Скорее в несовместимости конкретных саташников с контроллером.
-
Stranger03
- Сотрудник Тринити
- Сообщения: 12979
- Зарегистрирован: 14 ноя 2003, 16:25
- Откуда: СПб, Екатеринбург
- Контактная информация:
Re: что за ошибка? Phy is bad on enclosure.
Сообщение
Stranger03 » 27 апр 2017, 11:25
gs писал(а):Я что-то сомневаюсь, что дело именно в типе интерфейса. Скорее в несовместимости конкретных саташников с контроллером.
Можно проверить, подключив напрямую без бекплейна,
Вернуться в «Массивы — Технические вопросы, решение проблем.»
Перейти
- Серверы
- ↳ Серверы — Конфигурирование
- ↳ Конфигурации сервера для 1С
- ↳ Серверы — Решение проблем
- ↳ Серверы — ПО, Unix подобные системы
- ↳ Серверы — ПО, Windows система, приложения.
- ↳ Серверы — ПО, Базы Данных и их использование
- ↳ Серверы — FAQ
- Дисковые массивы, RAID, SCSI, SAS, SATA, FC
- ↳ Массивы — RAID технологии.
- ↳ Массивы — Технические вопросы, решение проблем.
- ↳ Массивы — FAQ
- Майнинг, плоттинг, фарминг (Добыча криптовалют)
- ↳ Proof Of Work
- ↳ Proof Of Space
- Кластеры — вычислительные и отказоустойчивые ( SMP, vSMP, NUMA, GRID , NAS, SAN)
- ↳ Кластеры, Аппаратная часть
- ↳ Deep Learning и AI
- ↳ Кластеры, Программное обеспечение
- ↳ Кластеры, параллельные файловые системы
- Медиа технологии, и цифровое ТВ, IPTV, DVB
- ↳ Станции видеомонтажа, графические системы, рендеринг.
- ↳ Видеонаблюдение
- ↳ Компоненты Digital TV решений
- ↳ Студийные системы, производство ТВ, Кино и рекламы
- Инфраструктурное ПО и его лицензирование
- ↳ Виртуализация
- ↳ Облачные технологии
- ↳ Резервное копирования / Защита / Сохранение данных
- Сетевые решения
- ↳ Сети — Вопросы конфигурирования сети
- ↳ Сети — Технические вопросы, решение проблем
- Общие вопросы
- ↳ Обсуждение общих вопросов
- ↳ Приколы нашего IT городка
- ↳ Регистрация на форуме
Куратор(ы):
KT
Автор | Сообщение | |||
---|---|---|---|---|
|
||||
Member Статус: Не в сети |
ПРОСЯ О ПОМОЩИ, ВЫКЛАДЫВАЙТЕ S.M.A.R.T. ПРОБЛЕМНОГО НАКОПИТЕЛЯ! Его можно посмотреть программами Everest, AIDA 64, Victoria 4.x, Dtemp, HDDScan, HD Tune, Crystal Disk Info, SpeedFan… Обращайте внимание на DATA/RAW-параметры, это главные и основные показатели здоровья диска. >>>При использовании Crystal Disk Info в меню Сервис>Дополнительно>Raw-значения выберите вариант «10 [DEC]» это несколько упростит восприятие информации утилиты форумчанами.<<< <<Скриншоты>> При выкладке скриншотов не забываем ограничения накладываемы пунктом 3.12 правил конференции. А именно: «Размещать в тегах «Img» картинки объемом свыше 500 кБ на сообщение. Допускаются картинки до 2 МБ под тегом «spoiler=«, а также прямые ссылки на картинки любого размера. Ссылки на страницы, где картинка отображается среди рекламы, запрещены, применяющие их сайты блокируются автоцензором.» Для лучшего понимания сути вопроса смотрите информацию на первой странице темы, составленную камрадом Ing-Syst. Так же помочь разобраться в показаниях СМАРТ может очень подробный материал размещенный на сайте ixbt.com: Оцениваем состояние дисков при помощи S.M.A.R.T. Возможно, для решения Вашей проблемы потребуется провести цикл процедур утилитами Виктория и MHDD. Ссылки на инструкции по работе с программами можно найти на первой странице темы. Связанные темы [FAQ] Всё о винчестерах Western Digital Восстановление данных Сигейт официально признал проблему с 7200.11 Полезные сообщения участников этой темы: Обнуление некоторых параметров СМАРТ на винчестерах Samsung ShutUp — программа камрада CoolCMD для предотвращения частых парковок HDD. https://disk.yandex.ru/d/x3UITAgo3EGqub Программа считывает один сектор через определенный пользователем промежуток времени. Учёт и поиск запчастей к жестким дискам — R.baza. Последний раз редактировалось KT 29.11.2021 18:36, всего редактировалось 15 раз(а). |
Реклама | |
Партнер |
vensant_jarden |
|
Member Статус: Не в сети |
Sania. ясно. Значит — просто установить и если никаких явных проблем не возникнет — следить за ситуацией на дистанции. |
Sania. |
|
Member Статус: Не в сети |
Да, глупый вопрос, а если вы установите драйвер на видеокарту, это лишает её работоспособности? Если бы драйвер к такому мог приводить, как вы думаете, вам бы не написали этого, а остальные не засудили бы интел за такой кривой драйвер? |
Sinestery |
|
Junior Статус: Не в сети |
Tomset писал(а): Помер и смарт у него явно слетел. А что конкретно не так? img Вложение:
|
Sania. |
|
Member Статус: Не в сети |
Sinestery писал(а): А что конкретно не так? В том что чушь половина смарта отображает, возьмите современную прогу по чтению смарта. |
kolyan1980-08-11 |
|
Member Статус: Не в сети |
userID Я не утверждаю, но мне как-то раз помогла. |
RuckusDJ |
|
||
Junior Статус: Не в сети |
Здравствуйте!
|
fixit |
|
Member Статус: Не в сети |
RuckusDJ писал(а): Диск смело можно выбрасывать? Теперь remap под DOS |
Sania. |
|
Member Статус: Не в сети |
Охлаждение ему организовать, он сейчас 52 греется, это очень плохо. |
RuckusDJ |
|
Junior Статус: Не в сети |
Sania. |
Sania. |
|
Member Статус: Не в сети |
Не доводите диск до перегрева в любом случаи. |
7Gluk7 |
|
Junior Статус: Не в сети |
Всем доброго времени суток! Код: ID Описание атрибута Порог Значение Наихудшее Данные Статус Вложение:
График чтения не сохранил, но он был ровный на 540МБ/с. Последний раз редактировалось 7Gluk7 20.01.2020 15:24, всего редактировалось 1 раз. |
Sania. |
|
Member Статус: Не в сети |
7Gluk7 писал(а): Что можете посоветовать? Очистить диск в нулину и проделать этот тест с другого диска. |
O Smirnoff |
|
Member Статус: Не в сети |
Sania. писал(а): Очистить диск в нулину Secure Erase — понимаю; а вот «в нулину» — это куда, зачем и кому?.. |
Sania. |
|
Member Статус: Не в сети |
Там скорее удаление MBR хватит, но можно ещё чего, что придумает автор, главное пустой стал. |
O Smirnoff |
|
Member Статус: Не в сети |
Sania. писал(а): удаление MBR хватит А, так это оно самое Sania. писал(а): Очистить диск в нулину и есть? |
Sania. |
|
Member Статус: Не в сети |
|
7Gluk7 |
|
Junior Статус: Не в сети |
Sania. писал(а): Очистить диск в нулину и проделать этот тест с другого диска. AIDA64 вроде при тесте записи как раз нулями и заполняет? |
O Smirnoff |
|
Member Статус: Не в сети |
Sania. писал(а):
Да вот не » «, а пиши уже внятными терминами; а то словоблудием своим только людей с пути истинного сбиваешь… Добавлено спустя 46 секунд: 7Gluk7 писал(а): AIDA64 вроде при тесте записи как раз нулями и заполняет? Лучше всё-же Secure Erase. |
Sania. |
|
Member Статус: Не в сети |
7Gluk7 писал(а): AIDA64 вроде при тесте записи как раз нулями и заполняет? На не пустой дмск, который не нулями и единицами заполнен, а конкретными файлами, которые винда не даст айде переписать,что бы вы не плакались как пол винда с фотками куда то пропали. Добавлено спустя 2 минуты 7 секунд: O Smirnoff писал(а): Да вот не » «, а пиши уже внятными терминами; а то словоблудием своим только людей с пути истинного сбиваешь… Да так меньше приходится писать, нужно же выяснить подкованность спрашивающего. |
7Gluk7 |
|
Junior Статус: Не в сети |
O Smirnoff писал(а): Лучше всё-же Secure Erase. Попробую. Sania. писал(а): На не пустой дмск, который не нулями и единицами заполнен, а конкретными файлами, которые винда не даст айде переписать,что бы вы не плакались как пол винда с фотками куда то пропали. Я с LiveUSB, а винду пока на vhd переместил Последний раз редактировалось 7Gluk7 20.01.2020 15:49, всего редактировалось 1 раз. |
—
Кто сейчас на конференции |
Сейчас этот форум просматривают: нет зарегистрированных пользователей и гости: 5 |
Вы не можете начинать темы Вы не можете отвечать на сообщения Вы не можете редактировать свои сообщения Вы не можете удалять свои сообщения Вы не можете добавлять вложения |
Лаборатория
Новости
-
#1
I have a CRON job running on my TrueNAS that watches a few of the key SMART
parameters on my boot drives. The count on each of the following parameters:
168|SATA_Phy_Error_Count
218|CRC_Error_Count
incremented by 1 on each of July 13, 14, 15, 19 and 21.
The counts are not super high:
168|SATA_Phy_Error_Count|32
218|CRC_Error_Count|32
but I’m pretty sure some sort of preemptive maintenance is in order.
My boot pool is a mirror of two budget 120GB SSDs running off of
SATA ports on the Motherboard. I have the system database on the
boot pool since I want the system to be functional without the data
pool if I want to troubleshoot the system with the data pool drives
removed.
The drive showing the errors is a KINGSTON Model# SA400S37120G
(Smart Info at the end of this post.)
The other drive is older and is an HP S700 120GB SSD that seems to be
fine.
IIUC this could be a drive problem, a cable probem, a (Motherboard) SATA
Port problem or a powersupply problem.
My question is how to troubleshoot given the intermittent nature of the
problem. Any suggestions would be much appreciated.
DMESG entries pertaining to the fault.
Code:
(ada3:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 d0 20 c3 7c 40 04 00 00 00 00 00 (ada3:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error (ada3:ahcich5:0:0:0): Retrying command, 3 more tries remain (ada3:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 30 08 cb b3 40 04 00 00 00 00 00 (ada3:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error (ada3:ahcich5:0:0:0): Retrying command, 3 more tries remain (ada3:ahcich5:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 20 9b ae 40 04 00 00 00 00 00 (ada3:ahcich5:0:0:0): CAM status: Uncorrectable parity/CRC error SMART Output for drive:smartctl -x /dev/ada3 smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p14 amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Phison Driven SSDs Device Model: KINGSTON SA400S37120G Serial Number: REDACTED LU WWN Device Id: 5 0026b7 782ea1dc1 Firmware Version: S3500102 User Capacity: 120,034,123,776 bytes [120 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device TRIM Command: Available Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-3 T13/2161-D revision 4 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Jul 22 03:23:00 2022 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Disabled Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Unavailable === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x02) Offline data collection activity was completed without error. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 120) seconds. Offline data collection capabilities: (0x11) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. No Selective Self-test supported. SMART capabilities: (0x0002) Does not save SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 10) minutes. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate -O--CK 100 100 000 - 100 9 Power_On_Hours -O--CK 100 100 000 - 21839 12 Power_Cycle_Count -O--CK 100 100 000 - 31 148 Unknown_Attribute ------ 100 100 000 - 0 149 Unknown_Attribute ------ 100 100 000 - 0 167 Write_Protect_Mode ------ 100 100 000 - 0 168 SATA_Phy_Error_Count -O--C- 100 100 000 - 33 169 Bad_Block_Rate ------ 100 100 000 - 0 170 Bad_Blk_Ct_Erl/Lat ------ 100 100 010 - 0/0 172 Erase_Fail_Count -O--CK 100 100 000 - 0 173 MaxAvgErase_Ct ------ 100 100 000 - 0 181 Program_Fail_Count -O--CK 100 100 000 - 0 182 Erase_Fail_Count ------ 100 100 000 - 0 187 Reported_Uncorrect -O--CK 100 100 000 - 0 192 Unsafe_Shutdown_Count -O--C- 100 100 000 - 19 194 Temperature_Celsius -O---K 044 062 000 - 44 (Min/Max 31/62) 196 Reallocated_Event_Count -O--CK 100 100 000 - 0 199 SATA_CRC_Error_Count -O--CK 100 100 000 - 0 218 CRC_Error_Count -O--CK 100 100 000 - 33 231 SSD_Life_Left ------ 090 090 000 - 90 233 Flash_Writes_GiB -O--CK 100 100 000 - 8865 241 Lifetime_Writes_GiB -O--CK 100 100 000 - 12051 242 Lifetime_Reads_GiB -O--CK 100 100 000 - 2641 244 Average_Erase_Count ------ 100 100 000 - 202 245 Max_Erase_Count ------ 100 100 000 - 222 246 Total_Erase_Count ------ 100 100 000 - 40787 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning General Purpose Log Directory Version 1 SMART Log Directory Version 1 [multi-sector log support] Address Access R/W Size Description 0x00 GPL,SL R/O 1 Log Directory 0x01 SL R/O 1 Summary SMART error log 0x02 SL R/O 1 Comprehensive SMART error log 0x03 GPL R/O 1 Ext. Comprehensive SMART error log 0x04 GPL,SL R/O 8 Device Statistics log 0x06 SL R/O 1 SMART self-test log 0x07 GPL R/O 1 Extended self-test log 0x10 GPL R/O 1 NCQ Command Error log 0x11 GPL R/O 1 SATA Phy Event Counters log 0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log 0x80-0x9f GPL,SL R/W 16 Host vendor specific log 0xde GPL VS 8 Device vendor specific log SMART Extended Comprehensive Error Log Version: 1 (1 sectors) Device Error Count: 33 (device log contains only the most recent 4 errors) CR = Command Register FEATR = Features Register COUNT = Count (was: Sector Count) Register LBA_48 = Upper bytes of LBA High/Mid/Low Registers ] ATA-8 LH = LBA High (was: Cylinder High) Register ] LBA LM = LBA Mid (was: Cylinder Low) Register ] Register LL = LBA Low (was: Sector Number) Register ] DV = Device (was: Device/Head) Register DC = Device Control Register ER = Error register ST = Status register Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 33 [0] log entry is empty Error 32 [3] log entry is empty Error 31 [2] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 04 -- 51 00 00 00 00 00 00 00 00 40 00 Error: ABRT Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- b0 00 d1 01 01 00 00 4f 00 c2 01 40 08 00:00:00.000 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] 2f 00 00 01 01 00 00 00 00 00 03 40 08 00:00:00.000 READ LOG EXT 2f 00 00 01 01 00 00 00 00 00 00 40 08 00:00:00.000 READ LOG EXT b0 00 d5 01 01 00 00 4f 00 c2 00 40 08 00:00:00.000 SMART READ LOG b0 00 da 00 00 00 00 4f 00 c2 00 40 08 00:00:00.000 SMART RETURN STATUS Error 30 [1] occurred at disk power-on lifetime: 0 hours (0 days + 0 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 04 -- 51 00 00 00 00 00 00 00 00 40 00 Error: ABRT Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- b0 00 d1 01 01 00 00 4f 00 c2 01 40 08 00:00:00.000 SMART READ ATTRIBUTE THRESHOLDS [OBS-4] 2f 00 00 01 01 00 00 00 00 00 03 40 08 00:00:00.000 READ LOG EXT 2f 00 00 01 01 00 00 00 00 00 00 40 08 00:00:00.000 READ LOG EXT b0 00 d5 01 01 00 00 4f 00 c2 00 40 08 00:00:00.000 SMART READ LOG b0 00 da 00 00 00 00 4f 00 c2 00 40 08 00:00:00.000 SMART RETURN STATUS SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 21598 - # 2 Extended offline Completed without error 00% 21591 - # 3 Extended offline Completed without error 00% 20707 - # 4 Extended offline Completed without error 00% 18218 - # 5 Extended offline Completed without error 00% 13958 - # 6 Extended offline Completed without error 00% 7400 - # 7 Extended offline Completed without error 00% 6975 - # 8 Extended offline Completed without error 00% 1348 - # 9 Extended offline Completed without error 00% 0 - #10 Short offline Completed without error 00% 0 - Selective Self-tests/Logging not supported SCT Commands not supported Device Statistics (GP Log 0x04) Page Offset Size Value Flags Description 0x01 ===== = = === == General Statistics (rev 1) == 0x01 0x008 4 31 --- Lifetime Power-On Resets 0x01 0x010 4 21839 --- Power-on Hours 0x01 0x018 6 3799441290 --- Logical Sectors Written 0x01 0x020 6 9010952 --- Number of Write Commands 0x01 0x028 6 1245637882 --- Logical Sectors Read 0x01 0x030 6 1333687 --- Number of Read Commands 0x07 ===== = = === == Solid State Device Statistics (rev 1) == 0x07 0x008 1 22 --- Percentage Used Endurance Indicator |||_ C monitored condition met ||__ D supports DSN |___ N normalized value Pending Defects log (GP Log 0x0c) not supported SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 4 7 Command failed due to ICRC error 0x0002 4 7 R_ERR response for data FIS 0x0005 4 0 R_ERR response for non-data FIS 0x000a 4 18 Device-to-host register FISes sent due to a COMRESET
-
#2
The count on each of the following parameters:
168|SATA_Phy_Error_Count
218|CRC_Error_Countincremented by 1 on each of July 13, 14, 15, 19 and 21.
Are those linked to either the dates of SMART tests or scrubs?
The drive showing the errors is a KINGSTON Model# SA400S37120G
(Smart Info at the end of this post.)The other drive is older and is an HP S700 120GB SSD that seems to be
fine.IIUC this could be a drive problem, a cable probem, a (Motherboard) SATA
Port problem or a powersupply problem.My question is how to troubleshoot given the intermittent nature of the
problem. Any suggestions would be much appreciated.
You also didn’t mention the 100 read errors reported by SMART… those are from the drive itself, so indicate some level of failure unrelated to cabling.
The CRC errors can be the controller on the drive, the cabling or the SATA controller, so as you say, hard to narrow down unless something obvious like a loose connection or burning smell from the controller chip.
I would generally treat the drive as untrustworthy and consider living with a single boot device (keeping config backups just in case).
-
#3
Are those linked to either the dates of SMART tests or scrubs?
I don’t think so… no way of finding out. I don’t run regular schedule smart scans, so it’s not likely a smart test. As for scrubs, I know the system does one every few days…. not sure what the default config is set to, but from another report I’m pretty sure the last two issues were not during a scrub. The one on the 21 wasn’t for sure.
The report comes from a CRON job that I run daily that does a smrtctl -a, and compares a bunch of results with the ones from the previous day, and if they don’t match, it spits out a report showing the old/new value. I wrote the script to alert me to just this type of situation. I am not getting any alerts from TrueNAS — just the report I produce.
You also didn’t mention the 100 read errors reported by SMART… those are from the drive itself, so indicate some level of failure unrelated to cabling.
Sorry what 100 read errors???? What am I missing. Are you confusing «Raw_Read_Error_Rate» with read errors?
The CRC errors can be the controller on the drive, the cabling or the SATA controller, so as you say, hard to narrow down unless something obvious like a loose connection or burning smell from the controller chip.
I would generally treat the drive as untrustworthy and consider living with a single boot device (keeping config backups just in case).
I hadn’t though of the controller chip on the drive. I’ll keep an eye on it for now, and an eye out for a sale on a replacement drive. SSDs are pretty cheap… about what a good USB drive used to cost back in day. When I get a moment I will likely open the box an pull all the cables an reset them just in case the contacts have oxidized.
-
#4
Sorry what 100 read errors???? What am I missing. Are you confusing «Raw_Read_Error_Rate» with read errors?
OK, so it’s not a count of read errors… but it’s also not OK…
That should be 120 (not 100) until something is wrong.
-
#5
OK, so it’s not a count of read errors… but it’s also not OK…
That should be 120 (not 100) until something is wrong.
Thanks for the reply…. Great idea, wrong data sheet…. Different drives have slightly different interpretations.
I didn’t know Kingston published this info. AFAIK Western Digital Doesn’t, so I didn’t even think to look. I did some additional searching which lead me to a Smartmon Tools page:
https://www.smartmontools.org/ticket/801
which lead me to the correct datasheet.
https://media.kingston.com/support/downloads/MKP_521_Phison_SMART_attribute.pdf
Here are the descriptions for the drive in question:
001 Read Error Rate
Counts the number of uncorrectable errors that accumulate when controller
reads data from Flash and ECC events occur.
168 SATA PHY Error Count
Counts the number of SATA PHY errors. This value includes all PHY error
counts, ex data FIS CRC , code errors, disparity errors, command FIS crc.
Value clears upon system power-down.
218 CRC Error Count
Counts the number of CRC error (read/write data FIS CRC error).
I’m not sure what to think about Read Error Rate — IIUC as the drive wears out, there will be errors, and the drive «handles» them. Since the drive has 90% life left, I would think that there would have been a few errors — but I may well be wrong, and would welcome someone correcting me if I am.
Other than reset or change the cables, swap the drive, is there any meaningful troubleshooting to be done?
DISK
DISK Displays information about the disks in the system.
This command is used to change the configuration settings for the disks
in the system and monitor the status of the disk channels. The command
will display the current disk configuration settings and the status of
each disk channel. The INFO= parameter can be used to display all of
the information about a disk in the system. The LIST parameter will
display a list of the disks installed in the system and indicate how
many were found.
- AGINGLIMIT=x|OFF
- Sets the maximum time a command should wait in the disk command queue
for.
This parameter is for Hitachi SAS drives only.
Each unit of this timer is 50 ms, where 0 is 50 ms.
Range: 0 to 4 (50 to 250 milliseconds) or OFF.
Default is 3, (200 milliseconds) - AUTOREASSIGN=ON|OFF
- Allows the user to turn on or off whether bad blocks will be
reassigned when a medium error occurs on a healthy tier.
Default is ON. - CMD_TIMEOUT=x
- This parameter sets the retry disk timeout (in seconds) for an I/O
request. The retry timeout value indicates the maximum amount of
time that is allotted to receive a reply for each retry of an I/O
request. If the I/O request does not complete within this time, it is
aborted and potentially retried: if there is still time remaining in
the overall disk timeout to allow for another retry, it is retried;
if not, it completes with an error status.
This parameter must be smaller than or equal to TIMEOUT.
Valid range is 1 to 512 seconds.
Recommended value for SAS drives is 11 seconds.
Recommended value for SATA drives is 31 seconds. Setting the timeout
below the recommended values can cause disk failures.
Default is 31 seconds. - DEFECTLIST[=tc]
- Allows the user to display the number of defects in the defect list
for the specified disk. The defect list contains all the physical
sectors on the disk that the drive has identified as bad, and to
which the disk’s hardware prevents access. The list is classified
into two types: the permanent list and the grown list. The permanent
list consists of the bad sectors that are identified by the disk
manufacturer; the grown list consists of the bad sectors that are
found after the disk has left the factory (and which can be added to
at any time).
The disk is specified by its tier and channel locations, ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - DIAG[=tc]
- Performs a series of diagnostics tests on the specified disk.
The disk is specified by its physical tier and channel locations,
‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - FAIL[=tc]
- This parameter tells the system to fail the specified disk at the
physical tier and channel locations indicated by ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>.
When a non-SPARE disk is specified:
If failing the disk won’t cause a multi-channel failure, the disk
is marked as failed, and an attempt is made to replace it with a
spare disk.
When a SPARE disk is specified:
If the spare disk is currently in use as a replacement for a
failed disk, then the disk that the spare is replacing is put back to
a failed status, and the spare is released, but it is marked as
unhealthy and unavailable. - FAST_FAIL=[ON|OFF]
- This parameter turns on/off the fast fail mode for disks that are
slow to respond to data access commands. The fast fail parameters can
be customized to a particular need. Default is OFF. - FAST_FAIL_THRESHOLD=’num cmds’
- This parameter indicates how many consecutive commands in the fast
fail algorithm must occur before failing the drive for this reason.
The default value is 5.
Valid range = 2 — 20. - FAST_FAIL_WINDOW_END=’t’
- This parameter indicates the timeout in seconds for when a disk
response is received outside of a window in the future. If the
command finishes outside of this time value, it is not aggregated in
the slow disk algorithm as it is considered a separate instance of
the event and the counter will restart. The default value is 90
seconds.
Valid range = 3 — 180. - FAST_FAIL_WINDOW_START=’t’
- This parameter indicates the timeout in seconds for when a disk
response is considered slow and will count against the drive in the
slow disk fail algorithm. The default value is 5 seconds.
Valid range = 2 — 179. - INFO[=tc]
- This parameter displays the information and status about a specific
disk in the system.
The disk is specified by its physical tier and channel locations,
‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - LIST[=SAS_ID|SPEED]
- This parameter displays a list of all the disks installed in the
system and indicates how many were found of each type.
The optional SAS_ID parameter will display the SAS ID of the device
instead of the serial number.
The optional SPEED parameter will display the link speed of the
device instead of the RPM. - LLFORMAT[=tc]
- Allows the user to perform a low level format of a disk drive.
The disk is specified by its tier and channel locations, ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>. - MAXCMDS=x
- Sets the maximum command queue depth to a tier of disks.
Range: 1 to 32 commands per tier.
Default: 16 commands.
Setting should be as follows:
— 16 if any SATA drives are used.
— 32 for everything else. - MAXREADLEN=x
- Sets the maximum read command length for SATA drives in KiB.
This parameter is used to increase throughput on systems with a large
number of SATA tiers by reducing the contention for the SAS lanes.
128K is the recommended setting for systems with 16 tiers or more of
SATA disks.
2048K is the recommended setting for systems with SAS disks.
Range is 128 to 2048.
Default is 128. - MAXWRITELEN=x
- Sets the maximum write command length to the drives in KiB.
This parameter is provided for testing only and should normally not
be changed.
Range is 128 to 2048.
Default is 2048. - PLS[=[t][c]]
-
Requests/displays the PHY Link Error Status Block information for the
specified drive. Note that SATA and SAS drives report PHY errors
differently. The PHY information consists of the following items:ERROR SATA AAMUX PHY ERRORS Explanation H-RX Number of SATA FIS CRC errors received on the host port of the AAMUX H-TX Number of SATA R_ERR primitives received on the host port indicating
a problem with the transmitter of the AAMUXH-Link Number of times the PHY has lost link on the host port H-Disp Number of frame errors for the host port of the AAMUX.
These include:
code error,
disparity error,
or realignmentO-RX Number of SATA FIS CRC errors received on the other host port of
the AAMUXO-TX Number of SATA R_ERR primitives received on the other host port
indicating a problem with the transmitter if the AAMUXO-Link Number of times the PHY has lost link on the other host port O-Disp Number of frame errors for the other host port of the AAMUX.
These include:
code error,
disparity error,
or realignmentD-RX Number of SATA FIS CRC errors received on the device port of the
AAMUXD-TX Number of SATA R_ERR primitives received on the device port
indicating a problem with the transmitter of the AAMUXD-Link Number of times the PHY has lost link on the device port D-Disp Number of frame errors for the device port of the AAMUX.
These include:
code error,
disparity error,
or realignmentError SAS PHY ERRORS Explanation InvDW Invalid DWORD Count — The number of invalid dwords received outside
of the PHY reset sequence.RunDis Running disparity Count — The number of dwords containing running
disparity errors received outside of the PHY reset sequence.LDWSYN Loss of DWORD synchronization count — The number of times the PHY
has lost synchronization and the link reset sequence.PHYRES PHY Reset Problem count — The number of times the PHY reset sequence
has failed.The disk is specified by its physical tier and channel locations,
‘tc’,where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
If neither the tier nor the channel are specified, the PLS
information is requested from all drives.
If only the tier is specified, the PLS information is requested from
all the drives on the specified tier. - PMBIT=ON|OFF
- When ON this parameter sets the PM (performance mode) bit in Seagate
SAS drives mode pages. When OFF the Seagate drive uses its default
performance mode settings.
Default is OFF. - QUARANTINE
- Displays the of number quarantine events on this controller for each
disk in the system. Only tiers with quarantine counts will be
displayed.
Use QUARANTINECLEAR to reset the quarantine counts. - QUARANTINE=[ON|OFF]
- Enables/disables the disk quarantine feature for all of the disks. A
disk cannot be quarantined unless FASTAV is enabled for the LUN.
Default is OFF. - QUARANTINECLEAR
- Resets the quarantine counts for all of the disks.
- QUARANTINECMDLIMIT=x
- Sets the maximum number of outstanding disk commands after a good
response before a quarantined disk can be put back into service.
Range 0 to 32 where 0 indicates no delay before putting the disk back
into service.
Default is 0. - QUARANTINETIMEOUT=x
- Sets the minimum timeout before a disk can be quarantined in 16.6
millisecond increments. A disk cannot be quarantined unless FASTAV is
enabled and has timed out on the command.
Range 6 to 65535.
Default is 12 (200 milliseconds) - REASSIGN[=tc] [0xh
- Allows for the reassigning of defective logical blocks on a disk to
an area of the disk reserved for this purpose.
The disk is specified by its tier and channel locations, ‘tc’, where:
‘t’ indicates the tier in the range <1..128>, and
‘c’ indicates the channel in the range <ABCDEFGHPS>.
0xh is the hexadecimal value of the LBA (Logical Block Address) to be
reassigned. - REBUILD[=tc]|ALL
-
This parameter tells the system to start a rebuild operation on a
(presumably) already failed disk. A rebuild operation restores a
failed disk to a healthy status once it completes. Note that this
operation can take several hours to complete depending on the size of
the disk and the speed of the rebuild operation. The speed of the
rebuild operation can be adjusted with the DELAY and EXTENT
parameters of the TIER command.
In addition, the rebuild operation can be stopped, or paused and
resumed with the TIER STOP, TIER PAUSE, and TIER RESUME commands.
The TIER AUTOREBUILD command can be used to automate the rebuild
process.
Note that SPARE disks are handled slightly differently from other
disks, in that SPARES that are not in use as an active replacement
for a failed disk elsewhere in the system are simply returned to a
normal healthy status by this command; SPAREs that are in use are
already considered healthy and are not rebuilt.
The failed disk to be rebuilt is specified by its physical tier and
channel locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
All failed and replaced disks can be rebuilt using the ALL parameter.
- REBUILDNOJOURNAL[=tc]|ALL
-
This parameter tells the system to start a rebuild operation on a
(presumably) already failed disk without using the journal. A
rebuild operation restores a failed disk to a healthy status once it
completes. Note that this operation can take several hours to
complete depending on the size of the disk and the speed of the
rebuild operation. The speed of the rebuild operation can be
adjusted with the DELAY and EXTENT parameters of the TIER command.
In addition, the rebuild operation can be stopped, or paused and
resumed with the TIER STOP, TIER PAUSE, and TIER RESUME commands.
The TIER AUTOREBUILD command can be used to automate the rebuild
process.
Note that SPARE disks are handled slightly differently from other
disks, in that SPARES that are not in use as an active replacement
for a failed disk elsewhere in the system are simply returned to a
normal healthy status by this command; SPAREs that are in use are
already considered healthy and are not rebuilt.
The failed disk to be rebuilt is specified by its physical tier and
channel locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
All failed and replaced disks can be rebuilt using the ALL parameter.
- REBUILDVERIFY=ON|OFF
-
This parameter determines if the system will send SCSI Write with
Verify commands to the disks when rebuilding failed disks. This
feature is used to guarantee that the data on the disks is rebuilt
correctly.
Note: This feature will increase the time it takes for rebuilds to
finish.Default is OFF.
- REPLACE[=tc]
-
This parameter tells the system to replace the specified failed disk
with a spare disk or replace a healthy disk that is believed to be on
the verge of failing. The healthy disk replacement is referred to in
the system as a proactive replacement operation. A replace operation
is used to temporarily replace a disk with a healthy spare disk.
This operation can take several hours to complete depending on the
size of the disk and speed of the replace operation. The speed of
the replace operation can be adjusted with the DELAY and EXTENT
parameters of the TIER command.
The disk to be replaced is specified by its physical tier and channel
locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHP>.
(Note that spare disks themselves cannot be replaced with this
command). - RESTART[=tc]
-
This parameter tells the system to start a restart operation on a
(presumably) already failed disk.The failed disk to be restarted is
specified by its physical tier and channel locations, ‘tc’, where:- ‘t’ indicates the tier in the range <1..128>, and
- ‘c’ indicates the channel in the range <ABCDEFGHPS>.
All failed and replaced disks can be restart using the ALL parameter.
- SCAN
- This parameter checks each disk channel in the system for any new
disks and verifies that the existing disks are in the correct
location. It also starts a rebuild operation on any failed disks
which pass the disk diagnostics. - STATUS
- Displays the loop status of each disk channel and a count of the
fibre channel errors encountered on each channel. - STATUSCLEAR
- Resets the fibre channel error counts on each disk channel.
- TIMEOUT=x
-
This parameter sets the total disk timeout (in seconds) for an I/O
request. The total disk timeout value indicates the total overall
length of time allotted to each I/O request to complete; if an I/O
request has not completed within this time frame, then an error
status is reported for it.
This parameter must be greater than or equal to CMD_TIMEOUT.
Valid range is 1 to 512 seconds.- Recommended value for SAS drives is 27 seconds.
- Recommended value for SATA drives is 60 seconds.
Default is 60 seconds.
- WRITESAME=ON|OFF
-
Enable and disables use of the SCSI Write Same command when
formatting LUNs. The SCSI Write Same command is used by the system to
format a LUNs faster. This parameter is provided for backwards
compatibility with disks or enclosures that do not support the SCSI
Write Command.Default is OFF.